MATERIAL SUBSTRUCTURES IN COMPLEX BODIES
This Page Intentionally Left Blank
MATERIAL SUBSTRUCTURES IN COMPLEX BODIES: FROM ATOMIC LEVEL TO CONTINUUM Edited by
GIANFRANCO CAPRIZ Università di Pisa
PAOLO MARIA MARIANO D.I.C., Università di Firenze
Amsterdam • Boston • Heidelberg • London • New York • Oxford Paris • San Diego • San Francisco • Singapore • Sydney • Tokyo
Elsevier The Boulevard, Langford Lane, Kidlington, Oxford OX5 1GB, UK 84 Theobald’s Road, London WC1X 8RR, UK First edition 2007 Copyright © 2007 Elsevier BV. All rights reserved No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means electronic, mechanical, photocopying, recording or otherwise without the prior written permission of the publisher Permissions may be sought directly from Elsevier’s Science & Technology Rights Department in Oxford, UK: phone (+44) (0) 1865 843830; fax (+44) (0) 1865 853333; email:
[email protected]. Alternatively you can submit your request online by visiting the Elsevier web site at http://elsevier.com/locate/permissions, and selecting Obtaining permission to use Elsevier material Notice No responsibility is assumed by the publisher for any injury and/or damage to persons or property as a matter of products liability, negligence or otherwise, or from any use or operation of any methods, products, instructions or ideas contained in the material herein. Because of rapid advances in the medical sciences, in particular, independent verification of diagnoses and drug dosages should be made British Library Cataloguing in Publication Data Material substructures in complex bodies : from atomic level to continuum 1. Microstructure 2. Materials science 3. Atomic structure 4. Molecular structure 5. Nanostructured materials I. Capriz, G. (Gianfranco) II. Mariano, Paolo Maria, 1966– 620.1 1299 Library of Congress Control Number: 200693845 ISBN 13: 978-0-08-044535-9 ISBN 10: 0-08-044535-7 For information on all Elsevier publications visit our web site at books.elsevier.com Typeset by Charon Tec Ltd (A Macmillan Company), Chennai, India. www.charontec.com Printed and bound in Great Britain 06 07 08 09 10
10 9 8 7 6 5 4 3 2 1
Contents
Contributors Preface
1. Asymptotic Continuum Models for Plasmas and Disparate Mass Gaseous Binary Mixtures
xi xiii
1
Pierre Degond 1.1 1.2 1.3
1.4
1.5 1.6 1.7
1.8
1.9
Introduction The Kinetic Model Moment Method and Conservation Laws for Gas Mixtures: Why it Cannot Apply to Plasmas 1.3.1 Properties of the collision operators 1.3.2 Moments and conservation laws 1.3.3 Closure of the moment system: LTE 1.3.4 Why the standard mixture model does not apply to plasmas The Plasma Fluid Model 1.4.1 Energy-transport form of the system 1.4.2 Hydrodynamic form of the system 1.4.3 Discussion of the plasma fluid model and applications 1.4.4 Approximate expression of the diffusion matrices Scaling Hypotheses Expansion of the Interspecies Collision Operators Moment Method and Conservation Laws for Plasmas 1.7.1 Properties of the expanded collision operators 1.7.2 Moments and conservation laws for the scaled kinetic model 1.7.3 Closure of the plasma moment system Computation of the Fluxes and of the Collision Terms 1.8.1 Preliminaries 1.8.2 Properties of LB 1.8.3 Resolution of the perturbation equation (1.242) 1.8.4 Computation of the fluxes 1.8.5 Expression of the fluxes in terms of ne and T e 1.8.6 Expression of the collision terms 1.8.7 Back to physical variables Conclusion
2 6 11 11 13 14 17 18 18 21 22 24 29 34 40 40 42 44 49 49 50 51 52 54 56 57 57
v
vi
Contents
2. Microscopic Foundations of the Mechanics of Gases and Granular Materials
63
Carlo Cercignani 2.1 2.2 2.3 2.4 2.5 2.6
Introduction Kinetic Theory of Smooth Spheres Collision Dynamics of Rough Spheres The Boltzmann–Enskog Equation The Macroscopic Balance Equations Concluding Remarks
3. Quantization of Affine Bodies: Theory and Applications in Mechanics of Structured Media
63 65 69 72 74 76
80
Jan J. Sławianowski 3.1 3.2 3.3
Introduction Classical Preliminaries General Ideas of Quantization
4. Moving Least-Square Basis for Band-Structure Calculations of Natural and Artificial Crystals
80 82 125
163
Sukky Jun and Wing Kam Liu 4.1
4.2
4.3
4.4
4.5
4.6
Introduction 4.1.1 Electronic band structures 4.1.2 Photonic and acoustic band structures 4.1.3 Meshless methods and moving least-square basis 4.1.4 Periodicity MLS Basis and Periodicity 4.2.1 MLS approximation 4.2.2 Implementation of periodicity condition Atomic Crystals and Semiconductors 4.3.1 Galerkin formulation of Schrödinger equation 4.3.2 The Kronig–Penney model potential 4.3.3 Empirical pseudopotentials of Si and GaAs 4.3.4 Strain effect in compound semiconductors PhoXonic Crystals 4.4.1 Maxwell equations for 2D photonic crystals 4.4.2 Band structures of various 2D photonic crystals 4.4.3 Acoustic bandgap materials Strain-Tunable Photonic Bandgap Materials 4.5.1 Deformations of 2D triangular photonic crystals 4.5.2 Band structures of deformed photonic crystals Concluding Remarks
164 164 165 166 166 167 168 169 174 175 177 178 180 183 183 185 194 195 196 198 201
vii
Contents
5. Modelling Ziegler–Natta Polymerization in High Pressure Reactors
206
Antonio Fasano, K. Kannan, Alberto Mancini and K. R. Rajagopal 5.1 Introduction Modelling the Growth of the Agglomerate (Macroscale) 5.2 Governing Equations 5.2.1 Constancy of porosity 5.2.2 Density of microspheres 5.2.3 Liquid monomer balance 5.2.4 Solid mass balance 5.2.5 Relating the agglomerate expansion to microspheres growth 5.2.6 Liquid flow 5.2.7 Energy balance 5.3 Initial and Boundary Conditions Modelling the Growth of Microspheres 5.4 Kinematics 5.5 The Governing Equations 5.5.1 Mass balance 5.5.2 Momentum balance 5.5.3 Constitutive equations 5.6 Initial and Boundary Conditions 5.7 Analysis of the Equations in the Microscale with Spherical Symmetry 5.8 Consistency of the Boundary Conditions Bridging the Two Scales. The Complete Model 5.9 Determining the Free Terms in the Macroscopic Transport Equations 5.10 The Complete Model 5.10.1 Macroscale 5.10.2 Microscale 5.11 Not Evolving Natural Configuration 5.12 Conclusions
6. Pseudofluids
208 212 212 212 213 213 214 214 215 215 216 217 217 218 219 219 220 222 224 228 232 232 232 232 233 234 235
238
Gianfranco Capriz 6.1 6.2 6.3 6.4 6.5 6.6 6.7
Preamble Material Element Basic Fields Measures of Deformation and Distorsion Strain Rates and Distorsion Rates Inertia Measures Relations with Thermal Concepts
238 239 242 245 248 250 252
viii
Contents
6.8 6.9 6.10
Balance Equations Boundary Conditions: Sample Flows A Lagrangian Approach
7. A Thermodynamical Framework Incorporating the Effect of the Thermal History on the Solidification of Molten Polymers
255 258 259
262
K. Kannan and K. R. Rajagopal 7.1 7.2 7.3
7.4
Introduction Kinematics Modeling 7.3.1 Modeling prior to the initiation of solidification 7.3.2 Modeling after the initiation of solidification Summary and Conclusions
8. Effects of Stress on Formation and Properties of Semiconductor Nanostructures
263 268 270 270 271 281
284
Harley T. Johnson 8.1 8.2
8.3
8.4
8.5
Overview Background 8.2.1 Applications of semiconductor nanostructures 8.2.2 Fabrication methods 8.2.3 Fundamental properties of nanostructures 8.2.4 Modeling methods for semiconductor nanostructures Effects of Stress on the Formation of Semiconductor Nanostructures 8.3.1 Modeling stress-induced surface self-assembly 8.3.2 Modeling stress effects in the sputter-erosion instability 8.3.3 Modeling stress effects in compositional segregation in thin films Stress Effects on the Electronic/Optical Properties of Semiconductor Nanostructures 8.4.1 Models for stress effects on parallel transport in thin films 8.4.2 Modeling the effects of stress on quantum confinement in wires and dots 8.4.3 Multiscale coupled mechanical/electronic modeling in semiconductor nanostructures Conclusions
285 285 286 288 289 293 295 295 299 301 303 303 305 309 310
ix
Contents
9. Continua with Spin Structure
314
Paolo Maria Mariano 9.1 9.2
9.3 9.4
Index
Introduction Spatial Representation 9.2.1 Heisemberg spins and balance equations 9.2.2 Connection with non-viscous compressible spin fluids Referential Description: Invariance with Respect to Relabeling Covariant Evolution of Interstitial Point Defects and Disclinations 9.4.1 Interstitial point defects 9.4.2 Disclination lines
314 316 316 321 323 327 327 332 335
This Page Intentionally Left Blank
Contributors
Prof. Gianfranco Capriz Dipartimento di Matematica, Università di Pisa, Largo B. Pontecorvo 5, I-56127 Pisa,Italy. E-mail:
[email protected] Prof. Carlo Cercignani Dipartimento di Matematica, Politecnico di Milano, Piazza Leonardo daVinci 32, 20133 Milano, Italy. Phone: +39 02 2399 4557; Fax: +39 02 2399 4606; E-mail: carlo.cercignani@mate. polimi.it Prof. Pierre Degond Laboratoire MIP, Université Paul Sabatier, 31062 Toulouse Cedex 9, France. Phone: +33 05 61 55 63 69; Fax: +33 05 61 55 83 85; E-mail:
[email protected] Prof. Antonio Fasano Dipartimento di Matematica “U. Dini’’, Università di Firenze, viale Morgani 67\A, I-50134 Firenze Italy. Phone: +39 055 4237145; Fax: +39 055 4222695; E-mail:
[email protected]fi.it Dr. Harley T. Johnson Department of Mechanical Science and Engineering, University of Illinois at UrbanaChampaign, 1206 W. Green Street, Urbana IL 61801-2906, USA. Phone: +1 217-265-5468; Fax: +1 217-244-6534; E-mail:
[email protected] Sukky Jun Department of Mechanical and Materials Engineering, Florida International University, 10555 West Flagler Street, EC 3463, Miami, FL 33174, USA. Phone: +1 305 348 1217; Fax: +1 305 348 1932; E-mail: juns@fiu.edu K. Kannan Department of Biomedical Engineering – Texas A&M University – 230A Engineering/ Physics Building Office Wing – 3120 TAMU. Prof. Wing Kam Liu Northwestern University, Department of Mechanical Engineering, 2145 Sheridan, Evanston, IL 60208-3111, USA. Phone: +1 847-491-7094; Fax: +1 847-491-3915; E-mail:
[email protected] xi
xii
Contributors
Alberto Mancini Dipartimento di Matematica “U. Dini’’ – Università degli Studi di Firenze – Viale Morgagni 67/a 50134 Firenze, Italy. Prof. Paolo Maria Mariano D.I.C., Università di Firenze, via Santa Marta 3, I-50139 Firenze, Italy. Phone: +39.055.4796470; Fax: +39.055.4796320; E-mail: paolo.mariano@unifi.it Prof. K. R. Rajagopal Department of Mechanical Engineering, Texas A&M University, 230A Engineering/ Physics Building Office Wing College Station,Texas 77843, USA. Phone: +1 979-862-4552; Fax: +1 979-845-3081; E-mail:
[email protected] Prof. Jan J. Sławianowski Institute of Fundamental Technological Research, Polish Academy of Sciences, ul. Swietokrzyska 21, 00-049 Warzaw, Poland. Phone: +48 (22) 8261281; Fax: +48 (22) 8269815; E-mail:
[email protected] Preface
Stringent industrial requirement of sophisticated performances and of circumstantial control for micro-devices and other types of machinery at multiple scales can be satisfied often only by resort to or allowance for complex materials. The adjective “complex’’ beckons to the fact that the substructure influences gross mechanical behaviour in a prominent way. Interactions due to substructural changes are represented directly. Examples, just to list a few, are liquid crystals, quasi-periodic alloys, polymeric bodies, spin glasses, magnetostrictive materials and ferroelectrics, suspensions, in particular liquids with gas bubbles, polarizable fluids, etc. Hopefully, substructures can be exploited, even invented anew, to reach predetermined goals. To help in the process, theories must be developed so that severe challenging theoretical problems arise; often of fundamental nature. A precise grasp of the physical meaning of mathematical entities is critical for the correct, adequate proposal of models of behaviour and even of consequent computational analyses. A basic problem is of bridging scales even from atomic to macroscopic level, translating through continuum limit the prominent aspects of the subtle discrete substructural features. Their number and nature may be also enriched by specific circumstances. The collection of chapters composing this book aims to underline some aspects of these questions, proposing also new matter of discussion together with specific solutions. In Chapter 1, Pierre Degond derives hydrodynamic models of plasmas and disparate mass binary mixtures by evaluating the continuum limit of kinetic “small-scale’’ events represented by means of Fokker-Planck or Boltzmann equations. Macroscopic diffusion equations for density of particles and energy follow, coupled with a Euler-type equation for ions or heavy species. Inconsistencies in existing models are evidenced. In Chapter 2, Carlo Cercignani continues the discussion on how kinetic schemes based on Boltzmann equation may offer microscopic foundations of continuous dynamical models. He examines how old and new techniques in the kinetic theories of dense gases may be useful for describing the fast flow of granular materials. Substructural kinetic effects may be not as prominent in some circumstances as quantum phenomena. In Chapter 3, Jan Jerzy Slawianowski develops a quantization scheme for affine bodies, a special class of complex bodies where the natural morphological descriptor is a second-order tensor: in other words, each material element is considered as a system which may (microscopically) deform independently of the neighbouring fellows. Once reasonable models have been established, computational techniques are essential in finding explicit solutions in special cases. When phenomena at various xiii
xiv
Preface
scales are involved, non-trivial computational problems arise and may be tackled with different methods, depending on circumstances. In their Chapter 4, Sukky Jun andWing Kam Liu discuss computational methods appropriate to analyse the formation of electronic band structures in periodic atomic lattices. The approach makes use of periodic meshless shape functions based on the moving least-square approximation. Wave equations are analysed in the reciprocal space determined by the standard Fourier basis. The analyses of semiconductors, photonic and phononic crystals are natural applications. Complex bodies are produced in non-simple industrial processes so that the process of formation of substructures deserves to be described per se. Amid possible industrial processes, in Chapter 5, Antonio Fasano, Krishna Kannan, Alberto Mancini and Kumbakonam R. Rajagopal propose a new model for the Ziegler-Natta polymerization in a high-pressure reactor by considering, after fragmentation, a single agglomerate of catalytic particles, then analyzing the mechanics of growing nano-spheres. A non-linear hyperbolic system of governing equations arises. Other aspects of the mechanics of polymers are further discussed by Krishna Kannan and Kumbakonam R. Rajagopal in their Chapter 7.The attention is focused on the solidification process of molten polymers where there is competition between the effects of substructural quenching and deformation of the melt:The former effect is an obstacle to the crystallization while the latter enhances it in a way in which memory effects have to be accounted for. Deformation and the corresponding macroscopic stress influence also the formation of nanostructures in semiconductors during their fabrication and the collective mechanical behaviour in applications. The modelling of these effects include atomistic, continuum and multiscale features. These topics are discussed in Chapter 8 by Harley T. Johnson. Finally, our personal contributions are in Chapters 6 and 9. Basic foundations of the mechanics of bodies in which substructural phenomena have kinetic nature are discussed in Chapter 6 (by G.C.) without resorting to the use of some version of Boltzmann equation. The interaction between gross deformation and spin structures are discussed in Chapter 9 (by P.M.M.) paying attention on the evolution of disclination lines and point defects. The covariance of the relevant evolution equations is proven. Gianfranco Capriz Bridport (UK) Paolo Maria Mariano Firenze (Italy)
C H A P T E R
O N E
Asymptotic Continuum Models for Plasmas and Disparate Mass Gaseous Binary Mixtures Pierre Degond∗
Contents 1.1 Introduction 1.2 The Kinetic Model 1.3 Moment Method and Conservation Laws for Gas Mixtures: Why it Cannot Apply to Plasmas 1.3.1 Properties of the collision operators 1.3.2 Moments and conservation laws 1.3.3 Closure of the moment system: LTE 1.3.4 Why the standard mixture model does not apply to plasmas 1.4 The Plasma Fluid Model 1.4.1 Energy-transport form of the system 1.4.2 Hydrodynamic form of the system 1.4.3 Discussion of the plasma fluid model and applications 1.4.4 Approximate expression of the diffusion matrices 1.5 Scaling Hypotheses 1.6 Expansion of the Interspecies Collision Operators 1.7 Moment Method and Conservation Laws for Plasmas 1.7.1 Properties of the expanded collision operators 1.7.2 Moments and conservation laws for the scaled kinetic model 1.7.3 Closure of the plasma moment system 1.8 Computation of the Fluxes and of the Collision Terms 1.8.1 Preliminaries 1.8.2 Properties of LB 1.8.3 Resolution of the perturbation equation (1.242) 1.8.4 Computation of the fluxes 1.8.5 Expression of the fluxes in terms of ne and T e 1.8.6 Expression of the collision terms 1.8.7 Back to physical variables 1.9 Conclusion
2 6 11 11 13 14 17 18 18 21 22 24 29 34 40 40 42 44 49 49 50 51 52 54 56 57 57
∗
MIP, UMR 5640 (CNRS-UPS-INSA), Université Paul Sabatier, 118, route de Narbonne, 31062 Toulouse Cedex, France e-mail:
[email protected] Material Substructures in Complex Bodies ISBN-10: 0-08-044535-7
© 2007 Elsevier Ltd. All rights reserved.
1
2
Pierre Degond
Abstract We review the derivation of macroscopic models for plasmas and disparate mass binary mixtures from a large-scale limit of the underlying kinetic (Fokker–Planck or Boltzmann) equations. The Knudsen number (ratio of the collision mean free-path to the size of the system) is supposed to be of the same order of magnitude as the square root of the mass ratio between the particles. The so-obtained macroscopic model consists of a system of two diffusion equations for the electron (or light species) density and energy (often referred to as the energy-transport model), coupled with the gas dynamics Euler system for the ions (or heavy species). This mathematical properties of this system are reviewed and its applicability to various physical contexts is outlined. Key Words: Gas mixture, Disparate masses, Plasmas, Boltzmann equation, Fokker– Planck–Landau equation, Conservation equations, Local thermodynamical equilibrium, Energy-transport system, Hydrodynamic equations, Onsager relation AMS Subject classification: 41A60, 35Q20, 76P05
1.1 Introduction This chapter reviews the derivation of macroscopic models for plasmas and disparate mass binary mixtures from a large-scale limit of the underlying kinetic (Fokker–Planck or Boltzmann) equations. The material is based on a series of work starting with Refs. [1], [2] for the case of unmagnetized gases and plasmas and complemented by Ref. [3] for the magnetized cases. A related series of works is concerned with semiconductor models [4], [5]. The systematic derivation of macroscopic continuum models from microscopic models has been the cornerstone of modern statistical physics since the seminal works of Boltzmann and Maxwell. The mathematical approach based on asymptotic expansions has been proposed and implemented by Hilbert [6] and independently by Chapman [7] and Enskog [8] and allows to rigorously derive Euler and Navier–Stokes equations from the Boltzmann equation. This approach has been later pursued by Grad [9–11] and Cercignani [12]. A recent review on these topics and the latests mathematical developments can be found in Ref. [13] and in the works of Caflisch [14], Kawashima et al. [15] and more recently Bardos et al. [16]. In the same vein, the passage from microscopic models of radiative transfer, neutron transport or semiconductors toward diffusion models (respectively called the Rosseland approximation, the diffusion approximation or the drift-diffusion model) has been investigated in Refs. [17–21]. In the present work, we concentrate on plasmas or disparate mass binary mixtures, and stay at the level of the formal asymptotics, which is rich and complex enough to justify a study of its own. Our goal is to study the passage from the kinetic description of the plasma or mixture toward macroscopic equations. It
Models for Plasmas and Disparate Mass Gaseous Binary Mixtures
3
should be noted that there exists no mathematically rigorous justification yet of the formal theory presented below. The derivation of macroscopic models for plasmas and disparate mass binary mixtures possesses already quite a rich history. Macroscopic transport models for plasmas have been solidly established by Braginskii [22] while the first numerical computation of the electron thermal diffusivity dates back to the work of Spitzer and Härm [23]. Disparate mass binary mixtures have been considered in Refs. [24–28]. In ordinary gas mixtures (when the particles have similar masses), the momenta and energies of each species relax one to each other on the same time scale as relaxation of the distribution functions to local thermodynamical equilibrium (LTE, this scale being characterized by the collision times and mean free-paths). As a consequence, the macroscopic velocity and temperature of the two species are equal (at leading order) and the resulting system contains only four equations: two conservation equations for the densities of each species, but only one momentum equation and one energy equation for the global mixture. If Navier– Stokes corrections are included, specific interspecies diffusion terms are added to the momentum and energy equations to account for the velocity and temperature differences between the two species which are first order in the Knudsen number [29] (we recall that the Knudsen number is the ratio of the collision mean free-path to the size of the system). In Ref. [22], Braginskii naturally considers two sets of coupled hydrodynamic equations (of Navier–Stokes type) for the electrons and the ions. He therefore assumes that the velocity and temperature differences between the species are zeroth order in the Knudsen number, which implicitly means that momentum and energy relaxations of the species one to each other occurs on a longer time scale than relaxation of the distribution functions to LTE. This obviously requires some additional mechanisms compared with an ordinary mixture. This mechanism is related to the small mass ratio between the electrons and the ions, which results in a scale separation between the relaxations to thermal equilibria of each species, as well as momentum and energy relaxation of the species to each other. In the present chapter, we propose a consistent asymptotic analysis of the set of the plasma kinetic equations which takes into account that both the Knudsen number and the mass ratio are simultaneously small. The relevant scaling is indeed that the Knudsen number is of the same order as the square root of the mass ratio. When the Knudsen number goes to zero in this way, one does not get a system of hydrodynamic equations for the each species, like in Ref. [22], but a system of diffusion equations for the electron density and energy, coupled with a hydrodynamic equations for the ions. The diffusion system for the electrons (sometimes referred to as the energy-transport system in the literature) can be viewed as a compressible Navier–Stokes system in which all terms proportional to the electron inertia are neglected (i.e. the inertia and viscosity terms in the momentum balance equation, the drift component of the total energy in the energy balance equation, etc.). In other words, the coupled Navier–Stokes system
4
Pierre Degond
in Ref. [22] retains terms of second order in the Knudsen number, while of course, some other terms of the same order (like second order (Burnett) corrections in the ion balance equations) are dropped out. To some extent, Braginskii’s model (which has more or less become the conventional plasma models) lacks consistency in this respect. The present work can be viewed as an attempt to correct this inconsistency of Braginskii’s model. Braginskii’s model [22] with Sptizer–Härm [23] thermal conductivity suffers from other deficiencies that shall not be addressed in the present chapter, such as the inadequacy of Sptizer–Härm formula for hot-electron transport (a review about this question can be found in Ref. [30]), the anomalously large value of certain transport coefficients due to plasma turbulence [31], the need for closures in the regimes where collisions are not dominant [32] and more recently [33], etc. In the present work, we do not question the LTE assumption which is the conventional strategy for deriving macroscopic model. Therefore, we cannot address the above-mentioned points, which are obviously situations where the LTE assumption is not valid. The theory of extended thermodynamics [34], recently revisited by Levermore [35] has been designed to address such situations. We shall not develop this viewpoint and just mention the recent work of Anile et al. [36–37] who have practically implemented this theory in the semiconductor context. In disparate mass binary mixtures, the derivation of macroscopic models from kinetic ones has been studied by a large number of authors [24–28] who have proposed a large panel of different models. Our approach bears similarities with that of Petit and Darrozes [28]. Although based on earlier articles [1–3], this presentation is different in several aspects. In the present work, we tried to make the text accessible to a physics audience (while the mathematical formalism of the previous chapters makes them difficult for non-mathematicians). First, we base our discussion on the physical conservation laws (mass, momentum and energy balance equations, entropy inequality), instead of using exclusively the Hilbert method like in the previous works. Second, we express the energy-transport fluxes in terms of the gradients of the entropic variables (chemical potential and reciprocal of the temperature essentially), which makes Onsager’s reciprocity relation particularly simple. The other major additions concern discussions of the physical motivation and implication of the model and the derivation of explicit formulae for the diffusion coefficients and relaxation rates in some simple cases (numerical computations are required for the most general case). These models have been implemented in various application cases such as the modeling of a plasma opening switch [38, 39], and the description of the primary discharge development in the process of arcing on satellite solar cells [40] (see also the review [41]). In the second case, surface collision mechanisms were added, following [42]. The energy-transport model appears in the solid-state physics literature as early as Refs. [43–45]. It is routinely used in semiconductor modeling [46–50]. The mathematical analysis of the energy-transport model has been conducted in Refs. [51–53]. As previously stated, its derivation from kinetic
Models for Plasmas and Disparate Mass Gaseous Binary Mixtures
5
models has been formally established in Refs. [1–3] in the case of plasmas, and in Refs. [4, 5] for the semiconductor case. Elements towards a rigorous convergence proofs are collected in Ref. [54] for the semiconductor case. To our knowledge, there is no mathematically rigorous analysis of the plasma case. Another approach, which consists in deriving the energy-transport model from a relaxation limit of the hydrodynamic equations, is reported in Ref. [55].The energy-transport model can also be derived from a relaxation limit of the so-called spherical harmonics expansion (SHE) model, also called the Fokker–Planck (FP) equation [4, 56]. For other references about the SHE/FP model and its numerical simulation, we refer the reader to Refs. [57–65], [84], [85]. A review of the relation between SHE/FP models and energy-transport models for semiconductors can be found in Ref. [66]. For general references concerning rarefied gas dynamics, we refer the reader to [12], [13], [67]; for plasmas, to [68–70]; for semiconductors, to [71], [72]. Extensions of the present work have been done in several directions. A derivation of an energy-transport model for relativistic electrons can be found in Ref. [39]. The case where the ionization levels of the ions is large has been investigated in Ref.[3]: it leads to an FP/SHE model for the electrons, coupled with a hydrodynamic model for the ions. Impact ionization for the case of semiconductors has been taken into account in Refs. [73–75], and leads to non-conventional energy-transport models. A somewhat related study for plasmas has been done in Ref. [76]. Relaxation limits of the energy-transport model under high-field conditions toward non-equilibrium drift-diffusion models have been investigated in Ref. [77]. The outline of the chapter is the following: we start with the kinetic level in Section 1.2 and investigate the moment approach and LTE closure for conventional mixtures in Section 1.3. We show why the case of plasmas or disparate mass mixtures requires a special treatment. Then, we state our plasma fluid model in Section 1.4. In this section, we discuss the physical implications of the model and provide analytical formulae for the diffusion and relaxation coefficients for some specific examples of collision cross-sections.We then develop the arguments leading to the plasma fluid model in the forthcoming sections. Section 1.5 is devoted to the scaling assumptions (which are the key point of our approach) and the subsequent dimensionless form of the equations. The most important dimensionless parameter in this analysis is the square root of the mass ratio ε, which is supposed equal to the Knudsen number (ratio of the collision mean free-path to the macroscopic scale). Section 1.6 gives the expansion of the interspecies collision operators in terms of ε and lists some useful properties of these operators. Then, the moment approach and the LTE closure are reprocessed in the framework of this new scaling. In doing so, the evaluation of the electron fluxes requires the inversion of an auxiliary equation involving the linearized collision operators. This analysis is performed in Section 1.8 and, after scaling the results back to physical variables, leads to the model stated in Section 1.4. The analysis of the operator involved in the auxiliary equation is deferred to Appendix A.
6
Pierre Degond
1.2 The Kinetic Model We investigate a fully ionized plasma model consisting of electrons and one species of ions, and no neutrals. We are interested in the macroscopic effects induced to the mixture by the collisions among the particles, being of the same species or not. In this respect, the specificity of plasmas is that the considered species have very different masses. On the other hand, all the conclusions of this study will be relevant to disparate mass binary gases as well, the nature of the interaction (of Coulombic type in the case of plasmas, and of short-range type for neutral molecules) being of little importance in the conclusions of this study. Therefore, we do not restrict ourselves to the Landau collision operator (relevant for charged particles) but allow ourselves to also consider Boltzmann type operators. We shall use the index “e’’ for electrons in the plasma case, and for the light species in the gas-dynamic case and the index “i’’ for ions or the heavy species. We start from kinetic models of Boltzmann or Fokker–Planck type for the two species of particles. Let fe (x, v, t) denote the electron (or light species) distribution function, which depends on the space coordinate x ∈ R3 , the velocity coordinate v ∈ R3 and the time t > 0. The pair (x, v) spans the phase-space and the quantity fe (x, v, t) dx dv represents the electron number density at a given point (x, v) of phase space and at time t. Similarly, we denote by fi (x, v, t) the ion (or heavy species) distribution function. The functions fe and fi evolve in time according to the so-called Boltzmann (or Fokker–Planck) equations: 1 Fe · ∇v fe = Qee ( fe , fe ) + Qei ( fe , fi ), me 1 ∂t fi + v · ∇x fi + Fi · ∇v fi = Qie ( fi , fe ) + Qii ( fi , fi ), mi
∂ t fe + v · ∇x fe +
(1.1) (1.2)
where me and mi denote the electron and ion masses, Qee ( fe , fe ), Qii ( fi , fi ) are the electron–electron and ion–ion collision operators, Qei ( fe , fi ), Qie ( fi , fe ) are the electron–ion collision operators, respectively, acting on the electron and ion distribution functions and finally Fe , Fi are force terms acting on the electrons and ions. These equations express that the rates of change of fe or fi while following the particles in their classical mechanical motion (i.e. the left-hand sides) are due to the pair collisions between the particles (expressed by the collision operators on the right-hand sides). The derivation and justification of the Boltzmann equation is beyond the scope of the present work (see e.g. Ref. [12], [13], [67]) . For rarefied gas mixtures, an example of force acting on the molecules is the gravity and the collision operators are of Boltzmann type (see below). In the case of a plasma, the main force acting on the charged particles is the Lorentz force: Fe = −qe (E + v × B),
Fi = qi (E + v × B),
Models for Plasmas and Disparate Mass Gaseous Binary Mixtures
7
where qe and qi denote the absolute value of the electron and ion charges, E = E(x, t) is the electric field and B = B(x, t), the magnetic field. A specificity of plasmas is that the two terms of the Lorentz force may have different orders of magnitude. This kind of splitting does not occur (at least to our knowledge) in gas mixtures. In plasmas, the collision operators are of Fokker–Planck–Landau type (see below), and we shall distinguish them from Boltzmann type operators by using the superscript P for the Fokker–Planck–Landau operator, and B for the Boltzmann operator. Therefore, the case of rarefied gas mixtures is described by equations (1.1) and (1.2) where Q = Q B while, in the case of plasmas these equations take the form: qe P (E + v × B) · ∇v fe = Qee ( fe , fe ) + QeiP ( fe , fi ), me qi ∂t fi + v · ∇x fi + (E + v × B) · ∇v fi = QieP ( fi , fe ) + QiiP ( fi , fi ), mi
∂t fe + v · ∇x fe −
(1.3) (1.4)
Throughout the text, all statements valid for both Q B and Q P will be stated using the generic notation Q. In plasmas, the collisions are modeled by the Fokker–Planck–Landau operator. This operator is written, for collisions of the species α against the species β, α and β being equal to either “e’’ or “i’’:
P ( fα , fβ )(v) Qαβ
=
μ2αβ mα
∇v ·
v1 ∈ R 3
P σαβ |v − v1 |3
1 1 S(v − v1 ) ( fβ )1 (∇v fα ) − fα (∇v fβ )1 dv1 , mα mβ
(1.5)
where μαβ = mα mβ /(mα + mβ ) is the reduced mass of the pair of particles, S(v) = Id − (v ⊗ v)/|v|2 is the 3 × 3 projection matrix onto the plane orthogonal P is the scattering cross-section. As usual in kinetic theory, we have to v and σαβ introduced the short-cuts f ≡ f (v), f1 ≡ f (v1 ) and for instance (∇v fβ )1 ≡ (∇v fβ )(v1 ). P depend on the relative velocity of the colliding parIt is customary to make σαβ ticles |v − v1 |. In fact, for our purpose, we shall prefer to make it depend on the P = σ P (μ |v − v |2 ). We have σ P = σ P . relative energy μαβ |v − v1 |2 i.e. σαβ 1 αβ αβ αβ βα In rarefied gas mixtures, the collision operator is the Boltzmann operator, given by: B Qαβ ( fα , fβ )(v)
=
v1 ∈ R 3
ω∈S2
B σαβ |v − v1 |( fα ( fβ )1 − fα ( fβ )1 ) dv1 dω,
(1.6)
8
Pierre Degond
where again fα ≡ fα (v), ( fβ )1 ≡ fβ (v1 ), fα ≡ fα (v ), ( fβ )1 ≡ fβ (v1 ). The pair (v , v1 ) is related with (v, v1 ) and ω through the collision transform v = vG +
mβ |v − v1 |ω, mα + mβ
v1 = vG −
mα |v − v1 |ω, mα + mβ
(1.7)
mα |v − v1 |e. mα + mβ
(1.8)
where vG is the center-of-mass velocity vG =
mα v + mβ v1 . mα + mβ
Introducing the unit vector e=
v − v1 , |v − v1 |
v and v1 can be written v = vG +
mβ |v − v1 |e, mα + mβ
v1 = vG −
We also note that ω=
v − v1 . |v − v1 |
Therefore, e and ω, respectively, define the direction of the relative velocities before and after the collision in the center-of-mass frame. ω is called the scattering direction. It ranges over the entire unit sphere S2 = {ω ∈ R3, |ω| = 1}. Relation (1.7) can be deduced from momentum and energy conservation during the collision (v, v1 ) ↔ (v , v1 ), i.e.: mα v + mβ v1 = mα v + mβ v1 , mα |v|2 + mβ |v1 |2 = mα |v |2 + mβ |v1 |2 . Indeed, momentum conservation implies that the center-of-mass velocity vG is conserved during the collision. Then, from energy conservation, we deduce that the magnitude of the relative velocities |v − v1 | and |v − v1 | are unchanged. Only the direction of the relative velocity changes from e to a unit vector ω randomly distributed on the unit sphere. Formulas (1.7) and (1.8) then just express the reconstruction of the velocities from the center-of-mass velocity and the relative velocities. The scattering angle θ = cos−1 (ω · e) is the angle between the directions of the relative velocities. θ ranges in [0, π]. The plane defined by the two vectors (e, ω) is the collision plane. All planes containing the relative velocity v − v1 are possible collision planes. All these planes can be located with respect to a reference plane by using a single angle ϕ ∈ [0, 2π]. The pair (θ, ϕ) defines a spherical coordinate system for the vector ω ∈ S2 and we have dω = dθ sin θ dϕ.
Models for Plasmas and Disparate Mass Gaseous Binary Mixtures
9
The energy in the center-of-mass frame (or relative energy) is (up to the factor 2): mα |v − vG |2 + mβ |v1 − vG |2 = μαβ |v − v1 |2 . Like in the case of the Fokker–Planck–Landau operator, we shall make the differB depend on the relative energy rather than on ential scattering cross-section σαβ B also depends on the scattering angle cos θ, the relative velocity |v − v1 | itself. σαβ i.e. in general, B B σαβ = σαβ (μαβ |v − v1 |2, cos θ). B describes the relative probability that a collision with The θ dependence of σαβ a definite scattering angle occurs. It depends on the nature of the collision B = σB . mechanism. Of course, we have σαβ βα For hard sphere collisions, i.e. if the molecules can be modeled like billiard balls of radii rα and rβ , respectively, elastically bouncing on each other, we have B (μαβ |v − v1 |2, cos θ) = π(rα + rβ )2 . σαβ
It is independent of |v − v1 | and θ. For rarefied gas dynamics, a better model is obtained by letting rα + rβ depend on |v − v1 | in a specific fashion. This is called the variable hard sphere model (VHS). For electron-neutral or ion-neutral collisions, isotropic scattering cross sections are often considered. This also gives rise to scattering cross sections which only depend on the energy. We now review the relation between the Boltzmann and the Fokker–Planck B and σ P . operators and in particular, between the scattering cross sections σαβ αβ This problem has been studied in Refs. [78–81]. The Fokker–Planck operator is relevant when the Boltzmann collision operator (1.6) is divergent at small B ∼ θ −s when θ → 0 with s ≥ 4. scattering angles θ ∼ 0. This occurs when σαβ More precisely, we suppose that the following asymptotic relation holds 1 B 1 B 2 2 σαβ (μαβ |v − v1 | , cos θ) = s σ˜ αβ (μαβ |v − v1 | ) + o s as θ → 0, (1.9) θ θ B . We define a regularized Boltzmann operator Q Bδ with s ≥ 4, and this defines σ˜ αβ αβ by performing the integration with respect to θ in equation (1.6) in the interval B by a cut-off cross section [δ, π]. In other words, we replace the cross section σαβ
Bδ σαβ
=
B σαβ 0
if θ ∈ [δ, π], if θ ∈ [0, δ].
(1.10)
Then, as δ→0 we have Bδ P ∼ Qαβ , Qαβ
(1.11)
10
Pierre Degond
P is the Fokker–Planck–Landau operator (2.5) with scattering cross where Qαβ section given by π P B = s (δ)σ˜ αβ , (1.12) σαβ 2 and where s (δ) is defined by | ln δ| if s = 4, 1 1 (1.13) s (δ) = if s > 4. s − 4 δs−4
This point is proved e.g. in Refs. [78–81]. The Coulomb scattering cross section is given by Refs. [69, 70]: 2 Cαβ q α qβ 1 B 2 σαβ (μαβ |v − v1 | , cos θ) = , , Cαβ = 4 2 2μαβ |v − v1 | sin (θ/2) 4πε0 (1.14) where ε0 is the vacuum permittivity. Therefore, it is of the form (1.9) with s = 4 and the corresponding Boltzmann integral (1.6) is divergent at θ = 0. According to (1.11), the relevant operator for Coulomb collisions is the Fokker–Planck operator with cross-section 2 2Cαβ π P 2 . (1.15) σαβ (μαβ |v − v1 | ) = | ln δ| 2 μαβ |v − v1 |2 The problem is now to assign a physical value to the parameter δ which, in practice, is very small. For that purpose, we return to the relation between the scattering angle θ and the impact parameter b, the distance of closest approach of the two particles if they would have no interaction. For Coulomb collisions, cos θ is a monotonously decreasing function of b, given by tan (θ/2) =
Cαβ . μαβ |v − v1 |2 b
For small angles, tan (θ/2) ∼ θ/2 and we can write θ=
2Cαβ . μαβ |v − v1 |2 b
(1.16)
Now, the cutoff angle δ is related through (1.16) to the screening length, i.e. the distance beyond which the Coulomb interaction vanishes. In a plasma, this screening (or Debye) length λD is given by Refs. [69, 70]: qβ2 nβ 1 qα2 nα 1 , (1.17) = + 2 ε0 kB Tα ε0 kB Tβ λ2D
11
Models for Plasmas and Disparate Mass Gaseous Binary Mixtures
where nα and Tα are the density and temperature of the α species and kB is the Boltzmann constant. The relative kinetic energy of the two particles is estimated by their average thermal energy, i.e. 1 μαβ |v − v1 |2 ∼ (kB Tα + kB Tβ ). 2 Therefore, the cut-off angle is defined by δ=
2Cαβ . + kB Tβ )λD
(1.18)
1 2 (kB Tα
The quantity | ln δ| is often called the Coulomb logarithm. In practice, its value ranges from 1 to 20 [69,70]. A constant δ is usually chosen, where estimates of the densities and temperatures of the species are used in place of their local values in Refs. (1.17) and (1.18). In the remainder of this chapter, we shall illustrate our results by giving explicit formulas in the following two simplified cases: B (hard-sphere like (i) Boltzmann operators with constant cross sections σαβ models). (ii) Fokker–Planck operators with the Coulomb cross-section (2.15).
1.3 Moment Method and Conservation Laws for Gas Mixtures: Why it Cannot Apply to Plasmas The numerical simulation of a gas or a plasma by means of the kinetic models (1.1), (1.2) or (1.3), (1.4) is very expensive, because it involves the discretization of the 6-dimensional phase–space (plus the time). Therefore, it is often preferable to use more macroscopic models, describing the evolution of observables defined on the 3-dimensional physical space (such as density, mean velocity, mean energy and so on). It is possible to derive such macroscopic models from the kinetic ones by the moment method, which we are now going to introduce. For this purpose, we first need to investigate some properties of the collision operators.
1.3.1 Properties of the collision operators The following properties are classical [12, 13, 67]: (i) Mass conservation: Qee dv = 0,
Qei dv = 0,
Qie dv = 0,
Qii dv = 0.
(1.19)
12
Pierre Degond
(ii) Momentum conservation: (Qei me v + Qie mi v)dv = 0, Qee me v dv = 0, (iii) Energy conservation: Qee me |v|2 dv = 0,
Qii mi v dv = 0 (1.20)
(Qei me |v|2 + Qie mi |v|2 )dv = 0,
Qii mi |v|2 dv = 0, (1.21)
(iv) Entropy inequalities:
He ( fe ) := Qee ( fe , fe ) ln fe dv ≤ 0, Hei ( fe , fi ) := (Qei ( fe , fi ) ln fe + Qie ( fi , fe ) ln fi ) dv ≤ 0, Hi ( fi ) := Qii ( fi , fi ) ln fi dv ≤ 0.
(1.22) (1.23) (1.24)
The quantities He ( fe ), Hi ( fi ) and Hei ( fe , fi ) are called the entropy dissipation rates of the electron–electron, ion–ion and electron–ion collisions, respectively. That these are negative quantities was the key observation made by Boltzmann and explains why large mechanical systems are time irreversible while point mechanics itself is reversible. As a consequence of the entropy inequality, we can find those distribution functions which cancel the collision operators.We have the following property: (v) Thermal equilibria: Let fe be such that Qee ( fe , fe ) = 0. Then, fe is a Maxwellian: ne me |v − ue |2 Mne ,ue ,Te (v) = , (1.25) 3/2 exp − 2kB Te 2πkB Te me
with ne > 0 to maintain the positivity of the distribution function and Te > 0 to guarantee its integrability with respect to v. The density ne , mean velocity ue and temperature Te are such that ⎞ ⎛ ⎞ ⎛ ne 1 ⎠. me ne ue (1.26) Mne ,ue ,Te (v) ⎝ me v ⎠ dv = ⎝ me ne |ue |2 + 3kB Te me |v|2 The quantities me ne ue and We = 12 me ne |ue |2 + 32 kB Te are the momentum and energy densities of the Maxwellian and kB is the Boltzmann constant. We have the same property for the ions: if the ion distribution function fi is such that Qii ( fi , fi ) = 0, then it is necessarily a Maxwellian Mni ,ui ,Ti with density ni , average velocity ui and temperature Ti .
Models for Plasmas and Disparate Mass Gaseous Binary Mixtures
13
Finally, let a pair fe , fi be such that both Qee ( fe , fe ) + Qei ( fe , fi ) = 0 and Qie ( fi , fe ) + Qii ( fi , fi ) = 0. Then fe and fi are two Maxwellians with same average velocity and same temperature: fe = Mne ,u,T , fi = Mni ,u,T . where ne and ni are the electron and ion density and u and T are the common mean velocity and temperature.
1.3.2 Moments and conservation laws Macroscopic quantities such as the number, momentum and energy densities can be constructed from integrals of the distribution function with respect to the velocity. We call such quantities “moments’’. The density nα , mean velocity uα and energy density Wα of the species α can be computed from the α-species distribution function fα according to ⎞ ⎛ 1 nα mα nα uα = fα (v) ⎝ mα v ⎠ dv. (1.27) mα |v|2 Wα 2
The quantity mα nα uα is the α-species momentum density. The energy density Wα can be split into drift energy 12 mα nα |uα |2 and thermal energy wα by 1 1 2 fα mα |v − uα |2 dv. Wα = mα nα |uα | + wα , wα = 2 2 Evolution equations for these quantities can be obtained from the kinetic models (1.1), (1.2) or (1.3), (1.4) by integration with respect to v, after multiplication by the convenient function of v. Using the mass conservation property of all collision operators (1.19), and the momentum and energy conservation properties of the like-particle collision operators (1.20), (1.21), we get the following system of balance laws for the electrons: ∂t ne + ∇x · (ne ue ) = 0, ∂t (me ne ue ) + ∇x fe me v ⊗ v dv = ne Fe + Sei , ∂t We + ∇x ·
(1.28) (1.29)
v
me |v|2 fe v dv = ne Fe · ue + Uei , 2 v
(1.30)
and similarly for the ions ∂t ni + ∇x · (ni ui ) = 0, ∂t (mi ni ui ) + ∇x fi mi v ⊗ v dv = ni Fi + Sie , ∂t Wi + ∇x ·
(1.31) (1.32)
v
mi |v|2 fi v dv = ni Fi · ui + Uie , 2 v
(1.33)
14
Pierre Degond
where Sαβ and Uαβ are the momentum and energy transfer rates toward species α from species β: mα v Sαβ = Qαβ ( fα , fβ )(v) mα |v|2 dv. (1.34) Uαβ 2
We note that, because of the momentum and energy conservation properties of the unlike-collision operators (1.20), (1.21), we have Sei + Sie = 0,
Uei + Uie = 0,
(1.35)
Indeed, the system consisting of the mixture being isolated, total momentum and total energy must be preserved. This system consists of balance laws for the densities (1.28), (1.31), momenta (1.29), (1.32) and energies (1.30), (1.33). However, the integrals inside the spacederivatives cannot be expressed in closed forms as functions of the primary variables (ne , ue , We ) and (ni , ui , Wi ), unless appropriate expressions for the distribution functions fe and fi are given. A similar remark can be made for the transfer rates Sαβ and Uαβ . This is the closure problem for the macroscopic equations. It is customary to separate the drift motion (defined by the average velocity uα ) and the random kinetic motion (defined by the velocity v − uα ) in evaluating these integrals: fα mα v ⊗ v dv = mα nα uα ⊗ uα + Pα , (1.36) v Pα = fα mα (v − uα ) ⊗ (v − uα )dv, (1.37) v
fα v
|v|2
mα 2
v dv = Wα uα + Pα uα + Qα , Qα =
fα v
mα |v − uα |2 (v − uα )dv. 2
(1.38) (1.39)
Pα is called the pressure tensor and Qα , the heat flux vector of the α-species.
1.3.3 Closure of the moment system: LTE To close the system of balance laws (1.28)–(1.33), we need to find an expression for the pressure tensors Pα and heat flux vectors Qα . To this aim, we need to specify a form of the distribution functions depending on the primary variables (nα , uα , Wα ). This cannot be done in complete generality, but requires to specify a regime in which this approximate form of the distribution function is valid. The most important case is that of the LTE. In this regime, we assume that the distribution function is close to a Maxwellian (1.25), whose density, mean velocity and temperature are possibly depending on space and time. This approximation
Models for Plasmas and Disparate Mass Gaseous Binary Mixtures
15
is justified when the size of the system and the time scale of interest are large compared with the typical space and time scales of collisions, i.e. the mean freepath and the mean time between collisions. To make this statement clearer, we return to the kinetic equations (1.1), (1.2) or (1.3), (1.4) and suppose that we make the following change of space and time coordinates x = εx,
t = εt,
Fe , Fi = εFe , Fi
(1.40)
where ε 1 is a small parameter.This amounts to changing the units for space and time to larger units by a factor ε−1 . In practice, ε is set to the ratio of the collision mean free-path to the size of the system under consideration. In rarefied gas dynamics, ε is called the Knudsen number. The scaling of the force corresponds to assuming that the momentum gained during the free-flight between two collisions is also of order ε. Performing the change of scale (1.40) into (1.1), (1.2) or (1.3), (1.4), we are led to the following system: 1 1 Fe · ∇v feε = (Qee ( feε , feε ) + Qei ( feε , fiε )), me ε 1 1 ∂t fiε + v · ∇x fiε + Fi · ∇v fiε = (Qie ( fiε , feε ) + Qii ( fiε , fiε )). mi ε
∂t feε + v · ∇x feε +
(1.41) (1.42)
The corresponding moment equations are deduced from (1.28)–(1.33) by using the same change of scale (1.40). This leads to ∂t neε + ∇x · (neε ueε ) = 0, 1 ∂t (me neε ueε ) + ∇x (me neε ueε ⊗ ueε ) + ∇x Pεe = neε Fe + Seiε , ε 1 ∂t Weε + ∇x · (Weε ueε + Pεe ueε + Qεe ) = neε Fe · ueε + Ueiε , ε
(1.43) (1.44) (1.45)
and similarly for the ions ∂t niε + ∇x · (niε uiε ) = 0, 1 ∂t (mi niε uiε ) + ∇x (mi niε uiε ⊗ uiε ) + ∇x Pεi = niε Fi + Sieε , ε 1 ∂t Wiε + ∇x · (Wiε uiε + Pεi uiε + Qεi ) = niε Fi · uiε + Uieε , ε
(1.46) (1.47) (1.48)
where we have denoted with a superscript ε the solutions of these equations. The macroscopic limit corresponds to letting ε → 0 in these systems. From (1.41) to (1.42), we see that feε , fiε formally converge as ε → 0 to functions fe , fi such that Qee ( fe , fe ) + Qei ( fe , fi ) = 0 and Qie ( fi , fe ) + Qii ( fi , fi ) = 0. Then, from
16
Pierre Degond
Section 1.3.1, property (v), we know that fe , fi are Maxwellians with the same mean velocity and the same temperature, i.e.: fe = Mne ,u,T ,
fi = Mni ,u,T .
(1.49)
ueε → u,
uiε → u.
(1.50)
Wiε → Wi := ni (mi
|u|2 3 + kB T ). 2 2 (1.51)
Additionally, we obviously have that neε → ne ,
niε → ni ,
Using (1.26), we also see that Weε → We := ne (me
|u|2 3 + kB T ), 2 2
Finally, a simple computation gives Pεe → Pe = ne kB T Id = pe Id, Qεe
→ 0,
Qεi
Pεi → Pi := ni kB T Id = pi Id,
→ 0.
(1.52) (1.53)
where pe and pi are called the scalar pressures. The macroscopic limit ε → 0 provides us with four independent quantities, ne , ni , u, and T (three scalars and one vector) while there are six independent moment equations (1.43)–(1.48) for finite ε. Therefore, these six equations must be combined in such a way that only four equations remain in the limit. This is the place to make use of the total momentum and energy conservations (1.35). Adding up the two momentum balance equations and the two energy balance equations, letting ε → 0 and using (1.50)–(1.53), we are led to ∂t ne + ∇x · (ne u) = 0,
(1.54)
∂t ni + ∇x · (ni u) = 0,
(1.55)
∂t (ρu) + ∇x (ρu ⊗ u) = ∇x p + ne Fe + ni Fi ,
(1.56)
∂t W + ∇x · ((W + p)u) = (ne Fe + ni Fi )u,
(1.57)
with ρ the volumic mass, p the total pressure and W the total energy given by ρ = me ne + mi ni ,
(1.58)
p = pe + pi = (ne + ni )kB T ,
(1.59)
W = We + Wi = ρ
|u|2 3 + p. 2 2
(1.60)
This is the final macroscopic model for the mixture derived under the scaling limit ε → 0. It consists of as many density equations as components in the mixture, but only one velocity and one temperature for the whole mixture. The total energy and total pressure of the gas is the sum of those of its components. The relation
Models for Plasmas and Disparate Mass Gaseous Binary Mixtures
17
between the internal energy and the pressure of each component (as following from equation (1.60)) is the perfect gas equation-of-state. By performing an expansion in powers of ε, it is possible to obtain diffusive corrections to this model. In addition to the usual viscosity and heat conductivity terms present in the standard Navier–Stokes equations, these higher-order models contain specific terms resulting from the small deviations of the velocity and temperature of each component to the mean velocity and temperature of the mixture [29]. We are not going to pursue this direction because one of the main specificities of plasmas, namely the fact that the two species have very different masses, has not been taken into account in this model. We now discuss this specific point and our strategy to improve the approach.
1.3.4 Why the standard mixture model does not apply to plasmas In plasmas, electrons have a much smaller mass than ions. Because of this very small mass, electron–ion collisions are very efficient to the relax the electron velocity toward the ion one, but very inefficient to relax the temperatures of the two species toward each other. The reason is that collisions between two particles of very different masses result in large momentum changes for the light particle (but very small momentum changes for the heavy one), and almost no energy transfers between the particles. Therefore, two typical collision time scales occur, respectively related with momentum and energy transfer rates. For this reason, in plasmas or disparate gas binary mixtures, the collision time scales for the two species of particles are not of the same order of magnitude. √ More specifically, the light species collision time scale is smaller by a factor me /mi than that of the heavy species. These considerations will be detailed below (see Section 1.5). Therefore, the heavy species collision time scale is already a macroscopic time scale for the light species. At this scale, it can be shown that the light species can be described by a set of macroscopic equations coupled with a kinetic equation for the heavy species. This model has been developed in Ref. [2]. In the present chapter, however, we shall rather consider a different scaling, where both species can be described by macroscopic equations. The time-scale that we shall be interested in is a macroscopic time scale relative to the heavy species, which corresponds to an even longer time-scale, the diffusion time-scale, for the light species. Therefore, this approach will give rise to a system of hydrodynamic equations for the ions (or the heavy species) coupled with diffusion equations for the electrons (or the light species). As a by-product, this separation of scales between the light and heavy species opens the possibility of constructing macroscopic models where the velocities and temperatures of the two species are different. Therefore, the dynamics of a plasma or a disparate mass binary mixture leads to a more complex physics than that of ordinary mixtures. To achieve our goal, we have to go back to the kinetic level and insert the relevant scaling of the masses into the kinetic equations. The best way to perform
18
Pierre Degond
this scaling in a systematic way is to use dimensionless variables. We shall perform this task in Section 1.5. In Section 1.4 below, we summarize the plasma fluid model obtained from this scaling.
1.4 The Plasma Fluid Model We now present the plasma fluid model, as it is obtained when we perform a moment method and an LTE closure respecting the fact that the electron mass is much smaller than the ion mass. This model is obtained under the assumptions that the density and temperature scales of the two species of particles are of the same order of magnitude, and therefore, that the√electron kinetic velocity scale is larger than that of the ions by a factor equal to mi /me . There are two different expressions of the model. The first one uses a formulation of the electron fluid system in terms of two conservation equations. This form will be referred to as the “energy-transport’’ form of the system, referring to a terminology used in semiconductor physics (see e.g. Refs.[4, 47]. The second one uses three balance equations, and will be referred to as the hydrodynamic form of the system.
1.4.1 Energy-transport form of the system The system takes the following form of two balance equations for the electron density and energy: ∂t ne + ∇x · jne = 0,
(1.61)
∂t We + ∇x · jW e + qe E · jne = Uei ,
(1.62)
coupled with a system of hydrodynamic equations for the ions ∂t ni + ∇x · (ni ui ) = 0,
(1.63)
mi (∂t (ni ui ) + ∇x (ni ui ⊗ ui )) + ∇x pi − qi ni (E + ui × B) = Sie ,
(1.64)
∂t Wi + ∇x · (Wi ui + pi ui ) − qi ni E · ui = Uie ,
(1.65)
with 3 We = ne kB Te , 2
pe = ne kB Te ,
1 3 Wi = mi ni |ui |2 + ni kB Ti , 2 2
pi = ni kB Ti . (1.66)
The electron density and energy fluxes are given by ⎛ ⎛ ⎞ E + ui × B μe n u e i jne ⎜ ∇x k T + qe k T ⎜ ⎟ B e B e =⎝5 ⎠ − LB ⎜ ⎝ 1 jW e ne kB Te ui ∇x − 2 kB Te
⎞ ⎟ ⎟ , (1.67) ⎠
Models for Plasmas and Disparate Mass Gaseous Binary Mixtures
19
where μe = ln ne − 32 ln(kB Te ) + Cst and LB is a 6 × 6 matrix defined by its 3 × 3 blocs {(LB )k }k=1,2 dv dv dv −1 (LB )k = − ψk ⊗ ϕ = − ψk ⊗ (LB ψ ) = − (LB ϕk ) ⊗ ϕ , Me Me Me (1.68) with ψ1 (v) = vMe , ψ2 (v) = me (|v|2 /2)vMe , ϕk = L−1 B (ψk ), k = 1, 2 and Me = Mne ,0,Te is the electron Maxwellian with zero mean velocity (1.25). In (1.68), LB is the following operator: qe LB ϕ := (v × B)∇v ϕ + 2Qee (Me , ϕ) + Qei0 (ϕ, Mi ) (1.69) me where 2Qee (Me , ϕ) is the linearization of the electron–electron collision operator 0 about √ the electron Maxwellian Me , while Qei (ϕ, Mi ) is the leading order (when ε = me /mi is small) of the electron–ion collision operator Qei and Mi = Mni ,ui ,Ti is the ion Maxwellian. These operators are given in the Boltzmann case by B B me 2Qee (Me , ϕ)(v) = σee |v − v1 |2 , cos θ |v − v1 | 2 (Me ϕ1 + ϕ (Me )1 − Me ϕ1 − ϕ(Me )1 )dv1 dω, B0 Qei (ϕ, Mi )(v) = ni |v|σeiB (me |v|2 , cos θ)(ϕ(|v|ω) − ϕ(v))dω,
(1.70) (1.71)
and in the Fokker–Planck case, by: P 2Qee (Me , ϕ)(v) 1 P me σee = ∇v · |v − v1 |2 |v − v1 |3 S(v − v1 )((Me )1 (∇v ϕ) 4 2 (1.72) + ϕ1 (∇v Me ) − Me (∇v ϕ)1 − ϕ(∇v Me )1 ) dv1 ,
QeiP0 (ϕ, Mi )(v) = ni ∇v · (|v|3 σeiP (me |v|2 )) S(v) ∇v ϕ .
(1.73)
It can be shown (see Section 1.8.2 and [82]) that equation LB ϕ = ψ is invertible that ψ satisfies the following solvability conditions: ψ dv = 0 and provided 2 ψ|v| dv = 0. Moreover, the solution ϕ is unique provided that we impose the constraints ϕ dv = 0 and ϕ|v|2 dv = 0. Under the solvability conditions, this unique solution ϕ is denoted (LB )−1 ψ. In particular, ψ1 and ψ2 satisfy the solvability conditions, which uniquely defines ϕ1 and ϕ2 . Since ψ1 and ψ2 have values in R3 , so have ϕ1 and ϕ2 and the operator (LB )−1 is applied componentwise. In the absence of a magnetic field B = 0, the operator L0 is rotationally invariant, and the matrices (L0 )k are scalar matrices (L0 )k = (L0 )k Id with (L0 )k a
20
Pierre Degond
real number. However, if B = 0, LB is not rotationally invariant and (LB )k are non-scalar 3 × 3 tensors. We show in Section 1.8.4 that the tensor LB is positive definite and satisfies Onsager’s symmetry relation L∗B = L−B ,
(1.74)
where L∗B is the transpose of LB . Finally, the interaction terms are given by Sei = −Sie = ∇x pe + qe ne (E + ue · B),
(1.75)
Uei = ui · Sei − νei,W ne (kB Te − kB Ti )
= −Uie = −(ui · Sie − νie,W ni (kB Ti − kB Te )). with
(1.76)
νie,W
me = ne mi
kB Te σ¯ ei,u (Te ), me
me kB Te νei,W = ni σ¯ ei,u (Te ), mi me dw 2 σ¯ ei,u (Te ) = |w|3 σei,u (kB Te |w|2 )e −|w| /2 . (2π)3/2 R3
(1.77)
(1.78) (1.79)
where the collision frequency for momentum exchange σei,u is given by π B σei,u = σei,u = 2π σeiB (me |v|2 , cos θ)(1 − cos θ) sin θ dθ, (1.80) 0
in the Boltzmann case, and P σei,u = σei,u = 2σeiP (me |v|2 ),
(1.81)
in the Fokker–Planck case and σ¯ ei,u is some kind of average of σei,u over the Maxwellian distribution function. We can give some explicit formulas in the Boltzmann case for hard-spheres σeiB = Constant, and in the Fokker–Planck case, for Coulomb collisions. We have: (i) Boltzmann case (hard-sphere cross section)
√ σ¯ ei,u (Te ) = 32 2πσeiB , √ kB Te B me = 32 2πσei ni , mi me √ m kB Te e = 32 2πσeiB ne . mi me
B σei,u = 4πσeiB ,
νei,W
νie,W
(1.82)
(1.83)
Models for Plasmas and Disparate Mass Gaseous Binary Mixtures
21
(ii) Fokker–Planck case (Coulomb cross section)
P σei,u (me |v|2 )
=
σ¯ ei,u (Te ) = νei,W = νie,W =
Cei 2 4π| ln δ| , me |v|2 √ Cei 2 4 2π| ln δ| , kB Te √ Cei 2 me kB Te 4 2π| ln δ| ni , kB Te mi me √ Cei 2 me kB Te 4 2π| ln δ| ne , kB Te mi me
(1.84) (1.85)
(1.86)
where Cei and the Coulomb logarithm | ln δ| are given by equations (1.14) and (1.18). It is more difficult to give explicit expressions of the coefficients (LB )k since they require the inversion of the operator LB . It is, however, possible to give some approximate analytic expressions in some limit cases, which are investigated in Section 1.4.4. Approximate formulas can be also deduced through a detailed inspection of [22, 23].
1.4.2 Hydrodynamic form of the system In this formulation, the model takes the form of two coupled system of hydrodynamic equations: ∂t ne + ∇x · (ne ue ) = 0,
(1.87)
∇x pe + qe ne (E + ue × B) = Sei ,
(1.88)
∂t We + ∇x · (We ue + pe ue + Qe ) + qe ne E · ue = Uei ,
(1.89)
∂t ni + ∇x · (ni ui ) = 0,
(1.90)
and
mi (∂t (ni ui ) + ∇x (ni ui ⊗ ui )) + ∇x pi − qi ni (E + ui × B) = Sie , ∂t Wi + ∇x · (Wi ui + pi ui ) − qi ni E · ui = Uie ,
(1.91) (1.92)
in which the inertia terms in the electron momentum equation have been dropped out, and where the energies and pressures are given in terms of the densities and temperatures by equation (1.66)
22
Pierre Degond
The friction force Sei and the heat-flux Qe are given by −ne kB Te (ue − ui ) Sei , = CB Qe −ne ∇x (kB Te ) ⎛ 1 ⎞ M B AB ⎜ ⎟ CB = ⎝ kB Te ⎠, 1 ∗ A−B KB ne
(1.93)
with MB = ne kB Te L−1 11 − qe B,
1 5 L−1 11 L12 − Id, kB Te 2 1 KB = (L22 − L21 L−1 11 L12 ), (kB Te )2 AB =
where B is the matrix of the vector product by B: 0 B3 −B2 B = −B3 0 B1 , B2 −B1 0
(1.94) (1.95) (1.96)
(1.97)
and the Lk are defined by (1.68). Finally, the energy relaxation rates are given by (1.76). It can be shown that the matrix KB is positive definite [4]. Also, CB satisfies Onsager’s symmetry relation C∗B = C−B
(1.98)
1.4.3 Discussion of the plasma fluid model and applications Mathematically, the plasma fluid model in its energy-transport form consists of a system of two parabolic diffusion equations for the electron density and temperature (1.61), (1.62), (1.67) coupled with the Euler system of hydrodynamic equations for the ion density, velocity and temperature (1.63)–(1.65). The latter is very classical (except for the coupling with the electron system, which we discuss in greater detail below). So, we concentrate first on the energy-transport system for the electrons. This system consists of continuity-like equations (1.61) and (1.62) where the time variations of the electron density and energy are balanced by the divergence of the corresponding density and energy fluxes (apart from the source term E · jn in the energy equation which corresponds to the work of the Lorentz force). These fluxes are given by the constitutive relation (1.67) which tells that in the frame moving with the ions, the electron fluxes are of diffusive nature. The first
Models for Plasmas and Disparate Mass Gaseous Binary Mixtures
23
term in equation (1.67) just corresponds to the change of frame to the ion comoving frame. The second term is a non-classical expression of the Fourier-Fick law which involves gradients of μe /kB Te and −1/kB Te instead of those of ne and We (where μe and Te are the electron chemical potential and temperature). The two sets of variables (ne , We ) on the one hand and (μe /kB Te , −1/kB Te ) on the other hand, are dual to each other through the concept of entropy. More specifically, there exists a strictly concave entropy function S(ne , We ) such that (μe /kB Te , −1/kB Te ) = −∇(ne ,We ) S. Additionally, this relation can be inverted by means of the Legendre-Fenchel transform of S. The fact that the matrix L is symmetric positive definite (at least in the zero-magnetic field case) guarantees that the entropy is an increasing function of time. The function S is derived from the microscopic entropy H ( fe ) = − fe ln fe dv by evaluating it for the local Maxwellian fe = Me . We do not elaborate more on the entropic structure of the energy-transport model, which has been discussed in greater detail in Ref. [66]. The set (ne , We ) is referred to as the conservative, or extensive variables, while the set (μe /kB Te , −1/kB Te ) are the entropic, or intensive variables. The fluxes are also related with the electric and magnetic fields. The term ui × B can simply be explained as arising from the change of frame to the ion co-moving frame. Indeed, in this frame, the electrons move with velocity ui across the magnetic field lines and this motion creates an induced electric field ui × B. Therefore, in this frame, the electrons “see’’ a total electric field equal to E + ui × B. This “total’’ electric field induces electron mass and energy fluxes which are related with the fluxes produced by gradients of μe /kB Te . Indeed, in both the density and energy fluxes, the same combination E + ui × B μe ∇x + qe kB Te kB Te appears. The matrix relating the fluxes to the fields is the “mobility’’. Here, the mobility and diffusion matrices are equal. This link is known as the Einstein relation [71, 72]. The electron system is coupled with the ion one in the expression of the fluxes (1.67) which contain ui and, in a more concealed way, ni (through the diffusion matrix LB ; note that the collision operator Qei0 (ϕ, Mi ) depends on ni ). The source term Uei in the electron energy balance equation (1.62) also involves all three ion macroscopic quantities ni , ui and Ti . Conversely, the ion system is coupled with the electron system through the interaction terms Sie and Uie . The energy-transport form of the system is likely to be best suited to numerical simulations, although the use of the chemical potential can be uneasy in regions devoid of electrons (for which μe → −∞).The hydrodynamic form of the system has been discussed in Section 1.4.2 to show the analogy (and differences) with the standard hydrodynamic equations. Indeed, the energy-transport system can be connected with the compressible Navier–Stokes equations by a rescaling in which we assume that the mean velocity is small. In this way the inertia and viscosity terms in the momentum balance equations of the Navier–Stokes equation can be
24
Pierre Degond
dropped out and we can recover the energy-transport model. This analogy allows us to compare the values of the momentum relaxation rate (the matrix me−1 MB ), thermo-electric coefficient AB and thermal conductivity KB with those of the literature. A careful inspection of Ref. [22] will convince the reader that we recover the same results as in the literature. In principle, the electric and magnetic fields must be coupled with the plasma variables through Maxwell or Poisson equations. In many circumstances, the coupling with the Poisson equation can be replaced by a quasi-neutrality assumption, i.e. the assumption that the charge density vanishes everywhere. In this case, the electric potential becomes a kind of Lagrange multiplier of the quasi-neutrality constraint. The way the electric and magnetic fields are obtained (and eventually coupled with the plasma fluid model) does not interfere with the considerations that will guide our derivation of the plasma fluid model (see the following sections). Therefore, it is legitimate, for this operation, to assume that they are just given data. It is possible to pursue the asymptotic expansion which leads to the plasma fluid model. It has been proved in Ref. [2] that going to the next order through a Chapman–Enskog expansion leads to classical diffusive corrections in the ion hydrodynamic equations, which only involve the standard viscosity and heat conductivity terms.The electron system itself remains unmodified by these next order terms. The applications of this approach go far beyond the case of one-ion species plasmas or disparate mass binary mixtures. It can be applied to multi-component ion plasmas, as well as weakly ionized plasmas. In this case, the dominant species are the neutrals and both the electrons and ions can be viewed as the light species. The outcome are coupled energy-transport systems for both the ions and electrons, with diffusion matrices computed from the Boltzmann operator. Since all formulas here apply for Boltzmann as well as Fokker-Planck operators, the approach can be extended with very little change. In particular, these systems, with a convenient additional modeling of the chemical kinetics, can be used to model high-pressure gas discharges, high-pressure plasmas processes, ionospheric plasmas disturbances and so on. An example of a practical application of these models for the simulation of a plasma opening switch can be found in Ref. [38]. Another application dealing with the modeling of arcing on satellite solar cells is described in Refs. [40, 41].
1.4.4 Approximate expression of the diffusion matrices 1.4.4.1 Small electron–electron collision approximation There exist no analytical expressions of the diffusion matrices LB or CB in the general case and one has to resort to numerical simulations in order to invert the operator LB (1.69). It is possible, however, to obtain analytical expressions if the electron–electron collision operator in the expression of LB is neglected. In fact, we shall suppose that the term 2Qee (Me , ϕ) is so small that it can be neglected in the computation, but does not vanish completely so that the
Models for Plasmas and Disparate Mass Gaseous Binary Mixtures
25
solvability condition of the equation LB ϕ = ψ remains unchanged. Neglecting the electron–electron collision operator is a valid physical hypothesis in some situations, for instance, if the electron–ion collision cross section is significantly larger than the electron–electron one. This happens if the ion charge qi is larger than the electron charge qe by a factor of the order of 10 [3]. Another situation where this approximation is legitimate is when the ion density is much larger than the electron density. This happens in weakly ionized gases when the considered heavy particle is not the ions but the neutral molecules. We first note that if ϕ is of the form ϕ(v) = v(|v|), then Qei0 (ϕ, Mi ) = −νei,u ϕ,
νei,u = ni σei,u |v|,
(1.99)
where σei,u is the scattering cross section for momentum exchange given by (1.80) or (1.81), and νei,u is a collision frequency for momentum exchange. The proof of this fact is easy and is left to the reader. For the Boltzmann operator with constant cross section σeiB , the collision frequency for momentum exchange is given by B (|v|) = 4πσeiB ni |v|, νei,u
(1.100)
while for the Fokker–Planck operator with Coulomb collisions, it is given by Cei 2 P νei,u (|v|) = 4π| ln δ| ni |v|, (1.101) me |v|2 with Cei and δ given by equations (1.14) and (1.18). 1.4.4.2 Non-magnetized case 0 We now first investigate the case B = 0.With assumptions, L0 ϕ = Qei (ϕ, Mi ). these
We note that both ψ1 = vMe and ψ2 = v me2|v| Me are of the form v(|v|). We try to solve equations L0 ϕk = ψk by functions ϕk of the same form ϕk = vk (|v|). Using (1.99), this leads to 1 ψk . (1.102) ϕk = − νei,u Since the so-obtained ϕk ’s satisfy the constraints ϕk dv = 0 and ϕk |v|2 dv = 0, they are the unique solution of the problem (in fact, this is true only if we consider that L0 incorporates a vanishingly small but non-zero electron–electron collision operator). We can now obtain the expression of L0 in this approximation 1 1 2 (L0 )11 = Me |v| dv Id, (1.103) 3 R3 νei,u me 1 4 (L0 )12 = (L0 )21 = Me |v| dv Id, (1.104) 6 R3 νei,u 2
26
Pierre Degond
(L0 )22
m2 = e 12
1
R3
Me |v| dv 6
νei,u
Id,
(1.105)
Inserting (1.103) into (1.104) (with B = 0) leads to M0 = R3
3ne kB Te Id. 1 2 νei,u Me |v| dv
(1.106)
We can define the electron momentum relaxation rate ν¯ ei,u by M0 = me ν¯ ei,u Id.
(1.107)
with ν¯ ei,u = 3ne
kB Te me R3
1 . 2 νei,u Me |v| dv 1
(1.108)
ν¯ ei,u measures the rate at which the electron velocity relaxes toward the ion velocity. Expressions of A0 and K0 can be obtained by inserting equations (1.103)– (1.105) into (1.95) and (1.96) as well. In the case of Boltzmann collisions with a constant cross-section σeiB (hardsphere model), straightforward computations lead to 3 kB Te B ν¯ ei,u = (2π)3/2 σeiB ni , (1.109) 2 me and L0 = ne
kB Te 1 B me ν¯ ei,u
Id 2kB Te Id
2kB Te Id . 6(kB Te )2 Id
(1.110)
M0 is given by (1.107) and A0 , K0 by
1 A0 = − Id, 2
K0 = 2ne
kB Te 1 Id. B me ν¯ ei,u
(1.111)
In the Fokker–Planck case with the Coulomb scattering cross-section σeiP (1.15), we can show that √ 2 3/2 (2π) C 2π k T kB Te B e ei P ν¯ ei,u = = ni , (1.112) σeiP (kB Te )ni | ln δ| 16 me 16 kB Te me where Cei and the Coulomb logarithm | ln δ| are given by equations (1.14) and (1.18). We deduce that Id 4kB Te Id kB Te 1 L0 = ne . (1.113) P 4kB Te Id 20(kB Te )2 Id me ν¯ ei,u
Models for Plasmas and Disparate Mass Gaseous Binary Mixtures
27
M0 is given by equation (1.107) and A0 , K0 by A0 =
3 Id, 2
K0 = 4ne
kB Te 1 Id. P me ν¯ ei,u
(1.114)
We now briefly comment on these values. First, we notice that the electron energy relaxation rate νei,W is much smaller by an order me /mi than the electron momentum relaxation rate ν¯ ei,u , both in the hard-sphere case and in the Coulomb case (compare (1.83) with (1.109) and (1.85) with (1.112)). This is actually consistent with the scaling hypotheses outlined in Section 1.3.4 and which will be developed in Section 1.5. This explains why momentum relaxation contributes to the macroscopic equations through a diffusion term while energy relaxation contributes through a relaxation term (see Section 1.4.1). Second, we comment on the differences between hard spheres and Coulomb collisions. In the hard-sphere case, both relaxation rates increase with the electron temperature like a square root function. This dependence is due to the electron– ion relative velocity which is roughly estimated to be the electron thermal velocity (the ion thermal velocity is small compared to it). In the Coulomb case, both decrease like the power −3/2 of the electron temperature. This is the signature of the strong decay of the Coulomb scattering cross section at large energies. Third, the thermo-electric coefficient A0 has a different sign for the hardsphere case and for the Coulomb case. This means that an electron temperature gradient induces flows in opposite directions in these two cases. The reason is again, the very different behavior of the scattering cross section as a function of the relative energy of the particles. Finally, for a similar value of the momentum relaxation rates, the thermal conductivity coefficient K0 in the Coulomb case has twice its value of the hardsphere case. This is because highly energetic particles travel longer distances in the Coulomb case (because of the decay of the scattering cross section with energy) than in the hard-sphere case, thereby contributing to a larger heat conductivity. A close inspection of Refs. [22], [23] shows that the above values are compatible with the values found in the literature. 1.4.4.3 Magnetized case Still in the limit where electron–electron collisions can be neglected, we consider now the case where B does not vanish. We suppose that B is aligned with the third coordinate axis B = |B|e3 where ek is the unit vector in the kth direction. We denote by ωc = (qe |B|)/me the electron cyclotron frequency. In this case, LB is not longer rotationally invariant and the sub-blocs (LB )k are not scalars. We still can use property (1.99) to find the elementary functions ϕ1 , ϕ2 . We recall that these functions are vector valued and denote by (ϕ1 )m , m = 1, 2, 3 the components of ϕ1 and similarly for ϕ2 . Using the same ideas as above, we find (ϕ1 )1 =
2 νei,u
1 (−νei,u v1 − ωc v2 )Me , + ωc2
(1.115)
28
Pierre Degond
(ϕ1 )2 =
2 νei,u
(ϕ1 )3 = −
1 (ωc v1 − νei,u v2 )Me , + ωc2
1 νei,u
v3 Me ,
(1.116) (1.117)
(ϕ2 )1 =
1 me |v|2 Me , (−ν v − ω v ) ei,u 1 c 2 2 + ω2 2 νei,u c
(1.118)
(ϕ2 )2 =
1 me |v|2 (ω v − ν v ) Me , c 1 ei,u 2 2 + ω2 2 νei,u c
(1.119)
(ϕ2 )3 = −
1 νei,u
v3
me |v|2 Me , 2
(1.120)
with νei,u = νei,u (|v|) given by equation (1.99). With these values, the blocs (LB )k are easily computed
(LB )k
⎛ P Lk ⎜L H = ⎝ k 0
H −Lk P Lk
0
⎞ 0 0 ⎟ ⎠, Lk
(1.121)
with
P L11
1 = 3
P L12
me = 6
P L22 =
me2 12
νei,u |v|2 M dv, 2 + ω2 e νei,u c νei,u |v|4 M dv, 2 + ω2 e νei,u c νei,u |v|6 M dv, 2 + ω2 e νei,u c
|v|2 M dv, 2 + ω2 e νei,u c me ωc |v|4 H L12 = M dv, 2 + ω2 e 6 νei,u c m2 ωc |v|6 H L22 = e M dv, 2 + ω2 e 12 νei,u c
H L11
ωc = 3
(1.122) (1.123) (1.124)
and (LB )21 = (LB )12 and Lk given by the zero-magnetic field values (1.103)– P , L H , L are traditionally referred to as the Pedersen, Hall (1.105). The values Lk k k and field-aligned diffusion coefficients. For the hard-sphere case or the Coulomb case, where νei,u ∼ |v| and νei,u ∼ |v|−3 , respectively, these coefficients can easily be computed by a numerical quadrature. In the Coulomb case, an approximate expression can be deduced from Ref. [22]. We do not elaborate more on these general formulas but investigate the case of a large magnetic field, or in other words, the case where ωc is large compared with νei,u (v), or more precisely, to a typical value of it, e.g. ν¯ ei,u . We are interested
29
Models for Plasmas and Disparate Mass Gaseous Binary Mixtures
in the form of the matrices MB , AB , KB . We find ⎛ ⊥ ⎞ νei,u 0 0 1 ν¯ ei,u ⎜ 0 ⎟ ⊥ νei,u 0 ⎠+o MB = ⎝ me ωc 0 0 νei,u with
as ωc → ∞,
(1.125)
⊥ νei,u = ni
mi kB Te σ¯ ei,u (Te ) = νei,W , me me
νei,u = ν¯ ei,u ,
(1.126)
where we recall that σ¯ ei,u (Te ), νei,W and ν¯ ei,u are, respectively, given by (1.78), ⊥ and ν are respectively the electron momentum relax(1.79) and (1.108). νei,u ei,u ation rate in the plane orthogonal to B and in the direction aligned with B. A ⊥ is an arithmetic average of close inspection of these formulas reveals that νei,u νei,u (|v|) over the normalized distribution 13 (ne (kB Te /me ))−1 |v|2 Me (v) while νei,u is a harmonic average of the same quantity. The off-diagonal terms of MB vanish because the Hall terms for large B merely correspond to the E × B drift velocity of the guiding centers [68], [70]. Since this drift is the same for the electrons and the ions, it does not contribute (at leading order in B) to the relative velocity ue − ui , which is what MB is about. Similarly, we get ⎞ ⎛ 0 0 0 1 ν ¯ ei,u AB = ⎝0 0 0 ⎠ + o (1.127) as ωc → ∞, me ωc 0 0 a and
⎛
0 0 1 ⎝ KB = 0 0 me 0 0
⎞ 0 ν¯ 0 ⎠ + o ei,u ωc κ
as ωc → ∞,
(1.128)
where a and κ are the zero-magnetic field values of the scalar thermo-electric coefficient A0 and thermal conductivity K0 (Section 1.4.4.2). Therefore, in a very strong magnetic field, heat conductivity and thermo-electric processes are almost totally impeded in the direction normal to the magnetic field, while they are unmodified (compared with their zero-magnetic field value) in the direction parallel to B. This is of course due to the Larmor rotation of the particles in the strong magnetic field which impedes transport in the direction normal to it.
1.5 Scaling Hypotheses In the present and following sections, we develop the various steps of the derivation of the plasma fluid model from a scaling limit of the coupled
30
Pierre Degond
Boltzmann or Fokker–Planck equations. In this section, our first task is to put the Fokker–Planck–Landau equations (1.3), (1.4) or Boltzmann equations (1.1), (1.2) in dimensionless form. Our key point will be that the mass ratio is a small parameter ε2 :=
me
1 mi
(1.129)
We first make the assumption that the density and temperature scales (respectively denoted by n0 and T0 ) are the same for the two species of particles. Since the masses are so different, the kinetic velocity scale for electrons ve0 and for ions vi0 must be chosen different, equal to kB T0 kB T0 ve0 = , vi0 = , me mi and in particular, we have vi0 = εve0 ve0 ,
(1.130)
This suggests that we should introduce a different scaling for the electron and ion velocities by letting
v = ve0 v¯ for electrons (or light particles), v = vi0 v¯ for ions (or heavy particles),
where v¯ are dimensionless velocities. Scaling units for the distribution functions −3 −3 are chosen accordingly: fe0 = n0 ve0 , fi = n0 vi0 and the dimensionless distribution ¯ ¯ ¯ ¯ functions fe , fi are defined by fe = fe0 fe , fi = fi0 fi . Now, our next task is to scale the expressions (1.5) and (1.6) of the collision operators. For this we need to make some hypothesis which allows us to compare the various scattering cross-sections. We suppose that there exists a common B (E, cos θ), where scattering cross section scale σ0 and dimensionless functions σ¯ αβ E > 0 is a dimensionless variable, such that μαβ |v − v1 |2 B 2 B σαβ (μαβ |v − v1 | , cos θ) = σ0 σ¯ αβ , cos θ , kB T0 in the case of the Boltzmann operator (1.6), and similarly for the scattering cross sections for grazing collisions in the case of the Fokker–Planck–Landau operator (1.5) P σαβ (μαβ |v
− v1 | ) = 2
P σ0 σ¯ αβ
μαβ |v − v1 |2 . kB T0
Models for Plasmas and Disparate Mass Gaseous Binary Mixtures
31
We note that 1 1 1 ε2 = m . μee = me , μii = mi , μei = me i 2 2 1 + ε2 1 + ε2 Associated with the scattering cross-section scale σ0 , we can define two typical collision frequencies, the electron collision frequency νe0 = σ0 n0 ve0 and the ion collision frequency νi0 = σ0 n0 vi0 . Because of (1.130), we have νi0 = ενe0 . We −1 −1 and νi0 as time adopt the inverse collision frequencies (or collision times) νe0 scales for the electron and ion collision phenomena.Therefore, we let Qe0 = νe0 fe0 and Qi0 = νi0 fi0 be the scaling units of the electron and ion collision operators, ¯ ee , Q ¯ ei ,… are defined by respectively. The dimensionless collision operators Q ¯ ee , Qei = Qe0 Q ¯ ei , Qie = Qi0 Q ¯ ie , Qii = Qi0 Q ¯ ii . Qee = Qe0 Q With these hypotheses the scaled Boltzmann operators read (dropping the bars for the sake of simplicity): Bε B 1 Qee ( fe , fe )(v) = σee |v − v1 |2 , cos θ |v − v1 |( fe ( fe )1 − fe ( fe )1 )dv1 dω, 2 (1.131) together with the collision rules: v =
v + v1 |v − v1 | + ω, 2 2
and
QeiBε ( fe , fi )(v)
=
σeiB
v1 =
v + v1 |v − v1 | − ω, 2 2
(1.132)
|v − εv1 |2 , cos θ |v − εv1 |( fe ( fi )1 − fe ( fi )1 )dv1 dω, 1 + ε2 (1.133)
with: ε 1 (εv + v1 ) + |v − εv1 |ω, 2 1+ε 1 + ε2 1 ε (εv + v1 ) − |v − εv1 |ω, v1 = 2 1+ε 1 + ε2 v =
and QieBε ( fi , fe )(v)
1 = ε
σeiB
(1.134)
|εv − v1 |2 , cos θ |εv − v1 | 1 + ε2
× ( fi ( fe )1 − fi ( fe )1 )dv1 dω,
(1.135)
with: 1 ε (v + εv1 ) + |εv − v1 |ω, 2 1+ε 1 + ε2 ε 1 (v + εv1 ) − |εv − v1 |ω, v1 = 2 1+ε 1 + ε2 v =
(1.136)
32
Pierre Degond
and finally
QiiBε ( fi , fe )(v)
=
σiiB
1 2 |v − v1 | , cos θ |v − v1 | 2
× ( fi ( fi )1 − fi ( fi )1 )dv1 dω,
(1.137)
with again the same collision rule as (1.132). Similarly, the scaled Fokker–Planck operators are given by:
Pε ( fe , fe )(v) Qee
1 = ∇v · 4
P σee
1 2 |v − v1 | |v − v1 |3 2
× S(v − v1 )(( fe )1 (∇v fe ) − fe (∇v fe )1 )dv1 , QeiPε ( fe , fi )(v)
=
1 1 + ε2
2 ∇v ·
σeiP
|v − εv1 |2 |v − εv1 |3 1 + ε2
(1.138)
× S (v − εv1 )(( fi )1 (∇v fe ) − εfe (∇v fi )1 )dv1 , (1.139) QiePε ( fi , fe )(v) =
QiiPε ( fi , fi )(v)
1 1 + ε2
1 = ∇v · 4
2 ∇v ·
σeiP
|εv − v1 |2 |εv − v1 |3 1 + ε2
× S (εv − v1 )(ε( fe )1 (∇v fi ) − fi (∇v fe )1 )dv1 , (1.140)
σiiP
1 2 |v − v1 | |v − v1 |3 2
× S (v − v1 )(( fe )1 (∇v fi ) − fi (∇v fe )1 )dv1 .
(1.141)
Pε = O(1). The same The scaled Fokker–Planck operators are such that Qαβ observation is true for the Boltzmann operator, except for QieBε which appears to be O(ε−1 ). In fact, this impression is false, because limε→0 εQieBε = 0. Indeed, this limit can be easily computed by formally setting ε = 0 in the expression of εQieBε and we get:
lim εQieBε ( fi , fe )(v) =
ε→0
σeiB |v1 |2 , cos θ |v1 |( fi ( fe )1 − fi ( fe )1 )dv1 dω, (1.142)
Models for Plasmas and Disparate Mass Gaseous Binary Mixtures
33
with the collision rule v = v, v1 = −|v1 |ω. Passing to spherical coordinates in v1 i.e. v1 = |v1 |e, e ∈ S2 , we get: lim εQieBε ( fi , fe )(v) = fi
ε→0
σeiB (|v1 |2 , cos (ω · e))
× ( fe (|v1 |ω) − fe (|v1 |e))|v1 |3 d|v1 |de dω = 0, (1.143) by antisymmetry in the interchange of ω and e. Therefore, QieBε = O(1) and we get Bε = O(1). We also notice that Q ε the same scaling for all Boltzmann operators Qαβ ii ε and Qee do not depend on ε and shall drop the superscript ε for these operators. We now pass to the choice of the time and space scales. We have already −1 −1 introduced the collision times νe0 and νi0 as the appropriate time scales for the collision operators. The associated space scales are the electron and ion mean freepaths (i.e. the average distance travelled by the particles between two collisions) −1 −1 e = ve0 νe0 and i = vi0 νi0 . We note that the electron collision time is much −1 −1 shorter than the ion one (because νe0 = ενi0 ). At variance, the electron and ion mean free-paths are the same e = i (because the electron velocity is larger and exactly compensates for the smaller collision time). Our goal is to observe the system on large time and space scales compared with the electron and ion kinetic scales, in order to obtain hydrodynamic-like models for both the electrons and the ions. To this aim, we choose units adapted to the observation of a large system (of typical size O(ε−1 )) over long periods of time (of typical duration O(ε−1 )) compared with the ion collision space and time scales. Specifically, we choose the time scale t0 and space scale x0 according to: 1 −1 1 −1 t0 = νi0 (= 2 νe0 ), ε ε
1 1 x0 = i (= e = vi0 t0 = εve0 t0 ). ε ε
(1.144)
We note that the size of the system is not chosen arbitrary, but is linked to the mass ratio parameter. Other kinds of scalings are possible but this particular one leads to remarkably rich macroscopic models. Finally, the force scale F0 = kB T0 /x0 corresponds to a force which induces energy changes of the order of unity over macroscopic distances, and therefore, small energy changes (of order ε) over the microscopic distance (the mean freepaths e = i ). In the case of the Lorentz force, the electric field scale is E0 = F0 /qe . Here we suppose that the charge number of the ions Zi = qi /qe is of the order of unity, which means that the electric forces acting on electrons and ions have the same order of magnitude. If large charge numbers (Zi ∼ 10) are involved, a different scaling must be chosen [3]. The magnetic field scale will be chosen as B0 = E0 /vi0 . At this scale, the Laplace force (the magnetic component of the Lorentz force) is of order O(1) for the ions. But since the electron velocity is much larger, the Laplace force is of order O(ε−1 ) for the electrons.
34
Pierre Degond
Then, dimensionless time and space variables ¯t and x¯ can be introduced by t = t0 ¯t , x = x0 x¯ . Similarly, dimensionless force, electric and magnetic fields are ¯ B = B0 B. ¯ defined similarly by letting Fe,i = F0 F¯ e,i , E = E0 E, Performing these changes of variables in the Fokker–Planck equations (1.3) and (1.4), we obtain the scaled Fokker–Planck equations which read (after dropping the bars since all quantities are now dimensionless): 1 1 1 P ( fe , fe ) + QeiPε ( fe , fi )), v · ∇x fe − E + v × B · ∇v fe = 2 (Qee ∂t fe + ε ε ε (1.145) 1 ∂t fi + v · ∇x fi + Zi (E + v × B) · ∇v fi = (QiePε ( fi , fe ) + QiiP ( fi , fi )). (1.146) ε We note that the Lorentz force acting on the electrons splits in two terms of different orders of magnitude (as a consequence of the remark above). This will be of importance in the asymptotic analysis below. For the Boltzmann equations of rarefied gas mixtures (1.1), (1.2) such a splitting does not occur. However, we easily recover the scaled form of the Boltzmann equations (1.1), (1.2) from the Fokker– Planck equations (1.145) and (1.146) by the following changes: B → 0, −E → Fe , Pε → Q Bε . Therefore, in the sequel, we shall mainly concentrate on Zi E → Fi , Qαβ αβ the Fokker-Planck equations (1.145) and (1.146) and just mention the necessary changes to treat the case of the Boltzmann equations. In order to facilitate the conversion between the scaled and unscaled version of the equations, we have summarized the scaling in Table 1.1
1.6 Expansion of the Interspecies Collision Operators In this section, we expand the interspecies collision operators Qeiε ( fe , fi ) and write:
Qieε ( fi , fe ). We
Qeiε = Qei0 + εQei1 + · · · ,
Qieε = Qie0 + εQie1 + · · · ,
(1.147)
and compute the various terms of the expansion. In this section, we only give the result, and the details of the computation can be found in Refs. [1, 2]. We first treat the case of Boltzmann operators. We recall that QeiBε is given by equation (1.133) with the collision rule (1.134) while QieBε is given by equation (1.135) with the collision rule (1.136). At leading order, the expansion of the electron–ion collision operator QeiBε gives: QeiB0 ( fe , fi )(v) = ni |v|σeiB (|v|2 , cos θ)( fe (|v|ω) − fe (v))dω, (1.148)
35
Models for Plasmas and Disparate Mass Gaseous Binary Mixtures
Table 1.1 Scaling units for the kinetic model
Quantity
Scaling unit
Density
n0
Temperature
T0
Electron kinetic velocity
ve0 =
kB T 0 me
kB T 0 mi
Ion kinetic velocity
vi0 =
Electron distribution function
fe0 =
n0 3 ve0
Ion distribution function
fi0 =
n0 3 vi0
Scattering cross-section
σ0
Electron collision frequency
νe0 = n0 σ0 ve0
Ion collision frequency
νi0 = n0 σ0 vi0
Electron collision operator
Qe0 = νe0 fe0
Ion collision operator
Qi0 = νi0 fi0
Time
1 −1 1 −1 t0 = νi0 = 2 νe0 ε ε
Space
1 1 x0 = i = e = vi0 t0 = εve0 t0 ε ε
Force
F0 =
kB T0 x0
Electric field
E0 =
F0 qe
Magnetic field
B0 =
E0 vi0
We recall that ε =
√ me /mi .
with ni = fi dv the ion density and cos θ = ω · v/|v|. This operator describes the relaxation of the electron distribution function toward velocity isotropic distribution functions about the zero average velocity.
36
Pierre Degond
It is often referred to as the Lorentz model. Indeed, it can be written: B QeiB0 ( fe , fi )(v) = −νei,T ( fe − fe (|v|)),
(1.149)
with the electron–ion total collision frequency B B νei,T (|v|) = ni |v|σei,T ,
given in terms of the total scattering cross-section π B σei,T = 2π σeiB (|v|2 , cos θ) sin θ dθ,
(1.150)
(1.151)
0
and the angular average fe of fe defined by: 1 fe (|v|) = B σeiB (|v|2 , cos θ)( fe (|v|ω)dω. σei,T
(1.152)
The ion distribution comes into the expression of QeiB0 only through the density. It is a remarkable property which can be explained as follows. At the electron velocity scale, the ion velocity is small (of order ε) and the ions can be viewed as steady. Because of their very small mass, the electrons elastically bounce on the steady ions, thereby contributing to relaxing their distribution function toward a velocity isotropic one about the zero average velocity. There are no energy exchanges between the two species of particles at this order. We shall study this operator in more detail when studying the linearized operator LB (1.69). We shall not need the expression of QeiB1 ( fe , fi ) in its full generality, but only for isotropic functions fe , i.e. when fe (v) = φ(|v|2 /2) is a function of the kinetic energy only. In this case, we have B QeiB1 ( fe , fi )(v) = −νei,u ui · ∇v fe ,
(1.153)
B is the electron–ion collision frequency for momentum exchange where νei,u B B νei,u (|v|) = ni |v|σei,u ,
(1.154)
given in terms of the scattering cross-section for momentum exchange π B σei,u = 2π σeiB (|v|2 , cos θ)(1 − cos θ) sin θ dθ, (1.155) 0
and ui is the mean velocity of fi , i.e. ni ui = fi v dv. Again, we stress that this expression is not true if fe is not isotropic. In the isotropic case, we also have the following useful identity: QeiB1 ( fe , fi )(v) = QeiB0 (ui · ∇v fe , fi )(v),
(1.156)
37
Models for Plasmas and Disparate Mass Gaseous Binary Mixtures
which relates the zeroth and first order terms of the expansion of QeiBε . Equations (1.153) and (1.156) are shown in Refs. [1, 2]. The leading and first order terms in the expansion of the ion–electron collision operator QieBε are given by: B (|v|σei,u ) fe (v)v dv , (1.157) QieB0 ( fi , fe )(v) = −∇v fi · QieB1 ( fi , fe )(v)
= −∇v (vfi ) :
B (|v|σei,u )v
⊗ ∇v fe (v)dv
+ D fi (v) : 2
(|v|Bei,D )|v|2 fe (v)dv
,
(1.158)
where the cross-section for momentum exchange is defined by (1.155) and the matrix cross section for diffusion Bei,D by: v v B (|v|2 ) ⊗ Bei,D (|v|2 , ω) = σei, |v| |v| v v B 2 + σei,⊥ (|v| ) Id − ⊗ , (1.159) |v| |v| with the cross sections for parallel and transverse diffusion given by: π B 2 σeiB |v|2 , cos θ (1 − cos θ)2 sin θ dθ, σei, (|v| ) = π 0 π π B (|v|2 ) = σeiB |v|2 , cos θ sin3 θ dθ. σei,⊥ 2 0
(1.160) (1.161)
We have denoted by ∇v (vfi ) the tensor (∂vk (v fi ))k,=1, ... 3 , by v ⊗ ∇v fe the tensor vk ∂v fe , by Dv2 fi the Hessian matrix ∂v2k v fi , and by the symbol: the contracted product of two tensors A:B = 3k,=1 Ak Bk . Elementary computations show that the various cross sections are related by: B B B σei,u = σei, + 2σei,⊥ .
The meaning of these formulas is as follows: at leading order, the electrons contribute to a force acting on the ions. This force is proportional to a velocity average of the distribution function, weighted by the total cross section for momentum exchange. Indeed, QieB0 can be written: QieB0 ( fi , fe )(v) = −∇v fi · A,
(1.162)
B A = νie,u ve ,
(1.163)
with the force:
38
Pierre Degond
B and the electron while the momentum transfer rate from electrons to ions νie,u velocity average ve are respectively given by: 1 B B B νie,u = (|v|σei,u )fe (v)dv, ve = B )fe (v)v dv, (1.164) (|v|σei,u νie,u
At the next order, the force operator must be corrected by a term proportional to the velocity gradient of the electron distribution function (the first B ) term in (1.158)). Using Green’s formula, we notice that the matrix (|v|σei,u v ⊗ ∇v fe (v) dv is a symmetric tensor. Then, the second contribution in (1.158) is a velocity diffusion operator D:Dv2 fi , with velocity diffusion tensor given by D = (|v|Bei,D )|v|2 fe (v)dv. It is interesting to note the following property: fe
isotropic
=⇒
QieB0 ( fi , fe ) = 0.
(1.165)
If additionally fe is a Maxwellian with zero mean velocity Mne ,0,Te (see (1.207)), then QieB1 reduces to the following Fokker–Planck operator: B (Te v fi + ∇v · (vfi )), QieB1 ( fi , Me ) = 13 νie,W
with the collision frequency B νie,W
1 = Te
(1.166)
B Me dv |v|3 σei,u
,
(1.167)
B and Me = Mne ,0,Te . We shall see later that νie,W is the energy relaxation rate.
We now turn to the case of the Fokker–Planck–Landau operators. Again, the details of this computation are given in Refs. [1, 2]. At leading order, the electron–ion collision operator QeiPε is given by: QeiP0 ( fe , fi )(v) = ni ∇v · ((|v|3 σeiP (|v|2 ))S(v)∇v fe ).
(1.168)
Again, this operator describes a relaxation toward an isotropic distribution function. Indeed, if fe is a function of |v| only, its gradient is proportional to v and is in the null-space of the matrix S.Therefore, QeiP0 vanishes on such functions.We can show [83] that this operator applied to any function fe decreases the distance of fe to velocity isotropic distributions, which is what a relaxation operator does. Like in the Boltzmann case, we only give the expression of QeiP1 ( fe , fi ) for isotropic functions fe (v) = φ(|v|2 /2) and in this case, we have P QeiP1 ( fe , fi )(v) = −νei,u u i · ∇v f e ,
(1.169)
where the electron–ion collision frequency is given by P P νei,u (|v|) = ni |v| σei,u ,
P σei,u = 2σeiP .
(1.170)
39
Models for Plasmas and Disparate Mass Gaseous Binary Mixtures
Like (1.156), we also have: QeiP1 ( fe , fi )(v) = QeiP0 (ui · ∇v fe , fi )(v).
(1.171)
These identities are shown in Refs. [1, 2]. The ion–electron collision operator QiePε gives at leading and first orders: P0 3 P Qie ( fi , fe )(v) = −∇v fi · (|v| σei )S(v)∇v fe (v)dv , (1.172) QieP1 ( fi , fe )(v)
= −∇v (vfi ) :
(|v|3 σeiP )S(v)D 2 fe (v)dv
+ D fi (v) : 2
(|v|3 σeiP )S(v)fe (v)dv
.
(1.173)
We have denoted by S(v)D 2 fe (v) the usual product of the two matrices S(v) and D 2 fe (v). Again, the leading order ion–electron collision operator QieP0 is of the form (1.162), where now the force A is given by: A = (|v|3 σeiP )S(v)∇v fe (v)dv. (1.174) Using Green’s formula and ∇v S(v) = −2
v , |v|2
(1.175)
A is equivalently written P A = νie,u ve , P P = (|v|σei,u )fe (v)dv, νie,u
ve =
1 P νie,u
(1.176) P )fe (v)v dv. (1.177) (|v|σei,u
We note the strong analogy with (1.164). The next order again splits into two terms. The first one is a correction term to the acceleration operator. By using Green’s formula and (1.175), this operator can also be written P −∇v (vfi ): (|v|σei,u )v ⊗ ∇v fe (v)dv in analogy with (1.158). The second one is a velocity diffusion operator, the diffusion coefficient of which is given by D = (|v|3 σeiP )S(v)fe (v)dv,
40
Pierre Degond
i.e. is an average of the velocity projection operator S weighted by the distribution function and the scattering cross section. Again, we note that fe
isotropic
=⇒
QieP0 ( fi , fe ) = 0.
(1.178)
If additionally fe is a Maxwellian with zero mean velocity Mne ,0,Te (= Me ) (see (1.207)), then QieP1 has the following Fokker–Planck form: 1 P (Te v fi + ∇v · (vfi )), (1.179) QieP1 ( fi , Me ) = νie,W 3 with the collision frequency 1 P 3 P (1.180) |v| σei,u Me dv . νie,W = Te
1.7 Moment Method and Conservation Laws for Plasmas In this section, we revisit the moment method and LTE closure in view of the new scaling (1.145) and (1.146). We start by investigating the conservation properties of the expanded collision operators which can be deduced from those of the original ones (properties (i)–(v) of Section 1.3.1).
1.7.1 Properties of the expanded collision operators Obviously, the properties of the scaled intra-particle collision operators Qee and Qii are the same as those of the unscaled versions. Therefore, we focus on the inter-particle collision operators Qeiε and Qieε . From (1.19)–(1.21), we deduce: (i) Mass conservation: Qeik dv = 0,
Qiek dv = 0,
k = 0, 1, . . . .
(1.181)
(ii) Momentum conservation: (Qeik + Qiek )v dv = 0,
k = 0, 1, . . . .
(1.182)
(iii) Energy conservation:
Qei0 |v|2 dv = 0,
(1.183)
(Qeik + Qiek−1 )|v|2 dv = 0,
k = 1, . . . .
(1.184)
41
Models for Plasmas and Disparate Mass Gaseous Binary Mixtures
Equations (1.181) are directly deduced from equation (1.19). The scaling of equation (1.20) and (1.21) leads to ε ε (Qeiε + εQieε )|v|2 dv = 0, (1.185) (Qei + Qie )v dv = 0 which, in the limit ε → 0, leads to (1.182) and (1.184). (iv) Entropy inequality: 0 Hei ( fe , fi ) := Qei0 ln fe dv ≤ 0.
(1.186)
Inequality (1.186) is found by scaling inequality (1.23), which leads to: (Qeiε ln fe + εQieε ln fi )dv ≤ 0, and in the limit ε → 0, to (1.186). Since we are dealing with inequalities, it is impossible to deduce any information for higher order terms. It is interesting to note that properties of Qei0 such as (1.181), (1.183) or (1.186) can directly be obtained from its expressions (1.148) or (1.168). Indeed, in the Boltzmann case, multiplying (1.148) by an arbitrary test function ψ(v), integrating over v and using spherical coordinates (|v|, ω = v/|v|) leads to: ni B0 Qei ( fe , fi )ψ dv = − (|v|σeiB (|v|2 , ω · ω )){fe (|v|ω ) 2 −fe (|v|ω)}{ψ(|v|ω ) = ni
− ψ(|v|ω)}dω dω |v|2 d|v|,
(1.187)
(|v|σeiB (|v|2 , ω · ω ))fe (|v|ω){ψ(|v|ω )
−ψ(|v|ω)}dω dω |v|2 d|v|.
(1.188)
QeiB0 ( fe , fi ) ψ dv = 0
It is readily seen that for all fe , fi if and only if ψ does not depend on ω i.e. ψ = ψ(|v|). In particular the result applies for ψ = 1 or ψ = |v|2 which leads to (1.181) (for Qei0 ) and to (1.183). Furthermore, letting ψ = ln fe and using that ln is a non-decreasing function, we get (1.186). In the Fokker–Planck case, we similarly have, using (1.168) P0 (1.189) Qei ( fe , fi )ψ dv = −ni (|v|3 σeiP (|v|2 ))(∇v ψ)T S(v)∇v fe dv, Again, this expression vanishes for all fe , fi if and only if ∇v ψ is proportional to v i.e. if ψ = ψ(|v|) does not depend on ω. With ψ = ln fe , it can be written: P0 Qei ( fe , fi ) ln fe dv = −ni (|v|3 σeiP )fe (∇v ln fe )T S(v)∇v ln fe dv ≤ 0, (1.190) because S(v) is a non-negative matrix.
42
Pierre Degond
(v) Equilibria: Let fe be such that Qei0 ( fe , fi ) = 0 for some non-zero fi . Then fe = fe (|v|) only depends on the magnitude |v| of the velocity. Indeed, if Qei0 ( fe , fi ) = 0, then Hei0 ( fe , fi ) = 0. In the case of the Boltzmann operator, using (1.187) (with ψ = ln f ), we get that Hei0 ( fe , fi ) is given by an integral of a non-positive quantity. Therefore, the quantity inside the integral is identically zero, which means that fe (|v|ω ) − fe (|v|ω) = 0, for all ω and ω . This obviously means that fe (|v|ω) is independent of ω and therefore, is a function of |v| only. In the case of the Fokker–Planck operator, we pursue the same reasoning by remarking that the quantity inside the integral (1.190) must be identically zero. This means that ∇v fe belongs to the null-space of the matrix S(v), i.e. is everywhere proportional to v. It is readily seen that this is equivalent to saying the fe only depends on |v|.
1.7.2 Moments and conservation laws for the scaled kinetic model Our goal is now to compute the velocity moments and find the conservation laws associated with the scaled Fokker-Planck equation (1.145), (1.146). We can rewrite these equations by using the expansions of the collision operators given in Section 1.6 according to: 1 1 ∂t feε + (v · ∇x feε − (E + v × B) · ∇v feε ) ε ε 1 1 = 2 (Qee ( feε , feε ) + Qei0 ( feε , fiε )) + Qei1 ( feε , fiε ) + Qei2 ( feε , fiε ) + O(ε), ε ε (1.191) ∂t fiε + v · ∇x fiε + Zi (E + v × B) · ∇v fiε 1 = (Qie0 ( fiε , feε ) + QiiP ( fiε , fiε )) + Qie1 ( fiε , feε ) + O(ε). ε
(1.192)
Beforehand, we need to scale the macroscopic quantities. The choice of scales follows directly from that of the microscopic quantities: we use n0 and W0 = n0 kB T0 as scaling units for the number and energy densities of both species. The electron and ion mean velocities are scaled by ve0 and vi0 . With this scaling, equation (1.27) which defines the macroscopic quantities now reads: ⎞ ⎞ ⎛ nαε 1 ⎝nαε uαε ⎠ = fαε (v) ⎝ v ⎠ dv 1 2 Wαε 2 |v| ⎛
α = i, e,
(1.193)
Similarly, we scale the thermal energies wα and the pressure tensors Pα (for α = i, e) by W0 = n0 kB T0 and the heat flux vectors for electrons and ions
Models for Plasmas and Disparate Mass Gaseous Binary Mixtures
43
by Qe0 = n0 kB T0 ve0 and Qi0 = n0 kB T0 vi0 , respectively. We get the following formulas: 1 1 (1.194) fαε (v)|v − uαε |2 dv, Wαε = nαε uαε + wαε wαε = 2 2 ε (v − uαε ) ⊗ (v − uαε ) Pα ε = fα (v) 1 (1.195) ε ε 2 dv. Qεα 2 (v − uα )|v − uα | Finally, the momentum and energy transfer rates (1.34) are scaled as follows: the electron momentum transfer rates is scaled by Se0 = νe0 me n0 ve0 , which corresponds to the electron momentum density me n0 ve0 multiplied by the electron collision frequency νe0 . Similarly, the ion momentum transfer rate is scaled by Si0 = νi0 mi n0 vi0 and the energy transfer rates by Ue0 = νe0 n0 kB T0 and Ui0 = νi0 n0 kB T0 . In scaled units, the relations between the transfer rates and the collision operators are written: ε v Sαβ ( fα , fβ ) ε (1.196) Qαβ ( fα , fβ ) |v|2 dv α = i, e. ε ( f ,f ) = Uαβ α β 2
We note that, because of (1.185), we have Seiε + Sieε = 0,
Ueiε + εUieε = 0
(1.197)
We summarize the scaling units for the macroscopic quantities in Table 1.2. Now, taking the moments of (1.191) and (1.192) leads to the following system: 1 ∂t neε + ∇x · (neε ueε ) = 0, (1.198) ε 1 1 1 ε ε ε ε ε ε ε ε = 2 Seiε ( feε , fiε ), ∂t (ne ue ) + ∇x (ne ue ⊗ ue ) + ∇x Pe + ne E + ue × B ε ε ε (1.199) 1 1 ∇x · (Weε ueε + Pεe ueε + Qεe ) + neε E · ueε = 2 Ueiε ( feε , fiε ), (1.200) ε ε and similarly for the ions: ∂t Weε +
∂t niε + ∇x · (niε uiε ) = 0,
(1.201)
1 ∂t (niε uiε ) + ∇x (niε uiε ⊗ uiε ) + ∇x Pεi − Zi niε (E + uiε × B) = Sieε ( fiε , feε ), ε (1.202) 1 ∂t Wiε + ∇x · (Wiε uiε + Pεi uiε + Qεi ) − Zi niε E · uiε = Uieε ( fiε , feε ), ε
(1.203)
44
Pierre Degond
Table 1.2 Scaling units for the macroscopic quantities
Quantity
Scaling unit
Electron mean velocity
ve0
Ion mean velocity
vi0
Total energy densities for both species
W0 = n0 kB T0
Thermal energy densities for both species
W0
Pressure tensors for both species
W0
Electron heat flux vector
Qe0 = n0 kB T0 ve0
Ion heat flux vector
Qi0 = n0 kB T0 vi0
Electron momentum transfer rate
Se0 = νe0 me n0 ve0
Ion momentum transfer rate
Si0 = νi0 mi n0 vi0
Electron energy transfer rate
Ue0 = νe0 n0 kB T0
Ion energy transfer rate
Ui0 = νi0 n0 kB T0
We shall call this system, the “plasma moment system’’. For the ions, we recognize the same scaling as (1.46)–(1.48). For the electrons however, the scaling is different as that of (1.43)–(1.45) and carries the mark of that of the kinetic equation (1.191). We now look for the limit ε → 0 of system (1.198)–(1.203). To this aim, we shall perform the limit ε → 0 in the kinetic system (1.191) and (1.192).
1.7.3 Closure of the plasma moment system We introduce the formal expansion of the solution ( feε , fiε ) of (1.191), (1.192) in powers of ε (Hilbert expansion): feε = fe0 + εfe1 + O(ε2 ),
fiε = fi0 + O(ε).
(1.204)
Introducing (1.204) in (1.191), (1.192) and keeping the leading order terms in ε leads to: (v × B) · ∇v fe0 + Qee ( fe0 , fe0 ) + Qei0 ( fe0 , fi0 ) = 0,
(1.205)
Qie ( fi0 , fe0 ) + Qii0 ( fi0 , fi0 ) = 0.
(1.206)
We first consider (1.205) and refer to (1.148) or (1.168) for the expressions of Qei0 in the cases of the Boltzmann and Fokker–Planck operators. We claim that
Models for Plasmas and Disparate Mass Gaseous Binary Mixtures
the solution of (1.205) is a Maxwellian with zero mean velocity: ne |v|2 0 exp − . fe = Mne ,0,Te = (2πTe )3/2 2Te
45
(1.207)
with density ne and temperature Te . Here, we have written the Maxwellian in scaled units using Tables 1.1 and 1.2. Indeed, multiplying (1.205) by ln fe0 and integrating with respect to velocity leads to He ( fe0 ) + Hei0 ( fe0 , fi0 ) = 0.
(1.208)
where He ( fe0 ) is the entropy dissipation rate of the electron–electron collision operator (1.22) and Hei0 ( fe0 , fi0 ) is that of the leading order electron–ion collision operator (1.186). The term arising from the magnetic field operator vanishes. Indeed, using that ∇v · (v × B) = 0 and Green’s fromula, we have, for fe decaying at infinity fast enough: (v × B) · (∇v fe ) ln fe dv = ∇v · ((v × B)fe ) ln fe dv = − ((v × B)fe ) · ∇v ( ln fe )dv = − ∇v · ((v × B)fe )dv = 0. Now, each of the quantities He ( fe0 ) and Hei0 ( fe0 , fi0 ) is non-positive, by (1.22) and (1.186) and their sum is equal to zero. Therefore, they are separately equal to zero. That He ( fe0 ) = 0 implies that fe0 is a Maxwellian fe0 = Mne ,ue ,Te (see Section 1.3.1). That Hei0 ( fe0 , fi0 ) = 0 means that fe0 is a function of |v| only (see Section 1.7.1). Combining these two properties, we see that fe0 is a Maxwellian with zero mean velocity. Now, we consider (1.206). We note the following property: if fe = fe (|v|) is an isotropic function, then Qie0 (fi , fe ) = 0. Indeed, Qie0 is given by (1.162), with the force A given by (1.163) or (1.174). It is readily seen that if fe = fe (|v|), A = 0 by antisymmetry. In particular, we have Qie0 (fi0 , fe0 ) = 0 because fe0 given by (1.207) is isotropic. Then (1.206) reduces to Qii0 ( fi0 , fi0 ) = 0. We know from Section 1.3.1 that this implies that fi0 is a Maxwellian (in scaled units): ni |v − ui |2 fi0 = Mni ,ui ,Ti = exp − . (1.209) (2πTi )3/2 2Ti with associated density ni , mean velocity ui and temperature Ti . There is no constraint that the ion and electron velocities or temperatures be equal. This is a consequence of our scaling using the smallness of the electron to ion mass ratio.
46
Pierre Degond
We now see how the informations (1.207) and (1.209) translate into the moment system (1.198)–(1.203). For that purpose, we perform a Hilbert expansion of the macroscopic quantities: neε = ne0 + εne1 + O(ε2 ),
ueε = ue0 + εue1 + O(ε2 ),
...,
(1.210)
and similarly for the other variables and for the ion quantities. By doing so, we note that 3 ne0 = ne , ue0 = 0, We0 = we0 = ne kB Te , (1.211) 2 1 3 Wi0 = ni ui2 + wi0 , wi0 = ni kB Ti , (1.212) 2 2 where ne , Te , ni , ui , Ti are the parameters of the Maxwellians (1.207) and (1.209). Similarly, we have: ni0 = ni ,
ui0 = ui ,
P0e = ne kB Te Id, Q0e = 0,
P0i = ni kB Ti Id,
(1.213)
Q0i = 0.
(1.214)
Using that ue0 = 0 and Q0e = 0, we can rescale the electron velocity and heat flux and define: u¯ eε
1 = ueε = ue1 + O(ε) , ε
and ¯ εe = 1 Qεe = Q1e + O(ε) , Q ε
Q1e =
1 2
ue1
1 = ne
fe1 v dv ,
5 fe1 v |v|2 dv − ne kB Te ue1 . 2
(1.215)
(1.216)
This last expression is found by replacing ueε by ε¯ueε in formula (1.195) for Qεe . We now turn to the collision terms. Following the development of the collision ε we develop the relaxation operators: operators Qαβ ε 0 1 ε 0 1 Sαβ = Sαβ + εSαβ + O(ε), Uαβ = Uαβ + εUαβ + O(ε) , k k k k 1 = Qαβ Sαβ v dv, Uαβ = Qαβ |v|2 dv . 2 ε = S ε ( f ε , f ε ), U ε = U ε ( f ε , f ε ). We develop: Now, we introduce Sαβ αβ α β αβ αβ α β ε 0 1 Sαβ = Sαβ + εSαβ + O(ε),
ε 0 1 Uαβ = Uαβ + εUαβ + O(ε) .
47
Models for Plasmas and Disparate Mass Gaseous Binary Mixtures
For instance 0 0 1 1 0 0 Sαβ = Sαβ ( fα0 , fβ0 ), Sαβ = Sαβ ( fα0 , fβ0 ) + Sαβ ( fα1 , fβ0 ) + Sαβ ( fα0 , fβ1 ) ,
and so on. From the fact that Qie0 ( fi0 , fe0 ) = 0, we have: Sie0 = 0,
Uie0 = 0.
(1.217)
Therefore, we can introduce 1 S¯ieε = Sieε = Sie1 + O(ε), ε
1 U¯ieε = Uieε = Uie1 + O(ε). ε
(1.218)
Now, because of (1.197), we can also define: 1 S¯eiε = Seiε = −S¯ieε = −Sie1 + O(ε), ε 1 U¯eiε = 2 Uieε = −U¯ieε = −Uie1 + O(ε). ε
(1.219) (1.220)
In particular, this implies that Sei0 = 0, Uei0 = Uei1 = 0. The first two equalities could have also been deduced from Qei0 ( fe0 , fi0 ) = 0, but not the last one. Introducing the newly defined “barred’’ variables into (1.198)–(1.203). This leads to: ∂t neε + ∇x · (neε u¯ eε ) = 0,
(1.221)
ε2 (∂t (neε u¯ eε ) + ∇x (neε u¯ eε ⊗ u¯ eε )) + ∇x Pεe + neε (E + u¯ eε × B) = S¯eiε ,
(1.222)
¯ εe ) + neε E · u¯ eε = U¯eiε , ∂t Weε + ∇x · (Weε u¯ eε + Pεe u¯ eε + Q
(1.223)
∂t niε + ∇x · (niε uiε ) = 0,
(1.224)
∂t (niε uiε ) + ∇x (niε uiε ⊗ uiε ) + ∇x Pεi − Zi niε (E + uiε × B) = S¯ieε ,
(1.225)
∂t Wiε + ∇x · (Wiε uiε + Pεi uiε + Qεi ) − Zi niε E · uiε = U¯ieε ,
(1.226)
and:
with S¯eiε + S¯ieε = 0,
U¯eiε + U¯ieε = 0.
(1.227)
Now, the limit ε → 0 of this system is readily taken and leads to ∂t ne + ∇x · (ne ue ) = 0,
(1.228)
∇x pe + ne (E + ue × B) = Sei ,
(1.229)
48
Pierre Degond
∂t We + ∇x · (We ue + pe ue + Qe ) + ne E · ue = Uei ,
(1.230)
and: ∂t ni + ∇x · (ni ui ) = 0,
(1.231)
∂t (ni ui ) + ∇x (ni ui ⊗ ui ) + ∇x pi − Zi ni (E + ui × B) = Sie ,
(1.232)
∂t Wi + ∇x · (Wi ui + pi ui ) − Zi ni E · ui = Uie ,
(1.233)
with Sei + Sie = 0,
Uei + Uie = 0.
(1.234)
We have introduced pe = ne Te , pi = ni Ti the scalar pressures and we have made the following identifications: ue = ue1 , Qe = Q1e , Sie = −Sei = Sie1 , Uie = −Uei = Uie1 . In order to close this system, we need to compute these four quantities. Before doing so, we give an alternate equivalent formulation of the electron system (1.228)–(1.230). From equation (1.216), we note that jW e = We ue + pe ue + Qe =
fe1
|v|2 v dv. 2
(1.235)
The quantity jW e is the electron energy flux. Similarly, we define the electron particle flux jne according to: (1.236) jne = ne ue = fe1 v dv, Thanks to these definitions, system (1.228)–(1.230) can be rewritten as a system of two conservation equations (for the density and energy): ∂t ne + ∇x · jne = 0,
(1.237)
∂t We + ∇x · jW e + E · jne = Uei ,
(1.238)
supplemented with a constitutive equation (often referred to in the physics literature as the generalized Ohm’s law): ∇x pe + ne (E + ue × B) = Sei ,
(1.239)
Now, we need to specify the values of the particle and energy fluxes jne and jW e , as well as that of the relaxation terms Uei , Sei . To this aim, we must compute the first order perturbation of the electron distribution function fe1 . This is performed in the next section. Before closing this section, we recall the scaling units for the new quantities that have appeared following the introduction of the “barred’’ variables. This is summarized in Table 1.3.
49
Models for Plasmas and Disparate Mass Gaseous Binary Mixtures
Table 1.3 Scaling units for the new macroscopic quantities
Quantity
Scaling unit
Rescaled electron mean velocity ue
vi0
Electron particle flux jne
n0 vi0
Rescaled electron heat flux vector Qe
n0 kB T0 vi0
Electron energy flux jW e
n0 kB T0 vi0
Rescaled momentum transfer rate Sei , Sie
t0−1 mi n0 vi0
Rescaled energy transfer rate Uei , Uie
t0−1 n0 kB T0
1.8 Computation of the Fluxes and of the Collision Terms 1.8.1 Preliminaries It will prove convenient to rewrite the electron Maxwellian by introducing the chemical potential μe according to: |v|2 ne |v|2 2 − μe Mne ,0,Te = . (1.240) exp − = fe0 exp − (2πTe )3/2 2Te Te The presence of the unit fe0 is just for physical homogeneity. A different choice would simply lead to a shift in the value of μe (in other words, fe0 simply sets up the origin of the chemical potential scale). According to this definition, μe is given by: 3 μe = ln ne − ln Te + constant. (1.241) Te 2 1 e The pair ( μ Te , − Te ) are the entropic variables, conjugate to the conservative variables (ne , We ) through the entropy [66]. For simplicity, we will abbreviate fe0 = Mne ,0,Te simply into Me and similarly for fi0 = Mni ,ui ,Ti = Mi . Now, our first task is to compute the first order term fe1 of the Hilbert expansion of the electron distribution function (1.204). For this purpose, we insert (1.204) into (1.191). Since the leading order term is cancelled thanks to (1.205), the next order leads to:
LB fe1 = v · ∇x Me − E · ∇v Me − Qei0 (Me , fi1 ) − Qei1 (Me , Mi ),
(1.242)
where the linear operator LB is defined by: LB ϕ := (v × B) · ∇v ϕ + 2Qee (Me , ϕ) + Qei0 (ϕ, Mi ).
(1.243)
50
Pierre Degond
Here, the index B refers to the magnetic field. In deriving this expression, we have used the bilinearity of the collision operators. For more general nonlinear operators, we would have to replace 2Qee (Me , ϕ) by the derivative of the Qee about Me applied to ϕ. To find fe1 , we need to invert (1.242). For this purpose, we need to study the operator LB , which is done in the next section.
1.8.2 Properties of LB The list of properties LB follows that of the nonlinear operators (see Sections 1.3.1 and 1.7.1): (i) Mass conservation: (1.244) (LB ϕ) dv = 0. (ii) Energy conservation:
(LB ϕ) |v|2 dv = 0.
(1.245)
More precisely, we have the following property: dv dv = ϕ (L−B ψ) . (1.246) (LB ϕ) ψ Me Me This property means that the formal adjoint operator to LB (with respect to the scalar product (ϕ, ψ)Me = ϕψMe−1 dv defined on the space of square integrable −1/2 functions with respect to the measure Me dv) is L∗B = L−B , i.e. the same operator for the opposite value of the magnetic field. Then, (1.244) and (1.245) follow from the application of (1.246) with ψ equal to one of the equilibria (see property (iv)). Additionally, any function ψ such that (LB ϕ) ψ dv = 0 for all ϕ is a linear combination of 1 and |v|2 (collisional invariant). (iii) Entropy inequality: dv ≤ 0. (1.247) (LB ϕ)ϕ Me That is LB is a non-positive operator. (iv) Equilibria: Let ϕ such that LB ϕ = 0. Then, ϕ is a linearized Maxwellian ϕ = Me (a + c|v|2 ), with a, c arbitrary real numbers. The proof of these four properties are sketched in Appendix A. Let now ψ be given and let us consider the following equation of unknown ϕ: LB ϕ = ψ.
(1.248)
From (1.244) and (1.245), we immediately see that a necessary condition for the existence of ϕ is that ψ satisfies the solvability constraint: 1 ψ dv = 0. (1.249) |v|2
Models for Plasmas and Disparate Mass Gaseous Binary Mixtures
51
Proving that this is also a sufficient condition is not obvious and in many cases is a challenging problem. In Ref. [82], it is shown that the property is true for Fokker– Planck operators, but with some assumptions which do not apply to the Coulomb scattering cross- section. When LB is the Boltzmann operator alone, this problem has been studied by many authors and the answer is not straightforward [12–14, 18–20, 67]. In the present work, we shall assume for simplicity that (1.249) is a necessary and sufficient condition for the existence of ϕ. Now, the solution is not unique, because we can add any equilibrium to a given solution to construct another solution. To single out a unique solution, we can require of ϕ to satisfy the additional constraint: 1 ϕ dv = 0. (1.250) |v|2 −1 This unique solution is denoted by L−1 B ψ and LB is referred to as the pseudoinverse of LB .
1.8.3 Resolution of the perturbation equation (1.242) We first show that we can rewrite (1.242) in the following form: ui · v LB fe1 − Me = v · ∇x Me − (E + ui × B) · ∇v Me . Te
(1.251)
Indeed, we first note that Me is isotropic (i.e. is a function of |v| only). Therefore, we have Qei0 (Me , fi1 ) = 0,
(1.252)
by an application of property (v) of Section 1.7.1. Still because Me is isotropic, we can apply (1.156) (or (1.171) for the Fokker–Planck operator) and get Qei1 (Me , Mi )(v) = Qei0 (ui · ∇v Me , Mi )(v).
(1.253)
Since ∇v Me = − Tve Me , we also get after some easy algebra (v × B) · ∇v (ui · ∇v Me ) = −(ui × B) · ∇v Me .
(1.254)
Finally, using that vMe belongs to the kernel of the linearized Boltzmann operator ϕ → 2Qee (Me , ϕ), we have Qee (Me , ui · ∇v Me ) = 0.
(1.255)
Collecting (1.253)–(1.255), we deduce that ui · v −LB Me = LB (ui · ∇v Me ) = −(ui × B) · ∇v Me + Qei1 (Me , Mi )(v). Te (1.256)
52
Pierre Degond
Adding this to (1.242), we are led to (1.251). The right-hand side of (1.251) can be explicitly computed thanks to (1.240) and leads to: LB
fe1
ui · v μe E + ui × B − Me = ∇x + · vMe Te Te Te 1 |v|2 + ∇x − Me . ·v Te 2
(1.257)
For notational convenience, we introduce ψ1 (v) = vMe ,
ψ2 (v) = v
|v|2 Me . 2
(1.258)
The functions ψk being odd satisfy the solvability condition (1.249) and therefore, we can introduce ϕk , k = 1, 2, by: ϕk = L−1 B (ψk ),
k = 1, 2.
(1.259)
Since ψk has three components, so has ϕk and (1.259) should be understood componentwise. Therefore, equation (1.257) is solvable and its solution is given by: ui · v μe E + ui × B 1 1 fe = Me + ∇x + · ϕ1 + ∇x − · ϕ2 . (1.260) Te Te Te Te An arbitrary element of the null-space of LB (i.e. a linearized Maxwellian Me (a + b|v|2 )) should be added to the expression (1.260). However, this term will give no contribution to the fluxes and the collision terms and is omitted.
1.8.4 Computation of the fluxes The definition of the electron density and energy fluxes (1.235), (1.236) can be put in vector form: dv jne 1 v 2 1 ψ1 . (1.261) = fe dv = fe jW e ψ2 Me v |v| 2
Inserting (1.260) into this expression leads to: ⎞ ⎛ μe E+ui ×B ∇ + x nu jne Te Te ⎠ , = 5e i − LB ⎝ 1 jW e n T u e e i ∇x − 2 Te
(1.262)
53
Models for Plasmas and Disparate Mass Gaseous Binary Mixtures
where the 6 × 6 tensor LB is block defined by (LB )11 (LB )12 LB = , (LB )21 (LB )22
(1.263)
and the 3 × 3 blocks (LB )k are given by:
(LB )k
dv = − ψk ⊗ ϕ =− Me dv = − (LB ϕk ) ⊗ ϕ . Me
ψk ⊗ (L−1 B ψ )
dv Me (1.264)
In the absence of a magnetic field B = 0, the operator L0 is rotationally invariant, and the matrices (L0 )k are scalar matrices (L0 )k = (L0 )k Id with (L0 )k a real number. However, if B = 0, LB is not rotationally invariant and (LB )k are nonscalar 3 × 3 tensors. We show below that the tensor LB is positive definite and satisfies Onsager’s symmetry relation (1.74). We recall that L−B is the formal adjoint of LB for the scalar product (ϕ, ψ)Me . Therefore,We have dv dv (L−B )k = − (L−B ϕk ) ⊗ ϕ = − ϕk ⊗ (LB ϕ ) Me Me dv ∗ = − (LB ϕ ) ⊗ ϕk = ((LB )k )∗ , (1.265) Me which proves (1.74). Now, let ξ1 , ξ2 be two vectors of R3 and ξ = (ξ1 , ξ2 ) ∈ R6 . We compute ∗
ξ LB ξ =
2
(LB )k, ξk∗ ξ
=−
(LB (
k,=1
ξk · ϕk )) ⊗ (
k=1
=−
2
(LB ϕξ ) ⊗ ϕξ
dv ≥ 0, Me
2 =1
ξ · ϕ )
dv Me (1.266)
with ϕξ = 2k=1 ξk ·ϕk .The quantity (1.266) is non-negative because the operator LB is non-positive. We now prove that there exists a positive constant C > 0 such that ξ ∗ LB ξ > C|ξ|2 = C(|ξ1 |2 + |ξ2 |2 ),
(1.267)
If this is not the case, then we have min ξ ∗ LB ξ = 0,
|ξ|2 =1
(1.268)
54
Pierre Degond
Since the sphere in the six-dimensional space is a compact set, this minimal value is attained for a certain vector ξ of norm 1. Because of (1.266), this means that ϕξ is in the null-space of LB . But we have also chosen the ϕk ’s in the orthogonal to this null-space, by (1.250), and so, ϕξ is also orthogonal to the null-space of LB . Therefore ϕξ = 0 and since the components of ϕk ’s are linearly independent, this means that ξ = 0, which contradicts the fact that ξ is of norm 1. Therefore, (1.267) is proven, which shows that LB is positive definite.
1.8.5 Expression of the fluxes in terms of ne and T e We now rewrite the expression of the fluxes (1.262) in a more conventional form using the unknowns ne and Te rather than μe /Te and −1/Te . We have
jne jW e
=
ne ui 5 2 ne Te ui
− DB
Te ∇x ne + ne (E + ui × B) ne ∇x Te
,
(1.269)
where DB is defined by 1 1 1 ⎜ ne Te (LB )11 ne Te Te (LB )12 − ⎜ DB = ⎜ ⎝ 1 1 1 (LB )21 (LB )22 − ne Te ne Te Te ⎛
⎞ 3 (LB )11 ⎟ 2 ⎟ ⎟, ⎠ 3 (LB )21 2
(1.270)
each block being denoted by (DB )k . The first line of (1.269) can be rewritten (using (1.236) and omitting the indices B for simplicity): ne (ue − ui ) = − D11 (Te ∇x ne + ne (E + ui × B)) − D12 ne ∇x Te .
(1.271)
= − D11 (∇x (ne Te ) + ne (E + ue × B)) + D11 ne (ue − ui ) × B − (D12 − D11 )ne ∇x Te .
(1.272)
It is easily seen that the 3 × 3 matrix L11 is non-singular and so is D11 . Therefore, in view of (1.229), it follows that the force term Sei has the following expression: Sei = ∇x (ne Te ) + ne (E + ue × B) −1 = −D−1 11 ne (ue − ui ) + ne (ue − ui ) × B − D11 (D12 − D11 )ne ∇x Te , (1.273)
Models for Plasmas and Disparate Mass Gaseous Binary Mixtures
55
It can be decomposed in two terms, the friction and thermal forces, respectively given by: Sei = Seiu + SeiT , Seiu SeiT
(1.274)
= −MB ne (ue − ui ),
(1.275)
= −AB ne ∇x Te .
(1.276)
The 3 × 3 matrices MB and AB are respectively the electron–ion momentum transfer rate and the thermo-electric coefficient: −1 MB = D−1 11 − B = ne Te L11 − B,
AB = D−1 11 (D12 − D11 ) =
1 −1 5 L11 L12 − Id, Te 2
(1.277) (1.278)
where B is the matrix of the vector product by B (1.97). A similar transformation of the second line of (1.269) allows to compute the heat flux Qe = jW e − 52 ne Te ue . Using (1.271), we have: 5 Qe = −D21 (Te ∇x ne + ne (E + ui × B)) − D22 ne ∇x Te − ne Te (ue − ui ) 2 5 = D21 D−1 11 (ne (ue − ui ) + D12 ne ∇x Te ) − D22 ne ∇x Te − ne Te (ue − ui ) 2 5 1 −1 D21 D−1 = 11 − Id ne Te (ue − ui ) + (D21 D11 D12 − D22 ) ne ∇x Te . Te 2 (1.279) We note that 1 5 5 1 D21 D−1 (LB )21 (LB )−1 11 − Id = 11 − Id Te 2 Te 2 5 ∗ 1 ∗ −1 ∗ ((LB )11 ) (LB )21 − Id = Te 2 1 5 ∗ −1 = (L−B )11 (L−B )12 − Id Te 2 = A∗−B .
(1.280)
Finally, we introduce the heat conductivity tensor KB as follows: KB = ne (D22 − D21 D−1 11 D12 ) =
1 (L22 − L21 L−1 11 L12 ). Te2
(1.281)
56
Pierre Degond
With these definitions, the heat flux vector can be decomposed in two components, respectively the friction and thermal heat fluxes, given by: Qe = Que + QT e, Que QTe
=
A∗−B ne Te (ue
(1.282) − ui ),
= −KB ∇x Te .
(1.283) (1.284)
1.8.6 Expression of the collision terms The force term Sei = −Sie has already been computed (see (1.274)). We now show that the energy transfer term Uei = − Uie is given by: Uie = ui · Sie − νie,W ni (Ti − Te ).
(1.285)
Indeed, from equations (1.218) and (1.165) or (1.178), we have Uie = Uie0 ( fi1 , Me ) + Uie0 (Mi , fe1 ) + Uie1 (Mi , Me ) |v|2 dv. = (Qie0 (Mi , fe1 ) + Qie1 (Mi , Me )) 2
(1.286)
and similarly for Sie : Sie = Sie0 ( fi1 , Me ) + Sie0 (Mi , fe1 ) + Sie1 (Mi , Me ) = (Qie0 (Mi , fe1 ) + Qie1 (Mi , Me ))v dv.
(1.287)
Therefore, using (1.162) and (1.166) (or (1.179)), we obtain 2 |v| 0 1 1 − ui · v dv Uie − ui · Sie = (Qie (Mi , fe ) + Qie (Mi , Me )) 2 2 |v| = − ∇v Mi · A − ui · v dv 2 |v|2 νie,W + Te v Mi + ∇v · (vMi ) − ui · v dv. 2 3 (1.288) Easy computation show that 2 |v| ∇v Mi − ui · v dv = 0, 2 and νie,W 3
Te v Mi + ∇v · (vMi )
|v|2 − ui · v dv = −νie,W ni (Ti − Te ). 2
Models for Plasmas and Disparate Mass Gaseous Binary Mixtures
57
1.8.7 Back to physical variables If we collect the results of Sections 1.7 and 1.8 and rescale them back into the physical variables, we find the results given in Section 1.4.
1.9 Conclusion We have presented a derivation of macroscopic equations for plasmas and disparate mass binary mixtures from a consistent asymptotic analysis of the underlying kinetic equations, taking into account the smallness of the mass ratio between the particles simultaneously with letting the Knudsen number (ratio between the collision mean free-path to the macroscopic scale) to zero. The resulting models couples a system of diffusion equations for the electron (or light particle) density and energy (the so-called energy-transport model), with a hydrodynamic system of Euler type for the ions (or the heavy species). We have pointed out that the conventional models which can be found in the literature, and which couple two distinct systems for the light and heavy particles lack consistency in that they retain second order terms (in the Knudsen number/mass ratio) while dropping out other terms of the same order. We have given a self-contained derivation of this model, discussed its major physical features and provided analytic computations of its coefficients in simplified situations. We have also discussed its applicability to various practical situations, ranging from plasmas and semiconductors to rarefied gases. In subsequent works, we shall develop this approach to more complex situations involving neutrals and more complex ionization kinetics and apply it to various situations such as discharges in air at atmospheric pressure or the dynamics of ionospheric instabilities.
Appendix: Properties of LB (Proofs) The operator LB is the sum of three operators: LB = Lee + Lei + TB , Lee ϕ = 2Qee ( fe0 , ϕ),
(1.A.1) Lei ϕ = Qei0 (ϕ, fi0 ),
TB ϕ = (v × B) · ∇v ϕ. (1.A.2)
It is easily shown that both Lee and Lei are non-positive formally self-adjoint operators with respect to the scalar product (ϕ, ψ)Me . The null-space of Lee (i.e. its space of equilibria) is spanned by linearized Maxwellians of the form Me (a + b|v|2 + c · v), with a, b arbitrary real numbers and c an arbitrary vector of R3 . Therefore, its collisional invariants are all linear combinations of 1, v, |v|2 with arbitrary coefficients. These properties are classical (see e.g. Ref. [12], [13] and references therein).
58
Pierre Degond
We have determined the null-space of Lei in Section 1.7.1: it is spanned by isotropic functions ϕ(v) = φ(|v|2 /2). Therefore, its collisional invariants are also all functions of the type φ(|v|2 /2). Finally, a straightforward computation shows that TB is formally skewadjoint. But since T−B = −TB , its formal adjoint is also T−B . It is easily seen 2 that TB ϕ = 02 for all isotropic functions ϕ(v) = φ(|v| /2) and consequently, that (TB ϕ)φ(|v| /2) dv = 0 for all such φ’s. Collecting these properties immediately leads to properties (i)–(iii) of Section 1.8.2. For property (iv), suppose that LB ϕ = 0. It follows that dv dv dv 0 = (LB ϕ)ϕ = (Lee ϕ)ϕ + (Lei ϕ)ϕ . (1.A.3) Me Me Me Since the two terms at the right-hand side of (1.A.3) are nonpositive, they are separately zero and ϕ must be an equilibrium for Lee and Lei simultaneously. This means that ϕ must be at the same time a linearized Maxwellian of the form Me (a + b|v|2 + c · v) and an isotropic function. This implies that c = 0 and ϕ is a linearized Maxwellian of the form Me (a + b|v|2 ) which gives property (iv). ACKNOWLEDGEMENTS Support by the European network HYKE, funded by the EC as contract HPRN-CT-2002-00282, is acknowledged.
REFERENCES 1. P. Degond and B. Lucquin-Desreux,The asymptotics of collision operators for two species of particles of disparate masses, Math. Mod. Meth. Appl. Sci., 6 (1996), 405–436. 2. P. Degond and B. Lucquin-Desreux,Transport coefficients of plasmas and disparate mass binary gases,Transp. Theory Stat. Phys., 25 (1996), 595–633. 3. B. Lucquin-Desreux, Diffusion of electrons by multicharged ions, Math. Mod. Meth. Appl. Sci., 10 (2000), 409–440. 4. N. Ben Abdallah and P. Degond, On a hierarchy of macroscopic models for semiconductors, J. Math. Phys., 37 (1996), 3306–3333. 5. N. Ben Abdallah, P. Degond, and S. Génieys, An energy-transport model for semiconductors derived from the Boltzmann equation, J. Stat. Phys., 84 (1996), 205–231. 6. D. Hilbert, Begründung der kinetischen Gastheorie, Mathematische Annalen, 72 (1916/1917), 562–577. 7. S. Chapman,The kinetic theory of simple and composite gases: viscosity, thermal conduction and diffusion, Proc. Roy. Soc. (London),A93 (1916/1917), 1–20. 8. D. Enskog, KinetischeTheorie derVorgänge in mässig verdünntent Gasen, 1, in AllgemeinerTeil (ed.), Almqvist & Wiksell, Uppsala, 1917. 9. H. Grad, in Rarefied Gas Dynamics, F. Devienne (ed.), Pergamon, London, England, 1960, 10–138. 10. H. Grad, Asymptotic theory of the Boltzmann equation, Phys. Fluids, 8 (1963), 147–181, and Asymptotic theory of the Boltzmann equation II, Proceedings of theThird International Symposium on Rarefied Gas Dynamics,Vol. 1, J.A. Laurmann (ed.),Academic Press, NewYork, 1963, 26–59. 11. H. Grad, Asymptotic Equivalence of the Navier–Stokes and Non-linear Boltzmann Equation, AMS Symposium on applications of partial differential equations, 1964, 154–183.
Models for Plasmas and Disparate Mass Gaseous Binary Mixtures
59
12. C. Cercignani,The Boltzmann Equation and its Applications, Springer, Berlin, 1988. 13. C. Cercignani, R. Illner, and M. Pulvirenti,The Mathematical Theory of Dilute Gases, Springer, New York, 1994. 14. R. Caflisch, The fluid dynamical limit of the nonlinear Boltzmann equation, Commun. Pure Appl. Math., 33 (1980), 651–666. 15. S. Kawashima, A. Matsumura, and T. Nishida, On the fluid-dynamical approximation to the Boltzmann equation at the level of the Navier–Stokes equation, Commun. Math. Phys., 70 (1979), 97–124. 16. C. Bardos, F. Golse, and C. D. Levermore, Fluid dynamic limits of kinetic equations II: convergence proofs for the Boltzmann equation, Comm. Pure Appl. Math., 46 (1993), 667–753. 17. C. Bardos, F. Golse, B. Perthame, and R. Sentis,The Rosseland Approximation for the radiative transfer equations, Commun. Pure Appl. Math., XL (1987), 691–721. 18. C. Bardos, R. Santos, and R. Sentis, Diffusion approximation and computation of the critical size,Trans. A. M. S., 284 (1984), 617–649. 19. A. Bensoussan, J. L. Lions, and G. C. Papanicolaou, Boundary layers and homogenization of transport processes, J. Publ. RIMS Kyoto Univ., 15 (1979), 53–157. 20. F. Golse and F. Poupaud, Limite fluide des équations de Boltzmann des semiconducteurs pour une statistique de Fermi-Dirac,Asymptot. Analy., 6 (1992), 135–160. 21. F. Poupaud, Diffusion approximation of the linear semiconductor equation: analysis of boundary layers,Asymptot. Anal., 4 (1991), 293–317. 22. S. I. Braginskii, Transport processes in a plasma, in Reviews of Plasma Physics, Vol. 1, M. A. Leontovitch (ed.), Consultants Bureau, New York, 1965. 23. L. Spitzer and R. Härm, Transport phenomena in a completely ionized gas, Phy. Rev., 89 (1953), 977–981. 24. R. M. Chmieleski and J. H. Ferziger,Transport properties of a nonequilibrium partially ionized gas, Phys. Fluids, 10 (1967), 364–371. 25. J. S. Darrozes, Sur une théorie asymptotique de l’équation de Boltzmann et son application á l’ étude des écoulements en régime presque continu, Preprint # 130, ONERA, Chatillon, France, 1970. 26. E. Goldman and L. Sirovich, Equations for gas mixtures, Phys. Fluids, 10 (1967), 1928–1940 and Equations for gas mixtures II, Phys. Fluids, 12 (1969), 245–247. 27. E. A. Johnson, Energy and momentum equations for disparate-mass binary gases, Phys. Fluids, 16 (1973), 45–49. 28. J. P. Petit and J. S. Darrozes, Une nouvelle formulation des équations du mouvement d’un gaz ionisé dans un régime dominé par les collisions, J. Mécanique, 14 (1975), 745–759. 29. S. Chapman and T. G. Cowling, The Mathematical Theory of Non-Uniform Gases, Cambridge University Press, New York, 1958. 30. G. P. Schurtz, P. D. Nicolaï, and M. Busquet, A nonlocal electron conduction model for multidimensional radiation hydrodynamics codes, Phys. Plasma., 7 (2000), 4238–4249. 31. R. Balescu,Aspects of Anomalous Transport in Plasmas, IOP, 2005 (to appear). 32. G. F. Chew, M. L. Goldberger, and F. E. Low, The Boltzmann equation and the one-fluid hydromagnetic equations in the absence of particle collisions, Proc. R. Soc. London A 226, (1956), 112–118. 33. G. W. Hammett and F. W. Perkins, Fluid models for landau damping with application to the ion-temperature-gradient instability, Phys. Rev. Lett., 64 (1990), 3019. 34. I. Muller and T. Ruggeri, Rational Extended Thermodynamics, 2nd edition, Vol. 37, Springer Tracts in Natural Philosophy, New York, 1998. 35. C. D. Levermore, Moment closure hierarchies for kinetic theories, J. Stat. Phys., 83 (1996), 1021–1065. 36. A. M. Anile and O. Muscato, Improved hydrodynamical model for carrier transport in semiconductors, Phys. Rev. B, 51 (1995), 16728. 37. A. M. Anile and O. Muscato, Extended thermodynamics tested beyond the linear regime: the case of electron transport in silicon semiconductors, Cont. Mech. and Thermodynamics, 8 (1) (1996).
60
Pierre Degond
38. F. Deluzet, Mathematical modeling of plasma opening switches, Comp. Phy. Commun., 152 (2003), 34–54. 39. F. Deluzet, Mathematical Modeling and Numerical Simulation of Plasma Opening Switches, PhD dissertation, Institut National des Sciences Appliquées,Toulouse, 2002. 40. P. Degond, R. Talaalout, and M.-H.Vignal, Electron Transport and Secondary Emission in a Surface of a Solar Cell, comptes rendus de la conférence multipactor, RF and DC corona and passive intermodulation in space RF hardware’, ESTEC, Noordwijk, The Netherlands, September 4–6, 2000. 41. N. Ben Abdallah, P. Degond, F. Deluzet, V. Latocha, R. Talaalout, M.-H. Vignal, Diffusion limits of kinetic models, in Hyperbolic Problems:Theory, Numerics, Applications,T.Y. Hou and E. Tadmor (eds), Springer Verlag, Berlin, 2003, pp. 3–17. 42. P. Degond Transport of trapped particles in a surface potential, in Studies in Mathematics and its Applications, Vol. 31, D. Cioranescu and J. L. Lions (eds), Elsevier, 2002, pp. 273–296. 43. A. H. Marshak and K. M.Van-Vliet, Electrical current in solids with position dependent band structure, Solid State Electron., 21 (1977) 417–427. 44. R. Stratton, Diffusion of hot and cold electrons in semiconductor barriers, Phys. Rev., 126 (1962), 2002–2014. 45. K. M. Van-Vliet and A. M. Marshak, Conduction current and generalized Einstein relations for degenerate semiconductors and metals, Phys. Stat. Sol. (b), 78 (1976), 501–517. 46. Y. Apanavich, E. Lyumkis, B. Polsky,A. Shur, and P. Blakey, Steady-state and transient analysis of submicron devices using energy-balance and simplified hydrodynamic models, IEEE Trans. CAD of Integ. Circuits Syst. 13 (1994), 702–711. 47. D. Chen, E. C. Kan, U. Ravaioli, C. Shu, and R. W. Dutton, An improved energy transport model including non-parabolicity and non-maxwellian distribution effects, IEEE Electron Device Lett., 13 (1992), 235–239. 48. P. Degond, A. Jüngel, and P. Pietra, Numerical discretization of energy-transport models for semiconductors with non-parabolic band structure, SIAM on Scientific Computing, 22 (2000), 986–1007. 49. S. Holst,A. Jüngel, and P. Pietra,A mixed finite-element discretization of the energy-transport equations for semiconductors, SIAM J. Sci. Comp., 24 (2003), 2058–2075. 50. E. Lyumkis, B. Polsky, A. Shur, and P. Visocky, Transient semiconductor device simulation including energy balance equation, COMPEL, 11 (1992), 311–325. 51. P. Degond, S. Génieys, and A. Jüngel, A system of parabolic equations in nonequilibrium thermodynamics including thermal and electrical effects, J. Mathématiques Pures Appliquées., 76 (1997), 991–1015. 52. P. Degond, S. Génieys, and A. Jüngel, A steady-state system in non-equilibrium thermodynamics including thermal and electrical effects, Math. Meth. Appl. Sci., 21 (1998), 1399–1413. 53. W. Fang and K. Ito, Existence of stationary solutions to an energy drift-diffusion model for semiconductor device, Math. Models Methods Appl. Sci., 11 (2001), 827–840. 54. N. Ben Abdallah, L. Desvillettes, and S. Génieys On the convergence of the Boltzmann equation for semiconductors toward the energy transport model, J. Stat. Phys., 98 (2000), 835–870. 55. I. Gasser and R. Natalini,The energy-transport and the drift-diffusion equations as relaxations of the hydrodynamic model for semiconductors, Quart. Appl. Math., 57 (1999), 269–282. 56. P. Dmitruk, A. Saul, and L. Reyna, High electric field approximation to charge transporet in semiconductor devices,Appl. Math. Lett., 5 (1992), 99–102. 57. E. Bringuier, Kinetic theory of high-field transport in semiconductors, Phys. Rev. B, 57 (1998), 2280–2285. 58. E. Bringuier, Nonequilibrium statistical mechanics of drifting particles, Phys. Rev. E, 61 (2000), 6351–6358.
Models for Plasmas and Disparate Mass Gaseous Binary Mixtures
61
59. A. Gnudi, D. Ventura, G. Baccarani, and F. Odeh, Two-dimensional MOSFET simulation by means of a multidimensional spherical harmonic expansion of the Boltzmann transport equation, Solid State Electron., 36 (1993), 575–581. 60. E. Gnani, S. Reggiani, and M. Rudan, Full band transport properties of silicon dioxide using the spherical harmoncis expansion of the BTE, Physica B, 314 (2002), 193. 61. N. Goldsman, L. Henrickson, and J. Frey, A Physics-based analytical-numerical solution to the Boltzmann Transport Equation for use in device simulation, Solid State Electron., 34 (1991), 389–396. 62. A. Greiner, M. C. Vecchi, and M. Rudan, Modeling surface scattering effects in the solution of the Boltzmann transport equation based on the spherical harmonics expansion, Semicond. Sci. Technol., 13 (1998), 1080. 63. V. I. Kolobov, Fokker–Planck modeling of electron kinetics in plasmas and semiconductors, Comput. Mat. Sci., 28 (2003), 302–320. 64. W. Liang, N. Goldsman, and I. Meyergoyz, 2-D MOSFET modeling including surface effects and impact ionization by self-consistent solution of the Boltzmann, Poisson and hole-continuity equation, IEEE Trans. Electron Dev., 44 (1997), 257. 65. K. Rahmat, J. White, and D. A. Antoniadis, Simulation of semiconductor devices using a Galerkin/SHE approach to solving the couled Poisson-Boltzmann system, IEEETrans. Comput. Aided Des. Integ. Circuit. Syst., 15 (1996), 1181. 66. P. Degond, Mathematical modelling of microelectronics semiconductor devices, Proceedings of the Morningside Mathematical Center, Beijing, AMS/IP Studies in Advanced Mathematics, AMS Society and International Press, 2000, pp. 77–109. 67. P. Degond, Macroscopic limits of the Boltzmann equation: a review in Modeling and computational methods for kinetic equations, P. Degond, L. Pareschi, and G. Russo (eds), Modeling and Simulation in Science, Engineering and Technology Series, Birkhauser, 2003, pp. 3–57. 68. F. F. Chen, Introduction to Plasma Physics, Plenum, New York, 1974. 69. J. L. Delcroix and A. Bers, Physique Des Plasmas,Vol. 1 & 2, EDP Sciences, Paris, 1994. 70. N. A. Krall and A.W.Trivelpiece, Principles of Plasma Physics, San Francisco Press, San-Francisco, 1986. 71. P. A. Markowich, C. Ringhofer, and C. Schmeiser, Semiconductor Equations, Springer, Wien, 1990. 72. S. M. Sze, Physics of Semiconductor Devices, J. Wiley and Sons, New York, 1969. 73. I. Choquet, P. Degond, and C. Schmeiser, Energy-Transport models for charge carriers involving impact ionization in semiconductors,Transp. Theory Stat. Phys., 32 (2003), 99–132. 74. I. Choquet, P. Degond, and C. Schmeiser, Hydrodynamic models for charge carriers, Comm. Math. Sci., 1 (2003), 74–86. 75. P. Degond,A. Nouri, and C. Schmeiser, Macroscopic models for ionization in the presence of strong electric fields,Transp. Theory Stat. Phys., 29 (2000), 551–561. 76. I. Choquet and B. Lucquin-Desreux, Hydrodynamic limit of an arc discharge at atmospheric pressure, J. Stat. Phys., 119 (2005), 197–239. 77. P. Degond and A. Jüngel, High field approximations of the energy-transport model for semiconductors with non parabolic band structure, Zeitschrifts für Angewandte Mathematik und Physik, 52 (2001), 1053–1070. 78. R. Alexandre and C.Villani, On the Landau approximation in plasma physics,Ann. I.H.P., 21 (2004), 61–95. 79. A. A. Arsen’ev and O. E. Buryak, On the connection between a solution of the Boltzmann equation and a solution of the Landau–Fokker–Planck equation, Math. USSR Sbornik, 69 (1991), 465–478. 80. P. Degond and B. Lucquin-Desreux, The Fokker-Planck asymptotics of the Boltzmann collision operator in the coulomb case, Mathematical Methods and Models in the Applied Sciences, 2 (1992), 167–182.
62
Pierre Degond
81. L. Desvillettes, On asymptotics of the Boltzmann equation when the collisions become grazing, Transp. Th. Stat. Phys., 21 (1992), 259–276. 82. B. Lucquin-Desreux, Fluid limit of magnetized plasmas, Transp. Theory and Stat. Phys., 27 (1998), 99–135. 83. P. Degond, M. Lemou, Dispersion relations of the linearized Fokker-Planck equation, Arch. Ration. Mech. Anal., 138 (1997), 137–167. 84. A. Fedoseyev, V. Kolobov, R. Arslanbekov, and A. Przekwas, Kinetic simulation tools for nano-scale semiconductor devices, Microelectron. Eng., 69 (2003), 577–586. 85. D.Ventura,A. Gnudi, G. Baccarani, and F. Odeh, Multidimensional spherical harmonics expansion of Boltzmann equation for transport in semiconductors, Appl. Math. Lett., 5 (1992), 85–90.
C H A P T E R
TW O
Microscopic Foundations of the Mechanics of Gases and Granular Materials Carlo Cercignani∗
Contents 2.1 2.2 2.3 2.4 2.5 2.6
Introduction Kinetic Theory of Smooth Spheres Collision Dynamics of Rough Spheres The Boltzmann–Enskog Equation The Macroscopic Balance Equations Concluding Remarks
63 65 69 72 74 76
Abstract We discuss microscopic models of ordinary gases and granular materials. Both can be treated by the statistical methods of kinetic theory. In particular, if one wants to take into account the exchanges of kinetic energy between the rotational and translational degrees of freedom of the particles (assumed to be spherical), he/she must endow them with a suitable amount of roughness. This makes the theory of granular materials similar to that of a dense gas of rough spheres: the major difference arises from the fact that kinetic energy is not conserved in the collisions between particles. Key Words: Kinetic Theory, Boltzmann–Enskog equation, granular materials
2.1 Introduction Microscopic models of materials have been studied for a long time: the first and most successful example is certainly provided by the kinetic theory of gases (for historical aspects we refer the reader to Ref. [1]). In the last decades one has witnessed a notable development of the study of the mechanics of granular materials because of their growing importance in the applications (sands, powders, rock and snow avalanches, landslides, grains, fluidized beds). Here the microscopic particles are much more sizable than the molecules of a gas. The problems related to the study of statics and dynamics of grain materials, which arise, more and ∗
Dipartimento di Matematica, Politecnico di Milano, Milano, Italy
Material Substructures in Complex Bodies ISBN-10: 0-08-044535-7
© 2007 Elsevier Ltd. All rights reserved.
63
64
Carlo Cercignani
more frequently, in industrial processes and are of growing importance in the study of natural phenomena [2], have been the object of much attention and have been treated with various methods that differ in rigor and complexity. A granular material can occur in three states of aggregation and one can talk of granular gases, granular liquids and granular solids [3]. In this chapter we shall mainly deal with the case of both molecular and granular gases because the microscopic model is simpler than for liquid and solid states of aggregation. In particular, the analogy between molecular and granular gases is rather strict. The important cases of collisional granular motion may include fluidized beds [4]. The methods used in these studies include: (i) development of physical and experimental models [5]; (ii) computer simulations [6–14] and (iii) kinetic theory [15–24]. Actually many studies [15–30] are based on the assumption that, in certain conditions of motion, collisions between the particles of a granular material supply the main mechanism of momentum and energy exchange. This assumption spontaneously suggests an analogy with the kinetic theory of gases [31–35]. As remarked above, in this theory the particles are of course molecules and there are thus essential differences between the two situations that must be duly taken into account. In particular, the intermolecular collisions are frequently elastic, whereas this is not a reasonable assumption when dealing with particles of a granular material. In addition, the sizes of the constituting particles are considerably different in the two cases. This second circumstance indicates that the evolution equation for the distribution function cannot be analogous to the equation originally established by Boltzmann in 1872 [1, 31–36] but rather to that proposed by Enskog [37–40] to describe dense gases. One can also take advantage of the results obtained by Campbell and Brennen [5–7] who simulated on a computer the dynamics of idealized granular materials, made of disk-shaped particles; their calculations showed that, at least for the special scheme that they adopted to describe interparticle collisions, a considerable fraction of fluctuation kinetic energy was of rotational nature. This remark led Lun and Savage [19] to study the importance of the rotational degrees of freedom of the grains, which had been completely neglected in previous studies. A theory of the kind proposed by Lun and Savage shows a notable analogy with the kinetic theory of a dense gas of rough spheres, proposed by Cercignani and Lampis [38–40], who were not aware that the same model had been already utilized in chemical physics by Condiff et al. [41]. In the present chapter the basic principles of the theory are examined and, in particular, the collision relations and the balance relations for mass, momentum and energy for granular materials are discussed. The final aim of theories of this kind should be, according to most authors, to generate constitutive relations holding in the presence of high shear stresses. One should not neglect, however, the possibility of studying significant effects of granular nature in the neighborhood of solid boundaries, where boundary layers of a still unexplored nature might form. Having this in mind, one should also discuss the boundary conditions to be imposed upon the distribution functions; for this we refer the reader to a previous chapter of the author [42].
Microscopic Foundations of Gases and Granular Materials
65
2.2 Kinetic Theory of Smooth Spheres The exact dynamics of N particles is a useful conceptual tool, but cannot in any way be used in practical calculations for a gas because it requires a huge number of real variables (of the order of 1020 ). This was realized by Maxwell and Boltzmann when they started to work with the one-particle probability density, or distribution function P (1) (x, ξ, t). The latter is a function of seven variables, i.e., the components of the two vectors x and ξ, and time t. In particular, Boltzmann wrote an evolution equation for P (1) , which bears his name. Let us first consider the meaning of P (1) (x, ξ, t); it gives the probability density of finding one fixed particle (say, the one labeled by 1) at a certain point (x, ξ) of the six-dimensional reduced phase space associated with the position and velocity of that molecule, which we shall for the moment assume to be a hard smooth spheres, whose center has position x. When the molecules collide, momentum and kinetic energy are conserved; thus the velocities after the impact, ξ 1 and ξ 2 , are related to those before the impact, ξ 1 and ξ 2 , by ξ 1 = ξ 1 − n[n · (ξ 1 − ξ 2 )], ξ 2 = ξ 2 + n[n · (ξ 1 − ξ 2 )],
(2.1)
where n is the unit vector along ξ 1 − ξ 1 . Note that the relative velocity V = ξ1 − ξ2
(2.2)
V = V − 2n(n · V),
(2.3)
satisfies i.e. undergoes a specular reflection at the impact. This means that if we split V at the point of impact into a normal component Vn , directed along n and a tangential component Vt (in the plane normal to n), then Vn changes sign and Vt remains unchanged in a collision. Let us remark that, in the absence of collisions, P (1) would remain unchanged along the trajectory of a particle. Accordingly we must evaluate the effects of collisions on the time evolution of P (1) . The probability of occurrence of a collision is related to the probability of finding another molecule with a center at exactly one diameter from the center of the first one, whose distribution function is P (1) . Thus, generally speaking, in order to write the evolution equation for P (1) we shall need another function, P (2) , which gives the probability density of finding, at time t, the first molecule at x1 with velocity ξ 1 and the second at x2 with velocity ξ 2 ; obviously P (2) = P (2) (x1 , x2 , ξ 1 , ξ 2 , t). Hence P (1) satisfies an equation of the following form: ∂P (1) ∂P (1) = G − L. + ξ1 · ∂t ∂x1
(2.4)
66
Carlo Cercignani
Here Ldx1 dξ 1 dt gives the expected number of particles with position between x1 and x1 + dx1 and velocity between ξ 1 and ξ 1 + dξ 1 which disappear from these ranges of values because of a collision in the time interval between t and t + dt, and Gdx1 dξ 1 dt gives the analogous number of particles entering the same range in the same time interval. The count of these numbers is easy, provided we use the trick of imagining particle 1 as a sphere at rest and endowed with twice the actual diameter a and the other particles to be point masses with velocity (ξ i − ξ 1 ) = Vi . In fact, each collision will send particle 1 out of the above range and the number of the collisions of particle 1 will be the number of expected collisions of any other particle with that sphere. Since there are exactly (N − 1) identical point masses and multiple collisions are disregarded, G = (N − 1)g and L = (N − 1)l, where the lower case letters indicate the contribution of a fixed particle, say particle 2.We shall then compute the effect of the collisions of particle 2 with 1. Let x2 be a point of the sphere such that the vector joining the center of the sphere with x2 is an, where n is a unit vector. A cylinder with height |V · n|dt (where we write just V for V2 ) and base area dS = a2 dn (where dn is the area of a surface element of the unit sphere about n) will contain the particles with velocity ξ 2 hitting the base dS in the time interval (t, t + dt); its volume is a2 dn|V · n|dt. Thus the number of collisions of particle 2 with particle 1 in the ranges (x1 , x1 + dx1 ), (ξ 1 , ξ 1 + dξ 1 ), (x2 , x2 + dx2 ), (ξ 2 , ξ 2 + dξ 2 ), (t, t + dt) occurring at points of dS is P (2) (x1 , x2 , ξ 1 , ξ 2 , t) dx1 dξ 1 dξ 2 xa2 dn|V2 · n|dt. If we want the number of collisions of particle 1 with 2, when the range of the former is fixed but the latter may have any velocity ξ 2 and any position x2 on the sphere (i.e. any n), we integrate over the sphere and all the possible velocities of particle 2 to obtain: P (2) l dx1 dξ 1 dt = dx1 dξ 1 dt R3
B−
× (x1 , x1 + an, ξ 1 , ξ 2 , t)|V · n|a2 dn dξ 2 ,
(2.5)
where B− is the hemisphere corresponding to V · n < 0. Thus we have the following result: P (2) (x1 , x1 L = (N − 1)a2 R3
B−
+ an, ξ 1 , ξ 2 , t)|(ξ 2 − ξ 1 ) · n|dξ 2 dn.
(2.6)
The calculation of the gain term G is exactly the same as the one for L, except for the fact that we have to integrate over the hemisphere B + , defined by V · n > 0. Thus we have: P (2) (x1 , x1 G = (N − 1)a2 R3
B+
+ an, ξ 1 , ξ 2 , t)|(ξ 2 − ξ 1 ) · n|dξ 2 dn.
(2.7)
Microscopic Foundations of Gases and Granular Materials
67
The probability density P (2) is continuous at a collision; in other words, although the velocities of the particles undergo the discontinuous change described by equation (2.1), we can write: P (2) (x1 , ξ 1 , x2 , ξ 2 , t) = P (2) (x1 , ξ 1 − n(n · V), x2 , ξ 2 + n(n · V), t), if |x1 − x2 | = a.
(2.8)
For brevity, we write (in agreement with equation (2.1)): ξ 1 = ξ 1 − n(n · V) ξ 2 = ξ 2 + n(n · V).
(2.9)
Inserting equation (2.8) in equation (2.5) we thus obtain: G = (N − 1)a2 P (2) (x1 , x1 R3
B+
+ an, ξ 1 , ξ 2 , t)|(ξ 2 − ξ 1 ) · n|dξ 2 dn,
(2.10)
which is a frequently used form. Sometimes n is changed into −n in order to have the same integration range as in L; the only change (in addition to the change in the range) is in the second argument of P (2) , which becomes x1 − an. At this point we are ready to understand Boltzmann’s argument. N is a very large number and a (expressed in common units, such as, e.g., centimeters) is very small; to fix the ideas, let us consider a box whose volume is 1 cm3 at room temperature and at atmospheric pressure. Then N ∼ = 10−8 cm. Then = 1020 and a ∼ 2 2 4 2 2 ∼ ∼ (N − 1)a = Na = 10 cm = 1 m is a sizable quantity, while we can neglect the difference between x1 and x1 + an. This means that the equation to be written can be rigorously valid only in the so-called Boltzman–Grad limit, when N → ∞, a → 0 with Na2 finite. In addition, the collisions between two preselected particles are rather rare events.Thus two spheres that happen to collide can be thought to be two randomly chosen particles and it makes sense to assume that the probability density of finding the first molecule at x1 with velocity ξ 1 , and the second at x2 with velocity ξ 2 is the product of the probability density of finding the first molecule at x1 with velocity ξ 1 times the probability density of finding the second molecule at x2 with velocity ξ 2 . If we accept this we can write: P (2) (x1 , ξ 1 , x2 , ξ 2 , t) = P (1) (x1 , ξ 1 , t)P (1) (x2 , ξ 2 , t)
(2.11)
for two particles that are about to collide, or: P (2) (x1 , ξ 1 , x1 + an, ξ 2 , t) = P (1) (x1 , ξ 1 , t)P (1) (x1 , ξ 2 , t) for
(ξ 2 − ξ 1 ) · n < 0.
(2.12)
Thus we can apply this recipe to the loss term (2.4) but not to the gain term in the form (2.5). It is possible, however, to apply equation (2.13) (with ξ 1 , ξ 2 in place
68
Carlo Cercignani
of ξ 1 , ξ 2 ) to the form (2.9) of the gain term because the transformation (2.10) maps the hemisphere B + onto the hemisphere B − . If we accept all the simplifying assumptions made by Boltzmann, we obtain the following form for the gain and loss terms: G = Na2 P (1) (x1 , ξ 1 , t)P (1) (x1 , ξ 2 , t)|(ξ 2 − ξ 1 ) · n|dξ 2 dn. (2.13) 3 − R B 2 P (1) (x1 , ξ 1 , t)P (1) (x1 , ξ 2 , t)|(ξ 2 − ξ 1 ) · n|dξ 2 dn. (2.14) L = Na R3
B−
By inserting these expressions in equation (2.6) we can write the Boltzmann equation in the following form: ∂P (1) ∂P (1) 2 = Na [P (1) (x1 , ξ 1 , t)P (1) (x1 , ξ 2 , t) + ξ1 · ∂t ∂x1 R 3 B− − P (1) (x1 , ξ 1 , t)P (1) (x1 , ξ 2 , t)]|(ξ 2 − ξ 1 ) · n|dξ 2 dn (2.15) The Boltzmann equation is an evolution equation for P (1) , without any reference to P (2) . This is its main advantage. However, it has been obtained at the price of several assumptions; the chaos assumption present in equations (2.12) and (2.13) is particularly strong; for a discussion, see Refs. [1, 34]. The Boltzmann equation for smooth hard spheres can be generalized to other molecular models, the most obvious being the case of molecules which are identical point masses interacting with a central force, a good general model for monatomic gases. The only difference arises in the factor a2 |(ξ 2 − ξ 1 ) · n| which turns out to be replaced by a function of V = |ξ 2 − ξ 1 | and the angle θ between n and V [31–34]. If the gas is polyatomic, then the molecules have other degrees of freedom in addition to the translation ones. This in principle requires using quantum mechanics, but one can devise useful and accurate models in the classical scheme as well. Frequently the internal energy Ei is the only additional variable that is needed; in which case one can think of the gas as of a mixture of species, each differing from the other because of the value of Ei . If the latter variable is discrete we obtain a strict analogy with a mixture; otherwise we have a continuum of species. We remark that in both cases, translation kinetic energy is not preserved by collisions because internal energy also enters into the balance; this means that a molecule changes its “species’’ when colliding. This is the simplest example of a “reacting collision’’, which may be generalized to actual chemical species when chemical reactions occur. For more information on mixtures and polyatomic gases the reader is referred to a book of the author [34]. Here we shall discuss just the rough sphere model, which in the case of gases can be considered as a particularly simple model arising in classical mechanics, but turns out to be particularly significant when dealing with granular materials. Before embarking in a discussion of this model, we remark that the unknown of the Boltzmann equation is not always chosen to be a probability density as we
69
Microscopic Foundations of Gases and Granular Materials
have done so far; it may be multiplied by a suitable factor and transformed into an (expected) number density or an (expected) mass density (in phase space, of course). The only thing that changes is the factor in N in equation (2.6) which becomes 1 or 1/m. We shall choose the latter normalization and replace P by f .
2.3 Collision Dynamics of Rough Spheres The rough sphere model was first introduced in kinetic theory by Bryan [43] for the purpose of allowing energy exchange in a collision between translational and rotational degrees of freedom in a simple way. This model, later used by other authors [38–41], assumes a perfect roughness, i.e. a reversal of the relative velocity of the contact points at a collision. In addition, the mass distribution is assumed to be such that the center of mass of each particle is in the center of the sphere and the inertia tensor is isotropic. If we denote by ξ 1 , ξ 2 , ω1 , ω2 the center-of-mass and angular velocities of two colliding spheres just before the collision, by ξ 1 , ξ 2 , ω1 , ω2 the same velocities just after the collision, and by n the unit vector from the center of the first sphere to the center of the second, the relative velocity at the contact point, before collision, is given by 1 1 V = ξ 2 + an ∧ ω2 − ξ 1 + an ∧ ω1 . 2 2
(2.16)
Since the relative velocity is reversed as a result of the collision, we also have: 1 1 V = −ξ 2 − an ∧ ω2 + ξ 1 − an ∧ ω1 . 2 2
(2.17)
Equations (2.16) and (2.17) supply a vector relation between the four vectors ξ 1 , ξ 2 , ω1 , ω2 on one side and ξ 1 , ξ 2 , ω1 , ω2 on the other side. In addition, we can apply conservation of total momentum and the conservation of angular momentum of each sphere with respect to the contact point. We thus obtain four vector equations for the four unknown vectors, which can be solved to yield: kV + n(n · V) , k+1 kV + n(n · V) , ξ 2 = ξ 2 − k+1 2n ∧ V , ω1 = ω1 + a(k + 1) 2n ∧ V ω2 = ω2 + , a(k + 1) ξ 1 = ξ 1 +
(2.18) (2.19) (2.20) (2.21)
70
Carlo Cercignani
where V is of course given by equation (2.16) and the constant k depends upon the sphere mass m and the inertia moment I in the following way: 4I 2 k= 2 (2.22) = for homogeneous spheres . ma 5 Energy conservation is automatically satisfied as a consequence of the other assumptions. Equations (2.18)–(2.21) give the final velocities in terms of the initial ones. If we bring the terms containing V from the right hand sides into the corresponding left hand sides and Equation (2.17) is used to express V, we obtain ξ 1 , ξ 2 , ω1 , ω2 in terms of ξ 1 , ξ 2 , ω1 , ω2 , that is the initial velocities in terms of the final ones. These relations may be used to express the velocities ξ , ξ ∗ , ω , ω∗ , which appear in the Boltzmann–Enskog equation, to be introduced in the next section, in terms of ξ, ξ ∗ , ω, ω∗ . As already mentioned, the perfectly rough sphere model has some advantages when used to represent gas molecules, but is not so convenient for the purpose of describing the particles of a granular material. As a matter of fact, the normal component (along n) and the tangential one (perpendicular to n) are not simply reversed in a collision between grains.The collision process is, generally speaking, a complicated phenomenon depending on elastoplastic deformations of the particle material and friction phenomena between the particles themselves. This circumstance can be taken into account by introducing two coefficients, e and β, which characterize the collision process [19]. If V and V are the relative velocities of the contact points before and after a collision, given by the right-hand sides of Equations (2.16) and (2.17), respectively, one lets: (n · V ) = −e(n · V),
(n ∧ V ) = −β(n ∧ V).
(2.23) (2.24)
For perfectly rough spheres e = β = 1, whereas e = 1, β = −1 for perfectly smooth spheres. The coefficients e and β are far from being constant. They will not only depend on the size of the grains and their material composition, but also on the collision velocity (as indicated by experimental results [44] and also by theoretical considerations on the plastic deformations which occur during a collision [45]). However, it is clear that in a theoretical treatment, which is already of notable complexity, one will be obliged to take constant values for e and β: this is also assumed in the aforementioned numerical calculations by Campbell and Brennen [6], who assumed β = 0 and e = 1.The value of β is particularly interesting because it has a marked influence upon the collision dynamics. When β = −1 the spheres are smooth and all the kinetic energy can be attributed to translational degrees of freedom.When β grows, the roughness of the spheres leads to a loss of translational kinetic energy due to friction and the energy exchange between translational and rotational degrees of freedom also grows. For β > 0, particles adherence and tangential elasticity grow: a sort of bounce-back takes place which tends to reverse the tangential components at the contact point. For β = 1 (and e = 1) the particles
71
Microscopic Foundations of Gases and Granular Materials
are perfectly rough and elastic and energy is no longer dissipated. The case β > 0 should not be interesting for typical granular materials which do not tend to adhere to the point of reversing relative velocities. Equations (2.23) and (2.24) can be associated, in place of equations (2.16) and (2.17), to the conservation equations in order to obtain the velocities after collision in terms of those before collision. The result reads as follows: a ξ 1 = ξ 1 − ηt V − (ηn − ηt )n(n·V) + ηt n ∧ (ω1 + ω2 ), (2.25) 2 a (2.26) ξ 2 = ξ 2 + ηt V + (ηn − ηt )n(n·V) − ηt n ∧ (ω1 + ω2 ), 2 2ηt (n ∧ V) ηt + n ∧ (n ∧ V), ka ka 2ηt (n ∧ V) ηt ω2 = ω2 − + n ∧ (n ∧ V), ka ka ω1 = ω1 −
(2.27) (2.28)
where ηn =
1+e , 2
ηt =
1+β k , 2 1+k
V = ξ1 − ξ2 .
(2.29) (2.30)
Please note that for e = 1, β = 1 one reobtains Equations (2.18)–(2.21) (provided the relation between V and V, given by (2.16), is duly taken into account). When e = 1, β = −1 (smooth spheres) one reobtains the classical relations [31–35]. One can see that as a consequence of the above relations, total momentum is conserved but the total mechanical energy decreases. Since collisions are irreversible, the above equations are not invariant with respect to an exchange of the states before and after collision. At variance with what happens in the case of reversible collisions, the velocities that are transformed into ξ, ξ ∗ , ω, ω∗ by a collision are not ξ , ξ ∗ , ω , ω∗ , but some other velocities ξ , ξ ∗ , ω , ω∗ , which, assuming e and β to be constant, are given by ηt ηn ηt a ηt ξ1 = ξ1 − V − (2.31) − n(n·V) + n ∧ (ω1 + ω2 ), β e β β2 ηt ηn ηt a ηt (2.32) − n(n·V) − n ∧ (ω1 + ω2 ), ξ2 = ξ2 + V + β e β β2 ω1 = ω1 − 2 ω2
= ω2 −
ηt n ∧ V ηt + n ∧ (n ∧ V), β ka βka
2 ηβt (n ∧ V) ka
+
ηt n ∧ (n ∧ V). βka
(2.33) (2.34)
72
Carlo Cercignani
One can easily compute the Jacobian determinant J=
∂(ξ 1 , ξ 2 , ω1 , ω2 ) 1 = − 2. ∂(ξ 1 , ξ 2 , ω1 , ω2 ) eβ
(2.35)
Thus the Jacobian is not simply equal to −1 as in the case of reversible collisions (e = |β| = 1).
2.4 The Boltzmann–Enskog Equation In order to study of a granular material, we must write an equation for the distribution function f = f (x, ξ, ω, t), which gives information on the number of particles having position x, center-of-mass velocity ξ and angular velocity ω at time t. We remark that the Euler angles giving the orientation of the spheres are treated as ignorable coordinates; this lends additional strength to the assumption of spherical particles because one can think that an average over nonsphericity has been performed. We do not know how to deduce rigorously the evolution equation satisfied by f from particle dynamics, unless we consider the case of a rarefied material (and even in this case with notable restrictions [46]). We shall hence adopt the spirit of the monograph of Truesdell and Muncaster on kinetic theory [35] and postulate (with some motivation) the evolution equation for f in the following form for the case of a molecular gas: ∂f ∂f ∂f + ξ· + X· ∂t ∂x ∂ξ = a2 dξ ∗ dω∗ d 2 n(n · V)+ [ g(x, x − an)f (x, ξ , ω , t) × f (x − an, ξ ∗ , ω∗ , t) − (n · g(x, x + an)f (x, ξ, ω, t)f (x + an, ξ ∗ , ω∗ , t)], (2.36) where a is the sphere diameter, n the unit vector of the oriented segment from the molecule with position x and velocity ξ, ω to that with position x + an and velocity ξ ∗ , ω∗ , whereas n·V if n·V > 0, (2.37) V = ξ − ξ ∗ ; (n·V)+ = 0 if n·V ≤ 0. In the gain term, ξ , ξ ∗ , ω , ω∗ indicate the molecular velocities just before a collision with unit vector −n which leads to after-collision velocities ξ, ξ ∗ , ω, ω∗ . A modification of the Boltzmann equation of the form indicated in equation (2.36) was first introduced by Enskog [37] and is accordingly called Boltzmann– Enskog equation or simply Enskog equation. The function g represents the
Microscopic Foundations of Gases and Granular Materials
73
statistical correlation function between the particles; it is because of this factor that we cannot attempt a rigorous derivation of equation (2.36). In fact, first of all, g should depend on ξ, ξ ∗ , ω, ω∗ as well and not only on x and x − an, although one believes that this dependence is not so important and, in any case, very difficult to evaluate. Even with the simplifying assumption that g depends just on x and on x − an, it is not easy to give an expression for g. In the case of smooth spheres, Enskog took for g its equilibrium value, i.e. a function of the number density 1 n= f dξ dω. (2.38) m However, more recent studies [47–50] show that g should be preferably thought of as a functional of n, implicitly defined in the manner to be presently described. Let us consider the functional ⎛ ⎞ j−1 ∞ k 1 ⎝dxj H (|xi − xj | − a)z(xj , t)⎠, (2.39) (z(x, t)) = k! j=1 i=1 k=0
where H is Heaviside’s step function. The above series is clearly convergent because it can be bounded, term by term, by the series giving exp ( z(x, t)dx). Furthermore, z is related to n by n=z
δ( log ) , δz
(2.40)
where δ/δz indicates the functional or variational derivative with respect to z. Equation (2.40) gives z(x |n(x, t) by inversion; this inversion is certainly possible provided n is sufficiently small (for a detailed discussion we refer to an article by Cannone and Cercignani [51]). With this definitions one has g(x1 , x2 , t) =
z(x1 , t)z(x2 , t) δ2 ( log ) n(x1 , t)n(x2 , t)(t) δz2
(2.41)
where δ2 /δz2 indicates the second functional derivative of (function of x1 and x2 and functional of n(x, t)). As shown by Résibois [50] in the case of a smooth rigid sphere gas and by Cercignani and Lampis [39] in the case of perfectly rough sphere gas, the above more complicated form of g plays an essential role in the proof of the H theorem and hence in the entropy balance. Presumably, however, this aspect is less important for a granular gas. The theory of dense gases, whose molecules can be thought of as rough spheres, differs from the theory of granular materials with spherical grains because of the collision assumptions, which have been described in the previous section. In the present case the Boltzmann–Enskog equation turns out to require modifications, not only because of the different expressions of the velocities, but also because of
74
Carlo Cercignani
the factor multiplying the gain term in the right-hand side of the equation. This factor arises because of two circumstances: the absolute value of the scalar product between the unit vector and the relative velocity is not the same when evaluated before and after a collision, and the absolute value of the Jacobian which appears in equation (2.35) is different from unity. Then the evolution equation takes on the following form: ∂f ∂f ∂f + ξ· + X· ∂t ∂x ∂ξ = a2 dξ ∗ dω∗ d 2 n(n · V)+ [e −1 | J |g(x, x − an)f (x, ξ , ω , t) × f (x − an, ξ ∗ , ω∗ , t) − g(x, x + an)f (x, ξ, ω, t)f (x + an, ξ ∗ , ω∗ , t)], (2.42) where the velocities before the collision ξ , ξ ∗ , ω , ω∗ are expressed in terms of ξ, ξ ∗ , ω, ω∗ in the way indicated in equations (2.31–2.34) and | J | is the absolute value of the Jacobian given by equation (2.35). We also remark that occasionally g has been supposed to depend on the strain rate as well [15, 26].
2.5 The Macroscopic Balance Equations It is well known [31–35] that one can obtain balance equations for mass, momentum and energy from the Boltzmann equation. The same thing occurs when dealing with the Enskog equation for both smooth [37] and rough spheres [40, 41]. One may ask what happens in the case of granular materials made of inelastic particles. The problem has been examined both for smooth [16] and rough spheres [19]. The balance equation for mass follows by integration of the Boltzmann–Enskog equation with respect to ξ and ω takes on the form traditional in continuum mechanics: ∂ ∂ρ + · (ρv) = 0, ∂t ∂x where
(2.43)
ρ=
f dξ dω.
(2.44)
ξf dξ dω.
(2.45)
R6
ρv =
R6
If we try to obtain the balance of momentum, it appears that momentum is not locally conserved; in fact collisions instantaneously transfer momentum from one point to a point located one particle diameter away. It is remarkable, however,
Microscopic Foundations of Gases and Granular Materials
75
to represent this nonlocal transfer as an integral over the line joining the particle centers [39, 40]. Thus the momentum balance takes on the traditional form: ∂ ∂ (ρv) + · (ρvv + p) = ρX, ∂t ∂x
(2.46)
p = pk + pc
(2.47)
where is made up of the traditional kinetic contribution, k ccf dξ dω, c = ξ − v p =
(2.48)
R6
and a collisional contribution pc , which is detailed in Refs. [38–40]. The tensor pc is not symmetric; it becomes symmetric when the spheres are smooth and vanishes for a vanishing particle diameter. Thus a dense gas of rough spheres is a Cosserat continuum. It is possible to write the balance of angular momentum as well: ∂ ∂ (ρx ∧ v) + nI ω0 ) + · [v(ρx ∧ v + nI ω0 ) ∂t ∂x + x ∧ p + K] = ρx ∧ X,
(2.49)
K = Kk + Kc
(2.50)
where is the couple stress tensor made up of a kinetic contribution Kk = cωf dξ dω, c = ξ − v
(2.51)
R6
and a collisional contribution Kc , which is detailed in Refs. [39, 40]. Obviously ωf dξ dω, c = ξ − v. (2.52) ω0 = R6
The energy balance appears to be interesting for several reasons. First of all, already in the case of a molecular gas [39, 40] the rate of work per unit area on a surface of normal unit vector u is not the sum of the rates of work of the stresses m · (pv) and of the couple stresses m · (Kω0 ) as assumed in most work on polar continua on the basis of a naive analogy with solid body dynamics. As a matter of fact, as early as 1962 Toupin [52] remarked that the traditional assumption on the surface power introduces an unnecessary and unnatural restriction and suggested introducing an extra rate of energy supply, which later Dunn and Serrin [53] reconsidered under the name of interstitial working.
76
Carlo Cercignani
Second, in the case of a granular gas one finds that together with the translation and rotation kinetic energy of the continuum, there are fluctuation energies associated with both kinds of degrees of freedom: these are analogous to the internal energy of a gas. It is common to talk of a “granular temperature’’ associated with these fluctuations: it is not the thermodynamic temperature, but rather a variable associated with the fluctuation energy with a rule similar to that relating the temperature of a perfect gas to its internal energy. This analogy has a purely formal character, but should not be discarded a priori; as a matter of fact, in a complete theory, it might, together with a “granular pressure’’ obtainable from the stress tensor, supply an intuitive description of the different kinds of aggregation of the grains. One might in this way speak of a“granular gas’’or“granular liquid’’, according to the degree of dilution or compactness of the material, and nothing excludes the possibility of describing “phase transitions’’ in granular materials. Another important aspect of the energy balance is related to the fact that energy is not conserved in a collision between grains (unless e = 1, β = ±1). This circumstance leads to the appearance of a source term in the energy balance, representing the energy dissipation rate per unit volume, which can explicitly given in terms of the distribution function. A problem which has not been developed so far and should be studied along the lines of the analogous studies concerning polyatomic gases [38–40] is the balance of entropy for a granular material; a detailed study is required because the energy dissipation in a collision, absent in the kinetic theory of gases, must necessarily modify the traditional approach.
2.6 Concluding Remarks We have presented an introduction to the kinetic theories of gases and granular materials. These theories have a notable interest from the viewpoints of both microscopic foundations of a dynamic theory of materials and applications to practical problems. We have seen that, in particular, old and new techniques used in the kinetic theory of gases, in particular of dense gases, may be advantageously utilized in the study of granular materials. The new aspect, with respect to the case of a gas, is related to the fact that the collisions between particles do not preserve kinetic energy. As is well known, the equations of kinetic theory provide approximate solutions, which yield constitutive relations capable of “closing’’ the system of balance equations.This procedure cannot be immediately generalized to the case of granular materials, since (with the obvious exceptions of perfectly smooth or perfectly rough particles) a Maxwellian solution describing statistical equilibrium is not available. Recently Goldshtein and Shapiro [24], following a previous article by Goldshtein et al. [54] obtained an unsteady homogeneous solution which might play a role analogous to that played by a Maxwellian, as indicated in the article
Microscopic Foundations of Gases and Granular Materials
77
which have been just mentioned. As a matter of fact, Goldshtein and Shapiro [24] obtained equations analogous to Euler’s (with a source in the energy equation) and applied them to a study of the vibrational motion of a layer of granular material put on an oscillating plate. They showed that shock waves may form in a granular material. A more systematic approach has subsequently been provided by Goldhirsch [55]. An alternative approach might be provided by using a Boltzmann equation with a velocity rescaled through a time dependent thermal speed. In this case an space homogeneous “equilibrium’’ exists, as conjectured by Ernst and Brito [56] and proved by Bobylev and the present author [57, 58]. REFERENCES 1. C. Cercignani, Ludwig Boltzmann.The ManWhoTrusted Atoms, Oxford University Press, Oxford, 1998. 2. C. S. Campbell, Rapid granular flows,Ann. Rev. Fluid Mech., 22 (1990), 57–92. 3. N. V. Brilliantov and T. Pöschel, Kinetic Theory of Granular Gases, Oxford University Press, Oxford, 2004. 4. G. M. Homsy, R. Jackson, and J. R. Grace, Report of a Symposium on mechanics of fluidized beds, J. Fluid. Mech., 236 (1992), 447–495. 5. T. G. Drake, Granular flow: physical experiments and their implication for microstructural theories, J. Fluid Mech., 225 (1990), 121–152. 6. C. S. Campbell and C. E. Brennen, Computer simulations of chute flows of granular materials in Proceedings of the IUTAM Symposium on Deformation and Failure of Granular Materials, Delft, 1983. 7. C. S. Campbell and C. E. Brennen, Computer simulation of shear flows of granular materials, in Mechanics of Granular Materials: New Models and Constitutive Relations, J.T. Jenkins and M. Satake (eds.), Elsevier,Amsterdam, 1983, pp. 313–326. 8. C. S. Campbell and C. E. Brennen, Computer simulation of granular materials, J. Fluid Mech., 151 (1985), 167–188. 9. C. S. Campbell,The stress tensor for simple shear flows of a granular material, J. Fluid Mech., 203 (1989), 449–473. 10. A. D. Rosato, Y. Lan, and D. T. Wang, Vibratory particle-size sorting in multicomponent systems, Powder Tech., 66 (1991), 149–160. 11. J. A. C. Gallas, H. J. Herrmann, and S. Sokolowski,Two-dimensional powder transport on a vibrating belt, J. Phys. II France, 2 (1992), 1389–1400. 12. G. H. Ristow, Simulating granular flow with molecular dynamics, J. Phys. I France, 2 (1992), 649–662. 13. G. H. Ristow and H. J. Herrmann, Density patterns in granular media, submitted to Phys. Rev. Lett., 50 (1994), R5–R8. 14. G. H. Ristow, Molecular dynamics simulations of granular materials on the Intel iPSC/860, J. Mod. Phys. C., 3 (1992), 1281–1293. 15. J. T. Jenkins and S. B. Savage, A theory for the rapid flow of identical, smooth, nearly elastic, spherical particles, J. Fluid Mech., 130 (1983), 187–202. 16. C. K. K. Lun, S. B. Savage, D. J. Jeffrey, and N. Chepurniy, Kinetic theories for granular flow: inelastic particles in couette flow and slightly inelastic particles in a general flow field, J. Fluid Mech., 140 (1984), 223–256. 17. J. T. Jenkins and M. W. Richman, Grad’s 13-moment system for a dense gas of inelastic spheres, Arch. Rat. Mech. Anal., 87 (1985), 355–377. 18. J. T. Jenkins and M. W. Richman, Kinetic theory for plane flows of a dense gas of identical, rough, inelastic, circular disks, Phys. Fluid., 28 (1987), 3485–3494.
78
Carlo Cercignani
19. C. K. K. Lun and S. B. Savage, A simple kinetic theory for granular flows of rough, inelastic, spherical particles,Trans. Asme. J. of Appl. Mech., 54 (1987), 47–53. 20. C. K. K. Lun and S. B. Savage,The effect of an impact dependent coefficient of restitution on stresses developed by sheared granular materials,Acta Mech., 63 (1986), 15–44. 21. M. W. Richman, The source of second moment of dilute granular flows of highly inelastic spheres, J. Rheol., 33 (1989), 1293–1306. 22. C. K. K. Lun, Kinetic theory for granular flow of dense, slightly inelastic, slightly rough spheres, J. Fluid Mech., 233 (1991), 539–559. 23. C. Cercignani, Meccanica dei materiali granulari e teoria cinetica dei gas: una notevole analogia, Atti Sem. Mat. Fis Univ. Modena, XXXVII (1989), 481-490. 24. A. Goldshtein and M. Shapiro, Mechanics of collisional motion of granular materials. Part I. General hydrodynamic equations, J. Fluid. Mech., 282 (1995), 75–114. 25. S. Ogawa, A. Umezura, and N. Oshima, On the equations of fully fluidized granular materials, J. Appl. Math. Phys. (ZAMP), 31 (1980), 483–493. 26. S. B. Savage and D. J. Jeffrey, The stress tensor in a granular flow at high shear rates, J. Fluid Mech., 110 (1981), 265–272. 27. H. Shen and N. L. Ackermann, Constitutive relationships for fluid-Solid mixtures, J. Eng. Mech. Div. of ASCE, 108 (1982), 748–763. 28. N. L. Ackermann and H. Shen, Stresses in rapidly sheared fluid-solid mixtures, J. Eng. Mech. Div. ASCE, 108 (1982), 95–113. 29. P. K. Haff, Grain flow as a fluid mechanical phenomenon, J. fluid Mech., 134 (1983), 401–430. 30. S. B. Savage, Streaming Motions in a Bed of Vibrationally-Fluidized Dry Granular Material, J. Fluid Mech., 194 (1988), 457–458. 31. C. Cercignani, Mathematical Methods in Kinetic Theory, Plenum Press, New York, 1969. 32. C. Cercignani,The Boltzmann Equation and its Applications, Springer-Verlag, New York, 1988. 33. S. Chapman and T. G. Cowling, The Mathematical Theory of Nonuniform Gases, Cambridge University Press, London, 1960. 34. C. Cercignani, Rarefied Gas Dynamics. From Basic Concepts to Actual Calculations, Cambridge University Press, Cambridge, (2000). 35. C.Truesdell and R. G. Muncaster, Fundamentals of Maxwell’s KineticTheory of a Simple Monatomic Gas,Academic Press, New York, 1980. 36. L. Boltzmann, Weitere Studien über das Wärmegleichgewicht unter Gasmole- külen, Wiener Berichte, 66 (1872), 275–370. 37. D. Enskog, Kungl. Svenska Vetenskaps Akademiens Handl., 63(4) (1921). 38. C. Cercignani and M. Lampis,Teoria Cinetica di un Gas Denso di Sfere Ruvide,Atti del Settimo Congresso AIMETA,Trieste (1984). 39. C. Cercignani and M. Lampis, On the kinetic theory of a dense gas of rough spheres, J. Stat. Phys., 53 (1988), 655–672. 40. C. Cercignani, Kinetic theory of a dense gas of rough spheres, J. Non-Equilib. Thermodyn., 11 (1986), 145–155. 41. D. W. Condiff, W. K. Lu, and J. S. Dahler, Transport properties of polyatomic fluids; a dilute gas of perfectly rough spheres, J. Chem. Phys., 42 1965, 3445–3475. 42. C. Cercignani, Recent developments in the mechanics of granular materials, in Fisica matematica e ingegneria delle strutture: rapporti e compatibilità, G. Ferrarese, (ed.), Pitagora Editrice, Bologna, (1995), 119–132. 43. G. H. Bryan, Brit. Assoc. Reports, 83 (1894). 44. W. Goldsmith, Impact: The Theory and Physical Behavior of Colliding Solids, E. Arnold, London, 1960. 45. K. L. Johnson, One hundred years of Hertz contact, Proc. Inst. Mech. Engrs. 196 (1982), 363–378. 46. O. Lanford, Time Evolution of large classical systems, in Proc. Battelle Rencontre on Dynamical Systems, J. Moser (ed.), LNP 35, Springer-Verlag, Berlin, 1975. 47. J. Lebowitz, J. Percus, and J. Sykes, Kinetic equation approach to time-dependent correlation functions, Phys. Rev., 188 (1969), 487–495.
Microscopic Foundations of Gases and Granular Materials
79
48. J. Sykes, Short-time kinetic equations for hard spheres: comparison with other theories, J. Stat. Phys., 8 (1973), 279–292. 49. H. Van Beijeren and M. Ernst, The modified Enskog equation, Physica, 68, 437–456; The modified Enskog equation for mixtures, Physica, 70 (1973), 226–242. 50. P. Résibois, H -theorem for the (modified) non-linear Enskog equation, J. Stat. Phys., 19 (1978), 593–609. 51. M., Cannone and C. Cercignani, The inverse conjecture for the revised Enskog equation, J. Stat. Phys., 63, (1978), 363–387. 52. R. A. Toupin, Elastic material with couple stresses,Arch. Rat. Mech. Anal., 11 (1962), 385–414. 53. E. Dunn and J. Serrin, On the thermomechanics of interstitial working,Arch. Rat. Mech. Anal., 88 (1985), 95–133. 54. A. I. Goldshtein,V. N. Poturaev, and I. A. Shulyak, Structure of the equations of hydrodynamics for a medium consisting of inelastic rough spheres Fluid Dynam., 25 (1990), 305–313. 55. I. Goldhirsch, Rapid granular flows,Annu. Rev. Fluid Mech., 35 (2003), 267–293. 56. M. H. Ernst and R. Brito, Scaling solutions of inelastic Boltzmann equations with over-populated high energy tails, J. Stat. Phys. 109 (2002), 407. 57. A. V. Bobylev and C. Cercignani, Self-similar asymptotics for the Boltzmann equation with inelastic and elastic interactions, J. Stat. Phys., 110, (2003), 333–375. 58. A. V. Bobylev, C. Cercignani, and G. Toscani, Proof of an asymptotic property of self-similar solutions of the Boltzmann for granular materials, J. Stat. Phys., 111 (2003), 403–416.
C H A P T E R
T H R E E
Quantization of Affine Bodies: Theory and Applications in Mechanics of Structured Media Jan J. Sławianowski∗
Contents 3.1 Introduction 3.2 Classical Preliminaries 3.3 General Ideas of Quantization
80 82 125
Abstract Discussed is kinematics and dynamics of bodies with affine degrees of freedom, i.e. homogeneously deformable “gyroscopes’’. The special stress is laid on the status and physical justification of affine dynamical invariance. On the basis of classical Hamiltonian formalism the Schroedinger quantization procedure is performed. Some methods of the partial separation of variables, analytical treatment and search of rigorous solutions are developed. The possibility of applications in theory of structured media, nanophysics, and molecular physics is discussed. Key Words: Affine degrees of freedom, Molecular dynamics, Nanophysics, Quantized media, Structured media
3.1 Introduction The mechanics of affine bodies was a subject of many papers [1–55]. It has been a field of intensive studies in our group at the Institute of Fundamental Technological Research in Warsaw. Up to our knowledge, for the first time the idea of objects with affine degrees of freedom in mechanics appeared in papers of Eringen [16, 56–58] devoted to structured continua, to be more precise in his theory of micromorphic media. Micromorphic continuum is an affine extension of the micropolar Cosserat continuum. Roughly speaking, the Cosserat medium is a deformable continuum of infinitesimal gyroscopes. Similarly, the micromorphic ∗ Institute of Fundamental Technological Research, Polish Academy of Sciences, Swi¸ ´ etokrzyska, Warsaw, Poland e-mail:
[email protected] Material Substructures in Complex Bodies ISBN-10: 0-08-044535-7
80
© 2007 Elsevier Ltd. All rights reserved.
Quantization of Affine Bodies: Theory and Applications in Mechanics of Structured Media
81
body is a deformable continuum of infinitesimal homogeneously deformable gyroscopes. Affine model of collective degrees of freedom was also used in the theory of collective phenomena in atomic nuclei [59]. The idea of affine body is interesting in itself from the point of view of analytical mechanics and theory of dynamical systems. It is an instructive example of systems with degrees of freedom ruled by Lie groups. In mechanics of non-constrained continua the configuration space may be identified with the group of all diffeomorphisms of the physical space (volume-preserving diffeomorphisms in mechanics of ideal incompressible fluids). It is rather difficult to be rigorous with such infinitedimensional groups. Affine model is placed between rigid-body mechanics and the general theory of deformable continua, i.e. it involves deformations but at the same time one deals there with a finite number of degrees of freedom. The Liegroup background of the geometry of the configuration space offers the possibility of the effective use of powerful analytic techniques. One can realize certain finitedimensional generalizations when the configuration space geometry is ruled, e.g. by the projective or conformal group. Also other finite-dimensional discretized approaches are useful but of course the models based on geometric transformation groups are particularly interesting and efficient. The range of applications of affine model of collective and internal degrees of freedom is very wide and has to do with various scales of physical phenomena: Macroscopic elastic problems when the length of excited waves is comparable with the linear size of the body. • Purely computational and engineering problems connected with the finite elements methods. A mixture of analytic and numerical procedures. • Structured bodies, e.g. micromorphic continua and molecular crystals. • Vibrations of astrophysical objects (stars, concentrations of the cosmic dust), theory of the shape of the Earth. • Molecular vibrations. • Nuclear dynamics. •
Obviously, the last two subjects must be based on the quantized version of the theory. Quantum description is also necessary in various problems concerning the nanoscale phenomena, fullerens, etc. It is a new fascinating subject where one deals with the very intriguing convolution of the classical and quantum levels, perhaps also with some yet non-solved paradoxes from the realm of quantum-mechanical foundations like decoherence, etc. Quantization as a purely mathematical procedure is connected with certain ambiguities which may be solved only a posteriori, on the bases of experimental data. There are some well-known problems with the ordering of operators. In models with a firm group-theoretic background there are some canonical procedures, usually confirmed by experiments. Because of this, the extensive geometric introduction presented below, almost a treatise as a matter of fact, is a constitutive element of the theory, motivated by deeper reasons than the purely mathematical curiosity or artificial sophistication.
82
Jan J. Sławianowski
Affine models of degrees of freedom of the structured elements is very natural. When one deals with fullerens, macromolecules, microdefects, affine modes of motion are certainly the most relevant ones. There are molecules, e.g. P4 , which have no other degrees of freedom; there are also such ones for which non-affine behaviour is a merely small correction. As mentioned, affine modes have also to do with finite elements, when the body is described as an aggregate of small affine objects. Perhaps the quantization of such an approach might be a procedure alternative to the phonon description based on the quantized plane elastic waves. And finally, one of the most important things. The group-theoretical description of internal and collective modes is really effective when the dynamics is invariant (or in some sense almost invariant) with respect to the group underlying kinematics of the problem. And this is not the case in all models of affine bodies met in literature. Kinematics is there affine but the group of dynamical symmetries is broken to the isometry group. Because of this, there is no full use and the full profit of Lie-group techniques. Unlike this, we formulate here affinely-invariant dynamics, where elastic interactions may be encoded in appropriate kinetic energy models without (or “almost’’ without) any use of potential energy terms. This procedure is similar to that following from the Maupertuis variational principle. There are indications that just such models may be useful in condensed matter theory, where the structural elements are more sensitive to the geometry of a surrounding piece of the body, e.g. to the Cauchy deformation tensor than to the “true’’ metric tensor of the physical space. This is something similar to the effective mass tensors of electrons in crystals. There is an interesting link between our models and theories of integrable lattices like Calogero-Moser, Sutherland, and others [60–62]. On the quantum level the deformation invariants behave like indistinguishable, exotically parastatistical one-dimensional “particles’’. Obviously, the real world, the arena of mechanical phenomena, is three dimensional. However, certain important invariance and other problems are explained in a more lucid way when described with non-physical generality, i.e. in n dimensions. By the way, two-dimensional problems are also interesting not only in “Flatland’’ [63] but also in some realistic physical problems. At the same time, they are computationally simple due to some exceptional, so to speak pathological, feature of GL(2, R) among all GL(n, R) (SO(2, R) is Abelian, whereas SO(n, R) for n > 2 are semisimple).
3.2 Classical Preliminaries Let us briefly describe various models of the configuration space of affinely rigid body. It depends on the particular problem under consideration which of them is more convenient. The possibility and usefulness of many choices of geometric structures underlying physically the same degrees of freedom was
Quantization of Affine Bodies: Theory and Applications in Mechanics of Structured Media
83
pointed out by Capriz [2, 64–69]. Various descriptions differ in assuming some auxiliary geometric objects. We begin with some elementary concepts of affine geometry, just to fix the language and notation. Affine space is given by a triple (X , E, →), where X is a point set, just the “space itself ’’, E is a linear space of translations in X , and the arrow → denotes a mapping from the Cartesian product X × X onto E; the → vector assigned to (p, q) ∈ X × X is denoted by − pq . The arrow operation satisfies some axioms, namely, → → → (i) − pq + − qr + − rp = 0 for any p, q, r ∈ X , → (ii) for any p ∈ X and v ∈ E there exists exactly one q ∈ X such that − pq = v; we write q = tv (p). For any v ∈ E, tv : X → X is a one-to-one mapping of X onto X , the translation by v. And obviously, tv ◦ tu = tu ◦ tv = tv+u ,
t◦ = idX ,
tu−1 = t−u .
In this way, E is considered as an additive-rule Abelian group acts freely and transitively on X . Any linear space E may be considered as an affine space (E, E, −), i.e. − → uv = v − u. → → → The axiom (i) implies that − pp = 0, − pq = −− qp for any p, q ∈ X . Let be an arbitrary set, in general, structureless one. The set of all mappings from in X , denoted by X , is simply the -indexed Cartesian product of X . For any mapping f : → X , the image f (ω) ∈ X is interpreted as an ωth component of f . When is a finite N -element set, e.g. = {1, 2, . . . , N }, this is just the familiar finite Cartesian product X N . The set X is in a natural way an affine space. Its translation space is identical with E , the set of all mappings from into X . If F, G are mappings from −→ into X , then the corresponding translation vector FG ∈ E is simply given by −−−−−−→ −→ (FG) (ω) : = F(ω)G(ω),
(3.1)
for any ω ∈ . One can easily show that all axioms of affine geometry are satisfied then. Affine mappings, by definition, preserve all affine relationships between figures and points. So, if (N , U , →), (M , V , → ) are affine spaces, then we say that : N → M is affine if there exists such a linear mapping L[]: U → V denoted also by Dϕ that for any p, q ∈ N the following holds: −−−−−→ → (p)(q) = L[]− pq . The mapping L[φ]: U → V is referred to as a linear part of . The set of all affine mappings from N to M will be denoted by Af(N , M ); similarly, L(U , V )
84
Jan J. Sławianowski
denotes the set of linear mappings. If 1 ∈Af(P, M ) and 2 ∈Af(N , P), then 1 ◦ 2 ∈Af(N , M ) and L[1 ◦ 2 ] = L[1 ]L[2 ]. Dimension of the translation space E is referred to as the dimension of X itself. Any fixed point p ∈ M establishes the bijection of M onto V given by → M q → − pq ∈ V . Such V -valued charts establish in M the structure of analytical differential manifold just of dimension dim V . The manifold of affine injections from N into M will be denoted by AfI(N , M ), and the corresponding set of linear injections from U into V by LI(U , V ). They are open submanifolds of Af(N , M ), L(U , V ), respectively. Obviously, they are non-empty if and only if dim M ≥ dim N . If dim M = dim N , they become, respectively, the manifolds of affine and linear isomorphisms. If N = M and U = V , i.e. when we work within some fixed affine space (M , V , →), then some simplified notation is used, namely, L(V , V ),
Af (M , M ),
LI(V , V ),
AfI(M , M )
are denoted, respectively, by L(V ),
Af (M ),
GL(V ),
GAf (M ).
Obviously, the last two sets are groups, respectively, the general linear and affine groups in V , M .Translations are affine isomorphisms; their set T [V ] = {tv :v, ∈ V } is a normal subgroup of GAf(M ). This subgroup is the kernel of the group epimorphism: GAf (M ) ϕ → L[ϕ] ∈ GL(V ). The quotient group GAf(M )/T (V ) is isomorphic with GL(V ) but in a noncanonical way; any choice of centre o ∈ M gives rise to some isomorphism. The set of affine mappings from (N , U , →) to (M , V , →), i.e. Af(N , M ), is an affine subspace of M N in the sense of (3.1); the translation space is identified with V N . If in the space N some origin point O ∈ N is chosen then the manifold Af(N , M ) may be simplified to the Cartesian product M × L(U , V ). Namely, with any ∈Af(N , M ) we associate a pair (x, ϕ) ∈ M × L(U , V ) in such a way −−−−−−→ − → that x = (O) and (O)(a) = ϕ · Oa. When we restrict ourselves to the open submanifold of affine isomorphisms AfI(N , M ) ⊂Af(N , M ), then ϕ in the above expression runs over the open submanifold LI(U , V ) ⊂ L(U , V ). And finally, let us fix some linear frame, i.e. an ordered basis in U , E = (E1 , . . . , EA , . . . , En ), n = dim U = dim V . When it is kept fixed, any linear mapping ϕ ∈ L(U , V ) may be identified with the system e = (e1 , . . . , eA , . . . , en ), where eA = ϕEA , A = 1, n. When ∈AfI(N , M ), i.e. ϕ ∈ LI(U , V ), then e is a linear frame in V . In this way LI(U , V ) is identified with F(V ), the manifold of linear frames in V . And AfI(N , M ) is identified with M × F(V ), the manifold of affine frames in M (the pairs consisting of points in M and ordered bases in V ).
Quantization of Affine Bodies: Theory and Applications in Mechanics of Structured Media
85
Fixing an affine frame (O, E) in N we turn it into the arithmetic space Rn . Linear isomorphisms of U onto V become then linear frames in V ; their inverse isomorphisms are identified with the dual co-frames: e˜ = (e 1 , . . . , e A , . . . , e n ), e A , eB = δA B. As frames and dual co-frames mutually determine each other,AfI(N , M ) may be as well identified with M × F(V ∗ ) = M × F(V )∗ ; here F(V ∗ ) is the manifold of frames in the dual space V ∗ denoted also as F(V )∗ . If in addition some affine frame (o, ε) = (o; ε1 , . . . , εA , . . . , εn ) in M is fixed, then also V becomes identified with Rn . The manifold LI(U , V ) is then analytically identified with the general linear group GL(n, R), and AfI(N , M ) may be identified with the semi-direct product GAf(n, R) GL(n, R) ×s Rn . The manifold AfI(N , M ) is a homogeneous space of affine groups GAf(M ), GAf(N ) acting, respectively, on the left and on the right: A ∈ GAf (M ) : AfI(N , M ) → A ◦ ,
(3.2)
B ∈ GAf (N ) : AfI(N , M ) → ◦ B.
(3.3)
Similarly, linear groups GL(V ), GL(U ) act transitively on LI(U , V ): α ∈ GL(V ) : LI(U , V ) ϕ → αϕ,
(3.4)
β ∈ GL(U ) : LI(U , V ) ϕ → ϕβ.
(3.5)
Let us observe that although GL(V ), GL(U ) are logically distinct disjoint sets, the corresponding transformation groups intersect non-trivially. Namely, dilatations belong to both of them, the left and right actions of α = λIdV , β = λIdU result in multiplying ϕ by λ, i.e. ϕ → λϕ. When some origin O ∈ N is fixed and AfI(N , M ) is identified with M × LI(U , V ), the left-acting transformation groups GAf(M ), GL(V ) may be represented as follows: A ∈ GAf (M ) : M × LI(U , V ) (x, ϕ) → (A(x), L(A)ϕ), α ∈ GL(V ) : M × LI(U , V ) (x, ϕ) → (x, αϕ),
(3.6) (3.7)
The origin O enables one to identify GAf(N ) with the semi-direct product −−−−→ GL(U ) ×s U . Namely, B ∈ GAf(N ) is represented by the pair (L(B), OB(O)). And conversely, the pair (β, u) ∈ GL(U ) ×s U gives rise to the mapping B ∈ GAf(N ) such that for any a ∈ N : − → −−−→ OB(a) = β · Oa + u. The right action (3.3) of B on AfI(N , M ) is represented in M × LI(U , V ) as follows: (β, u) ∈ GL(U ) ×s U : (x, ϕ) → (tϕu (x), ϕβ).
(3.8)
If we put u = 0, then the group GL(U ) itself acts only on the second component: β ∈ GL(U ) : (x, ϕ) → (x, ϕβ).
(3.9)
86
Jan J. Sławianowski
The standard language of continuum mechanics is based on the use of two affine spaces: the physical and material ones. We denote them, respectively, by (M , V , →) and (N , U , →). If we deal with the infinite continuum medium filling up the whole physical space, configurations are described by diffeomorphisms of N onto M . The smoothness class of these diffeomorphisms depends on peculiarities of the considered problem. The manifold N is interpreted as the set of material points. In configuration given by : N → M , the material point a ∈ N occupies the spatial position (a) ∈ M . Diffeomorphism groups Diff(M ) and Diff(N ) give rise to transformation groups acting on the configuration space Diff(N , M ), i.e. A ∈ Diff (M ) : Diff (N , M ) → A ◦ ,
(3.10)
B ∈ Diff (N ) : Diff (N , M ) → ◦ B.
(3.11)
They are referred to respectively as spatial and material transformations. Obviously, spatial and material transformations mutually commute. In continuum mechanics they have to do with symmetries of space and material itself. Obviously, when one deals with realistic bounded bodies, this description should be modified, e.g. manifolds with boundary become a better model of the material space. Another possibility is to use a smooth smeared-out model of the boundary, i.e. to describe the bounded body as a non-bounded, one however, with the mass density quickly vanishing outside the real object. Deeper modifications are necessary when describing continua with degenerate dimension like membranes, strings, infinitesimally thin shells, rods, etc. And obviously, for discrete systems the description based on the affine space N as a material body is not applicable in the literal sense, unless some tricks like smearedout density functions and so on are used. It would be a good thing, especially when dealing with affine systems in microscopic applications (molecular, microstructural, etc.) to start from some general formulation applicable both to discrete and continuous systems of various kinds. There is also some more subtle point to that, namely, the material space is primarily the abstract set of material points or their labels, so-to-say “identification cards’’. A priori this set is structureless; it is a kind of “powder’’ of material points. Let us denote it by . Configurations are mappings from to M , i.e. elements of M (the usual finite Cartesian product M N when one deals with an N -particle system). More precisely, M is the set of singular configurations, i.e. ones admitting coincidences of different material points at the same spatial point. To avoid such a “catastrophe’’ one must decide that the “true’’ configuration space is the set of injections from into M , i.e. Inj(, M ). As far as is structureless the only well-defined set of material transformations is Bij(), the set of bijections of onto . They act on configurations according to the following rule: B ∈ Bij() : Inj(, M ) → ◦ B.
(3.12)
These “permutations’’ of material points are just the only admissible material transformations on this yet amorphous stage. Transformations of M onto itself, in
Quantization of Affine Bodies: Theory and Applications in Mechanics of Structured Media
87
particular diffeomorphisms of M onto itself of an appropriate class, act on Inj(, M ) according to the following rule: A ∈ Bij(M ) : Inj(, M ) → A ◦ .
(3.13)
In general, this action is not transitive and splits into orbits, i.e. transitively classes. Any fixed class carries over geometric structures from M to . For example, if has the continuum cardinal number and only the bijections of onto M are admitted as configurations, then any fixed orbit of the left-acting diffeomorphism group Diff r (M ) induces in some structure of C r -class differentiable manifold. The powder of material points becomes the continuous body and its configuration space is identified with Diff r (, M ), i.e. the set of C r -class diffeomorphisms of onto M . If some orbit of the left-acting affine group GAf(M ) is fixed, becomes endowed with the induced structure of affine space. And one can sensibly tell about affine mappings from onto M and about affine relationships between material points in . Different orbits induce structures in , which are literally different, although usually isomorphic. In general, when dealing with constrained systems of material points, it is not the total group Bij(M ) or the diffeomorphism group Diff r (M ), but some rather peculiar proper subgroup G ⊂ Bij(M ) that rules geometry of degrees of freedom and perhaps also the dynamics. Configuration spaces are constructed as orbits of such groups. Let us assume that some orbit Q, i.e. some particular model of degrees of freedom is fixed. By the very definition, G acts transitively on Q, i.e. Q is a homogeneous space of the action (3.13). The point is how to define some right-hand-side action analogous to (3.11) or (3.3). In general, the transformation group Bij() acting through (3.12) is too poor. For example, when one deals with a finite system of material points, Bij() is the permutation group of and nothing like continuous groups of material transformations, e.g. (3.11) and (3.3) can be constructed on the basis of Bij(). One feels intuitively that there is something non-satisfactory here. And indeed, it is possible to define some rich, in general continuous, group of material transformations acting on the right on Q. The construction is more lucid when one forgets for a moment about details and considers an abstract homogeneous space with the underlying point set Qand the group G acting transitively on Q on the left. For simplicity, we denote the action of G by g ∈ G : Q q → gq ∈ Q. This graphical convention is well suited to the left-hand-side nature of this action: (g1 g2 )q = g1 (g2 q). The action is assumed to be effective, i.e. the group identity e ∈ G (its neutral element) is the only element of G satisfying the condition eq = q for any
88
Jan J. Sławianowski
q ∈ Q. Roughly speaking, G is a “proper’’ transformation group, not something homomorphically (with a non-trivial kernel) mapped into the group Bij(Q). This is exactly the case in problems described previously. Let q0 ∈ Q denote some arbitrarily fixed points, and H (q0 ) ⊂ G denote its isotropy group, i.e. the set of elements which do not move q: H (q0 ) := {g ∈ G : gq0 = q0 }. It is well known that Q may be identified in a one-to-one way with the set of left cosets, i.e. with the quotient space G/H (q0 ). The action of G on G/H (q0 ) is represented by left translations, namely, the coset xH (q0 ): = {xh: h ∈ H (q0 )} is transformed by g ∈ G into gxH (q0 ); obviously, the result does not depend on the particular choice of x within its coset, i.e. on the replacement x → xh, h ∈ H . The question is now whether there exist some right translations of representants, x → xg, admitting an interpretation in terms of transformations acting on G/H (q0 ). It is easy to see that the answer is affirmative. Namely, let N (H (q0 )) ⊂ G denote the maximal subgroup of G for which H (q0 ) ⊂ N (H (q0 )) is a normal subgroup. It is easy to see that for every n ∈ N (H (q0 )) the corresponding right regular translation G x → xn ∈ G is projectable to the manifold of left cosets G/H (q0 ). Indeed, cosets are transformed onto cosets: (xH (q0 ))n = xH (q0 )n = x(nH (q0 )n−1 )n = xnH (q0 ) = (xn)H (q0 ). In this way transformations may be performed on representants, G → xn ∈ G. Obviously, the choice of representants does not matter because (xh)nH (q0 ) = (xhn)H (q0 ) = xnn−1 hnH (q0 ), and for any h ∈ H (q0 ), n−1 hn ∈ H (q0 ), thus n−1 hnH (q0 ) = H (q0 ), and finally ((xh)n)H (q0 ) = (xn)H (q0 ). The non-effectiveness kernel of the right action of N (H (q0 )) on the coset manifold coincides with the group H (q0 ) itself, thus the true group of right-acting transformations is given by N (H (q0 ))/H (q0 ) = H (q0 )/N (H (q0 )). The above construction of right-acting transformations pre-assumes some choice of the reference point q0 ∈ Q. The question arises as to what extent does the presented prescription depend on q0 . It turns out that the constructed transformation group itself is well defined and the particular choice of q0 influence only the “parameterization’’, so-to-say identification labels of the group elements. Let q1 , q2 ∈ Q be two arbitrarily chosen reference point. The subset of G consisting of elements k transforming q1 into q2 , i.e. kq1 = q2 , will be denoted by H (q1 , q2 ). Obviously, H (q1 , q2 ) is simultaneously the left and right coset of the subgroups H (q1 ), H (q2 ), respectively; if k is an element of H (q1 , q2 ), then so is h2 kh1 for
Quantization of Affine Bodies: Theory and Applications in Mechanics of Structured Media
89
any h1 ∈ H (q1 ), h2 ∈ H (q2 ). (Incidentally, let us notice: perhaps it would be convenient to write H (q1 , q1 ), H (q2 , q2 ) instead of H (q1 ), H (q2 ).) Obviously, for any k ∈ H (q1 , q2 ) we have H (q2 ) = kH (q2 )k−1 ,
N (q2 ) = kN (q1 )k−1 .
Any choice of k ∈ H (q1 , q2 ) fixes some isomorphisms of H (q1 ), N (q1 ), respectively, onto H (q2 ), N (q2 ). Let us take some g1 ∈ G and the point q ∈ Q produced by it from q1 ∈ Q, i.e. q = g1 q1 . When q is fixed, g1 is defined up to the gauging g1 → g1 h, h ∈ H (q1 ). And now we transform q by the right action of n1 ∈ N (H (q1 )): q → q = g1 n1 q1 . The result does not depend on the gaugings g1 → g1 h, n1 → χ1 n1 χ1 , where h, χ1 , χ2 ∈ H (q1 ). Let us now express this action in terms of the reference point q2 = kq1 : q = g1 n1 q1 = g1 n1 k−1 q2 = (g1 k−1 )(kn1 k−1 )q2 . Now q is produced from the reference point q2 by g2 = g1 k−1 ; q = g2 q2 . And its representing group elements g2 is affected on the right by n2 = kn1 k−1 ∈ N (H (q2 )) the k-conjugation of n1 ∈ N (H (q1 )). In this way different right-hand-side actions in G, i.e. G x → xn1 ,
G x → xn2 ,
describe the same transformation in Q. They are different “labels’’ of this transformation corresponding to various choices of reference points q1 , q2 ∈ Q and various choices of k ∈ H (q1 , q2 ). We are particularly interested in situations when the action of G on Q is free, i.e. when the isotropy groups are trivial, H (q) = {e} for any q ∈ Q. Then H (q1 , q2 ) are one-element sets, H (q1 , q2 ) = {k}, i.e. the above k-element is unique. The “labelling’’ of the right-acting transformation group by elements of G depends only on the choice of the reference point. It is clear that for any q0 ∈ Q, N (H (q0 )) = G, i.e. the right-acting transformation group is isomorphic with G itself. Choosing some reference point q0 we automatically fix one of these isomorphisms. The extended affinely rigid body is defined as a system of material points constrained in such a way that all affine relationships between constituents remain frozen during any admissible motion. Summarizing the above remarks we can formulate a few geometric models of its configuration space: 1. If we use the standard terms of continuum mechanics based on the affine physical and material spaces (M , V , → ), (N , U , → ), then the configuration
90
Jan J. Sławianowski
space is given by AfI(N , M ), i.e. the manifold of affine isomorphisms of N onto M . Affine groups GAf(M ), GAf(N ) act on AfI(N , M ) according to the rules (3.2), (3.3) and describe, respectively, spatial and material transformations (kinematical symmetries). If we formally admitted singular configurations with degenerate dimension, the configuration space would be given by Af(N , M ), i.e. the set of all affine mappings of N into M , in general non-invertible ones. By the way, Af(N , M ) is also an affine space with Af(N , V ) as the translation space; this is just the special case of (1). 2. If some material origin O ∈ N is fixed, the configuration space may be identified with Q = Qtr × Qint = M × LI(U , V ); the first and second factors refer, respectively, to translational and internal (relative) motion. And again, when singular configurations are admitted, LI(U , V ) is replaced by its linear shell L(U , V ). The natural groups of affine symmetries act on Q according to (3.6)–(3.9). When translational motion is neglected, the configuration space reduces to Qint = LI(U , V ), or simply to L(U , V ) when singular internal configurations are admitted. 3. When in addition to O ∈ N some linear basis E = (E1 , . . . , EA , . . . , En ) is chosen, i.e. when an affine frames (O, E) is fixed in N , the configuration space becomes identified with Q = Qtr × Qint = M × F(V ); F(V ) denotes as previously the manifold of linear frames in V (n = dim V ). When we are not interested in translational motion, simply Qint = F(V ) is used as the configuration space. If, for any reason, singular configurations are admitted, we extend F(V ) to V n = V × · · · × V (n Cartesian factors). Just as in the model Q = M × LI(U , V ) transformation groups act essentially according to the rules (3.6)–(3.9), the linear space U being replaced by Rn (any choice of E ∈ F(U ) identifies U with Rn ). More precisely, spatial transformations are given by A ∈ GAf (M ) : M × F(V ) (x; . . . , eK , . . . ) → (A(x); . . . , L(A)eK , . . . ), α ∈ GL(V ) : M × F(V ) (x; . . . , eK , . . . ) → (x; . . . , αek , . . . ). The frame (O, E) identifies N with Rn ; namely, the point a ∈ N is identified with its coordinates aK (a) with respect to (O, E): − → Oa = aK EK . Therefore, GAf(N ), GL(U ) are identified, respectively, with GL(n, R) ×s Rn , GL(n, R). Their right-hand-side actions on Q = M × F(V ) are, respectively,
91
Quantization of Affine Bodies: Theory and Applications in Mechanics of Structured Media
described as follows: L , . . . ), (β, u) ∈ GL(n, R) ×s Rn : (x; . . . , eK , . . . ) → (teu (x); . . . , eL βK L , . . . ), β ∈ GL(n, R) : (x; . . . , eK , . . . ) → (x; . . . , eL βK
where eu ∈ V denotes the vector the coordinates of which with respect to the basis e coincide with uK , k = 1, n, eu = uK eK . 4. Model with the structureless material space. This is just the model based on orbits and homogeneous spaces, described in some details above. So, is the (structureless) set of material points and Inj(, M ) denotes the set of injections of into M .The spatial affine group GAf(M ) acts on Inj(, M ) through (3.13), A ∈ GAf (M ) : Inj(, M ) → A ◦ .
(3.14)
Any orbit of this action may be chosen as the configuration space of affinely rigid body. Different orbits are related to each other by non-affine transformations. More precisely, we usually concentrate on such orbits Q that for any ∈ Q () ⊂ M is not contained in any proper affine subspace of M . Therefore, the body is essentially n-dimensional (n = dim M ), although, obviously, it need not be so in the rigorous topological sense (e.g. when is finite the body is topologically zero dimensional). Let us mention, however that there are interesting applications of the model of singular affine body. The configuration space Q is then such an orbit of (3.14) that for any ∈ Q the subset () ⊂ M is contained in an affine subspace of M of dimension k < n. The right-acting partner of (3.14) is then constructed as described above for the general homogeneous space (Q, G). Now G = GAf(M ) and Q is an orbit of (3.14) consisting of injections with n-dimensional affine shells of () (n = dim M ). There is another way of fixing the configurations of an extended affine body with the structureless material space. Let us assume that the body is non-degenerate.There exists then an (n + 1)-element subset B ⊂ of material points such that for any ∈ Q (B) ⊂ M is not contained in any proper affine subspace, i.e. its affine shell coincides with M . Let us take the elements of B in some peculiar order (ω1 , . . . , ωn+1 ). Every configuration ∈ Q is uniquely fixed by position ((ω1 ), . . . , (ωn+1 )) of the ordered system (ω1 , . . . , ωn+1 ). The current positions (ω) of all other material points ω ∈ are uniquely determined by ((ω1 ), . . . , (ωn+1 )). The reason is that all affine relationships between positions of material points, i.e. all linear equations satisfied by −−−−−−→ vectors (ω)(ω ) (ω, ω being arbitrary elements of ) are invariant during any admissible (affinely constrained) motion. In other words, they depend only on ω, ω but are independent of ∈ Q.
92
Jan J. Sławianowski
In this way, configurations are identified with elements of the Cartesian product M n+1 = M × · · · × M ((n + 1) copies of M ). When the body is nonsingular, it is not the total M n+1 that is admitted but its open subset consisting of such (n + 1)-tuples (y1 , . . . , yA , . . . , yn+1 ), which are not contained in any proper affine subspace of M (therefore, the affine shell of yA , A = 1, (n + 1)), → coincides with M . This means that the vectors − y− 1 yA , A = 2, (n + 2), are linearly independent. For k-dimensional degenerate body (k ≤ n = dim M ) the configuration space Q (an orbit of GAf(M )) may be identified with M k+1 = M × · · · × M ((k + 1) Cartesian factors) or rather with its open subset consisting of (k + 1)-tuple (y1 , . . . , yA , . . . , yk+1 ) with k-dimensional affine shells. Non-degenerate ordered (n + 1)-tuples y ∈ M n+1 may be interpreted as affine bases (affine frames) in M . For any pair of such bases y = (y1 , . . . , yA , . . . , yn+1 ), y = (y1 , . . . , yA , . . . , yn+1 ) there exists exactly one affine transformation ∈ GAf(M ) such that yA = (yA ), A = 1, (n + 1). Any such basis may be interpreted as a reference configuration. 5. Finally, if some affine frames (O, E), (o, e) are fixed both in N and M , these affine spaces become identified with R, and the numerical affine group GAf(n, R) GL(n, R) ×s Rn may be used as the configuration space. Spatial and material transformations become then, respectively, left and right regular translations. Obviously, all the above models are mutually equivalent and their formal utility and practical usefulness depend on the kind of considered problems. For example, when describing discrete systems we shall use the model (3.4) with the structureless material space. In certain microstructural applications and in fundamental physics one uses internal degrees of freedom which are not interpreted in terms of composed multiparticle systems and perhaps by some principal reasons do not admit such interpretation at all (cf. the concept of spin of elementary particles). Obviously, then the model (3.3) based on the manifold of linear frames FM is the most adequate one. When one deals with the motion of structured bodies in nonEuclidean spaces, this is practically the only adequate approach if one wishes to remain within the framework of finite-dimensional analytical mechanics [34–37, 46–48, 70, 71]. There are some delicate points concerning the connectedness of the configuration space. Obviously, within the standard continuum treatment the singular situations are forbidden, and ϕ must be bijections, and e is a linear frame, not an arbitrary n-tuple of vectors. Therefore, the genuine configuration space is then one of the two connected components of AfI(U , V ), LI(U , V ), F(V ). Otherwise one would have to pass through the forbidden “singular configurations’’. Only the connected components of group unity, i.e. orientation preserving subgroups GAf+ (M ), GL+ (V ), GAf+ (N ), GL+ (U ), GL+ (n, R) are admitted as transformation groups. However, it is not the case when one deals with discrete affine bodies. Then there is nothing catastrophic in passing through “singular’’ situations
Quantization of Affine Bodies: Theory and Applications in Mechanics of Structured Media
93
when at some instant of time () is contained in a proper affine subspace of M . And there is nothing bad in mirror-reflected configurations forbidden in continuum mechanics. Obviously, the above description of configuration spaces must be modified then, e.g. Q is not any longer the homogeneous space of GAf(M ); rather, it may be a union of various transitivity orbits corresponding to all possible dimensions k ≤ n of affine shells of (). Analytical formulas will be usually expressed in terms of rectilinear Cartesian coordinates xi , aK , respectively, in M and N . They are fixed by affine frames (o, e), (O, E) − → − → ox = xi (x)ei , Oa = aK (a)EK . Coordinates xi , aK induce parameterization of configurations. Eulerian and Lagrangian coordinates (spatial and material variables) are related to each other by the formula: yi = i (a) = xi + ϕi K aK , where yi are coordinates of the spatial position of the ath material point, and xi are coordinates of the spatial position (O) of the fixed reference point O ∈ N . i ) are labels of and may be used as generalized coordinates The quantities (xi , ϕK qα , α = 1, n(n + 1) on the configuration space Q. The reference point O ∈ N , i.e. origin of Lagrange coordinates (aK (O) = 0) was chosen here in a completely arbitrary way. In practical problems the choice of O is, as a rule, physically motivated. If there exist additional constraints due to which some material point is immovable (i.e. the body is pinned at it), then the material origin O usually is chosen just at this point. There is no translational motion then. If translations are non-constrained, the centre of mass is usually chosen as the material reference point. Let us remember that in situations other than continuous medium filling up the whole space, centre of mass may happen to be placed “in vacuum’’. And even if it is not the case, the centre of mass is something else than the material point coinciding with it. Let the reference mass distribution be described by some positive regular measure μ on N ; this means that the mass of the sub-body B ⊂ N is given by μ(B) = dμ. B
Centre of mass C(μ) in N is the only point satisfying −−−→ C(μ)a dμ(a) = 0; the dipole moment of μ with respect to C(μ) vanishes. Any configuration gives rise to the -transported measure μ on M , μ ((B)) = μ(B),
μ (A) = μ(−1 (A)).
94
Jan J. Sławianowski
The measure μ describes the Eulerian mass distribution on M (current mass distribution). The current centre of mass distribution C(μ ) ∈ M is given by the formula: −−−−→ C(μ )y dμ (y) = 0. It is well known that the centre of mass is an invariant of affine transformations; by the way, affine transformations may be defined just as those preserving centres of mass. Therefore, (C(μ)) = C(μ ). And besides, for any affine transformation A ∈ GAf(M ), A(C(μ )) = C(μA◦ ). This is not true for non-affine configurations and transformations. If the material reference point O is chosen as C(μ), i.e. Lagrangian centre of i ) on Q are especially convenient because mass, then generalized coordinates (xi , ϕK xi are spatial (Eulerian) coordinates of the instantaneous position of the centre of i refer to the purely relative (internal) motion. mass in M . The variables ϕK The physical quantity μ (Lagrangian mass distribution) is at the same time an auxiliary geometric object underlying the convenient models M × LI(U , V ), M × F(V ) of the configuration space of the affine body. Practically all over this treatment we put the origin of Lagrangian coordinates at the Lagrangian centre of mass. The centre of mass is defined as such a point with respect to which the dipole moment of the mass distribution vanishes. The monopole moment equals the total mass of the body, m = μ(N ) = dμ. N
Higher-order multipole moments given an account of inertial properties of extended bodies. Thus, the Lagrangian second-order moment J ∈ U ⊗ U is given by KL := aK aL dμ(a). J As mentioned, the origin of aK -coordinates is placed at the Lagrangian centre of mass C(μ). The above object J KL is algebraically equivalent to the usual co-moving tensor of inertia known from the rigid body mechanics. One can -transport it to the physical space M . Obviously, the result J (ϕ) ∈ V ⊗ V is non-constant; it depends explicitly on the configuration but only through its internal part ϕ, j
i ϕL J KL . J (ϕ)ij = ϕK
(3.15)
Quantization of Affine Bodies: Theory and Applications in Mechanics of Structured Media
It is clear that
95
J (ϕ) = ij
(yi − xi )(yj − xj )dμ (y).
(3.16)
Obviously, the tensors J , J (ϕ) = ∗ · J are symmetric and positively definite. The most convenient choices of the material reference frames E are those diagonalizing J . One can as well define higher-order multipoles, i.e. K1 ···Kl = aK1 · · · aKl dμ(a), J i1 ···il J (ϕ) = (yi1 − xi1 ) · · · (yjl − xjl )dμ (y). Obviously, in affine motion i1 il K1 ···Kl · · · ϕK J . J (ϕ)i1 ···il = ϕK 1 l
Inertial multipoles of the order l > 2 do not occur in mechanics of affine bodies, nevertheless they are useful in other problems of continuum mechanics[3]. As yet we have used above only affine concepts, i.e. we remained on the ascetic level of Tales geometry. No metric concepts like distances and angular were used. The admissible configurations ∈AfI(N , M ) are homogeneous in the sense that for any a ∈ N the placement Da ∈ LI(U , V ) takes on the same value ϕ = L[]. Nevertheless, it would be incorrect to say that deformations are homogeneous, because without metric (Euclidean) geometry there is no deformation concept at all. The only well-defined “deformation’’ is then violation of affine geometry, i.e. non-constancy of the mapping a → Da . Let us now introduce metrical concepts. The spatial and material metric tensors will be denoted, respectively, by g ∈ V ∗ ⊗ V ∗ , η ∈ U ∗ ⊗ U ∗ . By definition they are symmetric and positively definite. Their contravariant inverses are denoted by g˜ ∈ V ⊗ V , η˜ ∈ U ⊗ U , but in analytical expressions we use the same kernel symbols; the distinction is indicated by the use of lower- and upper-case indices, gij ,
g ij ,
ηAB ,
ηAB ;
g ik gkj = δij ,
ηAC ηCB = δA B.
For any configuration ∈ GAf(M ) we define Green and Cauchy tensors G[] ∈ U ∗ ⊗ U ∗ , C[] ∈ V ∗ ⊗ V ∗ , G[] = ϕ∗ g,
C[] = ϕ−1∗ η,
i.e. analytically: j
i ϕB , G[]AB = gij ϕA
C[] = ηAB ϕi−1A ϕj−1B .
96
Jan J. Sławianowski
In these formulas, as usual, ϕ denotes the linear part of , ϕ = L[] = D. For general non-affine configurations these tensors become fields respectively on N , M , namely: G[]a = Da ∗ · g,
C[]y = Dy −1∗ · η.
Analytically: GAB = gij
∂yi ∂yj , ∂aA ∂aB
Cij = ηAB
∂aA ∂aB . ∂yi ∂yj
Obviously, Green and Cauchy tensors are symmetric and positively definite. G is built of the spatial metric tensor g and is independent of the material metric η. And conversely, C is independent of g and explicitly depends on η. Therefore, the traditional term “deformations tensors’’ is rather non-adequate here; the deformation concept presumes comparison of two metrics, whereas G and C are well-defined even if respectively η and g are not fixed at all. Let us assume they are both fixed and denote the corresponding Euclidean space structures by (N , U , →, η), (M , V , →, g). When some configuration is fixed, any of these two spaces is endowed by two metric-like tensors, respectively, G, η ∈∈ U ∗ ⊗ U ∗ , C, g ∈ V ∗ ⊗ V ∗ . The Lagrange and Euler deformation tensors are, respectively, given by E[] = 12 (G[] − η),
e[] = 12 (g − e[]).
They vanish in the non-deformed configurations, i.e. when are isometries. Obviously, these are usual (metrically) rigid body configurations. Their manifold will be denoted by Is(N , η; M , g) ⊂AfI(N , M ). The isometry groups Is(N , η) ⊂ GAf(N ), Is(M , g) ⊂ GAf(M ) act on Is(N , η; M , g), respectively, on the right and on the left in the sense of (3.2) and (3.3). Obviously, in realistic classical mechanics of (metrically) rigid body mirror-reflected coordinates are excluded and the genuine configuration space is given by some connected component of Is(N , η; M , g). When orientations are fixed in N , M , this will be the manifold Is+ (N , η; M , g) of orientation-preserving isometries. Physically admissible symmetries are given by the connected subgroups Is+ (N , η) ⊂ GAf+ (N ), Is+ (M , g) ⊂ GAf+ (M ) of orientation-preserving transformations (obviously, these group themselves do not assume any fixed orientation; they preserve separately both of them). When translational degrees of freedom are neglected, configurations of internal(relative) motion are elements of D(U , η; V , g), i.e. the manifold of linear isometries of (U , η) onto (V , g). The corresponding spatial and material transformations are, respectively, elements of the subgroup O(V , g) ⊂ GL(V ), O(U , η) ⊂ GL(U ); g- and η-orthogonal transformation groups. And again in classical rigid body mechanics one should restrict ourselves to one of the two connected components of O(U , η; V , g). When orientations in U , V are fixed, this is SO(U , η; V , g) ⊂ LI+ (U , η; V , g), i.e. the manifold of orientation-preserving
Quantization of Affine Bodies: Theory and Applications in Mechanics of Structured Media
97
linear isometries. The connected components are ruled by the proper orthogonal groups SO(V , g), SO(U , η) consisting of isometries with positive, thus plusone-determinants (no orientations in V , U needed for fixing these subgroups). When using the matrix representations we describe configurations by elements of the orthogonal group O(n, R), or in classical problems, by elements of the proper rotation group SO(n, R). Let us now fix some metric tensors η, g. The geometric structure becomes more rich and certain additional object may be defined. For example, the material reference frame maybe made less arbitrary and more based on physical concepts. Let us define two tensor objects built of the inertial tensor J ∈ U ⊗ U namely J˜ ∈ U ∗ ⊗ U ∗ and Jˆ ∈ U ⊗U ∗ L(U ), analytically given by the formulas: J˜AC J CB = δBA ,
JˆBA = J AC ηCB .
Obviously, the object J˜ is non-metrical, but Jˆ depends explicitly on the material metric tensor η. Now we have a well-defined eigenproblem in U : Jˆ E = lE. In the generic non-degenerate case there are n mutually distinct eigenvalues lA , A = 1, . . . , n, and n mutually orthogonal eigendirections. The directions are determined by vectors EA which may be chosen η-normalized to unity and such that the orthonormal frame E = (E1 , . . . , EA , . . . , En ) is oriented positively with respect to the fixed orientation in U . When the eigenvalues lA are ordered by convention in increasing order then EA are unique up to multiplying some of them by minus-one-factors. And the inertial tensor J is then represented as follows: n J= J A EA ⊗ EA . (3.17) A=1
This is just the best choice of the material reference frame. Obviously, it is no longer unique when degeneracy occurs, but these are non-generic situations. The extreme degeneracy correspond to the η -spherical body, when J is proportional to η J AB = μηAB . Remark: If ϕ is not an isometry, then obviously the co-moving vectors eA = ϕEA are not orthonormal in the g-sense, however, they are orthonormal with respect to the Cauchy tensor C used as a “metric’’ in V . And then: J [ϕ] =
n
J A eA ⊗ eA ,
A=1
with the same values J A which occur in (3.17).
98
Jan J. Sławianowski
Remark: For the sake of economy of symbols we could as well denote JˆBA by JBA , following the convention used in Euclidean and Riemannian geometry. But it is not the case with J˜AB . Writing it as JAB would suggest ηAC ηBD J CD , quite incorrectly, this is not the η-lowering of indices. Typical notational shorthands may be misleading. Only for the η-manipulation of indices and for the contravariant inverse of η(ηAC ηCB = δA B ) we can safely use the index manipulation with non-modified kernel symbols. When metric tensors are fixed, we can discuss the problem of interaction between rotations and deformations. In any of linear spaces U , V we are given two symmetric positively definite tensors: η ∈ U ∗ ⊗ U ∗ , G ∈ U ∗ ⊗ U ∗ , g ∈ V ∗ ⊗ V ∗ , C ∈ V ∗ ⊗ V ∗ . Obviously, η, g are fixed whereas G, C are configuration ˜ ∈ U ⊗ U, dependent. Just as previously, we can define the byproduct-objects G ˆ ∈ U ⊗ U ∗ L(U ), C˜ ∈ V ⊗ V , Cˆ ∈ V ⊗ V ∗ L(V ). Analytically they are G given by ˜ AC GCB = δA G B,
ˆ BA = ηAC GCB , G
C˜ ik Cjk = δij ,
Cˆ ji = g ik Ckj .
Again the same care must be taken as to the upper- and lower-case indices. And now G[ϕ], C[ϕ] may be expressed in terms of their η- and g-orthonormal bases (. . . , Fa [ϕ], . . . ), (. . . , fa [ϕ], . . .) in U , V G[ϕ] =
n
λa [ϕ]F a [ϕ] ⊗ F a [ϕ],
a=1
C[ϕ] =
n a=1
1 a f [ϕ] ⊗ f a [ϕ]; λa [ϕ]
obviously, (. . . , F a [ϕ], . . .), (. . . , f a [ϕ], . . .) are the dual orthonormal bases of U ∗ , V ∗ . When there is no danger of misunderstanding, the label ϕ at λa , F a , f a , Fa , fa may be omitted. The quantities λa [ϕ] are deformation invariants in the sense that they do not feel spatial and material linear isometries, λa [αϕβ] = λa [ϕ], for any α ∈ O(V , g), β ∈ O(U , η). The more so they are non-sensitive with respect to the spatial and material affine isometries (because translations evidently do not affect them). Unlike this, the Green and Cauchy deformation tensors are non-sensitive only with respect to spatial and material isometries, G[αϕ] = G[ϕ],
C[ϕβ] = C[ϕ]
Quantization of Affine Bodies: Theory and Applications in Mechanics of Structured Media
99
for any α ∈ O(V , g), β ∈ O(U , η), but not conversely. F a [ϕ] ∈ U , f a [ϕ] ∈ V are ˆ ˆ normalized eigenvectors, respectively, for G[ϕ], C[ϕ]: 1 fa [ϕ], λa g( fa , fb ) = δab ,
ˆ C[ϕ] fa [ϕ] =
ˆ G[ϕ]F a [ϕ] = λa Fa [ϕ], η(Fa , Fb ) = δab ,
and they are essentially unique for the non-degenerate spectra. The fixed bases (. . . , Fa [ϕ], . . .), (. . . , fa [ϕ], . . .), being orthonormal, give rise to some isometry U [ϕ] ∈ O(U , η; V , g), namely such one that U [ϕ]Fa [ϕ] = fa [ϕ],
a = 1, . . . , n.
The ϕ itself may be written in the form ϕ = U [ϕ]A[ϕ],
(3.18)
where A[ϕ]: U → U is η-symmetric and positively definite, i.e. η(A[ϕ]u, v) = η(u, A[ϕ]v),
η(A[ϕ]u, u) > 0,
for arbitrary u, v ∈ U and arbitrary u = 0 in the inequality. Analytically: C ηAC A[ϕ]C B = ηBC A[ϕ]A ,
and the matrix [A[ϕ]A B ] has positive eigenvalues, coinciding, by the way, with square roots of deformation invariants λa (which, obviously, are also positive). Obviously, (3.18) is a geometric interpretation of the polar decomposition, and analytically, when we put U = V = Rn it exactly coincides with the usual polar decomposition known from the matrix theory. It may be also alternatively written in the form: ϕ = B[ϕ]U [ϕ],
(3.19)
where B[ϕ]: V → V is g-symmetric and positively definite, and obviously: B[ϕ] = U [ϕ]A[ϕ]U [ϕ]−1 . These are simply the left and right polar decomposition known from the matrix theory. It is clear that Green and Cauchy tensor satisfy, respectively: ˆ G[ϕ] = A[ϕ]2 ,
ˆ G[ϕ] = B[ϕ]−2
Let us mention, there are also other possible choices of deformation invariants; every system of nfunctionally independent functions of λa , a = 1, . . . , n may be used as a basic system of invariants. Let us remind a few popularly used system, e.g. ˆ a ), Ka [ϕ] = Tr(G[ϕ]
a = 1, . . . , n.
100
Jan J. Sławianowski
Obviously, Ka =
n
(λi )a .
i=1
Another possibility is the system of coefficients of the characteristic polynomial ˆ of G[ϕ], Ip [ϕ]: n A A ˆ (−1)k In−k [ϕ]λk . det G[ϕ]B − λδB = k=0
Obviously, I0 = 1, and for p = 1, . . . , n, Ip is the sum of all possible products of p quantities λa with different (but not necessarily disjoint) sets of labels a = 1, . . . , n, e.g. I1 =
n
ˆ λi = Tr(G),
ˆ In = λ1 . . . λn = det (G).
i=1
In the physical three-dimensional case I2 = λ2 λ3 + λ3 λ1 + λ1 λ2 . Let us observe that orthonormal bases (. . . , fa [ϕ], . . .), (. . . , Fa [ϕ], . . .) represent formally configurations of two fictitious rigid bodies, respectively in (V , g) and (U , η). They refer, respectively, to the eigenaxes of the Cauchy and Green deformations tensors, therefore, they tell us how the deformation state is oriented with respect to V , U (what are instantaneous positions of deformation ellipsoids). Unlike this, deformation invariants contain only the scalar deformation about the deformation state (how large are stretchings). The manifold of scalar deformation states (parametrized by deformation invariants) may be considered as double-coset-space of LI(U , V ) with respect to the left and right actions of O(V , g), O(U , η), Inv = O(V , g)\LI(U , V )/O(U , η), or, when reflections are excluded, Inv = SO(V , g)\LI+ (U , V )/SO(U , η). Let us observe that, as usual linear frames F[ϕ], f [ϕ] may be naturally identified with linear isomorphisms: R[ϕ]: Rn → U , L[ϕ]: Rn → V . As they are orthonormal they are linear isometries of (U , η) onto (Rn , δ) and of (V , g) onto (Rn , δ); δ denotes here the natural Descartes–Kronecker metric of Rn . Similarly, the dual
Quantization of Affine Bodies: Theory and Applications in Mechanics of Structured Media
101
˜ co-frames F[ϕ], f˜ [ϕ] may be identified with the inverse mappings R[ϕ]−1 : n −1 U → R , L[ϕ] : V → Rn ; obviously, they are also linear isometries. One can show that ϕ may be represented as ϕ = L[ϕ]D[ϕ]R[ϕ]−1 , where the linear mapping D[ϕ]: Rn → Rn , i.e. simply a matrix, is diagonal, ⎡
D1 [ϕ] D[ϕ] = ⎣ ... 0
⎤ ... 0 .. ⎦ = Diag(D [ϕ], . . . , D [ϕ]). .. 1 n . . . . . Dn [ϕ]
If there is no danger of misunderstanding we shall omit the label ϕ at the quantities L, D, R. So, finally, we write ϕ = LDR −1 and in this way any internal (relative) configuration ϕ is formally identified with the configuration of two fictitious (metrically) rigid bodies and n purely oscillatory degrees of freedom of the stretching state. If we put U = V = Rn , then the above decomposition (two-polar decomposition, triple decomposition) is formally obtained from the polar one. Namely, for any ϕ = GL(n, R) one starts from the polar decomposition: ϕ = UA,
U ∈ O(n, R),
A ∈ Sym+ (n, R)
and then A is orthogonally diagonalized, A = RDR −1 ,
D ∈ Diag(n, R),
R ∈ O(n, R),
so, finally, ϕ = LDR −1 ,
L = UR ∈ O(n, R).
Unlike the polar decomposition, the two-polar one is non-unique. In the nondegenerate case, when all diagonal elements of D are pairwise distinct, this nonuniqueness is discrete and controlled by the permutation group S(n) interchanging deformation invariants. When degeneracy occurs the non-uniqueness is more catastrophic, in a sense continuous, and resembles the singularity of spherical coordinates at r = 0 (although, one must say, it is much more complicated). Let us finish with some kinematical concepts. Generalized velocity of an affine body is given by the pair (v, ξ) ∈ V × L(U , V ) consisting of the translational
102
Jan J. Sławianowski
velocity v and the internal one ξ. On a given classical motion R t → (x(t), ϕ(t)) it is analytically given by the system: dxi dϕi ,...;..., A,... . ..., dt dt When U = Rn , i.e. Q = M × F(V ), then velocities are elements of V × V n = V n+1 . It is convenient to use affine velocities in the spatial and material (co-moving) ˆ ∈ L(U ), namely: representations, ∈ L(V ), = ξϕ−1 =
dϕ −1 ϕ , dt
ˆ = ϕ−1 ξ = ϕ−1
dϕ . dt
They are interrelated as follows: ˆ −1 . = ϕϕ Analytically: ij =
i dϕA ϕj−1A , dt
−1A ˆA B = ϕi
dϕBi , dt
i ˆ A −1B ij = ϕA B ϕj .
Eringen in his micromorphic theory [16, 57, 58] uses for them the term“gyration’’ They are Lie-algebraic objects related respectively to the right-invariant and leftinvariant vector fields and differential forms, X [E]ϕ = Eϕ,
ˆ ϕ = ϕE, ˆ X [E]
ω = dϕϕ−1 ,
ωˆ = ϕ−1 dϕ.
(3.20)
ˆ In very rough, formal terms we would say that ω, ωˆ are obtained from , via the dt-multiplying. More rigorously, they are respectively L(V )-valued and L(U )-valued differential one-forms on LI(U , V ). Their evaluations on vectors ˆ tangent to trajectories just coincide with , . The right and left invariance is meant obviously in the sense of transformations (3.4) and (3.5). They become right and left regular group translations when U = V = Rn and LI (U , V ) is identified with GL(n, R). In the above formulas for vector fields E and Eˆ are respectively fixed elements of L(V ), L(U ). Affine velocities are non-holonomic in the sense that there no generalized coordinates for which they would be time derivatives. This is due to the noncommutativity of the full linear group. In continuum mechanics may be interpreted in terms of the Euler velocity field. Namely, the material point which at a given instant of time passes the spatial point y ∈ M has the velocity: → v(y) = v + − xy;
Quantization of Affine Bodies: Theory and Applications in Mechanics of Structured Media
103
Analytically: v i (y) = v i + ij (yj − xj ). When the motion is metrically rigid, i.e. gyroscopic, then affine velocity becomes skew-symmetric with respect to the appropriate metric tensor, ij = −g ia gjb ba = −ij , AK ˆA ˆ LK = − ˆA ηBL B = −η B,
These objects are the usual angular velocities, respectively, in the spatial and co-moving representation. In the physical three-dimensional case they are identified in a standard way the usual pseudo-vectors of angular velocity; namely, in orthonormal coordinates: ij = −εijk k , j
i = − 12 εik j k ,
A ˆC ˆA B = −εBC ,
ˆB ˆ A = − 1 εAC 2 B C ,
i ˆA i = ϕA .
Obviously, ε denotes here the totally antisymmetric Ricci symbol and indices are trivially shifted with the use of Kronecker-delta (orthonormal coordinates). Obviously, in the two-dimensional case (also physically interesting), angular ˆ numerically coincide and in velocities are one-dimensional objects, and orthonormal coordinates: ! ! dθ 0 −1 0 −1 i A ˆ [j ] = [B ] = ω = , 1 0 dt 1 0 ! cos θ − sin θ i . [ϕA ] = sin θ cos θ For the fictitious rigid bodies corresponding to the polar and two-polar decompositions we also introduce the corresponding angular velocities: dU −1 U ∈ SO(V , g) ⊂ L(V ), dt dU ∈ SO(U , η) ⊂ L(U ), ωˆ = U −1 dt dL −1 L ∈ SO(V , g) ⊂ L(V ), χ = dt ω =
104
Jan J. Sławianowski
χˆ = L −1
dL ∈ SO(n, R) ⊂ L(n, R), dt
dR −1 R ∈ SO(U , η) ⊂ L(U ), dt dR ∈ SO(n, R) ⊂ L(n, R). ϑˆ = R −1 dt
ϑ =
(3.21)
These objects are elements of the indicated Lie algebras of orthogonal groups. So, ω, χ are g-skew-symmetric, ω, ˆ ϑ are η-skew-symmetric, and χ, ˆ ϑˆ are skewsymmetric in the usual Kronecker sense. In certain problems it is convenient to use co-moving representation of the translational velocity, dxi . dt Canonical moments, i.e. linear functional on generalized velocities are pairs (p, π) ∈ V ∗ × L(V ,U ), i.e. analytically (…, pi , …; …, piA , …). Their evaluations on virtual velocities are given by vˆ A := ϕi−1A v i = ϕi−1A
( p, π), (v, ξ) = p, v + Tr(πξ) = pi v i + piA ξAi .
(3.22)
When U = Rn and Q = M × F(V ), then canonical momenta are elements of V ∗ × V ∗n = V ∗(n+1) . The duality between π and ξ may be expressed in terms of the duality between affine spin and affine velocity. More rigorously, affine spin ∈ L(V ) and its coˆ ∈ L(U ) are defined as: moving representation = ϕπ
ˆ = πϕ.
Analytically: i A ij = ϕA pj
A i ˆA B = pi ϕ B.
ˆ are, respectively, One uses also the term “hypermomentum’’. The quantities , Hamiltonian generators of the transformation groups (3.4) and (3.5). Linear momentum pi generates spatial translations. In many problems it is convenient to use its co-moving representation: i pˆ A = pi ϕA
which has to do with the material translations. One uses also the orbital affine momentum and the total affine momentum , J , given, respectively, by ij = xi pj
Jji = ij + ij .
Quantization of Affine Bodies: Theory and Applications in Mechanics of Structured Media
105
Unlike the quantities , J depend on the choice of the origin o ∈ M of affine coordinates in M . And J , more precisely J (o) is a Hamiltonian generator (momentum mapping) of the centre-affine subgroup GAf(M , o) ⊂ GAf(M) (affine transformations preserving o) acting through (3.2). The doubled skew-symmetric parts of hypermomenta, Sji = ij − g ik gjl lk , Lji = xi pj − g ik gjl xl pk , ij = Lji + Sji , are the usual angular momenta: the internal (spin), the orbital, and the total ones. They are Hamiltonian generators (momentum mapping) of the corresponding isometry groups. The quantity AL ˆA ˆK VBA := ηBK B −η L,
called by Dyson “vorticity’’, is the Hamiltonian generator of the right-acting rotation group (3.9). When ϕ is not an isometry, then V is not the co-moving representation of S, i VBA ϕj−1B . Sji = ϕA
S and V generate, respectively, spatial and material rotations of internal degrees of freedom. and p together are Hamiltonian generators of spatial isometries. ˆ are non-holonomic canonical momenta; non-holonomic, In a sense, and because their Poisson brackets do not vanish. The pairing between internal canonical momenta and velocities may be now expressed as follows: ˆ = Tr() = Tr( ˆ = Tr(πξ). ˆ ˆ ) , = , Analytically: ˆ = ji ij = ˆA ˆ ˆ BA , = , B. ˆ themselves, , ˆ may be interpreted in terms of right- and Just as , left-invariant vector fields or differential forms. We use the standard conventions of differential geometry according to which vector fields with components Z i related to local coordinates zi are identified with first-order differential operators: Z = Zi
∂ . ∂zi
106
Jan J. Sławianowski
ˆ as systems of vector fields dual to systems of Pfaff forms (3.20) (more Then , precisely, to L(V )- and L(U )-valued differential one-forms) are given by i Eji = ϕA
∂ j ∂ϕA
,
Eˆ BA = ϕBi
∂ . i ∂ϕA
(3.23)
In other words, at the point ϕ ∈ LI(U , V ), the (kA )th component of Eji equals i δk , and the (i )th component of E ˆ BA equals ϕBi δA ϕA j C C. ˆ Interpreted as invariant forms - and -objects become, respectively, the following fields on LI(U , V ): Y [F]ϕ = ϕ−1 F,
ˆ ϕ = Fϕ ˆ −1 , Y [F]
where F, Fˆ are arbitrarily fixed elements of L(V ), L(U ), and we remember that the dual space L(U , V )∗ is canonically isomorphic with L(V , U ) through the formula (3.22). ˆ , ˆ under Let us quote the obvious transformation rules of , , transformations (3.4) and (3.5) of internal degrees of freedom: α ∈ GL(V ) β ∈ GL(U ) α ∈ GL(V ) β ∈ GL(U )
: : : :
→ αα−1 → → αα−1 →
ˆ → ˆ ˆ → β−1 β ˆ ˆ → ˆ ˆ → β−1 β ˆ
Now let us quote the basic Poisson brackets. The most important of them, namely those involving the above generators, are determined by the structure constants of the linear and affine groups. {ij , kl } = δil kj − δkj il ,
C ˆA A ˆC ˆA ˆC { B , D } = δB D − δD B ,
{Jji , Jlk } = δil Jjk − δkj Jli ,
ˆA {ij , B } = 0,
{ ij , kl } = δil kj − δkj il ,
ˆA { ˆ C} = δA ˆB, B, p Cp
{Jji , pk } = { ij pk } = δik pj . i we have: For any function F depending only on generalized coordinate xi , ϕA i {ij , F} = −Eji F = −ϕA
{ ij , F} = −xi
∂F , ∂xj
∂F j
∂ϕA
,
Quantization of Affine Bodies: Theory and Applications in Mechanics of Structured Media
{Jji , F} = −xi
107
∂F i ∂F − ϕA , j j ∂x ∂ϕA
i ˆA ˆA { B , F} = −B F = −ϕB
∂F . i ∂ϕA
These Poisson brackets are in principle sufficient for obtaining equations of motion in the form: dF = {F, H }, (3.24) dt H donating the Hamilton function. It is convenient to introduce in addition to vector fields Eji , Eˆ BA , and differential one-forms ωji , ωˆ BA some others objects, namely, the vector fields Ha , Hˆ A and differential one-forms θ a , θˆ A , all defined on the configuration space i but result is coordinate Q = M × LI(U , V ). In terms of affine coordinates xi , ϕA independent they are given by ∂ , θˆ A = ϕ−1A i dxi , ∂xa The following duality relations hold among them: θ a = dxa ,
Ha =
C C ωˆ BA , Eˆ D = δA D δB ,
i Hˆ A = ϕA
∂ . ∂xi
(3.25)
ωˆ BA , Hˆ C = 0,
θˆ A , Hˆ B = δA B,
C = 0, θˆ A , Eˆ D
and similarly, ωba , Edc = δad δcb ,
ωba , Hc = 0,
θ a , Edc = 0, θ a , Hb = δab . Therefore, they are mutually dual fields of non-holonomic frames and co-frames on Q. As seen from the structure of Poisson brackets, these fields are geometrically important. In the theory of principal fibre bundles of linear frames or co-frames they are known as structural fields and standard horizontal fields [72, 39]. Additional important Poisson brackets: {pa , F} = −Ha F,
{ˆpA , F} = −Hˆ A F
(3.26)
for any function F depending only on generalized coordinate. Let us finish the above description of classical geometry of degrees of freedom with a brief review of symmetry problems underlying the polar and two-polar decomposition. They are very important for quantization problems. First of all we shall modify slightly our notation. We introduce new generalized coordinates parameterizing deformation invariants. In many problems it is convenient to denote the diagonal elements of D by Q a , a = 1, . . . , n, Q a := Daa ,
λa = (Daa )2 = (Q a )2 .
108
Jan J. Sławianowski
And many formulas become remarkably simplified when the logarithmic scale is used for parameterizing deformation invariants, qa = ln Q a , thus: Q a = Daa = exp (qa ),
λa = exp (2qa ).
Deformation parameters qa run over the total real range R. They are fictitious “material points’’ moving along the real axis. As such they are essentially identical and indistinguishable; this has to do with the mentioned non-uniqueness of the two-polar decomposition. This non-distinguishability is essentially striking and interesting in the quantized version of the theory. The volume extension ratio is given by det D = exp (q1 + · · · + qn ), thus it may measured in a convenient way by the sum (q1 + · · · + qn ). It is often convenient to split D into the isochoric (incompressible) and the purely dilatational parts, D = l,
det = 1,
l ∈ R+ .
The factor l is the linear size extension ratio; obviously, √ 1 1 n n (q + · · · + q ) . l = det D = exp n The logarithmic measure of this ratio: 1 q = ln l = (q1 + · · · + qn ), n is simply the “centre of mass’’ of the mentioned “material points’’. The isochoric part depends only on the ratios Q i /Q j , i.e. logarithmically, on the “relative positions’’ qi − q j . The splitting of internal configurations ϕ into dilatational and isochoric parts may be written down as follows: ϕ = l = exp (q) = exp (q)LR −1 , where as preciously is diagonal and isochoric (det = 1). The term refers to the shear-rotational degrees of freedom. It is convenient to use the isochoric affine velocities: ν=
d −1 , dt
νˆ = −1
d = −1 ν. dt
They are traceless, i.e. ν ∈ SL(V ) , νˆ ∈ SL(U ) (elements of the Lie algebras of SL(V ), SL(U )). The total affine velocities may be expressed as follows: =ν+
dq IdV , dt
ˆ = νˆ +
dq IdU , dt
where IdV , IdU are identity transformations in V , U .
(3.27)
Quantization of Affine Bodies: Theory and Applications in Mechanics of Structured Media
109
Similarly, the affine spin may be decomposed as follows: p = σ + IdV , n
ˆ = σˆ +
P IdV , n
(3.28)
where σ ∈ SL(V ) , σˆ ∈ SL(U ) (traceless) and p is the dilatational canonical momentum, ˆ = p. Tr() = Tr() (3.29) This momentum is canonically conjugate to the above q-variable (logarithmic size variable). The pairing between velocities and momenta may be expressed as follows: ˆ = Tr(σω) + p˙q = Tr(σˆ ω) ˆ ) Tr() = Tr( ˆ + p˙q. Poisson brackets for the components of σ are based on the structure constants of SL(V ). The same based for σˆ with the only precise that the signs are reverse. The mutual Poisson brackets {σ, σ} ˆ vanish. And obviously, {q, p} = 1, and the dilatational phase-space variables have vanishing Poisson brackets with the shearrotational quantities , σ, σ. ˆ Dilatational canonical momentum p may be interpreted as the total linear momentum of the one-dimensional qa -particles. The two-polar decomposition identifies (modulo some non-uniqueness) internal configuration with the triplets (L; q1 , . . . , qn ; R), where (L, R) is the pair of rigid bodies (Cauchy and Green deformation tensors principal axes), and the fictitious one-dimensional material points q1 , . . . , qn are deformation invariants (in logarithmic scale). The formulas (3.21) suggest us to make use of two possible systems of non-holonomic velocities: ˆ (χ; ˆ . . . , q˙ a , . . . ; ϑ),
(χ; . . . , q˙ a , . . . ; ϑ).
(3.30)
As expected from the (metrically) rigid body mechanics the first subsystem is more effective in analysis of dynamical models. Similarly, when the polar decompositions are used, we have at disposal the following natural systems of non-holonomic velocities: ˙ (ω, ˆ A),
˙ (ω, ˆ B),
˙ (ω, A),
˙ (ω, B),
(3.31)
cf. (3.18) and (3.19). For the qualitative analysis of practically important dynamical ˙ models the (ω, ˆ A)-system is most convenient. Non-holonomic canonical momenta conjugate to (3.30) are, respectively, denoted by (ρ; ˆ . . . , pa , . . . ; τˆ ), (ρ; . . . ; pa , . . . ; τ), where pˆ ∈ SO(n, R) ,
τˆ ∈ SO(n, R) ,
g ∈ SO(V , g) ,
and pa are canonical momenta conjugate to qa .
τ ∈ SO(U , η) ,
(3.32)
110
Jan J. Sławianowski
Canonical spin variables of the Green and Cauchy gyroscope and their dual angular velocities are considered as elements of the same linear spaces; this is due to the natural isomorphisms between orthogonal Lie algebras and their duals. So, the corresponding pairings are given by ˆ = pa q˙ a + 1 Tr(ρˆ χ) ˆ (ρ, ˆ p¯ , τˆ ), (χ, ˆ q˙¯ , ϑ) ˆ + 12 Tr(ˆτ ϑ), 2 (ρ, p¯ , τ), (χ, q˙¯ , ϑ) = pa q˙ a + 12 Tr(ρχ) + 12 Tr(τϑ). Similarly, the dual objects of (3.31) will be denoted by (σ, ˆ α),
(μ, ˆ β),
(σ, α),
(μ, β),
where σ, ˆ μ ˆ ∈ SO(U , η),
σ, μ ∈ SO(V , g) ,
and α ∈ L(U ), β ∈ L(V ) are respectively η- and g-symmetric. The corresponding pairings are given by ˙ = 1 Tr(σˆ ω) ˙ ˆ + Tr(αA), (σ, ˆ α), (ω, ˆ A) 2 etc. an analogous way for other combinations. Let us remind that in the physical three-dimensional case the skew-symmetric tensors are identified with the axial vectors, e.g. χji = −εijk χk ,
j
χi = − 12 εik j χk
in orthonormal coordinates. For the dual angular momentum quantities we have the reversed-sign-convention, e.g. j
ρji = εijk ρk ,
ρi = 12 εik j ρk .
The shift of indices here is meant in the cosmetic Kronecker-delta sense. Then the former formulas are compatible with the standard R3 -conventions, e.g. 1 2 Tr(ρχ)
= ρi χi ,
and similarly for other angular velocity and angular momentum quantities. The quantities ρ, τ coincide, respectively, with spin S and negative vorticity −V . They are Hamiltonian generators of transformations: ϕ → Aϕ,
ϕ → ϕC −1 ,
A ∈ SO(V , g),
i.e. in terms of the two-polar decomposition: L → AL,
R → CR.
C ∈ SO(U , η),
Quantization of Affine Bodies: Theory and Applications in Mechanics of Structured Media
111
The objects ρ, ˆ τˆ are Hamiltonian generators of transformations: L → LA,
R → RC,
A, C ∈ SO(n, R).
Similarly, if we use the polar decomposition, σ coincides with spin S, because it generates transformations ϕ → W ϕ, W ∈ SO(V , g) i.e. U → WU ,
ϕ = UA → WUA = W ϕ.
The quantity σˆ is the Hamiltonian generator of ϕ = UA → UKA,
K ∈ SO(U , η).
Similarly, the objects μ ˆ generates transformations: C ∈ SO(U , η):
ϕ = BU → BUC = ϕC.
Therefore, it coincides with the canonical vorticity V . And finally, μ generates the transformation group: W ∈ SO(V , g):
ϕ = BU → BWU .
Everything said above implies that χji = Lai χˆ ba Lj−1b , ϑBA = RaA ϑˆ ba RB−1b , ωji = UAi ωˆ BA Uj−1B and similarly ρji = Lai ρˆ ba Lj−1b , σji = UAi σˆ BA Uj−1B ,
τBA = RaA τˆba RB−1b , −1B μij = UAi μ ˆA . B Uj
In situations where the variables Q i are more convenient than qi = ln Q i , the canonical momenta Pi conjugate to Q i will be used instead pi (conjugates of qi ). The relationship is as follows: Pi = pi exp(−qi ) =
pi . Qi
Having in view applications on the fundamental level, including the atomic and molecular structure, we concentrate here on the quantization procedure.
112
Jan J. Sławianowski
Because of this, on the classical level we are interested mainly in Hamiltonian models. Even if dissipative phenomena are taken into account, they are considered as a correction to the Hamiltonian background. In any case, the primary concept is that of the kinetic energy. If we assume that the mechanism of affine constraint is compatible with the d’Alembert principle, then the kinetic energy is obtained by restriction of the primary multiparticle kinetic energy to the tangent bundle of the constraints manifold. After easy calculations one obtains: T = Ttr + Tint =
m dxi dxj 1 dϕi dϕi gij + gij A B J AB , 2 dt dt 2 dt dt
where m, J denote, as previously, the total mass and the second-order moment of the mass distribution.They characterize, respectively, the translational and internal inertia. It is instructive to quote two alternative formulas: m 1 ˆA ˆ B KL , GAB vˆ A vˆ B + GAB K L J 2 2 m 1 j T = gij v i v j + gij ik l J [ϕ]kl . 2 2 T =
Legendre transformation may be written in any of the following equivalent forms: pi = pˆ A =
∂T = mgij v j , ∂v i
∂T = mGAB vˆ B , A ∂ˆv
piA =
∂T
j
j ∂ξA
= gij ξB J BA ,
ˆA B =
∂T DA ˆC = GBC , DJ B ˆ ∂ A
ij =
∂T j
∂i
= gjk kl J [ϕ]li .
Obviously, it is assumed here that there is no generalized potential depending on velocities (e.g. no magnetic forces). Otherwise we would have to replace the kinetic energy T by the total Lagrangian. Inverting the above formulas and substituting them to the kinetic energy expressions we obtain there formulas for the kinetic Hamiltonian: 1 ij 1 g pi pj + J˜AB piA pjB g ij , 2m 2 1 1 AB ˆA ˆ B ˜ KL ˜ pˆ A pˆ B + J˜AB G T = K L G , 2m 2 1 1 ij j g pi pj + J˜ [ϕ]ij ik l g kl , T = 2m 2 T = Ttr + Tint =
where the “tilda’’ objects are reciprocal tensors.
(3.33) (3.34) (3.35)
Quantization of Affine Bodies: Theory and Applications in Mechanics of Structured Media
113
For Lagrangians L = T − V (x, ϕ), the resulting Hamiltonians have the form H = T + V (x, ϕ). The usual kinetic energy quadratic in velocities is geometrically equivalent to some Riemann structure on the configuration space, dqμ dqν 1 , T = μν (q) 2 dt dt
= μν (q)dqμ ⊗ dqν ,
or, in traditional notation using the arc element: dσ 2 = μν (q)dqμ dqν . For the kinetic Hamiltonian we have: 1 T = ˜ μν (q)pμ pν , 2
˜ = ˜ μν (q)
∂ ∂ ⊗ ν, μ ∂q ∂q
Using the previously introduced symbols we can express the metric tensor underlying our kinetic energy in any of the following equivalent forms: j
i ⊗ dϕB , = mgij dxi ⊗ dx j + gij J AB dϕA
(3.36)
A = mGAB θˆ A ⊗ θˆ B + GAB J KL ωˆ K ⊗ ωˆ LB , j
= mgij θ i ⊗ θ j + gij J [ϕ]kl ωki ⊗ ωl . Similarly, for the inverse metric ˜ we have the following equivalent expressions: 1 ij ∂ ∂ ∂ ∂ ⊗ j + g ij J˜AB i ⊗ j , g i m ∂x ∂x ∂ϕ A ∂ϕ B 1 AB ˜ AB J˜KL Eˆ AK ⊗ Eˆ BL , ˜ Hˆ A ⊗ Hˆ B + G ˜ = G m 1 ˜ = g ij Hi ⊗ Hj + g ij J˜ [ϕ]kl Eik ⊗ Ejl . m ˜ =
It is clear that the above kinetic energies (metric tensors) are invariant under the group of spatial isometries Is(M , g) acting through (3.2). It is also invariant under O(U , J ) acting through (3.5), i.e. the subgroup of GL(U ) preserving J . In particular, it is materially isotropic, i.e. invariant under O(U , η) when the inertial tensor is spherical, J = μ˜η. However, there is no total affine invariance either in the spatial or material sense. This kinematical symmetry is broken by the tensors g, J . Therefore, the traditional d’Alembert model of the kinetic energy does not belong to the framework of left- or right- (or two-side-) invariant geodetic systems on Lie groups or their group spaces, i.e. to the theory developed by Hermann and Arnold on the basis of rigid body or incompressible ideal fluid dynamics. By
114
Jan J. Sławianowski
the way, with the above metric tensors the geodetic systems are non-physical, because they predict the unlimited contraction and expansion of the body. And when some extra potential is introduced as a dynamical model of deformative vibrations, then, except some very special potential shape, none or rather small profit is gained from the group-theoretical model of degrees of freedom. So, it is a tempting idea to formulate dynamical models unifying two things: geodetic description (no potential as far as possible) and affine invariance. It turns out that to some extent this may be successfully done: not only the inertia but also interactions are encoded in some affinely-invariant kinetic energy forms (metrics on the configuration space). The most general and reasonable class of dynamical geodetic models invariant under the spatial affine group GAf(M ) acting through (3.2) is given by the metric tensor: = mηAB θˆ A ⊗ θˆ B + LBA D ˆA ˆC Cω B ⊗ω D,
(3.37)
where LBA D C are constant and symmetric in their bi-indices, DB LBA D C = LC A .
The inverse metric is given by ˜ =
1 AB ˆA ˆC η Hˆ A ⊗ Hˆ B + LBA D C EB ⊗ ED, m
where K L C A C L˜ A B L LK D = δD δB .
Similarly, the right-invariant metric tensors have the form: j
= mgij θ i ⊗ θ j + Ri lk ωij ⊗ ωkl
(3.38)
with similar properties of constants R: j
j
Ri lk = Rlk i .
The inverse contravariant metric (underlying the kinetic Hamiltonian) is given by ˜ =
1 ij ˜ j lk E ij ⊗ E lk , g H i ⊗ Hj + R i m
j
˜ j lk Rkl ab = δb δai . R i
Quantization of Affine Bodies: Theory and Applications in Mechanics of Structured Media
115
The corresponding explicit expressions for kinetic energies and kinetic Hamiltonians are: 1 m ˜A ˆC ηAB vˆ A vˆ B + LBA D C B D , 2 2 1 1 AB ˆA ˆC η pˆ A pˆ B + L˜ BA D T = C B D , 2m 2 1 j m T = gij v i v j + Ri dc ij cd , 2 2 1 ˜j d i c 1 ij T = . g pi pj + R 2m 2 ic j d
T =
(3.39)
(3.40)
In certain problems it is convenient to use another equivalent expressions for the translational parts. They are, respectively, given by Ttr =
m Cij v i v j , 2
Ttr =
1 ij C˜ pi pj , 2m
(3.41)
m 1 AB ˜ pˆ A pˆ B . (3.42) GAB vˆ A vˆ B , Ttr = G 2 2m Let us observe that expressions for GAf(M ) invariant may be interpreted in the following way. No metric tensor g in the physical space is assumed and even if it exists (as it does in reality) it does not enter the kinetic energy expression (if it did, affine symmetry would be broken and restricted to isometric one). The role of the metric tensor in contraction of tensorial indices is played by the Cauchy tensor C. There is no kinetic energy model invariant simultaneously under spatial and material affine transformations. More precisely, any symmetric twice covariant tensor field on Q = M × LI(U , V ) must be degenerate. This is due to the very malicious non-semisimplicity of the affine group. However, if we neglect the translational motion, then there exist internal metric on LI(U , V ) invariant under both spatial and material (homogeneous) affine transformations (3.4) and (3.5). They are given by Ttr =
0int = Aωˆ K ˆ LK + B ωˆ K ˆ LL L ⊗ω K ⊗ω = Aωkl ⊗ ωlk + Bωkk ⊗ ωll , A, B denoting constants. Their inverses have the form: 1 K B ˆL Eˆ L ⊗ Eˆ LK − Eˆ K K ⊗ EL A A(A + nB) 1 k B = E l ⊗ E lk − E k ⊗ E ll . A A(A + nB) k
˜ 0int =
(3.43)
116
Jan J. Sławianowski
Such a metric is never positively-definite. The reason is that SL(n, R) is noncompact and semisimple. 0int becomes the usual Killing metric when A = 2n, B = −2. But this is the pathological situation, because 0int is degenerate for A/B = −n (due to the dilatational centre in SO(n, R)). The corresponding kinetic energies are given by A B ˆ 2 ) + (Tr ) ˆ 2 Tr( 2 2 B A = Tr(2 ) + (Tr )2 , 2 2 1 B ˆ 2) − ˆ 2 = Tr( (Tr ) 2A 2A(A + nB) B 1 Tr(2 ) − (Tr )2 . = 2A 2A(A + nB)
0 = Tint
(3.44)
0 Tint
(3.45) (3.46)
The B-controlled term in Tint above is a merely correction. The main term (A-controlled one) has the hyperbolic signature (n(n + 1)/2+, n(n − 1)/2−), where the “plus’’ contribution corresponds to the non-compact dimensions and the “minus’’ one to the compact dimensions in GL(V ), GL(U ). This is the highest possible symmetry of Tint , an affine counterpart of the spherical top. One is rather reluctant to non-positive “kinetic energies’’. However, one can show that in the above model the lock of positive definiteness is not essentially embarrassing; on the contrary, the negative contributions may encode the attractive part of the deformation dynamics. By the way, the same effect may be obtained within the framework of positive Riemannian structures on Q, when we use a slightly modified version of (3.44). Let us observe that translational kinetic energy (3.39) and (3.41) is affinelyinvariant in the physical space and isometry-invariant in the material space. And conversely, (3.40) and (3.42) is isometry-invariant (homogeneous and isotropic) in the physical space and affinely invariant in the material space. This focuses our attention on Riemannian structures on Q = M × LI(U , V ) invariant under the spatial affine group GAf(M ) and the group of material isometries Is(N , η); the opposite models are those invariant under spatial isometries Is(M , g) and material affine transformations GAf(N ).The corresponding metric tensors are, respectively, given by = mηKL θˆ K ⊗ θˆ L + I ηKL ηMN ωˆ K ˆ LN + 0int , M ⊗ω = mgij θ i ⊗ θ j + Igik g jl ωij ⊗ ωkl + 0int , where the constants I , A, B are generalized moments of inertia.
117
Quantization of Affine Bodies: Theory and Applications in Mechanics of Structured Media
The corresponding contravariant inverses have the form 1 KL 1 ˆL η Hˆ K ⊗ Hˆ L + ηKL ηMN Eˆ K M ⊗ EN m I˜ 1 1 K ˆL ˆL ˆ + Eˆ K L ⊗ EK + ˜ EK ⊗ EL , ˜ A B 1 ij 1 ˜ = g Hi ⊗ Hj + gik g jl E ij ⊗ E kl m I˜ 1 i 1 j + E j ⊗ E i + E kk ⊗ E ll , ˜ A B˜
˜ =
˜ B˜ are given by where the inertial constants I˜ , A, 1 I˜ = (I 2 − A2 ), I
˜ = A
1 2 (A − I 2 ), A
1 B˜ = − (I + A)(I + A + nB). B
The corresponding kinetic energies are explicitly given by m I ˆK ˆ L MN ηAB vˆ A vˆ B + ηKL M N η 2 2 A B ˆ 2 ) + (Tr ) ˆ 2, + Tr( 2 2 m I = gij v i v j + gik ij kl g jl 2 2 A B + Tr(2 ) + (Tr )2 . 2 2
T = Ttr + Tint =
(3.47)
T = Ttr + Tint
(3.48)
Obviously, the last two terms in both expressions coincide because ˆ p ) for any natural p. Tr(p ) =Tr( The corresponding kinetic Hamiltonians have the following form: 1 AB 1 ˆK ˆ L MN η pˆ A pˆ B + ηKL M N η ˜ 2m 2I 1 1 ˆ 2) + ˆ 2, (Tr ) + Tr( ˜ 2A 2B˜ 1 ij 1 = g pi pj + gik ij kl g jl 2m 2I˜ 1 1 (Tr)2 . + Tr(2 ) + ˜ ˜ 2A 2B
T = Ttr + Tint =
(3.49)
T = Ttr + Tint
(3.50)
118
Jan J. Sławianowski
Let us observe that the metrical (g- and η-dependent) parts of kinetic energies may be alternatively written down in terms of the Cauchy and Green tensors: m I j Cij v i v j + Cij ik l C˜ kl , 2 2 1 1 ij j C˜ pi pj + Cij ik l C˜ kl , 2m 2I˜
m I ˆA ˆ B ˜ CD , GAB vˆ A vˆ B + GAB C D G 2 2 1 AB 1 ˆA ˆ B ˜ CD . ˜ pˆ A pˆ B + GAB G C D G 2m 2I˜
It is important that in a certain open range of triples (I , A, B) ∈ R3 the above kinetic energies are positively definite and at the same time they have all geometrical and analytical advantages of invariant geodetic system on the group manifolds. One can show that the spatially affine and materially metrical model (3.49), or more precisely, its internal part, may be expressed as follows: Tint =
1 1 1 ˆ 2) + ˆ 2+ Tr( (Tr ) ||V ||2 , 2α 2β 2μ
(3.51)
where α = I + A,
1 β = − (I + A)(I + A + nB), B
1 μ = (I 2 − A2 ), I
(3.52)
and ||V || denotes the magnitude of the vorticity, 1 ||V ||2 = − Tr(V 2 ). 2 Denoting the kth order Casimir invariant built of generators by C(k), ˆ k ), C(k) = Tr(k ) = Tr( we can write simply: Tint =
1 1 1 C(2) + C(1)2 + ||V ||2 . 2α 2β 2μ
(3.53)
Similarly, for the spatially metrical and materially affine model (3.50) we have Tint =
1 1 1 C(2) + C(1)2 + ||S||2 , 2α 2β 2μ
(3.54)
with the same as previously convention concerning the magnitude of spin: 1 ||S||2 = − Tr(S 2 ). 2 Let us note that ||V ||, ||S|| are simply second-order Casimir built of vorticity and spin.
Quantization of Affine Bodies: Theory and Applications in Mechanics of Structured Media
119
For the model (3.43) affinely invariant both in the physical and material space we have: 1 1 0 Tint = (3.55) C(2) + C(1)2 . 2A 2A(n + A/B) It is very convenient to separate dilatational and incompressible motions, especially when affinely invariant kinetic energies (metrics on Q) are used. One can easily show that for the affine–affine model (3.43) we have 0 Tint =
A n(A + nB) 2 0 Tr(v 2 ) + q˙ = Tsh0 + Tdil 2 2
(3.56)
cf. (3.27). Performing the Legendre transformation we obtain 0 Tint =
1 1 0 Tr(σ 2 ) + p2 = Tsh0 + Tdil 2A 2n(A + nB)
(3.57)
cf. (3.28) and (3.29). Similarly, for the affine–metrical and metrical–affine models one obtains, respectively, the following expression: Tint =
1 1 I CSL(n) (2) + p2 + ||V ||2 , 2(I + A) 2n(I + A + nB) 2(I 2 − A2 )
(3.58)
Tint =
1 1 I p2 + ||S||2 , CSL(n) (2) + 2 2(I + A) 2n(I + A + nB) 2(I − A2 )
(3.59)
where CSL(n) (k) are Casimir invariants built of σ, CSL(n) (k) = Tr(σ k ) = Tr(σˆ k ). Let us observe, the only difference is that concerning the last, third term. And the both expressions reduce to (3.57) when we substitute I = 0. And conversely, they may be obtained from (3.57) be replacing: A → (I + A) and introducing the mentioned terms, aff–met 0 = Tint [A → I + A] + Tint aff–met 0 = Tint [A → I + A] + Tint
2(I 2
I ||V ||2 , − A2 )
I ||S||2 . 2(I 2 − A2 )
Our philosophy is to base the dynamics as for as possible on geodetic affinelyinvariant models. In particular, geodetic affine–isometric and isometric–affine models are of special interest. They are “as affine as possible’’ and at the same time compatible with the positive definiteness demand. Nevertheless some models with potentials are still of interest, and, for non-affine models they are just
120
Jan J. Sławianowski
unavoidable. So, we shall consider also potential models H = T + V, where V depends only on the configuration variables (x, ϕ). What concerns inertial properties we concentrated on highly-symmetric models; in any case they are always spatially and usually materially isotropic (one can be general in formulation, but no so much in effective analysis). It is natural to assume that the potential energy V is compatible with these invariance properties of the kinetic term. So, V is invariant under internal spatial rotations if and only if it depends on ϕ through the Green tensor G. It is invariant under material spatial rotations if and only if it depends on ϕ through the Cauchy tensor. And finally, V is both spatially and materially isotropic in internal degrees of freedom if and only if it depends on ϕ only through the deformation invariants, parameterized, e.g. by q1 , . . . , qn . There is a very important special case when V is invariant under the volume-preserving groups SL(V ), SL(U ). This means that it depends on ϕ through the determinant det ϕ. If we use the logarithmic scala of deformation invariants, this means that V is function of q = (q1 + · · · + qn )/n, the “centre of mass’’ of logarithmic deformation invariants qi , i = 1, . . . , n. In kinetic energy models (3.57)–(3.59) dilatational and shear-rotational degrees of freedom (incompressible motion) are mutually orthogonal; there is no interaction between them. This suggests us to concentrate also on adapted potentials where these degrees of freedom are explicitly separated, V (q1 , . . . , qn ) = Vdil (q) + Vsh ( . . . , qi − qj , . . . );
(3.60)
the labels “dil’’ and “sh’’ refer, respectively, to “dilatation’’ and “shear’’. The most natural scheme for Vsh is that of “binary interactions’’ between deformation invariants fij (qi − q j ). (3.61) Vsh = i =j
For isotropic models Hint = Tint + V (q1 , …, qn ) with Tint given by (3.49), (3.51), (3.58) the vorticity V is a constant of motion and the third term in (3.58) has also the vanishing Poisson brackets with all terms of (3.58). The structure of Poisson brackets and equations of motion (3.24) implies that the evolution of variables ij , qa , ruled by the above Hamiltonian Hint , is the same as one ruled by 0 0 = Tint [A → I + A] + V (q1 , . . . , qn ), Hint 0 [A → I + A] is obtained from (3.45) T 0 by substituting (I + A) instead where Tint int A. The difference occurs only in degrees of freedom ruled by SO(V , g), SO(U , η), i.e. in the time evolution of quantities L, R describing the orientation of principal axes of deformation tensors C, G. If V depends only on the dilatational invariant q, then the total motion in Q is a direct product of two independent
Quantization of Affine Bodies: Theory and Applications in Mechanics of Structured Media
121
things: the geodetic incompressible motion and the autonomous dynamics of the q-variable. The deviator 1 σji = ij − aa δij n is then a constant of motion. The general solution for geodetic models based on 0 (3.45) is explicitly given by exponential mapping. Roughly speaking, it is Tint produced from initial conditions by one-parameter subgroups of GL(V ), GL(U ). And it may be shown on the basis of the properties of matrix exponents that for the incompressible geodetic affine–affine model A A o sh Tint = Tr(σ 2 ) Tr(ν2 ), 2 2 (with constrains q = 0) the general solution contains an open subset of bounded (oscillating) motions and an open subset of unbounded (escaping, dissociated) motions. When dilatations are allowed, then for any Hamiltonian o sh Tint =
o H = Tint + V (q) o given by (3.57) and V (q) stabilizing dilatations, there exists also an open with Tint subfamily of bounded motions (and an open subfamily of unbounded motions if sup V < ∞). The same remains true for the general geodetic affine–metrical model (3.47), (3.49), (3.53) with incompressibility constraints q = 0, and similarly, without such constraints but with dilatations stabilizing potential V (q). The same arguments may be applied to dilatationaly stabilized geodetic models in (3.48), (3.50), (3.54) invariant under O(V , g) × GL(U ) or purely geodetic isochoric models with the symmetry group O(V , g) × SL(U ) (materially speciala ˆA affine and spatially material models). On the level of state variables B , q the time evolution is exactly identical with that based on the affine–affine model of Tint (again with A in (3.50) replaced by I + A). Let us summarize the main message. Incompressible affine–affine, affine– material and metrical–affine models may encode the dynamics of elastic vibrations without any extra potential used, because their general solutions contain open subset of bounded motions. When no incompressibility constraints are imposed, the same may be achieved by introducing some dilatations-stabilizing potential V (q), e.g. some potential well oscillator V (q) = (k/2)q2 , etc. The bounded or unbounded character of motion has to do only with the time evolution of qa -variables, and from this point of view the mentioned three models are essentially identical.The difference appears only on to level of L, R- degrees of freedom, but these gyroscopic variables with compact topology cannot influence the property of trajectories to be bounded or escaping. To finish this classical description we describe everything in terms of the twopolar decomposition. It is convenient to combine the non-holonomic canonical momenta ρ, ˆ τˆ in the following way:
M := −ρˆ − τˆ ,
N := ρˆ − τˆ .
(3.62)
122
Jan J. Sławianowski
This provides a partial diagonalization of the kinetic energy and elimination of certain interference terms. Namely, after some calculations one obtain the following expressions for Casimirs: C(2) =
pa 2 +
a
1 (Mab )2 1 (Nab )2 − , a b a b 16 16 sh2 q −q ch2 q −q a,b
and, obviously, C(1) = p =
a,b
2
(3.63)
2
pa .
(3.64)
a
The first term in C(2) may be suggestively decomposed into the “relative’’ and the over-all (“centre of mass’’) parts: 1 p2 (pa − pb )2 + . 2n n a,b
This enables one to separate the incompressible and purely dilatational parts. It is interesting that C(2) and therefore the kinetic energy itself has a characteristic lattice structure known from the theory of one-dimensional manybody system. Here the logarithmic deformation invariants qa are positions of n indistinguishable fictitious “material points’’. Unlike in the usual Sutherland, hyperbolic-Sutherland and Calogero–Moser lattices where all binary coupling parameters were identical, now the quantities Mji , Nji are not only non-identical, but also non-constant. Moreover, they are state variables subject, together with other ones, to some closed system of evolution equations (3.24), where F runs over the quantities qi , pi , L, R, Mji , Nji . Obviously, one should substitute to (3.24) the following basic Poisson brackets {qa , pb } = δab , {qa , Mdc } = {pa , Mdc } = {qa , Ndc } = {pa , Ndc } = 0, {Mab , Mcd } = {Nab , Ncd } = Mcb gad − Mad gcb + Mac gdb − Mdb gac , {Mab , Ncd } = Ncb gad − Nad gcb + Nac gdb − Ndb gac , where the shift of indices is meant in the g-sense. Obviously, usually Cartesian orthonormal coordinates are used and then simply gab = δab . The Poisson brackets for M = − ρˆ − τˆ , N = ρˆ − τˆ follow from the following ones for ρ, ˆ τˆ : {ρˆ ab , ρˆ cd } = −ρˆ cb gad + ρˆ ad gcb − ρˆ ac gdb + ρˆ db gac , {ˆτab , τˆcd } = −ˆτcb gad + τˆad gcb − τˆac gdb + τˆdb gac , {ρˆ ab , τˆcd } = 0. And these brackets are based on the structure constants of SO(n, R), because ρ, ˆ τˆ (similarly like S, −V ) are corresponding Hamiltonian generators of SO(n, R).
Quantization of Affine Bodies: Theory and Applications in Mechanics of Structured Media
123
In affine kinetic energies (3.57)–(3.59) the second-order SL(n, R)-Casimir invariant has the form: CSL(n) =
1 1 (Mab )2 1 (Nab )2 (pa − pb )2 + − . a b a b 2n 16 16 sh2 q −q ch2 q −q a,b
a,b
a,b
2
(3.65)
2
It is seen from (3.53)–(3.55), (3.57)–(3.59), (3.62)–(3.64) that the M -term describes some effective centrifugal repulsion of deformation invariants, whereas the N -term is a model of “centrifugal attraction’’ between qa -“particles’’. This has nothing to do with any potential V (q1 , . . . , qn ); this attraction is due only to the negative contribution to the affine-affine kinetic energy (3.57). In this way an apparently “embarrassing’’ turns out to be just desirable. The affine-metrical and metrical-affine models (3.58) and (3.59) may be respectively written as follows: 1 ||V ||2 , 2(I 2 − A2 ) 1 aff–aff = Tint + ||S||2 , 2 2(I − A2 )
aff–met aff–aff Tint = Tint +
(3.66)
met–aff Tint
(3.67)
aff–aff here is just (3.57) but with A replaced by (I + A). Let us repeat where Tint aff–aff in terms of the two-polar parametrization and the the explicit formula for Tint shear-dilatation splitting: aff–aff Tint =
(Mab )2 1 1 (pa − pb )2 + a b 4(I + A)n 32(I + A) sh2 q −q a,b
−
1 32(I + A)
a,b
(Nab )2 ch2 q −q 2 a
a,b
b
+
p2 2n(I + A + nB)
(3.68)
2
.
It is interesting that for Hamiltonians of the form H = T + V (q1 , . . . , qn ), with potentials depending on deformation invariants only, the models (3.66)– (3.68), give exactly the same evolution equations for the system of state variables: ( . . . , qa , . . . ; . . . , pa , . . . ; . . . , Mba , . . . ; . . . , Nba , . . . ). This follows from the basic Poisson brackets quoted above. The only distinction between these three model appear on the level of variables L, R, i.e. the principal axes of the Cauchy and Green deformation tensors.These degrees of freedom have compact topology, thus they do not influence anything concerning the bounded or unbounded character of motion.
124
Jan J. Sławianowski
It is roughly seen from (3.66)–(3.68) and may be rigorously shown that; the incompressible sector of our state variables admits an open family of bounded motions and an open family of unbounded ones even in the purely geodetic models (without potential). And this is interesting because invariant geodetic systems on Lie groups (SL(n, R) this time) may be successfully analysed in terms of the exponential mapping and special functions on groups. Obviously, the dilatational sector violates these nice features. Without potential energy the dilatational parameter q moves uniformly in time and the total motion is unbounded. The only bounded solutions q = const are exponentially unstable on the level of physical ϕ-variables. Therefore, the “maximally geodetic’’ affinely-invariant systems have the form: H = T + V (q), where T stabilizes dilatations. There is no interaction between dilatational and shear-rotational degrees of freedom. Dilatational parameter q is subject to the one-dimensional dynamics ruled by the Hamiltonian Hdil =
p2 + V (q). 2n(I + A + nB)
The same is true in a more general situation when the potential energy depends also on the shear variables (non-geodetic models) and has the explicitly separated form V (q1 , . . . , qn ) = Vdil (q) + Vsh ( . . . , qi − qj , . . . ) cf. (3.60) and (3.61); usually the effective models of Vsh will have the binary structure (3.60). Finally, let us quote the two-polar representation of the doubly-isotropic d’Alembert model: Tint =
1 2 1 (Mab )2 1 (Nab )2 Pa + + . a b 2 a b 2 2I a 8I 8I a,b (Q − Q ) a,b (Q − Q )
(3.69)
As mentioned earlier, on the purely geodetic level it would be completely nonphysical. Here it is seen explicitly that Tint is purely repulsive on the level of Q-variables. All realistic models should be based on some potential term, H = Tint + V (Q 1 , . . . , Q n ) The binary structure of Tint resembles the Calogero–Moser lattices. And in fact, the general scattering solution of the Calogero–Moser chain is a subfamily of the general solution of the geodetic model (3.69).
Quantization of Affine Bodies: Theory and Applications in Mechanics of Structured Media
125
3.3 General Ideas of Quantization After all above classical preliminaries we can formulate the general ideas of quantization. We practically restrict ourselves to models based on Riemannian structures in the configuration space and on kinetic energies quadratic in velocities. Let us mention in connection with this the idea of Capriz [64] about kinetic energies of more general type, i.e. non-quadratic ones. Such models in fact appear in relativistic problems and may be useful in complicated problems of condensed matter theory, defects dynamics, etc. However, it may be very difficult to use them in quantization problems, because they may need the use of pseudo-differential operators; this may be hopelessly difficult in curved configuration spaces. So, we remain within the traditional Schrödinger framework. Let us assume that the classical problem is based (as above) on the Riemann structure , i.e. on the kinetic energy form 1 dqμ dqν T = μν (q) , 2 dt dt or, if canonical language is used, on the kinetic Hamiltonian 1 T = μν (q)pμ pν . 2 The Riemannian volume element is given by " dμ (q) = |det[μν ]|dq1 . . . dqf . Quantum-mechanical formulation is based on the Hilbert space L2(Q, μ ) of C-valued functions with the scalar product 1 |2 = 1 (q)2 (q)dμ (q). Quantum operator of the kinetic energy is given by T=−
2 (), 2
where is the “crossed’’ Planck constant and () denotes the Laplace–Beltrami operator of : 1 " () = √ ∂μ ||μν ∂ν = μν ∇μ ∇ν . (3.70) || μ,ν Obviously, ∇ denotes the Levi–Civita covariant derivative induced by the -metric. This means that the quantum kinetic energy is obtained from the
126
Jan J. Sławianowski
classical one by the formal replacing √ of pμ in T by the operator pμ = (/i)∇μ . Parallel transports preserve and ||, thus, pμ is formally self-adjoint in L2 (Q, μ ). When the classical problem is non-geodetic and based on some potential V (q1 , …, q f ), i.e. on the Hamiltonian H = T + V , then the corresponding quantum Hamiltonian is given by H = T +V, whereV denotes the operator multiplying wave functions by the potential V , i.e. V = V ; usually we do not distinguish them graphically. Velocity-dependent generalized (magnetic) potentials are not considered here. Strictly speaking, from the very principal point of view wave functions are not scalars but scalar densities of weight 1/2, and the squared moduli are scalar densities of weight one. But in all realistic models, and we do not go outside this scope, some Riemann structure is used and all tensor densities are factorized into tensors and standard densities built of . In particular, 1/2-densities √ describing pure quantum states are factorizing as = 4 ||, where are just the aforementioned scalar wave functions. Let us also mention that despite some current views, the one-valuedness of wave functions is not a fundamental assumption of quantum mechanics. There are at least some situations where multivalued amplitudes seem to be acceptable. First of all, it is so when the configuration space Q is multiply connected and has a finite homotopy group. Then it is natural to define the wave functions on the covering manifold Q. They need not project onto Q as one-valued amplitudes but it seems natural to demand that, according to the statistical interpretation, the squared moduli are uniquely projectable. This has to do with the projective representation. And just such situations are interesting in our model, where the configuration spaces of rigid and affinely rigid bodies in dimensions n ≥ 3 have two-element homotopy groups. This point was stressed, e.g. in Refs. [73–78], where the possibility of doubly-valued wave functions for quantized rigid body was pointed out. Literally performed calculations of Laplace–Beltrami operators are usually very difficult and the result is rather non-readable. It is much more convenient to use directly the operators Hˆ A , Hi , Eˆ BA , Eji introduced formerly and the classical expressions for kinetic Hamiltonians based on these quantities.Then we can define easily quantum operators representing the corresponding physical quantities, e.g. pi , pA i , A i ˆ B , j. Hilbert spaces may be constructed without calculating the complicate coord√ inate expressions for the metric tensors μν and their densities ||. Namely, our configuration spaces may be in a sense identified with Lie groups (more precisely their group spaces), therefore we can simply use Haar measures, which are explicitly known and given by simple expressions. We are usually dealing with left- and right-invariant metrics , thus, the corresponding measures μ are also
Quantization of Affine Bodies: Theory and Applications in Mechanics of Structured Media
127
invariant, and just coincide with the invariant Haar measures, because the latter ones are unique (modulo normalization). First of all, let us observe that our configuration space Q = M × LI(U ,V ) as an affine space (with the translation space V × L(U , V )) is endowed with the natural Lebesgue measure a unique up to normalization. Fixing metric tensors g ∈ V ∗ ⊗ V ∗ , η ∈ U ∗ ⊗ U ∗ and some adapted Cartesian coordinates i , we can normalize l as (orthonormal with respect to these tensors) xi , aK , ϕK follows: da(x, ϕ) = dx1 . . . dxn dϕ11 . . . dϕnn . When translational degrees of freedom are neglected, we use the usual Lebesgue measure l on LI(U , V ) as an open subset of the linear space L(U , V ): dl(ϕ) = dϕ11 . . . dϕnn . These measures are invariant under translations in the affine space M × LI(U , V ) and under spatial and material isometries. They are, however, non-invariant under spatial and material affine transformations. To achieve the affine invariance we must use the following Haar measures α, λ on Q = M × LI(U , V ) and Qint = LI(U , V ) induced from the affine group GAf(n, R) GL(n, R) ×s Rn and the linear group GL(n, R): dα(ϕ, x) = (detϕ)−n−1 da(x, ϕ) = (detϕ)−n−1 dx1 . . . dxn dϕ11 . . . dϕnn , dλ(ϕ) = (detϕ)−n dl(ϕ) = (detϕ)−n dϕ11 . . . dϕnn . Expressing the measure l in terms of the two-polar decomposition ϕ = LDR −1 we obtain |sh(qi − qj )|dq1 . . . dqn dμ(L)dμ(R), dλ(ϕ) = dλ(L; qa ; R) = i =j
where μ is the left- and right-invariant Haar measure on the manifolds of linear isometries LIs(Rn , δ; V , g), LIs(Rn , δ; U , η). Obviously, when LI(U , V ) is identified with GL(n, R) and the mentioned manifolds of isometries are identified with the orthogonal group SO(n, R), then μ becomes simply the literally understood Haar measure on SO(n, R). As the manifolds LIs and the group SO(n, R) are compact, the measure μ may, although need not, be normalized to unity (the manifold volume equals the unity). In certain formulas it is convenient to use the symbol Pλ := |sh(qi − q j )|, (3.71) i =j
thus, dλ(ϕ) = Pλ dq1 . . . dqn dμ(L)dμ(R).
128
Jan J. Sławianowski
One can also obtain the following convenient expression for the Lebesgue measure l: dl = Pl dQ 1 . . . dQ n dμ(L)dμ(R), where Pl =
i =j
(Q i2 − Q j2 ) =
(Q i + Q j )(Q i − Q j ),
(3.72)
i =j
and, as we remember, Q a = exp (qa ). The Haar measure on the internal configuration space of the isochoric (incompressible) affinely-rigid body may be expressed in terms of the Dirac distribution dλSL (ϕ) = Pλ (q1 , . . . , qn )δ(q1 + · · · + qn )dq1 . . . dqn dμ(L)dμ(R). Our quantum-mechanical models will be based on Hilbert spaces L2 (Q, a), 2 L (Qint , l), L2 (Q, α), and L2 (Qint , λ). Obviously, for affinely-invariant models L2 (Q, α), L2 (Qint , λ) are more convenient. Similarly, for the usual d’Alembert models L2 (Q, a), L2 (Qint , l) are more natural. Nevertheless, it is a matter of convenience; one should stress that both types of models may be formulated in terms of any of these Hilbert spaces. The spatial and material actions of GAf(M ) and GAf(N ) (3.2), (3.3) on the configuration space Q preserve the Haar measure α. Similarly, (3.4), (3.5) preserve the Haar measure λ on the internal configuration space Qint . On the other hand, except isometries, they do not preserve the usual Lebesgue measures on affine spaces, i.e. a, l. The latter ones are invariant, however, under the usual affine translations given analytically by i i ( . . . , xi , . . . ; . . . , ϕA , . . . ) → ( . . . , xi + ξ i , . . . ; . . . , ϕA + ξAi , . . . )
just the usual additive translations in M × L(U , V ) and L(U , V ) as affine spaces. (Remark: on Q and Qint , when L(U , V ) is replaced by its open subset LI(U , V ), then these translations act only locally.) On the other hand, these additive translations in general do not preserve the Haar measures α, λ. All the mentioned groups act argument-wise on wave functions. When they preserve the measure on Q or Qint , the resulting transformations of wave functions preserve the corresponding L2 -spaces and are unitary, i.e. they preserve the scalar products too. Let us quote explicitly some expressions, at least to fix the notation used later on. For any A ∈ GAf(M ) we define the operation A which transforms the wave function :AfI(N , M ) → C into such one that (A)() = (A ◦ ).
(3.73)
Quantization of Affine Bodies: Theory and Applications in Mechanics of Structured Media
129
Similarly, for any A ∈ GAf(N ) we define the operator B such that (B)() = ( ◦ B).
(3.74)
If translational degrees of freedom are neglected and we deal with wave functions : LI(U , V ) → C, then for any A ∈ GL(V ), B ∈ GL(U ) we define (A)(ϕ) = (Aϕ),
(3.75)
(B)(ϕ) = (ϕB).
(3.76)
Obviously, for any A, B the operators A, B are unitary in L2 (Q, α), L2 (Qint , λ), because the measures α, λ are invariant under regular translations. Unlike this, they are not unitary in L2 (Q, α), L2 (Qint , λ), unless A, B are volume-preserving mappings, i.e. elements of SAf(M ), SAf(N ), SL(V ), SL(U ) (more precisely, unimodularity is sufficient, i.e. detL(A) = detL(B) = ±1). Obviously, the differential operators (vector fields) Ha , Eji defined in (3.23), (3.25) are generators of unitary groups defined in (3.73), (3.75), therefore, they formally antiself-adjoint in L2 (Q, α), L2 (Qint , λ),or rather in the subspaces of smooth functions. Being non-bounded (non-continuous) they are not anti-Hermitian in the rigorous mathematical sense. However, they are so in rough terms used in physics. They possess anti-Hermitian extensions. The following differential operators: Pa =
∂ , Ha = i i ∂xa
ab =
a a ∂ Eb = ϕ K b i i ∂ϕK
are formally Hermitian. They are, respectively, quantum linear momentum and hyperspin operators. One can also introduce the operator of the total affine momentum (hypermomentum) Jab = xa Pa + ab = ab + ab . The ordering of non-commuting operators meant just as written above; it follows from their geometric nature as group generators. Obviously, the coordinate operators are defined in the usual way, (xa )(x, ϕ) = xa (x, ϕ),
a a (ϕK )(x, ϕ) = ϕK (x, ϕ).
If we define quantum Poisson bracket in the usual way, {A, B} :=
1 1 [A, B] = (AB − BA), i i
then the above basic quantities satisfy the rules identical with the classical ones (3.23), (3.24), (3.26).
130
Jan J. Sławianowski
The same concerns the co-moving representants based on differential operators ˆ A , Eˆ A H B . The corresponding formally Hermitian operators Pˆ A =
a ∂ ˆ A = ϕA , H i i ∂xa
A a ∂ ˆA ˆ B = E B = ϕB a , i i ∂ϕA
and K ˆK ˆK ˆK JˆL = aK Pˆ L + L = L + L
are quantum generators of GAf(N ), GL(U ). Operators of angular momenta are given by the doubled g-skew-symmetric parts of affine momenta: Sab = ab − g ac gbd dc ,
Lab = ab − g ac gbd dc ,
and a J¯b = Jab − g ac gbd Jdc = Lab + Sab .
They are respectively spin, orbital, and the total angular momentum. Similarly, the quantum vorticity operator is given by AC ˆA ˆD VA B = B − η ηBD C .
Canonical linear momentum conjugate to ϕiA is on the quantum level represented by the operator PA i =
∂ . i i ∂ϕA
Important: it is not formally Hermitian in L2 (Q, α), L2 (Qint , λ). Indeed, classically it generates additive translations in L(U , V ): i i ϕA → ϕA + ξAi .
And those do not preserve the Haar measures α, λ. But they preserve the Lebesgue 2 2 measures a, l, therefore, PA i is formally Hermitian in L (Q, a) and L (Qint , l). Because of this the Hilbert spaces are more convenient for describing quantization of models based on the usual d’Alembert principle. However, in all kinds of models the “non-usual’’ Hilbert spaces may be also applied, simply the definition of some operators must be modified. For example, affine models may be as well quantized in L2 (Q, a) and L2 (Qint , l), but the operators A, B in (3.73), (3.75) are to be replaced by A ,A given by A := detL[A]−(n+1)/2A,
B := detB −n/2 B.
Quantization of Affine Bodies: Theory and Applications in Mechanics of Structured Media
131
More explicitly, (A )(x, ϕ) = detL[A]−(n+1)/2 (A(x), L[A]ϕ), (B )(x, ϕ) = detB −n/2 (x, ϕB). Similarly, when considering only the action of A ∈ GL(V ) on internal degrees of freedom, we define A := detA−n/2A, i.e. (A )(x, ϕ) = detA−n/2 (x, Aϕ). Due to the multiplicative terms, the above operators are unitary in Hilbert spaces based on the Lebesgue measures. Their infinitesimal generators are then modified by additive correction terms due to which they become formally Hermitian in ˆA L2 (Q, a) and L2 (Qint, l). For example, ab , ab , B are, respectively, replaced by
ab = ab +
n a δ, 2i b
ab = ab +
n a δ, 2i b
ˆA ˆ A n δA . B = B + 2i B
It is easy to see that the linear momentum Pa , spin Sab , vorticity VA B remain unchanged.The finite actions generated by them preserve the Lebesgue measures. Similarly, when quantizing the d’Alembert models with the use of non-usual i for them L2 (Q, α) and L2 (Qint , λ), we would have to modify PA i = (/i)∂/∂ϕA , but we shall not do this here. Nevertheless, it must be stressed that quantizations in terms of “non-usual’’ Hilbert spaces may be convenient when one is interested in comparison between various models. As mentioned, the best tool when quantizing the d’Alembert model is offered by the geometry of Hilbert spaces L2 (Q, a) and L2 (Qint , l). Then the quantized version of the kinetic energy (3.33) is given by the operator T = Ttr +Tint =
1 1 ij J PA PB g ij . g pi pj + # 2m 2 AB i j
(3.77)
Explicitly, this is a kind of “Laplace operator’’ in the n(n + 1)-dimensional Euclidean space: T=−
2 ij ∂2 2 ∂2 ij # − g . g J AB i ∂ϕj 2m ∂xi ∂xj 2 ∂ϕA B
Unfortunately, geodetic (potential free) models are non-physical because they predict only escaping, non-bounded classical motion and the purely continuous positive spectrum after quantization (no bounded states). And for realistic i ) are rather non-adequate. The classical expressions potentials the variables (xi , ϕA
132
Jan J. Sławianowski
(3.34) and (3.35) are non-convenient for quantization because they suffer from the embarrassing problem of the ordering of operators. There are no such problems with models based on the affine invariance, i.e. (3.39) and (3.40). Let us remind that the first of them is affinely-invariant in the physical space and isometries-invariant in the material space. On the contrary, the other one is isometries-invariant in the physical space and affinely invariant in the body. There are no ordering problems and the quantum operators of the kinetic energy may be immediately obtained via the simple replacement of the classical linear momentum and affine spin by the operators just written down. Therefore, for the quantized versions of (3.39) and (3.40) we, respectively, obtain T =
1 AB 1 #B D ˆ A ˆ C η pˆ A pˆ B + L A C B D 2m 2
=−
2 ab ∂2 2 #B D k ∂ l ∂ # C − L A C ϕB k ϕ D l , 2m ∂xa ∂xb 2 ∂ϕA ∂ϕC
T=
1 ij 1 #j l i k g pi pj + R i k j l 2m 2
=−
2 ij ∂2 2 # j l i ∂ k ∂ − R i k ϕ A j ϕB l , g 2m ∂xi ∂x j 2 ∂ϕA ∂ϕB
(3.78)
(3.79)
where, as previously, C˜ denotes the inverse Cauchy deformation tensor. The ordering of operators ϕ, ∂/∂ϕ is essential, therefore, there appear first-order differential operators, respectively, −
2 #B A k ∂ i #B A ˆ C L A C ϕB k = − L A C B , 2 2 ∂ϕC
−
2 #j l i ∂ i #j l i R i j ϕA l = − R i j l . 2 2 ∂ϕA
The second-order terms are obvious: −
2 #B D k l ∂2 L A C ϕ B ϕD k l 2 ∂ϕA ∂ϕC
−
2 # j l i k ∂ 2 R i k ϕA ϕB j . 2 ∂ϕA ∂ϕBl
There “curved’’ structure is obvious. Geometrically this is due to the fact that the metric tensors on Q given by (3.37) and (3.38) define there essentially Riemannian structures with non-vanishing curvature tensors. Unlike this, the d’Alembert model is based on the evidently flat, Euclidean geometry with the metric tensor (3.34). All this has to do with strong nonlinearity encoded in geodetic terms of
Quantization of Affine Bodies: Theory and Applications in Mechanics of Structured Media
133
classical affine Hamiltonians. And this strong nonlinearity follows from the “large’’ group of assumed symmetries. As mentioned in the classical part, there are good reasons to concentrate the attention on those metric tensors on Q = M × LI(U , V ) (those models of kinetic energy) which are: 1. affinely invariant in the physical space and simultaneously isometry-invariant (homogeneous and isotropic) in the material space, and 2. conversely, homogeneous and isotropic in the physical space (isometryinvariant) and simultaneously affinely-invariant in the material space. It is impossible to satisfy simultaneously both demands 1 and 2. However, if translational degrees of freedom are neglected, there exist metrics on Qint = LI(U , V ) affinely (or rather centre-affinely) invariant both in the space and in the body. They are always pseudo-Riemannian, i.e. have the non-definite hyperbolic signature. Obviously, (3.78) and (3.79) are formally Hermitian in L2 (Q, α). Just like equation (3.77) is so in L2 (Q, a). The operators of translational kinetic energy Ttr are exactly like in general models (3.78) and (3.79), so we concentrate on the internal parts Tint . And they are just the very special cases of those in the general formulas (3.78) and (3.79). Due to their very peculiar features it is instructive to quote them explicitly. So, for internal degrees of freedom the quantized versions of (3.49) and ˆ instead of their classical (3.50) are obtained by the literal substitution of , counterparts, so, respectively. Tint =
1 1 K L 1 K L ˆ ˆ ˆ ˆ ˆK ˆL ηKL ηMN M N + ˜ L K + ˜ K L , ˜ 2I 2A 2B
(3.80)
1 1 k l 1 k l gik g jl ij kl + l k + k l ˜ ˜ 2I 2A 2B˜
(3.81)
Tint =
˜ B˜ like previously, (1). And again with the same meaning of inertial constants I˜ , A, the second terms of both expressions are identical; the same is true of the third ones. Let us write explicitly k l 2 k l ˆB ˆA B A = l k = − ϕB ϕA
∂2 k ∂ − inϕA k l k ∂ϕA ∂ϕB ∂ϕA
The geometric interpretation of the affine spin, usual spin, and vorticity as generators of transformation groups implies that many quantum expressions involving them may be, as it was just seen, directly obtained from classical formulas by simple substitution of appropriate operators instead of the corresponding classical phase-space quantities, so that the difficult ordering problems are avoided. For example, the very convenient classical expressions (3.53) and (3.54) remain valid on the operator level as alternative expressions, respectively, for (3.80) and
134
Jan J. Sławianowski
(3.81) free of the inconvenient transposition term (one with I˜ ): Tint =
1 1 1 C(2) + C(1)2 + ||V||2 2α 2β 2μ
(3.82)
(affine-metrical model), and Tint =
1 1 1 C(2) + C(1)2 + ||S||2 2α 2β 2μ
(3.83)
(metrical-affine model), where α, β, μ are the same constants as equation (3.52) from classical formulas, and the operator Casimirs C(k), ||V||2 , ||S||2 are given by a ˆA C(1) = A = a ,
a b ˆA ˆB C(2) = B A = b a ,
and analogously for k > 2 (till k = n), and 1 VB , ||V||2 = − VA 2 B A
1 ||S||2 = − Sab Sba . 2
The more so the affine–affine model (3.55) retains its structure when quantized (transformed to the operator form): Tint =
1 1 C(2) + C(1)2 . 2A 2A(n + A/B)
(3.84)
And similarly, the splitting of the kinetic energy into incompressible and dilatational parts survives smoothly the quantization procedure. Decomposition (3.28) and (3.29) of the affine spin into the spin-shear and dilatation parts has the following form: 1 ab = sab + pδab , n
1 A A ˆA B = sˆB + pδB , n
where the traceless parts are given by 1 sab = ab − dd δab , n
ˆA 1 ˆD A sˆA B = B − D δB . n
They are formally Hermitian operators generating the unitary actions of SL(V ), SL(U ) on L2 (Qint , λ) in the sense of (3.73), A ∈ SL(V ) : (A)(ϕ) = (Aϕ), B ∈ SL(U ) : (B)(ϕ) = (ϕB). Incidentally, they are also formally Hermitian in L2 (Qint , l), because the above actions of SL(V ), SL(U ) are there unitary.
Quantization of Affine Bodies: Theory and Applications in Mechanics of Structured Media
135
The operator p has the following form: A
ˆA = p = aa =
∂ . i ∂q
It is formally Hermitian in L2 (Qint , λ) (but not in L2 (Qint , l)) and generates the one-parameter unitary group of dilatations. This is the group induced by the additive translations of logarithmic deformation invariants qa , in particular, by the additive translations of their “centre of mass’’ q. This is the group which acts as follows: = e ξ ∈ R+ : ()(ϕ) = (λϕ),
q → q + ξ.
Let us denote the second-order Casimir operator for SL(n, R) by CSL(n) (2): B CSL(n) (2) = sab sba = sˆA B sˆA ,
and similarly for higher-order ones CSL(n) (k) (but of course CSL(n) (1) = 0; for orthogonal groups all of the odd orders vanish). On the quantized level the structure of affine–affine, affine–metrical, metrical– affine models (respectively, (3.57), (3.58) and (3.59)) beautifully survives in operator language. Namely, one obtains, respectively, 1 1 CSL(n) (2) + p2 = T0sh +T0dil , 2A 2n(A + nB) 1 1 I CSL(n) (2) + p2 + ||V||2 , = 2 2(I + A) 2n(I + A + nB) 2(I − A2 ) 1 1 I = CSL(n) (2) + p2 + ||S||2 . 2 2(I + A) 2n(I + A + nB) 2(I − A2 )
T0int = Tint Tint
Obviously, p2 = −2
∂2 . ∂q2
All these formulas are automatically obtained from the corresponding classical expressions (3.57), (3.58) and (3.59) by the formal substitution of operators instead of phase-space quantities. It is so because one deals here with generators of the underlying transformation groups, quantities of profound geometric interpretation. Just as in the classical case the quantum unbounded dilatational motion should be stabilized by some potential V (q) if the model is to describe quantum elastic vibrations. The Hamilton operator splits then into two independent mutually commuting parts: H = Hsh + Hdil .
136
Jan J. Sławianowski
The same is true for more general doubly isotropic potentials separating explicitly the shape and dilatation dynamics: V = Vsh ( . . . , qi − q j , . . . ) + Vdil (q). We have then 1 CSL(n) (2) + Vsh , 2A 1 I = CSL(n) (2) + ||V||2 + Vsh , 2 2 2(I + A) 2(I − A ) 1 I CSL(n) (2) + ||S||2 + Vsh , = 2 2(I + A) 2(I − A2 )
Hsh = Hsh Hsh
respectively, for the affine–affine, affine–metrical, and metrical–affine models.The first two of them reduce to the third one when we put I = 0. And obviously Hdil =
1 ∂2 2 + Vdil (q). p2 + Vdil (q) = − 2n(I + A + nB) 2n(I + A + nB) ∂q2
Vdil may be chosen as some qualitatively satisfactory phenomenological model, e.g. Vdil = (κ/2)q2 , the finite or infinite potential well, etc. When Vsh = 0, we are dealing with purely geodetic affinely-invariant Hamiltonians built entirely of the group generators. In such situations one can expect solutions of the eigenproblem based completely on some purely algebraic ladder procedure. Finally, it is interesting to express everything in terms of the two-polar decomposition. And now an unpleasant surprise, namely, the automatic replacing of classic quantities by seemingly natural operators does not work any longer. Namely, although p may be automatically substituted by p = (/i)∂/∂q , it is not the case with pa , they are not replace by (/i)∂/∂qa and a pa2 in (3.63) is not “quantized’’ to −2 [qa ] = −2
∂2 . ∂qa2 a
The point is that the additive translations of logarithmic deformation invariants are not geometrically fundamental operations. So, whereas (3.82)–(3.84) are automatically (−2 /2)-multiples of the corresponding Laplace–Beltrami operators, there is no such automatism with the two-polar expression of these operators. Fortunately, there are no problems with the spin and vorticity operators Sij ,VA B and with operators rˆba , tˆba corresponding to the classical quantities (3.32). The reason is again their group-theoretical interpretation: spin and vorticity generate,
Quantization of Affine Bodies: Theory and Applications in Mechanics of Structured Media
137
respectively, spatial and material rotations. And the operators rˆba , tˆba are their representations in terms of the principal axes of the Cauchy and Green deformation tensors, j
rˆ ab = Lia Lb Sij ,
a tˆb = −RAa RbBVA B;
the ordering of operators just as written here. Just as in the classical theory, rˆba , tˆba are generators (in the quantum-Poisson-bracket sense) of the right action of SO(n, R) on the quantities L: Rn → V , R: Rn → U L → LU ,
R → RU ,
U ∈ SO(n, R).
μ μ Just as rˆba , tˆba , the operators Sij , VA B act only on generalized coordinates x , y parameterizing, respectively, L and R (some Euler angles, rotation vectors, firstkind canonical coordinates, and so on). Any of the mentioned classical quantities a a Sij ,VA B , rˆb , tˆb has the following form:
f μ (x)p(x)μ,
g μ (y)p(y)μ,
where p(x)μ , p(y)μ are canonical momenta conjugate, respectively, to xμ , yμ . Due to the group-theoretical meaning of the mentioned quantities, the corresponding quantum operators are given, respectively, by the operators: ∂ μ f (x) μ , i ∂x
∂ μ g (y) μ ; i ∂y
the ordering just as explicitly written. In analogy to (3.62) we introduce the operators: M = −ˆr − tˆ,
N = rˆ − tˆ.
The kinetic energy operator for affine–affine model is then given as ∂2 2 B Dλ + 2A 2A(A + nB) ∂q2 1 M2ab 1 N2ab + − , a b a b 32A 32A sh2 q −q ch2 q −q
=− Taff–aff int
a,b
2
a,b
(3.85)
2
where Dλ =
∂2 ∂lnPλ ∂ 1 ∂ ∂ P = + λ Pλ a ∂qa ∂qa ∂qa2 ∂qa ∂qa a a
(3.86)
and Pλ is given by (3.71). The expression (3.85) exactly equals −(2 /2)(0 ), where the Laplace–Beltrami operator (0 ) is built of the configuration metric 0 (3.43), i.e. corresponds to the classical expressions (3.43) and (3.44). It is
138
Jan J. Sławianowski
seen that Dλ differs from the Rn -Laplace operator a ∂2 /∂qa2 by some first-order differential operator. This is just the mentioned breakdown of the naive classical analogy between pa and (/i)∂/∂qa . The reason of this breakdown is that the additive translations qa → qa + ua , do not preserve the measures λ, α. Because of this their argument-wise action on wave functions is not unitary in L2 (Q, α), L2 (Qint , λ). Incidentally, it is not unitary in L2 (Q, a), L2 (Q,int , l) either. And infinitesimal generators (/i)∂/∂qa , (/i)∂/∂Q a are not formally self-adjoint in these Hilbert spaces. The affine–metric and metric–affine models are, respectively, given by 2 2 ∂ 2 Dλ − 2α 2β ∂q2 1 M2ab 1 N2ab 1 + − + ||V||2 , a b a b 2 q −q 2 q −q 32α 32α 2μ sh ch
Taff–met =− int
a,b
=− Tmet–aff int +
2
2α
Dλ −
2
a,b
2 ∂2
2
2β ∂q2
1 M2ab 1 N2ab 1 − + ||S||2 . a b a b 2 q −q 2 q −q 32α 32α 2μ sh ch a,b
a,b
2
2
Quite similarly, quantizing the doubly-isotropic d’Alembert model we obtain
TdintA = −
M2ab N2ab 1 1 2 + , Dl + 2I 8I (Q a − Q b )2 8I (Q a + Q b )2 a,b
a,b
where Dl =
∂2 ∂lnPl ∂ 1 ∂ ∂ P = + , l a a a2 Pl a ∂Q ∂Q ∂Q ∂Q a ∂Q a a a
(3.87)
and the weight factor Pl is given by (3.72). The ordering of non-commuting operators Dλ , Dl is just as explicitly written here. There are no other ordering problems because the operators Mab , Nab (equivalently rˆab , tˆab ) do commute with deformation invariants qa , Q a . Let us observe that the first-order differential operators in (3.86) and (3.87) may be eliminated by introducing modified amplitudes ϕ given, respectively, by ϕ=
√ Pλ ,
ϕ=
√ Pl .
(3.88)
Quantization of Affine Bodies: Theory and Applications in Mechanics of Structured Media
139
Then the action of D-operators on is represented by the action of operators ˜ given, respectively, by D −
2 2 ∂ 2 ˜λ =− + V˜ λ , D 2A 2A a ∂qa 2
−
2 2 ∂ 2 ˜l =− + V˜ l , D 2I 2I a ∂qa 2
where V˜ λ , V˜ l are auxiliary “artificial’’ potentials 1 2 1 ∂Pλ 2 + , 2A Pλ2 4A Pλ a ∂qa 1 2 1 ∂Pl 2 + . V˜ l = − 2I Pl2 4I Pl a ∂Q a
V˜ λ = −
In other words Dϕ =
√ PD.
√ All other terms of the kinetic energy operators commute with P interpreted as a position-type operator. Obviously, the same concerns usual potential terms V . Therefore, finally, the Hamilton operator H =T +V is represented in ϕ-terms ˜ by H, √ ˜ = PH, Hϕ ˜ differs from the action of H in that the where analytically the action of H D-operators are replaced by the usual Rn -Laplace operators and additional V˜ -potentials appear and are combined with the “true’’ potentials V . The stationary Schrödinger equation, i.e. eigenequation H = E is equivalent to ˜ = Eϕ. Hϕ This eigenproblem is meant in the Hilbert space based on the modified scalar product without the weight factor P in the integration element, (ϕ1 |ϕ2 ) = ϕ1 ϕ2 dq1 . . . dqn dμ(L) dμ(R) in affine–models, and (ϕ1 |ϕ2 ) =
ϕ1 ϕ2 dQ 1 . . . dQ n dμ(L) dμ(R)
140
Jan J. Sławianowski
in d’Alembert models. Obviously, (ϕ1 |ϕ2 ) = 1 |2 . There is no real simplification in replacing by ϕ because instead of the complicated first-order differential operator the equally so complicated potential V˜ appears. In geodetic problems and in problems with the doubly isotropic potentials V (q1 , . . . , qn ), in particular, with the stabilizing dilatation potentials Vdil (q), the above Schrödinger equations may be reduced to ones involving only coordinates q1 , . . . , qn , because the action of H on the (L, R)-dependence of wave functions may be algebraized. This is based on the generalized Fourier analysis on the compact group SO (n, R). To simplify the treatment we identify analytically Qint with GL+ (n, R) and use the matrix form of the two-polar decomposition ϕ = LDR −1 . According to the Peter–Weyl theorem, the wave functions may be expanded in (L, R)-variables with respect to matrix elements of irreducible unitary representations of SO (n, R). Their expansion coefficients are functions of deformation invariants qa , or equivalently, Q a . Let denote the set of irreducible unitary representations of SO (n, R) (more precisely, the set of their equivalence classes). Obviously, due to the compactness of the group SO (n, R) these representations are finite-dimensional; their dimensions will be denoted by N (α). In the physical three-dimensional case is the set of all non-negative integers s = 0, 1, 2, . . . and N (s) = 2s + 1. If for some reasons we replace the rotation groups by their universal coverings SO(n, R) and so admit half-integer angular momenta, then is the set of all non-negative halfintegers and integers s = 0, 1/2, 1, 3/2, . . . and again N (s) = 2s + 1. Obviously, in the planar case =Z is the set of all integers and N (m) = 1 for any m ∈ Z (Abelian group). Let Dα be N (α) × N (α) matrices of irreducible representations. Then the mentioned expansion has the following form: (ϕ) = (L, D, R) =
(α) N (β) N α,β∈ m,n=1 k,l=1
α Dmn (L)fnk (D)Dkl (R −1 ). αβ
β
(3.89)
ml
The non-uniqueness of the polar decomposition implies that the deformation invariants q1 , . . . , qn (Q 1 , . . . , Q n ) are very complicated indistinguishable parastatistical “particles’’ in R. There is no place here to get into more details. The αβ point is that the reduced amplitudes fnk as functions of q1 , . . . , qn must satisfy ml
certain conditions due to which the resulting as a function of L, D, and R does not distinguish triplets (L, D, R) representing the same configuration ϕ, i.e. (L1 , D1 , R1 ) = (L2 , D2 , R2 ) if L1 D1 R1−1 = L2 D2 R2−1 . This is simply the condition for to be a one-valued function on the configuration space Qint .
Quantization of Affine Bodies: Theory and Applications in Mechanics of Structured Media
141
α as explicitly known. And in One can consider the matrix elements Dmk fact, they are deeply investigated special functions on the orthogonal groups j SO (n, R). In the physical case n = 3 they are well-known functions Dmk found by Wigner. Here j = 0, 1, 2, . . . or, if we replace SO (3, R) by its universal covering SU(2) (for the general n, SO (n, R) is replaced by the group Spin(n)), j = 0, 1/2, 1, 3/2, . . . And according to the standard convention m = −j, −j + 1, . . . , j − 1, j, k = −j, −j + 1, . . . , j − 1, j; for the fixed j, m, and k have (2j + 1)-element integer range with jumps by one both for the integer and half-integer j. a a α The operators Sij ,VA B , rˆb , tˆb when acting on functions Dmk may be replaced by some standard algebraic operations. This enables one to reduce the Schrödinger equation for the wave functions depending on n2 variables ϕi A to some eigenproblems for the multicomponent amplitudes f αβ depending only on the n deformation invariants qa . Therefore, in a sense, the problem may be reduced to the Cartan subgroup of diagonal matrices ϕ (the maximal Abelian subgroup in GL(n, R)). In geodetic models and in models with doubly isotropic potentials (ones depending only on deformation invariants; dilatations-stabilizing potentials V (q) provide the simplest example), the labels m, n in (3.89) are good quantum numbers.The Hamilton operator H commutes with the operators of spin and vorticity, i.e. Sij ,VA B . Also the representation labels α, β ∈ are good quantum numbers. They are equivalent to the systems of eigenvalues of the Casimir invariants built of S,V:
C(S, p) = Sik Skm . . . Srz Szi ,
K R Z C(V, p) = VA KVM . . .VZVA
(3.90)
( p factors). These eigenvalues will be denoted, respectively, by C α (p), Cβ (p). Obviously, C(S, p) = C(ˆr, p), C(V, p) = C tˆ, p . The above Casimir invariants vanish trivially for the odd values of p; so in the physical case n = 3 there is only one possibility: C (S, 2), C (V, 2). Due to the peculiarity of dimension three, where skew-symmetric tensors may be identified with axial vectors, it is more convenient to use 1 ||S||2 = − Sab Sba , 2
1 ||V||2 = − VA VB , 2 B A
i.e. (−1/2)-multiples of C (S, 2), C (V, 2). The point is that for n = 3 ||S||2 = S21 + S22 + S23 ,
||V||2 = V21 +V22 +V23 ,
where 1 1 VB . Sa = εcab Sbc , VA = εC 2 2 AB C
142
Jan J. Sławianowski
The raising and lowering of indices is meant here in the sense of orthonormal coordinates (Kronecker-delta trivial operation). The same convention is used for rˆba , tˆba , i.e. 1 rˆ a = εcab rˆ bc , 2
1 b tˆa = εcab tˆc . 2
Obviously, ||tˆ||2 = ||V||2 .
||ˆr||2 = ||S||2 ,
The corresponding eigenvalues are given by C(s, 2) = 2 s(s + 1),
C( j, 2) = 2 j( j + 1),
where s, j are non-negative integers or non-negative integers and positive halfintegers when GL+ (3, R), SL(3, R) are replaced by their coverings GL+ (3, R), SU(2). It is convenient to use multicomponent wave functions with values in the space of complex N (α) × N (β) matrices ((2s + 1) × (2j + 1) in the physical case n = 3): (ϕ) = αβ (L, D, R) = Dα (L)f αβ (D)Dβ (R),
(3.91)
where f αβ are complex N (α) × N (β) matrices-reduced wave amplitudes depending only on the deformation invariants. In the physical three-dimensional case, α are Wigner special functions D s , we, as usual, take m, n running from when Dmn mn −s to s and jumping by one (also in the “spinorial’’ case when s is half-integer). And then, ||S||2 sj = 2 s(s + 1)sj , sj
||V||2 sj = 2 j( j + 1)sj , sj
S3 ml = mml ,
sj
sj
V3 ml = lml .
And similarly, when the values n, k in the superposition (3.89) are kept fixed and we retain only the corresponding single term, for the resulting we have sj
sj
nk
nk
rˆ3 ml = nml ,
sj sj tˆ3 ml = kml . nk
nk
Let us now describe in a few words the aforementioned algebraization procedure in the sector of (L, R)-degrees of freedom. If a compact group counterpart of the usual Fourier-transform algebraization, where ∂/∂xa is represented by the ¯ by ika . pointwise multiplication of the Fourier transform of f (¯x), fˆ (k) Let us introduce some auxiliary symbols.
Quantization of Affine Bodies: Theory and Applications in Mechanics of Structured Media
143
The group SO (n, R) may be parameterized by the first-kind canonical coordinates ω, namely, 1 a b ωE , W (ω) = exp 2 b a where the basic matrices of the Lie algebra SO(n, R) are given by (E b a )cd = δbd δca − δbc δad , and the matrix ω is skew-symmetric in the“cosmetic’’Kronecker sense.Therefore, independent coordinates may be chosen as ωba , a < b, or conversely. However, for the symmetry reasons it is more convenient to use the representation with the summation extended over all possible ωba . To be more“sophisticated’’, the groups SO (V , g), SO (U , η) are parameterized as follows: 1 i j 1 A B ˆ W (ω) = exp ω E , W (ω) = exp ωˆ E , 2 j i 2 B A where Eji , Eˆ BA are basic matrices of Lie algebras SO(V , g) SO(U , η) given by (Eji )kl = δil δkj − g ik gjl ,
A C AC (Eˆ BA )C D = δD δB − η ηBD .
The skew-symmetry of ω in the above exponential formulas is meant, respectively, as follows: ωba = −g ac gbd ωcd ,
D ωˆ BA = −ηAC ηBD ωˆ C .
Now matrices of irreducible representations Dα are given by 1 a αb 1 a αb α α D (L(l)) = exp l M r M , D (R(r)) = exp , 2b a 2b a where l and r denote the ω-parameters, respectively, for the L- and R-factors of the two-polar decomposition. The anti-Hermitian matrices M α will be expressed by the Hermitian ones S α , αa M . i b are expressed through the structure constants of
Sbαa =
The commutation rules for Mbαa SO (n, R),
s s [Mabs , Mcds ] = −gad Mcbs + gcb Mad − gbd Macs + gac Mbd ,
and therefore 1 j j j j j [Sab , Scd ] = gad Scb − gcb Sad + gbd Sacj − gac Sbd . i
144
Jan J. Sławianowski
Indices here are shifted with the help of gab ; as a rule we use orthonormal coordinates when gab = δab . In the physical three-dimensional case when we put 1 j Saj := εbc a Sbc , 2 we obviously have 1 j j [S , S ] = εcab Scj . i a b From the fact that Dα are representations and (i/)Skl , (i/)VA rba , (i/)tˆba are B , (i/)ˆ infinitesimal generators of left and right orthogonal actions on the (L, R) variables it follows immediately that Sij αβ = Sjαi αβ ,
rˆba αβ = Dα (L)Sbαa f αβ (D)Dβ (R −1 ),
βA
βa tˆba αβ = Dα (L)f αβ (D)Sb Dβ (R −1 ).
VBA αβ = αβ SB ,
Therefore, spin and vorticity act on the wave amplitudes αβ as a whole, and in a purely algebraic way. On the other hand, to describe in an algebraic way the action of rˆba , tˆba , one must extract from αβ the reduced amplitudes f αβ (q1 , . . . , qn ). And it is only this amplitude that is affected by the action of rˆba , tˆba according to the following rules: rˆba : f αβ → Sbαa f αβ ,
βa tˆba : f αβ → f αβ Sb .
It is very convenient to use the following notation: − →αa αβ S b f := Sbαa f αβ ,
← − βa S β ab f αβ := f αβ Sb .
As Dα are irreducible, the matrices C(S α , p) := Sbαa Scαb . . . Swαu Saαw ( p factors) are proportional to the N (α) × N (α) identity matrix, C(S α , p) = C α (p)IN (α) , where C α ( p) are eigenvalues of (3.90). In particular, in the physical case n = 3 we have ||S||2 sj = ||ˆr||2 sj = 2 s(s + 1)sj ,
Sa sj = Sas sj ,
||V||2 sj = ||tˆ||2 sj = 2 j( j + 1)sj ,
VA sj = sj SA ,
j
where Sαs are standard Wigner matrices of the angular momentum with the squared magnitude 2 s(s + 1). Multiplying them by (i/) we obtain standard
Quantization of Affine Bodies: Theory and Applications in Mechanics of Structured Media
145
bases of irreducible representations of the Lie algebra SO(3, R) . For the standard Wigner representation the following is also true: sj
sj
S3 ml = mml ,
sj
sj
V3 ml = lml .
Similarly, the action of rˆ , tˆ operators is represented by the following operations on the reduced amplitudes: ← − − → rˆa : f sj → Sas f sj = S s a f sj , tˆa : f sj → f sj Saj = S j a f sj . In particular,
$ % $ % sj sj rˆ3 : fml → mf ml ,
$ % $ % sj sj tˆ3 : fml → l f ml .
Using the well-known orthogonality relations for the matrix elements of irreα [79, 80] we can rewrite the scalar product in ducible unitary representations Dmn the following form: 1 αβ+ αβ Tr f1 (qa )f2 (qb ) Pλ dq1 . . . dqn , (3.92) 1 |2 = N (α)N (β) α,β
where Pλ is the weight factor given by (3.71), and the argument symbols like qa are abbreviations for the system (q1 , . . . , qn ). The trace operation is meant in the sense of matrix two-indices: N (α) N (β) αβ αβ αβ+ αβ = f1 nk f2 nk . Tr f1 f2 n,m=1 k,l=1
ml
nl
If no superposition over m, l in (3.89) is performed and we use the matrixvalued wave functions (3.91), the trace operation is meant in the usual sense. Obviously, if we use the modified wave functions ϕ (3.88), then the scalar product expression is free of the weight factor Pλ , 1 αβ+ αβ Tr g1 (qa )g2 (qb ) dq1 . . . dqn , (ϕ1 |ϕ2 ) = N (α)N (β) α,β
√ where, obviously, g = Pλ f . Quite analogous formulas are true for the d’Alembert models; simply Pλ is replaced then by Pl (3.72) and instead qa we use their exponential functions Q a , 1 αβ+ αβ Tr f1 (Q a )f2 (Q b ) Pl dQ 1 . . . dQ n , 1 |2 = N (α)N (β) α,β 1 αβ+ αβ (ϕ1 |ϕ2 ) = Tr g1 (Q a )g2 (Q b ) dQ 1 . . . dQ n . N (α)N (β) α,β
146
Jan J. Sławianowski
Remark: it is implicit assumed in the above formulas that the Haar measure on the (L, R)-manifolds is normalized to unity (the total “volume’’ of the corresponding manifolds equals one). We restrict ourselves to Hamiltonians of the form H = T +V with some doubly-isotropic potentials V (q1 , . . . , qn ), in particular, with some dilatationstabilizing potentials V (q ) (affinely-invariant geodetic incompressible models). The energy eigenproblem, i.e. stationary Schrödinger equation H = E, is equivalent to the infinite sequence of eigenequations for the reduced multicomponent amplitudes f αβ : Hαβ αβ = E αβ αβ . The simultaneous spatial and material isotropy imply in the N (α) × N (β)-fold degeneracy, i.e. for every component of the N (α) × N (β)-matrix amplitude f αβ there exists an N (α) × N (β)-dimensional subspace of solutions, just as seen from αβ the symbol fnk used in (3.89). ml
The reduced Hamiltonians Hαβ = Tαβ + V, are N (α) × N (β) matrices built of second-order differential operators. For the affine–affine model of the kinetic energy we have ← − →αa 2 βa − − S S b b 1 2 f αβ Tαβ f αβ = − Df αβ + 2 qa −qb 2A 32A sh 2 a,b 2 ← − − → S β ab + S αab 1 2 B ∂2 αβ αβ − f + f . (3.93) a b 32A 2A(A + nB) ∂q2 ch2 q −q a,b 2 For the spatially metrical and materially affine model we obtain ← − − →αa 2 a β S b−S b 1 1 2 f αβ C(α, 2)f αβ + Tαβ f αβ = − Df αβ + 2 qa −qb 2α 2μ 32α sh 2 a,b 2 ← − − → S β ab + S αab 1 2 ∂2 αβ αβ − f − f , (3.94) a b 32α 2β ∂q2 ch2 q −q a,b 2
Quantization of Affine Bodies: Theory and Applications in Mechanics of Structured Media
147
where C(α, 2) is the αth eigenvalue of the rotational Casimir ||S||2 , thus, 1 αj − Sjαi Si = C(α, 2)IN (α) . 2 Obviously, for the physical dimension n = 3, f αβ = f sj , we have C(s,2) = h2 s(s + 1). And similarly for the spatially affine and materially metrical model we have ← − − →αa 2 a β S b−S b 1 1 2 Tαβ f αβ = − Df αβ + f αβ C(β, 2)f αβ + 2 qa −qb 2α 2μ 32α sh 2 a,b 2 ← − − → S β ab + S αab 1 2 ∂2 αβ αβ − f − f , (3.95) a b 32α 2β ∂q2 ch2 q −q a,b 2 where C(β, 2) appears as the βth eigenvalue of the vorticity Casimir || V||2 , and just as previously for n = 3, f αβ = f sj , we have C(j, 2) = h2 j(j + 1). It is so as if the doubly affine background (T affinely–invariant in the physical and material space) was responsible for some fundamental part of the spectra, perturbated by some internal rotations of the body itself or of the deformation axes. This perturbation and the resulting splitting of energy levels becomes remarkable when μ is small, i.e. when the inertial constants I , A differ slightly. The suggestive terms 2 s(s + 1), 2μ
2 j(j + 1) 2μ
as contributions to energy levels are very interesting and seem to be supported by experimental data in various ranges of physical phenomena. Finally, let us quote the corresponding form of Tαβ for the quantized d’Alembert model: ← − − →αa 2 a β S b−S b 1 2 αβ αβ αβ T f = − Dl f + f αβ 2I 8I (Q a − Q b )2
a,b
← − − →αa 2 a β S b+S b 1 + f αβ . 8I (Q a + Q b )2 a,b
In this way the problem has been successfully reduced from n2 internal degrees of freedom (physically 9, sometimes 4) to the n purely deformative degrees of freedom (physically 3, sometimes 2). The price one pays for that is the use of
148
Jan J. Sławianowski
multicomponent wave functions subject to the strange parastatistical conditions in the reduced qa -variables.The particular values of labels α, β and the corresponding α , Sβ describe the influence of quantized rotational degrees of freedom matrices Sab ab on the quantized dynamics of deformation invariants. It is interesting that on the classical level there is no simple way to perform such a dynamical reduction to deformation invariants. For any reduced problem with α, β labels the quantity αβ+ αβ ρ(qi ) := Tr f1 (qa )f2 (qb ) Pλ (qc ), is the probability density for finding the object in the state of deformation invariants (q1 , . . . , qn ). More precisely, ρ(q1 , . . . , qn )dq1 · · · dqn is the probability that the values of deformation invariants will be detected in the infinitesimal range dq1 . . . dqn about the values (q1 , . . . , qn ). Similarly, performing the integration ρ(L, R) =
(L; qa ; R)(L; qa ; R)Pλ (q)dq1 . . . dqn ,
one obtains the probability density for detecting the “gyroscopic’’ degrees of freedom L, R (equivalently, the Cauchy and Green deformation tensors) in some range of the configuration space. Obviously, this distribution is meant in the sense of the Haar measure μ. The integrals αβ
pnk = ml
αβ
αβ
ml
ml
fnk (qa )fnk (qb )Pλ (qc )dq1 . . . dqn ,
are probabilities of detecting the particular indicated values of angular momenta and vorticities, C(α, 2), C(β, 2), m, l, n, k. In the physical three-dimensional case they are, respectively h2 s(s + 1), h2 j(j + 1), m, l, n, k, where s, j are nonnegative integers, m, n = − s, . . . , s, k, l = − j, . . . , j, jumping by one. Particularly interesting are αβ
pml =
n,k
αβ
pnk , ml
because they refer to the constants of motion Sij , VA B and to “good’’ quantum a a ˆ numbers α, β, m, l. Except the special case n = 2, rˆb , tb are not constants of motion, and k, l are not“good’’quantum numbers. Quite analogous statements are true for the quantized d’Alembert model; the only formal difference is that the integration element in the manifold of invariants is given by Pl (Q 1 , . . . , Q n )dQ 1 . . . dQ n .
Quantization of Affine Bodies: Theory and Applications in Mechanics of Structured Media
149
Let us now write down the explicit formulas for the physical threedimensional case. For the affine–affine model (3.93) we have now sj
Taff −aff f sj = −
∂2 sj 2 2 B f Dλ f sj + 2A 2A(A + 3B) ∂q2
1 (Sas )2 f sj − 2Sas f sj Sa + f sj (Sa )2 + b c 16A a=1 sh2 q −q 2 3
j
1 (Sas )2 f sj + 2Sas f sj Sa + f sj (Sa )2 b c 16A a=1 ch2 q −q 2 3
−
j
j
j
(3.96)
j
where Sa are the standard Wigner matrices for j-angular momentum, i.e. h2 j(j + 1)-magnitude, and for any ath term of both summation we have obviously b = a, c = a, b = c. Obviously, it does not matter in what an ordering qb , qc are written, because the denominators are sign-non-sensitive. Dλ is given by (3.86), where explicitly Pλ = |sh (q2 − q3 ) sh (q3 − q1 ) sh (q1 − q2 )|. And similarly, using the abbreviated form, we can write for the metrical–affine (3.94) and affine–metrical (3.95) models, respectively, as follows: sj
I 2 s(s + 1), − A2 ) I sj = Taff −aff [A → I + A] + 2 j(j + 1), 2 2 2(I − A ) sj
Tmet−aff = Taff −aff [A → I + A] + sj
Taff −met
sj
2(I 2
sj
where, obviously, Taff –aff [A →I + A] is obtained from Taff –aff (3.96) simply by replacing A with α = I + A. The doubly-isotropic d’Alembert model in three dimensions has the following form: sj Td A f sj
3 j j 1 (Sas )2 f sj − 2Sas f sj Sa + f sj (Sa )2 2 sj = − Dl f + 2I 4I a=1 (Q b − Q c )2
+
3 j j 1 (Sas )2 f sj + 2Sas f sj Sa + f sj (Sa )2 4I a=1 (Q b + Q c )2
with the same convention as previously, Dl given by (3.87), and explicitly & & Pl = & (Q 2 )2 − (Q 3 )2 (Q 3 )2 − (Q 1 )2 (Q 1 )2 − (Q 2 )2 & .
(3.97)
150
Jan J. Sławianowski
Let us mention that in principle half-integer angular momentum of extended objects may be formally introduced by replacing the group GL(3, R) by its universal covering GL(3, R). There are some indications that the physical usefulness of such models is not excluded. Formally, the procedure is as follows. In (3.89) specialized to n = 3 we replace the group SO(3, R) by its covering SU(2) and s : write the following expression involving the known Wigner matrices Dmn (u, D, v) =
s
j
2 Dmn (u)fnk (D)Dkl (v −1 ),
s,j m,n=−s k,l=−j
sj
j
(3.98)
ml
where u, v ∈ SU(2), D ∈ Diag(3, R) ⊂ GL(3, R), and both the integer and halfinteger values of s, j are admissible. However, if the function of triples (u, D, v) is to represent a function on GL+ (3, R), then the values of s, j in the above series must have the same “halfness’’, i.e. either both s, j in (3.98) are integers or both are non-integers. And no superposition between elements of these two function spaces is admitted (a kind of superselection rule). The point is that for such " would be two-valued “halfness-mixing’’ superpositions the squared modulus from the point of view of SO(3, R). This would be violation of the probabilistic interpretation of in GL+ (3, R). If there is no mixing, then in the case of superposing over half-integer s, j in (3.98) the resulting is two-valued on GL+ " does project, (3, R), i.e. it does not project from GL+ (3, R) to GL+ (3, R) but + i.e. it is single-valued in GL (3, R). The simplest possible situation in (3.96) and (3.97) is s = j = 0, i.e. purely scalar amplitude f 00 . Then T00 reduces, respectively, to T00 = −
∂2 2 2 B , Dλ + 2A 2A(A + 3B) ∂q2
T00 = −
2 Dl , 2I
i.e. there is no direct contribution from internal degrees of freedom. If we admit half-integers, then the next simple situation is s = j = 1/2. Then 1/2 1/2 Sa = (/2)σa , where σa are Pauli matrices. Therefore, (Sa )2 = (h2 /4)I2 . Finally, let us briefly describe the two-dimensional situation, i.e. “Flatland’’ [63], n = 2. Obviously, it may have some direct physical applications when we deal with flat molecules or other structural elements. But besides, the two-dimensional models shed some light on the general situation and enable one to make it more comprehensible and lucid. Indeed, let us observe that the expressions (3.65) and (3.68) (without the last p2 -term) are superpositions of two-dimensional clusters corresponding to all possible R2 -subspaces in Rn . Obviously, these terms in general are non-disjoint and for n = 3 they simply cannot be disjoint (all twodimensional linear subspaces in R3 have intersections of dimension higher than null; if different, they always intersect along one-dimensional linear subspaces). There are some very exceptional features of the dimension n = 2. They are very peculiar, in a sense pathological. But nevertheless the resulting simplifications
Quantization of Affine Bodies: Theory and Applications in Mechanics of Structured Media
151
generate some ideas and hypotheses concerning the general dimension. Of course, later on they must be verified on the independent basis. Let us begin with the classical description. The one-dimensional group of planar rotations SO(2, R) is Abelian, therefore, ρˆ = ρ = S, τˆ = τ = −V . In doubly-isotropic models S and V are constants of motion and so are ρ, ˆ τˆ , M , N if n = 2. It is not the case for n > 2, where, as always in isotropic models, S, V are constants of motion but ρ, ˆ τˆ , do not equal S, −V and are non-constant. But it is exactly the use of ρ, ˆ τˆ , and their combinations M , N that simplifies the problem and leads to a partial separation of variables. In two-dimensional space these things coincide and the problem may be effiectively reduced to the dynamics of two-deformation invariants both on the classical and quantum level. The two-polar decomposition ϕ = LDR −1 will be parameterized in a standard way; using the matrix language we have ! ! cos α − sin α cos β − sin β , R= , sin α cos α sin β cos β ! ! exp q1 0 Q2 0 = . 0 Q2 0 exp q2
L = D =
To separate the dilatational and incompressible motion we introduce new variables: 1 q = (q1 + q2 ), 2
x = q2 − q1 .
Their conjugate momenta are given by p = p1 + p2 ,
1 px = (p2 − p1 ). 2
Angular velocities are given by the following matrices: dα dL dL −1 L = L −1 =' χ = dt dt dt
! 0 −1 , 1 0
dR −1 dβ dR ' R = R −1 =ϑ = dt dt dt
! 0 −1 . 1 0
χ= ϑ=
Spin and vorticity essentially coincide with canonical conjugate momenta pα , pβ , i.e. ! ! 0 1 0 1 S = ρ =' ρ = pα , V = −τ = −' τ = pβ . −1 0 −1 0
152
Jan J. Sławianowski
With this convention the pairing between velocities and momenta has the form: pα
1 dα = Tr(Sχ ), dt 2
pβ
1 dβ = Tr(Vϑ ). dt 2
The diagonalizing quantities M = −' ρ −' τ, N =' ρ −' τ, are also expressed by matrices ! ! 0 1 0 1 M =m , N =n , −1 0 −1 0 where m = pβ − pα ,
n = pβ + pα .
In some formulas it is convenient to use modified variables 1 γ = (β − α), 2
1 δ = (β + α). 2
Their conjugate momenta just coincide with the above m, n, i.e. pγ = m,
pδ = n.
The magnitudes of S, V have the form: 1 ||S|| = |pα | = |n − m|, 2
1 ||V || = |pβ | = |n + m|. 2
As mentioned, pα , pβ , m, n are constants of motion because in doubly isotropic models α, β are cyclic variables. The corresponding affine-affine, metrical-affine, and affine-metrical kinetic energies of internal degrees of freedom are, respectively, given by T aff −aff =
1 2 B 1 1 m2 n2 − , ( p1 + p22 ) − p2 + 2 1 2A 2A(A + 2B) 16A sh2 q −q 16A ch2 q2 −q1 2 2 I (n − m)2 , 8(I 2 − A2 ) I [A → I + A] + (n + m)2 , 2 8(I − A2 )
T met−aff = T aff −aff [A → I + A] + T aff −met = T aff −aff
where, as usual, T aff −aff [A → I + A] denotes T aff –aff with A replaced by I + A.
Quantization of Affine Bodies: Theory and Applications in Mechanics of Structured Media
153
Separating dilatational and incompressible motion we obtain, respectively, the following expressions: aff −aff Tint =
p2 p2 ( pα − pβ )2 ( pα + pβ )2 − , + x+ 4(A + 2B) A 16Ash2 2x 16Ach2 2x Ipα2 , I 2 − A2 Ipβ2 [A → I + A] + 2 . I − A2
met−aff Tint = T aff −aff [A → I + A] + aff −met Tint = T aff −aff
Canonical momenta pα , pβ , or equivalently, m, n, are constants of motion and their Poisson brackets with the variables q, x, p, px . Therefore, if we are interested only in the evolution of variables q, x but not in that of α, β, we can simply replace pα , pβ , m, n in the above expressions by constants characterizing a given family of solutions.They are effective coupling constants for the interaction between deformation invariants q1 , q2 . The sh−2 (x/2)-term controlled by m is always repulsive and singular at the coincidence x = 0 (nondeformed shape but dilatation admitted). The ch−2 (x/2)-term controlled by n is attractive and finite at x = 0. At large “distances’’ of deformation invariants, |x| → ∞, attraction prevails if and only if |m| >|n|, i.e. if pα , pβ have the same signs, pα , pβ > 0. If |m| < |n|, i.e. pα pβ < 0, then the time evolution of x is unbounded. This is just the very special (n = 2) example of that was said formerly, namely that in the incompressible and affinelyinvariant geodetic regime there exists an open family of bounded motions (“elastic vibrations’’) and an open family of unbounded motions (“dissociation’’, decay). If the total deformative motion is to be bounded, then some dilatations-stabilizing potential V (q) must be included into Hamiltonian. But even if there is no x-dependent potential, our affine geodetic model in the non-compact configuration space of incompressible motion may encode bounded elastic vibrations. The same is true for n > 2, however the situation is more complicated then because Mab , Nab are not constants of motion and also undergo some vibrations. Analogous statements are true on the quantum level. The Haar measure in our coordinates is given by dλ (α; q, x; β) = |shx| dq dx dα dβ, its weight factor equals Pλ = |shx|. The wave functions are expanded in the double Fourier series: (α; q, x; β) = f mn (q, x)e imα e inβ . m,n Z
This is obviously the Peter–Weyl theorem specialized to the two-dimensional torus group T2 . Our integers m, n ∈ Z are just the labels α, β from the general theory.
154
Jan J. Sławianowski
For the affine–affine model the reduced operator of the kinetic energy is given by mn Tmn =− aff −aff f
+
2 2 ∂2 f mn Dλ f mn − A 4(A + 2B) ∂q2 2 (n − m)2 mn 2 (n + m)2 mn f − f , 16Ash2 2x 16Ach2 2x
where now Dλ is expressed as follows: 1 ∂ Dλ = |shx| ∂x
∂ |shx| . ∂x
Similarly for the metric-affine and affine-metric models we obtain, respectively I 2 m 2 , I 2 − A2 I 2 n 2 [A → I + A] + 2 , I − A2
mn Tmn met−aff = Taff −aff [A → I + A] + mn Tmn aff −met = Taff −aff
To avoid the purely continuous spectrum one must include into Hamiltonian at least some dilatations-stabilizing potential V (q). The problems is then (as usual) separable in (x, q)-variables. And just as on the classical level, for affinely-invariant incompressible dynamics, i.e. for the x-sector of the above operators, there exists discrete spectrum if |n + m| > |n −m|, i.e. if mn > 0. This is the quantum bounded motion. In three-dimensional problems the above condition will be replaced by some more complicated one between quantum numbers labelling the reduced amplitudes f . Let us observe that in more general, not necessarily geodetic, problems in two dimensions with explicitly separable potentials V (q, x) = Vdil (q) + Vsh (x), the Schödinger equation Hmn f mn = Ef mn , where Hmn = Tmn + Vdil (q) + Vsh (x) with any of the above Tmn , splits into two one-dimensional Schrödinger equations. The reduced wave function is sought in the form f mn (q, x) = ϕmn (x)χ(q), where ϕmn , χ satisfy the following eigenequations: mn = Esh ϕmn , Hmn sh ϕ
Hdil χ = −
d2 χ 2 + Vdil χ = Edil χ. 4(A + 2B) dq2
155
Quantization of Affine Bodies: Theory and Applications in Mechanics of Structured Media
Here the shear-rotational Hamiltion operator Hmn sh is given by Hmn sh−aff −aff = −
2 2 (n − m)2 2 (n + m)2 − + Vsh (x), Dλ + A 16Ash2 2x 16Ach2 2x
in the affine–affine model, and by mn Hmn sh−met−aff = Hsh−aff −aff +
I 2 m 2 , I 2 − A2
mn Hmn sh−aff −met = Hsh−aff −aff +
I 2 n 2 , I 2 − A2
respectively, in the metric-affine and affine-metric models. Obviously, the total energy equals E = Esh + Edil . It is seen that the main point of the analysis is the affine-affine model because with fixed m, n the other ones differ form it by (m, n-dependent) c-numbers. Just as in the classical model, for |n + m| > |n−m|, i.e. nm > 0, the“centrifugal’’ term 2 (n − m)2 2 (n + m)2 − , Vcfg := 16Ash2 2x 16Ach2 2x is singular repulsive at x = 0 and finite-attractive for |x| → ∞. And then even for the purely geodetic incompressible model (Vsh = 0) there exist bounded states and discrete energy spectrum for Esh. For nm < 0 the energy spectrum is continuous (scattering states). Finally, let us quote the corresponding formulas for the quantized d’Alembert model. Obviously, using the same notation as above we have the following expression for the classical kinetic Hamiltonian: d.A Tint =
1 2 1 1 m2 n2 + . (P1 + P22 ) + 2I 4I (Q 1 − Q 2 )2 4I (Q 1 + Q 2 )2
With fixed values of m, n the problem reduces again to the dynamics of deformation invariants Q 1 , Q 2 . In the two-polar coordinates the Lebesgue measure element is given by dl(α; Q 1 , Q 2 ; β) = Pl (Q 1 , Q 2 )dQ 1 dQ 2 dα dβ, where
& & & & Pl = &(Q 1 )2 − (Q 2 )2 & = &(Q 1 + Q 2 )(Q 1 − Q 2 )& .
The reduced amplitudes f mn satisfy the eigenequations Hmn f mn = Tmn f mn + V (Q 1 , Q 2 )f mn = E mn f mn with Tmn f mn = −
2 2 m 2 2 n 2 mn f + f mn , Dl f mn + 2I 4I (Q 1 − Q 2 )2 4I (Q 1 + Q 2 )2
(3.99)
156
Jan J. Sławianowski
where
1 ∂ ∂ 1 ∂ ∂ Dl = Pl 1 + Pl 2 . Pl ∂Q 1 ∂Q Pl ∂Q 2 ∂Q
The coordinates Q 1 , Q 2 are very badly non-separable even in the very kinetic energy expression. There are however other coordinates on the plane of deformation invariants, much better from this point of view. The simplest ones are coordinates Q + , Q − obtained from Q 1 , Q 2 by the rotation by the angle π/4, 1 Q + := √ (Q 1 + Q 2 ), 2
1 Q − := √ (Q 1 − Q 2 ), 2
where Q + and Q − may be expressed in terms of polar and elliptic coordinates, respectively, as Q + = r cos ϕ,
Q − = r sin ϕ,
and Q + = chρ cos λ,
Q − = shρ sin λ.
In all these variables the Hamilton–Jacobi and Schrödinger equations without potential are separable. Obviously, the geodetic d’Alembert model ˙ is completely non-physical. However, the coordinate systems T = (I /2)Tr(ϕ˙ T ϕ) (Q + , Q − ), (r, ϕ), (ρ, λ) enable one to find a class of potentials which are physically realistic and at the same time both the Hamilton–Jacobi and Schrödinger equations are separable for the corresponding Hamiltonians. The reduced Schrödinger eigenproblem (3.99) with doubly isotropic potentials is separable if V (Q 1 , Q 2 ) = V+ (Q + ) + V− (Q − ). mn + Hmn , where Namely, we have then Hmn = H+ _
Hmn + Hmn −
2 1 ∂ ∂2 (m − n)2 mn + + V+ f+mn , f + ∂Q +2 Q + ∂Q + + 8IQ +2 2 2 1 ∂ ∂ 2 (m + n)2 mn mn =− + + + V f − f− , 2I ∂Q −2 Q − ∂Q − − 8IQ −2 2 =− 2I
where mn mn mn Hmn + f+ = E+ f+ ,
mn mn mn Hmn − f− = E− f− ,
and f mn (Q 1 , Q 2 ) = f+mn (Q + )f−mn (Q − ),
mn mn E mn = E+ + E− .
Quantization of Affine Bodies: Theory and Applications in Mechanics of Structured Media
157
Obviously, the volume element is given by dl(α; Q + , Q − ; β) = 2|Q + ||Q − |dQ + dQ − dαdβ. The doubly isotropic models separable in coordinates (r, ϕ) are based on potentials of the form 1 V (r, ϕ) = Vr (r) + 2 Vϕ (ϕ). r The wave functions are factorized as follows: = e imα e inβ f mn (r, ϕ) = e imα e inβ R mn (r)mn (ϕ) and then of course S = rˆ = m, V = −tˆ = n. The reduced Hamiltonian has the form: 1 mn Hmn = Hmn r + Hϕ , r where Hmn r
2 =− 2I
3∂ ∂2 + 2 ∂r r ∂r
+ Vr
is as a matter of fact independent of m, n; unlike this, Hϕmn depends explicitly on m, n: ∂ 2 ∂ 2 2 m2 + 2mn cos (2ϕ) + n2 mn Hϕ = − + 2ctg(2ϕ) + + Vϕ . 2I ∂ϕ2 ∂ϕ 2I sin2 (2ϕ) The functions mn , R mn satisfy eigenequations mn Hmn = Eϕmn mn , ϕ
Hmn r
1 mn + Eϕ R mn = ER mn . r
(3.100)
(3.101)
The -equation (3.100) is to be solved as first. Then the resulting quantized values of Eϕmn , labelled by an additional quantum number k are to be substituted to (3.101) and because of this the labels m, n appear in R although the radial operator Hrmn , in spite of the used notation, is independent of m, n. There are also two additional quantum numbers in R, namely k itself appearing through Eϕmnk and the proper radial quantum number μ, thus, E obtained from (3.101) will be denoted by E mnkμ .
158
Jan J. Sławianowski
Let us quote some very interesting model qualitatively compatible with standard demands of the macroscopic nonlinear elasticity, 1 D12 + D22 k 2 2k + + r =k . V = 2 r cos (2ϕ) 2 D1 D2 2 In the natural state of elastic equilibrium r = 0, ϕ = 0. We do not quote more complicated and rather non-useful one-dimensional equations for the elliptic coordinates ρ, λ. Let us only mention the general shape of separable doublyisotropic potentials V (ρ, λ) =
Vρ (ρ) Vλ (λ) + . 2 2 2(ch ρ − cos λ) 2(ch ρ − cos2 λ) 2
Finally, we quote a three-parameter family of doubly-isotropic potentials for which both the classical and quantum problems are simultaneously separable in all the aforementioned coordinate systems: B A + −2 + C(Q +2 + Q −2 ) +2 Q Q 2A 2B + 1 + C (Q 1 )2 + (Q 2 )2 . = 1 2 2 2 2 (Q + Q ) (Q − Q )
V =
Here A, B, Care arbitrary constants. It is well-known that the simultaneous separability of Hamilton–Jacobi and Schrödinger equations in a few coordinate systems has to do with degeneracy and hidden symmetries. Two-dimensional models are interesting not only from the philosophical point of view of the “Flatland’’ geometry. They may be practically useful in the theory of surfaces of structured bodies and in the dynamics of elongated molecules or other structure elements. ACKNOWLEDGEMENTS The chapter presented here as a contribution to this book has a long story. It was initiated during my stay in Piza in 2001 at the Istituto Nazionale di Alta Matematica “Francesco Severi’’, Università di Piza, with professor Gianfranco Capriz and owes very much to my discussions with him and with professor CarmineTrimarco. In a sense, it should be interpreted as a document of our common work. Later on, during the visit of professor Paolo Mariano at our Institute of Fundamental Technological Research in Warsaw, we discussed the material contained here and I must say that this intellectual interaction influenced in a very deep way its final shape. Certain parts were prepared during my stay in Berlin in 2004 at the Institute of Theoretical Physics, Berlin Technical University, with professors K. E. Hellwig and H. H. von Borzeszkowski, and discussions with my German colleagues were very essential for me. There is, perhaps unfortunately, no intellectual work without financial support. I was blessed in this sense by the help of Istituto Nazionale di Alta Matematica “Francesco Severi’’, Gruppo Nazionale per la Fizica Matematica Firenze in Piza and by Alexander von Humboldt Stiftung in Berlin. I am really very grateful to all mentioned professors and Institutions.
Quantization of Affine Bodies: Theory and Applications in Mechanics of Structured Media
159
REFERENCES 1. O. I. Bogoyavlensky, Methods of Qualitative Theory of Dynamical Systems in Astrophysics and Gas Dynamics, Springer, Berlin Heidelberg New York, 1985. 2. G. Capriz, Continua with Substructure, Phys. Mesomech., 3 (2000), 5–14, 37–50. 3. S. Chandrasekhar, Ellipsoidal Figures of Equilibrium,Yale Univ. Press, New Haven London, 1969. 4. D. P. Chevallier, On the foundations of ordinary and generalized rigid body dynamics and the principle of objectivity,Arch. Mech., 56 (4) (2004), 313–353. 5. H. Cohen, Pseudo-rigid bodies, Utilitas Math., 20 (1981), 221–247. 6. H. Cohen and G. P. Mac Sithigh, Plane motions of elastic pseudo-rigid bodies, J. Elasticity 14 (1989), 193–226. 7. H. Cohen and G. P. Mac Sithigh, Impulsive motions of elastic pseudo-rigid bodies, ASME J. Appl. Mech., 58 (1991), 1042–1048. 8. H. Cohen and G. P. Mac Sithigh, Symmetry and asymmetry roto-deformations of a symmetrical, isotropic, elastic pseudo-rigid body, Int. J. Nonlinear Mech., 27 (1992), 519–526. 9. H. Cohen and G. P. Mac Sithigh, Impulsive motions of elastic pseudo-rigid bodies. II. further results and examples, J. Elasticity 34 (1994), 149–166. 10. H. Cohen and G. P. Mac Sithigh, Collisions of pseudo-rigid bodies: a brach-type analysis, Int. J. Eng. Sci., 34 (1996), 249–256. 11. H. Cohen and M. G. Muncaster,The Theory of Pseudo-Rigid Bodies, Springer Tracts in Natural Philosophy, Springer, Berlin, 1989. 12. S. Cohen, F. Plasil, and W. J. Swiatecki, Equilibrium configurations of rotating charges or gravitating liquid masses with surface tension. II.,Ann. Phys., 82 (1974), 557–596. 13. H. Cohen and Q.-X. Sun, Plane motions of elastic pseudo-rigid pendulums, Sol. Mech. Arch., 13 (1988), 147–176. 14. H. Cohen and Q.-X. Sun, snapping of an elastic pseudo-rigid bodies, Int. J. Solids Struct., 28 (1991), 807–818. 15. F. J. Dyson, Dynamics of a spinning gas cloud, J. Math. Mech., 18 (1) (1968), 91. 16. A. C. Eringen, Mechanics of micromorphic continua, Proceedings of the IUTAM Symposium on Mechanics of Generalized Continua, Freudenstadt and Stuttgart, 1967, E. Kröner (ed.),Vol. 18, Springer, Berlin Heidelberg New York, 1968, 18–33. 17. F. W. Hehl, E. A. Lord, andY. Ne’eman, Hadron dilatation, shear and spin as components of the intrinsic hypermomentum. current and metric-affine theory of gravitation, Phys. Lett., 71B (1977),432. 18. A. Martens, Dynamics of holonomically constrained affinely-rigid body, Rep. Math. Phys., 49(2/3) (2002), 295–303. 19. A. Martens, Quantization of affinely-rigid body with constraints, Rep. Math. Phys., 51(2/3) (2003), 287–295. 20. A. Martens, Hamiltonian dynamics of planar affinely-rigid body, J. Nonlinear Math. Phys., 11 (Suppl) (2004), 145–150. 21. A. Martens, Quantization of the planar affinely-rigid body, J. Nonlinear Math. Phys., 11 (Suppl) (2004), 151–156. 22. O. M. O’Reilly,A properly invariant theory of infinitesimal deformations of an elastic cosserat point, Z. Angew. Math. Phys., 47 (1996), 179–193. 23. O. M. O’Reilly and P. C.Varadi,A unified treatment of constraints in the theory of a cosserat point, Z. Angew. Math. Phys., 49 (1998), 205–223. 24. P. Papadopoulos, On a class of higher-order pseudo-rigid bodies, Math. Mech. Solids, 6 (2001), 631–640. 25. M. Roberts, C. Wulff, and J. Lamb, Hamiltonian systems near relative equilibria, J. Diff. Equat., 179 (2002), 562–604. 26. G. Rosensteel and J. Troupe, Nonlinear Collective Nuclear Motion, preprint, arXiv:nucl-th/ 9801040 v1. 27. G. Rosensteel and J. Troupe, Gauge Theory of Riemann Ellipsoids, preprint, arXiv:math-ph/ 9909031 v2.
160
Jan J. Sławianowski
28. M. B. Rubin, On the theory of a cosserat point and Its application to the numerical solution of continuum problems,ASME J. Appl. Mech., 52 (1985), 368–372. 29. M. B. Rubin, Free vibration of a rectangular parallelepiped using the theory of a cosserat point, ASME J. Appl. Mech., 53 (1986), 45–50. 30. A. K. Sławianowska, On certain nonlinear many-body problems on lines and circles, Arch. Mech., 41 (5) (1989), 619–640. 31. A. K. Sławianowska and J. J. Sławianowski, Quantization of affinely rigid body in N dimensions, Rep. Math. Phys., 29 (3) (1991), 297–320. 32. J. J. Sławianowski, Analytical Mechanics of Homogeneous Deformations, Prace IPPT — IFTR Reports 8, 1973 (in Polish). 33. J. J. Sławianowski, Newtonian dynamics of homogeneous strains, Arch. Mech., 27 (1) (1975), 93–102. 34. J. J. Sławianowski, Homogeneously deformable body in a curved space, Bulletin de l’Académie Polonaise des Sciences, Série des sciences techniques 23 (2) (1975), 43–47. 35. J. J. Sławianowski, Deformable gyroscope in a non-euclidean space. classical non-relativistic theory, Rep. Math. Phys., 10 (2) (1976), 219–243. 36. J. J. Sławianowski, GL(n, R),Tetrads and generalized space-time dynamics, in Differential Geometry, Group Representations, and Quantization, Lecture Notes in Physics, 379, Springer-Verlag, 1991. 37. J. J. Sławianowski, Group-theoretic approach to internal and collective degrees of freedom in mechanics and field theory,Technische Mechanik, 22 (1) (2002), 8–13. 38. J. J. Sławianowski, quantum and classical models based on GL(n, R)-symmetry, in Proceedings of the Second International Symposium on Quantum Theory and Symmetries, Kraków, Poland, July 18–21, 2001, E. Kapu´scik andA. Horzela, (eds),World Scientific, New Jersey London Singapore Hong Kong, 2002, pp. 582–588. 39. J. J. Sławianowski, Linear frames in manifolds, riemannian structures and description of internal degrees of freedom, Rep. Math. Phys., 51(2/3) (2003), 345–369. 40. J. J. Sławianowski, Classical and quantum collective dynamics of deformable objects. symmetry and integrability problems, in Proceedings of the Fifth International Conference on Geometry, Integrability and Quantization, June 5–12, 2003, Varna, Bulgaria, Ivaïlo M. Mladenov and Allen C. Hirshfeld, (eds), SOFTEX, Sofia, 2004, 81–108. 41. J. J. Sławianowski, Geodetic systems on linear and affine groups. classics and quantization, J. Nonlinear Math. Phys., 11 (suppl) (2004), 130–137. 42. J. J. Sławianowski and V. Kovalchuk, Klein–Gordon–Dirac equation: physical justification and quantization attempts, Rep. Math. Phys., 49 (2/3) (2002), 249–257. 43. J. J. Sławianowski and V. Kovalchuk, Invariant geodetic problems on the affine group and related hamiltonian systems, Rep. Math. Phys., 51 (2/3) (2003), 371–379. 44. J. J. Sławianowski and V. Kovalchuk, Invariant geodetic problems on the Projective Group Pr(n, R), Proceedings of Institute of Mathematics of NAS of Ukraine, A. G. Nikitin,V. M. Boyko, R. O. Popovych, and I. A.Yehorchenko (eds), 50, Part 2, Kyiv, Institute of Mathematics, 2004, 955–960. 45. J. J. Sławianowski and V. Kovalchuk, Classical and quantized affine physics. A step towards it, J. Nonlinear Math. Phys., 11 (Suppl) (2004), 157–166. 46. J. J. Sławianowski,V. Kovalchuk, A. Sławianowska, B. Gołubowska, A. Martens, E. E. Ro˙zko, and Z. J. Zawistowski, Invariant Geodetic Systems on Lie Groups and Affine Models of Internal and Collective Degrees of Freedom, Prace IPPT-IFTR Reports 7, 2004. 47. J. J. Sławianowski,V. Kovalchuk, A. Sławianowska, B. Gołubowska, A. Mar-tens, E. E. Ro˙zko, and Z. J. Zawistowski,Affine symmetry in mechanics of collective and internal modes. part I. classical Models, Rep. Math. Phys., 54 (3) (2004), 373–427. 48. J. J. Sławianowski,V. Kovalchuk,A. Sławianowska, B. Gołubowska,A. Martens, E. E. Ro˙zko,and Z. J. Zawistowski, Affine symmetry in mechanics of collective and internal modes. Part II. quantum models, Rep. Math. Phys., 55(1) (2005), 1–45. 49. J. J. Sławianowski and A. K. Sławianowska,Virial coefficients, collective models and problems with the Galerkin procedure,Arch. Mech., 45(3) (1993), 305–331.
Quantization of Affine Bodies: Theory and Applications in Mechanics of Structured Media
161
50. J. M. Solberg and P. Papadopoulos, A simple finite element-based framework for the analysis of elastic pseudo-rigid bodies, Int. J. Numer. Meth. Eng., 45 (1999), 1297–1314. 51. J. M. Solberg and P. Papadopoulos, Impact of an elastic pseudo-rigid body on a rigid foundation, Int. J. Eng. Sci., 38 (2000), 589–603. 52. E. Sousa Dias, A Geometric Hamiltonian Approach to the Affine Rigid Body, in Dynamics, Bifurcation and Symmetry. New Trends and New Tools, P. Chossat (ed.), NATO ASI Series C,Vol. 437, Kluwer Academic Publishers, Netherlands, 1994, 291–299. 53. A. Trz¸esowski and J. J. Sławianowski, Global invariance and lie-algebraic description in the theory of dislocations, Int. J. Theor. Phys., 29(11) (1990), 1239–1249. 54. K.Westpfahl, Relativistische Bewegungsprobleme, I,Das Freie Spinteilchen,Annalen der Physik., 20 (1967), 113–135. 55. C.Wulff and M. Roberts, Hamiltonian systems near relative periodic orbits, SIAM J. Dynamical Systems 1(1) (2002), 1–43. 56. A. C. Eringen, NonlinearTheory of Continuous Media, McGraw-Hill Book Company, NewYork, 1962. 57. A. C. Eringen (ed.), Continuum Mechanics. Volume I. Mathematics, Academic Press, New York London, 1975. 58. A. C. Eringen (ed.), Continuum Mechanics. Volume II. Continuum Mechanics of Single-Substance Bodies,Academic Press, New York San Francisco London, 1975. 59. A. Bohr and B. A. Mottelson, Nuclear Structure.Volume II,W. A. Benjamin, Reading, MA, 1975. 60. F. Calogero and C. J. Marchioro, Exact solution of a one-dimensional three-body scattering problem with two-body and/or three-body inverse-square potentials, Math. Phys., 15 (1974), 1425. 61. J. Moser, Dynamical systems theory and applications, Lecture Notes in Physics, 38, Springer, Berlin, 1975. 62. J. Moser, Three integrable hamiltonian systems connected with isospectral deformations, Advances Math., 16 (1975), 197–220. 63. E. A. Abbott, Flatland. A Romance of Many Dimensions, Seeley & Co., Ltd., London, 1884. 64. G. Capriz, Continua with Microstructure, Springer Tracts in Natural Philosophy, Vol. 35, Springer Verlag, New York Berlin Heidelberg Paris Tokyo, 1989. 65. G. Capriz and P. M. Mariano, Balance at a junction among coherent interfaces in materials with substructure, in Advances in Multifield Theories of Materials with Substructure, G. Capriz and P. M. Mariano (eds), Birkhäuser, Basel, 2003. 66. G. Capriz and P. M. Mariano, Symmetries and hamiltonian formalism for complex materials, J. Elasticity, 72 (2003), 57–70. 67. P. M. Mariano, Configuration forces in continua with microstructure, Z. Angew. Math. Phys., 51 (2000), 752–791. 68. P. M. Mariano, Multifield theories in mechanics of solids, Adv. Appl. Mech., 38 (2001), 1–93. 69. P. M. Mariano, Cancellation of vorticity in steady-state non-isoentropic flows of complex fluids, J. Phys. A: Math. Gen., 36 (2003), 9961–9972. 70. J. J. Sławianowski, Field of linear frames as a fundamental self-interacting system, Rep. Math. Phys. 22 (3) (1985), 323–371. 71. J. J. Sławianowski, GL(n, R) as a Candidate for fundamental symmetry in field theory, Il Nuovo Cimento, 106B (6) (1991), 645–668. 72. S. Kobayashi and K. Nomizu, Foundations of Differential Geometry, Interscience Publishers, New York, 1963. 73. D. Arsenovi´c, A. O. Barut, and M. Boži´c, The critical turning points in the solutions of the magnetic-top equations of motion, Il Nuovo Cimento, 110B (2) (1995), 177–188. 74. D. Arsenovi´c,A. O. Barut, Z. Mari´c, and M. Boži´c, Semi-classical quantization of the magnetic top, Il Nuovo Cimento, 110B (2) (1995), 163–175. 75. A. O. Barut, M. Boži´c, and Z. Mari´c, The magnetic top as a model of quantum spin, Ann. Phys., 214 (1) (1992), 53–83.
162
Jan J. Sławianowski
76. A. O. Barut and R. R¸aczka, Theory of Group Representations and Applications, PWN-Polish Scientific Publishers,Warsaw, 1977. 77. W. Pauli, Über ein Kriterium füer Ein- oder Zweiwertigkeit der Eigenfunktionen in der Wellenmechanik, Helvetica Physica Acta, 12 (1939), 147–168. 78. J. Reiss, Single-Valued and Multi-Valued Schrodinger Wave Functions, Helvetica Physica Acta, 45 (1972), 1066–1073. 79. M. Hamermesh, Group Theory, Addison-Wesley Publishing Company, Inc., Reading, MA London, 1962. 80. H. Weyl,The Theory of Groups and Quantum Mechanics, Dover, New York, 1931.
C H A P T E R
F O U R
Moving Least-Square Basis for Band-Structure Calculations of Natural and Artificial Crystals Sukky Jun∗ and Wing Kam Liu∗∗
Contents 4.1 Introduction 4.1.1 Electronic band structures 4.1.2 Photonic and acoustic band structures 4.1.3 Meshless methods and moving least-square basis 4.1.4 Periodicity 4.2 MLS Basis and Periodicity 4.2.1 MLS approximation 4.2.2 Implementation of periodicity condition 4.3 Atomic Crystals and Semiconductors 4.3.1 Galerkin formulation of Schrödinger equation 4.3.2 The Kronig–Penney model potential 4.3.3 Empirical pseudopotentials of Si and GaAs 4.3.4 Strain effect in compound semiconductors 4.4 PhoXonic Crystals 4.4.1 Maxwell equations for 2D photonic crystals 4.4.2 Band structures of various 2D photonic crystals 4.4.3 Acoustic bandgap materials 4.5 Strain-Tunable Photonic Bandgap Materials 4.5.1 Deformations of 2D triangular photonic crystals 4.5.2 Band structures of deformed photonic crystals 4.6. Concluding Remarks
164 164 165 166 166 167 168 169 174 175 177 178 180 183 183 185 194 195 196 198 201
Abstract We present a unified meshfree framework for real-space band-structure calculations of semiconductors, photonic crystals, and phononic crystals. General formulation of periodic meshfree shape function is proposed in order to reproduce the periodicity in these natural and artificial crystal structures. Matrix eigen-equations for the Schrödinger equation, the Maxwell equations, and the elastic wave equation are then solved. Numerical examples include the electronic structure of strained compound semiconductor, the strain-tunable band-gap modification of photonic crystal, and acoustic ∗ Department of Mechanical Engineering, University of Wyoming, Laramie,WY, USA ∗∗ Department of Mechanical Engineering, Northwestern University, Evanston, IL, USA
Material Substructures in Complex Bodies ISBN-10: 0-08-044535-7
© 2007 Elsevier Ltd. All rights reserved.
163
164
Sukky Jun and Wing Kam Liu
band-gap materials. Results demonstrate that the meshfree method can be a promising tool for various strain-induced multiphysics problems of natural and artificial crystals. Key Words: Meshfree method, semiconductor, photonic crystal, phononic crystal, moving least square basis
4.1 Introduction Electrons, light, and sound are all waves. Each of them is described by the corresponding wave equation: the Schrödinger equation, the Maxwell equations, and the elastic wave equation, respectively. This notion recognizes surprisingly interesting analogies among them [1]. That is, periodic arrangements of atoms, dielectric materials, and elastic materials, yield the electronic, photonic, and acoustic band structures, respectively. Bandgaps, generated either naturally or artificially by the appropriate periodicity, in turn, enable us to control the wave propagations of electrons, light, and sound, resulting in the development of new materials such as semiconductors, photonic crystals, and phononic crystals. Wave equations and periodicity are thus two keywords of this work.
4.1.1 Electronic band structures In computational condensed-matter physics, popular numerical methods for the calculation of electronic band structures of crystalline solids are the schemes that employ the plane wave (PW) basis. Wave functions are expanded in terms of Fourier basis and wave equations are solved in reciprocal space. PW-based methods have been naturally fit for describing the systems having periodic boundary conditions. Since a PW represents a free electron, the PW methods have also been very efficient in describing the almost free electrons like metals [2]. On the other hand, it is true that these methods have some shortcomings, especially for large-scale computations where the localized electron states are of specific importance. PW basis is not localized in real space, which makes the basis unsuitable for the adaptive simulations of those problems. Among several alternatives, the augmented plane wave (APW) method, for example, may be employed in order to overcome these deficiencies [3], but it still requires the Fourier transform that is generally known to be very inefficient for parallel computing. In addition to adaptivity and parallel computing, real-space calculations may also provide a good environment for multi-scale simulations, because coupling across scales can be implemented more naturally in real space than in reciprocal space [4]. Consequently, real-space techniques have been attracting attention as one of the promising numerical methods for electronic-structure calculations [5]. More specifically, we note here electronic-structure computations recently performed by two real-space methods which have been much more popular in engineering fields rather than in condensed-matter physics: the finite difference method [2, 6] and the finite element method [7–10].
MLS Basis for Band-Structure Calculations of Natural and Artificial Crystals
165
4.1.2 Photonic and acoustic band structures Photonic crystal is the artificial lattice structure that one dielectric material (inclusion) is periodically embedded in another dielectric medium (matrix). Similar to electronic energy bands due to atomic crystals, the periodicity of dielectric material produces frequency band structures. In addition, an appropriate difference in dielectric constants, between inclusions and matrix, may generate the photonic bandgap; so called, the optical analogy of semiconductor. These properties enable us to manipulate the light to (or not to) propagate in some directions at certain frequencies [11, 12]. Ever since the first theoretical prediction [13, 14], the photonic bandgap material has been one of the central issues in photophysics and optoelectronics communities [11, 15–19]. Accurate computation of the band structure is indispensable for the development of various optoelectronic devices and optical fibers based on these materials [20–22]. The most popular method for calculating photonic band structure is as well the PW method [23–27]. However, it is well known that the conventional PW method reveals unsatisfactory accuracy arising from the abrupt change in the value of dielectric function across the interface between matrix and inclusion [21, 28, 29]. Fourier transform of the discontinuous dielectric function causes oscillatory behavior which lead to very slow convergence. Special techniques are therefore required for the PW method to achieve faster convergence, such as interpolating dielectric function [30] for example. Other numerical methods have also been tried for more efficient calculations of photonic band structures. Among them are the finite-difference time-domain (FDTD) method [31–33], the multiple-scattering theory [34, 35], the transfer matrix method [36], the finite difference method with effective medium theory [29], and most recently, the finite element method [37-39]. We can thus perceive easily that, as in the cases of electronic-structure calculations, real-space methods for the calculation of photonic band structures have attracted growing attention. Real-space methods are also very useful in the study of bandgap tuning by mechanical deformation or strain, because they are much more natural and easier to model arbitrarily distorted interfaces due to the deformation. We can take full advantage of the real-space technique when we focus on the direct modelling of the strain-induced change in the shape of interface between matrix and inclusion. Consequently, the effect of the distorted interface on bandgap modification can be investigated more precisely by the real-space methods. Similar to electromagnetic waves in dielectric materials, sound waves can also be controlled by elastic periodic structures. That is, the periodic array of an elastic material embedded in another elastic material yields a new material system with crystal structure that controls the propagation of sound wave, as named phononic crystal. All the arguments given above for the calculation of photonic band structures are thus applicable to the calculation of acoustic band structure of photonic crystals.
166
Sukky Jun and Wing Kam Liu
4.1.3 Meshless methods and moving least-square basis In this chapter, we introduce a novel approach toward the computations of those band structures for atomic, photonic, and phononic crystals, using the meshless method, a new type of real-space method, which has successfully been applied to various engineering problems, mainly in solid and fluid mechanics areas [40– 43]. Among examples are Reproducing Kernel Particle Methods [44–50] and Element-Free Galerkin Method [51]. Although there have been several types of meshless methods separately developed under different names, they are basically equivalent to each other, and one way to define their basis function is from the moving least-square approximation (MLS) in real space. The meshless methods and the MLS basis have not been studied systematically for the calculations of electronic structures and photonic band structures, except our recent works [52–54]. On account of its advantages reported [40–43], the MLS basis of meshless method may lead to an efficient real-space technique, which motivated us to implement it into the Schrödinger equation for calculating electronic structures, the Maxwell equations for photonic crystals, and finally the elastic wave equations for phononic crystals. Especially, the materials discontinuity in photonic and phononic crystals can naturally be modelled without requiring the Fourier transform. Furthermore, any irregular shape of interface between different materials in those crystals is easily recognized in meshless modelling by simply putting nodes along the interface, which is similar to finite element method. On the other hand, it is a point-based method that does not need grids or meshes in approximation procedure, being different from other real-space techniques. Owing to its inherent feature, the MLS-based meshless method may also be well suited to the adaptive computation, which is expected to provide a highly efficient methodology for the band-structure calculations of electronic, photonic, and acoustic bandgap materials.
4.1.4 Periodicity In developing the MLS-based meshless formulations for the calculation of those various band structures, the most important issue is how to realize the crystal’s periodicity in the meshless modelling procedure. The reason why we pay special attention to periodicity can be recognized through an example as follows. The solid-state electronic-structure calculation dealt within this chapter is basically to solve the eigenvalue problem of the time-independent single-electron Schrödinger equation as 2 2 (4.1) ∇ ψk (x) + V (x)ψk (x) = Ek ψk (x), 2m where is the Planck’s constant, and m, Ek , ψk (x) are respectively the effective mass, energy, wave function of electron, for a given wave vector k. The Bravais lattice provides a potential with periodicity, i.e.,
−
V (x) = V (x + L),
(4.2)
MLS Basis for Band-Structure Calculations of Natural and Artificial Crystals
167
where L denotes the lattice vector in real space. According to the Bloch theorem [55], the wave function ψk (x) of the Schrödinger equation with a periodic potential is the product of a periodic function times a PW such as ψk (x) = uk (x) exp(ik · x),
(4.3)
where uk (x) has the periodicity of the crystal lattice as uk (x) = uk (x + L).
(4.4)
By substituting equation (4.3) into (4.1), the wave equation can be expressed in terms of the eigenfunction uk (x) and eigenvalue Ek for a given k. To this end, the periodicity is the essential feature that meshless solution uk (x) must possess. The PW method expands the eigenfunction in terms of reciprocal lattice vectors with Fourier basis. On the other hand, in real-space techniques, it is expanded in terms of the localized real-space basis such as, e.g. finite element shape function. We here employ the meshless approximation with MLS basis to represent the eigenfunction, and the periodicity is implemented in the process of defining the shape function. A simple algorithm is thus introduced for the periodic meshless shape function that perfectly satisfies the periodic conditions as required. In the following section, we review the conventional formulation of defining the meshless shape function from the MLS approximation and the implementation of periodicity into the formulation. Next, the meshless framework of the eigenvalue problems derived from the above-mentioned wave equations and their Galerkin formulations are presented. In numerical experiments, electronicstructure calculations of diamond and zinc-blende semiconductors are first carried out for demonstrating the performance of periodic MLS basis, which is then followed by the investigation of strain effects on bandgap changes in compound semiconductors. Next, the periodic shape functions are directly applied to the band-structure calculation of photonic crystals where we achieve the higher convergence of MLS-based meshless real-space calculations over the conventional PW method. We further demonstrate how the applied strain can be utilized for tuning the photonic bandgaps in 2D silicon-based photonic crystals. In addition, acoustic band structures of phononic crystals are also calculated using the periodic shape functions proposed. All those computations are carried out by the abovementioned unified formulation of periodic MLS basis for these strain-induced multiphysics simulations of natural and artificial crystals.
4.2 MLS Basis and Periodicity The theoretical preparation for the MLS computation of band structures is to implement the periodicity conditions into the regular meshless formulation. For this, we first briefly review the derivation of meshless shape functions from the MLS approximation in the following section. Next, we then introduce a translation-and-searching algorithm for the cell-periodic MLS basis to represent the periodic conditions involved in various crystals.
168
Sukky Jun and Wing Kam Liu
4.2.1 MLS approximation In this section we briefly describe a typical procedure to define the meshless shape function using the MLS approximation. Although there are generally many routes to defining meshless shape function, here we follow a recent reference Ref. [56]. The MLS-based method begins with the local approximation of an arbitrary function u(x) as uh (x, x¯ ) = p(x − x¯ ) · a(x¯ ) = pT (x − x¯ )a(x¯ ),
(4.5)
in a small region around x¯ . The column vector p(x) must contain at least 1 and monomials in order to fulfill the linear reproducing conditions [42]. For example, p(x) = [1, x, y, z]T for 3D. By reproducing conditions, we mean that the MLS approximation can reproduce exactly the functions which appears in p(x). This is the well-known feature of MLS-based meshless methods. Any polynomials of higher order or arbitrary types of functions can be added in the component of p(x) for the specific purpose of enrichment in MLS approximation. After discretizing a problem domain by a set of points (I = 1, . . . , NP), the idea of MLS interpolant is used in determining the coefficient vector a(x¯ ). That is, a(x¯ ) is determined by minimizing the local error residual functional that is expressed as J (a(x¯ )) =
NP
|u(xI ) − p(xI − x¯ ) · a(x¯ )|2 W (xI − x¯ ),
(4.6)
I =1
where the window function W (x) is a compactly supported continuous function. By solving ∂J /∂a = 0 for a(x¯ ), we have −1
a(x¯ ) = M (x¯ )
NP
p(xI − x¯ )W (xI − x¯ )u(xI ),
(4.7)
I =1
where the moment matrix M(x¯ ) is defined by M(x¯ ) ≡
NP
p(xI − x¯ )pT (xI − x¯ )W (xI − x¯ ).
(4.8)
I =1
Next we insert equation (4.7) into equation (4.5) for a(x¯ ), and then obtain the best local approximation of u(x) [56] as uh (x, x¯ ) = p(x − x¯ ) · a(x¯ ) = pT (x − x¯ )M−1 (x¯ )
NP I =1
p(xI − x¯ )W (xI − x¯ )u(xI ).
(4.9)
MLS Basis for Band-Structure Calculations of Natural and Artificial Crystals
169
Finally, the global MLS approximation of u(x) is obtained by taking the limit of equation (4.9) as x¯ → x, u (x) = h
NP
pT (0)M−1 (x)p(xI − x)W (xI − x)u(xI ),
(4.10)
I =1
where p(0) = [1, 0, . . . , 0]T . The moment matrix M(x) becomes M(x) =
NP
p(xI − x)pT (xI − x)W (xI − x).
(4.11)
I =1
Equation (4.10) can now be rewritten by given sampling values uI = u(xI ) and the MLS basis function (or shape function) NI (x) = N (xI − x) as uh (x) =
NP
NI (x)uI ,
(4.12)
I =1
where NI (x) = pT (0)M−1 (x)p(xI − x)W (xI − x).
(4.13)
Due to the symmetry in M(x), it is straightforward to verify that the shape function can also be written as NI (x) = pT (xI − x)M−1 (x)p(0)W (xI − x) = pT (xI − x)b(x)W (xI − x).
(4.14)
For comprehensive and systematic derivations of MLS shape function and its derivatives, it is recommended to refer to [40–42].
4.2.2 Implementation of periodicity condition Now we modify the meshless representation in order to make the basis function inherently suitable for periodicity. The basic idea is motivated from the conventional implementation of periodic boundary conditions in molecular dynamics simulation [57, 58]. A similar idea has also been applied to finite element method for value-periodic systems [9, 10]. As in Figure 4.1, we consider a 2D periodicity where identical circles are triangularly arrayed. We can consider it as a photonic crystal that consists of circular rod embedded in air. The lattice vector L of the crystal is denoted by L = n1 a1 + n2 a2 ,
(4.15)
170
Sukky Jun and Wing Kam Liu
a2 a2 a1 a1
Figure 4.1 An example of 2D periodicity. The triangular lattice consists of circles (left). The parallelogram containing a circle is selected for the unit cell of the lattice (right). We can consider it as a photonic crystal of circular rod embedded in air.
where a1 , a2 are primitive lattice vectors, and n1 , n2 are integers. Take a unit cell for our domain of computation being discretized by a set of nodes J = 1, . . . , NP. The periodicity to be fulfilled by uh (x) is uh (x) = uh (x + L),
(4.16)
for any point x in the unit cell. In Figure 4.1, a parallelogram with primitive vectors (a1 , a2 ) is shown as a unit cell. In order to compute the shape function NJ (x) at a specific point x, we have to search the associated discretization nodes denoted by J . These nodes are under the support (or domain-of-influence) of the shape function of which the center is at the point x [42]. For example, we consider the point x that is near the lower left corner in the unit cell as illustrated in Figure 4.2. The support (solid circle) then covers not only the portion of the unit cell but also some region outside the cell. Let us now imagine that there are some virtual nodes in the region outside the unit cell, but still under the support of point x. It is important that these virtual nodes must be taken into consideration as long as they are located inside the support, because they also make contribution to the value of shape function computed at x. Fortunately, owing to the symmetry of translation, there is always a corresponding node in the unit cell that is physically equivalent to the outside virtual node. This notion is from the periodicity of the crystal. Therefore, it is unnecessary to employ the virtual outside nodes in the modelling. Instead we search the corresponding interior nodes and put them on the list of the influencing nodes for the point x. This view can be implemented step by step, as shown in Figure 4.2 for the point x near the lower left corner of the parallelogram. We first find neighbor nodes (inside the unit cell only) associated with the original point x. Secondly, we translate the point x to x + a1 and search interior nodes again. Next, it is repeated for x → x + a2 and for x → x + a1 + a2 . This translation-and-searching procedure is carried out for all translated points at which the support of shape function covers the unit cell. The resulting supports are illustrated in gray color
171
MLS Basis for Band-Structure Calculations of Natural and Artificial Crystals
Γ4 Γ2
The circular rod of photonic crystal
a2 Γ1
Γ3 Γ1
a1 a2
Supports of shape function
a1 Γ2
Point x
Figure 4.2 The translation-and-searching algorithm for the construction of a periodic meshless shape function based on the MLS approximation. Supports of the shape function are in gray within the solid circles. The dashed circle illustrates the rod of photonic crystal [52].
in Figure 4.2. Then, the meshless approximation can be written as
uh (x) =
NP
NJ (x)uJ +
J
+
NP
NJ (x + a1 )uJ
J
NP
NJ (x + a2 )uJ +
J
NP
NJ (x + a1 + a2 )uJ ,
(4.17)
J
where the shape function NJ (x + ai ) denotes the usual meshless shape function calculated at x + ai , except the definition of moment matrix M only. Since the nodal summation is also required when defining M, the translation-and-searching steps described above have to be performed as well. The moment matrix is hence defined by
M(x) ≡
NP
WJ (x)pJ (x)pT J (x)
J =1
+
NP J =1
WJ (x + a1 )pJ (x + a1 )pT J (x + a1 )
172
Sukky Jun and Wing Kam Liu
+
NP
WJ (x + a2 )pJ (x + a2 )pT J (x + a2 )
J =1
+
NP
WJ (x + a1 + a2 )pJ (x + a1 + a2 )pT J (x + a1 + a2 ).
(4.18)
J =1
Again as the result of the translational symmetry, it is straightforward to verify the periodicity in equations (4.17) and (4.18). We have thus the equivalences as M(x) = M(x + a1 ) = M(x + a2 ) = M(x + a1 + a2 ),
(4.19)
and consequently, uh (x) = uh (x + a1 ) = uh (x + a2 ) = uh (x + a1 + a2 )
(4.20)
for the point x near the lower left corner. This description is now generalized to an arbitrary point x in the unit cell. Using the lattice vector L = n1 a1 + n2 a2 , equation (4.17) can be written in a more general form, as u (x) = h
NP
NJ (x + L)uJ
L J =1
=
( NP J =1
) NJ (x + L) uJ =
NP
NJP (x)uJ ,
(4.21)
J =1
L
where NJ (x + L) is defined by NJ (x + L) = N (xJ − x − L) ≡ pT (xJ − x − L)b(x)W (xJ − x − L) = pT J (x + L)b(x)WJ (x + L).
(4.22)
Note that b is a function of x, not of x + L. The vector b(x) is obtained by solving M(x)b(x) = p(0),
(4.23)
where M(x) ≡
NP
WJ (x + L)pJ (x + L)pT J (x + L)
(4.24)
L J =1
The summation with respect to L is performed over the current unit cell itself and the neighboring cells only, because, for any farther L, the corresponding support
MLS Basis for Band-Structure Calculations of Natural and Artificial Crystals
173
does not overlap the unit cell of computation. In conclusion, the 2D cell-periodic meshless shape function for a parallelogram unit cell of which primitive vectors are a1 and a2 , can be written as NJ (x + L) = NJ (x + n1 a1 + n2 a2 ) (4.25) NJP (x) ≡ L
n1 ,n2
Therefore the summation over ni ’s results in 9 cells in 2D (or 27 cells for 3D case) at most. That is, ni = −1, 0, +1 in equation (4.25) is enough for any x in the unit cell. In the Galerkin formulation of the next sections, the boundary conditions are given in the specific forms of uh (x) = uh (x + ai ) for all x ∈ i (i = 1, 2), where i denotes the boundary surfaces of parallelogram cell as shown in Figure 4.2. These conditions are automatically satisfied by the translation-and-searching procedure described above. It should be carefully noted that the opposing boundaries are physically identical, i.e., 1 = 3 and 2 = 4 as illustrated in Figure 4.2. Therefore, for the nodal summation over J = 1, . . ., NP, we must involve one boundary only among each identical pair in order to avoid over-summing. In other words, the nodes on 3 and 4 have to make no contribution to the summation with respect to J , once the corresponding nodes along 1 and 2 are summed up. An example of 2D cell-periodic shape function is plotted in Figure 4.3 for a parallelogram unit cell. The position of the center is near the lower left corner as arrowed in the figure. The shape function has non-vanishing values near all the four corners as a result of the translation-and-search algorithm proposed above. One of the fascinating features of meshless method is the reproducing property, i.e., the consistency of the meshless approximation function [42].When the meshless shape function employs the linear basis vector (e.g. pT (x) = [1, x] for 1D),
Figure 4.3 The periodic meshless shape function. The position arrowed is the center of the shape function [52, 54].
174
Sukky Jun and Wing Kam Liu
10 Exact: u(x)x
uh(x)
NP 20 5 NP 10 NP 5 0 10
x Unit cell
10
20
Figure 4.4 Reproducing property by the periodic meshless shape function. The exact function is linear u(x) = x in the unit cell and is a sawtooth globally. The meshless approximation correctly reproduces the sawtooth shape from interior-node values only [54].
the approximation function uh (x) has to exactly reproduce constants and linear functions. The periodic meshless shape function defined above satisfies constant consistency, i.e., uh (x) = 1. On the other hand, special attention has to be paid to the linear consistency. Consider 1D unit cell x = (0, 10) as depicted in Figure 4.4, and assume that the exact function to be reproduced is given as u(x) = x in the unit cell. In the global point of view, this linear function is in fact the sawtoothtype function due to its periodicity as illustrated in dashed line in Figure 4.4. Therefore, it is reasonable for the meshless approximation to take this periodicity into account. The reproducing results by using the periodic meshless shape function are shown together in the figure, for the number of nodes are NP = 5, 10, and 20. It is well demonstrated that the approximation uh (x) is approaching the sawtoothed shape as the number of nodes increases. It is also shown that uh (0) = uh (10), i.e., the periodic boundary condition. This is a distinctive feature of the periodic meshless shape function proposed here. There seems to be no difference between this shape function and particle-type kernel function since we do not encounter the usual sense of boundary where the support of shape function is cut off. Spline functions have already been used for the real-space calculation of electronic structures, even though periodicity has not been mentioned [8, 59]. However, we here insist on following the meshless formulation not because of boundary correction but of convergence and stability [40–43]. Although not dealt within this Chapter, the reward of meshless method would be obvious if we perform the adaptivity that is one of the main advantages of real-space electronic-structure calculation over the PW method [5, 9, 10].
4.3 Atomic Crystals and Semiconductors Crystalline material is the periodic array of atoms. Depending on the atom species and array patterns, the atomic crystalline materials exhibit their own
MLS Basis for Band-Structure Calculations of Natural and Artificial Crystals
175
electronic properties such as metal, semiconductor, or insulators. These properties can be predicted by calculating the electronic band structures of the materials. In this section, using the real-space periodic MLS basis defined previously, we solve the time-independent Schrödinger equation to calculate the electronic band structures of several representative semiconductors of which atomic crystal structures are diamond and zinc-blende lattice structures. We also calculate the compound semiconductors in order to demonstrate that the MLS-based computation properly yields the strain effect, caused by difference in lattice parameters, on the change of electronic structures.
4.3.1 Galerkin formulation of Schrödinger equation Consider the eigenvalue problem of the Schrödinger equation written as −∇ 2 ψ(x) + V (x)ψ(x) = Eψ(x),
(4.26)
in the units of Rydberg (Ry) and atomic unit (a.u.). According to the Bloch theorem, the wave function ψ(x) is expressed by the multiplication of a periodic function u(x) and a PW exp(ik · x) as stated in the introduction. By substituting equation (4.3) into equation (4.26), we have the equation to solve for u(x) and E such as −∇ 2 u(x) − 2ik · ∇u(x) + [V (x + k2 )]u(x) = Eu(x),
(4.27)
where u(x) and V (x) are complex while E is real. Since we consider the parallelepiped unit cell, we take the periodic boundary conditions of u(x) and its normal derivatives as follows: for all x ∈ i (i = 1, 2, 3), u(x) = u(x + ai )
and nˆ i · ∇u(x) = nˆ i · ∇u(x + ai ),
(4.28)
where i are the boundary surfaces and ai are the basis vectors of the unit cell. The outward unit normal to i is denoted by nˆ i . The comprehensive derivation of the Galerkin formulation, including the equivalence between strong form and weak form, has been presented elsewhere [10]. Therefore we write here the final form of the Galerkin formulation: find uh (x) and E such that, ∀v h ∈ h , h ∗ h ∇(v ) · ∇u d + (v h )∗ [−2ik · ∇uh + (V + k2 )uh ]d =E (v h )∗ uh d, (4.29)
where ( )* denotes complex conjugate. Note that the weighting function v h (x) is restricted to obey the periodic conditions, i.e., h = {v h |v h ∈ H 1 , v h (x) = v h (x + L), ∀x ∈ i }.
176
Sukky Jun and Wing Kam Liu
Subsequently the meshless approximations as u (x) = h
NP
NJP (x)uJ
and v (x) = h
J
NP
NJP (x)vJ ,
(4.30)
J
are inserted into equation (4.29). Here uJ and vJ are complex, whereas NJP (x) is real. After all the efforts, we have the matrix equation of eigenvalue problem in conclusion as Hu = ESu,
(4.31)
Hij =
{∇NIP (x) · ∇NJP (x)}d − {2iNIP (x)k · ∇NJP (x)}d + {[V (x) + k2 ]NIP (x)NJP (x)}d, Sij = NIP (x)NJP (x)d,
(4.32) (4.33)
and u = [u1 , u2 , . . . , uNP ]T .The potential V (x) considered here is a given function of x. In their finite element formulations of the Schrödinger equation, Johnson and Freund [60], Jonson et al. [61], and Phillips [62] expressed the potential V (x) in terms of the FE shape functions as done for the trial solution uh (x). Alternatively, Pask et al. [9, 10] simply computed the potential at the quadrature points. We use the latter for the meshless method because the interpolation or approximation is not really necessary for the potential that is already given as a function of x. Two types of potential are employed in the following to demonstrate the realspace meshless calculation of electronic structures. One is the Kronig–Penney model potential [55] of which the analytic solution exists. For more realistic examples, the other is devoted to the local empirical potential of diamond and zinc-blende semiconductors [63]. In computational solid-state physics, the ab initio calculations of electronic structures are usually conducted by solving the Kohn–Sham equation selfconsistently with the exchange–correlation term in potential, being based on the density functional theory [64, 65]. This ab initio calculations are beyond the extent of this chapter since showing the rigorous study on the physics of semiconductor is not of our primary intention. Instead, the local empirical pseudopotential is used because it has the simplest form and is thus adequate for the quick verification of meshless implementation. The Kohn–Sham equation has basically the same form as the wave equation considered here. The real-space meshless formulation introduced here can therefore be extended to first-principles calculations without any considerable modification as long as the potential is periodic.
177
MLS Basis for Band-Structure Calculations of Natural and Artificial Crystals
4.3.2 The Kronig–Penney model potential We employ the generalized form of the 1D Kronig–Penney model potential [55] given as V (x) =
3
V (xi )
where V (xi ) =
i=1
0, 0 ≤ xi < a V0 , a ≤ xi < b,
(4.34)
where a = 2 (a.u.), b = 3 (a.u.) and V0 = 6.5 (Ry) are used for the test problem. The unit cell is simple cubic. The meshless results of the energy band structure for 9×9×9 nodal distribution are shown in Figure 4.5 along the line from to X points in the first Brillouin zone of the simple cubic. For the reciprocal lattices and Brillouin zones of various crystal structures, see any textbook on condensedmatter physics (e.g. Ref. [55]). Analytic solution is easily obtained from 1D solution by using the separation of variable. The convergence for the lowest five eigenvalues at wave vector k = (π/a)(0.5, 0.0, 0.0) is given in Figure 4.6 where Ee is the analytic eigenvalue and En is the meshless result (n = 1, . . . , 5). Note that some lines overlap each other because their eigenvalues are degenerated. Although each convergence rate is deviated a little, the average slope obtained is 3.2 approximately. We confirmed that this average rate of convergence holds for any arbitrary values of k. 20 18 16
Energy (Ry)
14 12 10 8 6 4
k
X
Figure 4.5 Energy level for the Kronig–Penney model potential by meshless method (open circle) of 9×9×9 nodes and by analytic solution (line) [54].
178
Sukky Jun and Wing Kam Liu
101
(EnEe)/|Ee|
102
E1 E2 E3 E4 E5
103
104 0.2
0.4
0.6
0.8
1
1.2
h
Figure 4.6 Convergence of the eigenvalues for the Kronig–Penney model potential at k = (π/a)(0.5, 0.0, 0.0) [54].
4.3.3 Empirical pseudopotentials of Si and GaAs The local empirical pseudopotentials of semiconductors of the diamond and zinc-blende structures have been given by Cohen and Bergstresser [63]. The pseudopotentials are expressed in terms of the reciprocal lattice vector G as VG exp (iG · x), (4.35) V (x) = G
where the coefficient is split into the symmetric and antisymmetric parts as S A (G)FGS + SG (G)FGA , VG = SG
(4.36)
where SG and FG are the structure factor and form factor, respectively. The lattice structure of diamond or zinc-blende has two atoms per primitive cell. They are located at the position of x = a(1/8, 1/8, 1/8) ≡ q and x = −q when the origin is taken to be at the halfway between them. Here a denotes the lattice constant of the material. In this case, the structure factors are given by S SG (G) = cos(G · q)
and
A SG (G) = sin(G · q),
(4.37)
and the form factors yield zero, except terms in which the square of (2πG/a) is 0, 3, 4, 8 or 11. In the examples, two semiconductors are considered: silicon (Si) of the diamond structure, and gallium arsenide (GaAs) of the zinc-blende structure. Their values of form factors are given in [63]. As usual, the symmetric form factor of G = 0 is set zero because it simply adds a constant to all energy levels. For the
179
MLS Basis for Band-Structure Calculations of Natural and Artificial Crystals
(a, a, a)
a1
a2
Z a3
Y X
(0, 0, 0)
Figure 4.7 Two-atom primitive cell and primitive vectors of the diamond and zinc-blende structures. a denotes lattice constant of the materials used. The outer cube is for illustration only [54].
domain of computation, we select the two-atom primitive cell as illustrated in Figure 4.7. Its primitive vectors are ˆ a1 = 12 a( ˆj + k),
a2 = 12 a(kˆ + î),
a3 = 12 a(î + ˆj),
(4.38)
where î, ˆj, kˆ are the Cartesian unit vectors [55]. The 9×9×9-node meshless results of the energy levels of Si (a = 10.261 a.u.) are shown in Figure 4.8 and compared with those from the highly converged empirical pseudopotential method using PW (see e.g. Ref. [66]). The meshless results are purposely plotted by circles because both methods are indistinguishable when plotted by lines together. We attain the average convergence rate of O(h3.5 ) approximately as is shown in Figure 4.9 for the lowest five eigenvalues at k = (2π/a)(0.5, 0.0, 0.0). In the figure, Ee is the eigenvalue of PW method and En is the meshless result. The finite element approaches by Pask et al. [9, 10] have shown the convergence rate of O(h6 ) for the same example. Their results of energy levels agree well with those by PW method when the number of element along each direction of the unit cell reaches 6 or more. However, they used the 32-node serendipity element to achieve satisfactory accuracy. On the other hand, the current meshless discretization is somewhat similar to 8-node brick elements rather than the 32-node serendipity elements of FEM. Nevertheless, 9 nodes along each direction of the primitive cell are enough for this meshless method to demonstrate the results as good as in Figure 4.8. The computing time for both
180
Sukky Jun and Wing Kam Liu
20
Energy (eV)
15
10
5
0
Γ
X
W
L k
Γ
K
Figure 4.8 Energy level of Si by meshless method (open circle) and plane wave method (line). 9×9×9 nodes are used for meshless method [54].
FEM and meshless method heavily depends on the size of the matrix, i.e., the number of nodes. Hence the accuracy with relatively coarse nodal distribution (9×9×9 nodes) of meshless method is remarkable. The band structure of GaAs (a = 10.658 a.u.) is finally exhibited in Figure 4.10, and we verified again the highly accurate results by the meshless method.
4.3.4 Strain effect in compound semiconductors We have also computed the strained compound semiconductor Si1−x Gex in order to show that the MLS basis properly represents the strain effect in band structures. Because the chemical and electronic properties of Si and Ge are similar, it is possible to produce a solid solution of one element in the other to obtain a silicon germanium alloy [67]. For this strained Si1−x Gex system, the band structures have to capture the band splitting at highly symmetric points, especially around the valence band maximum at the center of Brillouin zone. The strain effect is implemented by modifying the primitive lattice vectors as a˜ α = (I + ε), aα ,
(4.39)
where I is the second-rank unit tensor and ε is the strain tensor. a˜ α and aα denote primitive lattice vectors of the strained and unstrained alloy respectively (α = 1, 2, or 3). Without any shear strain in the growth layer, the strain tenor is then given as ( ) ε 0 0 ε = 0 ε 0 , (4.40) 0 0 ε⊥
181
MLS Basis for Band-Structure Calculations of Natural and Artificial Crystals
100
(EnEe)/|Ee|
101
102
E1 E2 E3 E4 E5
103
104
1
1.5
2
2.5
3
3.5 4
h Figure 4.9 Convergence of the eigenvalues for Si at k = (2π/a)(0.5, 0.0, 0.0) [54].
20
Energy (eV)
15
10
5
0
Γ
X
W
L
Γ
K
k
Figure 4.10 Energy level of GaAs by meshless method (open circle) and plane wave method (line). 9×9×9 nodes are used for meshless method [54].
182
Sukky Jun and Wing Kam Liu
Relative energy (eV)
5
0
5
X
W
L
K
k 0.4
Relative energy (eV)
0.2
0.0 0.2 0.4 0.6 0.8
L
K
k
Figure 4.11 (Top) Energy levels of strained Si1−x Gex (x = 0.4) by meshless method (open circles) and PW method (solid lines). The 11×11×11 nodes are used for meshless method. (Bottom) Comparison between the intrinsic Si (solid lines) semiconductor and the strained compound semiconductor Si1−x Gex (dotted lines). Energy split at point is observed.
where ε and ε⊥ are respectively in-plane and transverse layers with respect to the growth layer. According to the minimum-energy condition [68], the relationship between these two strain components is expressed as ε⊥ = −2
C12 ε , C11
(4.41)
MLS Basis for Band-Structure Calculations of Natural and Artificial Crystals
183
where C11 and C12 are the elastic constants of the compound system. Since the growth layer experiences biaxial strain in the direction of the growth plane, the in-plane strain in equation (4.41) can be expressed as ε =
as − ac , ac
(4.42)
where as implies the lattice constant of substrate and ac denotes the in-plane lattice constant of the compound. Finally, the elastic constants (C11 and C12 ) and the lattice constant (ac ) of the Si1−x Gex alloy are calculated by the virtual crystal approximation (VCA). For example, the lattice constant of Si1−x Gex is given as aSi1−x Gex (x) = aSi (1 − x) + aGe x.
(4.43)
In our example, we calculated the band structures of Si1−x Gex with the concentration of x = 0.4. The strain effect is well reflected on the split of energy degeneracy at point as shown in Figure 4.11.
4.4 PhoXonic Crystals Here we apply the real-space MLS approach to the band-structure calculation of phoXonic (photonic and phononic) bandgap materials. The matrix eigenvalue equations are developed from the Maxwell equations and elastic wave equations via the Galerkin formulation in conjunction with MLS-based meshless shape functions.Various examples demonstrate the efficiency and accuracy of the MLS basis for promising research of these bandgap materials.
4.4.1 Maxwell equations for 2D photonic crystals The electromagnetic wave propagation through mixed dielectric media is governed by the macroscopic Maxwell equations. Following the harmonic mode approach, the equations are reduced to ! 1 ω 2 ∇× H(x; ω), (4.44) ∇ × H(x; ω) = ε(x) c ∇ · H(x; ω) = 0,
(4.45)
where H(x; ω) is the magnetic field intensity and ε(x) is the dielectric function. ω is the frequency of the monochromatic electromagnetic wave and c the speed of light. For simplicity, we consider 2D crystal structures only, in which ε(x) is constant along the z-direction, whereas it is periodic according to the lattice
184
Sukky Jun and Wing Kam Liu
structure in the x–y plane as ε(x) = ε(x + L),
(4.46)
where L is the lattice vector and x is a position vector, both in the x–y plane. In addition, light is assumed to propagate in the x–y plane so that the system has the symmetry under the reflection through the plane. This symmetry results in separating the modes into two distinct polarizations; transverse-electric (TE) modes (or H-polarization) and transverse-magnetic (TM) modes (or E-polarization). That is, the vector equation (4.44) is split into two scalar equations for solving the unknown field intensity ψ(x) as below: −∇ · −
1 ∇ψ(x) = λψ(x) ε(x)
1 2 ∇ ψ(x) = λψ(x) ε(x)
(TE modes),
(TM modes),
(4.47)
(4.48)
where λ = (ω/c)2 . According to the Floquet–Bloch theorem, ψ(x) is expressed by the product of a periodic function and a PW of the wave vector k, such as ψ(x) = u(x)eik·x ,
(4.49)
where u(x) is the function fulfilling the periodicity of lattice structure, i.e., u(x) = u(x + L). By inserting equation (4.49) into equations (4.47) and (4.48), the problem is reduced to solving two equations for u(x) as 1 (∇ + ik)u(x) = λu(x) ε(x)
(TE modes),
(4.50)
−(∇ + ik) · (∇ + ik)u(x) = λε(x)u(x)
(TM modes).
(4.51)
−(∇ + ik) ·
There can be found many introductions to the physics of photonic crystals. For example, refer to Ref. [11] on which our brief review is based. The Galerkin formulations for equations (4.50) and (4.51) are stated as; find uh (x) such that, for all v h (x) ∈ H 1 , 1 (4.52) (∇ + ik)uh (x) · (∇ + ik)v h (x)dV = λ uh (x)v h (x)dV ε(x) for TE modes and (∇ + ik)uh (x) · (∇ + ik)v h (x)dV = λ ε(x)uh (x)v h (x)dV
(4.53)
for TM modes. The bars in both equations imply the complex conjugates. The problem domain () is a unit cell of the crystal lattice. Note that both uh (x) and
MLS Basis for Band-Structure Calculations of Natural and Artificial Crystals
185
v h (x) have to satisfy the periodic boundary conditions. Details of the Galerkin formulation for this problem is found elsewhere [37, 38]. Let’s now employ the meshless representation of uh (x) and v h (x) as below: u (x) = h
NP
NJP (x)uJ
and v (x) = h
J
NP
NJP (x)vJ
(4.54)
J
where NP denotes the number of nodes and NJP (x) is the periodic meshless shape function. Here uJ and vJ are complex, whereas NJP (x) is real. The periodic nature of shape function is emphasized by the superscript P. Manipulations after putting equation (4.54) into equations (4.52) and (4.53) result in matrix eigenequations as, Au = λBu, where for TE modes,
AIJ =
BIJ =
and for TM modes,
(4.55)
1 (∇ + ik)NIP (x) · (∇ + ik)NJP (x)dV , ε(x)
(4.56)
NIP (x)NJP (x)dV ,
(4.57)
AIJ =
(∇ + ik)NIP (x) · (∇ + ik)NJP (x)dV ,
(4.58)
ε(x)NIP (x)NJP (x)dV .
(4.59)
BIJ =
When we discretize the unit cell by a set of nodes (or particles) only, it is geometrically natural to locate nodes exactly on the interface between matrix and inclusion. Unfortunately in this situation, it is not clear which dielectric constant has to be assigned to the interface nodes, because the matrix and inclusion have different dielectric constants from each other. In contrast, the meshless method, similar to finite element method, can employ a set of integration points, apart from the set of nodes, to calculate the integrals in matrix equations [42]. And the integration points are located in either inclusion or matrix, not on the interface. Therefore, the given value of dielectric constant is straightforwardly assigned to integration points. This makes the meshless implementation very simple, not requiring additional techniques such as averaging or smoothing the dielectric function as other particle-based methods.
4.4.2 Band structures of various 2D photonic crystals The periodic meshless MLS basis is now applied to the computation of band structures in 2D photonic crystals of various lattices and inclusion shapes. For all
186
Sukky Jun and Wing Kam Liu
2
Normalized frequency
1.5
1
0.5
M Γ
X
0 M
Γ
X
M
Figure 4.12 Band structures of the homogeneous square lattice calculated by the meshless method (open circles) and from the analytic solution (lines). 11×11 nodes are used for the meshless calculation. Normalized frequency implies ωc/(2πa).
numerical examples in this section, employed were the quartic spline function for the window function WJ (x), and the 2D linear basis vector of pT (x) = [1, x, y] when calculating the meshless shape function. 4.4.2.1 Homogeneous square lattice The first example is the square lattice of homogeneous media of ε = 1.0 (air). In this case, there is no difference between TM and TE modes. The analytic solution is obtained as ω 2 λ= = |L + k|2 , (4.60) c where the quantities are the same as defined in previous sections. In spite of homogeneity, this lattice provides frequency band structures due to the periodicity imposed.Therefore this is a good example for convincing that our periodic meshless shape function is properly implemented. The meshless band-structure result using 11×11 nodes is compared with the analytic solution in Figure 4.12. It is shown that the meshless result (open circles) agrees well with the analytic result (lines). Convergence rates of five eigenvalues (E2 , . . . , E6 ) at k = (π/a)(0.5, 0.5) are shown in Figure 4.13, where a is the lattice length and the eigenvalue En denotes the normalized frequency, i.e., En = ωn a/(2πc). The first eigenvalue (E1 ) is not exhibited in Figure 4.13 because it is almost identical to the analytic solution,
187
MLS Basis for Band-Structure Calculations of Natural and Artificial Crystals
102 E2 E3 E4 E5 E6
(EeEn)/Ee
103
104
105
106 h
0.1
0.2
0.3
Figure 4.13 Convergence of the eigenvalues for the homogeneous square lattice (ε = 1.0) at k = (π/a)(0.5, 0.5). a/ 2
a
2
1
a
Figure 4.14 Unit cell of the electromagnetic Kronig–Penney problem [52].
within the bound of machine error. Ee means the analytic solution computed from equation (4.60). Four different discretizations (7×7, 11×11, 21×21, and 41×41 nodes) are employed. The average slope is approximately 3.54, i.e., O(h3.54 ). From this result, we conclude that our implementation of the periodic meshless shape function correctly represents the periodicity of crystal lattice. 4.4.2.2 Electromagnetic Kronig–Penney problem The electromagnetic Kronig–Penney problem is now considered. The unit cell is depicted in Figure 4.14, which is composed of two different dielectric materials.
188
Sukky Jun and Wing Kam Liu
0.7
0.7
0.6
0.6
Normalized frequency
Normalized frequency
The analytic solution of this case can be obtained elsewhere [39]. This example is taken into consideration in order to investigate how accurately the meshless method handles the discontinuity of dielectric function. Dielectric constants of ε1 = 1.0 and ε2 = 9.0 are used for the simulation. Frequency band structures of both TM and TE modes are given in Figure 4.15. Lines are for the analytic solution and open circles are for the meshless results using 11×11 nodes for TM modes and 41×41 nodes for TE modes. Convergence rates of the lowest five eigenvalues at k = (π/a)(0.5, 0.5) are given in Figure 4.16 in which En ’s are the normalized frequency eigenvalues and Ee is the analytic solution, as denoted before. The average slope of TM and TE modes are 3.12 and 2.77, respectively. The results for TM modes show better performance over theTE modes case, not only in convergence but also in accuracy.
0.5 0.4 0.3 TM modes 0.2 0.1 0
0.5 0.4 0.3 TE modes 0.2 0.1
Γ
0
X
Γ
X
Figure 4.15 Band structures of the electromagnetic Kronig–Penney problem. Meshless results are denoted by open circles and analytic results by solid lines [52]. TM modes
TE modes
101
101
(EeEn)/Ee
(EeEn)/Ee
102 103 104
102
Meshfree E1 Meshfree E2 Meshfree E3 Meshfree E4 Meshfree E5 Plane Wave E 1 Plane Wave E 2 Plane Wave E 3 Plane Wave E 4 Plane Wave E 5
105 106 0.05
0.1 h
0.15
103
0.05
0.1
0.15
h
Figure 4.16 Convergence rates of the lowest five eigenvalues for the electromagnetic Kronig–Penney problem at k = (π/a)(0.5, 0.5) [52].
MLS Basis for Band-Structure Calculations of Natural and Artificial Crystals
189
This is due to the difference of smoothness properties of eigenvectors [38]. There may be several ways to enhance the accuracy of meshless results onTE modes.The meshless treatment of discontinuity using Lagrange multiplier has been applied to solid mechanics by [69]. Another interesting implementation of discontinuity capturing performance has recently been proposed by [70]. Application of these methods to photonic crystals remains for future study. This example can also be solved by the PW method. In Figure 4.16, the results obtained by the conventional PW method are thus compared with the meshless method. It is shown that meshless results are more accurate for both modes. The faster convergence of meshless method is quite apparent especially for TM modes. In the case of meshless method, the x-axis indicates the size of discretization such as h = a/(NP − 1), where NP denotes the number of nodes in one direction. For the plane wave method, it similarly implies h = a/(NPW − 1), where NPW means the number of PW used for the computation. This notation is arbitrary because the meshless method solves generalized eigenvalue problems, while the PW method does standard eigenvalue problems. The meshless method thus requires more computing time than the PW method. Nonetheless it is convenient for the purpose of direct comparison as in Figure 4.16, because both (NP)2 and (NPW )2 determine the size of system matrix to be solved. 4.4.2.3 Square lattice of dielectric veins Our next example is a square lattice of dielectric vein. The lattice structure and its meshless modelling of the unit cell are illustrated in Figure 4.17 where the unit cell is discretized by 21×21 nodes. The ratio r/a is 0.2, where a is the lattice length and r is the thickness of vein. Dielectric constants are εvein = 18.0625 for veins and εm = 1.0 otherwise. Meshless results on the band structures for both TE and TM modes are shown in Figure 4.18. The wide band gap for TE modes is in good agreement with previous results computed by the PW method [11, 20].
Figure 4.17 A square lattice of dielectric veins (left) and its unit cell with meshless nodal discretization (right).
190
Sukky Jun and Wing Kam Liu
0.5
Normalized frequency
0.4
0.3
TE modes 0.2
0.1 TM modes 0
Γ
M
X
M
Figure 4.18 Meshless results on band structures of the dielectric veins.
0.27
First TE mode at point M Plane wave method Meshfree method
0.265 0.26 0.255 0.25 0.245 0 2500 5000 Total number of nodes or plane waves
Normalized frequency
Normalized frequency
0.275
0.42 0.415 0.41 0.405 0.4 0.395 0.39 0.385 0.38 0.375 0.37
Second TE mode at point X Plane wave method Meshfree method
0 2500 5000 Total number of nodes or plane waves
Figure 4.19 Comparisons of meshless method and PW method (TE modes) for the dielectric veins.
However further investigation reveals a big difference between results of the meshless method and conventional PW method as demonstrated in Figures 4.19 and 4.20. The wide bandgap inTE modes are determined by the frequency difference between the 1st mode at point M and the second mode at point X .Therefore, in Figure 4.19, the frequency values obtained by both methods at these points, are compared as increasing the total number of nodes (for the meshless method) and the total number of PW (for the PW method). It is shown that meshless results reach the converging solutions more rapidly than the PW method. As
191
MLS Basis for Band-Structure Calculations of Natural and Artificial Crystals
First TM mode at point M Plane wave method Meshfree method
Normalized frequency
Normalized frequency
0.22
0.235
0.215
0.21
0.205
0.23
Second TM mode at point X Plane wave method Meshfree method
0.225 0.22 0.215 0.21 0.205
0.2 0 2500 5000 Total number of nodes or plane waves
0 2500 5000 Total number of nodes or plane waves
Figure 4.20 Comparisons of meshless method and PW method (TM modes) for the dielectric veins.
stated earlier, the meshless formulation results in the generalized matrix eigenvalue problem. On the other hand, the PW method leads to the standard matrix eigenvalue equation. The meshless method thus requires more computing time even though it shows faster convergence. However the merit of meshless method is well demonstrated in the comparison for TM mode as in Figure 4.20. Meshless results exhibit almost no changes in the frequency values, as the number of nodes are increasing, even from the very coarse case. This is consistent with the highly accurate meshless result on TM modes for the electromagnetic Kronig–Penney model computed in the foregoing. 4.4.2.4 Square and triangular lattices of circular rods Photonic band structures of square and triangular lattices comprised of circular rods are computed here. First, the square lattice structure and the square unit cell are illustrated in Figure 4.21. The unit cell in the figure is discretized by 1697 meshless nodes. The radius of circular cross section is r = 0.2a for this example. Dielectric constants are εm = 1.0 and εr = 8.9 for matrix and rods, respectively. In this case, there exists a wide bandgap in TM modes as known in literatures. Meshless results on band structures also confirm the wide bandgap in TM modes as shown in Figure 4.22. In this example, the width of bandgap is determined by the 1st TM mode at point M and the 2nd TM mode at point X . The numerical results at these points obtained by both the meshless method and PW method are compared in Figure 4.23, where the faster converging behavior of the meshless method over the PW method are demonstrated. The last example is a triangular lattice composed of circular air columns. The lattice is given in Figure 4.24. Its unit cell is a parallelogram as shown in the figure, being discretized by 1486 nodes. Dielectric constants are given as εm = 13.0 and εr = 1.0 for matrix and air cylinders, respectively. The ratio between the side length of the parallelogram and the radius of the circular cross section is
192
Sukky Jun and Wing Kam Liu
Figure 4.21 A square lattice of circular rod (left) and its unit cell discretized by meshless method (right) [52].
0.8 0.7
Normalized frequency
0.6 0.5
TE modes
0.4 0.3 0.2 TM modes
0.1 0
M
Γ
X
M
Figure 4.22 Meshless results on band structures of square lattice composed of circular rods [52].
r/a = 0.45. For this case of lattice structure and dielectric constants, it has been known that there exists a complete band gap for all polarizations, around the normalized frequency of 0.4 [11, 26, 71].This complete bandgap is also verified by meshless results as shown in Figure 4.25.The bandgap is determined byTM modes for which the meshless method gives highly accurate results as demonstrated in previous examples.
193
MLS Basis for Band-Structure Calculations of Natural and Artificial Crystals
2nd TM mode at point X
1st TM mode at point M 0.46
Plane wave method Meshfree method
Normalized frequency
Normalized frequency
0.34
0.335
0.33
0.325 0.32 0 3000 4000 1000 2000 Total number of nodes or plane waves
Plane wave method Meshfree method
0.455
0.45 0.445 0.44
4000 0 1000 2000 3000 Total number of nodes or plane waves
Figure 4.23 Comparisons of meshless method and PW method (TM modes) for the square lattice of circular rods [52].
Figure 4.24 A triangular lattice of circular rod (left) and its unit cell discretized by meshless method (right). 0.8
Normalized frequency
0.7 0.6 0.5 Complete band gap
0.4 0.3
TE modes
0.2 0.1
TM modes
0 X
Γ
K
X
Figure 4.25 Meshless results on band structures of triangular lattice composed of circular rods.
194
Sukky Jun and Wing Kam Liu
4.4.3 Acoustic bandgap materials Periodic distribution of an elastic material in another elastic background medium can yield a new material of crystal structure that controls the propagation of sound wave, as named phononic crystal. In the part of numerical example, using the MLS basis, we solve the eigenvalue problem of elastic wave equation for 2D phononic crystals that elastic rods are embedded in other elastic materials. Let us consider a square lattice of circular elastic rod cylinders which are infinite in kˆ direction. We then have transverse acoustic waves with displacement u parallel to ˆ where r = xˆi + yˆj. the axis of cylinders, i.e., u(r, t) = u(r, t)k, The wave equation of this transverse acoustic wave model is given as ρ(r)
∂2 u(r, t) = ∇ · [c(r)∇u(r, t)], ∂t 2
(4.61)
where ρ(r) is the mass density and c(r) is the elastic constant (=c44 ) of the composite for transverse acoustic wave, and both are periodic functions of r. Employing the Bloch theorem and Galerkin formulation, we can have the generalized eigenvalue equation as (A + G + iH)U + ω2 MU = 0.
(4.62)
The components of each matrix are defined as MIJ =
ρ(r)NI (r)NJ (r)dV ,
(4.63)
AIJ = −
BT I DBJ dV ,
(4.64)
KT I DKJ dV ,
(4.65)
GIJ = − HIJ =
BT I DKJ dV ,
(4.66)
where BI =
! NI ,x ρ(r)c(r)2 , D= NI ,x 0
! ! kx NI 0 = , K , I ky NI ρ(r)c(r)2
(4.67)
and kx and ky are the components of wave vector k. The shape function NI (r) and NJ (r) are the periodic meshless shape functions defined by the MLS approximation as previous sections.
195
MLS Basis for Band-Structure Calculations of Natural and Artificial Crystals
2
Frequency (2πω/a)
1.5
1
0.5
0
M
Γ Wave vector
X
M
Figure 4.26 Frequency band structures of 2D phononic crystals computed by meshless method (open circles) and PW method (solid lines). Volume fraction is 0.35. The 441 nodes are used for MLS-based meshless method, and the 2601 PW are employed as the reference solution.
In Figure 4.26, the comparison of meshless results with 441 nodes and the PW method with 2601 PW are provided for the lowest ten eigenvalues. The volume fraction is 0.35. The elastic constant of cylinders is c = 4c0 where c0 is the elastic constant of background medium (air in this example). Likewise, the mass density of ρ = 4ρ0 is used. We can find a wide bandgap that prohibits the propagation of the sound wave of the corresponding frequency, and that the meshless results converges well to the highly refined results of PW method.
4.5 Strain-Tunable Photonic Bandgap Materials During the last decade, photonic bandgap materials have attracted increasing interests of photophysics and optoelectronics communities due to the possibility of controlling light propagation through the materials [11, 17] as discussed in previous section. More recently, a growing number of research papers have specifically reported the tunability of absolute bandgap in photonic crystals. For example, various techniques of bandgap tuning have been demonstrated by applying electric field [72, 73], temperature [74, 75], and magnetic field [76], as well as by infiltrating liquid crystals [77]. In addition, mechanical strain has also been examined as an alternative for band-structure modifications and, possibly, for tuning the bandgaps in photonic crystals as desired [78–82].
196
Sukky Jun and Wing Kam Liu
In this section, we investigate the modification of bandgap in photonic crystals undergoing mechanical deformation. Considered is the 2D triangular photonic crystal consisting of silicon matrix and cylindrical air rods. Our current analysis of the deformation (or strain) effect on bandgap modification follows a distinguishing route from others that have previously addressed the strain-tunable photonic bandgap materials. First, using the MLS method, we numerically solve the linear elasticity problems of the unit cell of photonic crystal, in order to find the actual deformation in this highly inhomogeneous material subject to prescribed displacements at the outer boundaries of the unit cell. Then, we perform the real-space band-structure calculation using the MLS method as well, which is very useful for modelling arbitrarily distorted interfaces due to the deformation. Therefore, taking full advantage of the real-space technique, we focus on the realistic modelling of the deformation-induced change in the shape of interface between matrix and inclusion, and consequently on the effect of the distorted interface on bandgap modification. In this respect, current numerical study is different from some results in literatures where the emphases were put only on reconfiguring the lattice structure by mechanical strain without considering the detailed shape change along the interface.
4.5.1 Deformations of 2D triangular photonic crystals In mechanics of materials, strains or deformations are conventionally classified into fundamental modes such as pure shear, simple shear, and uniaxial tension, as shown in Figure 4.27 where the undeformed square of dashed line is also given for clear illustration [83]. Each deformation mode has its own characteristics. For example, the pure shear stress implies that only the shear stress component does not vanish at every point in the material. However, photonic crystal is the highly inhomogeneous material accompanying abrupt changes in material constants across the interface between matrix and inclusions. In mechanics’ point of view, the internal responses of matrix and inclusions to prescribed boundary conditions are quite different from each other. For example, silicon matrix deforms in response to the applied load, while air columns do not transfer the
(a)
(b)
(c)
Figure 4.27 Illustrations of fundamental deformation modes: (a) pure shear, (b) simple shear and (c) uniaxial tension. Dashed square is the undeformed original shape [53].
MLS Basis for Band-Structure Calculations of Natural and Artificial Crystals
197
load. It is therefore difficult to define the fundamental deformation modes of photonic crystals that are exactly equivalent to those of homogeneous materials illustrated in Figure 4.27. Nonetheless, we borrow here the terminology of those three deformation modes, for convenience. We will use them to specify how we impose the corresponding displacement boundary conditions on the outer boundary of the unit cell. The terminology can also be justified for inhomogeneous materials because those three modes provide the only means available to guide systematically how strains can be applied in practice. We consider 2D triangular photonic crystals involving cylindrical air (ε = 1.0) rods in silicon (ε = 11.56) matrix. The inclusion’s volume fraction of the undeformed unit cell is given f = 0.72. It is well known that this triangularly periodic structure of those dielectric materials has complete photonic bandgap. Three deformation modes mentioned above are separately implemented by imposing corresponding displacements (ux , uy ) along the boundary of unit cell. That is, we first prepare the undeformed parallelogram unit cell as shown in Figure 4.28, and then prescribe the displacement boundary conditions according to each deformation mode. In this Chapter, we restrict our study to relatively small deformation of 3% strain as in [80]. The strain is defined as ij = (∂ui /∂xj + ∂uj /∂xi )/2 where u1 = ux , u2 = uy , x1 = x, and x2 = y in our notation. It is noted that the prescribed boundary condition is in fact imposed on the silicon’s part only (i.e., not on the air) along the outer boundary of the unit cell. Next, we solve numerically the typical governing equation of 2D plane strain linear elasticity to obtain the displacement field at all points in the silicon matrix. Refer to Ref. [42] on how to solve linear elasticity problem using the MLS method.Young’s modulus of 185 (GPa) and Poisson’s ratio of 0.26 are used here for silicon. The traction-free boundary condition is imposed on the circular interface between the silicon and the air, while the silicon’s part out of the outer boundary are fixed as prescribed above. In Figure 4.28, the computed results under 3% elastic strains corresponding to each of the three deformation modes are given in comparison with the undeformed original unit cell. It is clearly shown that the circular interfaces are slightly distorted due to deformation. In the following
(a)
(b)
(c)
Figure 4.28 Undeformed (grey) and deformed (solid black) unit cells of 2D triangular photonic crystal with cylindrical air rods: (a) pure shear, (b) simple shear and (c) uniaxial tension. In each mode, corresponding shear or tensile strain of 3% is applied [53]. (See also colour plate 1.)
198
Sukky Jun and Wing Kam Liu
section, we explore how the distorted interfaces contribute to modifying the original band structures of photonic crystal.
4.5.2 Band structures of deformed photonic crystals Based on the deformed unit cells obtained from linear elasticity problem, we here compute the photonic band structures using the MLS method, and investigate the effect of each deformation mode on bandgap modification. Because the method is a real-space technique as stated before, it is suitable for accurately modelling the slightly distorted interfaces caused by deformation. The unit cell is discretized by 925 nodes and the associated 864 integration cells. 3×3 quadrature points for each cell, and [1, x, y] for polynomial basis vector are also employed in the MLS modelling. Detailed description regarding how the MLS method is used for the highly accurate computation of photonic band structures, are presented in previous sections. When computing the band structures of deformed photonic crystal, we do not take into account the stress-induced change of dielectric constant in silicon matrix, because the deformation considered in this example is small enough to disregard it [84]. As the photonic crystal deforms, the interface between silicon matrix and air hole is distorted. In addition, the lattice structure itself is also deviated from the original perfect triangular lattice. Consequently, the perfect hexagonal symmetry in reciprocal lattice is broken. When computing band structures, we thus employ the quasi-hexagonal symmetry points of deformed photonic crystals, as illustrated in Figure 4.29. For each deformation mode, we are able to obtain the exact geometry of each quasi-hexagon subjected to 3% strain. Three different symmetry zones, labelled by 1–3 in the figure, are considered in calculating the band structures [80]. Results on the band structures of deformation modes are given in Figure 4.30 where the top, middle, and bottom rows are respectively for pure shear, simple shear, and uniaxial tension. In the figure, ω is the frequency of the monochromatic electromagnetic wave, c the speed of light, and a the radius of cylindrical air
M
K
M
K 3
M 2
Γ
K
Γ M
1 K
Figure 4.29 Schematic diagrams of symmetry points and zones in the reciprocal lattice of undeformed (left) and deformed (right) photonic crystals [53]. (See also colour plate 2.)
199
MLS Basis for Band-Structure Calculations of Natural and Artificial Crystals
Undeformed
Pure shear (zone 1)
Pure shear (zone 2)
Pure shear (zone 3)
0.7 0.6
ωa/2πc
0.5 0.4 0.3 0.2 0.1 0 M
K M M
Undeformed
K M M
Simple shear (zone 1)
K M M
K M
Simple shear (zone 2) Simple shear (zone 3)
0.7 0.6
ωa/2πc
0.5 0.4 0.3 0.2 0.1 0 M
K M M
Undeformed
K M M
Tension (zone 1)
K M M
Tension (zone 2)
K M
Tension (zone 3)
0.7 0.6
ωa/2πc
0.5 0.4 0.3 0.2 0.1 0
M
K M M
K M M
K M M
K M
Figure 4.30 Photonic band structures under pure shear (top), simple shear (middle), and uniaxial tension (bottom). TM and TE modes are in solid black and grey, respectively. Dashed horizontal lines indicate the bandgap of undeformed original photonic crystal. Insets in top low illustrate the quasi-hexagonal symmetry zones of the deformed photonic crystal [53]. (See also colour plate 3.)
200
Sukky Jun and Wing Kam Liu
Table 4.1 Correlations between volume fraction and bandgap shift [53]
Absolute bandgap (ωa/2πc)
Volume fraction
Undeformed
0.400041 ∼ 0.434603
0.720000 (100.0%)
Pure shear
0.400546 ∼ 0.434311
0.717946 (99.71%)
Simple shear
0.401445 ∼ 0.435770
0.718931 (99.85%)
Uniaxial tension
0.384464 ∼ 0.420237
0.702820 (97.61%)
inclusions. In each row, the band structure of the undeformed photonic crystal is also given in the left column. The other three columns respectively correspond to the three different symmetry zones of deformed reciprocal lattice as depicted in Figure 4.30. The two dashed horizontal lines in each row indicate the upper and lower bound of absolute bandgap of the undeformed original photonic crystal. This original bandgap is overlapped for the clear demonstration of bandgap changes due to each deformation mode. In Figure 4.30, it is notable that both pure and simple shear deformations cause little alteration of the absolute bandgap although band structures are slightly modified along the symmetry lines. In fact, the ‘widths’ of the absolute bandgaps are not significantly changed in all the three deformation cases, under this small amount of applied strain. This conclusion is contrary to a reference [80] which used the PW expansion method and thus did not consider the details of interface changes due to deformations. Our results are rather similar to those of more recent references although different types of materials were employed [82, 84]. On the other hand, the only remarkable modification of bandgap in our study is found in the case of uniaxial tension (i.e., the bottom row) where the absolute bandgap is shifted down without considerable change of its width. InTable 4.1, we show that this is mainly because of the deformation-induced change of volume fraction. The volume fractions are here computed based on the deformed shapes obtained from linear elasticity problems of the previous section. In the table, the reduction of volume fraction in the case of uniaxial tension is more significant than in both shear cases. Consequently, the bandgap shift is evident in tensile mode only. This type of correlation between volume fraction and bandgap shift can also be found in the gap map for triangular lattice of air columns in medium of ε = 11.4, as given in [11] where the bandgap gets shifted down as the volume fraction decreases near f = 0.72, without perceptible change of bandgap width. As already seen from Figure 4.28, lattice distortions are somewhat greater in shear modes than in uniaxial tension mode. On the other hand the bandgap shift is prominent in the tension only as above. It is most probably because the effects of lattice distortions are compensated by the interface distortions in the shear modes, which is not the case of uniaxial tension. This strongly implies that, when computing band structures of strained photonic crystals, we must carefully consider not only the lattice distortion but also the shape changes of interfaces.
MLS Basis for Band-Structure Calculations of Natural and Artificial Crystals
201
4.6 Concluding Remarks We have proposed a unified approach to the band-structure calculations of atomic, photonic, and phononic crystals. Our approach uses the periodic meshless shape functions based on the MLS approximation. This method is one of the real-space methods, which is promising over the conventional PW method, especially for the large-scale computing and representation of arbitrary interfaces. The cell-periodic meshless shape function is here suggested in order to represent the periodicity of crystals. The modification from the regular meshless shape function is made at the step when searching nodes under the support of the shape function. By employing this translation-and-searching algorithm, the MLS-based meshless shape function enables any periodic function to be reproduced correctly. We have shown that the MLS basis is readily applicable for various real-space calculations of electronic structure of crystalline solids having periodic potentials. It is demonstrated that the periodic meshless shape function implemented is very efficient for the problems involving periodicity. The accuracy obtained by using the shape function is proven to be outstanding. In the examples of 3D electronic-structure calculations, we have employed the spherical shape of support and 3×3×3 quadrature points for each integration cell. Systematic analysis regarding the effects of the support and numerical integrations on the accuracy of eigenvalues is desirable for further development. For these electronic-structure calculations using meshless method, it will be interesting to see the rates of convergence for shape functions with the consistency of sinusoidal functions and atomic orbital as well as higher order polynomials. It remains our future study. The adaptivity and parallel computing will immediately expand the applications tractable, including the first-principles calculation of electronic structures of crystalline solids. Our implementation has also been proven to be highly accurate through the application of the periodic meshless shape function to the frequency bandstructure computation of 2D homogeneous photonic crystal. Several types of inhomogeneous photonic bandgap materials have been examined to study the performance of meshless method for the band-structure calculation of phoXonic crystals. Throughout numerical examples of photonic crystals, it has been verified that meshless results for both TM and TE modes are more accurate than those of the conventional PW methods. Specifically, TM modes results of meshless method are outstanding, compared with TE mode results. This property has also been found in other real-space techniques such as finite element method and finite difference method [29, 38]. In order to enhance the accuracy and convergence of TE modes solutions by the meshless method, special techniques may be applied in our future study. Furthermore, we have computed band structures of 2D triangular photonic crystals of silicon matrix and cylindrical air columns which undergo deformations subject to external strain.The contribution by shear strains to the bandgap modification is not remarkable, while the uniaxial tension produces considerable bandgap
202
Sukky Jun and Wing Kam Liu
shift. This is contrary to the theoretical prediction from other methods. We have shown that this difference between deformation modes comes from the changes of volume fraction that are closely related to the distortion of interfaces between the matrix and inclusions. This rigorous analysis would never have been straightforward with the conventional PW method. Our results are thus expected to serve as practical guidance for tuning photonic bandgaps by applying mechanical strain. Extending meshless band-structure computation to 3D photonic crystal is an example to which the periodic meshless shape function can immediately be further applied. Various value-periodic problems may possibly be simulated by using this periodic MLS meshless basis. Some of the examples are pattern formation problems including Ginzburg–Landau equation and Swift–Hohenberg equation, the surface morphology of soft materials, quantum island formations, strain-induced nanopatterning on surface, the periodic array of quantum heterostructures, and so on. ACKNOWLEDGEMENTS We would like to gratefully acknowledge the support of the National Science Foundation (NSF), the NSF-IGERT program, the NSF Summer Institute on Nano Mechanics and Materials, the Army Research Office (ARO) and the Office of Naval Research (ONR) for their support of this work. SJ acknowledges the support by Korea Research Foundation Grant (KRF-003-D00007). We are also grateful to R. Murphy,Y.-S. Cho, and S. Y. Kim for their help.
REFERENCES 1. M. S. Kushwaha, Classical band structure of periodic elastic composites, Int. J. Mod. Phys. B, 10 (1996), 977–1094. 2. J. -L. Fattebert and M. B. Nardelli, Finite Difference Methods for Ab Initio Electronic Structure and Quantum Transport Calculations of Nanostructures, Handbook of Numerical Analysis, C. Le Bris (ed.),Vol. X: Special Volume: Computational Chemistry, Elsevier,Amsterdam, 2003. 3. E. Wimmer, H. Krakauer, M. Weinert, and A. J. Freeman, Full-potential self-consistent linearized-augmented-plane-wave method for calculating the electronic structure of molecules and surfaces: O2 molecule, Phys. Rev. B, 24 (1981), 864–875. 4. J. Q. Broughton, F. F. Abraham N. Bernstein, and E. Kaxiras, Concurrent coupling of length scales: methodology and application, Phys. Rev. B, 60 (1999), 2391–2403. 5. T. L. Beck, Real-space mesh techniques in density-functional theory, Rev. of Mod. Phys., 72 (2000), 1041–1080. 6. J. R. Chelikowsky, N. Troullier, and Y. Saad, Finite-difference-pseudopotential method: Electronic structure calculations without a basis, Phys. Rev. Lett., 72 (1994), 1240–1243. 7. R. L. Ferrari, Electronic band-structure for 2-dimensional periodic lattice quantum configurations by the finite-element method, Int. J. Numer. Model. Electron. Network. Device. Field., 6(4) (1993), 283–297. 8. E. Tsuchida and M. Tsukada, Adaptive finite-element method for electronic-structure calculations, Phys. Rev. B, 54 (1996), 7602–7605. 9. J. E. Pask, B. M. Klein, C. Y. Fong, and P. A. Sterne, Real-space local polynomial basis for solid-state electronic-structure calculations:A finite-element approach, Phys. Rev. B 59 (1999), 12352–12358. 10. J. E. Pask, B. M. Klein, P. A. Sterne, and C. Y. Fong, Finite-element methods in electronicstructure theory, Comput. Phys. Comm. 135 (2001), 1–34.
MLS Basis for Band-Structure Calculations of Natural and Artificial Crystals
203
11. J. D. Joannopoulos, R. D. Meade, and J. N. Winn, Photonic Crystals: Molding the Flow of Light, Princeton University Press, Princeton, 1995. 12. S. G. Johnson and J. D. Joannopoulos, Photonic Crystals:The Road fromTheory to Practice, Kluwer Academic Publishers, Boston, 2002. 13. S. John, Strong localization of photons in certain disordered dielectric superlattices, Phys. Rev. Lett., 58(23) (1987), 2486–2489. 14. E. Yablonovitch, Inhibited spontaneous emission in solid-state physics and electronics, Phys. Rev. Lett., 58(20) (1987), 2059–2062. 15. J. D. Joannopoulos, P. R. Villeneuve, and S. Fan, Photonic crystals, Solid State Commun., 102(2–3) (1997), 165–173. 16. C. Weisbuch, H. Benisty, S. Olivier, C. J. M. Smith, and T. F. Krauss, Advances in photonic crystals, Phys. Stat. Sol. (b), 221 (2000), 93–99. 17. Y. Xia, Photonic crystals,Adv. Mater. 13(6) (2001), 369 and papers in this special issue. 18. E. Yablonovitch, Photonic band-gap crystals, J. Phys. Condens. Matter, 5 (1993), 2443–2460. 19. E. Yablonovitch, Photonic band-gap crystals: semiconductor of light, Scientific American, 2001, 47–55. 20. K. Busch, Photonic band structure theory: assesment and perspectives, C. R. Physique, 3 (2002), 53–66. 21. D. Cassagne, Photonic band gap materials,Ann. Phys. Fr., 23 (1998), 1–91. 22. J. B. Pendry, Calculating photonic band structure, J. Phys. Condens. Matter, 8 (1996), 1085–1108. 23. K. M. Ho, C. T. Chan, and C. M. Soukoulis, Existence of a photonic gap in periodic dielectric structures, Phys. Rev. Lett., 65 (1990), 3152–3155. 24. K. M. Leung, and Y. F. Liu, Full vector wave calculation of photonic band structures in face-centered-cubic dielectric media, Phys. Rev. Lett., 65 (1990), 2646–2647. 25. M. Plihal and A. A. Maradudin, Photonic band structure of two-dimensional systems: The triangular lattice, Phys. Rev. B, 44 (1991), 8565–8571. 26. P. R. Villeneuve and M. Piché, Photonic bandgaps in two-dimensional square lattices: Square and circular rods, Phys. Rev. B, 46 (1992), 4973–4975. 27. Z. Zhang, and S. Satpathy, Electromagnetic wave propagation in periodic structures: Bloch wave solution of Maxwell’s equations, Phys. Rev. Lett., 65 (1990), 2650–2653. 28. H. S. Sözüer, J. W. Haus, and R. Inguva, Photonic bands: Convergence problems with the plane-wave method, Phys. Rev. B, 45 (1992), 13962–13972. 29. L. Shen, S. He, and S. Xiao, A finite-difference eigenvalue algorithm for calculating the band structure of a photonic crystal, Comput. Phys. Comm., 143 (2002), 213–221. 30. R. D. Meade,A. M. Rappe, K. D. Brommer, J. D. Joannopoulos, and O. L. Alerhand,Accurate theoretical analysis of photonic band-gap materials, Phys. Rev. B, 48 (1993), 8434–8437. 31. C. T. Chan, Q. L. Yu, and K. M. Ho, Order-N spectral method for electromagnetic waves, Phys. Rev. B, 51 (1995), 16635–16642. 32. M. Qiu, and S. He, A nonorthogonal finite-difference time-domain method for computing the band structure of a two-dimensional photonic crystal with dielectric and metallic inclusions, J. Appl. Phys., 87 (2000), 8268–8275. 33. A. J. Ward, and J. B. Pendry, Calculating photonic Green’s functions using a nonorthogonal finite-difference time-domain method, Phys. Rev. B, 58 (1998), 7252–7259. 34. K. M. Leung, and Y. Qiu, Multiple-scattering calculation of the two-dimensional photonic band structure, Phys. Rev. B, 48 (1993), 7767–7771. 35. X. Wang, X. G. Zhang, Q. Yu, and B. N. Harmon, Multiple-scattering theory for electromagnetic waves, Phys. Rev. B, 47 (1993), 4161–4167. 36. J. B. Pendry and A. MacKinnon, Calculation of photon dispersion relations, Phys. Rev. Lett., 69 (1992), 2772–2775. 37. W. Axmann and P. Kuchment, An efficient finite element method for computing spectra of photonic and acoustic band-gap materials: I. Scalar case, J. Comput. Phys. 150 (1999), 468–481. 38. D. C. Dobson, An efficient band structure calculations in 2D photonic crystals, J. Comput. Phys., 149 (1999), 363–376.
204
Sukky Jun and Wing Kam Liu
39. C. Mias, J. P. Webb, and R. L. Ferrari, Finite element modelling of electromagnetic waves in doubly and triply periodic structures, IEE Proc.-Optoelectron., 146(2) (1999), 111–118. 40. S. Li and W. K. Liu, Meshfree and particle methods and their applications, Appl. Mech. Rev., 55 (2002), 1–34. 41. I. Babuˇska, U. Banerjee, and J. E. Osborn, Survey of meshless and generalized finite element methods:A unified approach,Acta Numerica, 12 (2003), 1–125. 42. S. Li and W. K. Liu, Meshfree Particle Methods, Springer, New York, 2004. 43. T. Belytschko,Y. Krongauz, D. Organ, M. Fleming, and P. Krysl, Meshless methods: an overview and recent developments, Comput. Meth. Appl. Mech. Eng., 139 (1996), 3–47. 44. W. K. Liu, S. Jun, and Y. F. Zhang, reproducing kernel particle methods, Int. J. Numer. Meth. Fluids, 20 (1995), 1081–1106. 45. W. K. Liu,Y. Chen, S. Jun, J. S. Chen,T. Belytschko, C. Pan, and C. T. Chang, Overview and applications of the reproducing kernel particle methods, Arch. Comput. Meth. Eng. State Art Rev., 3 (1996), 3–80. 46. W. K. Liu,Y. Chen, C. T. Chang, and T. Belytschko,Advances in multiple scale kernel particle methods, Comput. Mech., 18(2) (1996), 73–111. 47. W. K. Liu,Y. Chen, R. A. Uras, and C.T. Chang, Generalized multiple scale reproducing kernel particle methods, Comput. Meth. Appl. Mech. Eng., 139 (1996), 91–158. 48. W. K. Liu, S. Li, and T. Belytschko, Moving least square reproducing kernel method part I: Methodology and convergence, Comput. Meth. Appl. Mech. Eng., 143 (1997), 113–154. 49. W. K. Liu, R. A. Uras, and Y. Chen, Enrichment of the Finite Element Method with the Reproducing Kernel Particle Method,ASME J. Appl. Mech., 64 (1997), 861–870. 50. W. K. Liu, and S. Jun, Multiple Scale Reproducing Kernel Particle Methods for Large Deformation Problems, Int. J. Numer. Meth. Eng., 41 (1998), 1339–1362. 51. T. Belytschko,Y.Y. Lu, and L. Gu, Element-free Galerkin method, Comput. Meth. Appl. Mech. Eng., 113 (1994), 397–414. 52. S. Jun,Y.-S. Cho, and S. Im, Moving least-square method for the band-structure calculation of 2D photonic crystals, Opt. Express, 11 (2003), 541–551. 53. S. Jun and Y.-S. Cho, Deformation-induced bandgap tuning of 2D silicon-based photonic crystals, Opt. Express, 11 (2003), 2769–2774. 54. S. Jun, Meshfree implementation for the real-space electronic-structure calculations of crystalline solids, Int. J. Numer. Meth. Eng., 59 (2004), 1909–1923. 55. C. Kittel, Introduction to Solid State Physics, John Wiley & Sons, New York, 1986. 56. D.W. Kim, andY. Kim, Point collocation method using the fast moving least-square reproducing kernel approximation, Int. J. Numer. Meth. Eng., 56 (2003), 1445–1464. 57. M. P. Allen, and D. J. Tildesley, Computer Simulation of Liquids, Oxford University Press, New York, 1989. 58. D. Frenkel and B. Smit, Understanding Molecular Simulation: From Algorithms to Applications, Academic Press, San Diego, (2002). 59. E. Hernández, M. J. Gillan, and C. M. Goringe, Basis functions for linear-scaling first-principles calculations, Phys. Rev. B, 55 (1997), 13486–13493. 60. H. T. Johnson and L. B. Freund, The influence of strain on confined electronic states in semiconductor quantum structures, Int. J. Solids Struct., 38 (2001), 1045–1062. 61. H. T. Johnson, L. B. Freund, C. D. Akyüz, and A. Zaslavsky, Finite element analysis of strain effects on electronic and transport properties in quantum dots and wires, J. Appl. Phys., 84 (1998), 3714–3725. 62. R. Phillips, Crystals, Defects and Microstructures: Modeling Across Scales, Cambridge University Press, Cambridge, 2001. 63. M. L. Cohen and T. K. Bergstresser, Band structures and pseudopotential form factors for fourteen semiconductors of the diamond and zinc-blende structures, Phys. Rev., 141 (1966), 789–796. 64. P. Hohenberg, andW. Kohn, Inhomogeneous electron gas, Phys. Rev., 136 (1964), B864–B871.
MLS Basis for Band-Structure Calculations of Natural and Artificial Crystals
205
65. W. Kohn, and L. J. Sham, Self-consistent equations including exchange and correlation effects, Phys. Rev., 140 (1965), B1133–B1138. 66. J. M. Thijssen, Computational Physics, Cambridge University Press, Cambridge, 2001. 67. S. Gonzalez, D. Vasileska, and A. A. Demkov, Empirical pseudopotential method for the band structure calculation of strained-silicon germanium materials, J. Comput. Electronics, 1 (2002), 179–183. 68. M. M. Rieger and P.Vogl, Electronic-band parameters in strained Si1−x Gex alloys on Si1−y Gey substrates, Phys. Rev. B, 48 (1993), 14276–14287. 69. L. W. Cordes, and B. Moran,Treatment of material discontinuity in the element-free Galerkin method, Comput. Meth. Appl. Mech. Eng., 139 (1996), 75–89. 70. D. W. Kim,Y.-C. Yoon,W. K. Liu, and T. Belytschko, Extrinsic meshfree approximation using asymptotic expansion for interfacial discontinuity of derivative, J. Comput. Phys., (2006), doi:10.1026/j.jcp.2006.06.023. 71. R. D. Meade, K. D. Brommer,A. M. Rappe, and J. D. Joannopoulos, Existence of a photonic band gap in two dimensions,Appl. Phys. Lett., 61 (1992), 495–497. 72. K. Busch, and S. John, Liquid-crystal photonic-band-gap materials: the tunable electromagnetic vacuum, Phys. Rev. Lett., 83 (1999), 967–970. 73. Y. Shimoda, M. Ozaki, and K. Yoshino, Electric field tuning of a stop band in a reflection spectrum of synthetic opal infiltrated with nematic liquid crystal,Appl. Phys. Lett., 79 (2001), 3627–3629. 74. K.Yoshino,Y. Shimoda,Y. Kawagishi, K. Nakayama, and M. Ozaki,Temperature tuning of the stop band in transmission spectra of liquid-crystal infiltrated synthetic opal as tunable photonic crystal, Phys. Rev. Lett., 75 (1999), 932–934. 75. J. Zhou, C. Q. Sun, K. Pita,Y. L. Lam,Y. Zhou, S. L. Ng, C. H. Kam, L. T. Li, and Z. L. Gui, Thermally tuning of the photonic bandgap of SiO2 colloid-crystal infilled with ferroelectric BaTiO3 ,Appl. Phys. Lett., 78 (2001), 661–663. 76. Y. Saado, M. Golosovsky, D. Davidov, and A. Frenkel, Tunable photonic bandgap in selfassembled clusters of floating magnetic particles, Phys. Rev. B, 66 (2002), 195108–195113. 77. S.W. Leonard, J. P. Mondia, H. M. van Driel, O.Toader, S. John, K. Busch,A. Birner, U. Gösele, and V. Lehmann, Tunable two-dimensional photonic crystals using liquid-crystal infiltration, Phys. Rev. B, 61 (2000), R2389–R2392. 78. H. Pier, E. Kapon, and M. Moser, Strain effects and phase transitions in photonic resonator crystals, Nature, 407 (2000), 880–883. 79. S. Noda, M.Yokoyama, M. Imada,A. Chutian, and M. Mochizuki, Polarization mode controll of two-dimensional photonic crystal laser by unit cell structure design, Science, 293 (2001), 1123–1125. 80. S. Kim, andV. Gopalan, Strain-tunable photonic bandgap crystals, Appl. Phys. Lett., 78 (2001), 3015–3017. 81. P. A. Bermel and M. Warner, Photonic band structure of highly deformable self-assembling systems, Phys. Rev. E, 65 (2002), 10702–10705. 82. V. Babin, P. Garstecki, and R. Holyst, Photonic properties of an inverted face centered cubic opal under stretch and shear, Appl. Phys. Lett., 82 (2003), 1553–1555. 83. J. M. Gere, and S. P. Timoshenko, Mechanics of Materials, PWS Publishing Company, Boston, 1997. 84. M. Huang, Stress effects on the performance of optical waveguides, Int. J. Solids Struct., 40 (2003), 1615–1632.
C H A P T E R
F I V E
Modelling Ziegler–Natta Polymerization in High Pressure Reactors Antonio Fasano∗ , K. Kannan∗∗ , Alberto Mancini∗ , K. R. Rajagopal∗∗
Contents 5.1 Introduction Modelling the Growth of the Agglomerate (Macroscale) 5.2 Governing Equations 5.2.1 Constancy of porosity 5.2.2 Density of microspheres 5.2.3 Liquid monomer balance 5.2.4 Solid mass balance 5.2.5 Relating the agglomerate expansion to microspheres growth 5.2.6 Liquid flow 5.2.7 Energy balance 5.3 Initial and Boundary Conditions Modelling the Growth of Microspheres 5.4 Kinematics 5.5 The Governing Equations 5.5.1 Mass balance 5.5.2 Momentum balance 5.5.3 Constitutive equations 5.6 Initial and Boundary Conditions 5.7 Analysis of the Equations in the Microscale with Spherical Symmetry 5.8 Consistency of the Boundary Conditions Bridging the Two Scales. The Complete Model 5.9 Determining the Free Terms in the Macroscopic Transport Equations 5.10 The Complete Model 5.10.1 Macroscale 5.10.2 Microscale 5.11 Not Evolving Natural Configuration 5.12 Conclusions
208 212 212 212 213 213 214 214 215 215 216 217 217 218 219 219 220 222 224 228 232 232 232 232 233 234 235
Abstract The Ziegler–Natta process in its modern version using spherical catalyst support in high pressure reactors is the most efficient and economic way of producing polymers ∗ Dipartimento di Matematica “U. Dini’’, Università degli Studi di Firenze, Italy ∗∗ Department of Biomedical Engineering,Texas A&M University,TAMU,Texas, USA
Material Substructures in Complex Bodies ISBN-10: 0-08-044535-7
206
© 2007 Elsevier Ltd. All rights reserved.
207
Modelling Ziegler–Natta Polymerization in High Pressure Reactors
of ubiquitous use such as polypropylene and polyethylene. Using the ideas of Ref. [1] as a starting point, we formulate a mathematical model encompassing for the first time the mechanical behaviour of the polymer while it grows around the nanometric catalytic particles. As a theoretical basis we use the theory of Ref. [2]. This study opens the way to a long-term project aimed at producing a code for simulating the entire process. Key Words: Polymerisation, free boundary problems, mechanics of growth, reactiondiffusion
In this chapter we present an original study of some unexplored mechanical aspects of the polymerization processes known today as the Ziegler–Natta process, after the names of the Nobel laureates (1963) Karl W. Ziegler and Giulio Natta. Polypropylene and polyethylene are the polymers most widely used for innumerable applications. They are obtained from the corresponding monomers (propylene and ethylene) by means of a catalyst triggered reaction in which pairs of monomers share a chemical bound, thus creating a long chain which may contain thousands of elements (see Figures 5.1 and 5.2). H
H C
C
CH3 H Propylene
Figure 5.1 Polypropylene.
Ziegler–Natta polymerization or metallocene catalysis
H
H
C
C
H
CH3
n
Polypropylene
208
Fasano et al.
CH2
CH2
CH2
CH CH2
Ethylene
CH2 H3C
CH3
4-methyl-1-pentene CH2
CH2
n
CH2
Ziegler–Natta polymerization
CH CH2 CH2
H3C
CH3
m
Poly(ethylene-co-4-methyl-1-pentene) (BP’s Innoves ®, a form of LLDPE) Polyethylene
Ethylene
Figure 5.2 Polyethylene. (See also colour plate 4.)
5.1 Introduction The most efficient and economic way of producing polyethylene and polypropylene (and some kinds of copolymers) is based on the fundamental ideas of K. W. Ziegler and G. Natta (Figure 5.3) which, through a series of spectacular improvements over several decades, led to the modern process which uses spherical aggregates of catalytic particles. The final product is in the form of spherical pellets with diameters in the range 2–4 mm, which are ready to be commercialized, avoiding the costly procedure of heating, extruding and cutting that is needed to make pellets from the polymeric material produced in more traditional ways. For an overview of how the process was developed we refer the reader to Ref. [3]. An aggregate consists of an enormous amount of catalytic particles tightly packed in a sphere of 70 μm radius. Each catalytic particle is approximately 5 nm diameter (thus the number of particles in the agglomerate is of the order of 1011 ).
Modelling Ziegler–Natta Polymerization in High Pressure Reactors
209
Figure 5.3 Fundamental ideas of G. Natta. With prepoly
Without prepoly
Figure 5.4 Effect of prepolymerization. (See also colour plate 5.)
Of course the chemistry of the catalyst is at the heart of the problem, but we are not going to deal with it, since our goal is limited to the description of the growth of the agglomerate during the process, once some basic information is known about the rate of the polymerization reaction. The agglomerate is first exposed to the monomer (propylene or ethylene) at a sufficiently low temperature in order to keep polymerization rate sufficiently low so as to not to damage agglomerate (if this first stage is run correctly spherical symmetry will be conserved throughout the process) otherwise the final product is not usable (see Figure 5.4). During this stage a thin layer of polymer is produced around each catalytic element, creating the porosity which is absolutely necessary to allow the monomer to permeate the agglomerate. This early stage is called fragmentation (see Ref. [4]). The way the monomer penetrates the agglomerate depends on its state of aggregation. If the monomer is gaseous then diffusion may be considered the main transport mechanism (i.e. the driving force is the concentration gradient). In high pressure reactors the monomer is liquid at the operating temperature and its flow is driven by the pressure gradient (Darcy’s law). In either case from the
210
Fasano et al.
permeated agglomerate the monomer flows through the already formed polymer shells to reach the surface of the catalytic particles, where new monomer molecules are continuously incorporated in polymer chains. In practice the process following fragmentation takes place in two stages at different temperatures. The first stage (prepolymerization) is carried out at a temperature lower than the one of the last (and main) stage, still with the aim of stabilizing shape and improving efficiency. Our model starts soon after the completion of fragmentation and can be applied sequentially to the following stages, just by changing the reactor parameters at the transition time. The first mathematical model proposed (with reference to the low pressure reactor, since the high pressure process is much more recent) was the so-called multigrain model (see Ref. [5]), which however suffers from serious geometrical inconsistencies. An approach closer to reality (still for the case of the gaseous monomer) was proposed in Ref. [1] and further developed in a series of papers (see e.g. the survey Ref. [6]). The final formulation, based on a suitable rescaling, was presented in Ref. [7] and numerical simulations performed in Ref. [8], using the data of Ref. [9], revealed that the model gave a quite sensible description of the process, in spite of some substantial simplification that had been introduced and that will be discussed later. In particular simulations were able to provide a clear indication concerning the optimal duration of the process and interesting information about the internal structure of the agglomerate at the switching off time, which can be associated with the quality of the final product. The main idea underlying the model was to exploit the large difference in size between the agglomerate and the microspheres composing it. Accordingly, two spatial scales were introduced: the macroscale (agglomerate scale) and the microscale (at the level of the microspheres). On the large scale the inner structure is described by the density of catalytic sites ρ and by the porosity εA (volume fraction available to the monomer). One more piece of information is the spatial distribution of the size of the microspheres within the aggregate, measured by their radius R (as a function of time t and of radial coordinate rA in the agglomerate). Other quantities to be determined on the large scale are: the agglomerate radius RA (t), a quantity connected with the monomer flow (concentration cA in the gaseous case, pressure pA in the high pressure reactor), temperature θA (the polymerization reaction is strongly exothermic). Conversion of monomer into polymer occurs at the microscale and in our scheme it will be seen on the larger scale as a monomer sink. At the same time the heat released during the reaction is schematized as a distributed source. The radial expansion velocity of the agglomerate V% sA is the cumulative result of the growth of the polymeric microspheres and therefore it is strongly linked, as all other macroscopic quantities, to the evolution of the microscopic quantities. Besides the microsphere radius R, the latter are the same we have on the large scale, namely temperature θ, and either monomer concentration c, or pressure p, and geometrical and kinematical unknowns like the porosity ε and the radial
Modelling Ziegler–Natta Polymerization in High Pressure Reactors
211
expansion velocity V% s , both depending on the “microradial’’ coordinate r and time t. The polymer production rate at the catalyst surface depends on temperature and on mechanical quantities. All quantities in the micro-scale depend also on the “macroradial’’ coordinate rA . Two basic assumptions where introduced in Ref. [1], which have been maintained in all subsequent papers and which greatly simplify the model: (i) the porosity of the agglomerate εA is constant and uniform; (ii) the porosity of the microspheres ε is constant and uniform. While we could produce a convincing justification for (i), the second assumption was without any physical basis and has to be considered a rather crude approximation, having the advantage of eliminating all the difficulties connected with the mechanics of the growing polymer shells. In Ref. [10] a first naive attempt was made to formulate a model dropping assumption (ii). Although confined to a toy problem in one space dimension the substantial difficulties inherent to the determination of the polymer matrix porosity were already evident. As a matter of fact, the main motivation of the present paper was to reconsider this aspect of the model, introducing in it a self-consistent description of the mechanics of the evolving microspheres. This will result in quite a substantial expansion of the model size which treats each microsphere as a mixture of the liquid monomer (we restrict to the more modern high pressure process) and of the solid polymer. A peculiar feature of such mechanical system is the conversion of monomer into polymer at the catalytic surface, requiring the specification of the stress state at which the new polymer is formed. As we shall see, irrespective of the choice of the constitutive equation for the polymer, the stress state of the newly formed polymer is one of the new unknowns we have to introduce and it is related to the actual state of the whole microsphere, as well as of part of the agglomerate. Thus we are dealing with a problem which is considerably more complicated than the one formulated and studied in Refs. [1, 7]. Indeed, not only we do incorporate the full mechanics of the microspheres but, as we will point out, dropping assumption (ii) on the microscale has eventually a deep effect also on the way balance laws have to be written on the macroscale. We believe that the material presented here represents a real progress in the search for a comprehensive mathematical model for the Ziegler–Natta process. Nevertheless the final goal of providing quantitative predictions is still far, because the numerical values of the parameters introduced in the constitutive equations of the growing polymer are not known and hard to become available in the near future. Only numerical computation and data fitting can give substantial indications. Our work needs also further studies to establish well posedness. Indeed the model formulated here presents great mathematical difficulties and for the moment we confined ourselves to check that the differential equations and the boundary conditions, as they seems to be dictated by physics, are at least not contradicting what is known from the general theory of hyperbolic systems. In other words this is the first stage of a long term project, which looks very stimulating and promising.
212
Fasano et al.
The plan of the chapter is the following: 1. Modelling the growth of the agglomerate. Here the partial differential equations governing the evolution of the agglomerate are derived (Section 5.2), together with the required initial and boundary conditions (Section 5.3). 2. Modelling the growth of microspheres. This is the core of our analysis, because the difficult question of the mechanical behaviour of the growing polymer shells is addressed. In Section 5.5 we write the mass and momentum balance laws along with the constitutive equations. The latter are formulated in the framework of the general theory of Ref. [2]. Section 5.6 is dealing with the delicate question of selecting the correct boundary conditions. Sections 5.7 and 5.8 have a more mathematical character, investigating the nature of the differential system governing the growth of the microspheres and the consistency of the boundary conditions. 3. Bridging the two scales. Here the coupling terms are calculated (heat source and monomer sink at the macroscopic scale, Section 5.9) and the complete model in spherical symmetry is summarized (Section 5.10). In the final section a simplified version is presented under the assumption that the polymer in the microsphere can be considered purely elastic.
MODELLING THE GROWTH OF THE AGGLOMERATE (MACROSCALE) 5.2 Governing Equations In writing the set of equations driving the evolution of the agglomerate we have to make use of the quantities evolving at the microscale. This mixing is unavoidable because of the strong linking between the two scales. Therefore we will be able to complete the formulation of the problem on the macroscale only after we have considered the problem on the microscale. We start with the justification of assumption (i) on the agglomerate porosity εA .
5.2.1 Constancy of porosity It is well known (see Ref. [11]) – and on the other hand it is of easy calculation – that an ideal porous medium made of equal spheres laid in the so-called rhombohedral arrangement has a porosity (0.25) which is independent of the radius of the “grains’’. The aspect of the final polymer pellet is so compact that it is natural to suppose that in the expanding agglomerate the microspheres keep locally at all times a configuration of maximal packing (i.e. the rhombohedral one, see Ref. [12]). This situation, although ideal, is of course different from the one of the ideally packed medium described before because the spheres making the agglomerate are not of equal size. The reason is very simple: the spheres close to the outer
Modelling Ziegler–Natta Polymerization in High Pressure Reactors
213
Ideal packing of the growing microspheres
Neighbouring spheres have similar histories and approximately the same radius
Figure 5.5 Porosity ε constant. (See also colour plate 6.)
surface have been exposed to monomer for a larger time and a higher pressure than the ones close to the centre and therefore they have grown larger. However, if we consider a few adjacent spherical layers of microspheres (see Figure 5.5), their history is so similar that locally they form a maximum-packing arrangement at all times. Therefore, despite the dependence of the spheres radius R on the Eulerian radial coordinate rA in the agglomerate, the assumption εA 0.25 seems very well justified.
5.2.2 Density of microspheres If ρ(rA , t) denotes such a quantity, then we obviously have 3 4 3 πR ρ
= (1 − εA ),
(5.1)
which relates ρ to the microstructure.
5.2.3 Liquid monomer balance If ρf is the density of the liquid monomer (which we take as constant, neglecting its variation in the temperature range within the agglomerate) and V% fA = uA%er is the fluid velocity, we write εA div (ρf uA e%r ) = −M,
(5.2)
where M > 0 represents the rate at which the monomer is absorbed by the microspheres. Obviously M can be found only by solving the problem at the microscale and its expression will be given in Section 5.9. The fact that εA = constant has been used in (5.2).
214
Fasano et al.
5.2.4 Solid mass balance Let V% sA = vA%er denote the velocity of the solid component during the expansion of the agglomerate. Then we have (1 − εA )
∂ρs + (1 − εA ) div (ρs vA%er ) = M, ∂t
(5.3)
where (1 − εA )ρsA is the solid mass per unit volume of the agglomerate. The latter quantity can be calculated as follows: * s 4 3 (1 − εA )ρsA = (1 − ε)ρs 4πr 2 dr ρ, (5.4) πr0 ρcat + 3 r0 where ρcat is the density of the catalytic particle sitting in the centre of each microsphere, r0 is its radius, ρs is the density of the pure polymer and ε is the unknown porosity of the microspheres, which depends on r, t and on rA . Since the contribution of the catalyst mass becomes soon negligible we realize that if ε were taken space independent then, thanks to (5.1), it would be (1 − εA )ρsA (1 − ε) 43 πs3 ρρs = (1 − εA )ρs , i.e. ρsA = ρs . Thus the definition of ρsA is deeply affected by the variability of ε within microspheres.
5.2.5 Relating the agglomerate expansion to microspheres growth The following equation: R˙ div V% sA = 3 , R
(5.5)
∂R ∂R + vA ∂t ∂rA is a consequence of the following simple argument. Since εA is constant the “grains’’ and “pores’’ in the agglomerate must vary at the same rate. So div V% sA represents the specific volume increase rate both of the whole agglomerate and of the microspheres. The latter is precisely where R˙ =
4πR 2 R˙ (4/3)πR 3 which gives (5.5). Owing to spherical symmetry equation (5.5) is all we need to retrieve the expansion velocity from the microscopic evolution of the system: rA2 vA (rA , t) = 3
0
rA
ξ2
˙ ξ) R(t, dξ, R(t; ξ)
(5.6)
215
Modelling Ziegler–Natta Polymerization in High Pressure Reactors
where we have used the obvious condition vA (0, t) = 0,
(5.7)
following the natural choice of the reference frame.
5.2.6 Liquid flow The assumption that the microspheres keep maximal packing during the entire process allowed us to reduce the problem of the agglomerate expansion to a purely kinematic one, bypassing the necessity of investigating the mechanical interactions among the microspheres. A consistent level of simplicity should be adopted in describing the flow of the liquid monomer through the agglomerate. Darcy’s law looks an ideal choice: εA (V% fA − V% sA ) = −K ∇p,
(5.8)
where K is the hydraulic conductivity of the agglomerate, taken constant, p is the liquid pressure. To the relative velocities between the two components we can associate the interaction force %IA exerted by the fluid on the solid per unit volume, a quantity that will play an important role on the microscale mechanics. A natural way of defining %IA is %IA = εA (1 − εA )κA (V% fA − V% sA ),
(5.9)
where κA is a positive constant. By taking the divergence of (5.8) and using (5.2), (5.5) we arrive at the equation: 1 ∂ 1 M R˙ 2 ∂p p = 2 + 3εA r = . (5.10) K ρf R rA ∂r A ∂r
5.2.7 Energy balance We define Cf , κf as the heat capacity and thermal conductivity of the pure (liquid) monomer and Cˆ s , kˆ s as the heat capacity and thermal conductivity of the microspheres. Thus we should write 1 Cˆ s (rA , t) = (4/3)πR(t; rA )3
4 3 πr ρcat Ccat + 3 0
R(t;rA ) r0 2
+ + ε(ˆr , t; rA )ρf Cf ]ˆr dˆr ,
4π[(1 − ε(ˆr , t; rA ))ρs Cs (5.11)
216
Fasano et al.
and kˆ s (rA , t) =
1 (4/3)πR(t; rA )3
4 3 πr kcat + 3 0
R(t;rA )
4π[(1 − ε(ˆr , t; rA ))ks
r0
+ + ε(ˆr , t; rA )kf ]ˆr 2 dˆr ,
(5.12)
where Ccat , kcat refer to the catalytic particle and Cs , ks to the pure polymer. Then we can write the energy balance equation as ∂θA − div{[εA kf + (1 − εA )kˆ s ]∇θA } ∂t + [εA ρf Cf V% fA + (1 − εA )Cˆ s V% sA ]∇θA = H,
[εA ρf Cf + (1 − εA )Cˆ s ]
(5.13)
where H represents the amount of heat released to the aggregate per unit volume and unit time due to polymerization occurring at the microscale and we are assuming thermal equilibrium of the components. Of course H will be specified after the discussion of the problem at the microscale (see Section 5.9).
5.3 Initial and Boundary Conditions We set time t = 0 at the end of the fragmentation stage, which is very fast. This defines the initial position of the outer boundary RA (0) = R0A ,
(5.14)
whereas the expansion velocity is the velocity of the solid component R˙ A (t) = vA (RA (t), t).
(5.15)
For rA = 0 the assumption of spherical symmetry gives the required boundary conditions, for instance υA (0, t) = 0, & ∂p && = 0, ∂rA &rA =0 & ∂θ && = 0, ∂rA &rA =0
(5.16) (5.17)
(5.18)
and on the external boundary (rA = RA (t)) we can assume p(RA (t), t) = pext ,
(5.19)
217
Modelling Ziegler–Natta Polymerization in High Pressure Reactors
θ(RA (t), t) = θext ,
(5.20)
being pext and θext the pressure and temperature of the reactor.
MODELLING THE GROWTH OF MICROSPHERES 5.4 Kinematics We briefly review some elements of the kinematics of the solid–liquid mixture making the microspheres, with the main objective to set the notation. Let us % i , i = s, f denote the material consider a mixture of a solid infused by a liquid. X points associated with the solid and fluid, respectively, in the reference state. Both the material points occupy the same position x% at time t. The motion of each constituent is given through % i , t), x% = χ % i (X
i = s, f .
(5.21)
% i , i = s, f , i.e. V% i , i = s, f , is given by (see [13]) The velocity of these particles X ∂% χi % i , t), i = s, f . (5.22) (X ∂t The velocity gradient associated with the ith constituent is defined through V% i =
∂V% i L i = ∇ V% i = , i = s, f . (5.23) ∂%x # The symmetric part of L i , i.e. D i is # # (5.24) D i = 12 (L i + L T i ). # # # The deformation gradient, left and the right Cauchy–Green stretch tensor are given through ∂% χi , F iκR = ∂Xi #
T
T
B iκR = F iκR F iκR and C iκR = F iκR F iκR . # # # # # #
(5.25)
The principal invariants of B iκR are # 1 2 IiκR = tr(B iκR ), IIiκR = {[tr(B iκr )]2 −tr(B iκR )} and IIIiκR = det(B iκR ). (5.26) 2 # # # # From this point forward, we will focus on the kinematics related to the solid unless stated otherwise. The deformation gradient F sκR is decomposed as follows: # F sκR = F κp(t) G . (5.27) # # #
218
Fasano et al.
The mapping L κp(t) is defined through # D s G −1 L κp(t) = # G , (5.28) Dt # # where Ds ∂ % s,t ) := φ(X % s , t); φ(X Dt ∂t φ is any scalar valued function, and its symmetric part D κp(t) is defined through # 1 D κp(t) = (L κp(t) + L T (5.29) κp(t) ). 2# # # In a manner similar to that of equations (5.25) and (5.26) one can define the left and the right Cauchy–Green stretch tensor and its invariants associated with the tensor F κp(t) . # The upper-convected Oldroyd derivative of B κp(t) (see Ref. [2]) is defined # through Ds B kp(t) = B κ − L s B κ − B κ L T = −2F κp(t) D κp(t) F T κp(t) . Dt # p(t) # # p(t) # p(t)#s # # # #
(5.30)
5.5 The Governing Equations As we said, the microspheres have a core (also modelled as a sphere) which is the catalyst, and a shell made of a mixture of liquid monomer and solid polymer. The latter is modelled as a compressible, homogeneous, isotropic viscoelastic solid. Polymerization takes place at the catalytic surface (radius r0 ) and the new solid material pushes away the already formed polymer, so that the outer radius R(t) increases. The deformability of the polymer and the stresses created during expansion produce a variation of the polymer volume ratio in the shell (which is permeated by the liquid monomer) both in space and time. In contrast to the problem at the macro-scale, we cannot avoid here dealing with the full mechanics of the monomer–polymer mixture. Due to the smallness of the ratio R(t)/RA (t) during the whole process, a simple rescaling argument shows that, in each microsphere, temperature can be considered uniform and equal, at each time instant, to the temperature of the agglomerate at the location occupied by the microsphere: θ = θA (rA , t).
(5.31)
We have implicitly assumed that solid and liquid are always in thermal equilibrium within microspheres. We have to write down the equations governing the evolution of the mixture. The subscript s refers to the solid (polymer) and the subscript f to the liquid
Modelling Ziegler–Natta Polymerization in High Pressure Reactors
219
(monomer). Our formulation is inspired to the general theory about mixtures of Ref. [13].
5.5.1 Mass balance (ερf )t + div (ερf V% f ) = 0
(5.32)
[(1 − ε)ρs ]t + div [(1 − ε)ρs V% s ] = 0.
(5.33)
We suppose that ρs and ρf are known constants (neglecting their dependence on temperature) so that they can be eliminated in the equations. The velocity fields V% s , V% f are unknown. They can be taken radial: V% s = v%er , V% f = u%er , %er denoting the unit radial vector. Of course the liquid volume fraction ε is also unknown.
5.5.2 Momentum balance ερf (Dt V% f ) = −%I + div(T f ) = −%I + div(εT f ), # # (1 − ε)ρs (Dt V% s ) = %I + div(T s ) = %I + div[(1 − ε)T s (B kp(t) , B kR )]. # # # #
(5.34) (5.35)
Here T s , T f are the Cauchy stress tensors of the respective component and %I # the # interaction force (more precisely the action exerted by the fluid on denotes the solid), all to be specified. The tensors B kp(t) , B kR are the right Green tensors with respect to the natural # # configurations, and to the reference respectively (see [14]). We shall assume that they obey the following evolution equations:
B κR = 0 , # # B κp(t) = F (B κp(t) ). # ## Where F will be specified later and for any tensor t : # # ∂t t = # + ∇t V% s − ∇ V% s t − t (∇ V% s )T , # ∂t # # #
(5.36) (5.37)
denotes the upper-convected Oldroyd derivative. In Section 5.6 we will return to the discussion about the difficulty in defining correctly, e.g. B kR connected with the fact that new material is being generated on the catalyst#surface. Clearly (5.36), (5.37) need some motivation that we are going to provide along with the specification of the various quantities not yet defined.
220
Fasano et al.
5.5.3 Constitutive equations A very natural way of defining the interaction force %I is %I = ε(1 − ε)κ(V% f − V% s ),
(5.38)
where κ is a positive constant. This of course presumes that the interaction force is only due to the drag (see [13] for various possible interactions mechanisms). T f is the usual Cauchy stress tensor for a viscous fluid of viscosity η: # (5.39) T f = −pI + ηD f = −pI + η(∇ V% f + (∇ V% f )T ). # # # # T s has a much more complicated structure that we have to select so to comply with#the restrictions imposed by the second law of thermodynamics. To that end it is convenient to rewrite the mass balance for the solid in the Lagrangian form, namely, (5.40) ρˆ os = ρˆ s |F sκR|, # where ρˆ os is the reference density of the solid when it is deformation free, ρˆ s is the current density of the solid in the mixture state (note that pˆ s = (1 − ε)ρs ) and for any tensor t the symbol |t | stands for det(t ). Using (5.27), the above equation # # # becomes ρˆ os |G | = . (5.41) ρˆ s |F κp(t) | # # Motivated by (5.40), we shall define the density associated with the natural configuration κp(t) (B), i.e. ρκ(t) to be ρκp(t) = ρˆ s |F κp(t) |. #
(5.42)
Ds ρκ(t) + ρκ(t) tr(D κp(t) ) = 0. Dt #
(5.43)
It is easy to show that
It is customary to impose the local form of the second law of thermodynamics for the mixture as a whole to obtain restrictions on the constitutive equations (see [13]). Here we require that the local form of the second law of thermodynamics to be valid for each constituent, i.e. %qi · ∇θ Di Di T i · D i − ρˆ i ψi − ρˆ i ηi θ − + εsi + Ai ηi θ = ρˆ i ξi θ = ξˆi ≥ 0, Dt Dt θ # # i = s, f , (5.44) subject to the constraint that As + Af = 0, where Ti , i = s, f are the partial stress of the solid and fluid, respectively. Also, ψi and ηi are the specific Helmholtz
Modelling Ziegler–Natta Polymerization in High Pressure Reactors
221
potential and entropy of the constituent i. The energy supply and the rate of entropy production of the ith constituent are denoted by εsi and ξi , respectively. The terms Ai and εsi are zero for the problem under consideration as there is no interconversion of constituents and it is assumed there is no energy supply. The term ∇θ is zero as the temperature is assumed to be uniform in the microsphere. We shall further assume that the entropy production of the solid and fluid constituents are only due to dissipation. We also neglect the variation of θ with time, since in the real process the reactor temperature is fairly stable. Therefore the second law of thermodynamics reduces to Di T i · D i − ρˆ i ψi = ξˆi ≥ 0. (5.45) Dt # # We shall assume that the specific Helmholtz potential for the solid, ψs , is defined as follows (see [15]): 2 μ2 , μ1 ψs = {tr(B κp(t) ) − 3 − log |B κp(t) |} + |B κp(t) | − 1 2ρκp(t) 2ρκp(t) # # # , 2 μ3 μ4 × s {tr(B sκR ) − 3 − log |B sκR |} + s |B sκR | − 1 , (5.46) 2ρ0 # 2ρ0 # # where μi , i = 1, …, 4 are nonnegative constants. The rate of dissipation associated with the solid, ξˆs , is defined through (5.47) ξˆs = 2η1 (tr D κp(t) )2 + 2η2 tr(D 2κp(t) ), # # where η1 , η2 are nonnegative constants. It is obvious that ξˆs is nonnegative and is zero if and only if D κp(t) = 0. The rate of dissipation has the same structure # Newtonian fluid. The equations (5.46) and (5.47) are as that of a compressible substituted in (5.45). Thus selecting the constitutive form of the partial stress: , μ1 μ3 |B kp(t) | − 1 I + , (B κp(t) − I ) + μ2 (B kR − I ) Ts = , # # # # # |B κp(t) | # |B kR | # # # , + μ4 |B kR | − 1 I , (5.48) # # is sufficient to ensure that the second law holds and we find: , μ1 |B κp(t) | − 1 tr D κp(t) (B κp(t) · D κp(t) − tr D κp(t) ) + μ2 , # # # # |B κp(t) | # # μ1 μ2 − , tr D κp(t) {tr(B κp(t) ) − 3 − log |B κp(t) |}− , tr D κp(t) # # 2 |B κp(t) | # 2 |B κp(t) | # # # , 2 × |B κp(t) | − 1 = 2η1 (tr D κp(t) )2 + 2η2 tr(D 2κp(t) ), (5.49) # # #
222
Fasano et al.
which can be viewed as a constraint on D κp(t) , i.e. only those D κp(t) are possible # # which satisfy (5.49). The evolution equation for the natural configuration κp(t) is determined such that the rate of dissipation is maximized subject to the constraint (5.49) (see [2]). We are now able to write the evolution equation (5.37) where μ2 , μ1 F =− , |B κp(t) | − 1 B κp(t) B κp(t) (B κp(t) − I ) − η2 # # # # # η2 |B κp(t) | # # μ1 μ2 + {tr(B κp(t) ) − 3 − log |B κp(t) |}B κp(t) + , , # # 2η2 |B κp(t) | # 2η2 |B κp(t) | # # 2 , η1 |B κp(t) | − 1 B κp(t) + 2 (tr D κp(t) )B κp(t) , (5.50) × η2 # # # # where ⎧ ⎪ ⎨ μ , 1 tr D κp(t) = , |B κp(t) | − 1 (tr B κp(t) − 3) + 3μ2 ⎪ # # ⎩ |B κp(t) | # # 3μ2 3μ1 [tr(B κp(t) ) − 3 − log|B κp(t) |] − , − , # 2 |B κp (t) | # 2 |B κp(t) | # # 2 * , |B κp(t) | − 1 (5.51) /(6η1 + 2η2 ). × # The latter equation is obtained taking the trace of both sides of (5.50) and of
(5.30) and eliminating B kpt . #
5.6 Initial and Boundary Conditions The question of the boundary conditions to be associated to the equations shown in the previous section is by no means trivial because it is intimately linked to the mathematical nature of the system of the governing partial differential equation(s) (p.d.e.’s), which still has to be analysed in detail. Therefore in this section we just list a set of boundary conditions which look physically natural, postponing a more careful analysis to the next section.We make use of the spherical symmetry of the problem. A. Motion of the external boundary: ˙ = υ(R(t), t), R(t)
R(0) = R0 (rA )
(strictly speaking the radius after fragmentation depends on rA ).
(5.52)
Modelling Ziegler–Natta Polymerization in High Pressure Reactors
223
B. Pressure on the external boundary: p(R(t), t) = pA (rA , t).
(5.53)
C. Monomer volume fraction at the inner boundary: ε(r0 , t) = ε0 .
(5.54)
This condition is motivated by the assumption that on the average there is a constant distribution of catalytic sites on the surface of the catalytic core. D. Conditions on radial velocities v, u on the inner boundary: Polymerization rate depends on various parameters. It is an increasing function of temperature θ, pressure p and it is likely to depend on T s too. Thus we suppose that a function vp (θ, p, T s ). is given such that # # (5.55) v(r0 , t) = vp (θ, p, T s ). # Since the mass of the generated polymer is provided by the monomer flux, we write the balance law: [ερf u + (1 − ε)ρs υ]|r=r0 = 0.
(5.56)
E. Stresses: The most delicate conditions refer to the stresses on the solid component. The normal stress acting on spheres at a distance rA from the agglomerate centre can be deduced imposing global conservation of linear momentum of the solid in the agglomerate. Let m(rA , t) be the mass of the microspheres. Then we write (neglecting the force acting on the external surface of the agglomerate): & d RA (t) & 2 (1 − εA )4πrA T s · %e & = 4πx2 (1 − εA )ρ(x, t)m(x, t)vA (x, t)dx r=R(t) dt rA # RA (t) + 4πx2 %IA (x, t)dx, (5.57) rA
where %IA denotes the interaction force per unit volume exerted by the fluid on the solid in the agglomerate (see (5.9)). At the catalyst surface r = r0 again we suppose that normal stresses account for the global momentum balance of the polymer in the microsphere: & & (1 − ε0 )4πr02 T s · %e & r=r0 # & d R(t) & 2 = (1 − ε)4πR (t)T s · %e & 4πr 2 (1 − ε)ρs v(r, t)dr r=R(t) dt r0 # R(t) 4πr 2 %I (r, t)dr. (5.58) + r0
224
Fasano et al.
We will return to the question of selecting appropriate boundary conditions for the tensor B kR in Section 5.7. # Initial conditions must be prescribed for all the quantities if we take R0 > r0 .
5.7 Analysis of the Equations in the Microscale with Spherical Symmetry The specific form of the mass balances (5.15), (5.16) is 1 ∂ 2 εr 2 εt + =− , (5.59) εt + 2 [r (εu)] = 0 ⇒ ur + u r ∂r ε r ε 1 ∂ 2 (1 − ε)t + 2 [r 2 ((1 − ε)v)] = 0 ⇒ εt + εr v − vr (1 − ε) = (1 − ε)v. r ∂r r (5.60) The tensors T f , T s , B kp(t) , B kR in spherical coordinates have the form: # # # # vr 0 0 0 T1 0 T f = −pI + 2η 0 vr 0 , T s = 0 T2 0 , # # # 0 0 vr 0 0 T2 0 B1 0 b1 0 0 B kp(t) = 0 B2 0 , B kR = 0 b2 0 . # # 0 0 B2 0 0 b2 The fluid momentum balance (5.19) reduces to 1 ∂ ερf (ut + uur ) = −ε(1 − ε)κ(u − v) + 2 [r 2 ε(−p + ηur )] r ∂r εr εr 2 2η ⇒ pr + p + = ur + ηurr + η ur ε r r ε − ρf (ut + uur ) − (1 − ε)κ(u − v).
(5.61)
Using the specific form (5.50), (5.51) of the function F and the fact that ⎡ ⎤ # , , b1 − 1 ⎢ B1 − 1 ⎥ + μ2 B1 B22 − 1 + μ3 , + μ4 b1 b22 − 1 ⎦ , T1 = ⎣μ1 , B1 B22 b1 b22 (5.62) ⎤ , , b2 − 1 ⎢ B2 − 1 ⎥ T2 = ⎣μ1 , + μ2 B1 B22 − 1 + μ3 , + μ4 b1 b22 − 1 ⎦ , B1 B22 b1 b22 ⎡
(5.63)
225
Modelling Ziegler–Natta Polymerization in High Pressure Reactors
(we can take |B kp(t) | and |B kR | both positive, hence B1 , b1 > 0), the solid # (5.35) becomes # momentum balance κ 1 2 ∂ vt + vvr = ε(u − v) + [(1 − ε)T1 ] + (T1 − T2 ), ρs (1 − ε) ∂r r yielding the full expression ⎡ ⎤ vt + vv r + εr
μ ˆ 2 B22 μ ˆ 1 B22 ⎥ T1 ˆ1 1 ⎢μ + + − (B1 )r ⎣ , , ⎦ (1 − ε) 2 B B2 2 B B2 2 (B1 B 2 ) 32 1 2
2
1 2
⎡
⎤ √ ˆ 2 2B1 B2 μ ˆ 1 2B1 ⎥ B1 μ ⎢ ˆ1 2 + + −(B2 )r ⎣−μ , ⎦ 2 B B2 2 (B1 B 2 ) 32 B2 2 1 2
⎡
⎤
μ ˆ 4 b22 μ ˆ 3 b22 ⎥ ˆ3 1 ⎢μ −(b1 )r ⎣ , + + , ⎦ 2 b b2 2 b b2 2 (b1 b2 ) 32 2 1 2 1 2 ⎡
⎤ √ ˆ 4 2b1 b2 μ ˆ 3 2b1 ⎥ b1 μ ⎢ + ˆ3 2 + −(b2 )r ⎣−μ , ⎦ 2 b b2 2 (b1 b2 ) 32 b2 2 1 2
⎤
⎡ =
2⎢ μ μ ˆ1 ˆ3 κ ⎥ ε(u − v) + ⎣ , (b1 − b2 ) + , (B1 − B2 )⎦ , ρs r b1 b22 B1 B22
(5.64)
(where μ ˆ i := μρ1s ) and the evolution laws (5.36), (5.37) of B kp(t) , B kR lead to the # # respective systems (b1 )t + (b1 )r v − 2b1 vr = 0 (b2 )t + (b2 )r v = 2r b2 v (B1 )t + (B1 )r v − 2B1 vr = −
(5.65)
B1
, 2η2 (η2 + 3η1 ) B1 B22
[−μ2 η2 − 3μ2 η1 μ1 η2 + 3μ1 η1 − 3B12 η1 μ2 B22 − 3B1 η1 μ1 ln(B1 B22 ) + B12 η1 μ1 + 3B1 η1 μ2 + 3μ2 B1 B22 η1 − 6η1 B2 μ1 + 2η1 B2 μ1 B1 + 3μ1 ln(B1 B22 )η1 + μ1 ln(B1 B22 )η2 − 2μ1 B2 η2 + μ1 B1 η2 + μ2 B1 B22 η2 ]
226
Fasano et al.
2 B2 (B2 )t + (B2 )r v = B2 v − √ r 2η2 (η2 + 3η1 ) B1 B22 [−μ2 η2 − 3μ2 η1 + μ1 η2 + 3μ1 η1 + 3μ2 B1 B22 η1 +3η1 B2 μ2 − 3η1 B2 μ1 + 2η1 B22 μ1 −3η1 B2 μ1 ln(B1 B22 ) − 3η1 B23 μ2 B1 + η1 B2 μ1 B1 + 3μ1 ln(B1 B22 )η1 + μ1 ln(B1 B22 )η2 − 3μ1 B1 η1 −μ1 B1 η2 + μ2 B1 B22 η2 ]. (5.66) At this point we remark that equation (5.59) can be regarded as a first-order ordinary differential equation (o.d.e.) (w.r.t. r) in u and can be formally integrated at each time using (5.56) as boundary condition on r = r0 .Thus u can be expressed as a functional of ε, εt and v. Likewise, equation (5.61) can be regarded as a first-order o.d.e. for p, which can be integrated at each time using (5.53) as boundary condition, thus expressing p as a functional of v, ε, εr , u, ur , urr . We order the remaining quantities in the vector: % = {ε, B1 , B2 , b1 , b2 , v}, U
(5.67)
and note that they satisfy genuine “p.d.e.’’s. The final system made of equations (5.60), (5.64)–(5.65), can be summarized as % r = Y% , % t + AU U where the matrix A is ⎛ ⎜ ⎜ ⎜ ⎜ ⎜ ⎝
v 0 0 0 0
Tˆ 1 1−ε
0 v 0 0 0 A1
0 0 v 0 0 A2
0 0 0 v 0 A3
0 0 0 0 v A4
(5.68) ⎞ −(1 − ε) −2B1 ⎟ ⎟ 0 ⎟ −2b1 ⎟ ⎟ ⎠ 0
(5.69)
v
and Ai are the coefficients of the r-derivatives of B1 , B2 , b1 and b2 in (5.64), Tˆ 1 has the same expression as T1 (see (5.62)) with μi replaced by μ ˆ i , i = 1, …, 4, and the vector Y% is ⎛ ⎞ 2 r (1 − ε)v ⎟ ⎜ B1 ⎟ ⎜ B2 ⎟ ⎜ ⎟ ⎜ % 0 Y =⎜ (5.70) ⎟, ⎟ ⎜ 2 ⎟ ⎜ r b2 v !⎠ ⎝ ˆ3 μ ˆ1 κ 2 √μ √ ε(u − v) + (b − b ) + (B − B ) 1 2 1 2 2 2 ρs r b1 b2
B1 B2
227
Modelling Ziegler–Natta Polymerization in High Pressure Reactors
where Bi = Bi (B1 , B2 ), i = 1, 2 are the right-hand sides of equation (5.66) and u in the last component is the functional obtained by the formal integration of equation (5.59) as noted before. It can be seen that the matrix A has the following six eigenvalues: v − C , v, v, v, v, v + C ,
(5.71)
C 2 = −2A1 B1 − 2A3 b1 − Tˆ 1 ,
(5.72)
with which are all real since the explicit computation C 2 shows that ⎡ μ ˆ2 ˆ1 1 ⎢μ + + −2A1 B1 − 2A3 b1 − Tˆ 1 = 2B1 ⎣ , , 2 B B2 2 B B2 1 2 1 2 ⎡ B22
⎤ μ ˆ 1 B22 2 (B1 B 2 ) 32 2
⎥ ⎦
⎤ μ ˆ 3 b22 2 (b1 b2 ) 32 2
μ ˆ4 ˆ3 1 ⎢μ ⎥ + + + 2b1 ⎣ , , ⎦ 2 b b2 2 b b2 1 2 1 2 ⎛ , , b1 ⎜ ˆ 4 b1 b22 − μ ˆ4 +μ ˆ3, +μ ˆ 2 B1 B22 − ⎝μ b1 b22 ⎞ b22
−μ ˆ1, =μ ˆ1,
1
−μ ˆ2 +μ ˆ1,
B1 B22
B1
+μ ˆ3,
B1 B22
, 1 +μ ˆ 2 B1 B22 + μ ˆ1, B1 B22 B1 B22
b1 b22
⎟ ⎠
B1
+μ ˆ3,
−μ ˆ4
1
b1 b1 b22
, b1 b22
+μ ˆ1,
,
+μ ˆ 4 b1 b22 + μ ˆ3,
−μ ˆ4 −μ ˆ3,
1 B1 B22
b1 b1 b22
+μ ˆ2 −μ ˆ1 + ,
=μ ˆ4 +μ ˆ 2 + 2μ ˆ3,
1 b1 b22
1 b1 b22 ,
−μ ˆ 2 B1 B22 B1 B1 B22
+ 2μ ˆ1,
+μ ˆ3,
1 B1 B22
> 0.
1 b1 b22
228
Fasano et al.
The eigenvalues will be denoted by λ1 < λ2 = λ3 = λ4 = λ5 < λ6 . We can also compute the corresponding left eigenvectors ζ%i , i = 1, . . . , 6: 1−ε 1−ε 1−ε 1−ε 1−ε ζ%1 = 1, −A1 , −A 2 , −A 3 , −A 4 ,C , (5.73) Tˆ 1 Tˆ 1 Tˆ 1 Tˆ 1 Tˆ 1 associated with v − C ; 1ε−1 %ζ2 = 1, , 0, 0, 0, 0 , 2 B1 1 (1 − ε) (1 − ε) %ζ3 = ,− , 1, − − 1, 0 , 2 4(B1 + b1 ) 4(B1 + b1 )
(5.74) (5.75)
ζ%4 = (0, 0, 1, 0, 0, 0),
(5.76)
ζ%5 = (0, 0, 0, 0, 1, 0),
(5.77)
associated with v, and 1−ε 1−ε 1−ε 1−ε 1−ε , −A 2 , −A 3 , −A 4 , −C (5.78) ζ%6 = 1, −A1 Tˆ 1 Tˆ 1 Tˆ 1 Tˆ 1 Tˆ 1 associated with v + C . The matrix Z whose rows are the left eigenvectors is nonsingular (as long as ε < 1 and v > 0). Indeed |Z| =
1 (1 − ε)3 C 3 , 4 B1 Tˆ 12 (B1 + b1 )
thus the eigenvectors span R6 .
5.8 Consistency of the Boundary Conditions From the mathematical point of view the problem we have formulated is enormously complicated for the following reasons: (i) the boundary conditions (5.57), (5.58) are nonlocal; the former involves the whole structure of part of the agglomerate, the latter the whole solid and liquid velocity fields in the microspheres; (ii) the vector Y% in equation (5.67) contains functionals of the unknowns and of their derivatives; (iii) the domain in which we are looking for a solution has a complex structure, as illustrated in Figure 5.6, showing that it is crossed by characteristics originating form the two corner points (r0 , 0), (R0 , 0).
229
Modelling Ziegler–Natta Polymerization in High Pressure Reactors
t
R1
c c
R2 c
R(t )
c
c c
c
R3 c
R0 I.C.
r0
Figure 5.6
R0
r
v(r0 ,t) − C(r0 , t) < 0.
The latter circumstance implies the necessity of solving the problem first in the domain R0 , where only the initial conditions: ⎧ ε(r, 0) ⎪ ⎨ v(r, 0) B ⎪ ⎩ i (r, 0) bi (r, 0)
= ε0 (r) (0, 1) = υ0 (r) (0, C (r, 0)) = Bi0 (r) i = 1, 2 = bi0 (r) i = 1, 2 r0 < r < R0 ,
(5.79)
affect the solution, and subsequently in the domains R1 , R2 , R3 , using the previous information and imposing compatibility of the data and continuity across the interface between R1 , and R2 . We note that Figure 5.6 refers to the specific case 0 < v < C , which is the only one physically meaningful. We remark that the initial value problem in R0 is actually solvable because the domain is strong determinate in the sense of Ref. [16] and the condition of the existence and uniqueness theorem (Theorem 4.1, Chapter 2 [16]) are satisfied. However the interplay among the various regions is really complicated and for the sake of simplicity we will confine ourselves to the particular case in which we may neglect the thickness of the initial shell, i.e. R0 r0 , thus reducing the domain to the one shown in Figure 5.7. This simplification makes sense since the initial thickness of the layer (following fragmentation) is indeed very small, meaning that the lifetime of the domain R0 is extremely short. Now we have a problem in a wedge. Again we refer to Ref. [16] for the general theory. We limit our analysis to the verification that the boundary conditions prescribed on the basis of physical arguments are indeed well suited for the hyperbolic system of equations.
230
Fasano et al.
t
R(t)
c c
r0 R0
r
Figure 5.7 r0 = R0 .
According to Ref. [16] the way of prescribing the boundary data has to follow the general rule: % =: Gi (r, t, U % ) = γi (r, t, U %) ζ%i · U
on r = r0 for all i s.t.λi > 0,
(5.80)
% =: Gi (r, t, U % ) = γj (r, t, U %) ζ%i · U
on r = Rt for all j s.t.λj < v.
(5.81)
Moreover if we denote by 0 the set of indices i and by R the set of indices j, it must be 0 ∩ r = Ø and 0 ∪ R = {1, . . . , 6}. We note that 0 = {2, 3, 4, 5, 6}, having supposed 0 < v < C , while R = {1}. Thus on r = r0 we have to specify the quantities: % = ζ%2 · U % = ζ%3 · U
3 ε−1 2 1 3 ε − 1 + B2 − b2 2 2
:= G2 ,
(5.82)
:= G3 ,
(5.83)
% = ζ%4 · U
B2
:= G4 ,
(5.84)
% = ζ%5 · U
b2
:= G5 ,
(5.85)
1−ε (A1 B1 + A2 B2 + A3 b1 + A4 b2 + Cv ) := G6 , Tˆ 1
(5.86)
% =ε− ζ%6 · U
while on r = R(t) we need just one condition on % =ε− ζ%1 · U
1−ε (A1 B1 + A2 B2 + A3 b1 + A4 b2 + Cv ) := G1 . Tˆ 1
(5.87)
Modelling Ziegler–Natta Polymerization in High Pressure Reactors
231
Remembering (5.65), (5.66) and the definitions of Ai , i = 1, . . . , 4, we know that the latter quantities and T1 , T2 are all determined as functions of B1 , B2 , b1 and b2 . The same is true for C (see (5.72)). We note that with the choice we made of the eigenvectors the functions Gi , %. i = 2, . . . , 6 are linearly independent in the components of U We recall that on r = r0 we have prescribed ε (see (5.54)), v (see (5.55)), while % , is put equal to a G6 , which is a known function of all the components of U % functional of U via the nonlocal condition (5.37). Similarly, on r = R(t), G1 % and the nonlocal condition (5.64) sets it equal to a is a known function of U % functional of U . The missing boundary conditions are the ones in which we have to specify the reference configuration of the newly produced polymer at r = r0 , namely b1 , % is the vector of Lagrangian coordinates in the reference b2 . We recall that if X configuration for a point produced at time t at r = r0 , then B kR (r0 , t) = ( ( X ( X T )|r=r0 . (5.88) # Now we can argue as if all newly produced points were already existing and “stored’’ within the sphere ||%x|| < r0 . For instance we can attach to a point produced at the location r0%e at time t the Lagrangian coordinate: % (r0 , t) = r0 f (t)%e , X
(5.89)
where f (t) is any smooth function such that f (0) = 1 and f (t) < 0 (e.g. f (t) = e−t ). It is now simple to establish that % (r0 , t) = X % (r0 , t − vp δ) − X % (r0 , t), % (r0 + δ, t) − X X
(5.90)
& 1 ∂ && % = − f (t)r0%e . X & ∂r r=r0 vp
(5.91)
so that
This finally yields 1 f (t)r0 b1 (r0 , t) = vp
!2 ,
b2 = 1.
(5.92)
For t = 0 the degeneracy of the domain generates a discontinuity of T1 (see (5.64) and (5.65)) which is however artificial since in the physical problem we start with a thin layer of polymer produced in the fragmentation stage. To eliminate this discontinuity it suffices to modify (5.62) by introducing a cutoff factor ξ(t), smooth and monotonically increasing from 0 to 1 in some short time interval (0, τ): v(r0 , t) = ξ(t)vp .
(5.93)
232
Fasano et al.
BRIDGING THE TWO SCALES. THE COMPLETE MODEL 5.9 Determining the Free Terms in the Macroscopic Transport Equations We still have to express the monomer absorption rate M in equation (5.2) and the heat production rate H in equation (5.13). If α(rA , t) is the monomer flux entering each of the microspheres located at rA at time t, we just have M = ρα. The quantity α(rA , t) is expressed by & α = −ρf 4πR 2 ε(u − v)&t=R(t) , (5.94) yielding
& & 3 M = − ρf ε(1 − εA )(u − v)&& . R t=R(t)
(5.95)
Concerning H, we must observe that on the large scale it accounts for the heat generated by the polymerization reaction which is ordinarily much larger than the change of heat capacity involved in the transformation of monomer into polymer. Thus we simply write H = ρβ, where β(rA , t) is the heat released per unit time by each microsphere located at rA at time t, namely β = χρs 4πr0 ε0 vp ,
(5.96)
where χ is the latent heat generated by the production of the unit mass of polymer. Hence, H = 3χρs ε0 (1 − εA )
r02 vp . R3
(5.97)
5.10 The Complete Model For the reader’s convenience we summarize the final set of equations:
5.10.1 Macroscale 2 ∂vA R˙ + vA = 3 , ∂rA rA R
(5.98)
Modelling Ziegler–Natta Polymerization in High Pressure Reactors
2 ∂p 1 ∂2 p + = 2 rA ∂rA K ∂rA
M R˙ + 3εA , ρf R
233
(5.99)
** ∂θA ∂θA 1 ∂ 2 ˆ − 2 r [εA kf + (1 − εA )ks ] ∂t ∂rA rA ∂rA A ∂θA +[εA ρf Cf V% fA + (1 − εA )Cˆ s V% sA ] = H, (5.100) ∂rA
[εA ρf Cf + (1 − εA )Cˆ s ]
d RA (t) = vA (RA (t), t), dt
(5.101)
where Cˆ s and kˆ s are defined in (5.11) and (5.12) and boundary and initial conditions are (5.52)–(5.58).
5.10.2 Microscale
εr 2 ur + u + ε r pr +p
εr 2 + ε r
=
εt =− , ε
(5.102)
εr 2η ur +ηurr +η ur −ρf (ut +uur )−(1−ε)κ(u−v), (5.103) r ε % r = Y% , % t + AU U
(5.104)
d R(t; rA ) = v(R(t), t; rA ), dt
(5.105)
where A and Y% are defined in equations (5.67), (5.69), (5.70) and initial and boundary conditions are discussed in Section 5.8. Equations (5.102) and (5.103) are used to express u, p as functionals of % = {ε, B1 , B2 , b1 , b2 , v}, allowing the conthe quantities ordered in the vector U struction of the vector Y% appearing in (5.104), which is treated as a nonlinear hyperbolic system. Finally the terms bridging the two scale are & & 3 M = − ρf ε(1 − εA )(u − v)&& , R t=R(t) H = 3χρs ε0 (1 − εA )
vp being a function of θ, p and possibly T1 .
r02 vp , R3
(5.106)
(5.107)
234
Fasano et al.
5.11 Not Evolving Natural Configuration In this case we have B kP(t) ≡ B kR (the reference configuration is the natural # # an elastic behaviour. Formally this is equivone at all times) and the polymer has % reduces to alent to put μ1 = 0, μ2 = 0 in (5.48). The dimension of the vector U four:
where now the matrix
% = {ε, b1 , b2 , v}, U
(5.108)
% t + AU % r = Y% , U
(5.109)
A ∈ R4×4
is defined by
⎛
v ⎜ 0 A=⎜ ⎝ 0
Tˆ 1 1−ε
and Y% ∈ R4 is
⎛ ⎜ ⎜ Y% = ⎜ ⎜ ⎝κ
ρs ε(u
and obviously
0 v 0 A3
0 0 v A4
⎞ −(1 − ε) −2b1 ⎟ ⎟, 0 ⎠ v
2 r (1 − ε)v
0
− v) +
2 r b2 v ˆ3 2 √μ (b1 r b1 b22
(5.110)
⎞ ⎟ ⎟ ⎟ !⎟ , ⎠ − b2 )
(5.111)
⎡
⎤ , ⎢ b1 − 1 ⎥ Tˆ 1 = ⎣μ ˆ3, +μ ˆ4 b1 b22 − 1 ⎦ , b1 b22 ⎡
(5.112) ⎤
ˆ3 1 μ ˆ 4 b22 μ ˆ 3 b22 ⎥ ⎢μ A3 = − ⎣ , + + , ⎦, 2 b b2 2 b b2 2 (b1 b2 ) 32 1 2
1 2
(5.113)
2
⎡
⎤ √ ˆ 4 2b1 b2 μ ˆ 3 2b1 ⎥ b1 μ ⎢ A4 = − ⎣−μ ˆ3 2 + + , ⎦, 2 b b2 2 (b1 b2 ) 32 b2 2
(5.114)
1 2
whereas nothing changes in equations for p and u namely (5.59), (5.61). As in the general case the system is hyperbolic but with reduced dimensions.
Modelling Ziegler–Natta Polymerization in High Pressure Reactors
235
The eigenvalues v − C , v, v, v + C, which are all real, being C 2 = −2A3 b1 − Tˆ 1 > 0. Accordingly we have the new set of left eigenvectors: 1−ε 1−ε 1−ε ζ%1 = 1, −A3 , −A 4 , −C , Tˆ 1 Tˆ 1 Tˆ 1
(5.115)
(5.116)
associated with v − C ;
1ε−1 , 0, 0 , ζ%2 = 1, 2 b1
ζ%3 = (0, 0, 1, 0),
(5.117)
associated with v, and
1−ε 1−ε 1−ε ζ%4 = 1, −A3 , −A 4 ,C , Tˆ 1 Tˆ 1 Tˆ 1 associated with v + C , as well as the corresponding quantities:
(5.118)
1 (A3 b1 + A4 b2 + Cv ), Tˆ 1
(5.119)
% = ε − (1 − ε) G1 := ζ%1 · U 3 % = ε − 1, G2 := ζ%2 · U 2 % = b1 , G3 := ζ%3 · U % = ε − (1 − ε) G4 := ζ%4 · U
(5.120) (5.121) 1 (A3 b1 + A4 b2 − Cv ). Tˆ 1
(5.122)
The choice of the boundary conditions for this “reduced’’ hyperbolic problem follows the same rules used for the full model with the same difficulties but now we cannot choose the tensor B kR on the internal boundary as we did in (5.89)– (5.92), but instead we need to#choose b1 using (5.58) that defines the stress state of the new born material.
5.12 Conclusions We have formulated a new mathematical model for the Ziegler–Natta polymerization process in high pressure reactors. The operating conditions are such that the monomer (either ethylene or propylene) is liquid. A single agglomerate of catalytic particles is examined after the so-called fragmentation has occurred. The evolving system consists then of a set of very large number of nanospheres
236
Fasano et al.
which during the whole process remain arranged in an agglomerate with radial symmetry. The agglomerate can be treated as a porous medium with constant porosity. Each nanosphere keep growing due to the continuous production of polymer at the surface of the catalytic particle sitting in its core. The new model follows some guideline of a previous approach to the Ziegler– Natta process for low pressure reactors [1, 7] describing the system as a two-scale continuum, but introduces some radical innovation. The most important one is the study of the mechanics of the growing nanospheres (where porosity was also considered constant in Ref. [7]) based on the general mixture theory of Ref. [13] with the technique of evolving natural configurations [2]. Of course a remarkable difficulty is represented by the production of polymer and the consequent necessity of specifying the stress state of the new born material. This approach leads to a very complicated nonlinear hyperbolic system with even more complicated boundary conditions. Indeed the condition to be imposed on the stress at the outer surface of a growing nanosphere involves the dynamics of a whole portion of the agglomerate, and the condition at the catalyst surface links the stress to the entire evolving polymer shell. Couplings between the large scale (agglomerate) and the shell scale (nanospheres) are much stronger than in the previous simplified model for low pressure reactors. In the present study we limited the theoretical analysis of the model to checking the consistency of the boundary conditions naturally suggested by physics with the structure of the hyperbolic system governing the evolution of the microspheres, according to Li Ta-Tsien’s theory [16]. We believe that this work provides a new insight and a deeper knowledge of such an economically relevant industrial process. Further studies will follow which we hope can lead to the possibility of performing simulations and obtaining quantitative predictions and technically relevant information for optimizing the process.
REFERENCES 1. D. Andreucci, A. Fasano, and R. Ricci, Modello matematico di replica nel caso limite di distribuzione continua di centri attivi, in Simposio Montell 96, S. Mazzullo and G. Cecchi (eds.), Ferrara, 1996, SATE, pp. 154–174. 2. K. R. Rajagopal and A. R. Srinivasa,A thermodynamic framework for rate type fluid models, J. Non-Newtonian Fluid Mech., 88 (2000), 207–227. 3. P. Galli, Modello di crescita di un polimero e fenomeno della replica: dalla teoria alla conferma sperimentale, in Simposio Montell 96, S. Mazzullo and G. Cecchi (eds.), Ferrara, 1996, SATE, pp. 12–40. 4. S. Pizzi, The multigrain two-scale model for the Ziegler–Natta polymerization process with fragmentation of the catalytic aggregate, MECCANICA, 35 (2000), 312–323. 5. R. L. Laurence and M. G. Chiovetta, Heat and mass transfer during olefin polymerization from the gas phase, in Polymer Reaction Engineering, K. H. Reichert and W. Geisler (eds.), Munich, Hanser, publisher, 1983, pp. 73–112. 6. D. Andreucci and R. Ricci, Mathematical problems in Ziegler–Natta polymerization process, in Complex Flows in Industrial Processes,A. Fasano (ed.), MSSET Birkhauser, Boston, 2000, pp. 215–238.
Modelling Ziegler–Natta Polymerization in High Pressure Reactors
237
7. A. Fasano, Mathematical models for polymerization processes of Ziegler–Natta type, in Mathematical Modelling for Polymer Processing, V. Capasso (ed.), Vol. 2 of Mathematics in Industry, ECMI, Springer, 2002, pp. 3–28. 8. A. Mancini, Numerical Simulations of the Ziegler–Natta Process. I 2 T 3 Internal Report, 2001. 9. G. Mei, Modello polimerico multigrain e double grain, in Meccanismi di accrescimento di poliolefine su catalizzatori Ziegler–Natta, S. Mazzullo and G. Cecchi (eds.), Ferrara, 1996, SATE, pp. 135–153. 10. A. Fasano, A. Mancini, and R. Ricci, Solid core revisited, in Free Boundary Problems, P. Colli, C.Verdi and A.Visintin (eds.),Vol. 147, International Series of Numerical Mathematics, Birkhauser, 2003, pp. 139–149. 11. J. Bear, Dynamics of Fluids in Porous Media,American Elsevier, Boston, 1972. 12. T. C. Hales,The Kepler conjecture. http://www.math.pitt.edu/thales/PUBLICATIONS/kepler. pdf, 1998. 13. K. R. Rajagopal and L. Tao, Mechanics of Mixtures,Vol. 35,World Scientific, 1995. 14. K. Kannan and K. R. Rajagopal, A thermomechanical framework for the transition of a viscoelastic liquid to a viscoelastic solid, Math. Mech. Solid., 9 (2004), 37–59. 15. R. W. Ogden, Non-linear Elastic Deformations, Dover, New York, 1998. 16. L. Ta-Tsien and W. Ci Yu, Boundary Value Problems for Quasilinear Hyperbolic Systems, Vol. V Duke University Mathematics Series, Mathematics Department, Duke University, Durham, NC, 1985.
C H A P T E R
S I X
Pseudofluids Gianfranco Capriz∗
Contents 6.1 6.2 6.3 6.4 6.5 6.6 6.7 6.8 6.9 6.10
Preamble Material Element Basic Fields Measures of Deformation and Distorsion Strain Rates and Distorsion Rates Inertia Measures Relations with Thermal Concepts Balance Equations Boundary Conditions: Sample Flows A Lagrangian Approach
238 239 242 245 248 250 252 255 258 259
Abstract Pseudofluid is a borrowed name to designate here a soft continuum in which torques have a central role, as in a hyperfluid, but which, at bottom, shares, on the one hand, certain properties of hypoelastic bodies and, on the other hand, some of kinetic gases. The balance equations were sought, at first, to deal with certain flows of granular materials where random speeds of agitation could be of the same order as the average speed; however, those equations may have appeal and bearing in other contexts so that prospects opened are many. Preliminaries, elsewhere elementary, are here as inspired by the kinetic theory. Key Words: Multifield theories, Granular materials, Kinetic theory
6.1 Preamble Ordinary gases and granular materials are modelled, mainly, on the basis of kinetic theory; attention is focused on the consequences of collisions, on their mathematical representation by way of a Boltzmann integral over the space of peculiar velocities and on means of integrating the equation ruling the distribution function. ∗
Dipartimento di Matematica, Università di Pisa, Italy e-mail:
[email protected] Material Substructures in Complex Bodies ISBN-10: 0-08-044535-7
238
© 2007 Elsevier Ltd. All rights reserved.
Pseudofluids
239
Peculiar velocities are the velocities of the molecules with respect to a reference translating with the centre of gravity. Under some circumstances, fortunately rare, several consequences of this approach are affected, if modestly, by a rotation of the observer and, in principle at least, attendant corollaries fail to be objective. The difficulty may be redressed by detracting the effects of observer rotation; however, such patching up, though effective, is not very elegant. An intrinsically valid approach should be based on the deploy, at each place, of a reference (with the origin still moving as mentioned above, but) with an orientation exacted of the complex kinetic behaviour of the molecules within each element rather than left the whim of the observer. Such an approach is followed here with sundry quaint consequences. In some respect, however, the present essay is seriously wanting: no attention is paid to the collision term. Rather, borrowing suggestions from extended thermodynamics and relying on hints from standard continuum mechanics, an appropriate brew of stress and hyperstress is introduced and special constitutive laws are suggested depicting the effects of collisions and cohesion. As happens in extensions of the standard kinetic theory which account for rotations of the molecules, besides the equation of Cauchy, a Cosserat balance law emerges, but for tensor moments. That law involves also a new tensor (Reynolds’), quadratic in the peculiar velocities, which satisfies a last equation, which expresses a balance for a tensorial mechanical energy due to the agitation of the molecules. Thus, in a sense, the boundary between mechanical and thermal phenomena is pushed forward in favour of the former. Continua for which the set of balances just quoted are appropriate could be named hypofluids in analogy with nomenclature within the theories of elasticity, but also, vice versa, hyperfluids because densities of moments are involved; hence the choice of the name pseudofluids, bearing in mind physical behaviour of media, “soft matter’’, they are intended to portray.
6.2 Material Element At the outset a chief question must be answered: what is a material element or, rather, what is, exactly, here, the depth of its potrayal within our mathematical model? The answer is: we use two scales; within the gross scale the element is seen simply as a point x in the Euclidean space E 3 ; tinier details belong to a minute scale and comprise often, at least in principle as within the present analysis, also the span e of the element in E 3 , i.e. a “microscopic’’ neighbourhood of x. The space of minutiae (information about e apart, if it is needed at all) must be chosen as expedient. It may even be a functional space; more often it is a manifold of finite dimension (perhaps a group), as occurs when modelling nematic liquid crystals. On occasion the manifold collapses, conveniently, into a linear space, even into another, now local, Euclidean space, the typical example of the latter being multipolar continua; care must be taken then to avoid equivocations.
240
Gianfranco Capriz
Hence it is worthwhile, even if indeed repetitious, to emphasize that although, coarsely, the element is replaced by a point, still, while instituting the prolegomena of the theory at least, one needs to magnify the element enormously and meditate over all its morphological complexities to extract the essential traits and measure them with appropriate variables, which will become the values at x of fields over the body. Here, as in the kinetic theory of gases, the element is imagined to comprise numerous specks of matter, or molecules; the latter are entrained by the gross motion but, besides, are prone to a peculiar agitation. Within the kinetic theory the essential ingredient is the distribution function θˆ of the peculiar velocities c of the molecules (measured with respect to a reference translating with x) irrespective of their place y within the element; thus c may range over the whole vector space V. To be precise, θˆ is the number density (within V) of the molecules belonging to e, so that the integral over V of θˆ (c) is equal to the total number of molecules in e, a number presumed constant throughout. That integral, when multiplied by the mass μ of one molecule, is reputed to be equal to the product of meas(e) by the gross density ρ at x. Again, as in the standard theory, x is imagined as the centre of mass of the molecules in e so that the integral over V of c by θˆ (c) vanishes. Sadly, the averages based on θˆ are marred by lack of objectivity. The blemish can be hardly diagnosed within the standard kinetic theory: one deals then with circumstances where the values of |c| are extremely larger than the speed |v| (v = x˙ ) attributed to the element or those which may ensue from a rotation of an observer. It is exhibited only when pressing on to approximations of higher order, such as in the Enskog expansion, or whenever the kinetic theory is stretched to cover cases when |v| and the speeds |c| are of the same order (as occurs with granular media). The entrainment evaluated through a link with x must be complemented, as mentioned in the Preamble, by an intrinsic choice of the orientation of the reference so that it fits “best’’ the admittedly chaotic behaviour of the molecules. Thus the evidence we require here goes deeper: to begin with, the distribution function of both, the places y of molecules within e and their peculiar velocities c must be known: each subelement at y containing many molecules with velocity distribution θ(y, c) the dependence on x and time τ being not explicited). To proceed one must muster detailed notation and reckon with some crucial amendments. Whereas, as before, the integral of θ over its domain (now, obviously, V × e) gives the constant total number of molecules in e and, when multiplied by μ, the total constant mass, the latter may now differ from the product of the gross density ρ by meas(e), because of the possible effects of self-penetration. The latter phenomenon need not necessarily imply mass variation in e as the number of molecules leaving e may well be, and often is, the same as those entering it. In fact, such is the root presumption in our developments, so that like before: ρ meas(e) = μ θ(y, c) and yθ(y, c) = 0. (6.1) e
V
e
V
241
Pseudofluids
For any sufficiently smooth function ψ of places and time (and indicating by a dot a time derivation), because of the presupposed conservation of mass of the element: · e
V
ψθ =
˙ ψθ.
(6.2)
V
e
To proceed, a change of variables (particularly, but not only in θ) is crucial so that independence of the observer is easily validated, whenever necessary. For that purpose, an essential ingredient is Euler’s inertia tensor per unit mass: θ(y, c)y ⊗ y. (6.3) Y = e
V
Through Y , one can construct an intrinsic metric at x, its principal axes offering also the sought intrinsic reference there. Actually, one may venture beyond and introduce a non-dimensional tensor G, determined by Y , an orthogonal factor apart: G = (meas(e))− 3 Y 2 R. 1
1
(6.4)
R leads from a reference, arbitrarily chosen so far, to the intrinsic present one. That reference need not be, nor is it convenient that it be, independent of time; its time dependence may rather be selected appropriately so as to satisfy an additional condition, as we shall see. A constant orthogonal factor (unaffected by an added rigid rotation of the body as a whole or by a rotation of anyone observing the motion) could always be introduced if convenient in solving special problems. The formulae which involve G below must be, generally, indifferent to changes by such last factor, as it is trivially the case of an expression of Y in terms of G, namely: 2
Y = (meas(e)) 3 GG T .
(6.5)
Of course, G determines R uniquely: R = (GG T )− 2 G. Moreover, G −1 can be used as the pull-back of any vector in e, y in particular, into the reference; in earlier papers the notation s was used for the pull-back of y, namely y = Gs. Obviously 2 the pull-back of Y is the spherical tensor (meas(e)) 3 I (I the unit tensor). In this sense, G extracts whichever affine displacement fits globally best (best also in another capacity to be made precise below) the actual displacements of the molecules; other changes are attributed to agitation proper, left unknowable in detail. In fact, the only other total over the subelement at s is the velocity attributed to it, which is the average over c at s: cθ(y, c) ˙ . (6.6) y˙ = Gs + G˙s = V V θ(y, c) 1
242
Gianfranco Capriz
This decision can be intended also as an indirect definition of ˙s. The distribution θˆ mentioned above can be got by integration over e: (6.7) θˆ (c) = θ(y, c). e
Exclusive use of θˆ , however, bars the road to a definition of Y (hence of G); for which purpose a sort of dual distribution (6.8) θ˜ (y) = θ(y, c) V
sufficies instead, because, obviously,
θ˜ (y)y ⊗ y.
Y =
(6.9)
e
A decision on the restricted alternative recourse either to θˆ (c) or to θ˜ (y) marks the junction where extended thermodynamics and the present sort of extended mechanics essentially separate. The explicit variable in θ˜ , above, is y, but we will write from now on s instead, the dependence on G being hidden again in the unexpressed dependence on x and τ. Remark that, from (6.1) and (6.6): sθ˜ (s) = 0, ˙sθ˜ (s) = 0. (6.10) e
e
˙ are intended to be known; in the Actually, when operating on θ, G and G ensuing field theory, known as functions of x and τ. Therefore, in θ, they will be hidden through the implicit dependence on x and τ so that the only arguments written explicitly for θ will be s and c (the latter, to respect traditional notation, is written in place of ˙s).
6.3 Basic Fields Primary kinetic fields, in standard theories, are gross density ρ, gross velocity v, an ensuing kinetic energy per unit mass, namely κ = 12 |v|2 , the placement ˙ −1 . gradient F and velocity gradient L = grad v = FF Here, further fields are introduced; Y and G were already mentioned in Section 6.2. In addition: ˙ −1 , matching and set against L at the level of the element and relevant (i) B = GG in a corollary of (6.5), namely: Y˙ = BY + YB T .
(6.11)
243
Pseudofluids
(ii) The tensor moment of momentum K= e
θ˜ (s)y ⊗ y˙ ,
θ(y, c)y ⊗ c =
V
(6.12)
e
the symmetric part of which is bound to Y˙ through a corollary of (6.3) and (6.9) Y˙ = 2 sym K .
(6.13)
Remark. Knowledge of the restricted distribution θ˜ sufficies for the appropriate evaluation of Y and K provided that y˙ in equation (6.12) be the quantity introduced in equation (6.6). Some next definitions are in principle deficient. (iii) The kinetic energy per unit mass should be 1 2 1 |v| + 2 2
e
V
θ(y, c)c 2 ,
(6.14)
but we will use also a “reduced’’ value, denoted by κ˜ , given, with the use of the distribution θ˜ , by 1 1 κ˜ = |v|2 + 2 2
θ˜ |˙y|2 ,
(6.15)
e
it is κ˜ , rather than κ, that will enter our extended mechanics;% thus, $ 2 1 implicitly, we leave the deficiency 2 e V θ(y, c)c 2 − 1˜ V θ(y, c)c to be θ compensated essentially by a thermal contribution. (iv) Similarly, the full kinetic energy tensor 1 W = 2
V
e
θ(y, c)(v + c) ⊗ (v + c)
(6.16)
is accompanied by a reduced version ˜ = W
1 2
θ˜ (s)(v + y˙ ) ⊗ (v + y˙ ).
(6.17)
e
(v) A full Reynolds’ tensor H= e
V
θ(y, c)c ⊗ c =
V
θˆ (c)c ⊗ c
(6.18)
244
Gianfranco Capriz
by a reduced one ˜ ˜ ˜ θ (s)˙s ⊗ ˙s G T . H = θ (s)(˙y − By) ⊗ (˙y − By) = G e
(6.19)
e
Comparison of (6.11) with (6.13) suggests that K might coincide with YB T . However, for the time being, only their symmetric parts agree; still one can play with the partial indetermination of G to insure complete equality. In fact, substitute G with G = GQ; Q, any orthogonal tensor, arbitrary so far, as allowed by the acknowledged ambiguity. Then B changes into ˙ + G Q)Q ˙ T G −1 = GG ˙ −1 + G QQ ˙ T G −1 B = (GQ
(6.20)
˙ T G T . Finally, we so that YB T is modified by the addition of the term YG −T Q Q can choose B through a wise selection of Q (or, rather, of its time-dependence, ˙ T ) to insure the coincidence of i.e. of the angular velocity associated with QQ T YB with K . The proof requires elementary manipulations of a fourth-order tensor G ⊗ (G −1 Y ) − (YG −T ) ⊗ G T appropriately doubly contracted with Ricci’s alternating tensor. In the event, one can even imagine a reversed process, where an archetype e∗ of the element exists, or is conveniently invented, within which the agitation of the molecules is ruled by s and the distribution function is again θ˜ but multiplied by det G.Within e, instead, the actual law of motion is given by y(τ); the latter motion is envisioned as split into an affine motion governed by G, and an irregular one duly determined by s, G being chosen so that the tensor moment of momentum K becomes equal to YB T . Such judicious splitting implies by (6.3), (6.6) and (6.12) this result: the tensor moment of agitation in e∗ vanishes, namely: (det G)θ˜ (s)s ⊗ ˙s = 0. (6.21) e∗
In other words, returning to the original process, the reference determined by Q is one with respect to which (6.21) applies. In terms of ˙s, the reduced Reynolds’ tensor becomes T H˜ = G H˜ ∗ G , H˜ ∗ = θ˜ (s)(det G)˙s ⊗ ˙s, (6.22) e∗
˜ can be given a compact form whereas, in view of (6.21), W ˜ = 1 v ⊗ v + 1 BYB T + 1 H˜ . W 2 2 2
(6.23)
Actually it is easy to prove that the choice made for G is exactly one that makes the difference κ˜ − 12 (|v|2 + trBYB T ) a minimum; also in this sense the option suggested for G is “best fitting’’.
245
Pseudofluids
Indeed that difference amounts to 12 e∗ θ˜ (s)(˙y − By)2 , a quantity the derivative of which with respect to B is exactly K − YB T . A few other totals embody estimates of inertia; they are integrals where y¨ enters as a factor. Obviously one gets θ˜ (s)¨y = 0, (6.24) e
so that x¨ still measures the resultant inertia of molecules in e per unit mass. Besides, we need the tensor moment of inertia
θ˜ (s)˙y ⊗ y¨ = e
·
θ˜ (s)y ⊗ y˙ −
e
θ˜ (s)˙y ⊗ y˙ e
= K˙ − BK − H˜
(6.25)
and the reduced Reynolds’ inertia tensor · ˜θ (s)(G˙s ⊗ y¨ + y¨ ⊗ G˙s) = H˜ + θ˜ (s)((G˙s) ⊗ (Gs ¨ + G˙ ˙ s) + (Gs ¨ + G˙ ˙ s) ⊗ (G˙s)) e
e
·
= H˜ + H˜ B T + B H˜ .
(6.26)
To obtain the last result, (6.21) is essential. Remark. All the definitions and results above signify that κ, H , W have their main role within extended thermodynamics, whereas the quantities shown by the same letters capped with a tilde prevail in what we have suggested to call extended mechanics. A deeper study of relations between quantities in the two sets would be worthwhile but it is not pursued here. Below, to simplify notations we seem to forget the differences and do not use alternative notation; however, the context makes it obvious which choice applies.
6.4 Measures of Deformation and Distorsion In the approach and interpretation of Section 6.3, an inherent local frame is assigned (with origin x and orientation fixed by the non-singular tensor G); inherent in the sense that the residual “peculiar’’ kinetic energy of the randomly varying granules about it is minimal. If, quite exceptionally, an archetypal setting of the element existed or, as we put it, might at least be invented without detriment, then a unique choice of G (and for F) could ensue. Otherwise, only the rate B has proper physical meaning (as L has in the habitual theory), measuring changes from the present state.
246
Gianfranco Capriz
Of course, in any case, F and G may combine into locally legitimate entities: both FF T and GG T , not only the latter, may be interpreted as local metrics (interelement and inherent, respectively); FG −1 may be taken as a persuasive local measure of stretch, whenever referential measures are physically meaningless. If we run again through the definition of quantities in Section 6.3, we realize that we refer to a fixed set of molecules supposed to belong instantaneously to e as it were a monad, whereas the element is part of a continuum, hence facing other elements all around (and that perception is mandatory to determine F, of course). The remarks regards G, in particular; it emphasizes that G accounts for all of: (i) the influence within the element of the macrostretch measured locally by the placement gradient F, plus; (ii) the rearrangements of granules within the stretched element (for a given mass a “crust’’ leads to a larger value of Y then a “compact ball’’); (iii) the self-penetration of granules into and from neighbouring elements (due, say, to an excess of homogeneous apparent microstretch over the macroscopic one); for such an occurrence the name protrusion has been suggested. The last two effects together influence the value of the tensor FG −1 ; whereas the rate of self-penetration alone may be related to the tensor H as described in the next section. When the alternative (iii) is significant and the gain and loss of granules is unbalanced, then, contrary to the initial tenets, the element becomes a system with variable mass; the corresponding oversight should be corrected. To achieve the required correction we need to estimate the local mass variation per unit volume. A discrepancy between F and G by itself is not sufficient, even in the absence of effect (ii), to imply mass variation; it might simply give a hint as to the extent of self-penetration, a measure of which could be the different change of volume attributed to the element by F and G: α = det(FG −1 ), leading to the rate α˙ = α tr(L − B) = α(div v − tr B).
(6.27)
Rather it is only the possible presence of a relatively steep change of gradient of α that is accompained by a mass transfer. A rough estimate of mass variation per unit volume could be (ρα), where α is the Laplacian of α. In fact, for a spherical element, one finds (meas(e))−1 grad(ρα) · n = (ρα). (6.28) ∂e
which is a measure of the “extra matter’’, as in the theory of dislocations. We will keep an ambiguous stance on this issue and proceed as though the definitions of momenta and their moments were not affected, delaying amendments and delegating responsibility for them at a later stage, when constitutive laws are selected. Then, the whole gradient of FG −1 may be involved rather than
247
Pseudofluids
only the gradient of α. Clearly, direct estimates of momentum losses and gains, to subtract from or add to ρv, K and H , would be more appropriate; they would, indeed, involve the whole gradient and its time rate. 2 Together with the local metrics C˜ = FF T and N˜ = GG T = (meas(e))− 3 Y , the relative gradient X˜ = FG −1 must be relevant, although the latter is not totally 1 independent of C˜ and N˜ . In fact, if F and G are split into the products C˜ 2 R and 1 N˜ 2 R , respectively, then X˜ appears as the product X˜ = C˜ 2 R(R )T N˜ − 2 , 1
1
(6.29)
˜ which puts in evidence the independent factor Q ˜ = RRT. Q
(6.30)
We have inflicted a tilde upon some symbols above to reserve the bare letters for the near standard case, when a tangible archetype or reference exists and the appropriate strain tensors are C = F T F,
N = G T G.
(6.31)
Then, F and G are split into the products RC 2 and R N 2 , respectively, the relative gradient is X = G −1 F and the relevant independent rotation is best measured by Q = R T R, with component tensors the same as before. Also, gradients can be evaluated on the reference (capital letters being used for the corresponding operators: Grad, Div, Rot). There are many fashions to measure how discrepant G is from a true gradient; they all imply, of course, the behaviour of G in neighbouring elements. If an archetype is available, one could examine, simply, how far from vanishing is the skew part (in the last two indices) of Grad G or, better, of G T Grad G; even in that case, an alternative sometimes convenient exists in the choice of n = Grad N . Notions of torsion, dislocation density and Burgers vector ensue. However, those notions have citizenship also in a purely local theory, always, if indirectly, via the tensor G. The local wryness w could be gauged by 1
1
w = grad X˜ ,
(6.32)
h = 12 (w − wt )
(6.33)
the local torsion h by
(where the exponent t indicates transposition of the last two indices of the thirdorder tensor w), the dislocation density by eiab hjab
(6.34)
248
Gianfranco Capriz
(where e is Ricci’s permutation tensor; repeated indices are used to indicate contraction) and the Burgers vector b, relative to any plane of unit normal vector n, by bi = eiab hjab nj .
(6.35)
As in the standard case, if both N and C degrade to the identity, the element (agitation apart) and its immediate neighbourhood (to the first order) are each involved only in a rigid rotation. The two rotations may be different, however, if Q itself does not reduce to the identity. Incidentally, even if locally G itself is the identity, wryness and torsion may not vanish there. In the absence of effect (ii) mentioned at the beginning of Section 6.4, a measure of protrusion from or intrusion into the element could be achieved by imagining a unit vector n∗ within the archetype, as it is changed into Gn∗ and Fn∗ by the two relevant tensors. Protrusion or intrusion occurs in the direction of Gn∗ depending as to whether |Gn∗ | ≷ (Fn∗ ) · (Gn∗ )|Gn∗ |−1 . Hence, the unitary measure could be 1 − m · X˜ m
(m = Gn∗ |Gn∗ |−1 ).
(6.36)
6.5 Strain Rates and Distorsion Rates The tensors connected with deformation and distorsion (of the inner element and within its immediate neighbourhood) are mirrored in kinetic tensors linked with time rates. Again, direct recourse to derivatives is less suitable than slightly more elaborate options, which lessen or avoid altogether observer dependence. ˙ −1 have already been mentioned. They are affected ˙ −1 and B = GG L = FF by a superimposed rigid motion each through the addition of a common skew tensor. Thus L − B is not modified, neither are sym L and sym B. It is interesting to observe that C˙ = 2F T (sym L)F,
(6.37)
N˙ = 2G (sym B)G,
(6.38)
T
X˙ = G
−1
(L − B)F,
(6.39)
whereas ·
˜ C˜ = 2 sym L C,
·
N˜ = 2 sym B N˜ ,
·
X˜ = L X˜ − X˜ B.
(6.40)
249
Pseudofluids
Note also that the expression of skw(L − B) is not as compact as that of sym(L − B): ˙ T + R[(C 2 )· C 2 − C 2 (C 2 )· ]R T 2skw(L − B) = R QR 1
1
1
1
− R [(N 2 )· N 2 − N 2 (N 2 )· ]R T , 1
1
1
1
(6.41)
whereas, of course, 2 sym L = R[(C 2 )· C − 2 + C − 2 (C 2 )· ]R T 1
1
1
1
˙ −1 = RC − 2 [C 2 (C 2 )· + (C 2 )· C 2 ]C − 2 R T = F −T CF 1
1
1
1
1
1
(6.42)
and similarly for sym B. Again, as in the standard case, if C˙ and N˙ vanish, then the instantaneous velocity fields within the element (agitation apart) and in its immediate neighbourhood (to the first order) are those of a rigid motion. The two rotational speeds may 1 1 ˙ 2 , it need not vanish; for differ, however; although X˙ reduces then to N − 2 QC ˙ must go to zero. that to occur, also Q Vice versa, returning now to the general case, equality of skew L with skew B ˙ the symmetric parts of L and B may does not imply necessarily the vanishing of Q: be such as to imply different rates of change in the orientation of the principal axes of C and N . As for the time rate of wryness, it depends on the choice in the definition of the latter. If one uses the third-order tensor nABC , then its time-derivative differs from a symmetric part of the third-order tensor GAi GBj FkC bijk , bijk = Bij,k , by an additional term proportional to sym B: −1 −1 −1 −1 −1 −1 bijk + bjik = GAi GBj FCk n˙ ABC − [GAi (Bjr + Brj ) + GAj (Bri + Bir )]GrA,C FCk .
(6.43) So far there are hardly novelties: formulae similar to those cited above, perchance with different interpretations, could be unearthed from texts on complex materials with affine microstructure. Rather, novel to some extent, is the way one profits here of the tensor H , some disguise of which has such prominent part in studies of turbulence. The exegesis goes as follows. Recall that, in classical fluid mechanics, the gross interpretation of L or, better, of its symmetric and skew components is pervasive and very serviceable, random molecular motion notwithstanding. Still, to hide the effects of the latter motion outright within the thermodynamic maelstrom may, on occasion and particularly when coarser granulosity is involved, curtail correct perception of phenomena. A first crude grasp of that recondite behaviour
250
Gianfranco Capriz
is offered, indeed, by the tensor H through its mesoscopic interpretation as follows: (i) Write H in its canonical form highlighting non-negative eigenvalues (χ(s) )2 and unit eigenvectors h(s) : H=
3
(χ(s) )2 h(s) ⊗ h(s) .
(6.44)
s=1
(ii) Caricature the original definition of H by reading directly (6.44) as the sum of three terms as though the population of molecules (all having the same mass, as always presumed here) were spread among three tribes; within the sth tribe all the molecules moving along the line spanned by h(s) with speed |χ(s) |; each tribe split into two subtribes, equally numerous but with opposing velocities (±χ(s) h(s) ). Alternatively, one may imagine all molecules to have (not only the same mass but also) the same speed, but the fraction of those moving along h(s) to be (χ(s) )2 3
.
(6.45)
(χ(s) )2
s=1 1
With such an interpretation in mind it becomes obvious that H 2 is the tensor 1 regulating the balanced cross-flux of molecules, H 2 n being a measure of the flux through a plane of unit normal n. What was just said of H could be mirrored in an interpretation of H∗ , intrinsic now H∗ =
3
(s)
(s)
(s)
(χ∗ )2 h∗ ⊗ h∗
(6.46)
s=1
with (s)
h
(s)
=
Gh∗
(6.47)
(s)
|Gh∗ |
and (s)
(χ(s) )2 = (h∗ − Nh∗ )2 (χ∗ )2 .
(6.48)
6.6 Inertia Measures ˙ The totals of momenta called upon in the previous sections involve x˙ , G (or B), H ; those are the quantities appearing in the expression of kinetic energy.
251
Pseudofluids
To achieve the balance equations which hold for x, G and H , one must first express the relevant totals of inertia. The first total (per unit mass) is again x¨ as in the standard case. The acceleration of a molecule being equal to ¨ + 2G˙ ˙ s + G¨s, x¨ + Gs
(6.49)
the straight average over e × V is obviously x¨ . To attain the tensor moment of inertia we must take the tensor product of (6.49) by Gs to the left: ¨ T + 2G(s ⊗ ˙s)G ˙ T + G(s ⊗ ¨s)G T . G(s ⊗ x¨ ) + G(s ⊗ s)G
(6.50)
The first and the third addendum give null totals; the total of s ⊗ ¨s is the opposite of the total of ˙s ⊗ ˙s (again because the total of s ⊗ ˙s, and of its time derivative vanishes). On the other hand, by known definitions and properties ˙ −1 YG −T G ¨ T + GG ˙T K˙ = YG −T G
(6.51)
so that the total (6.50) is equal to K˙ − BYB T − H .
(6.52)
H measures the variance of the distribution of momenta rather than an average or a moment. One must multiply (6.49) tensorially to the left by G˙s and take the symmetric part before integration. That operation cancels the first two terms; the sum of the remaining two by the definition of H and through some straightforward calculations is found to be equal to H˙ + HB T + BH .
(6.53)
An alternative process to obtain totals of momenta is linked with the use of variational derivatives of kinetic energy density. Write first κ so as to put in ˙ and ˙s evidence x˙ , G, G 2 1 2 1 1 ˙ ˙ 3 ˙s ⊗ ˙s G T , (6.54) κ = x˙ + (meas(e)) G · G + tr G 2 2 2 e∗ V then one gets promptly
∂κ ˙ ∂G
·
−
∂κ ∂˙x
·
−
∂κ = x¨ , ∂x
2 ∂κ ¨ − HG −T . = (meas(e)) 3 G ∂G
(6.55)
(6.56)
On the other hand ¨ + BYB T G −T , K˙ G −T = (meas(e)) 3 G 2
(6.57)
252
Gianfranco Capriz
hence
or
∂κ ˙ ∂G
·
G
−
∂κ ˙ ∂G
∂κ = (K˙ − BYB T − H )T G −T ∂G
·
∂κ − ∂G
T
= K˙ − BYB T − H .
(6.58)
(6.59)
Of course, (6.56) directly leads to a double vector which must be pushed forward with the help of G to lead to the tensor (6.25). To evaluate the relevant “virial’’ of inertia, one must first perform the tensor product of (6.49) with G˙s to the left and take twice the symmetric part of the resulting tensor ¨ T + Gs ¨ ⊗ ˙sG T + 2G˙s ⊗ ˙sG ˙T G˙s ⊗ x¨ + x¨ ⊗ ˙sG T + G˙s ⊗ sG ˙ s ⊗ ˙sG T + G˙s ⊗ ¨sG T + G¨s ⊗ ˙sG T . + 2G˙
(6.60)
The first four addenda give null totals. The total of the next two amount to 2(HB T + BH ). To interpret the total of the last two observe that ∂(˙s ⊗ ˙s) · = ¨s ⊗ I + I ⊗ ¨s. (6.61) ∂˙s The density of κ must be taken now to be per unit “volume’’ in e × V so that the variational derivative must be taken at that depth. That derivative contracted with ˙s, when integrated again, gives appropriately ˙ ∗ G T + 2H∗ G ˙ T + G H˙ ∗ G T . 2GH We close this section quoting the local version of inertia densities ∂v ∂K + Lv , ρ + ( grad K )v − BK − H , ρ ∂τ ∂τ ∂H T + ( grad H )v + BH + HB . ρ ∂τ
(6.62)
(6.63) (6.64)
6.7 Relations with Thermal Concepts To appreciate the proximity of features of the tensor H with those of some typical quantities common in thermodynamics remark that if the eigenvalues of H were all equal, then 1 H= θˆ c 2 I , (6.65) 3 e
253
Pseudofluids
so that 12 tr H would take the role of temperature of the kinetic theory. In a sense, H generalizes that concept, giving it a tensorial aspect. To approach standard developments even further, suppose that θˆ depend only on speeds |c|, actually can be expressed in terms of c 2 . Then, with S2 the unit sphere, ∞ ∞ 4 2ˆ 2 2ˆ 2 H= c θ (c ) (vers c) ⊗ (vers c) = c θ (c ) I , (6.66) π 3 0 S2 0 with vers c = |c|c . Finally, if θˆ reduces to the canonical form θˆ (c 2 ) =
1 −βc 2 , e β
(6.67)
(β, a positive constant with the dimensions of a speed square), then the integral of c 2 θˆ (c 2 ) can be calculated explicitly with the result 4 H = πβI , 3
β=
1 tr H . 4π
(6.68)
Countless consequences of the choice (6.67) have shown its appropriateness under many circumstances. Variants apply in particular when certain ancillary effects predominate; for instance the distribution θˆ (c 2 ) =
1 |γ|βc 2 γ (e − 1)−1 − 1|−1 |e β
(6.69)
with β as in equation (6.68) and γ = Di log(e − 1) 2
γ
Di log ϑ =
∞ ϑι ι=1
ι
= ϑ
0
log τ dτ τ
(6.70)
is valid in principle with one of the two possible choices for γ γ = −0.814651 . . . γ = 1.405050 . . .
(Bose–Einstein case), (Fermi–Dirac case).
The option (6.67) has one, distant but distressing, corollary: the speed of propagation of certain perturbations in H is infinite; not surprisingly as, by accepting (6.67), one admits also the existence of molecules, though fewer and fewer, with speed as large as desidered. An alternative to equation (6.67) limits speeds to those with square less than c02 , say θˆ (c 2 ) = 0 for c 2 > c02 ,
2 2 θˆ (c 2 ) = αe−αβc (eαβc0 − 1)−1 for c 2 < c02 ,
(6.71)
254
Gianfranco Capriz
with α solution of α − 1 = αβc02 (1 − eαβc0 )−1 , 2
(6.72)
and β again as in (6.68). An inspection of (6.72) requiring only accurate evaluation of order of magnitude shows that there is one and only one value of α satisfying it for βc02 > 1 and there are no values of α with that property for βc02 < 1. The function α(βc02 ), thus defined in (1, +∞), is strictly increasing from −∞ to 1; it vanishes for βc02 = 2; its approximate expression in the neighbourhood of 2 is α ∼ 32 (βc02 − 2). When βc02 tends to 1 and α to −∞, the distribution (6.71) approaches a δ-function with an atom at βc02 = 1. When βc02 tends to the value 2 and α to zero, the distribution (6.71) tends to be piecewise constant: θˆ = 12 for 0 < βc 2 < 2 and null otherwise. When βc02 tends to ∞ and α to 1, the distribution tends to be canonical. This distribution is reminiscent of one which is suitable for quantum systems allowing only a finite number of states; one consequence is the possible occurrence of infinite and negative “temperatures’’. To explore, in a relatively simple alternative context, the new vistas implied by the use of the full tensor H , assume θˆ to be a measure with support on a sphere of radius cˆ say, so that 2 H = cˆ θˆ (ˆc 2 , vers c)(vers c) ⊗ (vers c). (6.73) S2
Of course, if θˆ were independent of vers c, then H would reduce to a spherical tensor with trace cˆ2 . Formally we encounter circumstances exactly matching those envisioned when dealing with only partially ordered nematic liquid crystals. The order tensor could be here 1 1 H− I (6.74) cˆ2 3 with eigenvalues μi say, leading to measures of prolation and optical biaxiality given respectively by
3 27
2
i=1
13 μi
⎛
⎞1 3 5 √ ⎝6 3 ⎠ |μi − μj | .
(6.75)
i<j=1
The main connotation is the conception of a “tensorial temperature’’ %−1 , linked with a distribution θˆ of the special form !* 1 θˆ (vers c) = θ∗ exp % · (vers c) ⊗ (vers c) − I . (6.76) 3
255
Pseudofluids
The tensorial structure of % is strictly linked with that of H . When θˆ is isotropic (θˆ constant), the“temperature’’goes to infinity. A scalar“temperature’’can be easily defined in the case of a uniaxial % and turns out to have of prolation the sign (a parameter which, in general, has values in the interval − 12 , 1 ).
6.8 Balance Equations To achieve the balance equations for v, B and H we need, on the one hand, to refer to results of Section 6.6 concerning inertia totals and, on the other hand, confront the arduous challenge of conceiving and recommending expressions for the totals of internal and external actions on the element. Here we meet severe, and deplorable, handicaps in our essay.There are intimate actions among molecules within the element, actions internal to the body from element to element and totally external actions, external to the body, such as those due to gravity and those due to contacts through the boundary. All these actions may be either caused by molecular collisions or owed to cohesion, attraction or repulsion at short or long distance. A detailed scrutiny as to how all those actions sum up in totals goes beyond our writ. Rather, in analogy with suggestions successful in classical cases, though perhaps less justifiably, we presume the existence of fields of stress T , hyperstress m and agitation flux s. Besides, in view of the fact that moments and virials of internal actions do not necessarily balance and thus lead to non-null resultants, we also presume the existence of fields of tensor moments −A and tensor virial −Z of the actions; plus, of course, the representatives of actions at distance ρf , ρM , ρS. In conclusion, we suggest the following set of balance equations: ∂v ρ + Lv = ρf + div T , (6.77) ∂τ ∂K + ( grad K )v − BK − H = ρM − A + div m, (6.78) ρ ∂τ ∂H T + ( grad H )v + BH + HB = ρS − Z + div s. ρ ∂τ
(6.79)
To this system, the equation of conservation of mass must be added, as in the classical case, together with a property of Y already mentioned in (6.11), a property which could even be interpreted as expressing the conservation of the moment of inertia, though we have already proposed it, deep down, as geometric in origin: ∂ρ + div(ρv) = 0, (6.80) ∂τ
256
Gianfranco Capriz
∂Y + ( grad Y )v − BY − YB T = 0. ∂τ
(6.81)
Evidently, the sombre, furtive assumptions behind the choice of the right-hand sides of (6.77)–(6.79) above are very restrictive of types of interactions allowed. On the other hand, they leave still a vast scope for fantasy and adjustment to a wealth of circumstances via the choice of constitutive laws. Candidates for intermediate variables within those laws are the measures of deformations and their rates mentioned in Sections 6.4 and 6.5. However, before we proceed with some special, even if appealing, suggestions, we must derive the kinetic energy theorem appropriate for pseudofluids. A major corollary of it, connected with an interpretation of terms involved, leads to a final balance law, inclusive of classical one which implies the symmetry of T . If we multiply both sides of (6.77) tensorially by v, operate with B on both sides of (6.78), add term by term the two resulting equations and (6.79) multiplied by 12 , take finally the symmetric part of all terms and integrate over any subbody b, by parts where needed, we arrive at the kinetic energy theorem in tensorial form: 1 1 ˙ ρW = ρ sym x˙ ⊗ f + BM + S + sym x˙ ⊗ Tn + B(mn) + sn 2 2 b b ∂b 1 + sym Z + LT T + BA + bmt (6.82) 2 b where n is the unit normal vector to ∂b and the exponent t to m indicates minor right transposition. The sum of the first two integrals on the right-hand side of (6.82) delivers the tensor power of external actions (respectively: body force, torque, strirring and boundary tractions, twist, agitation flux). Thus, the last integral must be interpreted as the tensor power of internal actions, with density −sym
1
2Z
+ LT T + BA + bmt .
(6.83)
Hence, the density of actual power is given by the scalar −
1
2 tr Z
+ L · T + B · AT + b · (mt )T .
(6.84)
The equation (6.79), or rather its corollary obtained by taking the skew components of its two sides, though it exhausts the requirement of balance of vector moment of momentum, it does not here secure automatically the demand on stresses to make the internal power (6.84) observer independent. Two observers on frames in relative motion read different values of L and B; the difference, in
257
Pseudofluids
both, amounts to ew (e, Ricci’s tensor; w, relative speed of rotation). Hence the condition: skw T = skw A.
(6.85)
Oddly, it occurs sometimes that the constitutive choices for T and A are such that the stronger property T = −AT
(6.86)
applies. Then, as can easily be checked, even the tensor power (6.83) is observerindependent and reduces to −sym
1
T t . Z + (L − B)T + bm 2
Actually, in kindred investigations but where neither moments of momenta nor external torques are incorporated (K , M , m all vanish in (6.79)) one can dispense with a separate fashioning of the tensor A as that tensor would necessarily always coincide with ρH again by our (6.79), now greatly reduced in content. Then, from (6.86), the stronger identification obtains T = −ρH .
(6.87)
Thus, through this constitutive law, a formal connection is enacted with proposals advanced in hypoelasticity, extended thermodynamics, etc., where the Cauchy stress is the main evolving function in an added balance equation. Another argument bears in favour of (6.86), or, at least, reveals its deep gist. −A represents the density of internal equilibrated tensor torques; thus in the absence of twist influx due to subtler mechanisms it can be gauged in terms of T only as follows: imagine the material element as filling a minute sphere S of radius , obviously “small’’ but not insignificant and thus imagine further −A(vol S ) to be equal to the total over the surface of S of the tensor moment of traction Tn (n, unit normal to S ) −A(vol S ) =
S
n ⊗ Tn.
(6.88)
If S were the sphere of radius 1, so that S n ⊗ n = 43 πI (I , the identity tensor), then 4 3 3 n ⊗ n T T, (6.89) π A=− 3 S hence (6.86).
258
Gianfranco Capriz
6.9 Boundary Conditions: Sample Flows The balance equations (6.77)–(6.79) go along with appropriate conditions at the boundaries which either render the requirements imposed there on v, B and H or embody the local effects of the environment through the assignment of tractions Tn, twister mn and stirrer sn (n, unit normal vector). Actually, boundary conditions cannot be expected to mimic always that standard model strictly. For instance, granularity and permeability of restraining walls play sometimes a decisive role; their effects on the inner flow must be identified and portrayed mathematically and that portrayal demands details of some “substructure’’ of the boundary. In any case, the variety of continua for which the proposed balance laws might apply makes general statements unfeasible: loose granular matter is hardly entrained by a moving boundary or restrained by a stationary one, whereas no slip is allowed for viscous suspensions. The major ingredient still missing is some statement regarding constitutive rules for A, Z, and T , m, s. Each set of rules would characterize a tribe within the class of pseudofluids; criteria of objectivity would restrict the choice of those laws and put in evidence intermediate variables of strain such as C, N and Q (particularly in conservative instances) the ensuing time rates (particularly when friction effects become relevant). Some hypothetical simple flows examined in a kinetic context (i.e. apart from their dynamic feasibility) might suggest the simplest consequences of radically different boundary assumptions. Consider the following, almost trivial, plane flows (in the plane ζ3 = 0, say, of a Cartesian coordinate system) in an infinite channel: −∞ < ζ1 < +∞, 0 ≤ ζ2 ≤ δ. Let a, b be two constant vectors in the directions of the first and second axis, respectively, and take v = a,
B = 0,
H = b ⊗ b.
(6.90)
Molecules jog up and down across the channel, with peculiar velocities ±b (half the population up and half down), bouncing at the walls, at the same time they move steadily down the channel with average velocity a. A bare image of the flow could be described also thus: at each place of the channel two clouds of molecules meet, equally numerous, one with velocity a + b and the other with velocity a − b without hindering each other. The bounce at the boundary occurs without energy loss (it is “perfectly elastic’’). However, there could be a totally different interpretation of (6.90). The lower boundary of the channel is stationary, while the upper plane slides with velocity 2a. At each place two clouds of molecules meet, equally numerous, one with velocity b and the other with velocity 2a − b. When they bounce at the lower wall, the normal component of velocity changes sign, whereas the tangential component is practically extinguished with a corresponding energy loss; when they bounce at the upper wall, again the normal component simply changes sign
259
Pseudofluids
whereas the tangential component is raised suddenly by the moving boundary from zero to 2a, with a corresponding acquisition of energy.
6.10 A Lagrangian Approach ˙ N˙ , Q, ˙ n˙ provided by (6.40)–(6.43) allows one The link of L, B and b with C, to write the power of internal actions as a function which is affine in those time rates.That circumstance induces one to presume, under conservative prerequisites, the possible existence of a potential ϕ depending on C, N , Q, n (and perhaps other variables as well, but with the latter not influenced by those four) and the ensuing possibility that the dynamics of the body be ruled totally by an associated Lagrange function κ − ϕ. Of course the four variables above can be expressed through F = Grad x, G and Grad G and so ϕ, as far as it goes, is determined by the latter three variables. Two balance equations follow, where the inertia terms are as derived in Section 6.6, ρ¨x =
ρ(K˙ − BYB − H ) = G T
−T
1 ∂ϕ Div , det F ∂F −1
(det F)
∂ϕ ∂ϕ + Div − ∂G ∂(Grad G)
(6.91) !T .
(6.92)
This result reinforces our opinion on the validity of (6.77), (6.78) in general, i.e., even when T , A and m are not proportional to partial derivatives of a potential. In addition it offers a class of constitutive laws for T , A and m. A subtle analogy with remarks at the end of Section 6.6 suggests as plausible the eventuality of a dependence of ϕ also on some construct based on a total involving s ⊗ s, such as Y and its gradient and the consequent admissibility of a continuum where the sum of the last two terms in the right-hand side of equation (6.79) coincide with the still missing terms of the time derivative of ϕ, terms expressed consequently as linear in the time derivative of Y and its gradient. The question now arises as to the nature of the last balance equation (6.79) and, in particular, whether circumstances could be figured when a sort of deeper conservation rule might prevail, which would imply phenomena beyond mechanical ones. The root of the matter rests on the ambiguity we have admitted in the definition of H . The developments of Section 6.7 must be strictly referred to the definition (6.18), all others to the mechanical “reduced’’ expression (6.19) of H˜ . H˜ leaves the disorderly agitation around any subelement out of account, recording only the subelemental average of molecular velocities. As is habitual, such deeper difference could be dealt with “thermally’’, adding to the energy ϕ a quantity measured as heat.
260
Gianfranco Capriz
Then one must propose a central axiom expressing the balance of total energy, an axiom appropriate to the pseudofluids discussed here and dare to suggest that there should be, also for thermal phenomena, an inherent complexity parallel to the kinetic and dynamic one. One could conjecture that it be possible to measure, on each element and at each instant along a process, a density of thermal internal tensor energy, a third-order heat flux tensor, etc. To avoid too far fetched conjectures (although we have already mentioned tensorial temperature at the end of Section 6.7), we follow a middle course and, while suggesting a tensorial form of the principle of balance of energy to match the kinetic energy theorem (6.82), we take the deeper ferment to be isotropic and thus propose spherical tensors 13 εI to represent the total energy (thermal plus mechanical) and 13 λI for the rate of heat generation; we also downgrade the third-order heat flux tensor to the form q ⊗ I , where q is the usual heat flux vector. In conclusion, we postulate the validity, over any subbody b, of a tensor balance equation modelled, formally, on the classical one
·
1 ρ εI + W 3 b
1 1 = ρ sym v ⊗ f + BM + S + λI 2 3 b 1 + sym v ⊗ Tn + B(mn) + sn − q ⊗ n . (6.93) 2 ∂b
Along any process, which is sufficiently regular to ensure the validity of the kinetic energy theorem (6.82), equation (6.93) yields b
·
1 ρεI 3
1 1 = sym LT + BA + bm + Z − grad q + ρλI , (6.94) 2 3 b
T
t
and, because of the choice of b among subbodies is arbitrary, the localization ensues 1 1 1 ρ˙εI = sym LT T + BA + bmt + Z − grad q + ρλI . (6.95) 3 2 3 Taking the trace of both members, the more common form of the energy principle is attained: 1 ρ˙ε = L · T + B · AT + b · (mt )T + tr Z − div q + ρλ. 2
(6.96)
As already remarked in Section 6.8, the first four terms in the right-hand side assign the power of the internal actions, the last two measure heat loss or generation.
Pseudofluids
261
Closing remark and acknowledgement As the approach is, to an extent, tentative and some statements, no doubt, liable to criticism, the sources, innumerable and authoritative though vague in the context, are not referred to uselessly; the knowledgeable reader will recognize them easily. Specifically, some of the results quoted above have appeared already in Rend. Sem. Mat. Padova, 110 (2003) and 111 (2004) and in “Trends in Applications of Mathematics to Mechanics’’, Proceedings of STAMM, Seeheim, Germany, Aachen: Shaker, Berichte aus der Mathematik, 85–92 (2005). Preliminary discussions (directly aimed at an expansion of Enskog type) with George Mullenger during a visit at the University of Canterbury were relevant though differently addressed. The research was carried out within the project Mathematical Models for the Science of Materials of the Italian MURST.
C H A P T E R
S E V E N
A Thermodynamical Framework Incorporating the Effect of the Thermal History on the Solidification of Molten Polymers K. Kannan∗ and K.R. Rajagopal∗
Contents 7.1 Introduction 7.2 Kinematics 7.3 Modeling 7.3.1 Modeling prior to the initiation of solidification 7.3.2 Modeling after the initiation of solidification 7.4 Summary and Conclusions
263 268 270 270 271 281
Abstract If a polymer melt is quenched sufficiently quickly, the polymer solidifies to a predominantly glassy state. On the other hand, a polymer melt solidifies to a predominantly semicrystalline state when the cooling rate is sufficiently slow. The rate of crystallization is enhanced in the presence of deformation in the melt, which is termed as flow-induced crystallization, and even under quiescent conditions, the thermal history significantly affects the temperature at which the crystallization is initiated. Thus, there is a competition between quenching, which tends to suppress the crystallization process and deformation of the melt that enhances the same. The thermomechanical history undergone by the melt determines whether a polymer melt solidifies predominantly to a glass or a crystalline state or a mixture of the two. In this chapter, we incorporate deformation history effects in the framework developed by Rao and Rajagopal [1] to study the problem of solidification of polymer melts within a unified setting. When the cooling rate is sufficiently slow, the crystallization kinetics described by Nakamuratype kinetics is insufficient to describe the crystallization as the slower "secondary" crystallization related to the lamellar thickening becomes important. Secondary crystallization effects are also included and a general model is developed within a thermodynamic setting. The melt is modeled as a viscoelastic liquid and the crystalline solid, which comes into being over an interval of time associated with an evolving set ∗
Department of Mechanical Engineering,Texas A&M University, College Station, USA e-mail:
[email protected] Material Substructures in Complex Bodies ISBN-10: 0-08-044535-7
262
© 2007 Elsevier Ltd. All rights reserved.
Effect of Thermal History on Solidification of Molten Polymers
263
of natural configurations, is modeled as a one- parameter family of orthotropic elastic solids. The amorphous glassy solid is modeled as an isotropic viscoelastic solid. Key Words: Natural configurations, semi-crystalline polymers, mathematical modeling, thermal history, cyrstallization, glass transition, secondary crystallization, continuous cooling diagram, quenching, mixture of crystalline phases
7.1 Introduction The crystalline structure and hence the mechanical properties of a polymer depend on the thermomechanical history to which the polymer was subjected to. However, the entire thermomechanical history of the polymer is usually unavailable. Subjecting different specimens of a particular polymer to the identical thermomechanical history may not result in the formation of identical microstructure. There can be a number of reasons for the observed effect. A small crystal grown in the crack of a foreign substance in the previous polymer treatment may survive melting, see Ref. [2] and act as a nucleation center during the current process. Nuclei of subcritical size (embryos) may become supercritical (growth nuclei), self-nucleation of high-molecular weight crystals survive melting for a longer period compared to their low-molecular weight counterparts and selfnucleation caused by residual stress due to previous thermomechanical history are among the many explanations for the memory effect. To erase the effects of previous thermomechanical history, the polymer is held at a temperature above its melting point for a sufficient duration (several minutes). Brucato et al. [3] and Piccarola et al. [4] conducted quenching experiments, after nullifying memory effects, on isotactic polypropylene (iPP), nylon-6 and Polyethylene terephthalate (PET) under processing conditions (up to 2000◦ C/s). In those quenching experiments, cooling rates of the order of a few degrees per second were obtained using a differential scanning calorimeter (DSC) and the higher cooling rates were obtained by rapid quenching. About 100–200 μm samples were sandwiched between two pieces of Cu–Be alloy with very high thermal conductivity before quenching under quiescent conditions. A thermocouple embedded in the sample measured the temperature. As opposed to DSC, where controlled cooling rates can be obtained (limited to a few hundred degrees per minute) rapid quenching, as expected, does not produce constant cooling rates. However, since the temperature history is measured, the results can be interpreted in terms of the thermal history undergone by the polymer. Wide angle X-ray diffraction (WAXD) was used to extract the amount of each phase present. At low cooling rates, the iPP melt produced large quantities of α phase at room temperature, which is characterized by monoclinic crystals while very high cooling rates (representative value of around 1000◦ C/s) produced a predominantly mesomorphic phase1 ; the crossover happens at around 100◦ C/s. The maximum 1 IUPAC compendium of chemical technology: these are states of matter in which anisometric molecules (or particles) are regularly arranged in one (nematic state) or two (smectic state) directions, but randomly arranged in the remaining direction(s).
264
K. Kannan and K.R. Rajagopal
change in the density of the sample between the melt and α phase is around 2–3%. A similar behavior is also observed in the case of nylon-6 and PET. As PET is largely an amorphous polymer, the transition from α triclinic at low cooling rates to the amorphous glassy state takes place between 1◦ C and 3◦ C/s. Thus, iPP, nylon-6 and PET in the solid-like state are a mixture of α and mesomorphic phase, with the proportion of each constituent being determined by the thermal history undergone by the polymer. In another study, Gogolewski et al. [5] investigated the effects of annealing nylon-6 samples at temperatures between 50◦ C and 220◦ C (equilibrium melting temperature around 260◦ C) for a duration of up to 2000 h. At such supercooled conditions, the melt begins to crystallize. They found that at a fixed temperature below the equilibrium melting temperature, the longer the annealing time the higher the melting temperature, which was determined by a subsequent heating cycle in a DSC. They also found, using X-ray diffraction, at 215◦ C, that the lamellar crystal2 thickness remains unchanged for a period of 10 h (mirroring melting temperature) and then increases with annealing time, finally reaching a plateau. This observation suggests that there is reordering of the crystals that had formed earlier after sufficient time (10 h), which increased the melting temperature. However, even after the cessation of lamellar thickening, the melting temperature of the nylon-6 sample continues to increase with annealing time. To explain this observation, they proposed secondary nucleation at the grain boundaries, which would increase the crystallinity and hence the melting temperature. At sufficiently low temperature, there is very little molecular mobility and hence the lamellar thickness of nylon-6 does not change even after annealing for 1000 h. As the temperature increases, the molecular mobility and the lamellar thickness increase. At sufficiently high temperature, the molecules have enough mobility to achieve equilibrium lamellar thickness. Gerardi et al. [6] also observed lamellar thickening during annealing of quenched iPP from the melt. Using small angle X-ray scattering (SAXS), a set of lamellar thickness, corresponding to the monoclinic and mesomorphic phase was obtained. The melting temperature enhancement due to annealing, attributed to lamellar thickening, was also observed in syndiotactic polystyrene (sPS), Polypropylene (PP) and PET (see Ref. [7]). The change of the crystal structure during lamellar thickening due to annealing was ruled out because the subsequent WAXD did not undergo any change in the crystal structure. Petermann et al. [8] studied annealing of linear polyethylene at various temperatures using transmission electron microscopy and concluded that two mechanisms were in effect during crystal growth. One process involves selective melting of thinner lamellae, which gets incorporated into the unmelted crystal through epitaxial crystallization (recrystallization3 ) when sufficient time is allowed.The other process involves snaking or gliding movements along the chain axis of the crystal at sufficiently 2 IUPAC: A type of crystal with a large extension in two dimensions and a uniform thickness.The thickness of the crystal is around 5–50 nm. 3 Reorganization proceeding through partial melting.
Effect of Thermal History on Solidification of Molten Polymers
265
high temperatures where molecular mobility is high (reorganization4 ). During such a process, three-dimensional crystalline order is not broken. The latter process increases the lamellar thickness while the former leads to a stacking of one lamellar crystal on the other. For a more detailed discussion of the molecular mechanisms involved during annealing, refer to Refs. [9, 10]. Piccarola et al. [11] studied the effect of cooling rate on the crystallization of PET. All the samples were held at 280◦ C (above melting temperature) for 10 min to remove the effects of the previous thermomechanical history. At low cooling rates (≤1◦ C/s), the effect of primary5 and secondary6 crystallization becomes important and at high cooling rates (≥100◦ C/s) forms an amorphous phase. At intermediate cooling rates, the density (measured at 10◦ C after subjecting the sample to constant rate and quenching tests) change of PET is small, which is attributed to the changing quantities of the mixture of the mesomorphic and amorphous phase, and at the high cooling rates the density change is almost zero and is presumed to be the amorphous phase density. The effect of cooling rate on the crystallization initiation temperature is significant. For example, the initiation temperature for a quenching rate of 0.08◦ C/s is around 230◦ C while for a quenching rate of 2.33◦ C/s it is 170◦ C. In addition to the effects of cooling history on a quiescent polyethylene and polypropylene melt (see continuous cooling transformation curves in Ref. [12]), similar effects are observed for PET. Moreover, the effect of stress in the melt is transparent, i.e. higher stresses cause the crystallization start temperature to increase. Thus, in a thermomechanical process, the competition between cooling history and deformation will decide the temperature at which crystallization starts. Landau recognized the importance of capturing the quintessential feature concerned with crystallization, namely the discrete change of material symmetry. He remarks [13]: “Every transition from a crystal to a liquid or to a crystal of a different symmetry is associated with the disappearance or appearance of some element of symmetry. · · · so continuous transitions (in the sense that transitions between liquid and gas are continuous) connected with changes of the symmetry of the body are absolutely impossible.’’
The thermodynamical framework presented here takes into account the solidification of a polymer melt (which is isotropic) to a semicrystalline solid that is a mixture of an isotropic solid and a crystalline solid that is orthotropic. More importantly, it also takes into account flow-induced crystallization of polymer melts and we obtain a Hillier-like model within a thermodynamic setting.The initiation conditions for the two types of crystal and glass naturally arise as a result of the thermodynamic considerations. We now turn to the details of the framework. 4 The molecular process by which (i) amorphous or poorly ordered regions of a polymer specimen become incorporated into crystals or (ii) a change to a more stable crystal structure takes place or (iii) defects within the crystal decrease. 5 IUPAC: The first stage of crystallization, considered to be ended when most of the spherulite surfaces impinge on each other. 6 IUPAC: Crystallization occuring after primary crystallization, usually proceeding at a lower rate.
266
K. Kannan and K.R. Rajagopal
Most bodies can exist in more than one stress-free configurations and Eckart [14] seems to have been the first to recognize its consequence with regard to the behavior of the inelastic response of solids. As a body is subjected to a thermomechanical process, in general the underlying natural configuration associated with the current configuration evolves. Under certain conditions, namely, a homogeneous body undergoing homogeneous deformation during a homothermal process, the configuration attained by the body when the external stimuli are removed is the natural configuration corresponding to the current configuration. It is therefore natural to represent the stress tensor with respect to these evolving natural configurations. A polymer melt is capable of undergoing an elastic response that is isotropic with respect to these natural configurations. Since the crystalline material is formed continuously over a period of time, the stress response of the crystalline material is given by a one-parameter family of natural configurations. Even though the crystalline material that is born at any instant is assumed to be orthotropic, the associated preferred directions depends upon the deformation in the melt at the time at which the crystalline material is born. The glass transition regime is usually of the order of a few degrees and hence one can approximate the mechanical response of a glassy state with a configuration κr (see Figure 7.1) and an evolving natural configuration to capture the effects such as stress relaxation and creep exhibited by the predominantly glassy materials [15]. Here, the Cauchy stress is represented with respect to a fixed configuration κr and an evolving configuration κps(t) . A model has to reflect how the material stores energy, produces entropy due to conduction, dissipation, crystallization, and glass transition, how it conducts heat, the structure of latent heat, the form of latent energy, etc. The structure of these quantities determines the evolution of the natural configuration associated with the current configuration of the body. The precise manner in which the natural configuration evolves is determined by maximizing the rate of entropy production while enforcing the second law of thermodynamics and the incompressibility of the body as constraints. That is, the choice of a specific constitutive equation from a class of allowable constitutive equations is made by picking that which maximizes the rate of entropy production. The polymer melt is modeled as a viscoelastic liquid. Depending upon the thermomechanical history of a polymer melt, a polymer melt can solidify to a semicrystalline state, a predominantly glassy state or a mixture of the two. As observed in experiments, depending upon the polymer, crystals of monoclinic or triclinic type are formed at higher temperatures and of mesomorphic type are formed at lower temperatures. Accordingly, we consider two types of crystals in the semicrystalline state, which could be extended to accommodate other types of crystals. The initiation of crystallization takes place at a higher temperature at slow cooling rates and vice versa. The deformation in the melt causes the crystallization to initiate at a higher temperature compared to the quiescent (no deformation) crystallization when the melt in both the cases were subjected to the same temperature history. Thus, a higher cooling rate suppresses the temperature at which crystallization is initiated and deformation in the melt elevates the same. When sufficient time is allowed, a polymer melt crystallizes with a larger lamellar
267
Effect of Thermal History on Solidification of Molten Polymers
Reference configuration
k(1)c(t)
kt1
X
Current configuration
k(2)
c(t)
kt2
kr
x
kt Fk(1) c(t)
Fk(2) c(t) Gs
Fk(3) p
Gm
s(t)
Fkp
m(t)
kp
m(t )
kp
s(t)
Natural configuration associated with the melt
Natural configuration associated with the glassy material
Figure 7.1 The above figure represents the configurations associated with the solidifying mixture.
thickness and the temperature at which initiation takes place is also higher, i.e. there is a characteristic time scale associated with the formation of crystals. When a polymer melt is cooled such that a large interval of time is allowed in relation to the characteristic time, then the temperature at which the crystallization process initiates approaches a maximum value (see Figure 7.2) or the equilibrium melting temperature. When a polymer melt is quenched, very little time is available for the crystals to form in relation to the characteristic time and the temperature at which initiation takes place bottoms out. As a result, one obtains an s-shaped curve for the initiation of crystallization. In Figure 7.2, path 4 corresponds to the quenching of a polymer melt. The crystallization corresponding to the crystals of the form 1 and 2 are initiated at lower temperatures and the corresponding crystallization rates (see right side of Figure 7.2) are negligible and hence one obtains negligible crystallinity. However, this path is favorable for the formation of a glassy state. The path 1 results in a semicrystalline state corresponding to the crystal of the form 1. Similarly, the path 2 results in a semicrystalline state with a mixture of the two types of crystals and the path 3 results in a semicrystalline state predominantly with the crystal of the form 2. All these effects are accommodated within our framework. When one restricts oneself to a spatial scale of several nanometers, depending on the type of polymer, an unit cell may have a monoclinic symmetry, triclinic symmetry, etc. However, the global response, e.g. in the case of quiescent crystallization, is that of an approximately isotropic body. The crystalline part of the semicrystalline polymer is modeled as a one-parameter family of orthotropic (global symmetry) elastic solids. As the crystals are formed from a molten polymer, the preferred direction of the crystal that comes into being at the current instant of time are given by the eigen directions of left Cauchy–Green stretch tensor
268
K. Kannan and K.R. Rajagopal
Initiation of crystallization for crystal 1 Initiation of crystallization for crystal 2
1 2 3
Rate of solidification Crystal 1
Melt
Temperature
Crystal 1 4 Crystal 2 Crystal 2
Crystal 1 crystal 2
Glass Crystal 1 crystal 2 glass
Time
Figure 7.2 Schematic of a continuous cooling diagram.
associated with the melt. The glassy state is modeled as an isotropic viscoelastic solid [15].
7.2 Kinematics By the motion of a body we mean a one-to-one mapping that assigns to each point X ∈ κR (B), κR (B) being the reference configuration, a point x ∈ κt (B), κt (B) being the current configuration, for each t ∈ R, i.e. x = χκR (XκR , t).
(7.1)
The velocity of a particle is defined through v=
∂χκR . ∂t
(7.2)
For notational convenience, henceforth we shall drop the argument B in κR (B), κt (B), etc.The deformation gradient, FκR , the left and right Cauchy–Green stretch tensors BκR and CκR are defined through FκR =
∂χκR , ∂XκR
BκR = FκR FT κR
and
CκR = FT κR FκR .
(7.3)
269
Effect of Thermal History on Solidification of Molten Polymers
The principal invariants of BκR are IκR = tr(BκR ),
IIκR =
+ 13 [tr(BκR )]2 − tr(B2κR ) 2
IIIκR = det(BκR ). (7.4) Let κpm (t) denote the natural configuration associated with the configuration at κt (see Figure 7.1). The left and the right Cauchy–Green stretch tensors associated with the tensor Fκpm (t) , the mapping between the tangent spaces at the appropriate points belonging to the configurations κpm (t) and κt are defined through Bκpm (t) = Fκpm (t) FT κpm (t)
and
and
Cκpm (t) = FT κpm (t) Fκpm (t) ,
(7.5)
respectively. The principal invariants of Bκpm (t) are denoted by Iκpm (t) , IIκpm (t) and IIIκpm (t) and are defined in a manner similar to that of equation (7.4). The mapping Gm is defined through (see Figure 7.1) Gm = FκR →κpm (t) = F−1 κpm (t) FκR ,
(7.6)
and the velocity gradient L and the mapping Lκpm (t) are defined through & −1 ˙ m Gm L := F˙ κR &Xκ =const. F−1 and Lκpm (t) = G . κR
(7.7)
R
The symmetric parts of L and Lκpm (t) are defined through 1 D = (L + L T ) 2
and
1 Dκpm (t) = (Lκpm (t) + LκTpm (t) ). 2
(7.8)
The upper convected Oldroyd derivative of Bκpm (t) is defined through [16] (
˙ κp (t) − LBκp (t) − Bκp (t) L T = −2Fκp (t) Dκp (t) FT Bκpm (t) = B κpm (t) . m m m m m
(7.9)
In a manner similar to that used to obtain equations (7.5) through (7.9), one can obtain ([15]) (
(3) (3) T (3) (3) (3)T ˙ (3) B(3) κps (t) = Bκps (t) − LBκps (t) − Bκps (t) L = −2Fκps (t) Dκps (t) Fκps (t) , t3 ≤ τ ≤ t, (7.10)
for the glassy solid and the equation (3) (3) T ˙ (3) B κr − LBκr − Bκr L = 0, t3 ≤ τ ≤ t.
(7.11)
(i)
Finally, the mappings Fκc(τ) , i = 1, 2 are defined through −1 F(i) κc(τ) := FκR (XκR , t)FκR (XκR , τ),
ti ≤ τ ≤ t,
i = 1, 2,
(7.12)
for the deformation gradient of the solid that came into being at time τ, where ti , i = 1, 2, 3 represent the time at which the crystal of the type 1, 2 and the glassy
270
K. Kannan and K.R. Rajagopal
solid are born, respectively. As described earlier in this section, one can define the left and the right Cauchy–Green stretch tensors (see Figure 7.1) and the invariants (i) associated with tensors Fκc(τ) , i = 1, 2. We shall assume that the material is incompressible, an assumption that is reasonable for the polymers under consideration. Thus, (3) det(Bκpm (t) ), det(B(3) κps (t) ), det(Bκr ) = 1
(or tr(Lκpm (t) ), tr(Lκ(3)ps (t) ) = 0)
tr(L) = 0.
and (7.13)
7.3 Modeling By making judicious choices for the specific Helmholtz potential, the rate of entropy production due to conduction, dissipation in the melt and the glassy phase, and phase change due to crystallization and glass transition, one can obtain a model describing the solidification of molten polymers to a mixture of glassy and semicrystalline phase depending upon the thermomechanical process to which the melt is subjected to. We refer the reader to Rao and Rajagopal [1] for a sufficiently general thermodynamical framework for describing the crystallization of polymers, which, is extended further, in what follows.
7.3.1 Modeling prior to the initiation of solidification The Helmholtz potential per unit mass and the rate of dissipation per unit volume of the melt, are defined as follows: (θ − θm )2 m (θ, Iκpm (t) ) = Am + (B m + c2m )(θ − θm ) − c1m 2 !n * m θ b μ θ m 1 + (Iκpm (t) − 3) − 1 , − c2 θ ln + θm 2ρθm b n (7.14) where θm , Am , B m , c1m , c2m , b, n and μm are the reference temperature and the material constants, and ξm (θ, Bκpm (t) , Dκpm (t) ) = 2νm (θ, Bκpm (t) ){Dκpm (t) · Bκpm (t) Dκpm (t) }β , νm (θ, Bκpm (t) )
(7.15)
where is a viscosity-like function associated with the melt and β is a constant. The above choice for the rate of dissipation guarantees that it will be non-negative. The stress tensor and the evolution equation for the natural configuration are given through (see Ref. [17] for details) !n−1 μm θ b Tm = −pI + Bκpm (t) (7.16) 1 + (Iκpm (t) − 3) θm n
271
Effect of Thermal History on Solidification of Molten Polymers
and (
Bκpm (t)
1 ⎧ )1−β ⎫ 2β−1 !n−1 ( ⎬ ⎨ μm θ 9 b tr Bκpm (t) − =2 1 + (Iκpm (t) − 3) ⎭ ⎩ 2νm θm n tr(B−1 κpm (t) ) ) ( 3 × I − Bκpm (t) tr(B−1 κpm (t) ) ) ( 1 3 = (7.17) I − Bκpm (t) , R tr(B−1 κpm (t) )
where R has units of time and is identified as the relaxation time.
7.3.2 Modeling after the initiation of solidification Figure 7.3 shows the schematic of the various mass fractions present in the mixture. The shaded areas represent the semicrystalline polymer, which is a mixture of amorphous and crystalline regions, mass fractions of the crystals of the form 1 and 2. It is necessary to introduce the semicrystalline mass fractions as further crystallization occurs within the semicrystalline region. Such an assumption is supported experimentally [18]. The mass fractions f (1) and f (2) represent the total crystalline region within the semicrystalline region 1 and 2, respectively. The amorphous regions within each semicrystalline region and the remaining melt is modeled as a viscoelastic liquid with the same material parameters as the original viscoelastic liquid except for the parameters appearing in the viscosity-like 1ƒˆ(1)ƒˆ (2)ƒˆ(3)(melt–like) ƒ (3)(glass)
ƒˆ(2)
ƒˆ(1)
ƒ (1) (crystal 1)
ƒ (2) (crystal 2)
ƒ (4) (glass) ƒˆ(1)ƒˆ(2)(ƒ (1)ƒ (2))ƒ (4) (meltlike)
Figure 7.3 A schematic of mass fractions of the various phases.
272
K. Kannan and K.R. Rajagopal
function νˆ m . The term f (3) represents the mass fraction of the glassy solid that is outside the semicrystalline region and f (4) represents the mass fraction of the glassy state within the semicrystalline region. No distinction is made between the glass that is formed within and outside of the semicrystalline region. It is necessary to introduce the mass fraction f (3) as many polymers could be quenched to a predominantly glassy phase with insignificant crystallinity and hence most of the glassy phase is formed outside the semicrystalline region. Similarly glass transition also occurs in the amorphous regions of a predominantly semicrystalline model necessitating the introduction of the mass fraction f (4) . Every point in the mixture is occupied by various phases present. This is achieved by homogenizing the actual polymer mixture. The mixture is modeled as a constrained mixture, i.e. at a point, the velocities of the various phases are assumed to be the same. The specific Helmholtz potential of the mixture is partitioned into four parts, with each phase being weighted by its respective mass fractions, i.e. (θ − θm )2 = (1 − f (1) − f (2) − f (3) − f (4) ) Am + (B m + c2m )(θ − θm ) − c1m 2 ! ** n θ b μm θ 1 + (Iκpm (t) − 3) − 1 + −c2m θ ln θm 2ρθm b n 2 t (θ(t) − θs,i )2 + As,i + (B s,i + c2s,i )(θ(t) − θs,i ) − c1s,i 2 i=1 ti !ni * t θ(t) s,i s,i s,i s,i − c2 θ(t) ln exp (F (θ(τ ) − θo ))dτ + D exp −E θs,i to s,i s,i μs,i μ μ + 1 (Iκ(i)c(τ) (t, τ) − 3) + 2 ( Jκ(i)c(τ) (t, τ) − 1)2 + 3 (Kκ(i)c(τ) (t, τ) − 1)2 2ρ 2ρ 2ρ df (i) (θ − θs,3 )2 × dτ + (f (3) + f (4) ) As,3 + (B s,3 + c2s,3 )(θ − θs,3 ) − c1s,3 dτ 2 θ μs,3 μs,3 2 − c2s,3 θ ln − 3) + (I (3) − 3) + 1 (Iκ(3) θs,3 2ρ ps (t) 2ρ κr = (1 − f
(1)
−f
(2)
−f
(3)
−f
(4)
) + m
2
s,i + ( f (3) + f (4) )s,3 ,
(7.18)
i=1 (i) where f (i) = αop fˆ (i) + αs , i = 1, 2 with fˆ (i) , i = 1, 2 representing the semicrystalline mass fractions related to the crystals of the form 1 and 2.The total crystalline mass fraction, i.e. f (i) is assumed to be the sum of the primary crystalline fraction defined as αop fˆ (i) , where αop ≤ 1 and is a constant, and the secondary
273
Effect of Thermal History on Solidification of Molten Polymers
(i)
crystalline fraction αs that is assumed to grow at the expense of amorphous (i) (i) region within the semicrystalline fraction. The terms Kκc(τ) := Cκc(τ) mκc(τ) · mκc(τ) , (i)
(i)
Jκc(τ) := Cκc(τ) nκc(τ) · nκc(τ) , i = 1, 2, where nκc(τ) and mκc(τ) are the eigenvectors of Bκpm (t) . Also, θm , Am , B m , c1m , c2m , b, n and μm are the reference temperature and the material constants associated with the melt, {θs,i , As,i , B s,i , c1s,i , c2s,i , s,i s,i s,i {μs,i j , j = 1, 2, 3}, D , E , F and ni , i = 1, 2} are the reference temperature and the material constants associated with the crystals of the form 1 and 2, respectively, and θs,3 , As,3 , B s,3 , c1s,3 , c2s,3 , {μs,3 j , j = 1, 2} are the material constants associated with the glass. The specific functional form chosen for the melt is the same as that described in the previous subsection and regarding the glass it is the same as that described in Ref. [15], i.e. the glass is modeled as an isotropic viscoelastic solid. The crystalline material born at the expense of the melt at each instant τ is modeled as an orthotropic elastic solid. Since the crystallization occurs over a period of time, the Helmholtz potential of the crystalline solid at a material point at the current time t is assumed to be the sum of Helmholtz potentials of orthotropic elastic solids born over a period of time (see Ref. [17] for details). The term on the fourth line of the above equation captures the thermal history dependence. This term is such that when exposed for more time and/or at higher temperature its contribution to is small representing the formation of crystals that have undergone reorganization and/or recrystallization. The polymer melt at time to is assumed to be devoid of any previous thermomechanical history, i.e. memory effects are erased. Note that the indices 1 and 2 refer to the crystals of the form 1 and 2 and the index 3 and 4 refers to the glassy solid. The second law, used to place restrictions on the constitutive equations, is introduced in the following form [19]: ˙ − ρηθ˙ − T · D − ρ
q · grad(θ) = ρθζ = ξ, θ
ξ ≥ 0,
(7.19)
Assuming additivity of the rate of entropy production due to the various mechanisms associated with solidification of molten polymers, the right-hand side of equation (7.19) becomes g
g
ξ = ξc + ξdm + ξd + ξp(1) + ξp(2) + ξr(1) + ξr(2) + ξp ,
(7.20)
with each of the terms on the right-hand side of equation (7.20) being nonnegative; they represent the rate of entropy production (times ρθ) due to conduction (ξc ), rate of dissipation in the melt (ξdm ), rate of dissipation in the g glassy solid (ξd ), the rate of entropy production (times ρθ) due to crystallization (1) (2) of the melt to the crystals of the form 1 and 2 (ξp , ξp ), reorganization of the (1) (2) crystals that have formed for the crystals of the form 1 and 2 (ξr , ξr ), and solidg ification of the remaining melt to a glassy solid (ξp ), respectively. Recall that when sufficient time and/or temperature is available, rearrangement of the molecules
274
K. Kannan and K.R. Rajagopal
(1)
(2)
that form the crystal takes place. The terms ξr and ξr represent the rate of entropy production (times ρθ) due to this reorganization. We shall suppose that q · grad(θ) . (7.21) θ The rate of entropy production (times ρθ) due to conduction is assumed to be ξc = −
ξc =
k(Bκpm (t) , f (1) , f (2) , f (3) , f (4) )grad(θ) · grad(θ) θ
,
(7.22)
where k(Bκpm (t) , f (1) , f (2) , f (3) , f (4) ) is positive definite and symmetric and is given by k(Bκpm (t) , f (1) , f (2) , f (3) , f (4) ) = (1 − f (1) − f (2) − f (3) − f (4) )km (Bκpm (t) ) + f (1) K (1) + f (2) K (2) + (f (3) + f (4) )k(3) I. (7.23) These requirements guarantee that the rate entropy production due to conduction is positive for non-zero grad(θ) as required. When grad(θ) is zero, then there is no conduction and the associated rate of entropy production is zero. From equations (7.21) and (7.22), the constitutive equation for the heat flux is given by q = −{(1 − f (1) − f (2) − f (3) − f (4) )km (Bκpm (t) ) + f (1) K (1) + f (2) K (2) + (f (3) + f (4) )k(3) I}grad(θ),
(7.24)
where km (Bκpm (t) ), K (1) and K (2) are symmetric and positive definite and k(3) > 0. As the melt stretches, the molecules tend to get aligned in a certain direction and hence the components of the heat flux vector will be different as compared to the components corresponding to the unstretched melt. The stretch in the crystals of the form 1 and 2, and the glassy solid are small in typical industrial processes such as fiber spinning, film blowing, blow molding, etc. Therefore, their respective thermal conductivity tensors do not depend on their stretch tensors. The glassy solid is amorphous and hence a Fourier-type conduction is assumed. Using equations (7.21) and (7.20) in (7.19), we arrive at the reduced dissipa˙ − ρηθ˙ = ξ m + ξ g + ξp(1) + ξp(2) + ξr(1) + ξr(2) + ξpg . Since tion equation T · D − ρ d d the melt and the glassy solid can dissipate, the rate of dissipation, appropriately weighted, is assumed to be of the following form: g
ξdm + ξd = (1 − f (1) − f (2) − f (3) − f (4) )2ˆνm (θ, Bκpm (t) , f (1) , f (2) , f (3) , f (4) ) × {Dκpm (t) · Bκpm (t) Dκpm (t) }β + (f (3) + f (4) ) (1) (2) (3) (4) (3) (3) (3) × 2ˆνs (θ, B(3) κps (t) , f , f , f , f )Dκps (t) · Bκps (t) Dκps (t) .
(7.25)
275
Effect of Thermal History on Solidification of Molten Polymers
The functions νˆ m and νˆ s are defined so that as f (i) → 0, i = 1, 2, 3, 4, the viscosity-like functions tend to their appropriate limit, i.e. νˆ m ( · · · ) → νm (·, ·) and νˆ s ( · · · ) → νs (·, ·). Since the viscosity-like terms are positive, it is obvious that the rate of dissipation is non-negative. Substituting equations (7.18) and (7.25) into the reduced dissipation equation, a sufficient condition to ensure that the reduced dissipation equation holds is that the Cauchy stress tensor takes the form: !n−1
T = −pI + (1 − f (1) − f (2) − f (3) − f (4) ) +
2 t7 i=1
ti
μm θ b 1 + (Iκpm (t) − 3) θm n
Bκpm (t)
s,i (i) (i) (i) μs,i 1 Bκc(τ) (t, τ) + 2μ2 ( Jκc(τ) (t, τ) − 1)Fκc(τ) (t, τ)(nκc(τ) ⊗ nκc(τ) )
s,i (i) (i) (i)T × F(i)T κc(τ) (t, τ) + 2μ3 (Kκc(τ) (t, τ) − 1)Fκc(τ) (t, τ)(mκc(τ) ⊗ mκc(τ) )Fκc(τ) (t, τ)
×
df (i) s,3 (3) (3) dτ + (f (3) + f (4) )(μs,3 1 Bκps (t) + μ2 Bκr ), dτ
8
(7.26)
and the entropy of the mixture is given by η = (1 − f
(1)
−f
(2)
−f
(3)
−f
(4)
) −B
m
+ c1m (θ
− θm ) + c2m
θ ln θm
!n ** 3 7 b μm f (i) −B s,i + c1s,i (θ − θs,i ) 1 + (Iκpm (t) − 3) − 1 + − 2ρθm b n i=1 * * θ θ s,i s,3 s,3 (4) s,3 + c2 ln +f −B + c1 (θ − θs,3 ) + c2 ln . (7.27) θs,i θs,3 Using the relation = + ηθ, the internal energy is given through
* 1 m 2 2 m = (1 − f − f − f − f ) A − B θm + c1 (θ − θm ) + c2 (θ − θm ) 2 * 2 1 s,i 2 s,i (i) s,i s,i 2 + f A − B θs,i + c1 (θ − θs,i ) + c2 (θ − θs,i ) 2 i=1 t s,i s,i μs,i μ μ 1 + (I (i) (t, τ) − 3) + 2 ( Jκ(i)c(τ) (t, τ) − 1)2 + 3 (Kκ(i)c(τ) (t, τ) − 1)2 2ρ κc(τ) 2ρ 2ρ ti !ni ** t df (i) (i) s,i s,i s,i exp (F (θ(τ) − θo ))dτ dτ + f D exp −E × dτ to (1)
(2)
(3)
(4)
m
m
276
K. Kannan and K.R. Rajagopal
1 2 + (f (3) + f (4) ) As,3 − B s,3 θs,3 + c1s,3 (θ 2 − θs,3 ) + c2s,3 (θ − θs,3 ) 2 μs,3 μs,3 1 2 (3) (3) − 3) + (7.28) (I (I − 3) . + 2ρ κps (t) 2ρ κr In addition to the equations (7.26) and (7.27), one can also obtain !n−1
μm θ b 1 + (Iκpm (t) − 3) θm n
Bκpm (t) · Dκpm (t)
+β 3 = 2ˆνm (θ, Bκpm (t) , f (1) , f (2) , f (3) , f (4) ) Dκpm (t) · Bκpm (t) Dκpm (t)
(7.29)
and (3) (3) (1) (2) (3) (4) (3) (3) (3) μs,3 νs (θ, B(3) κps (t) , f , f , f , f )Dκps (t) · Bκps (t) Dκps (t) , (7.30) 1 Bκps (t) · Dκps (t) = 2ˆ (3)
which, for fixed values of Bκpm (t) and Bκps (t) , can be viewed as a constraint, in (3)
addition to the incompressibility constraints tr Dκpm (t) = 0 and tr Dκps (t) = 0, on (3)
the tensors Dκpm (t) and Dκps (t) , respectively. The evolution equations for the natural configurations κpm(t) and κps(t) are obtained in such a way that the rate of dissipation, i.e. equation (7.25) is maximized [15], i.e. one obtains the evolution equation for κpm(t) to be equation (7.17) with νm being replaced by νˆ m and κps(t) to be (
B(3) κps (t)
=
⎧ ⎪ ⎪ ⎨
⎫ ⎪ ⎪ ⎬
μs,3 3 1 (3) −1 ! I − Bκps (t) ⎪ . νˆ s ⎪ ⎪ ⎪ ⎩ tr B(3) ⎭ κps (t)
(7.31)
The rate of entropy production (times ρ θ) due to crystallization is defined as ξp(i) =
ˆ s,i )αop (f˙ˆ (i) )2 ρ(m − (i)
φp
ˆ s,i )(α˙ s )2 ρ(m − (i)
+
(i)
φs
, i = 1, 2, . . .
(7.32)
where φp(i) = Ki (θ, Bκpm (t) )ni (1 − fˆ (1) − fˆ (2) − f (3) ) log i = 1, 2 and
1 1 − fˆ (1) − fˆ (2) − f (3)
ni −1 ni
,
277
Effect of Thermal History on Solidification of Molten Polymers
φs(i)
d = dt
ti
t
dfˆ (i) (i) s (t; τ)dτ, i = 1, 2, . . . dτ
subject to that condition that mi −1 mi αos ds(i) (i) o (1) (2) , = Ks (θ, Bκpm (t) )mi (αs − s − s ) log dt αos − s(1) − s(2) i = 1, 2, respectively. (7.33) In the above equation ni , mi ≥ 1 and αos are constants and the bell-shaped (i) functions Ki (θ, Bκpm (t) ), i = 1, 2 and Ks (θ, Bκpm (t) ) are defined through p,i p,i −C −C o,p 1 2 Ki (θ, Bκpm (t) ) = Ki exp exp , i = 1, 2, . . . p,i ˆ ˆ s,i ) θ( − θ − θc m and
Ks(i) (θ, Bκpm (t) )
=
Kio,s
exp
−C1s,i θ − θcs,i
exp
−C2s,i
ˆm − ˆ s,i ) θ(
, i = 1, 2, . . . (7.34)
where
θ (θ − θm )2 − c2m θ ln 2 θm !n * − 3) − 1 , (7.35)
ˆ m = Am + (B m + c2m )(θ − θm ) − c1m μm θ + Cb 2ρθm b
b 1 + (Iκpm (t) n
ˆ m of equation (7.35) and m of equawhere Cb is a positive constant. Note that o,p p,i p,i tion (7.18) are not the same. In equation (7.34), Ki , C1 , C2 , Kio,s , C1s,i , C2s,i , p,i θcs,i and θc are positive constants. The first and the second term of equation (7.32) represent the rate of entropy production (times ρθ) due to primary and secondary crystallization, respectively. Each term of equation (7.32) is non-negative provided ˆ s,i ≥ 0 and φp(i) and φs(i) are positive. It is easy to see that φp(i) , i = 1, 2 that m − are positive if fˆ (1) + fˆ (2) + f (3) ≤ 1. On physical grounds, the maximum sum of mass fractions of the transformed volume must be unity (see Figure 7.3) and hence one must impose the requirement that fˆ (1) + fˆ (2) + f (3) ≤ 1. We will show later that this is indeed the case. First, we note that the right-hand side of equation (7.33)3 is bounded provided that s(1) + s(2) < αos . Recall that the constants associated with the functions Ksi , i = 1, 2 are chosen so that the value of each of the functions is bounded. The terms s(i) , i = 1, 2 are associated with the secondary crystallization and the initial conditions for equation (7.33)3 should be zero. However, the differential
278
K. Kannan and K.R. Rajagopal
equations (7.33)3 are such that s(i) , i = 1, 2 are zero at all times for zero initial conditions. Therefore for obtaining non-trivial results, one needs non-zero but otherwise arbitrarily small values for s(i) as initial conditions. If one has positive initial conditions, then the values of s(i) , i = 1, 2 at the next instant of time are larger than their values at the previous instant as their derivative with respect to time is always positive provided that s(1) + s(2) < αos . It immediately follows that s(i) increases monotonically. Thus s(i) , i = 1, 2 are monotonic and continuous and their sum s(1) + s(2) (also continuous) must approach αos at some time. As the sum (i) approaches αos , dsdt → 0, i = 1, 2 and as a result s(i) , i = 1, 2 levels off for subsequent times with s(1) + s(2) approaching a maximum value of αos . To sum up, the solution to the differential equations (7.33)3 with positive initial conditions is continuous, non-decreasing with s(1) + s(2) approaching a maximum value of αos . ˆ (i)
(i)
f > 0. Now, φs is positive if the integrand of equation (7.33)2 is positive, i.e. if ddt The rate of entropy production due to primary and secondary crystallization is fˆ (i) > 0. non-negative provided fˆ (1) + fˆ (2) + f (3) < 1 and ddt Now we will turn our attention towards the rate of entropy production (times ρθ) due to glass transition. It is defined as g
ξp =
ˆ s,3 )(f˙ (3) )2 ρ(m − ˆ s,3 )(f˙ (4) )2 ρ(m − + , φ3 φ4
where φ3 = K3 (θ, Bκpm (t) )n3 (1 − fˆ (1) − fˆ (2) − f (3) ) log
n3 −1 n3
1 1 − fˆ (1) − fˆ (2) − f (3)
and φ4 = K4 (θ, Bκpm (t) )n4 { fˆ (1) + fˆ (2) − (f (1) + f (2) ) − f (4) } n4 −1 n4 fˆ (1) + fˆ (2) − (f (1) + f (2) ) , × log fˆ (1) + fˆ (2) − (f (1) + f (2) ) − f (4)
(7.36)
where Ki (θ, Bκpm (t) ) =
Kio
exp
−C (i) 1 (i)
θ − θc
exp
(i)
−C2
ˆm − ˆ s,3 ) θ(
, i = 3, 4. (i)
(i)
(7.37) (i)
In equation (7.36), n3 , n4 , ≥ 1 and in the above equation Kio , C1 , C2 and θc , i = 3, 4 are positive constants. In a manner similar to that for the entropy production due to crystallization, each term of equation (7.36) is non-negative provided ˆ s,3 ≥ 0, fˆ (1) + fˆ (2) + f (3) < 1 and f (1) + f (2) + f (4) < fˆ (1) + fˆ (2) are met. m −
279
Effect of Thermal History on Solidification of Molten Polymers
Substituting equations (7.32) and (7.36)1 into the reduced dissipation equation, (i) ˙ after collecting the terms involving fˆ (i) and α˙ s , i = 1, 2 and f˙ (i) , i = 3, 4, we set their respective coefficients to be zero because we are looking only for a sufficient condition to satisfy the reduced dissipation equation. This procedure leads to our determining the transformation kinetics for the semicrystalline mass fraction (and hence primary) and secondary crystallization within the semicrystalline region for the crystals of the form 1 and 2, and the kinetics for glass transition within and outside of the semicrystalline region. The rate of change of semicrystalline mass fractions for the crystals of the form 1 and 2, i.e. fˆ (i) , i = 1, 2 and glass are given through dfˆ (i) = Ki (θ, Bκpm (t) )ni (1 − fˆ (1) − fˆ (2) − f (3) ) log dt
ni −1 ni
1
,
1 − fˆ (1) − fˆ (2) − f (3)
i = 1, 2 and df (3) = K3 (θ, Bκpm (t) )n3 (1 − fˆ (1) − fˆ (2) − f (3) ) log dt
1
n3 −1 n3
. 1 − fˆ (1) − fˆ (2) − f (3) (7.38)(1,2)
The secondary crystallization contribution to the total crystallinity of each form of crystal, using the fact that αis = 0 at t = ti , is given by α(i) s
= ti
t
dfˆ (i) (i) s (t; τ)dτ, dτ
i = 1, 2
subject to the condition mi −1 mi αos ds(i) (i) o (1) (2) , = Ks (θ, Bκpm (t) )mi (αs − s − s ) log o (1) (2) dt αs − s − s i = 1, 2, respectively.
(7.39)(1,2)
The rate of change of mass fraction of the glassy phase within the semicrystalline region (see Figure 7.3) is df (4) = K4 (θ, Bκpm (t) )n4 { fˆ (1) + fˆ (2) − ( f (1) + f (2) ) − f (4) } dt n4 −1 n4 fˆ (1) + fˆ (2) − ( f (1) + f (2) ) × log , fˆ (1) + fˆ (2) − ( f (1) + f (2) ) − f (4)
(7.40)(1,2)
280
K. Kannan and K.R. Rajagopal
(i) where the total crystallinity f (i) = αop fˆ (i) + αs , i = 1, 2. From the differential equations (7.38)1,2 , using arguments similar to those fˆ (i) > 0 and the remaining used before, it is clear that fˆ (1) + fˆ (2) + f (3) < 1, ddt m s,i ˆ condition − ≥ 0, i = 1, 2 is imposed as a constraint while solving the problem. Similarly, with regard to the rate of entropy production due glass ˆ s,3 ≥ 0. By definition transition, one needs to impose the restriction m − (i) f (i) = αop fˆ (i) + αs ≤ (αop + αos )fˆ (i) , i = 1, 2. To obtain this inequality, we have used the fact that the maximum value that can be attained by s(i) , i = 1, 2 is αos . Since only a portion of the semicrystalline fraction is occupied by the crystalline regions, αop + αos < 1. Since the rate of crystallization and glass transition are very slow at the beginning and the end of the solidification compared to the other periods, it is natural to assume that the rate of entropy production due to crystallization and glass transition tend to zero, i.e. the initiation conditions are (see equations (7.32) and (7.36)1 ) θ (θ − θm )2 μm θ Am + (B m + c2m )(θ − θm ) − c1m + − c2m θ ln 2 θm 2ρθm b 7$ % 8 7 n b (θ − θs,i )2 1 + (Iκpm (t) − 3) − 1 − As,i + (B s,i + c2s,i )(θ − θs,i ) − c1s,i n 2 $ t θ 7 %ni 88 − c2s,i θ ln + D s,i exp −E s,i exp (F s,i (θ(τ) − θo ))dτ θs,i to
ˆ s,i = 0, = m −
i = 1, 2, 3 with D s,3 ≡ 0.
(7.41)(1,2)
If the cooling rate is low, then the melt spends a longer time at higher temperatures and thus the last term of the equation (7.41) becomes small causing a high crystallization initiation temperature (see path 1 of Figure (7.2)) and vice versa as observed in experiments. Even though there may be some melting of the thinner lamellae during annealing or under very slow cooling rates, the overall crystallinity of the polymer continues to increase.The rate of entropy production, attributed to reorganization and recrystallization, are given through !ni 8 t 7 (i) (i) d s,i s,i s,i ξr = −ρf , i = 1, 2. exp (F (θ(τ) − θo ))dτ D exp − E dt to (7.42)(1,2) It is easy to check that the above equation is non-negative. Thus, the reduced dissipation equation is identically satisfied. The crystallization kinetics equation has a similar structure as that reported in Refs. [18, 20]. A modification to the Avrami equation was proposed under isothermal and quiescent conditions by Hillier [18] and Price [21] to explain the
Effect of Thermal History on Solidification of Molten Polymers
281
anomalous fractional values of the Avrami exponent when the Avrami equation was fit to the experimental data of certain polymers. Fractional values of the exponent does not seem to have a clear physical meaning, and to circumvent this problem, by allowing further crystallization to take place within the spherulite, Hillier [18] was able to obtain good agreement with the experimental data. The primary crystallization is associated with constant radial growth rate of spherulites as observed in experiments, which are semicrystalline, and further crystallization is allowed within a spherulite termed as post-Avrami crystallization. Price [21] showed that if the post-Avrami process is sufficiently slower than the primary crystallization, then the crystallization curves resemble that of polymers where secondary crystallization (crystallinity is proportional to log(time) and is related to lamellar thickening) is present. Hillier [18] showed that if the two processes have comparable half-lives, then the crystallization curves obtained resemble that of certain polymers where secondary crystallization is absent, which further leads to anomalous fractional values when post-Avrami crystallization is ignored. The model was able to fit the isothermal quiescent crystallization data for polymethylene, polyethylene oxide, etc. We refer the reader toVerhoyen [22] for a discussion of other modifications to the Avrami equation. Equations (7.38)1 and (7.39) represent the non-isothermal version of a Hillierlike model including the effect of deformation. The first equation represents primary crystallization and has a form similar to that of the Nakamura equation [23]. The glass transformation kinetics, i.e. equations (7.38)2 and (7.40), has a similar mathematical form as that for the crystallization kinetics, but the interpretation for these quantities is different from those that are offered in traditional studies.
7.4 Summary and Conclusions A general thermodynamic setting was used to develop a model for the solidification of a polymer melt that took into consideration the thermomechanical history of the melt, including primary and post-Avrami crystallization as well as flow-induced crystallization. The amount and the type of the crystalline phase obtained depends on the thermomechanical history undergone by the polymer. For many polymers such as nylon-6, nylon-66, polypropylene, etc., two phases have been found in their semicrystalline state, namely, mesomorphic and monoclinic or triclinic phases. Even though the lattice structure may be triclinic, monoclinic and so forth, they tend to get aligned, as dictated by the deformation, giving an overall anisotropy that could be approximated as being orthotropic. The model can predict the amount and the type of crystalline structures formed. Here, the model takes into account two different crystalline structures. However, it can be easily generalized to account for more crystalline phases. If the cooling rate is sufficiently high, it is possible to bypass the crystallization route and the polymer
282
K. Kannan and K.R. Rajagopal
melt will solidify into a glass (see path 4 of Figure 7.2), and such effects can also be predicted by the model that has been developed. ACKNOWLEDGEMENT We thank the National Science Foundation for its support of this work.
REFERENCES 1. I. J. Rao and K. R. Rajagopal, A thermodynamic framework for the study of crystallization in polymers, Z. Angew. Math. Phys., 53 (2002), 365–406. 2. P. Tidick, S. Fakirov, N. Avramova, and H. G. Zachmann, Effect of the melt annealing time on the crystallization of nylon-6 with various molecular weights, Colloid. Polym. Sci., 262 (1984), 445–449. 3. V. Brucato, S. Piccarola and V. La Carrubba, An experimental methodology to study polymer crystallization under processing conditions. The influence of high cooling rates, Chem. Eng. Sci., 57 (2002), 4129–4143. 4. S. Piccarola, M. Saiu,V. Brucato and G. Titomanlio, Crystallization of polymer melts under fast cooling. II. High-purity iPP, J. Appl. Polym. Sci., 46 (1992), 625–634. 5. S. Gogolewski, M. Gasiorek, K. Czerniawska and A. J. Pennings,Annealing of melt-crystallized nylon-6, Colloid. Polym. Sci., 260 (1982), 859–863. 6. F. Gerardi, S. Piccarola,A. Martorana, and D. Sapoundjieva, Study of the long-period changes in samples of isotactic poly(propylene) obtained by quenching from the melt and subsequent annealing at different temperatures, Macromol. Chem. Phys., 198, (1997), 3979–3985. 7. N. V. Gvozdic and D. J. Meier, On the melting temperature of syndiotactic polystyrene: 2. Enhancement of the melting temperature of semicrystalline polymers by a novel annealing procedure, Polym. Commun., 32 (1991), 493–494. 8. J. Petermann, M. Miles, and H. Gleiter, Growth of polymer crystals during annealing, J. Macromol. Sci., B12 (1976), 393–404. 9. P. Dreyfus and A. Keller, A simple chain refolding scheme for the annealing behavior of polymer crystals, Polym. Lett., 8 (1970), 253–258. 10. G. S. Y. Yeh, R. Hosemann, J. Loboda-Cackovic, and H. Cackovic, Annealing effects of polymers and their underlying molecular mechanisms, Polymer, 17 (1976), 309–318. 11. S. Piccarola,V. Brucato, and Z. Kiflie, Non-isothermal crystallization kinetics of PET, Polym. Eng. Sci., 40 (2000), 1263–1272. 12. J. E. Spruiell and J. L. White, Structure development during polymer processing: studies of the melt spinning of polyethylene and polypropylene fibers, Polym. Eng. Sci., 15 (1975), 660–667. 13. L. D. Landau, On the Theory of Phase Transitions (in Collected papers), Gordon and Breach, New York, USA, 1967. 14. C. Eckart, The thermodynamics of irreversible processes. IV. The theory of elasticity and anelasticity, Phys. Rev., 73 (1948), 373–382. 15. K. Kannan and K. R. Rajagopal, A thermomechanical framework for the transition of a viscoelastic liquid to a viscoelastic solid, Math. Mech. Solid., 9, (2004) 37–59. 16. K. R. Rajagopal and A. R. Srinivasa,A thermodynamic framework for rate type fluid models, J. Non-Newtonian Fluid Mech., 88 (2000), 207–227. 17. K. Kannan and K. R. Rajagopal, Simulation of fiber spinning including flow induced crystallization, J. Rheol., 49 (2005), 683–703. 18. I. H. Hillier, Modified Avrami equation for the bulk crystallization kinetics of spherulitic polymers, J. Polym. Sci., 3 (1965), 3067–3078. 19. A. E. Green and P. M. Naghdi, On thermodynamics and nature of second law, Proc. Roy. Soc. Lond. A., 357 (1977), 253–270.
Effect of Thermal History on Solidification of Molten Polymers
283
20. M. Gordon and I. H. Hillier, Mechanism of secondary crystallization of polymethylene, Phil. Mag., 11 (1965), 31–41. 21. F. P. Price, A phenomenological theory of spherulitic crystallization: primary and secondary crystallization processes, J. Polym. Sci., 3 (1965), 3079–3086. 22. O. Verhoyen, F. Dupret and R. Legras, Isothermal and non-isothermal crystallization kinetics of polyethylene terephthalate: mathematical modeling and experimental measurement, Polym. Eng. Sci., 38 (1998), 1594–1610. 23. K. Nakamura, T. Watanabe, and K. Katayama, Some aspects of nonisothermal crystallization of polymers. I. Relationship between crystallization temperature, crystallinity and cooling conditions, J. Appl. Polym. Sci., 16 (1972), 1077–1091.
C H A P T E R
E I G H T
Effects of Stress on Formation and Properties of Semiconductor Nanostructures Harley T. Johnson∗
Contents 8.1 Overview 8.2 Background 8.2.1 Applications of semiconductor nanostructures 8.2.2 Fabrication methods 8.2.3 Fundamental properties of nanostructures 8.2.4 Modeling methods for semiconductor nanostructures 8.3 Effects of Stress on the Formation of Semiconductor Nanostructures 8.3.1 Modeling stress-induced surface self-assembly 8.3.2 Modeling stress effects in the sputter-erosion instability 8.3.3 Modeling stress effects in compositional segregation in thin films 8.4 Stress Effects on the Electronic/Optical Properties of Semiconductor Nanostructures 8.4.1 Models for stress effects on parallel transport in thin films 8.4.2 Modeling the effects of stress on quantum confinement in wires and dots 8.4.3 Multiscale coupled mechanical/electronic modeling in semiconductor nanostructures 8.5 Conclusions
285 285 286 288 289 293 295 295 299 301 303 303 305 309 310
Abstract Quantum wires and quantum dots are semiconductor structures with two or more physical dimensions on the order of 10 nm or smaller. These structures have applications in nanoelectronics, optoelectronics, information technology, and biotechnology, due to the quantum mechanical confinement provided by the small spatial extent of the structures. At these very small scales, the continuum approximation breaks down and it becomes necessary to model the materials at the atomistic level; physically, the electronic and mechanical properties become strongly coupled. Experimental evidence shows that mechanical stress has important effects on these systems in terms of both fabrication and device applications. In this chapter, an overview of issues in modeling ∗
Department of Mechanical Science and Engineering, University of Illinois at Urbana-Champaign, Urbana, IL, 61801, USA; e-mail:
[email protected] Material Substructures in Complex Bodies ISBN-10: 0-08-044535-7
284
© 2007 Elsevier Ltd. All rights reserved.
Effects of Stress on Formation and Properties of Semiconductor Nanostructures
285
of stress effects on formation and properties of semiconductor nanostructures is presented. A summary of the applications of these structures is first presented, followed by a brief discussion of modeling methods available for studying quantum dots and wires. Then several problems are presented as examples of stress effects on semiconductor nanostructure formation: stress-driven self-assembly of quantum dots, the sputter erosion surface instability, and the stress-affected phase separation in semiconductor alloys. Three areas in which stress affects semiconductor nanostructure device properties are then discussed: the stress effect on parallel transport in semiconductor films, the effects of stress on electronic and optical properties of quantum dots and quantum wires, and the multiscale modeling of coupled mechanical/electronic properties in semiconductor nanostructures. Finally, prospects and challenges for future modeling of stress effects in semiconductor nanostructures are discussed. Key Words: Semiconductor material, mechanics of thin films, quantum dots, self-assembly, electronic structure, surface instability, linear elasticity, quantum confinement, ion-bombardment, atomistic modeling, continuum modeling
8.1 Overview This chapter focuses on mathematical and physical aspects of modeling semiconductor nanostructures. There are several excellent full-length monographs on the physics and technology of semiconductor nanostructures [1–3], and an equally broad and comprehensive review of the topic is not the aim of the present work. The purpose of this chapter is to present an overview of the ways in which mechanical stress affects semiconductor nanostructures from the perspectives of both fabrication and application. The physical problems of interest, and the small scale of the structures demand models that include atomistic, continuum, and multiscale features. Furthermore, an understanding of the important phenomena requires an inherently coupled or multiphysics approach.
8.2 Background Semiconductor nanostructures include three classes of structures with at least one physical dimension on the order of tens of nanometers or smaller: thin films with thickness of less than 100 nm, nanowires with two characteristic dimensions less than 100 nm and the third dimension larger, and nanodots, with all three dimensions having characteristic lengths of less than 100 nm. Applications of these material systems range from mechanical to electronic, optical, and biological, but the most novel uses of semiconductor nanostructures all exploit the feature of electron quantum confinement due to the one or more spatial dimensions of less than 100 nm, which approaches the deBroglie wavelength of the electron in the base material. Thus, nanowires and nanodots are often referred to as lowdimensional structures, and more popularly as quantum wires and quantum dots. The purpose of this chapter is to discuss modeling of the coupled mechanical and physical properties of these unique systems and particularly the way in
286
Harley T. Johnson
which stress affects the formation and resulting properties of thin films, quantum wires, and quantum dots. This section of the chapter covers background on semiconductor nanostructures. In the following subsections, some basic applications of semiconductor nanostructures are reviewed. Then several subsections are devoted to discussing principles underlying the important mechanical and physical properties of the structures.
8.2.1 Applications of semiconductor nanostructures Applications of semiconductor thin films, quantum wires, and quantum dots, referred to collectively as quantum nanostructures, are made possible by quantum confinement of otherwise free electrons in the systems. The uses of quantum nanostructures depend on properties of confined electron wave functions or the confined electron energy levels. In principle, semiconductor structures of all sizes feature confined electron states, but only in nanoscale structures is the separation of these energy levels large enough to be experimentally useful. A simple estimate of the scale of the energy separation is given by considering the well-known particle-in-a-box energy levels for a three-dimensional system of size scale L, or 2 π 2 n 2 , (8.1) 2m∗ L 2 where is Planck’s constant, and m∗ is the electron effective mass. For typical semiconductor electron effective mass values on the order of a tenth of the free electron mass, the energy separation between the first and second confined modes in a box of 10 nm size is more than 100 meV. This scale is large enough to be measured using a variety of electrical and optical techniques, and it makes possible a number of useful applications in the areas of optoelectronics and nanophotonics, nanoelectronics and quantum computing, and bionanotechnology.
En =
8.2.1.1 Optoelectronics and nanophotonics Semiconductor nanostructures, and particularly quantum dots, have been studied and used most widely in applications relating to their optical properties. Perhaps the most useful nanostructures in this area are the direct bandgap type-I semiconductor quantum dots, since in these materials the bandgap is arranged so that transitions between confined electron energy levels and confined valence band energy levels can easily be mediated by interactions with individual photons. In this case the energy difference given (approximately) by the bandgap and the confined energy levels, E = EBG + Eielectron + Ejhole ,
(8.2)
will match the energy of an emitted or absorbed photon, so that E = ω,
(8.3)
where is Planck’s constant and ω is the photon frequency, often in the visible, near infrared, or infrared range for groups III–V or groups II–VI semiconductor
Effects of Stress on Formation and Properties of Semiconductor Nanostructures
287
quantum dots. A number of important applications of semiconductor quantum dots depend on emitted or absorbed light at frequencies determined by this simple relationship. Among these applications are laser sources [4], active elements in nanophotonics [5], and photodetectors [6]. In many cases, quantum dots, quantum wires, and thin films are all candidate structures for optoelectronics and nanophotonics. But quantum dots hold tremendous promise over other higher-dimensional structures such as quantum wires and thin films due to their delta-function-like density of states, which allows for narrow line widths in the optical spectra of the devices. This is important in both active (source) and passive (detector) elements. 8.2.1.2 Nanoelectronics and quantum computing Quantum dots and wires also have promise for next generation devices in nanoelectronics and in new quantum mechanical computing paradigms. Thin films with thicknesses approaching the few monolayer range are already in use in conventional devices in the microelectronics industry, but the quantum confinement effects in quantum dots have been proposed for use in devices operating in entirely new ways. It has been proposed that single electron transistors, e.g. could take advantage not only of the discrete spectrum of energy levels in nanostructures, but also of the so-called Coulomb blockade, whereby charging of the structures is electrostatically limited to one or a few electrons at a time [7]. Coulomb interaction effects in quantum dots also form the basis on which some proposed quantum computing devices would operate [8]. Semiconductor nanostructure-based quantum computing elements would serve one of two functional purposes: as memory devices, in which quantum bits would store information for short periods to be used for logic operations; and as logic devices that would carry out simple operations very quickly by exploiting the quantum mechanical properties of confined energy states. In either memory or logic devices, quantum computing elements could operate on the properties of either charge confinement or spin manipulation, i.e. through the presence or absence of an exciton (electron/hole pair), or through the existence of a spin up or spin down state. New applications of quantum dots in computing are still being conceived, and only a few possible applications are in development. 8.2.1.3 Bionanotechnology Semiconductor nanostructures, specifically quantum dots, have recently been brought into commercial use in some biomedical applications. Due to their unique optical properties, quantum dots have found some specialized uses in medical imaging as labeling agents that are superior to some conventional fluorescence methods. For example, free-standing (often colloid-based) quantum dots conjugated to antibodies can be used to mark specific biomolecules in live cells [9]. In such an application, light emission at a particular frequency can be used to indicate the presence of a particular protein. An even more fascinating application as a biological marker is reported by Winter and co-workers [10]. In this case quantum dots are attached to individual living neurons, and when the
288
Harley T. Johnson
cells are electrically active, the quantum dots are directly excited by the electrical stimulus and light emission is induced. As with the nanophotonic and nanoelectronic applications of quantum dots, the biological applications are also based on the quantum confining features of the nanostructures.
8.2.2 Fabrication methods Semiconductor nanostructures such as quantum dots and quantum wires have emerged as a new technology due to decades of work in microelectronic materials processing. The basic classes of nanostructure formation are (i) top-down or lithographic methods, and (ii) bottom-up, or spontaneous self-assembly methods. 8.2.2.1 Top-down or lithographic methods Lithographic methods for semiconductor nanostructure formation emerged as a direct consequence of advances in the spatial resolution of conventional microelectronics processing technologies. These methods involve, essentially, the masking of selective nanoscale regions of a semiconductor film that is otherwise large in lateral extent, and the subsequent removal of all surrounding unmasked material. As combined use of photolithography and electron-beam or X-ray lithography has led to minimum feature sizes below 100 nm, semiconductor nanostructures can be directly fabricated in a top down manner. Still, controlled fabrication of quantum wires and quantum dots with dimensions in the 10 nm range is beyond the capability of current lithographic methods. 8.2.2.2 Bottom-up or spontaneous self-assembly methods There is considerable research activity in semiconductor nanostructure fabrication using bottom-up or self-assembly methods. The spontaneous formation of semiconductor quantum dots was observed in the early 1990s [11,12] as a result of what was first considered to be a defective film growth mode. Since then, several mechanisms for the formation of nanoscale clusters of semiconductor material have been identified in a range of processing scenarios, and each has been explored as a method for fabricating electrically active quantum dots or quantum wires. Stress is involved directly in each of these mechanisms, and because the focus of this chapter is on stress effects on the formation and properties of semiconductor nanostructures, each of the mechanisms is introduced in more detail below. The specific role of stress in these mechanisms, and the modeling methods associated with them, are discussed in Section 8.3 of this chapter. A tremendous amount of research effort has been devoted to understanding stress-induced surface instabilities that can lead to the formation of nanostructures in semiconductor materials. Stress-induced surface instabilities were first observed along stressed solid surfaces in the problem of growing stress-corrosion cracks in metal specimens, and explained by Asaro and Tiller [13]. Much later, a similar mechanism was used to explain the formation of quantum dots in the so-called Stranski–Krastanov growth mode, in which growing strained film first adopts a flat configuration and then shifts to a periodic morphology when the
Effects of Stress on Formation and Properties of Semiconductor Nanostructures
289
Figure 8.1 Schematic of three thin film growth modes. Left: Frank–van der Merwe or planar layer-by-layer growth. Center: Stranski–Krastanov or island growth on a wetting layer. Right: Volmer–Weber or island growth with no wetting of the substrate. (See also colour plate 7.)
relieved strain energy exceeds the added surface energy associated with ripple formation [14]. This explanation is also used to understand the Volmer–Weber film growth mode, in which deposited strained material never fully wets a substrate surface, but instead immediately takes on an island-like morphology. The Stranski–Krastanov, Volmer–Weber, and Frank–van der Merwe growth modes are illustrated schematically in Figure 8.1. The latter mode is dominant when the growing film is unstrained, so the morphology remains flat. In contrast to the growth mode stress-induced instabilities described above, there are several lesser-known surface material-removal instabilities that have been shown to produce device-quality semiconductor nanostructures. Stress is also believed to play a role in these subtractive processes. For example, under certain conditions ion-bombardment, the physical component of plasma processing, has long been known to lead to spontaneous nanostructure formation on surfaces [15].This phenomenon occurs because the ion-bombardment of a nonplanar surface results in selectively enhanced sputtering in “valleys’’ and reduced sputtering on“hills’’. The role of stress in this mechanism is described in detail in Section 8.3. A third category of spontaneous formation of semiconductor nanostructures occurs in the presence of a bulk configuration instead of at a free surface. When a semiconductor heterostructure is fabricated from layers consisting of alternating ternary or quaternary compound compositions, such as Inx Ga1−xAs and Iny Ga1−yAs, phase separation occurs if the three materials are not fully miscible at the compositions x and y [16]. Stress contributes to the extent of the phase separation, which is accomplished via bulk diffusion in the heterostructure. While the mechanisms and modeling of the stress effect in this case are rather different than in the previous two categories, the resulting microstructure has also been shown to be useful for quantum dot and quantum wire devices. Interestingly, the characteristic length scales of this mechanism are comparable to the surface instability mechanisms. The details of the stress effect in this case are also described in Section 8.3.
8.2.3 Fundamental properties of nanostructures The applications of semiconductor nanostructures are generally based on their unique electronic and corresponding optoelectronic characteristics. In order to study the effects of stress in these structures it is critical to understand both the fundamental mechanical properties of the structures and the unique electronic and optical properties.
290
Harley T. Johnson
8.2.3.1 Mechanical properties The mechanical behavior of semiconductor nanostructures such as quantum dots and quantum wires is dominated by linear elastic properties, since these structures are typically single crystalline semiconducting materials. These materials have elastic properties similar to structural metals or ceramics. Often they are based on the open crystal structures that are typical of covalently bonded materials, such as diamond cubic or zinc blende (group IV elements and alloys) or Wurtzite (groups III–V alloys). Dislocations in these crystal structures take many complex forms, but in most applications of semiconductor nanostructures dislocations are avoided by design. Epitaxially grown structures may have lattice mismatch strains on the order of a few percent, but most structures remain defect free due to the small volumes and complex morphologies adopted by the materials. 8.2.3.2 Electronic and optical properties The electronic and corresponding optoelectronic properties of semiconductor nanostructures are governed, as described above, by the electron and hole states confined by the small spatial extent of the nanostructures. Delta-function-like densities of states give quantum dots and quantum wires, in particular, govern the special electronic and optical applications of these structures. Figure 8.2 shows a schematic view of the density of electronic states in semiconductor materials with decreasing dimensionality, from bulk materials to quantum dot configurations. The energy scale separating the peaks in the densities of states for quantum wires and quantum dots can be estimated using equation (8.1). The occupation of and transitions between these states governs most of the useful properties, including
Density of states
Bulk
Energy
Film
Wire
et al ., 1995 Dot
Figure 8.2 Schematic of the electron density of states in bulk and quantum confined material systems. The delta-function-like densities of states in quantum wire and quantum dot configurations are desirable for many nanoelectronic and optoelectronic devices. (See also colour plate 8.)
Effects of Stress on Formation and Properties of Semiconductor Nanostructures
291
both the parallel transport and perpendicular, or quantum confined, properties of the structures. Electronic transport in the long directions of nanostructures, e.g. along the length of a nanowire, or in the plane of a film of nanometer thickness, is characterized on the basis of classical transport models. In this case it is common to define an electron or hole mobility, so that the conductivity is given by σ = neμ,
(8.4)
where n is the number of carriers, either electrons or holes, e is the fundamental charge, and μ is the mobility of either an electron or a hole. For the present purposes this simple model is sufficient to describe the parallel transport characteristics of a nanoscale thin film structure. Mechanical effects on the conductivity can then be considered as influencing the in-plane electron or hole mobility μ. A more fundamental mechanistic definition for the mobility is given by adopting a classical scattering model in which the mobility is proportional to the mean scattering time τ and inversely proportional to the effective mass m∗ . In this case μ=
eτ . m∗
(8.5)
Any mechanical coupling to the electronic transport properties can be viewed as inducing changes to the mean scattering time. Along the length of a nanowire, such as a carbon nanotube, it is possible for an additional effect on electronic properties due to lateral quantum confinement of the charge carriers. In this case, the parallel transport conductance of the structure becomes quantized, with the fundamental unit of conductance given by G=
2e 2 , h
(8.6)
where h = 2 π. In this case, mechanical effects influence the quantized density of states, shown for a range of structures in Figure 8.2; these effects directly modify the energies at which the quantized conductance increases or decreases [17]. The most significant feature governing the properties of semiconductor nanostructures is the confinement of electrons and holes leading to a discretized spectrum of energy levels or density of states. As noted above, all of the important applications of these structures make use of the quantum confined energy states. In the present work a mean-field or continuum view is adopted for the electronic properties of semiconductor nanostructures. It is possible to build up an understanding of these properties from first-principles methods, in which the quantum dot or quantum wire electronic properties are a consequence of the core and valence electron configuration around each nucleus [18], but for the practical reason of understanding the continuum mechanical effects on the electronic properties, a single-electron or continuum view is taken here. In this approach, the electron and hole properties are governed by a single electron steady state Schrödinger equation written in a k · p Hamiltonian form,
292
Harley T. Johnson
given by −
2 αβ ∇i Lij ∇j β + V αβ β = E, 2m
(8.7)
where is the wave function of the carrier, V is a spatially varying electrostatic potential field, and L is the Luttinger–Kohn Hamiltonian containing the effective mass tensor parameters and off-diagonal energy band coupling terms. In the case when energy subbands α and β are decoupled, the formulation is simply a single electron effective mass Schrödinger equation, given by −
2 m ∇j + V = E. ∇i 2m m∗ ij
(8.8)
The electronic effect of the positively charged nuclei is accounted for in a meanfield sense through the electron effective mass m∗ .The effective potential accounts for all electrostatic contributions to the potential energy of the charge carrier. The solutions of the equation consist of energy levels and wave functions for either a single electron or single hole in this effective medium approach. For a one-dimensional system, the energy solutions are approximately given by the particle-in-a-box energy levels in equation (8.1). Transitions between the energy levels govern most important electronic and optical applications of these systems. For example, light of specific frequencies, given by equation (8.3), can be absorbed or emitted if the energy of the photon matches differences between available electron or hole states. For interband transitions between electron and hole states, the primary contribution to the spectrum of allowed frequencies is from combinations of electron and hole states with matching quantum numbers. This simple selection rule is also known as the n = 0 rule. However, due to the differences in confining potential and effective masses between the conduction and valence bands, the spatial extent of wave functions with matching quantum numbers is not perfectly matched, so a more complete analysis of optical emission and absorption frequencies can be obtained from the optical conductivity, given by [2] σ1 (ω) =
2π e 2 |e · pi, j |2 |< i|j >|2 δ(Ei − Ej − ω) ω m i,j
(8.9)
where the main terms in the summation, in order, represent the polarization of the interacting light relative to the momentum of the charge carrier, the spatial overlap of the two confined states i and j, and the matching criterion for the frequency relative to the energy difference between the two states. The volume of the system is given by .
Effects of Stress on Formation and Properties of Semiconductor Nanostructures
293
8.2.4 Modeling methods for semiconductor nanostructures Analysis of stress effects in nanostructures requires studying both the mechanical behavior and the electronic properties in a coupled manner using either continuum or atomistic methods. The methods must account sufficiently for the properties described above. While the focus of this book is on mean-field or continuum effective medium methods, which are described next, it is also instructive to briefly review suitable atomistic methods as well. 8.2.4.1 Continuum modeling methods The linear elastic analysis of stress and strain in semiconductor nanostructures is conveniently done using continuum finite element methods. Since the systems are expected to be free of defects, and because the elastic properties of the heterostructure materials are well characterized, there is no difficulty in accurately solving for the deformation fields in this manner. Even the small length scale of the problem is generally not a barrier to adopting a continuum approach. Highly nonuniform deformation, such as may be found near interfaces and crystallographically sharp geometrical features of the structures, are generally well resolved with continuum elasticity modeling down to length scales of just a few atomic spacings [19]. Because of accurate, widely available commercial finite element packages with elasticity modules, the use of finite difference or spectral element methods is very uncommon in studying the mechanics of semiconductor nanostructures [20]. In principle, however, these methods would also be suitable. The electronic structure analysis of semiconductor nanostructures can also be carried out in a continuum effective medium approach, and implemented computationally by means of the finite element method, the finite difference method, or the spectral element method. The underlying continuum equation describing the system, the steady state single-electron Schrödinger’s equation, is described in the previous section. This system is ideally suited for finite element analysis, like the elasticity problem, and it can be solved conveniently using the same mesh. While a finite element solution to this equation is much less common that for the elasticity problem, it is becoming more common and can now be done using commercial finite element software [21]. 8.2.4.2 Atomistic modeling methods With ever-increasing computational resources, semiconductor nanostructures are seemingly well suited for atomistic total energy and electronic structure analyses. In practice, however, the lack of defects and the highly ordered nature of the systems at the atomic scale make continuum analyses very accurate and desirable. Nevertheless, empirical potentials could be used to compute mechanical fields while tight-binding or density-functional theory methods could be used for computing electronic properties. Empirical interatomic atomistic methods are available for modeling the structure and stress in semiconductor nanostructures through total energy calculations.
294
Harley T. Johnson
In general, semiconductor structures consisting of groups III, IV, and V elements and compounds crystallize in open lattice structures with predominantly covalent bonding. These systems are well suited to pair and cluster potential methods. With accurate bond-angle dependent total energy potentials available for these materials it is generally unnecessary to adopt environment dependent pair or cluster functional methods. For example, silicon is well characterized using the Stillinger–Weber potential [22]. Structures and stress in other compound semiconductors are commonly studied using methods such as theValence Force Field approach [23]. However, in systems that are often dominated mechanically by the presence of free surfaces and interfaces, it is critical that the available potentials be transferable. In many cases, such as the often studied [110] free-surface structure of some III–V alloys, empirical potentials are not accurate enough [24]. To accurately resolve some structure and stress features in semiconductor nanostructures, and to take advantage of electronic structure information for coupled mechanical–electronic studies, tight-binding and first-principles atomistic calculations are sometimes preferable. At the present time these methods are too computationally costly to study larger structures or arrays of quantum dots; for some smaller specialized problems tight-binding total energy and electronic structure calculations are possible. A complete discussion of the essential differences between these methods, e.g. the treatment of fully quantum mechanical effects such as exchange and correlation energies of the electrons, is beyond the scope of the present chapter. More on the application of these methods to the problem of quantum confinement in semiconductor quantum dots is presented in Section 8.4. Nevertheless, it is worth nothing that these more computationally intensive electronic structure methods have many inherent advantages over empirical atomistic methods or continuum mean-field methods because they make possible direct, fully coupled studies of electronic and mechanical structure. 8.2.4.3 Multiscale modeling methods The computational expense of quantum mechanical or simpler empirical atomistic methods makes the growing class of multiscale modeling methods increasingly attractive for studying semiconductor nanostructures. Most multiscale methods developed in the mechanics community in recent years can be described as either concurrent methods or hierarchical methods. Concurrent methods involve modeling various regions of a domain with different levels of total energy approximation, but only using a single method for each region. The regions are carefully joined so as to avoid mismatched forces or stiffnesses. One successful example of a concurrent method is the Macro-Atomistic-Abinitio-Dynamics method (MAAD) used to study fracture in silicon by Broughton and co-workers [25]. Hierarchical methods involve embedding a higher accuracy energy calculation inside a less costly mean-field or continuum framework, so that information from a finer scale feeds analysis at a coarser scale. An example of this type of analysis is the quasicontinuum method first formulated by Tadmor and co-workers [26]. As has been repeatedly shown in multiscale modeling methods for a range of applications, in the case of semiconductor nanostructures only specific problems are
Effects of Stress on Formation and Properties of Semiconductor Nanostructures
295
suited to multiscale analysis. One such case is demonstrated by Johnson and coworkers for deforming silicon nanostructures [27], although significant problems with interfaces and free surfaces make the value of such a local quasicontinuum method for real quantum dots very limited. Thus, while these methods are instructive for some cases, fully atomistic or fully continuum methods are more practical in general. Still, despite rapidly diminishing computational limitations due to cheap and fast parallel clusters, continuum or mean-field methods are the current choice for studying the effects of stress on formation and properties of semiconductor nanostructures.
8.3 Effects of Stress on the Formation of Semiconductor Nanostructures The presence of mechanical stress plays a role in the formation of semiconductor nanostructures by means of either the top-down lithographic methods or the bottom-up spontaneous self-assembly phenomena described in the previous section. In nanostructure formation by self-assembly, stress in some cases provides the primary force that drives the formation, while in other cases it provides a force resisting nanostructure development. In any of the self-assembly cases it is necessary to understand the role of stress and to include it in any accurate model of the formation process. The effects of stress on the various spontaneous formation phenomena for semiconductor nanostructures are described below.
8.3.1 Modeling stress-induced surface self-assembly In the introductory section the phenomenon of stress-driven surface self-assembly of nanostructures was described qualitatively. Here the theory and modeling of this formation process are outlined by examining several significant contributions in the literature, with some comments about prospects for more accurate allatomistic modeling of the phenomenon. 8.3.1.1 Mechanism of stressed surface instability Work on stress-corrosion cracking by Asaro and Tiller [13] and extensions of the work to other systems by Grinfeld [28] and then Srolovitz [29] led to a framework for understanding the competition between strain energy and surface energy that can lead to nanoscale ripple formation in stressed epitaxial films. This instability is a primary mechanism for explaining the spontaneous formation of semiconductor heterostructures. The seminal contributions of Asaro, Tiller, Grinfeld, and Srolovitz (ATGS) described the dynamical behavior of these strained surfaces with an emphasis on linear stability analysis. A material system in this configuration, in general, is characterized by a variable describing the tendency for shape change in order to reduce total free energy. This variable, surface chemical potential, represents the change in energy per unit volume needed to add an infinitesimal element
296
Harley T. Johnson
s
vn
Figure 8.3 Schematic of the Asaro–Tiller–Grinfeld–Srolovitz surface instability. Mass transport changes the shape of the surface, characterized by coordinate s. The shape change can be described by the surface normal velocity v n .
of material at a position s along a surface, as shown in Figure 8.3. The chemical potential is given by χ(s) = [φ(s) − κ(s)γ],
(8.10)
where φ(s) is the strain energy density at position s, κ(s) is the curvature of the surface at position s, γ is the surface energy density, and is the atomic volume. The two terms representing strain energy density along the surface and added surface energy due to curvature, or capillarity, compete to determine the total chemical potential. Assuming that the surface energy density is uniform, whether it is isotropic or not, the strain energy density plays a clear role in driving the surface instability. If a uniformly stressed surface, loaded e.g. by lattice or thermal mismatch strain, takes on a periodic morphology the stress field near the surface is also periodic. Thus, the strain energy density also shares the periodicity of the surface, so the local chemical potential is influenced by the morphology. Peaks constitute areas of reduced chemical potential and valleys constitute areas of increased chemical potential due to stress concentrations. Then, given sufficient mobility, added material will diffusively migrate to the peaks instead of the valleys. The effect of surface energy or capillarity reduces this tendency over a range of length scales. But according to the linear stability analysis, the surface will nevertheless be unstable with respect to perturbations in the range of a few tens to hundreds of nanometers for typical semiconductors. The critical wavelength, above which capillarity is unable to stabilize the surface, is given by π γ λcr = , (8.11) 1 + ν U0 where ν is Poisson’s ratio and U0 is the strain energy density in a biaxially strained surface. It is worth noting that since U0 varies with the square of the stress, semiconductor thin films with larger lattice mismatch have much smaller critical wavelengths; this is consistent with observations of formation of semiconductor quantum dots and quantum wires in a range of strained groups IV and III–V semiconductor alloys. 8.3.1.2 Surface evolution models The surface chemical potential formulation can be used as the basis for a fully continuum surface evolution analysis. A Fickian-type surface diffusive flux driven
Effects of Stress on Formation and Properties of Semiconductor Nanostructures
297
by gradients in the surface chemical potential is governed by j(s) = −
Ds cs ∂χ , kT ∂s
(8.12)
where Ds is the surface diffusivity, cs is the concentration of the diffusing species, and kT is thermal energy. Then, by invoking a local mass conservation along the surface, the normal velocity of the free surface is given by vn =
D s cs 2 ∂ 2 2 [φ(s) − γκ(s)]. kT ∂s
(8.13)
This equation is known as a Mullins-type evolution equation after work on thermal grooving by Mullins in 1957 [30]. This simple model captures reasonably well the surface evolution of a strained semiconductor thin film provided that the diffusivity Ds is reasonably high, that the length scales are sufficiently small, as suggested by equation (8.11), that the strain is large, and that the surface is clean and free from impurities. These conditions are all satisfied in the case of typical molecular beam epitaxial (MBE) growth or chemical vapor deposition (CVD) of strained semiconductor heterostructures. For example, the mismatch strain in the case of Si/Ge growth is as large as approximately 4%, and the CVD growth temperature is approximately 600◦ C. Equation (8.13) can be solved explicitly by first imposing a small perturbation on a flat surface. Then the system is evolved by forward time integration for the surface height. At each small time step the corresponding elasticity problem is solved, since the change in configuration changes the driving force locally imposed by the lattice mismatch. Thus the stress continuously couples to the surface evolution. Zhang and Bower [31] have done calculations of this type for strained epitaxial film growth. The results compare qualitatively well with experimental observations of spontaneous nanostructure formation in stressed semiconductor thin films. 8.3.1.3 Evolution of surfaces with atomistic features The continuum description of stressed surface evolution associated with the Mullins-type equation captures formation of quantum dots and quantum wires in a qualitatively accurate way. However, in some material systems atomistic features at the surface dominate the nanostructure formation process. In this physically more complex case the surface features, such as steps and adatoms, contribute their own local and long-range elastic effects to the system. The elastic fields of these atomistic features induce interactions that essentially govern the specific faceted shape of these nanostructures. A formulation to account for these atomistic surface features in a continuum model is presented by Shenoy and Freund [32]. In this simple model, first formulated for a two-dimensional geometry and later extended to a threedimensional geometry [33], surface energy of surfaces with a small slope h relative
298
Harley T. Johnson
to a high-symmetry plane in a crystalline semiconductor is expanded to include the effects of surface steps, so that √ |h |3 γ[h(s), h (s), ε(s)] = (γ0 + τ0 ε) 1 − h + (β1 + β2 ε)|h | + β3 1 − h2
(8.14)
where h, h , and ε are all functions of position s, τ0 is the surface stress of the flat surface, and β1 , β2 , and β3 are atomistic parameters relating to step formation energies. It is worth noting that the expression contains terms from both continuum elasticity and an atomistic-based expression for surface energy. Thus, by computing the constants β from atomistic models, the rest of the surface evolution problem can be formulated in a continuum framework. As in the numerical model reported by Zhang and Bower [31], the surface evolution is then determined from the local mass transport driven by gradients in surface chemical potential, where surface chemical potential at a point along the surface is interpreted as the energetic cost to add material at that point, per unit volume. However, due to the potentially singular nature of the surface energy as a function of vicinal angle θ, Shenoy and Freund initially adopt a Fourier transform method rather than formulating a direct time integration approach for the surface evolution process. A key result is that due to the coupling between stress and the angle dependent step energy contribution to the surface energy, a growing strained surface will tend to favor particular angles relative to the high-symmetry directions of the crystal. Thus, simply from energy minimization considerations, such a surface is expected to steadily evolve toward a faceted morphology with no nucleation barrier. This conclusion is consistent with many experimental observations of island or nano-dot growth in strained semiconductor thin films [34]. 8.3.1.4 The challenge of all-atomistic modeling of stress-induced surface self-assembly Atomistic features of surfaces of stressed semiconductor materials govern the behavior that leads to spontaneous self-assembly of nanostructures. In principle, the most accurate and useful model for nanostructures formation would be fully atomistic in nature. Some models for properties of semiconductor nanostructures are fully resolved to the atomic scale; e.g. various atomistic methods have been used to model the electronic properties of quantum dots. However, the long time scales associated with mass transport for self-assembly limit the practicality of fully atomistic methods. Accurate tight-binding atomistic models for semiconductor quantum dots as large as tens of millions of atoms are now possible, for static analyses, and using large supercomputers [35]. The evolution problem described here, in atomistic time units of about 10−10 s, would require such a tight-binding calculation to be carried out 108 or more times using conventional molecular dynamics methods. With present computational resources, these analyses are obviously impossible. One alternative is to use accelerated molecular dynamics methods [36] since, on the atomic scale, the self-assembly processes consist of rare thermally activated
Effects of Stress on Formation and Properties of Semiconductor Nanostructures
299
atom motions. The other alternative is to construct multiscale models, in which atomistic information about stress effects on the formation of nanostructures must appear as input into continuum calculations. Besides the work of Shenoy and Freund [32], other authors have successfully extracted continuum surface properties from atomistic surface features; these properties could in principle be applied to continuum studies of nanostructure formation.
8.3.2 Modeling stress effects in the sputter-erosion instability In semiconductor materials processing, the stress-driven surface instability that leads to nanostructure formation can be considered as a material-additive instability, since it generally occurs during deposition by either CVD or MBE. In that case, the added strain energy in a film of increasing thickness gives rise to the force driving the instability. By contrast, there is a class of material-subtractive surface instabilities that also results in nanostructure formation. Chemical etching of stressed surfaces, e.g. is known to lead to nonplanar surface morphology [37]; as in the ATGS instability, the presence of stress destabilizes the surface with respect to wavelengths on the nanoscale. A possibly richer stress effect is observed in the so-called sputter-erosion instability, described below. 8.3.2.1 Mechanistic understanding of the sputter-erosion instability When an initially flat semiconductor surface is exposed to an ion beam with energy on the order of 1 keV up to fluences of more than 1016/cm2 , which usually occurs over seconds and minutes, surface ripples, and dots on the nanometer scale are often observed.This observation spans a broad selection of materials, over a wide temperature range, and for various incident ion beam angles. As in the ATGS instability, surface energetic effects tend to smooth a rough or wavy surface. But in the ion-bombardment instability, the driving force for roughening or formation of nanostructures is thought to be due to locally varying rates of material removal by sputtering. As shown in Figure 8.4, when an incident ion strikes a free surface at nonnormal incidence, it penetrates a finite distance before imparting its kinetic energy to the crystal. Because of the assumed Gaussian profile of the deposited energy contours at a given time after impact, material downstream from the impact site is more likely to sputter away than material directly at the point of impact. In this case, the result is that valleys on the surface erode more quickly than peaks on the surface. So if an initially atomically flat surface develops a small perturbation in height, e.g. due to an impurity locally suppressing the sputter yield, the perturbation should grow in an unstable manner. Sigmund [15] shows that this behavior leads to sharpening of mounds or cones that have initial length scales on the order of the ion-penetration depth. Morphological changes with much longer length scales cannot be due to the sputter-erosion instability. 8.3.2.2 Continuum models for the sputter-erosion instability Just as the ATGS instability is studied numerically using methods that directly integrate surface evolution equations, the Sigmund mechanism is studied numerically
300
Harley T. Johnson
Incident ion
90-u Sputtered atoms
Contours of “deposited energy”
Figure 8.4 Schematic of the Sigmund mechanism for the sputter-erosion surface instability. When ions strike the surface at off-normal incidence, sputtering is more likely “downstream’’ from the impact point than at the impact point.
in order to follow the evolution of a sputter-eroded surface. Work by Bradley and Harper [38] formally reframed the Sigmund instability as being due to a sputter yield dependence on both the local slope and the local curvature of the surface. This allows an empirical formulation of the surface evolution in time given by ∂h = −c0 (θ) + c1 (θ)∇h + c2 (θ)∇ 2 h + c4 (θ)∇ 4 h, ∂t
(8.15)
where the c(θ) terms can be considered as material constants used to fit the model to the experimentally observed morphologies. In this work Bradley and Harper point out that surface self-diffusion provides a smoothing mechanism through the biharmonic term of equation (8.15). Extensive work by Barabasi and co-workers [39–41] shows that by including a term in equation (8.15) that accounts for randomness, and by retaining higher-order nonlinear terms, is possible to more closely match experimental results. Numerical integration of the resulting surface evolution equation leads, for certain chosen values of the constants c(θ), to regular nano-dot or nanowire formation. Whether these structures have quantum confining characteristics remains to be seen. Stress plays a secondary role in the process as described by equation (8.15) and other continuum models. Early attempts to explain the significance of stress in the formation of nanostructures by sputter erosion focused on the possibility of near-surface buckling, although length scale arguments can be used to discard that theory. A more likely effect of stress is in the surface self-diffusion term. As in the ATGS instability described in the similar equation (8.13), strain energy can compete with surface energy to roughen the surface. This contribution has
Effects of Stress on Formation and Properties of Semiconductor Nanostructures
301
been mostly neglected in analyses of sputter erosion, although it is known that ion-bombardment leads to large near-surface stresses [42]. Other potential effects of stress in the sputter-erosion surface instability include in the sputter yield, and in the possibility of viscous relaxation of the amorphized near-surface layer. These effects are considered to be very difficult to quantify using continuum models, although work is presently being done to understand the sputter yield dependence on stress using atomistic modeling. 8.3.2.3 Initial connections between atomistic calculations and continuum methods Some new results by the author and co-workers show the connection between the atomistics of sputtering and the continuum behavior that leads to a surface instability [43]. Atomistic modeling successfully predicts the sputter yield for normally incident 500 eV Argon ions on silicon, as well as the stress development and structure evolution in the material. One significant finding is that the ioninduced rearrangement of material along the surface likely affects the morphology evolution as much as the sputtering of material off of the surface. Molecular dynamics studies of slope and curvature dependence of sputter yield, as well as of the basic assumptions of energy deposition proposed by Sigmund, are underway. The eventual goal of atomistic studies of the sputter-erosion surface instability is to validate and inform the continuum empirical models proposed by Bradley and Harper and subsequently extended by others [39–41].
8.3.3 Modeling stress effects in compositional segregation in thin films A third class of instability leading to nanostructure formation, where stress couples to the observed morphological change, occurs in growth of semiconductor heterostructures consisting of alternating layers of different binary, ternary, or quaternary alloys. This phase separation phenomenon allows for the growth of nanostructures in several III–V semiconductor systems, although as in stressinduced surface self-assembly of quantum dots and wires, control over placement and spacing of the features is limited. Like the stress-induced surface self-assembly process, compositional segregation was first viewed as a symptom of defective processing before it gained attention as a possible self-assembly method for producing nanostructures. In studying short period superlattices of alternating layers of GaP and InP, e.g. Hsieh and co-workers observed the formation of quantum wires caused by lateral segregation of Gallium into bands as narrow as 20 nm [16]. These structures are presently considered to be good candidates for optoelectronic quantum wire applications, and more work is in progress to understand the spontaneous compositional segregation phenomenon. As an outcome of the simple models used to explain the experimental results, it is seen that stress couples into the problem, but in a way that resists formation of the nanostructures.
302
Harley T. Johnson
8.3.3.1 A continuum model for stress coupling to compositional segregation The thermodynamic tendency of a multicomponent system to phase separate can be described in a Cahn-Hilliard equation framework. A typical material system is the one studied by Hsieh et al., Inξ Ga1−ξ P, where ξ describes the spatially varying composition in the ideal, or as grown structure. Following Freund and Suresh [24] the free energy in this system can be written as ! 1 CH F= φ(ξ) + κij ξ,i ξ,j + Cijkl εij εkl dR (8.16) 2 R
where φ(ξ) is the free energy locally as a function of composition ξ, the second term is a phenomenological term that corrects for the locality assumption in the first term by penalizing the free energy for the presence of strong composition gradients ξi , and the third term is the elastic strain energy density. By application of the divergence theorem and assuming that the boundaries do no work on the system, this expression can be written in rate form to determine how the elastic energy affects the rate of change of free energy, and thus the stability. Freund and Suresh demonstrate that in this case, for a block of material with lattice mismatch strain εm and with a periodic perturbation in composition, the system is stable only if φ (ξ) + 2E ∗ ε2m > 0
(8.17)
where E ∗ is the plane strain elastic modulus of the material. This result implies that, as one expects, a system with no mismatch strain is unstable if there is a miscibility gap for the composition ξ, which is indicated by φ (ξ) < 0. Without such a miscibility gap, the system is unconditionally stable, even in the presence of strain. In fact, strain stabilizes the system. The preceding results assume that the system is fully constrained against deformation, so that no stress relaxes. If the system is not fully constrained, the stress can relax and the stabilizing effect can be nearly lost. In that case, the tendency to phase separate depends almost entirely on the presence of a miscibility gap for the composition range of interest. 8.3.3.2 Prospects for modeling phase separation in real III–V semiconductor materials Continuum models that account for the tendency of the solid solution to undergo phase separation, such as the simple model presented above, appear to be the most useful description of lateral composition segregation like that observed by Hsieh et al. [16]. More accurately modeling of the mechanics problem of stress relaxation through formation of lateral regions of phase separated material is possible in either an atomistic or continuum framework. In fact, as the system goes from a fully constrained configuration like the strained superlattice to a partially stress-relaxed configuration like the compositionally modulated structure, the tendency toward
Effects of Stress on Formation and Properties of Semiconductor Nanostructures
303
phase separation should decrease.This transition would be reflected in the kinetics of the morphological evolution. Just as the strain relaxation can be studied atomistically, so can the electronic structure. Mattila,Wang, and Zunger [45] have used empirical interatomic potentials to consider the strain in a compositionally segregated semiconductor system, and a semiempirical pseudopotential formulation within density functional theory to predict features of the electronic structure, such as band gaps and band alignment. However, a fully atomistic model of the evolution process, as in the case of stress-induced surface self-assembly, is impractical or impossible at this time.
8.4 Stress Effects on the Electronic/Optical Properties of Semiconductor Nanostructures Stress effects are observed in the functional characteristics of semiconductor nanostructures in a variety of ways. To classify the physical mechanisms of stress coupling to electronic and optical properties, it is convenient to separately consider so-called parallel transport from perpendicular transport. Thin film configurations in which electrons behave as free particles in the plane of the film exhibit parallel transport characteristics, while quantum wires and quantum dots behave according to the rules of perpendicular transport when lateral confinement effects are considered. The following sections cover observations of the effects of stress on electronic and optical properties in nanostructures, with an emphasis on the modeling methods used to understand the physical phenomena.
8.4.1 Models for stress effects on parallel transport in thin films Parallel transport refers to the essentially planar motion of electrons in a layer of material; the problem is often treated classically. Within the simple framework of the Drude model of conduction [46], electrons move ballistically between spatially and temporally separate scattering events associated with any one of a number of possible scattering sources. A single electron experiencing an electric field in the material accelerates until drag imposed by these scattering sources limits the velocity to a maximum value known as the drift velocity. This velocity v is linearly related to the applied field with a proportionality constant equal to the mobility μ. Thus, through equation (8.5), the velocity is inversely related to the effective mass m∗ of the electron. So the effective mass, and thus the mobility and drift velocity, can be altered by strain. It is worth noting that in this case it is more appropriate, strictly speaking, to refer to strain as the controlling variable rather than stress, since it is the change in the spacing of the atoms in the lattice that affects the coupling to electronic and optical properties. This contrasts with nanostructure formation described in the previous sections, where the change in free energy – not the change in spacing of atoms in the lattice – leads to the configurational forces that drive the phenomenon. Two examples of stress (or strain) effects in parallel transport are reviewed next in a mechanistic manner. These examples are included for completeness,
304
Harley T. Johnson
although the connection to semiconductor nanostructures is only in the sense that the planar films of material in which these effects are observed may have thicknesses on the order of nanometers. 8.4.1.1 Strained silicon The technology based on a stress effect on electronic behavior having the most commercial impact is known popularly as strained silicon. This technology is currently in use by major microelectronics chip makers and is considered to be one reason that the semiconductor industry is able to extract better performance from transistors even as size scales approach fundamental limits. Observations by Fitzgerald and co-workers among others show that it is possible to fabricate thin films that achieve a significant increase in the in-plane electron or hole mobility of silicon when uniaxial or biaxial stress is applied to the material [47,48]. This increased mobility for electrons and holes is due to a reduction in effective mass. Applied strain reduces the effective mass by lifting the degeneracy in the conduction and valence energy band structure at the band edge. This effect has been known theoretically for many years and has more recently been confirmed by atomistic calculations based on density-functional theory [49]. Subsequent commercial activity to improve the technology has led to effective ways of growing strained silicon layers with acceptably low dislocation densities in the active region of the film, which would constitute the channel region of a silicon-on-insulator (SOI) device. 8.4.1.2 Dislocation strain effect in thin films Applying strain to silicon decreases carrier effective masses and can be accomplished without inducing large dislocation densities. This is critical, because dislocation scattering is a primary source of scattering electrons and holes in a semiconductor film. Other scattering mechanisms include electron-phonon scattering, which is the dominant temperature-dependent scattering mechanism; ionized impurity scattering, which is the result of electrostatic interactions between charge carriers and dopant point charges in the material; and interface and surface scattering, which arise due to disorder and roughness at boundaries in the material. Dislocation scattering, however, is partially the result of elastic stress in the material. Electron scattering by dislocations in semiconductors was first studied in the 1950s, shortly after dislocation scattering in metals was first considered. The conventional theory of dislocations as scattering centers assumes that the dislocations are well spaced and can act as individual scatterers. The total scattering potential is due to the sum of the deformation potential associated with the strain distribution, and the electrostatic potential due to the trapping of charge along the dislocation cores. The effect of elastic strain, due to the self-stress field of the dislocation, is characterized by Dexter and Seitz [50] who show that the scattering potential is given by V (r) = V0
b (1 − 2v) sin θ , 2π (1 − ν) r
(8.18)
Effects of Stress on Formation and Properties of Semiconductor Nanostructures
305
where V0 is a material parameter that scales the scattering potential, b is the Burgers distance, ν is Poisson’s ratio for the material, and r and θ are position coordinates from the dislocation. Since this term decays as 1/r, where r is the distance from the dislocation core, in most cases the effect of the stress field of the dislocation is significantly less than the effect of the charging of the dislocation core. The potential associated with a charged and unscreened core is described independently by Read [51] and Bonch-Bruevich and Glasko [52]. This contribution to the scattering potential decays approximately as ln (r0 /r), and therefore has a longer-range effect. From such a scattering potential, it is possible to use classical scattering models to determine the scattering time and thus the electron or hole mobility in a semiconductor with a particular dislocation density.You and Johnson carry out this analysis, in closed form, for a semiconductor layer with a range of dislocation densities [53].
8.4.2 Modeling the effects of stress on quantum confinement in wires and dots In quantum wires and quantum dots the physical configuration is appropriate for perpendicular transport, in the sense that charge transport through the structures would occur perpendicular to the physical interfaces responsible for quantum confinement. A basic estimate of quantum confinement energies for a quantum dot is given in Section 8.2. Whether a quantum dot or quantum wire is used in a transport device via perpendicular electron transport, or for a spectral measurement such as in photoluminescence, stress can have a relatively strong effect. In the remaining sections the effect of stress on the optical spectral characteristics of quantum dots and wires will be reviewed, although the extensive literature in the area of stress effects in quantum dots, in particular, is too vast to permit a comprehensive review. One thrust of the review presented here is on unique modeling methods for solving these problems. 8.4.2.1 Atomistic modeling of quantum confinement in semiconductor quantum dots Some basic methods of atomistic modeling of semiconductor nanostructures are discussed in Section 8.2. For the particular problem of studying electron confinement in semiconductor quantum dots, the use of atomistic methods is possible but significantly limited by current computational resources. In a typical realistic III–V quantum dot structure, there are as few as 105 atoms; some quantum dots, particularly group IV self-assembled dots in the Si/SiGe system, are larger. In the case of an embedded quantum dot, in addition to the atoms in the quantum dot itself, it is necessary both mechanically and quantum mechanically to consider atoms in the surrounding material. This can raise the computational expense considerably. Several modeling methods, namely tight-binding and pseudopotential-based density functional theory, can in principle be used for both electronic structure calculations as well as total energy calculations. However, due to the computational cost and the availability of accurate and inexpensive empirical potentials
306
Harley T. Johnson
for total energy calculations, atom positions are usually determined by empirical potentials and electronic structure is usually determined separately with either tight-binding or density functional theory. One notable exception to the practice of using separate methods for determining atom positions and electronic structure is the recent work of Oyafuso, Klimeck, and co-workers [35,54]. Through use of efficient algorithmic scaling and large parallel computers, they have studied multimillion atom III–V quantum dot domains with a highly accurate sp3 d5 s∗ tight-binding parameterization. Both mismatch strain and electronic structure parameters including confinement energies can be accurately computed using this approach. The semiempirical pseudopotential density functional theory method of Zunger and co-workers has also proven to be very useful for studying fundamental properties of semiconductor quantum dots [55], but mostly for much smaller systems. This method has yielded more detailed information about excitonic effects in quantum dots than other electronic structure methods. Until now this method has often been used in combination with the Valence Force Field method for determining the atom positions in the quantum dot; models using these methods have considered essentially geometrically idealized quantum dots such as pyramids, cones, or lens-shaped structures. Similarly, a mechanically and electronically decoupled approach, whereby the atom positions are obtained separately and input into an electronic structure calculation, is in development by the author and co-workers [56]. In this approach, atom positions for embedded III–V quantum dots are obtained either as the result of empirical potential calculations or by direct experimental measurement, as shown in Figure 8.5. Then a real space, order(N ) scaling tight-binding method
DOS (eV1)
2.0
1.5
1.0
0.5
0.0 8 6 4 2 0 2 E (eV)
4
6
8
Figure 8.5 Combined experimental and computational study of electronic structure of individual embedded quantum dot at the atomistic scale. Atom positions are determined using high resolution cross-sectional scanning tunneling microscopy (upper left) and then converted to an atomistic computational input file (lower left). Using a novel tight-binding method, the local density of states is determined at various positions (right) and compared to experimental data [56]. (See also colour plate 9.)
307
Effects of Stress on Formation and Properties of Semiconductor Nanostructures
[57] is used locally to probe the electronic structure in regions of interest, such as near interfaces, point defects, clustering, etc. This method reveals the presence of quantum confined states in embedded dots, as well as the presence of surface states on nearby free surfaces. The tight-binding method used in this case is based on the method of moments, which allows for a construction of the local density of states from statistical information about the local atomistic structure of the material. 8.4.2.2 Continuum modeling of electron confinement in quantum dots and wires Continuum methods are the most convenient approach for modeling the mechanical fields in quantum dots and quantum wires, and conventionally the preferred approach in the mechanical engineering and mechanics community. Finite element methods are the simplest to implement for elasticity analyses of semiconductor nanostructures, and boundary element methods and Green’s function methods can be used as well [58,59]. As noted in Section 8.2, the level of accuracy is sufficient for most purposes. For modeling electron confinement in quantum dots and quantum wires the continuum framework, conveniently, can also be solved by a variety of numerical methods, but the accuracy is relatively limited for many purposes. In the continuum approach, electronic structure is assumed to be approximated by the properties of a single extra electron, or test charge, in the effective medium provided by the electrostatic potential field of the underlying crystal lattice. This physical picture is described by equation (8.7), the effective mass steady state Schrödinger equation. The model may be considered as an envelope function approach, since refers to the electron envelope function and not the localized atomic wave functions. If multiple energy bands are considered, and if the properties of the electron, as described by this simple model, are coupled by the presence of other energy bands nearby in energy, then the full k·p Hamiltonian model is used. Within this model, already described in Section 8.2, the effect of stress is wholly included as part of the effective potential term, V . In fact, all external perturbations on the energetics of the single electron are included in the potential term, including an applied electric field, energy band misalignment in adjacent layers of a heterostructure, etc. The stress effect is obtained through a simple linear constitutive relation given by αβ
V αβ = Dij εij ,
(8.19) αβ
where, as in equation (8.7), α and β range over energy subbands; Dij is the deformation potential tensor for the material, and εij is the elastic strain tensor. In this manner, the coupling of stress to electronic properties is affected computationally by first determining the elastic strain and then converting it to an effective electrostatic potential. This effective potential does not correspond to an applied electric field or a piezoelectric effect, but rather a shifting of the parabolic energy bands to higher or lower energies relative to the values in an unstrained
308
Harley T. Johnson
crystal. This approach is based on an assumption of locality, in the sense that the local effective medium in the crystal is constructed as though that point of material is embedded in a homogeneous, infinite crystal with the same local effective medium characteristics. The effective mass or k·p Hamiltonian method is used effectively by a number of investigators, including in an important contribution by Grundmann et al. [60] who first use the approach to study electronic properties of strained quantum dots. Jiang and Singh also study InAs/GaAs quantum dots, but with a more accurate eight-band k·p Hamiltonian method [61]. Due to the computationally efficient nature of the model, the method is also useful for studying much larger samples of material than can be considered using atomistic electronic structure methods. The author and co-workers have used the k·p method to study large Si/SiGe quantum dots as well as entire arrays of III–V quantum dots [62,63]. In this case, a continuum method is particularly useful for studying externally applied stress, since the elastic fields are nonuniform over domains much larger than could be reasonably considered using any other method. For example, the authors have shown that an array of quantum dots will undergo a strong blue shift in emitted light as a result of strain imposed by nanoindentation. This work is illustrated in Figure 8.6 [64]. However, two shortcomings make this continuum method potentially inaccurate for modeling real quantum dot and quantum wire problems, including (i) the breakdown of the envelope function approximation in structures with such small features, and (ii) the assumption of locality that underlies the effective medium approach.
0 500 1000 1500 2000
Band gap difference
2000
1500
Lattice mismatch εxx εyy < 0 εzz > 0
500
1000
0
200 100 0
s
Angstrom
Indentation εzz < 0 εxx εyy > 0
Figure 8.6 Embedded quantum dot array finite element mesh. The color contour shows the electrostatic potential for a single electron in the system when the surface is nanoindented to a small depth. The three inset images show how bandgap difference, lattice mismatch strain, and nanoindentation strain contribute to the total potential field [64]. (See also colour plate 10.)
309
Effects of Stress on Formation and Properties of Semiconductor Nanostructures
8.4.3 Multiscale coupled mechanical/electronic modeling in semiconductor nanostructures Several multiscale modeling approaches are in use for modeling electronic and mechanical coupling in semiconductors. These methods are intended to overcome either the continuum “limit’’, to provide a nonlocal theory, or simply to build a continuum level model from atomistic input, for either elasticity or electronic structure. One of the most powerful multiscale methods, and the one after which the few models described here are modeled, is the so-called quasicontinuum method developed by Tadmor et al. [26]. In this method atomistic calculations are used to inform a higher level continuum model, usually the finite element method.The continuum discretization has some positions resolved to the atomic scale and some positions very coarsely discretized, but the overlap between atomistic and continuum is seamless. The details of the quasicontinuum method are not discussed here, but significant work is ongoing even now to generalize the method, and many additional references are available in the literature [65]. The first quasicontinuum-based method built on an atomistic model with electronic structure degrees of freedom is reported by the author and co-workers
0.20 0.18
Energy gap (eV)
0.16 0.14 0.12 0.10 0.08 0.06 0.04 0.02 0.00 0.10
0.05
0.00
0.05
0.10
κ (1/nm)
Figure 8.7 Mechanical and electronic coupling in a single-wall carbon nanotube. Applying a small twist to an armchair nanotube results in a large, nonlinear shift in the bandgap. This, in turn, affects the transport properties of the structure [66].
310
Harley T. Johnson
[27]. In this work, a simple, purely local version of the quasicontinuum method is formulated so that tight-binding electronic structure calculations are embedded within continuum finite elements. This allows for a fully coupled mechanical and electronic study in which the local electronic structure determines the mechanical properties and vice versa. In this case the model is applied to electronic structure in simple deformed silicon geometries. It is shown that the local density of states near the bandgap in silicon can change fairly significantly in the presence of stresses in the range of what is often seen in semiconductor nanostructures such as quantum dots and quantum wires. Huang and co-workers, with the author, use precisely the same method in studying carbon nanotubes under various loadings. A local quasicontinuum method is used, in which mechanical properties are derived from tight-binding total energy calculations; then, simultaneously, the local electronic properties are derived [66]. In a subsequent study of nanowire transport properties, small tensile, compressive, and torsion loads lead to significant nonlinear changes in the quantized electron transport properties. The origins of this nonlinear coupling between mechanical and electronic properties are shown for the case of torsion in Figure 8.7 [17]. Recent work by Carter, Ortiz, and co-workers is similar: a local quasicontinuum model is built on density functional theoretical atomistic calculations. This allows for a multiscale, multiphysics model of a material that calls for the increased accuracy provided by first principles atomistics. These analyses are applied to study coupled mechanical/electronic properties in bulk materials with some nanoscale microstructure [67].
8.5 Conclusions Semiconductor nanostructures, especially quantum dots and quantum wires, display a broad range of rich mechanics behavior. Stress plays an important role not only in the properties of nanostructures for device purposes, but also in their formation. In the spontaneous formation of various classes of semiconductor nanostructures, stress in some cases provides a driving force for self-assembly, while in other cases it provides a stabilizing effect that resists self-assembly. Electronic and optical properties of quantum dots and quantum wires similarly display a broad range of stress effects, including shifts in optical spectral properties and in electron transport properties. The effects of stress in semiconductor nanostructures can only be fully studied by recourse to computational modeling; experimental studies on these nanoscale objects can only reveal rough trends in many cases. From this perspective, semiconductor nanostructures provide another rich class of issues to consider. First, these objects require multiphysics models that are designed to incorporate both mechanical behavior and electronic properties, since the structures have applications that are primarily optical and electronic. Second, because of the length scales of interest in semiconductor nanostructures, continuum approximations are not
Effects of Stress on Formation and Properties of Semiconductor Nanostructures
311
always accurate, and purely atomistic models are not always possible. These factors undoubtedly contribute to the extraordinary scientific interest in semiconductor nanostructures in the last several years, only a small cross-section of which has been reviewed in this chapter. REFERENCES 1. C. Weisbuch, and B. Vinter, Quantum Semiconductor Structures, Academic Press, San Diego, 1991. 2. J. H. Davies,The Physics of Low-Dimensional Semiconductors, Cambridge University Press, 1998. 3. D. Bimberg, M. Grundmann, and N. N. Ledentsov, Quantum Dot Heterostructures, John Wiley & Sons, 1999. 4. D. L. Huffaker, G. Park, Z. Zou, O. B. Shchekin, and D. G. Deppe, 1.3 mm room-temperature GaAs-based quantum-dot laser,Appl. Phys. Lett., 73 (1998), 2564–2566. 5. Yoshie, Tomoyuki, Axel Scherer, Hao Chen, Diana Huffaker, and Dennis Deppe, Optical characterization of two-dimensional photonic crystal cavities with indium arsenide quantum dot emitters,Appl. Phys. Lett., 79 (2001), 114–116. 6. S. J. Xu, S. J. Chua,T. Mei, X. C. Wang, X. H. Zhang, G. Karunasiri,W. J. Fan, C. H. Wang, J. Jiang, S.Wang, and X. G. Xie, Characteristics of InGaAs quantum dot infrared photodetectors, Appl. Phys. Lett., 73 (1998), 3153–3155. 7. C. J. Beenakker,Theory of Coulomb-blockade oscillations in the conductance of a quantum dot, Phys. Rev. B., 44 (1991), 1646–1656. 8. Pazy, Ehoud, Irene D’Amico, Paolo Zanardi, and Fausto Rossi, Storage qubits and their potential implementation through a semiconductor double quantum dot, Phys. Rev. B., 64 (2001), 195320. 9. Jaiswal, K. Jyoti, J. Hedi Mattoussi, Matthew Mauro, and Sanford M. Simon, Long-term multiple color imaging of live cells using quantum dot bioconjugates, Nat. Biotechnol., 21 (2002), 47–51. 10. Jessica O. Winter,TimothyY. Liu, Brian A. Korgel, Christine E. Schmidt, Recognition molecular directed interfacing between semiconductor quantum dots and nerve cells, Adv. Mater., 13 (2001), 1673–1677. 11. D. J. Eaglesham, and M. Cerullo, Dislocation-free Stranski–Krastanow growth of Ge on Si(100), Phys. Rev. Lett., 64 (1990), 1943–1946. 12. D. Leonard, M. Krishnamurthy, C. M. Reaves, S. P. Denbaars, and P. M. Petroff, Direct formation of quantum-sized dots from uniform coherent islands of InGaAs on GaAs surfaces, Appl. Phys. Lett., 63 (1993), 3203–3205. 13. R. J. Asaro, and W. A. Tiller, Interface morphology development during stress corrosion cracking, Metall. Trans., 3 (1972), 1789–1796. 14. L. B. Freund, Evolution of waviness on the surface of a strained elastic solid due to stress-driven diffusion, Int. J. Solids Struct., 6–7 (1995), 911–923. 15. Sigmund, Peter, A mechanism of surface micro-roughening by ion bombardment, J. Mater. Sci., 8 (1973), 1545–1553. 16. K. C. Hsieh, J. N. Baillargeon, and K. Y. Cheng, Compositional modulation and long-range ordering in GaP/InP short-period superlattices grown by gas source molecular beam epitaxy, Appl. Phys. Lett., 57 (1990), 2244–2246. 17. H. T. Johnson, B. Liu, and Y. Y. Huang, Electron transport in deformed carbon nanotubes, ASME J. Eng. Mater. Technol., 126 (2004), 222–229. 18. A. Zunger, Electronic structure theory of semiconductor quantum dots, MRS Bull., 23 (1998), 35–42. 19. L. B. Hansen, K. Stokbro, B. I. Lundqvist, K. W. Jacobsen, and D. M. Deaven, Nature of dislocations in Silicon, Phys. Rev. Lett., 75 (1995), 4444–4447. 20. ABAQUS,general purpose finite element software, Hibbitt, Karlson, and Sorenson,Version 6.5.
312
Harley T. Johnson
21. FEMLAB, general purpose finite element software, COMSOL, Inc.,Version 3.0. 22. F. H. Stillinger, and T. A. Weber, Computer simulation of local order in condensed phases of silicon, Phys. Rev. B., 31 (1985), 5262–5271. 23. C. Pryor, J. Kim, L. W. Wang, A. J. Williamson, and A. Zunger, Comparison of two methods for describing the strain profiles in quantum dots, J. Appl. Phys., 83 (1998), 2548. 24. J. R. Chelikowsky, M. L. Cohen, Self-consistent pseudopotentail calculation for the relaxed (110) surface of GaAs, Phys. Rev. B., 20 (1979), 4150. 25. J. Q. Broughton, F. F. Abraham, N. Bernstein, and E. Kaxiras, Concurrent coupling of length scales: methodology and application, Phys. Rev. B., 60 (1999) 2391–2403. 26. E. B. Tadmor, M. Ortiz, and R. Phillips, Quasicontinuum analysis of defects in solids, Philos. Mag. A., 73 (1996), 1529–1563. 27. H. T. Johnson, R. Phillips, and L. B. Freund, Electronic structure boundary value problems without all of the atoms, Materials Research Society Symposium Proceedings 538 (1999), 479–484. 28. M. Grinfeld, Instability of the separation boundary between a nonhydrostatically stressed elastic body and a melt, Sov. Phys. Doklady, 31 (1986), 831–834. 29. Srolovitz, D, On the stability of surfaces of stressed solids, Acta. Metall. Mater., 37 (1989), 621–625. 30. W. W. Mullins,Theory of thermal grooving, J. Appl. Phys., 28 (1957), 333–339. 31. Y. W. Zhang, and A. F. Bower, Numerical simulations of island formation in a coherent strained epitaxial thin film system, J. Mech. Phys. Solids., 47 (1999), 2273–2297. 32. V. B. Shenoy, and L. B. Freund, A continuum description of the energetics and evolution of stepped surfaces in strained nanostructures, J. Mech. Phys. Solids., 50 (2002), 1817–1841. 33. A. Ramasubramaniam, andV. B. Shenoy,Three dimensional simulations of self-assembly of hut shaped Si-Ge quantum dots, J. Appl. Phys., 95 (2004), 7813–7824. 34. P. Sutter, and M. G. Lagally, Nucleationless three-dimensional island formation in low-misfit heteroepitaxy, Phys. Rev. Lett., 84 (2000), 4637–4640. 35. Oyafuso, Fabiano, Gerhard Klimeck, Paul von Allmen, Tim Boykin, and R. Chris Bowen, Strain effects in large-scale atomistic quantum dot simulations, Phys. Status Solidi b., 239 (2003), 71–79. 36. A. F. Voter, F. Montalenti, and T. C. Germann, Extending the time scale in atomistic simulation of materials,Annu. Rev. Mater. Res., 32 (2002), 321–346. 37. K.-S. Kim, J. A. Hurtado, and H. Tan, Evolution of a Surface-Roughness Spectrum Caused by Stress in Nanometer-Scale Chemical Etching, Phys Rev. Lett., 83 (1999) 3872–3875. 38. Bradley, R. Mark, and James M. E. Harper, Theory of ripple topography induced by ionbombardment, J. Vac. Sci. Technol. A., 6 (1988), 2390–2395. 39. S. Park, B. Kahng, H. Jeong, and A.-L. Barabási, Dynamics of Ripple Formation in Sputter Erosion: Nonlinear Phenomena, Phys. Rev. Lett., 83 (1999), 3486–3489. 40. B. Kahng, H. Jeong, and A.-L. Barabási, Quantum dot and hole formation in sputter erosion, Appl. Phys. Lett., 78 (2001), 805–807. 41. Makeev, A. Maxim, Rodolfo Cuerno, and Albert-László Barabási, Morphology of ion-sputtered surfaces, Nucl. Instr. Methods Physi. Research Sect. b., 197 (2002), 185–227. 42. C. A. Davis, Simple model for the formation of compressive stress in thin films by ion bombardment,Thin Solid Films, 226 (1993), 30–34. 43. N. Kalyanasundaram, J. B. Freund, and H. T. Johnson,Atomistic origins of ion bombardment nanoscale surface instability, submitted for publication. 44. L. B. Freund, and S. Suresh,Thin Film Materials: Stress, Defect Formation, and Surface Evolution, Cambridge University Press, 2003. 45. T. Mattila, L.-W. Wang, and Alex Zunger, Electronic consequences of lateral composition modulation in semiconductor alloys, Phys. Rev. b., 59 (1999), 15270–15284. 46. N. W. Ashcroft, and N. D. Mermin, Solid State Physics, Saunders College Publishing, Fort Worth,TX, 1976.
Effects of Stress on Formation and Properties of Semiconductor Nanostructures
313
47. Y. J. Mii,Y. H. Xie, E. A. Fitzgerald, Don Monroe, F. A. Thiel, B. E. Weir, and L. C. Feldman, Extremely high electron mobility in Si/Gex Si1−x structures grown by molecular beam epitaxy, Appl. Phys. Lett., 59 (1991), 1611–1613. 48. C. W. Leitz, M. T. Currie, M. L. Lee, Z.-Y. Cheng, D. A. Antoniadis, E. A. Fitzgerald, Hole mobility enhancements and alloy scattering-limited mobility in tensile strained Si/SiGe surface channel metal–oxide–semiconductor field-effect transistors, J. Appl. Phys., 92 (2002), 3745– 3751. 49. Wang, Xin, D. L. Kencke, K. C. Liu, A. F. Tasch, Jr., L. F. Register, and S. K. Banerjee, Electron transport properties in novel orthorhombically-strained silicon material explored by the Monte Carlo method, International Conference on Simulation of Semiconductor Processes and Devices, (2000), 70–73. 50. D. L. Dexter, and F. Seitz, Effects of Dislocations on Mobilities in Semiconductors, Phys. Rev., 86 (1952), 964–965. 51. W. T. Read, Jr.,Theory of Dislocations in Germanium, Philos. Mag., 45 (1954), 775. 52. V. L. Bonch-Bruevich, and V. B. Glasko, The theory of electron states connected with dislocations, Fiz. Tverd. Tela., 3 (1960), 36. 53. J.-H.You, and H.T. Johnson, Electron scattering due to threading edge dislocations in epitaxial wurzite GaN, J. Appl. Phys., 99 (2006), 033706. 54. Oyafuso, Fabiano, Gerhard Klimeck, R. Chris Bowen, Tim Boykin, and Paul von Allmen, Disorder induced broadening in multimillion atom alloyed quantum dot systems, Phys. Stat. Sol. c., 0004 (2003), 1149–1152. 55. Zunger,Alex, Pseudopotential theory of semiconductor quantum dots, Phys. Stat. Sol. b., 224 (2001), 727–734. 56. Jun-Qiang Lu, H. T. Johnson, V. D. Dasika, and R. S. Goldman, Moments-based tightbinding analysis of local electronic structure in InAs/GaAs quantum dots for comparison to experimental measurements,Appl. Phys. Lett., 88 (2006), 053109. 57. Schuyler, D. Adam, G. S. Chirikjian, Jun-Qiang Lu, and H.T. Johnson, Random-walk statistics in moment-based O(N) tight binding and applications in carbon nanotubes, Phys. Rev. E., 71 (2005), 046701. 58. B. Yang, and E. Pan, Elastic analysis of an inhomogeneous quantum dot in multilayered semiconductors using a boundary element method, J. Appl. Phys., 92 (2002), 3084–3088 59. B. Yang, and E. Pan, Elastic fields of quantum dots in multilayered semiconductors: a novel Green’s function approach, J. Appl. Mech., 70 (2003), 161–168. 60. M. Grundmann, O. Stier, and D. Bimberg, InAs/GaAs pyramidal quantum dots: Strain distribution, optical phonons, and electronic structure, Phys. Rev. b., 52 (1995), 11969–11981. 61. Jiang, Hongtao, and Jasprit Singh, Strain distribution and electronic spectra of InAs/GaAs self-assembled dots:An eight-band study, Phys. Rev. b., 56 (1997), 4696–4701. 62. H. T. Johnson, L. B. Freund, C. D. Akyuz, and A. Zaslavsky, Finite element analysis of strain effects on electronic and transport properties in quantum dots and wires, J. Appl. Phys., 84 (1998), 3714–3725. 63. H. T. Johnson, V. Nguyen, and A. F. Bower, Simulated self-assembly and optoelectronic properties of InAs/GaAs quantum dot arrays, J. Appl. Phys., 92 (2002), 4653–4663. 64. H. T. Johnson, and R. Bose, Nanoindentation effect on optical properties of self-assembled quantum dots, J. Mech. Phys. Solids., 51 (2003), 2085–2104. 65. Quasicontinuum developers’ website: http://www.qcmethod.com. 66. B. Liu, H. Jiang, H. T. Johnson, and Y. Huang, The influence of mechanical deformation on electrical properties of single-wall carbon nanotubes, J. Mech. Phys. Solids., 52 (2004), 1–26. 67. M. Fago, Robin L. Hayes, Emily A. Carter, and Michael Ortiz, Density-functional-theorybased local quasicontinuum method: Prediction of dislocation nucleation, Phys. Rev. b., 70 (2004), 100102(R).
C H A P T E R
N I N E
Continua with Spin Structure Paolo Maria Mariano∗
Contents 9.1 Introduction 9.2 Spatial Representation 9.2.1 Heisemberg spins and balance equations 9.2.2 Connection with non-viscous compressible spin fluids 9.3 Referential Description: Invariance with Respect to Relabeling 9.4 Covariant Evolution of Interstitial Point Defects and Disclinations 9.4.1 Interstitial point defects 9.4.2 Disclination lines
314 316 316 321 323 327 327 332
Abstract A continuum Lagrangian–Hamiltonian description of alloys with spin glass structure in magnetic saturation conditions is presented here, paying attention to the interaction between gross deformation and spin substructures. A version of Noether theorem accounting for gyroscopic inertia of spins is proven so that the covariance of the balance of spin interactions and, as a special case, the one of Landau–Lifshitz–Gilbert equation follow. The formalism is also applied to the description of topological transitions in spin fluids. Finally, evolution equations of point defects and disclination lines are derived in covariant way. Key Words: Spin glasses, Multi-field theories, Complex bodies
9.1 Introduction “Quenched’’ (i.e. fixed in space) disorder is present in some metallic alloys admitting a phase in which crystalline cells display magnetic spins. Impurities (not altered by thermal fluctuations) break long-range order and the material remains in a glass phase displaying random features. Such alloys are commonly called spin glasses. They are characterized not only by the disorder but also by frustration between energetic valleys (see e.g. [1]), a frustration which is due to competing interactions between spins, so that highly degenerate ground states occur. In an ideal representation (in a certain sense rather far from tangible aspects), one may imagine to represent a spin glass by means of the Ising model: an ideal ∗
D.I.C., Università di Firenze, via Santa Marta 3, I-50139 Firenze (Italy) e-mail: paolo.mariano@unifi.it.
Material Substructures in Complex Bodies ISBN-10: 0-08-044535-7
314
© 2007 Elsevier Ltd. All rights reserved.
315
Continua with Spin Structure
(square) lattice endowed at each cross-link site with dipoles oriented just “up’’ and “down’’ along prescribed directions. Simply, one may imagine a spin system of this type that may undergo a finite number of possible configurations ς¯ = {ς1 , . . . , ςN } selected in {−1, 1}N with N a sufficiently large integer. Such a kind ¯ Disorder of system is characterized by a certain (random) Hamiltonian HN (ς): implies, in fact, that energy levels are randomly distributed. Each state belonging to a set of replicas of the original system may be then weighed by Gibbs measure: GN ({ς}) ¯ =
exp(−βHN (ς)) ¯ ZN
where β is Boltzmann factor, ZN the partition function ZN = ZˆN (β) = exp(−βHN (ς)) ¯
(9.1)
(9.2)
N ς∈{−1,1} ¯
and the sign minus indicates that low energy states are favored. The Hamiltonian HN (ς) ¯ and the Gibbs measure GN ({ς}) ¯ are essential in evaluating statistical (thermodynamical) properties of the spin lattice although the estimate of the partition function is not immediate, or better, is not always clear. “Molecular’’ and “mean field’’ points of view may be pursued in developing analyses. In the former case one considers interactions decaying with the increment of distance between atoms located at lattice vertices and obtains models hard to manage. On the contrary, in mean field approaches one forgets exact locations of atoms and assumes that interactions are infinite ranged. Under these assumptions, rather complete treatments exist (with related solutions) like the classical one in Ref. [2], even if the thermodynamics of spin glasses with finite extension in space is still unclear under various aspects. The physics of such disordered systems is rather subtle (see [3]) and generates far reaching vistas on problems of pure stochastic calculus as pointed out in the encyclopedic treatise by Talagrand [4]. Here the attention is focused on the description of the interaction between local spin states and macroscopic deformation. To this end, I make use of the general model-building framework of multi-field theories describing complex bodies (see [5–8]) to construct an hydrodynamics scheme for the description of spin alloys. Metallic alloys with spin glass states are, in fact, complex bodies in the sense that their material substructure influences the gross behavior in such a way that interactions due to substructural changes are prominent and have to be described directly. My point of view about the picture of interactions is in between molecular and mean field approaches. More precisely, I forget the exact location of each atom as in mean field approach but consider just weakly non-local interactions of gradient type. Each material element is pictured as a patch containing a family of spin particles interacting with each other. The spatial variation of spin orientations in ground states due to the distribution impurities is accounted for by the presence of the
316
Paolo Maria Mariano
gradient of the spin field in the list of entries of the energy. Spin glasses may be in fact considered in hydrodynamic range as the continuum limit of a magnetic lattice with disclinations. The general formulation is discussed first in a pure spatial setting without any use of reference configuration in order to get a formalism common to spin fluids or certain classes of hyperfluids. Then the referential counterpart of the description is explored to find consequences of a requirement of invariance of the Lagrangian density with respect to “permutations’’ of defects in the reference place. A version of Noether theorem accounting for gyroscopic inertia of spins can be proven. It implies the covariance of the balance of spin interactions with respect to the action of Lie groups on the sphere S 2 (the manifold of substructural shapes in this case). As a special case, the covariance of Landau–Lifshitz–Gilbert equation accounting for gyromagnetic effects follows. The mechanics of evolving point defects and disclination lines is also discussed: Covariance of the relevant evolution equations is also discussed. In absence of macroscopic deformation, the analysis of ground states of spin glasses can be reduced to the one of Dirichlet energies involving maps between a contractile bounded set of R3 and S 2 . For this case, profound analytical results exist (see [9, 10]). Some standard notations are recalled briefly: For any pair A and B of tensors of the same order, A · B is the scalar product contracting all indices. If A is r-covariant and s-contravariant tensor and C is (r − n)-covariant and (s − m)contravariant tensor with r > n, s > m, then C A is a product contracting all indices of C. If both A and B are second order tensors, the product AB contracts only one index giving rise to a second order tensor.
9.2 Spatial Representation 9.2.1 Heisemberg spins and balance equations B indicates the “regular’’ region of the three-dimensional Euclidean point space occupied by the body in its generic current place. Regularity is intended in the sense that B is bounded, compact and with boundary ∂B smooth everywhere except a finite number of corners and edges. Let g be at each point x the (Riemannian) metric pertinent to B ; it is the value of a field g˜ B x −→ g = g(x) ˜ ∈ Sym+ (Tx B , Tx∗ B ) ⊂ Hom(Tx B , Tx∗ B) ∼ = R3 ⊗ R3 , (9.3) defined over B . The sole description of the place B occupied by the body is not sufficient to describe the geometry of its minute spin structure. As in all cases of complex bodies, each material element is, in fact, here a system and a descriptor of the local spins has to be introduced. Different choices of morphological descriptors can
317
Continua with Spin Structure
be made. In conditions of magnetic saturation, the natural choice is to assign to each material element the Heisemberg spin ς, an element of the unit sphere S 2 , so that S 2 plays here the role of manifold of substructural shapes of the magnetic substructural morphology. At each point of B the spin ς is then the value of a field ς˜ a
B x −→ ς = ς˜ a (x) ∈ S 2 ,
(9.4)
which is considered sufficiently regular to justify formally the developments below. Sections over the tangent bundles T B and TS 2 , parametrized by a parameter t ∈ [0, ¯t ], define over B sufficiently smooth rate fields v˜
B × [0, ¯t ] (x, t) −→ v = v(x, ˜ t) ∈ Tx B , υ˜
˜ t) ∈ Tς S 2 , B × [0, ¯t ] (x, t) −→ υ = υ(x,
(9.5) (9.6)
once one identifies t with time. An essential ingredient for describing the mechanical behavior in conservative setting is a Lagrangian L, a 3 + 1 form admitting density L, namely: L = L(dx)3 ∧ dt,
(9.7)
with L the value of a map L˜ , namely: L = L˜ (x, v, g, ς, grad ς) = 12 ρ|v|2 − e(g, ς, grad ς) − U (x).
(9.8)
In (9.8) grad ς is the spatial derivative of ς, e(·) the elastic potential and U (·) the potential of the external body forces, i.e. the gravitational field. No external fields acting directly on spins are considered; spin structures are spontaneously induced by deformation or they are pre-existent to the deformation itself. Each spin is constrained to have unit length, being an element of S 2 : Its inertia is then considered here just to have pure gyroscopic (powerless) nature.To account for it in a extremum global principle, appropriate variations of the spin field are necessary:They are then given by δ
B × [0, ¯t ] (x, t) −→ δς = δς˜ a (x, t) ∈ Tς S 2 ,
(9.9)
˜ a generic time-parametrized section of TS 2 not necessarily coincident with υ. The global extremum principle used here is given by δˆ
L(dx) ∧ dt + 3
B×[0,¯t ]
B×[0,¯t ]
γ(ς × υ) · δς(dx)3 ∧ dt = 0,
(9.10)
where δˆ indicates “variation’’ accounting for the possible spatial dependence of fields in (9.10) and γ is a material constant set equal to 1 for the sake of simplicity.
318
Paolo Maria Mariano
The term second integral accounts for gyroscopic inertia, the integrand does not vanish unless δς = υ. Further details about variations are needed. It is necessary first to select an arbitrary smooth curve in the group of automorphism of the ambient space E 3 , a group indicated by Aut(E 3 ). Precisely, one considers a smooth curve [0, ¯s2 ] s2 −→ f¯s22 ∈ Aut(E 3 ), indicate by v the derivative d ¯2 f |s =0 (x) ds2 s2 2 and select f¯s22 in order to have skew (grad v) = 0. Total variations of the entries of the Lagrangian density can be then defined. They are δˆ g :=
d ¯ 2∗ f g|s2 =0 = Lv g = 2sym(grad v) = 2 grad v, ds2 s2
(9.11)
where the asterisk means pull back so that Lv g is the autonomous Lie derivative following the flow v, δˆ ς := δς,
(9.12)
δˆ grad ς := grad δς + (gradς)grad v.
(9.13)
Under sufficient smoothness of the Lagrangian density L˜ , straightforward calculations imply balance equations ρv˙ = div σ + b,
(9.14)
−ς × υ = div Sa − za ,
(9.15)
where ρ is mass density, b = −∂x L ∈ Tx∗ B the co-vector of standard body forces, σ Cauchy stress tensor given by σ = −2∂g L + (grad ς)∗ Sa ∈ Hom(Tx∗ B , Tx∗ B ),
(9.16)
with Sa the actual microstress (describing contact interactions of gradient type associated with the spatial variation of the spin field), namely: Sa = −∂grad ς L ∈ Hom(Tx∗ B , Tς∗ S 2 ),
(9.17)
and finally za = ∂ς e ∈ Tς∗ S 2 the self-force (x −→ ς is in fact a self-interacting field).
319
Continua with Spin Structure
If one requires invariance of e˜(·) with respect to action of SO(3) (objectivity) on the ambient space E 3 and the manifold of substructural shapes S 2 , one gets skew(σ + ς ⊗ za + Sa∗ grad ς) = 0,
(9.18)
This is a special case of a general result in complex bodies (see [5]):The Cauchy stress is in general not symmetric and its skew-symmetric part is determined by the substructural interactions measured by za and Sa . The second order tensor (grad ς)∗ ∂grad ς L rules directly the transfer of energy from the macroscopic spin level to the gross deformation scale and is a direct indicator of the competition between the two levels. In complex fluids such a mechanism is a direct source of topological transitions along flows. Let us consider the simple case in which L˜ (x, v, g, ς, grad ς) = L˜ 1 (x, v, g) + 12 'grad ς2
(9.19)
with L˜ 1 (·) the Lagrangian density of the gross motion and ' a constitutive constant that we can set equal to 1 for the sake of simplicity. In this case, (9.15) reduces to ¯ υ = −ς × ς,
(9.20)
which coincides with Landau–Lifshitz–Gilbert equations (written with respect to the current place of the body) when just gyromagnetic effects are accounted ¯ indicates Laplacian with respect to coordinates in B, namely for. In (9.20), ¯ = div grad. Notice that the presence of the gyroscopic term (ς × υ) · δς does not allow us to write the global extremum principle (9.10) in a purely variational form. In a sense, like in the case of the presence of substructural friction (see [8] for the paradigmatic example of quasi-periodic alloys), (9.10) allows one to get metastable states rather than true ground states. Basically, here two mechanisms occur: (i) the frustration between ground states belonging to different energetic wells when e is multi-well and (ii) the occurrence of gyromagnetic effects. The Hamiltonian density associated with L is given simply by H = H˜ (x, p, ¯ g, ς, grad ς) = p¯ · v − L˜ (x, v, g, ς, grad ς),
(9.21)
where p¯ is the canonical momentum given by p¯ = ∂v L ∈ Tx∗ B . No canonical momentum is associated with spins because spin inertia has only gyroscopic nature. Let F be any functional of the type F = B f (x, p, ¯ ς)(dx)3 with f a sufficiently smooth density. For a boundary value problem for which just places x and spins
320
Paolo Maria Mariano
ς are prescribed at the boundary, relevant Hamilton equations can be written for any F of the type above as F˙ =
B
({H, f }a,el + {H, f }a,s (ς))(dx)3 ,
(9.22)
where the superposed dot means partial derivative and elastic and spin Poisson brackets {·, ·}a,el and {·, ·}a,s in the actual configuration B are defined, respectively, by δH δf δf δH {H, f }a,el = · − · , (9.23) δx δp δx δp {H, f }a,s (ς) =
δH δf × δς δς
· ς.
In formulas above, for any vector v = d/ds2 (f¯s22 |s2 =0 (x)), one defines δH · v = (−∂x H + div(2∂g H − (grad ς)∗ ∂grad ς H)) · v, δx
δH δx
by (9.24)
holding p¯ fixed and allowing x to vary. Analogies and differences with Poisson bracket-based descriptions of spin glasses such as the proposals by Halperin and Saslow [11] and Dzyaloshinskii andVolovick [12] become evident (comparison is made here with the latter work only because basic equations in the former one are generalized there): •
•
•
Contrary to [12], here the analysis of the interaction between spin structures and macroscopic deformation allows us to underline the transfer of energy from the spin level to the macroscopic one, a transfer of energy ruled by the second rank tensor (grad ς)∗ Sa , appearing in (9.16), or, that is the same thing, by (grad ς)∗ ∂gradς H, appearing in (9.24). Contrary to [12], dissipative effects are not accounted for but they can be included directly by means of the same standard technique used in Ref. [12], without any additional effort. A special aspect deserving attention is that in Ref. [12] gauge fields taking values on the special orthogonal group SO(3) are introduced and their topological singularities are called disclinations. Gauge fields may used also in representing spin substructures in a way alternative to [12]. In fact, one may even describe spin glasses by introducing in a standard Lagrangian density for a simple fluid (thus a Lagrangian not including ς and its gradients in the list of constitutive entries) appropriate gauge fields that may allow one to account for the quenched disorder. In this case, the description of spin glasses displays strict analogies with
321
Continua with Spin Structure
the one of Yang–Mills fluids, as shown in Ref. [13]. Here, I do not introduce gauge fields and consider a disclination as a line defect where the spin field is not defined. Alternatively one may consider a disclination as a line where the spin field takes point by point the entire S 2 as value. Heisemberg spins could be directly substituted in the representation of the substructural morphology by elements of SO(3) or SU (2)/U (1). In fact, the special orthogonal group SO(3) acts transitively on S 2 so that one may substitute S 2 with SO(3). Moreover, if the body admits an“ordered’’phase in which all spins are oriented along a given direction e1 (an assumption valid only in principle because the “quenched’’ disorder may exclude it) the symmetry group of such a phase is SO(2), i.e. the group of all proper rotations about e1 . Then, one may use the quotient space SO(3)/SO(2) as manifold of substructural shapes, giving it the role played up to now by S 2 . The elements of SO(3)/SO(2) are cosets of SO(3) with respect to SO(2) and the correspondence between the elements of S 2 and the cosets is one-to-one and continuous [14]. On the other hand, SO(3) is homeomorphic to the special unitary group SU (2). The subgroup of it leaving invariant e1 is isomorphic to the unitary group U (1) so that we may use the coset space SU (2)/U (1) as manifold of substructural shapes [14].
9.2.2 Connection with non-viscous compressible spin fluids Spin fluids can be treated with the formalism above. For compressible spin fluids (see [15, 16]), the Lagrangian density depends on g only through its determinant, i.e. the specific volume ι = det g, so that L = L˜ (x, v, ι, ς, grad ς). As a consequence, when viscous effects are absent, Cauchy stress tensor becomes simply T = −pI − (grad ς)∗ ∂grad ς e,
(9.25)
with the thermodynamic pressure p given by p = −ι∂ι e. As anticipated above, the spin substructure may induce topological transitions along flows. The results below suggest some examples. Let the standard macroscopic vorticity be indicated by ω, namely ω = curl v. Proposition 1 For a steady-state, isentropic flow of a perfect compressible spin fluid, one gets ω × v = −grad hs − grad ς (grad ∂grad ς e) − (grad(div ∂gradς e + ς × υ))∗ ς + ι(grad ς)∗ div ∂grad ς e + ι(∂grad ς e)∗ grad2 ς,
(9.26)
with hs the total enthalpy of the spin fluid given by hs = 12 |v|2 + e − ι∂ι e − ∂ς e · ς − ∂grad ς e · grad ς.
(9.27)
322
Paolo Maria Mariano
This result is a direct application to spin fluids of a general theorem valid for arbitrary complex fluids (see Proposition 2 of [17]) so that the rather articulated proof is not written down. As an immediate corollary, one realizes that, contrary to simple fluids, when the enthalpy is constant along a steady-state isentropic flow of a perfect spin fluid, that is grad hs = 0, the spin substructure may generate vorticity when the remaining right-hand side term of (9.26) is different from zero. Alternatively, for non-constant enthalpy, the spin substructure may cancel macroscopic vorticity when the right-hand side term of (9.26) vanishes identically. More articulated is the landscape in non-stationary flows where, by calculating the curl of the balance of forces, the evolution equation of the transport of vorticity can be derived and is given by dω = (ω · grad)v + grad ι × (−grad p − div((grad ς)∗ Sa )) dt −ι curldiv((grad ς)∗ Sa ).
(9.28)
Notice that the Ericksen-like stress (grad ς)∗ Sa contributes directly to the time evolution of the vorticity, as mentioned previously in Remark 1. The nature of this contribution may be underlined clearly in the simpler case of an incompressible spin fluid flowing along a plane. Incompressibility means that grad ι = 0 while the planar nature of the motion implies also ω ⊥ grad, so that (9.28) reduces to dω = −ι curldiv((grad ς)∗ Sa ). dt
(9.29)
Theorem 1 For a perfect, incompressible spin fluid in non-stationary planar motion, the vorticity is never conserved unless (i) There exists a scalar field B x −→ π(x) such that (grad ς)∗ Sa = πI,
(9.30)
with I the second order unit tensor, or (ii) There exists a second order tensor valued C 1 (B) field B x −→ A(x) such that curl(Sa∗ grad ς) = (curl A)T .
(9.31)
In fact, from elementary tensor calculus we know first that curl grad π = 0 when π(·) is a scalar field. Moreover, for any C 2 (B ) second order valued field A(·), one gets both curl div AT = div curl A and div(curl A)T = 0. Conversely, if the right-hand side term of (9.29) vanishes, conditions (i) and (ii) apply indifferently. The condition (i) represents simply the case in which Ericksen-like stress generates just a sort of “pressure’’ while the physical interpretation of the condition (ii) is not immediate.
323
Continua with Spin Structure
9.3 Referential Description: Invariance with Respect to Relabeling The choice of a reference place B0 , that the body may occupy in principle, is helpful when one tries to investigate the behavior of defects in solids. B0 is presumed to be regular as B , which is reached by a standard deformation, i.e. a mapping x˜
B0 X −→ x = x(X) ˜ ∈ B,
(9.32)
which is presumed to be one-to-one, at least piecewise differentiable and orientation preserving. At each X, as usual, F indicates the spatial derivative of x˜ so that F = D x(X). ˜ The requirement that x˜ be orientation preserving implies that everywhere det F > 0. Spins are assigned in B0 by a continuous and piecewise continuously differentiable mapping ς˜
˜ B0 X −→ ς = ς(X) ∈ S2
(9.33)
such that ς˜ = ς˜ a ◦ x˜ −1 (alternatively ς may be selected as an element of the coset space SU (2)/U (1) and the related treatment developed). Motions are then described by time-parametrized curves x˜ t and ς˜ t in the ˜ with t ∈ [0, ¯t ], curves assumed sufficiently smooth with space of maps x˜ and ς, respect to time. In this way x = x(X, ˜ t) indicates the current place of the material ˜ element resting at X when t = 0 and ς = ς(X, t) the value of the spin attached at X at the instant t. In this description involving B0 as paragon setting, rates in the referential description, that is rates considered as fields over B0 , are indicated ˜ t)) ∈ Tς S 2 . by x˙ = d/dt(x(X, ˜ t)) ∈ Tx B and ς˙ = d/dt(ς(X, In the referential description, the extremum principle (9.10) becomes then 3 ˜ ˙ · δς(dX)3 ∧ dt = 0, δ L(X, x, x, ˙ F, ς, Dς)(dX) ∧ dt + γ(ς × ς) B0 ×[0,¯t ]
B0 ×[0,¯t ]
(9.34) where the explicit dependence of the Lagrangian density on X indicates that one accounts for the inhomogeneity induced by the defect structure, F substitutes g and the material constant γ is set once more equal to 1. More precisely, the Lagrangian density L is given by L = L˜ (X, x, x, ˙ F, ς, Dς) = 12 ρ0 |x| ˙ 2 − e(X, x, F, ς, Dς) − U (x),
(9.35)
with ρ0 the referential mass density, conserved during the motion. Here, computing variations is a standard job and one gets pointwise balance equations of the form ·
∂x˙ L = ∂x L − Div ∂F L,
(9.36)
324
Paolo Maria Mariano
ς˙ = −ς × (∂ν L − Div ∂Dν L).
(9.37)
In deriving (9.37) one consider that ς˙ is orthogonal to ς since ς · ς = 1. The geometrical environments involved in the referential description are not only the ambient space E 3 , the interval of time and the manifold of substructural shapes S 2 (or SU (2)/U (1) when relevant) as in Section 9.2, but also the reference place B0 . With respect to the previous spatial description, now the definition of observer involves the representation of B0 . Changes in synchronous observers (that are observers measuring the same time scale) are then characterized by the transformations listed below (see [7, 8]), transformations that alter the representation of E 3 , S 2 and B0 , respectively: (a) R+ s2 −→ fs22 ∈ Aut(E 3 ), with f02 the identity. Each f 2 is assumed sufficiently smooth in space. u indicates the derivative f02 (x): = dsd2 fs22 (x)|s2 =0 . (b) A Lie group G, with Lie algebra g, acts over S 2 . If ξ ∈ g, its action over ς ∈ S 2 is indicated with ξS 2 (ς). By indicating with ς g the value of ς after the action of g ∈ G, if a one-parameter smooth curve R+ s3 −→ gs3 ∈ G is considered over dg G (such that ξ = dss33 |s3 =0 ) together with its corresponding curve s3 −→ ς gs3 over S 2 , starting from a given ς, one gets ξS 2 (ς) = dsd3 ς gs3 |s3 =0 . (c) R+ s1 −→ fs11 ∈ SDiff(B0 ) is a smooth curve with f01 the identity; at each (X) = 0, where the prime denotes difs1 one gets X −→ fs11 (X), with Div fs1 1 ferentiation with respect to the parameter s1 . To simplify notations w will indicate the derivative f01 (X) := dsd2 fs22 (x)|s2 =0 . L is said invariant with respect to the action of fs11 , fs22 and G if L = L˜ (X, x, x, ˙ F, ς, Dς)
˙ (grad f 2 )F(Df 1 )−1 , ς g , (Dς g )(Df 1 )−1 ), = L˜ (f 1 , f 2 , (grad f 2 )x,
(9.38)
for any g ∈ G and s1 , s2 ∈ R+ , where, for the sake of brevity, f 1 , f 2 and ς g indicate the values fs11 (X), fs22 (x), ς gs3 (X). Let us define scalar Q, m(ς) and vector F densities, respectively by Q = ∂x˙ L · (u − Fw),
(9.39)
m(ς) = −ς × ς˙ · (ξS 2 (ς) − (Dς)w),
(9.40)
F = Lw + (∂F L)∗ (u − Fw) + (∂Dς L)∗ (ξS 2 (ς) − (Dς)w).
(9.41)
Theorem 2 If L is invariant under fs11 , fs22 and G, for any g ∈ G and s1 , s2 ∈ R+ , for the variational problem (9.34) one gets Q˙ + Div F − m(ς) = 0.
(9.42)
325
Continua with Spin Structure
Proof
The requirement of invariance of L implies d L|s =0,s =0,s =0 = 0, ds1 1 2 3
(9.43)
d L|s =0,s =0,s =0 = 0, ds2 1 2 3
(9.44)
d L|s =0,s =0,s =0 = 0, ds3 1 2 3
(9.45)
∂X L · w − ∂F L · (FDw) − ∂∇ς L · ((Dς)Dw) = 0,
(9.46)
˙ + ∂F L · ((grad u)F) = 0, ∂x L · u + ∂x˙ L · ((grad u)x)
(9.47)
∂ς L · ξS 2 (ς) + ∂∇ς L · DξS 2 (ς) = 0.
(9.48)
and these relations lead to
By computing the time derivative of Q and the divergence of F and using (9.46)–(9.48), thanks to the validity of (9.36) and (9.37), one gets Q˙ + Div F = −ς × ς˙ · (ξS 2 (ς) − (Dς)w).
(9.49)
Theorem 2 is a version (in a sense an extension) of Noether theorem account˙ In standard elasticity and in its ing for gyromagnetic effects of the type ς × ς. multi-field extension for complex bodies, the term m(ς) is absent (see [18] for simple bodies and [7, 19] for complex bodies). In deriving (9.42) use of Euler– Lagrange equations has been made. Such a use implies a requirement of regularity for the fields involved (they have to be solutions to Euler–Lagrange equations), a regularity not necessary in the conservative case free of gyroscopic terms because a direct proof of Noether theorem not based on balance equations is used (see e.g. [20]). Various corollaries arise naturally fromTheorem 2.The first one is the common result prescribing that the balance of standard forces ρ0 x¨ = Div P + b
(9.50)
(with P = −∂F L the first Piola–Kirchhoff stress and b = −∂x U ) is covariant in the sense that it is invariant with respect to changes in observers “deforming’’ arbitrarily E 3 . In fact, the balance of forces follows from (42) when f 2 acts alone (“alone’’ means that the action of f 1 and G vanishes identically) and u is selected arbitrarily. Other corollaries are listed below. Corollary 1 If G arbitrary acts alone on L, from (9.42) it follows that ς˙ = −ς × (Div S − z),
(9.51)
326
Paolo Maria Mariano
with S = −∂Dς L ∈ Hom(TX∗ B0 , Tς∗ S 2 ) and z = −∂ς L ∈ Tς∗ S 2 . The arbitrariness of the choice of the element ξ of the Lie algebra of G implies the covariance of (9.51). A well known special case occurs when L = L˜ (X, x, x, ˙ F) + 12 'Dς2 ,
(9.52)
with ' a material constant that one may set equal to 1. In this case, another result follows from Corollary 1. Corollary 2 The Landau–Lifshitz–Gilbert equation ς˙ = −ς × ς
(9.53)
(now ς = Div( Dς)) is covariant. Of course, by requiring the invariance of e(X, x, F, ς, Dς) with respect to the action of SO(3) both on the ambient space E 3 and the manifold of substructural shapes S 2 , one obtains the referential counterpart of (9.18), namely: skew(PF∗ + ς ⊗ z + S ∗ Dς) = 0.
(9.54)
Corollary 3 Let fs11 act alone on L, with w arbitrary, from (9.42) it follows that ·
∗ ˙ F∗ ∂x˙ L −Dς ∗ (ς × ς)+F grad U +Div(Pς − 12 ρ0 |x| ˙ 2 )−∂X L +F∗ b = 0, (9.55)
where Pς = eI − F ∗ P − Dς ∗ S . Such a balance is covariant, the covariance assured by the arbitrariness of w. The second-order tensor Pς is a special version of a generalized Eshelby tensor derived in [6] for general multi-field theories of complex bodies.The dependence of e on ς and Dς and the term Dς ∗ S establishes the difference with respect to the standard Eshelby stress. Equation (9.55) is a balance of configurational forces for spin glasses in absence of dissipative motion of defects. With respect to the general expression of configurational forces for multi-field theories as derived in [6]1 , standard substructural inertia is absent but there is a new term, ˙ which accounts for gyromagnetic effects2 . In other words, namely −Dς ∗ (ς × ς), ∗ ˙ is the configurational analog of the gyroscopic inertia −ς × ς. ˙ −Dς (ς × ς) The tensor Pς is in general non-symmetric: conditions assuring its symmetry are collected in the corollary below. Corollary 4 Set G coincident with SO(3) and assume that the same copy of SO(3) acts also alone on the reference place B0 so that, for any element q˙ × of the associated Lie algebra so(3), f 1 is such that w = q˙ × (X − X0 ). If the body is homogeneous and G and f 1 act alone on L, Pς is symmetric. The result can be derived from (9.42) by direct calculation. Of course such a result is vain when the disorder renders inhomogeneous the body. 1
For the proof of the covariance of the balance of such forces see [7, 19, 21]. detailed treatment of configurational forces in micromagnetics from a different point of view can be found in Ref. [22]. 2A
Continua with Spin Structure
327
Let b any arbitrary part of B0 , i.e. a subset of B0 with the same geometrical regularity of B0 itself. Theorem 2 implies the validity of the integral balance d Q(dX)3 − m(ς)(dX)3 + F · n dH2 = 0, (9.56) dt b b ∂b where dH2 is the Hausdorff measure over the boundary ∂b which is endowed with outward unit normal n everywhere, except a finite number of corners and edges. From a physical point of view (9.56) is a balance between an energetic (relative) flux and kinetic (relative) sources. In fact, by definition, when multiplied by n, the term (∂F L)∗ (u − Fw) is the power of the standard traction in the difference between the virtual velocity u and the push-forward in B of the virtual referential rate of permutation of defects w, namely Fw. In this sense, it is a “relative’’ power. An analogous interpretation holds also for (∂Dς L)∗ (ξS 2 (ς) − (Dς)w) · n which is now the “relative’’ power of S n. Also, by definition, Q is the same “relative’’ power of the canonical momentum ∂x˙ L and m(ς) the one of the gyroscopic inertial force −ς × ς˙ (see [7] for the interpretation of an analogous integral balance valid for abstract multi-field theories for condensed matter with complex substructural morphology but not including the gyroscopic-like effects carried out by m(ς)).
9.4 Covariant Evolution of Interstitial Point Defects and Disclinations 9.4.1 Interstitial point defects Up to this point, the defects have been considered “quenched’’, i.e. fixed in space. ¯ in B0 that moves in B relatively Imagine now an interstitial impurity located at X −1 to B itself. By means of x˜ one may picture this motion in B0 in a sort of ¯ characterized by velocity w. non-material motion of X ¯ From now on, br will indicate a special part of B0 , namely a sphere centered ¯ at X. With reference to the definition of f 1 and f 2 in Section 9.3, it is assumed that: ˜ = w, ¯ (i) limX→X¯ w(X) (ii) u(·) is continuous everywhere in B , ˜ = α. (iii) there exists r ∈ S 2 and α ∈ Tr S 2 such that limX→X¯ ξS 2 (ς(X)) A basic assumption is that the sole dissipation mechanism present is due to the breaking of material bonds, a breaking necessary to allow the evolution of the point defect and the evolution is represented in B0 represented ¯ ¯ by the fictitious kinematics t −→ X(t). A force f then drives the defect at X. It is additional to the system of standard and substructural (spin) interactions considered up to now and is power conjugated in B0 with the non-material ¯ kinematics t −→ X(t). It has also purely dissipative nature in the sense that,
328
Paolo Maria Mariano
when w ¯ = 0, f·w ¯ ≥0
(9.57)
and the equality sign holds only when w ¯ = 0. Of course, a solution to the inequality (9.57) is given by f = g(w) ¯ w ¯
(9.58)
with g(w) a definite positive scalar function that has constitutive nature. ¯ in its interior so that there Consider an arbitrary part b of B0 , including X(t) exist r > 0 and such a ball br ⊂ b that ∂br ∩ ∂b = ∅. For any field a which takes ¯ the integral of a over b is values in a linear space and is possibly singular at X, intended here in the limit sense a(X)(dX)3 = lim a(X)(dX)3 . (9.59) r→0 b\br
b
Q, m(ς) and F are presumed to admit integrals over b in the sense above. Moreover, consider such maps t −→ b(t) and t −→ br (t) that br (t) ⊂ b(t) and ∂br (t) ∩ ∂b(t) = ∅, to “follow’’ virtually the motion of the point defect (in fact ¯ of the point defects evolves in time). Points of ∂br (t) have in B0 the pre-image X ¯ at each ¯ namely w, the same velocity of X, ¯ if one assumes that br be centered at X instant. For any scalar field a depending on space and time, by standard transport theorem one then gets
d dt
a(X, t)(dX) =
a˙ (X, t)(dX) − lim
3
b
3
r→0 ∂br
b
a(X, t)(w ¯ · n)dH2 .
(9.60)
Theorem 3 Let b be an arbitrary part including in its interior the interstitial point ¯ in B0 . Let the assumptions above be valid and also defect located at X d dt
Qd X −
m(ς)d X +
3
b
F · n d H2 + f · w = 0
3
b
(9.61)
∂b
¯ as specified above. If L is invariant under hold for any possible choice of b, including X 1 2 fs1 , fs2 and G, for any g ∈ G and s1 , s2 ∈ R+ , covariant pointwise balances for a point defect follow as in the list below: 1. The action of f 2 alone over L implies: 2 lim Pn dH = − lim r→0 ∂br
r→0 ∂br
(ρ0 x˙ ⊗ w)n ¯ d H2 .
(9.62)
329
Continua with Spin Structure
2. The action of G alone over L implies: lim S n dH2 = 0. r→0 ∂br
3. The action of f 1 alone implies:
(9.63)
f = lim
¯ ∂br br →X
(Pς − krel I)n dH2 ,
(9.64)
with krel = 12 ρ0 |x˙ − w| ˜ 2, w ˜ := Fw. ¯ Take note that the configurational counterpart of the gyroscopic inertia, ˙ has no influence on the motion of the interstitial point namely −Dς ∗ (ς × ς) defect under the assumptions adopted here. ¯ as described above. By divergence Proof Let b and br be selected around X theorem, 2 3 F · n dH = Div F(dX) + lim F · n d H2 . (9.65) ∂b
r→0 ∂br
b
Then, by using (9.60) and (9.65), the relation (9.61) reduces to 3 ˙ (Q + Div F − m(ς))(dX) − lim Q(w ¯ · n)dH2 b
r→0 ∂br
+ lim
F · n dH2 + f · w = 0,
(9.66)
so that Theorem 1 implies lim (F · n − Q(w ¯· n))dH2 + f · w = 0.
(9.67)
r→0 ∂br
r→0 ∂br
If f 2 acts alone on L (contemporarily the actions of f 1 and G vanish identically), one finds Q = ρ0 x˙ · u,
so that (9.67) reduces to
F = −P∗ u
(9.68)
u · lim
r→0 ∂br
(ρ0 x˙ ⊗ w ¯ + P)n dH2 = 0
(9.69)
so that (9.62) follows thanks to the arbitrariness of u. When G acts alone, it follows that Q = 0,
F = −S ∗ ξS 2 (ς),
(9.70)
330
Paolo Maria Mariano
and (9.67) reduces to
α · lim
r→0 ∂br
S n d H2 = 0
(9.71)
so that (9.63) follows due to the arbitrariness of ξ generating α in the limit sense described at the beginning of this section. When f 1 acts alone, one gets Q = −ρ0 x˙ · Fw,
(9.72)
F = Lw + P∗ (Fw) + S ∗ (Dς)w = 12 ρ0 |x| ˙ 2 w − P∗ς w − ww.
In this case (9.67) reduces to f · w + lim
r→0 ∂br
+ lim
r→0 ∂br
(9.73)
ρ0 x˙ · Fw(w ¯ · n)dH2
1 2 ∗ ˙ − Pς w · n dH2 = 0. ρ0 |x| 2
(9.74)
The term limr→0 ∂br ww · n dH2 is not present because the integrand is continu¯ as assumed above, and w ous over B0 . Since w →w ¯ as X →X, ¯ is arbitrary, (9.74) must be valid under the transformation w ¯ → −w ¯
(9.75)
so that, thanks to the arbitrariness of w, (9.76) becomes 1 ∗ 2 f = lim Pς n − ˙ − ρ0 x˙ · w ˜ n d H2 , ρ0 |x| r→0 ∂br 2
(9.76)
¯ Equation (9.64) follows where w ˜ := Fw ¯ is the limiting value of Fw as X → X. by taking into account that 1 ˙2 2 ρ0 |x|
and
− ρ0 x˙ · w ˜ = krel − 12 ρ0 |w| ˜ 2
lim
r→0 ∂br
ρ0 |w| ˜ n dH = 2
2
∗ 2 ρ0 |w ˜ tip |
(9.77)
lim
r→0 ∂br
n dH2 = 0.
(9.78)
A result similar to (9.64) has been obtained with a different procedure in Ref. [23] in the case of nematic liquid crystals in absence of bulk deformations and inertia.
331
Continua with Spin Structure
Of course, w ¯ is different from zero only when the driving force f overcomes a certain threshold. In this case (9.64) becomes the evolution equation of the point defect given by g(w) ¯ w ¯ = lim (Pς − krel I)n dH2 . (9.79) ¯ ∂br br →X
¯ so that w Let e be a unit vector attached at X ¯ = we. ¯ By varying e within ¯ due to the inhomomay find different strength of the material around X geneity properties induced by the “quenched’’ disorder. So that one finds that a real function over S 2 , namely: S 2 , one
F = S 2 → R+
(9.80)
¯ may describe the “strength’’ around X. By following the standard use one says that f is: (a) Subcritical when f · e < F(e) for all e ∈ S 2 . (b) Critical where there exist some e ∈ S 2 such that f · e = F(e) while subcritical state is granted for all directions. (c) Supercritical when there exist some e ∈ S 2 such that f · e > F(e). Of course, when there exists a unique e along which f is supercritical, the point defect moves along e itself. Moreover, by denoting by f the product f · e, the dissipation D along the motion is given by D(f, e) = (f · e)w¯ = g(w) ¯ w¯ 2 .
(9.81)
Also, w¯ itself is determined by f and e so that one gets w¯ = w( ˜ f , e),
(9.82)
with w(·, ˜ ·) such that w(·, ˜ e) is a strictly increasing function of f . Of course, in isotropic conditions w¯ has the form w¯ = w( ˜ f ). It may occur that there exist “many’’ e’s along which f is supercritical. In this case, the natural selection criterion is to say that the motion of the point defect is activated along the direction in which f is supercritical and the dissipation is maximized, so that, once f is fixed, one finds e as the argument which maximizes the dissipation, namely: max{D(f, e) f · e > F(e)}. e∈S 2
(9.83)
It is immediate to realize that in isotropic conditions, due to the properties of w( ˜ · ), e is such that f · e > f · s, ∀s ∈ S 2 , s = e.
(9.84)
332
Paolo Maria Mariano
Of course, S 2 above is a copy of the unit sphere different from the one selected as a manifold of substructural shapes.
9.4.2 Disclination lines A disclination is a discontinuity line l for the fields involved where the spin field is not defined, or, alternatively, it may assume point by point the entire S 2 as value. l is a simple curve parametrized by arc length s ∈ [0, s¯ ] and represented by a point ˜ : [0, s¯ ] → B0 so that the derivative Z,s of Z ˜ at Z = Z( ˜ s) valued C 2 mapping Z with respect to s is the the tangent vector t=˜t(s) at Z. The disclination moves in the current place B of the body, relatively to B itself. As in the case of the point defect, one may picture in B0 the motion of the disclination by means of the inverse mapping x˜ −1 which describes a non-material ¯ s), given then by a motion of l in B0 , characterized by an incoming velocity w( vector field along l in B0 . Assumptions (i)–(iii) in Section 9.4.1 apply. In particular, one needs only to adapt slightly (i) to the situation envisaged here by assuming that limX→Z( ˜ = w( ¯ s) for any s ∈ [0, s¯ ] and also (iii) by saying that at each s ˜ s) w(X) 2 there exists r(s) ∈ S and α(s) ∈ Tr(s) S 2 such that limX→Z( ˜ = α(s). ˜ s) ξS 2 (ς(X)) A special class of parts bl,r of B0 is helpful in subsequent calculations. Each representative bl,r of this class is a “curved’’ cylinder wrapped around l, a cylinder ˜ s1 ) to Z( ˜ s2 ), two arbitrary obtained by translating a disk Dr of radius r from Z( ˜ s) and points of l with s1 < s2 , maintaining Dr orthogonal to t at each Z = Z( the center of Dr coincident with Z. Consider two coaxial “curved’’ cylinders bl,r1 and bl,r2 with r1 > r2 . Even in this case, for any field a which takes values in a linear space and is possibly singular at l, the integral of a over bl,r1 is intended here in the limit sense 3 a(X) d (X) = lim a(X) d (X)3 . (9.85) bl,r1
r2 →0 bl,r \bl,r 1 2
A force field ˜f
[0, s¯ ] s → f = ˜f(s) ∈ R3
(9.86)
drives l. It is power-conjugated with w( ¯ · ) and has also purely dissipative nature so that at each s we have ˜f(s) · w( ¯ s) ≥ 0 with the equality sign holding when ¯ w( ¯ s), with g(w) ¯ a positive scalar w( ¯ s) = 0. The inequality implies that f = g(w) function, as in (9.58). Theorem 4 Let bl,r be an arbitrary curved cylinder wrapped around l as described above. Let the assumptions above be valid and also s2 d 3 3 2 Q(dX) − m(ς)(dX) + F · n dH + f·w ¯ ds = 0 (9.87) dt bl,r bl,r ∂bl,r s1
333
Continua with Spin Structure
hold for any possible choice of bl . If L is invariant under fs11 , fs22 and G, for any g ∈ G and s1 , s2 ∈ R+ , covariant pointwise balances for a point defect follow as in the list below. 1. The action of f 2 alone over L implies at each s ∈ [0, s¯ ] 2 Pn dH = − lim (ρ0 x˙ ⊗ w)n ¯ d H2 . lim
(9.88)
2. The action of G alone over L implies at each s ∈ [0, s¯ ] S n dH2 = 0. lim
(9.89)
3. The action of f 1 alone implies at each s ∈ [0, s¯ ] f = lim (Pς − krel I)n dH2 ,
(9.90)
r→0 ∂Dr
r→0 ∂Dr
r→0 ∂Dr
r→0 ∂Dr
with krel (s) = 12 ρ0 |x( ˙ s) − w( ˜ s)|2 , w( ˜ s) := Fw( ¯ s). The proof follow the same path of the one of Theorem 3. Since here bl = Dr × [s1 , s2 ], one takes into account in addition just the arbitrariness of the interval [s1 , s2 ]. Remarks about the need to overcome some material threshold to activate the motion of the disclination also apply as in Section 9.4.1. They are not rewritten here for the sake of brevity. ACKNOWLEDGEMENTS The support of the Italian National Group of Mathematical Physics (GNFM INDAM) and of MIUR under the grant 2005085973- “Resistenza e degrado di interfacce in materiali e strutture’’ – COFIN 2005 is acknowledged.
REFERENCES 1. R. D. James and D. Kinderlehrer, Frustration in ferromagnetic materials, Cont. Mech. Therm., 2 (1990), 215–239. 2. D. Sherrington and S. Kirkpatrick, Solvable model of a spin-glass, Phys. Rev. Lett., 35 (1975), 1792–1796. 3. M. Mézard, G. Parisi, and M. A. Virasoro, Spin Glass Theory and Beyond, World Scientific Publishing, Singapore, 1987. 4. M. Talagrand, Spin Glasses:A Challenge for Mathematicians, Springer Verlag, Berlin, 2003. 5. G. Capriz, Continua with Microstructure, Springer Verlag, Berlin, 1989. 6. P. M. Mariano, Multifield theories in mechanics of solids,Adv. Appl. Mech., 38 (2002), 1–93. 7. C. de Fabritiis and P. M. Mariano, Geometry of interactions in complex bodies, J. Geom. Phys., 54 (2005), 301–323. 8. P. M. Mariano, Mechanics of quasi-periodic alloys, J. Nonlinear Sci., 16 (2006), 45–77. 9. M. Giaquinta and G. Modica, On sequences of maps with equibounded energies, Calc. Var. Par. Diff. Eq., 12 (2001), 213–222.
334
Paolo Maria Mariano
10. M. Giaquinta, G. Modica, and J. Souˇcek, Cartesian currents in the calculus of variations,Vol. I and II, Springer Verlag, Berlin, 1998. 11. B. I. Halperin and W. M. Saslow, Hydrodynamic theory of spin waves in spin glasses and other systems with noncollinear spin orientations, Phys. Rev. B, 16 (1977), 2154–2162. 12. I. E. Dzyaloshinskii and G. E. Volovick, Poisson brackets in condensed matter physics, Ann. Phys., 125 (1980), 67–97. 13. D. D. Holm and B. A. Kupershmidt, The analogy between spin glasses and Yang-Mills fluids, J. Math. Phys., 29 (1988), 21–30. 14. N. D. Mermin,The topological theory of defects in ordered media, Rev. Mod. Phys., 51 (1979), 591–648. 15. Y. N. Obukhov, On a model of unconstrained hyperfluid, Phys. Lett. A, 210 (1996), 163–167. 16. Y. N. Obukhov and R. Tresguerres, Hyperfluids – a model of classical matter with hypermomentum, Phys. Lett. A, 184 (1993), 17–22. 17. P. M. Mariano, Cancellation of vorticity in steady-state non-isentropic flows of complex fluids, J. Phys. A: Math. Gen., 36 (2003), 9961–9972. 18. J. E. Marsden and T. J. R. Hughes, Mathematical Foundations of Elasticity. Prentice Hall, Dover edition, London, 1994. 19. G. Capriz and P. M. Mariano, Symmetries and Hamiltonian formalism in complex materials, J. Elasticity, 72 (2003), 57–70. 20. J. E. Marsden and T. S. Ratiu, Introduction to Mechanics and Symmetry, Springer Verlag, New York, 1999. 21. A. Yavari, J. E. Marsden, and M. Ortiz, On spatial and material covariance laws in elasticity, J. Math. Phys., 47, 042903-1–042903-53. 22. R. D. James, Configurational forces in magnetism with application to the dynamics of a small scale ferromagnetic shape-memory cantilever, Cont. Mech. Therm., 14 (2002), 56–86. 23. E. C. Gartland, A. M. Sonnet, and E. G. Virga, Elastic forces on nematic point defects, Cont. Mech. Thermodyn., 14 (2002), 307–319.
Index
2D cell-periodic meshless shape function, 173 2D periodicity, 169–170 2D photonic crystals, 183, 194 band structures, 185–193, 195 Maxwell equations, 183–185 2D triangular photonic crystals, deformation, 196–198 n = 0 rule, 292 Ab initio calculations, 176 Acoustic band structure, 165 Acoustic bandgap materials, 194–195 Affine bodies, 80–82 affine–affine model, 119, 121, 134, 135, 137, 146, 149, 154, 155 affine mapping, 83–84 affine–metric model, 119, 121, 123, 135, 136, 138, 149, 154 affine space, 83 dimension, 84 material, 86–87 physical, 86 application, 81 Cartesian coordinates, 93 centre of mass, 93–95, 108 configuration space, 86, 87 geometric models, 89–92 degrees of freedom, 80, 81, 82, 87, 107, 114, 120, 123, 127, 131, 133, 148, 152 discrete, 92–93 double-isotropic d’Alembert model, 149 Eulerian coordinates, 93, 94 extended, 89 frames, 84–85 Haar measures, 126–128, 146, 148, 153 Hilbert spaces, 126, 130
invariants, basic system of, 99–100 kinematical concepts, 101–106, 108–109 kinetic energies, 112–119, 123, 125, 137 and kinetic Hamiltonians, expressions, 115, 117 Lagrangian coordinates, 93 Lebesgue measure, 127–128, 130, 131, 155 left-acting transformation, 85–87 Legendre transformation, 112, 119 metric–affine model, 138, 154, 155 metric tensor, role, 115 metrical concepts, 95–99 operators, 126, 129 Poisson brackets, 106–107, 109, 120, 122–123, 129 polar decomposition, 101 symmetry problems, 109–110, 111 poles, 93, 94–95 powder of material points, 86, 87 quantization, 81, 125 Riemannian structure, 116, 125 right-acting transformation, 87–89, 90–91 spatial transformation, 90 structureless space, 91 tensors, 95–99 transformation actions, 87–89 two-polar decomposition, 101 symmetry problems, 109, 110 Agglomerate growth modelling, 232–233 governing equations agglomerate expansion and microspheres growth, 214–215 energy balance, 215–216 liquid flow, 215 liquid monomer balance, 213 microspheres, density, 213
335
336
Agglomerate growth modelling (Continued) porosity, constancy, 212–213 solid mass balance, 214 initial and boundary conditions, 216–217 Asaro–Tiller–Grinfeld–Srolovitz (ATGS) instability, 295–296, 299 Atomic crystals and semiconductors, 174 compound semiconductors, strain effect, 180–183 empirical pseudopotentials, of Si and GaAs, 178–180 Galerkin formulation, of Schrödinger equation, 175–176 Kronig–Penney model potential, 177–178 Atomistic modeling methods of electron confinement in semiconductor quantum dots, 305–307 of semiconductor nanostructures, 293–295 Augmented plane wave (APW) method, 164 Band structures, 163, 185 acoustic band structure, 165, 166 circular rods square lattices, 191, 192, 193 triangular lattices, 191–192, 193 deformed photonic crystals, 198–200 dielectric veins square lattice, 189–191 electromagnetic Kronig–Penney problem, 187–189 electronic band structure, 164 homogeneous square lattice, 186–187 photonic band structure, 165 Bionanotechnology, of semiconductor nanostructures, 287–288 Boltzman–Grad limit, 67 Boltzmann–Enskog equation, 72–74 Boltzmann equation, 2, 6, 30, 34, 68, 72, 74, 77
Index
Bottom-up/self-assembly methods, 288–289 Braginskii’s model, 3, 4 Capriz, G., 238 Cauchy–Green stretch tensors, 217, 218, 268, 269, 270 Cauchy tensor, 95–96, 97, 99 Centre of mass, 93, 94, 108, 120, 122, 135, 240 Cercignani, C., 63 Circular rods square lattice, band structure, 191, 192, 193 triangular lattice, band structure, 191–192, 193 Collision dynamics, of rough spheres, 69–72 Collision frequency, 20, 25, 31, 36, 38, 40, 43 Collision operators dimensionless, 31 electron–electron, 6, 12, 19, 24–25, 45 electron–ion, 6, 12, 19, 45 expanded, see expanded collision operators interspecies, 34–40 ion–ion, 6, 12 properties energy conservation, 12 entropy inequalities, 12 mass conservation, 11, 13, 40 momentum conservation, 12 thermal equilibria, 12 Complex bodies, 315, 316, 319, 325, 326 Compound semiconductors, strain effect in, 180–183 Constrained mixture, 272 Continuous cooling diagram, 268 Continuum modeling methods, in semiconductor nanostructure, 293 atomistic surface features, 297–298 in phase separation, 302–303 in quantum dots and wires, 307–308 quasicontinuum method, 309–310
Index
for sputter-erosion instability, 299–301 stress and strain analysis, 293 for stress coupling, 302 Coulomb logarithm, 10, 11 Critical function, 331 Crystallization, 262, 265, 266, 267, 271, 276, 279, 281 Darcy’s equation, 209, 215 Deformation gradient, 217, 268, 269 Deformations, 70, 81, 95, 96, 100 2D triangular photonic crystals, 196–198 deformed photonic crystals, band structures, 198–200 Degond, P., 1 Dielectric veins, square lattice band structure, 189–191 Diffusion approximation, 2 Dimensionless collision operators, 31 Disclinations, 320, 332–333 Dislocation strain effects, in thin films, 304–305 Disparate masses, 2, 3, 4, 5, 6, 17, 24, 57 Drift-diffusion model, 2, 5 Drift velocity, 29, 303 Einstein relation, 23 Electromagnetic Kronig–Penney problem band structure, 187–189 Electron density, 52 Electron–electron collision operator, 19, 24–25, 45 Electron–ion collision operator, 6, 19, 34, 38, 45 Electron momentum relaxation rate, 26, 27, 29 Electronic band structures, 164, 175 Empirical pseudopotentials, of Si and GaAs, 176, 178–180, 303, 306 Energy conservation in collision operators, 12 in expanded collision operators, 40–42 in LB , 50 Energy flux, 52
337
Energy-transport system, 3, 4, 5, 18–21, 22, 23, 24 Enskog equation, see Boltzmann–Enskog equation Entropy inequalities in collision operators, 12 in expanded collision operators, 41 in LB , 50–51 Entropy production rate, 266, 270, 273, 274, 277, 278, 280, 284 due to crystallization, 276, 280 due to glass transition, 278, 280 Euler deformation tensors, 96 Expanded collision operators properties energy conservation, 40–41 entropy inequality, 41 equilibria, 42 mass conservation, 40 momentum conservation, 40 Extremum principle, 317, 319, 323 Fabrication methods, of semiconductor nanostuctures bottom-up/self-assembly methods, 288–289 top-down/lithographic methods, 288 Fasano,A., 206 Flow-induced crystallization, 262, 265, 281 Fluxes and collision terms, computation, 49–57 collision terms, expression, 56 fluxes computation, 52–54 expression, using ne and Te , 54–56 LB , properties, 50–51 preliminaries, 49–50 Fokker–Planck (FP) equation, see spherical harmonics expansion model Fokker–Planck–Landau equation, 7, 38 Fourier-Fick law, 23 Fragmentation, 209 Free-standing quantum dots, 287
338
Galerkin formulation, of Schrödinger equation, 175–176 Gallium arsenide empirical pseudopotentials, 178–180, 181 Gas mixture, 3, 6, 7, 34 moment method and conservation laws, 11–18 Gases and granular materials, microscopic foundations Boltzmann–Enskog equation, 72–74 macroscopic balance equations, 74–76 rough spheres, collision dynamics, 69–72 smooth spheres, kinetic theory, 65–69 Gauge fields, 320–321 Gibbs measure, 315 Glass transition, 266, 272, 280 Granular materials, 63, 238 Green tensor, 95–96, 100, 118, 120, 219 Gyration, 102 Gyroscope, 80–81, 110 Haar measures, 126–127, 128, 146, 153 Heisemberg spins and balance equations, 316–321 Hilbert spaces, 126, 130–131, 139 Hydrodynamic equations, 3, 17, 18, 21–22, 23, 24 Incompressibility, 121, 322 Inertial tensor, 97, 113 Institute of Fundamental Technological Research, 80 Interaction force, 219, 220, 223 Interspecies collision operators, expansion, 5, 34–40 Interstitial point defects, covariant evolution, 327–332 Isotactic polypropylene (iPP), 263 Johnson, H. T., 284 Jun, S., 163 k · p Hamiltonian method, 308 Kannan, K., 206, 262 Kinetic energy tensor, 243
Index
Kinetic energy theorem, in tensorial form, 256, 260 Kinetic model, 6–11 scaling units, 35, 42 Kinetic theory of gases, 240 of smooth spheres, 65–69 Knudsen number, 3–4, 15, 57 Kohn–Sham equation, 176 Kronig–Penney model potential, 176, 177–178 LB , 316, 318, 321, 323 properties, 50–51, 57–58 Lagrange deformation tensor, 96 Lagrangian approach, in pseudofluids, 259–260 Landau–Lifshitz–Gilbert equation covariance, 316, 326 Legendre transformation, 112, 119 Lie group, 81, 113, 124, 126, 316, 324 Liu,W. K., 163 Local thermodynamical equilibrium (LTE), 3, 14–17 Lorentz force, 6–7, 33, 34 Lorentz model, 36 Macroscale growth modelling, see agglomerate growth modelling Macroscopic balance equations, 74–76 Mancini,A., 206 Mariano, P. M., 314 Mass conservation in collision operators, 11 in expanded collision operators, 40 in LB , 50 Material affine space, 86 Material transformation, 86 Mathematical modeling, 210, 211, 235–236 Meshless approximation, 167, 171, 173, 174, 176 Meshless methods, 166, 168 2D photonic crystals, band structures, 185–186
Index
energy level of GaAs, 180, 181 for Kronig–Penney model, 177–178 of Si, 179 of strained Si1−x Gex , 180–183 homogeneous square lattice, band structures, 186–187 and moving least-square basis, 166 and PW method, 189, 190–191, 193, 195 Micromorphic continuum, 80, 81 Microspheres growth modelling, 217, 233 boundary conditions, consistency, 228–231 governing equations, 218 constitutive equations, 220–222 mass balance, 219 momentum balance, 219 initial and boundary conditions, 222–224 kinematics, 217–218 microscale with spherical symmetry, equations analysis, 224–228 Mixture model application, 17–18 MLS approximation, 168–169 Modeling methods, of semiconductor nanostuctures, 293 atomistic modeling methods, 293–294 continuum modeling methods, 293 multiscale modeling methods, 294–295 Molecular dynamics, 169, 298, 301 Molten polymers, solidification kinematics, 268–270 modeling after initiation of solidification, 271–281 prior to initiation of solidification, 270–271 Moment matrix, 171–172 Moment method and conservation laws for gas mixtures, 11–18 for plasmas, 40–49 Momentum conservation in collision operators, 12 in expanded collision operators, 40
339
Moving least-square (MLS) basis atomic crystals and semiconductors, 174–183 for band-structure calculations, of natural and artificial crystals, 163 and meshless methods, 166 and periodicity, 166–167, 167–174 implementation, of periodicity condition, 169–174 MLS approximation, 168–169 phoXonic crystals, 183–195 strain-tunable photonic bandgap materials, 195–200 Mullins-type evolution equation, 297 Multifield theories, 326, 327 Multigrain model, 210 Multiscale coupled mechanical/electronic modeling in semiconductor nanostructure, 309–310 Multiscale modeling methods, in semiconductor nanostructure, 294–295 Nanophotonics, in semiconductor nanostuctures, 286–287 Natural configuration, 220, 222, 234–235, 236, 266, 267 Navier–Stokes system, 3–4, 23–24 Non-viscous compressible spin fluids, 321–322 Nylon-6, 264 Onsager relation, 4, 20, 22, 53 Optical analogy, of semiconductor, 165 Optoelectronics, of semiconductor nanostuctures, 286–287 Parallel transport, stress effects on, 303 Peculiar velocities, 239 Periodicity and moving least-square basis, 166–167, 167 Photonic band structure, 165, 198–200 Photonic bandgap materials, 195
340
Photonic crystal, 165 2D photonic crystal band structures, 185–193 Maxwell equations, 183–185 2D triangular photonic crystals, 196–198 deformed, 198–200 PhoXonic crystals, 183 2D phonic crystals, 183 band structures, 185–193 Maxwell equations, 183–185 acoustic bandgap materials, 194–195 Physical affine space, 86 Plane wave (PW)-based methods, 164, 165 Plasma fluid model, 18 applications, 22–24 diffusion matrices, approximate expression, 24–29 electron–electron collision, 24–25 magnetized case, 27–29 non-magnetized case, 25–27 energy-transport form, 18–21 hydrodynamic form, 21–22 Plasma moment system, closure, 44–49 Plasmas, 3, 5, 6, 7, 11, 22, 24, 40, 44, 57, 289 moment method and conservation laws, 40–49 Plasmas and disparate mass gaseous binary mixtures, 1 fluxes and collision terms, computation, 49–57 interspecies collision operators, expansion, 34–40 kinetic model, 6–11 moment method and conservation laws for gas mixtures, 11–18 for plasmas, 40–49 plasma fluid model, 18–29 scaling hypotheses, 29–34 Polyethylene, 207–208 Polyethylene terephthalate (PET), 263, 264, 265 Polymer melt, 262, 266–267
Index
Polymer production rate, at catalyst surface, 210–211 Polypropylene, 207–208 Post-Avrami crystallization, 281 Prepolymerization, 209, 210 Primary crystallization, 281 Primary kinetic fields, 242 Pseudofluids, 238 balance equations, 255–257 basic fields, 242–245 boundary conditions, 258–259 deformation and distorsion measures, 245–248 inertial measures, 250–252 Lagrangian approach, 259–260 material element, 239–242 strain rates and distorsion rates, 248–250 thermal concepts, relations with, 252–255 Pseudopotentials, 176, 178–180, 303, 306 Quantization, 81, 125, 134 Quantum nanostructures, 286 Quantum wires and dots, 284 in nanoelectronics and quantum computing, 287 quantum confinement, 305–308 see also semiconductor nanostructures Quasicontinuum method, 294, 309, 310 Quenching, 263, 265, 267 Radial expansion velocity, 210 Rajagopal, K. R., 206, 262 Real-space techniques, 164, 165, 166, 167 Reynold’s tensor, 243–244 Rosseland approximation, 2 Rough spheres, collision dynamics, 69–72 Scaling hypotheses, 29–34 Scaling units, 30, 31, 42, 43, 48 kinetic model, 35 macroscopic quantities, 44 new macroscopic quantities, 49
Index
Scattering direction, 8 Scattering potential, 304–305 Schrödinger equation, 141, 146, 167, 175, 292, 307 Galerkin formulation, 175–176 Secondary crystallization, 262, 277, 279, 281 Semiconductor nanostructures, 284 applications, 286 bionanotechnology, 287–288 nanoelectronics and quantum computing, 287 optoelectronics and nanophotonics, 286–287 background, 285–286 electronic and optical properties, 290–292 electronic and optical properties, stress effects on, 303 multiscale coupled mechanical/electronic modeling, 309–310 parallel transport, in thin films, 303–305 quantum confinement, on wires and dots, 305–308 fabrication methods, 288 bottom-up/self-assembly methods, 288–289 top-down/lithographic methods, 288 formation, stress effects on, 295 compositional segregation, in thin films, 301–303 sputter-erosion instability, 299–301 stress-induced surface self-assembly modeling, 295–299 mechanical properties, 290 modeling methods, 293 atomistic modeling methods, 293–294 continuum modeling methods, 293 multiscale modeling methods, 294–295 Semicrystalline polymer, 267, 271
341
Semiempirical pseudopotential density functional theory method, 306 Sigmund mechanism, 299 Silicon, 167, 178, 196, 295 empirical pseudopotentials, 178–180 strained silicon, 304 Sławianowski, J. J., 80 Small angle X-ray scattering, (SAXS), 264 Smooth spheres, kinetic theory, 65–69 Solidification, 265, 270, 271, 273 molten polymers, see molten polymers, solidification Spatial transformation, 86, 90 Spherical harmonics expansion (SHE) model, 5 Spin fluids perfect compressible spin fluid, 321 isentropic flow, 321 perfect incompressible spin fluid, 322 vorticity, 322 Spin glasses, 314, 315, 316, 320, 326 Spin structure, continua with, 314 disclination lines, covariant evolution, 332–333 Heisemberg spins and balance equations, 316–321 interstitial point defects, covariant evolution, 327–332 non-viscous compressible spin fluids, 321–322 referential description, 323–327 Square lattice, 186, 189, 191, 194, 315 band structure, 186 circular rods, 191, 192, 193 dielectric veins, 189–191 homogeneous media, 186–187 homogeneous square lattice, 186–187 Strain, 197, 200, 202, 248, 296, 308 dislocation strain effect, in thin films, 304–305 effect, in compound semiconductors, 180–183
342
Strain-tunable photonic bandgap materials, 195 2D triangular crystals, deformation, 196–198 deformed photonic crystals, band structures, 198–200 Strained silicon, 304 Stress effects modeling, semiconductor nanostructure, 295 atomistic modeling, of quantum confinement in semiconductor quantum dots, 305–307 in compositional segregation, in thin films, 301 continuum model, 302 phase separation, in real III–V semiconductor materials, 302–303 continuum modeling, of electron confinement in quantum dots and wires, 307–308 on parallel transport, in thin films, 303 dislocation strain effects, 304–305 strained silicon, 304 in sputter-erosion instability, 299 atomistic calculations and continuum methods, connections, 301 continuum models, 299–301 mechanistic understanding, 299 see also semiconductor nanostructure Stress-induced surface self-assembly modeling, 295–299 all-atomistic modeling, 298–299 stressed surface instability, mechanism, 295–296 surface evolution models, 296–297 with atomic features, 297–298 Stresses, 223–224, 265 Structured media, 80 Subcritical function, 331 Supercritical function, 331 Tensor moment, of momentum, 243 Thermal equilibria in collision operators, 12–13
Index
Thermal history, on solidification of molten polymers, 262 kinematics, 268–270 modeling after initiation of solidification, 271–281 prior to initiation of solidification, 270–271 Tight-binding atomistic models, 298, 306–307, 310 Top-down/lithographic methods, 288 Translation-and searching algorithm, 170, 171, 173, 201 Triangular lattice, of circular rods band structure, 191–192, 193 Upper-convected Oldroyd derivative, 218, 269 Variable hard sphere model (VHS), 9 Wide angle X-ray diffraction (WAXD), 263, 264 Ziegler–Natta polymerization modeling, in high pressure reactors, 206 agglomerate growth modelling governing equations, 212–216 initial and boundary conditions, 216–217 complete model, 232–233 macroscale, 232–233 microscale, 233 macroscopic transport equations free terms determination, 232 microspheres growth modelling boundary conditions, consistency, 228–231 governing equations, 218–222 initial and boundary conditions, 222–224 kinematics, 217–218 microscale with spherical symmetry, equations analysis, 224–228 not evolving natural configuration, 234–235
(a)
(b)
(c)
Plate 1 Undeformed (red) and deformed (blue) unit cells of 2D triangular photonic crystal with cylindrical air rods: (a) pure shear, (b) simple shear and (c) uniaxial tension. In each mode, corresponding shear or tensile strain of 3% is applied.
M
K
M
K 3
M 2
Γ
K
Γ M
1 K
Plate 2 Schematic diagrams of symmetry points and zones in the reciprocal lattice of undeformed (left) and deformed (right) photonic crystals.
Undeformed
Pure shear (zone 1)
Pure shear (zone 2)
Pure shear (zone 3)
0.7 0.6
ωa/2πc
0.5 0.4 0.3 0.2 0.1 0 M
K M M
Undeformed
K M M
Simple shear (zone 1)
K M M
K M
Simple shear (zone 2) Simple shear (zone 3)
0.7 0.6
ωa/2πc
0.5 0.4 0.3 0.2 0.1 0 M
K M M
Undeformed
K M M
Tension (zone 1)
K M M
Tension (zone 2)
K M
Tension (zone 3)
0.7 0.6
ωa/2πc
0.5 0.4 0.3 0.2 0.1 0 M
K M M
K M M
K M M
K M
Plate 3 Photonic band structures under pure shear (top), simple shear (middle), and uniaxial tension (bottom). TM and TE modes are in blue and red, respectively. Dashed horizontal lines indicate the bandgap of undeformed original photonic crystal. Insets in top low illustrate the quasi-hexagonal symmetry zones of the deformed photonic crystal.
CH2
CH2
CH2
CH CH2
Ethylene
CH2 H3C
CH3
4-methyl-1-pentene CH2
CH2
n
CH2
Ziegler–Natta polymerization
CH CH2 CH2
H3C
CH3
Poly(ethylene-co-4-methyl-1-pentene) (BP’s Innoves ®, a form of LLDPE) Polyethylene
Ethylene
Plate 4 Polyethylene.
m
With prepoly
Without prepoly
Plate 5 Effect of prepolymerization.
Ideal packing of the growing microspheres
Neighbouring spheres have similar histories and approximately the same radius
Plate 6 Porosity ε constant.
Plate 7 Schematic of three thin film growth modes. Left: Frank–van der Merwe or planar layer-by-layer growth. Center: Stransk–Krastanow or island growth on a wetting layer. Right: Volmer–Weber or island growth with no wetting of the substrate.
Density of states
Bulk
Energy
Film
Wire
et al ., 1995 Dot
Plate 8 Schematic of the electron density of states in bulk and quantum confined material systems. The delta-function-like densities of states in quantum wire and quantum dot configurations are desirable for many nanoelectronic and optoelectronic devices.
DOS (eV1)
2.0
1.5
1.0
0.5
0.0 8 6 4 2 0 2 E (eV)
4
6
8
Plate 9 Combined experimental and computational study of electronic structure of individual embedded quantum dot at the atomistic scale. Atom positions are determined using high resolution cross-sectional scanning tunneling microscopy (upper left) and then converted to an atomistic computational input file (lower left). Using a novel tight-binding method, the local density of states is determined at various positions (right) and compared to experimental data [56].
0 500 1000 1500 2000
Band gap difference
2000
1000 1500 ms Angstro
Lattice mismatch εxx εyy < 0 εzz > 0
500
0
200 100 0
Indentation εzz < 0 εxx εyy > 0
Plate 10 Embedded quantum dot array finite element mesh. The color contour shows the electrostatic potential for a single electron in the system when the surface is nanoindented to a small depth. The three inset images show how bandgap difference, lattice mismatch strain, and nanoindentation strain contribute to the total potential field [64].