First Edition, 2012
ISBN 978-81-323-3444-6
© All rights reserved.
Published by: Research World 4735/22 Prakashdeep Bldg, Ansari Road, Darya Ganj, Delhi - 110002 Email:
[email protected] Table of Contents Chapter 1 - Maxwell's Equations Chapter 2 - Navier–Stokes Existence and Smoothness Chapter 3 - Noether's Theorem Chapter 4 - Method of Characteristics and Method of Lines Chapter 5 - Ricci Flow Chapter 6 - Secondary Calculus and Cohomological Physics, Screened Poisson Equation and Saint-Venant's Compatibility Condition Chapter 7 - Separation of Variables Chapter 8 - Spherical Harmonics Chapter 9 - Variational Inequality and Underdetermined System
Chapter-1
Maxwell's Equations
Maxwell's equations are a set of partial differential equations that, together with the Lorentz force law, form the foundation of classical electrodynamics, classical optics, and electric circuits. These in turn underlie modern electrical and communications technologies. Maxwell's equations have two major variants. The "microscopic" set of Maxwell's equations uses total charge and total current including the difficult-to-calculate atomic level charges and currents in materials. The "macroscopic" set of Maxwell's equations defines two new auxiliary fields that can sidestep having to know these 'atomic' sized charges and currents. Maxwell's equations are named after the Scottish physicist and mathematician James Clerk Maxwell, since in an early form they are all found in a four-part paper, "On Physical Lines of Force," which he published between 1861 and 1862. The mathematical form of the Lorentz force law also appeared in this paper. It is often useful to write Maxwell's equations in other forms; these representations are still formally termed "Maxwell's equations". A relativistic formulation in terms of covariant field tensors is used in special relativity, while, in quantum mechanics, a version based on the electric and magnetic potentials is preferred.
Conceptual description Conceptually, Maxwell's equations describe how electric charges and electric currents act as sources for the electric and magnetic fields. Further, it describes how a time varying electric field generates a time varying magnetic field and vice versa. Of the four equations, two of them, Gauss's law and Gauss's law for magnetism, describe how the fields emanate from charges. (For the magnetic field there is no magnetic charge and therefore magnetic fields lines neither begin nor end anywhere.) The other two equations describe how the fields 'circulate' around their respective sources; the magnetic field 'circulates' around electric currents and time varying electric field in Ampère's law with
Maxwell's correction, while the electric field 'circulates' around time varying magnetic fields in Faraday's law.
Gauss's law Gauss's law describes the relationship between an electric field and the generating electric charges: The electric field points away from positive charges and towards negative charges. In the field line description, electric field lines begin only at positive electric charges and end only at negative electric charges. 'Counting' the number of field lines in a closed surface, therefore, yields the total charge enclosed by that surface. More technically, it relates the electric flux through any hypothetical closed "Gaussian surface" to the electric charge within the surface.
Gauss's law for magnetism: magnetic field lines never begin nor end but form loops or extend to infinity as shown here with the magnetic field due to a ring of current.
Gauss's law for magnetism Gauss's law for magnetism states that there are no "magnetic charges" (also called magnetic monopoles), analogous to electric charges. Instead, the magnetic field due to materials is generated by a configuration called a dipole. Magnetic dipoles are best represented as loops of current but resemble positive and negative 'magnetic charges', inseparably bound together, having no net 'magnetic charge'. In terms of field lines, this equation states that magnetic field lines neither begin nor end but make loops or extend to infinity and back. In other words, any magnetic field line that enters a given volume must somewhere exit that volume. Equivalent technical statements are that the sum total magnetic flux through any Gaussian surface is zero, or that the magnetic field is a solenoidal vector field.
Faraday's law
In a geomagnetic storm, a surge in the flux of charged particles temporarily alters Earth's magnetic field, which induces electric fields in Earth's atmosphere, thus causing surges in our electrical power grids. Faraday's law describes how a time varying magnetic field creates ("induces") an electric field. This aspect of electromagnetic induction is the operating principle behind many electric generators: for example a rotating bar magnet creates a changing magnetic field, which in turn generates an electric field in a nearby wire. (Note: there are two closely related equations which are called Faraday's law. The form used in Maxwell's equations is always valid but more restrictive than that originally formulated by Michael Faraday.)
Ampère's law with Maxwell's correction
An Wang's magnetic core memory (1954) is an application of Ampere's law. Each core stores one bit of data. Ampère's law with Maxwell's correction states that magnetic fields can be generated in two ways: by electrical current (this was the original "Ampère's law") and by changing electric fields (this was "Maxwell's correction"). Maxwell's correction to Ampère's law is particularly important: It means that a changing magnetic field creates an electric field, and a changing electric field creates a magnetic field. Therefore, these equations allow self-sustaining "electromagnetic waves" to travel through empty space. The speed calculated for electromagnetic waves, which could be predicted from experiments on charges and currents, exactly matches the speed of light; indeed, light is one form of electromagnetic radiation (as are X-rays, radio waves, and others). Maxwell understood the connection between electromagnetic waves and light in 1861, thereby unifying the previously-separate fields of electromagnetism and optics.
Units and summary of equations Maxwell's equations vary with the unit system used. Though the general form remains the same, various definitions get changed and different constants appear at different places. The equations here are given in SI units. Other units commonly used are Gaussian units (based on the cgs system), Lorentz-Heaviside units (used mainly in particle physics) and Planck units (used in theoretical physics). In the equations given below, symbols in bold represent vector quantities, and symbols in italics represent scalar quantities. The definitions of terms used in the two tables of equations are given in another table immediately following.
Table of 'microscopic' equations Formulation in terms of total charge and current Name Differential form Integral form Gauss's law Gauss's law for magnetism Maxwell–Faraday equation (Faraday's law of induction) Ampère's circuital law (with Maxwell's correction)
Table of 'macroscopic' equations Formulation in terms of free charge and current Name Gauss's law
Gauss's law for magnetism Maxwell–Faraday equation (Faraday's law of induction) Ampère's circuital law (with Maxwell's correction)
Differential form
Integral form
Table of terms used in Maxwell's equations The following table provides the meaning of each symbol and the SI unit of measure:
Symbol
Definitions and units Meaning (first term is the most common) electric field also called the electric field intensity magnetic field also called the magnetic induction also called the magnetic field density also called the magnetic flux density
SI Unit of Measure volt per meter or, equivalently, newton per coulomb tesla, or equivalently, weber per square meter, volt-second per square meter
coulombs per square electric displacement field meter or also called the electric induction equivalently, also called the electric flux density newton per voltmeter magnetizing field also called auxiliary magnetic field ampere per meter also called magnetic field intensity also called magnetic field per meter (factor the divergence operator contributed by applying either the curl operator operator) per second (factor partial derivative with respect to contributed by time applying the operator) differential vector element of surface area A, with infinitesimally square meters small magnitude and direction normal to surface S differential vector element of path meters length tangential to the path/curve permittivity of free space, also called the electric constant, a farads per meter universal constant permeability of free space, also henries per meter, or called the magnetic constant, a newtons per ampere
universal constant free charge density (not including bound charge) total charge density (including both free and bound charge) free current density (not including bound current) total current density (including both free and bound current) net free electric charge within the three-dimensional volume V (not including bound charge) net electric charge within the threedimensional volume V (including both free and bound charge) line integral of the electric field along the boundary ∂S of a surface S (∂S is always a closed curve). line integral of the magnetic field over the closed boundary ∂S of the surface S the electric flux (surface integral of the electric field) through the (closed) surface (the boundary of the volume V) the magnetic flux (surface integral of the magnetic B-field) through the (closed) surface (the boundary of the volume V)
squared coulombs per cubic meter coulombs per cubic meter amperes per square meter amperes per square meter coulombs
coulombs
joules per coulomb
tesla-meters
joule-meter per coulomb
tesla meters-squared or webers
webers or magnetic flux through any surface equivalently, voltS, not necessarily closed seconds electric flux through any surface S, joule-meters per coulomb not necessarily closed flux of electric displacement field through any surface S, not necessarily closed
coulombs
net free electrical current passing through the surface S (not including bound current)
amperes
net electrical current passing through the surface S (including both free and bound current)
amperes
Proof that the two general formulations are equivalent The two alternate general formulations of Maxwell's equations given above are mathematically equivalent and related by the following relations:
where P and M are polarization and magnetization, and ρb and Jb are bound charge and current, respectively. Substituting these equations into the 'macroscopic' Maxwell's equations gives identically the microscopic equations.
Maxwell's 'microscopic' equations The microscopic variant of Maxwell's equation expresses the electric E field and the magnetic B field in terms of the total charge and total current present including the charges and currents at the atomic level. It is sometimes called the general form of Maxwell's equations or "Maxwell's equations in a vacuum". Both variants of Maxwell's equations are equally general, though, as they are mathematically equivalent. The microscopic equations are most useful in waveguides for example, when there are no dielectric or magnetic materials nearby. Formulation in terms of total charge and current Name Differential form Integral form Gauss's law Gauss's law for magnetism Maxwell–Faraday equation (Faraday's law of induction) Ampère's circuital law (with Maxwell's correction)
With neither charges nor currents In a region with no charges (ρ = 0) and no currents (J = 0), such as in a vacuum, Maxwell's equations reduce to:
These equations lead directly to E and B satisfying the wave equation for which the solutions are linear combinations of plane waves traveling at the speed of light,
In addition, E and B are mutually perpendicular to each other and the direction of motion and are in phase with each other. A sinusoidal plane wave is one special solution of these equations. In fact, Maxwell's equations explain how these waves can physically propagate through space. The changing magnetic field creates a changing electric field through Faraday's law. In turn, that electric field creates a changing magnetic field through Maxwell's correction to Ampère's law. This perpetual cycle allows these waves, now known as electromagnetic radiation, to move through space at velocity c.
Maxwell's 'macroscopic' equations Unlike the 'microscopic' equations, "Maxwell's macroscopic equations", also known as Maxwell's equations in matter, factor out the bound charge and current to obtain equations that depend only on the free charges and currents. These equations are more similar to those that Maxwell himself introduced. The cost of this factorization is that additional fields need to be defined: the displacement field D which is defined in terms of the electric field E and the polarization P of the material, and the magnetic-H field, which is defined in terms of the magnetic-B field and the magnetization M of the material.
Bound charge and current
Left: A schematic view of how an assembly of microscopic dipoles produces opposite surface charges as shown at top and bottom. Right: How an assembly of microscopic current loops add together to produce a macroscopically circulating current loop. Inside the boundaries, the individual contributions tend to cancel, but at the boundaries no cancellation occurs. When an electric field is applied to a dielectric material its molecules respond by forming microscopic electric dipoles—their atomic nuclei move a tiny distance in the direction of the field, while their electrons move a tiny distance in the opposite direction. This produces a macroscopic bound charge in the material even though all of the charges involved are bound to individual molecules. For example, if every molecule responds the same, similar to that shown in the figure, these tiny movements of charge combine to produce a layer of positive bound charge on one side of the material and a layer of negative charge on the other side. The bound charge is most conveniently described in terms of a polarization, P, in the material. If P is uniform, a macroscopic separation of charge is produced only at the surfaces where P enter and leave the material. For nonuniform P, a charge is also produced in the bulk. Somewhat similarly, in all materials the constituent atoms exhibit magnetic moments that are intrinsically linked to the angular momentum of the atoms' components, most notably their electrons. The connection to angular momentum suggests the picture of an assembly of microscopic current loops. Outside the material, an assembly of such microscopic current loops is not different from a macroscopic current circulating around the material's surface, despite the fact that no individual magnetic moment is traveling a large distance. These bound currents can be described using the magnetization M.
The very complicated and granular bound charges and bound currents, therefore can be represented on the macroscopic scale in terms of P and M which average these charges and currents on a sufficiently large scale so as not to see the granularity of individual atoms, but also sufficiently small that they vary with location in the material. As such, the Maxwell's macroscopic equations ignores many details on a fine scale that may be unimportant to understanding matters on a grosser scale by calculating fields that are averaged over some suitably sized volume.
Equations Formulation in terms of free charge and current Name
Differential form
Integral form
Gauss's law
Gauss's law for magnetism Maxwell–Faraday equation (Faraday's law of induction) Ampère's circuital law (with Maxwell's correction)
Constitutive relations In order to apply 'Maxwell's macroscopic equations', it is necessary to specify the relations between displacement field D and E, and the magnetic H-field H and B. These equations specify the response of bound charge and current to the applied fields and are called constitutive relations. Determining the constitutive relationship between the auxiliary fields D and H and the E and B fields starts with the definition of the auxiliary fields themselves:
where P is the polarization field and M is the magnetization field which are defined in terms of microscopic bound charges and bound current respectively. Before getting to how to calculate M and P it is useful to examine some special cases, though.
Without magnetic or dielectric materials In the absence of magnetic or dielectric materials, the constitutive relations are simple:
where ε0 and μ0 are two universal constants, called the permittivity of free space and permeability of free space, respectively. Substituting these back into Maxwell's macroscopic equations lead directly to Maxwell's microscopic equations, except that the currents and charges are replaced with free currents and free charges. This is expected since there are no bound charges nor currents.
Isotropic Linear materials In an (isotropic) linear material, where P is proportional to E and M is proportional to B the constitutive relations are also straightforward. In terms of the polarizaton P and the magnetization M they are:
where χe and χm are the electric and magnetic susceptibilities of a given material respectively. In terms of D and H the constitutive relations are:
where ε and μ are constants (which depend on the material), called the permittivity and permeability, respectively, of the material. These are related to the susceptibilities by:
Substituting in the constitutive relations above into Maxwell's equations in linear, dispersionless, time-invariant materials (differential form only) are:
These are formally identical to the general formulation in terms of E and B (given above), except that the permittivity of free space was replaced with the permittivity of the material, the permeability of free space was replaced with the permeability of the material, and only free charges and currents are included (instead of all charges and
currents). Unless that material is homogeneous in space, ε and μ cannot be factored out of the derivative expressions on the left sides.
General case For real-world materials, the constitutive relations are not linear, except approximately. Calculating the constitutive relations from first principles involves determining how P and M are created from a given E and B. These relations may be empirical (based directly upon measurements), or theoretical (based upon statistical mechanics, transport theory or other tools of condensed matter physics). The detail employed may be macroscopic or microscopic, depending upon the level necessary to the problem under scrutiny. In general, though the constitutive relations can usually still be written:
but ε and μ are not, in general, simple constants, but rather functions. Examples are:
Dispersion and absorption where ε and μ are functions of frequency. (Causality does not permit materials to be nondispersive; see, for example, Kramers–Kronig relations). Neither do the fields need to be in phase which leads to ε and μ being complex. This also leads to absorption. Bi-(an)isotropy where H and D depend on both B and E:
Nonlinearity where ε and μ are functions of E and B. Anisotropy (such as birefringence or dichroism) which occurs when ε and μ are second-rank tensors,
Dependence of P and M on E and B at other locations and times. This could be due to spatial inhomogeneity; for example in a domained structure, heterostructure or a liquid crystal, or most commonly in the situation where there are simply multiple materials occupying different regions of space). Or it could be due to a time varying medium or due to hysteresis. In such cases P and M can be calculated as:
in which the permittivity and permeability functions are replaced by integrals over the more general electric and magnetic susceptibilities. In practice, some materials properties have a negligible impact in particular circumstances, permitting neglect of small effects. For example: optical nonlinearities can be neglected for low field strengths; material dispersion is unimportant when frequency is limited to a narrow bandwidth; material absorption can be neglected for wavelengths for which a material is transparent; and metals with finite conductivity often are approximated at microwave or longer wavelengths as perfect metals with infinite conductivity (forming hard barriers with zero skin depth of field penetration). It may be noted that man-made materials can be designed to have customized permittivity and permeability, such as metamaterials and photonic crystals.
Calculation of constitutive relations In general, the constitutive equations are theoretically determined by calculating how a molecule responds to the local fields through the Lorentz force. Other forces may need to be modeled as well such as lattice vibrations in crystals or bond forces. Including all of the forces leads to changes in the molecule which are used to calculate P and M as a function of the local fields. The local fields differ from the applied fields due to the fields produced by the polarization and magnetization of nearby material; an effect which also needs to be modeled. Further, real materials are not continuous media; the local fields of real materials vary wildly on the atomic scale. The fields need to be averaged over a suitable volume to form a continuum approximation. These continuum approximations often require some type of quantum mechanical analysis such as quantum field theory as applied to condensed matter physics. See, for example, density functional theory, Green–Kubo relations and Green's function. Various approximate transport equations have evolved, for example, the Boltzmann equation or the Fokker–Planck equation or the Navier–Stokes equations. Some examples where these equations are applied are magnetohydrodynamics, fluid dynamics, electrohydrodynamics, superconductivity, plasma modeling. An entire physical apparatus for dealing with these matters has developed. A different set of homogenization methods (evolving from a tradition in treating materials such as conglomerates and laminates) are based upon approximation of an inhomogeneous material by a homogeneous effective medium (valid for excitations with wavelengths much larger than the scale of the inhomogeneity). The theoretical modeling of the continuum-approximation properties of many real materials often rely upon measurement as well, for example, ellipsometry measurements.
History Relation between electricity, magnetism, and the speed of light The relation between electricity, magnetism, and the speed of light can be summarized by the modern equation:
The left-hand side is the speed of light, and the right-hand side is a quantity related to the equations governing electricity and magnetism. Although the right-hand side has units of velocity, it can be inferred from measurements of electric and magnetic forces, which involve no physical velocities. Therefore, establishing this relationship provided convincing evidence that light is an electromagnetic phenomenon. The discovery of this relationship started in 1855, when Wilhelm Eduard Weber and Rudolf Kohlrausch determined that there was a quantity related to electricity and magnetism, "the ratio of the absolute electromagnetic unit of charge to the absolute ), and determined electrostatic unit of charge" (in modern language, the value that it should have units of velocity. They then measured this ratio by an experiment which involved charging and discharging a Leyden jar and measuring the magnetic force from the discharge current, and found a value 3.107×108 m/s, remarkably close to the speed of light, which had recently been measured at 3.14×108 m/s by Hippolyte Fizeau in 1848 and at 2.98×108 m/s by Léon Foucault in 1850. However, Weber and Kohlrausch did not make the connection to the speed of light. Towards the end of 1861 while working on part III of his paper On Physical Lines of Force, Maxwell travelled from Scotland to London and looked up Weber and Kohlrausch's results. He converted them into a format which was compatible with his own writings, and in doing so he established the connection to the speed of light and concluded that light is a form of electromagnetic radiation.
The term Maxwell's equations The four modern Maxwell's equations can be found individually throughout his 1861 paper, derived theoretically using a molecular vortex model of Michael Faraday's "lines of force" and in conjunction with the experimental result of Weber and Kohlrausch. But it wasn't until 1884 that Oliver Heaviside, concurrently with similar work by Willard Gibbs and Heinrich Hertz, grouped the four together into a distinct set. This group of four equations was known variously as the Hertz-Heaviside equations and the Maxwell-Hertz equations, and are sometimes still known as the Maxwell–Heaviside equations. Maxwell's contribution to science in producing these equations lies in the correction he made to Ampère's circuital law in his 1861 paper On Physical Lines of Force. He added
the displacement current term to Ampère's circuital law and this enabled him to derive the electromagnetic wave equation in his later 1865 paper A Dynamical Theory of the Electromagnetic Field and demonstrate the fact that light is an electromagnetic wave. This fact was then later confirmed experimentally by Heinrich Hertz in 1887. The physicist Richard Feynman predicted that, "The American Civil War will pale into provincial insignificance in comparison with this important scientific event of the same decade." The concept of fields was introduced by, among others, Faraday. Albert Einstein wrote: The precise formulation of the time-space laws was the work of Maxwell. Imagine his feelings when the differential equations he had formulated proved to him that electromagnetic fields spread in the form of polarised waves, and at the speed of light! To few men in the world has such an experience been vouchsafed ... it took physicists some decades to grasp the full significance of Maxwell's discovery, so bold was the leap that his genius forced upon the conceptions of his fellow-workers —(Science, May 24, 1940) Heaviside worked to eliminate the potentials (electric potential and magnetic potential) that Maxwell had used as the central concepts in his equations; this effort was somewhat controversial, though it was understood by 1884 that the potentials must propagate at the speed of light like the fields, unlike the concept of instantaneous action-at-a-distance like the then conception of gravitational potential. Modern analysis of, for example, radio antennas, makes full use of Maxwell's vector and scalar potentials to separate the variables, a common technique used in formulating the solutions of differential equations. However the potentials can be introduced by algebraic manipulation of the four fundamental equations.
On Physical Lines of Force The four modern day Maxwell's equations appeared throughout Maxwell's 1861 paper On Physical Lines of Force: 1. Equation (56) in Maxwell's 1861 paper is ∇ ⋅ B = 0. 2. Equation (112) is Ampère's circuital law with Maxwell's displacement current added. It is the addition of displacement current that is the most significant aspect of Maxwell's work in electromagnetism, as it enabled him to later derive the electromagnetic wave equation in his 1865 paper A Dynamical Theory of the Electromagnetic Field, and hence show that light is an electromagnetic wave. It is therefore this aspect of Maxwell's work which gives the equations their full significance. (Interestingly, Kirchhoff derived the telegrapher's equations in 1857 without using displacement current. But he did use Poisson's equation and the equation of continuity which are the mathematical ingredients of the displacement current. Nevertheless, Kirchhoff believed his equations to be applicable only inside an electric wire and so he is not credited with having discovered that light is an electromagnetic wave).
3. Equation (115) is Gauss's law. 4. Equation (54) is an equation that Oliver Heaviside referred to as 'Faraday's law'. This equation caters for the time varying aspect of electromagnetic induction, but not for the motionally induced aspect, whereas Faraday's original flux law caters for both aspects. Maxwell deals with the motionally dependent aspect of electromagnetic induction, v × B, at equation (77). Equation (77) which is the same as equation (D) in the original eight Maxwell's equations listed below, corresponds to all intents and purposes to the modern day force law F = q( E + v × B ) which sits adjacent to Maxwell's equations and bears the name Lorentz force, even though Maxwell derived it when Lorentz was still a young boy. The difference between the B and the H vectors can be traced back to Maxwell's 1855 paper entitled On Faraday's Lines of Force which was read to the Cambridge Philosophical Society. The paper presented a simplified model of Faraday's work, and how the two phenomena were related. He reduced all of the current knowledge into a linked set of differential equations.
Figure of Maxwell's molecular vortex model. For a uniform magnetic field, the field lines point outward from the display screen, as can be observed from the black dots in the middle of the hexagons. The vortex of each hexagonal molecule rotates counterclockwise. The small green circles are clockwise rotating particles sandwiching between the molecular vortices. It is later clarified in his concept of a sea of molecular vortices that appears in his 1861 paper On Physical Lines of Force. Within that context, H represented pure vorticity (spin), whereas B was a weighted vorticity that was weighted for the density of the vortex sea. Maxwell considered magnetic permeability µ to be a measure of the density of the vortex sea. Hence the relationship,
1. Magnetic induction current causes a magnetic current density
was essentially a rotational analogy to the linear electric current relationship, 1. Electric convection current
where ρ is electric charge density. B was seen as a kind of magnetic current of vortices aligned in their axial planes, with H being the circumferential velocity of the vortices. With µ representing vortex density, it follows that the product of µ with vorticity H leads to the magnetic field denoted as B. The electric current equation can be viewed as a convective current of electric charge that involves linear motion. By analogy, the magnetic equation is an inductive current involving spin. There is no linear motion in the inductive current along the direction of the B vector. The magnetic inductive current represents lines of force. In particular, it represents lines of inverse square law force. The extension of the above considerations confirms that where B is to H, and where J is to ρ, then it necessarily follows from Gauss's law and from the equation of continuity of charge that E is to D. i.e. B parallels with E, whereas H parallels with D.
A Dynamical Theory of the Electromagnetic Field In 1864 Maxwell published A Dynamical Theory of the Electromagnetic Field in which he showed that light was an electromagnetic phenomenon. Confusion over the term "Maxwell's equations" sometimes arises because it has been used for a set of eight equations that appeared in Part III of Maxwell's 1864 paper A Dynamical Theory of the Electromagnetic Field, entitled "General Equations of the Electromagnetic Field," and this confusion is compounded by the writing of six of those eight equations as three separate equations (one for each of the Cartesian axes), resulting in twenty equations and twenty unknowns. (As noted above, this terminology is not common: Modern references to the term "Maxwell's equations" refer to the Heaviside restatements.) The eight original Maxwell's equations can be written in modern vector notation as follows: (A) The law of total currents
(B) The equation of magnetic force (C) Ampère's circuital law
(D) Electromotive force created by convection, induction, and by static electricity. (This is in effect the Lorentz force)
(E) The electric elasticity equation
(F) Ohm's law
(G) Gauss's law (H) Equation of continuity
or Notation H is the magnetizing field, which Maxwell called the magnetic intensity. J is the current density (withJtot being the total current including displacement current). D is the displacement field (called the electric displacement by Maxwell). ρ is the free charge density (called the quantity of free electricity by Maxwell). A is the magnetic potential (called the angular impulse by Maxwell). E is called the electromotive force by Maxwell. The term electromotive force is nowadays used for voltage, but it is clear from the context that Maxwell's meaning corresponded more to the modern term electric field. φ is the electric potential (which Maxwell also called electric potential). σ is the electrical conductivity (Maxwell called the inverse of conductivity the specific resistance, what is now called the resistivity). It is interesting to note the μv × H term that appears in equation D. Equation D is therefore effectively the Lorentz force, similarly to equation (77) of his 1861 paper. When Maxwell derives the electromagnetic wave equation in his 1865 paper, he uses equation D to cater for electromagnetic induction rather than Faraday's law of induction which is used in modern textbooks. (Faraday's law itself does not appear among his equations.) However, Maxwell drops the μv × H term from equation D when he is deriving the electromagnetic wave equation, as he considers the situation only from the rest frame.
A Treatise on Electricity and Magnetism In A Treatise on Electricity and Magnetism, an 1873 treatise on electromagnetism written by James Clerk Maxwell, eleven general equations of the electromagnetic field are listed and these include the eight that are listed in the 1865 paper.
Maxwell's equations and relativity Maxwell's original equations are based on the idea that light travels through a sea of molecular vortices known as the 'luminiferous aether', and that the speed of light has to be respective to the reference frame of this aether. Measurements designed to measure the speed of the Earth through the aether conflicted, though. A more theoretical approach was suggested by Hendrik Lorentz along with George FitzGerald and Joseph Larmor. Both Larmor (1897) and Lorentz (1899, 1904) derived the Lorentz transformation (so named by Henri Poincaré) as one under which Maxwell's equations were invariant. Poincaré (1900) analyzed the coordination of moving clocks by exchanging light signals. He also established mathematically the group property of the Lorentz transformation (Poincaré 1905). Einstein dismissed the aether as unnecessary and concluded that Maxwell's equations predict the existence of a fixed speed of light, independent of the speed of the observer, and as such he used Maxwell's equations as the starting point for his special theory of relativity. In doing so, he established the Lorentz transformation as being valid for all matter and not just Maxwell's equations. Maxwell's equations played a key role in Einstein's famous paper on special relativity; for example, in the opening paragraph of the paper, he motivated his theory by noting that a description of a conductor moving with respect to a magnet must generate a consistent set of fields irrespective of whether the force is calculated in the rest frame of the magnet or that of the conductor. General relativity has also had a close relationship with Maxwell's equations. For example, Theodor Kaluza and Oskar Klein showed in the 1920s that Maxwell's equations can be derived by extending general relativity into five dimensions. This strategy of using higher dimensions to unify different forces remains an active area of research in particle physics.
Modified to include magnetic monopoles Maxwell's equations provide for an electric charge, but posit no magnetic charge. Magnetic charge has never been seen and may not exist. Nevertheless, Maxwell's equations including magnetic charge (and magnetic current) is of some theoretical interest. For one reason, Maxwell's equations can be made fully symmetric under interchange of electric and magnetic field by allowing for the possibility of magnetic charges with
magnetic charge density ρm and currents with magnetic current density Jm. The extended Maxwell's equations (in cgs-Gaussian units) are: Name
Without magnetic monopoles
With magnetic monopoles (hypothetical)
Gauss's law: Gauss's law for magnetism: Maxwell– Faraday equation (Faraday's law of induction): Ampère's law (with Maxwell's extension): If magnetic charges do not exist, or if they exist but not in the region studied, then the new variables are zero, and the symmetric equations reduce to the conventional equations of electromagnetism such as ∇ · B = 0. Further if every particle has the same ratio of electric to magnetic charge then an E and a B field can be defined that obeys the normal Maxwell's equation (having no magnetic charges or currents) with its own charge and current densities.
Boundary conditions using Maxwell's equations Like all sets of differential equations, Maxwell's equations cannot be uniquely solved without a suitable set of boundary conditions and initial conditions. For example, consider a region with no charges and no currents. One particular solution that satisfies all of Maxwell's equations in that region is that both E and B = 0 everywhere in the region. This solution is obviously false if there is a charge just outside of the region. In this particular example, all of the electric and magnetic fields in the interior are due to the charges outside of the volume. Different charges outside of the volume produce different fields on the surface of that volume and therefore have a different boundary conditions. In general, knowing the appropriate boundary conditions for a given region along with the currents and charges in that region allows one to solve for all the fields everywhere within that region. An example of this type is a an electromagnetic scattering problem, where an electromagnetic wave originating outside the scattering region is scattered by a target, and the scattered electromagnetic wave is analyzed for the information it contains about the target by virtue of the interaction with the target during scattering.
In some cases, like waveguides or cavity resonators, the solution region is largely isolated from the universe, for example, by metallic walls, and boundary conditions at the walls define the fields with influence of the outside world confined to the input/output ends of the structure. In other cases, the universe at large sometimes is approximated by an artificial absorbing boundary, or, for example for radiating antennas or communication satellites, these boundary conditions can take the form of asymptotic limits imposed upon the solution. In addition, for example in an optical fiber or thin-film optics, the solution region often is broken up into subregions with their own simplified properties, and the solutions in each subregion must be joined to each other across the subregion interfaces using boundary conditions. A particular example of this use of boundary conditions is the replacement of a material with a volume polarization by a charged surface layer, or of a material with a volume magnetization by a surface current, as described in the section Bound charge and current. Following are some links of a general nature concerning boundary value problems: Examples of boundary value problems, Sturm–Liouville theory, Dirichlet boundary condition, Neumann boundary condition, mixed boundary condition, Cauchy boundary condition, Sommerfeld radiation condition. Needless to say, one must choose the boundary conditions appropriate to the problem being solved.
Gaussian units Gaussian units is a popular electromagnetism variant of the centimetre gram second system of units (cgs). In gaussian units, Maxwell's equations are:
where c is the speed of light in a vacuum. The microscopic equations are:
The relation between electric displacement field, electric field and polarization density is:
And likewise the relation between magnetic induction, magnetic field and total magnetization is:
In the linear approximation, the electric susceptibility and magnetic susceptibility are defined so that: , (Note: although the susceptibilities are dimensionless numbers in both cgs and SI, they differ in value by a factor of 4π.) The permittivity and permeability are: , so that , In vacuum, ε = μ = 1, therefore D = E, and B = H. The force exerted upon a charged particle by the electric field and magnetic field is given by the Lorentz force equation:
where q is the charge on the particle and v is the particle velocity. This is slightly different from the SI-unit expression above. For example, the magnetic field B has the same units as the electric field E.
Alternative formulations of Maxwell's equations Special relativity motivated a compact mathematical formulation of Maxwell's equations, in terms of covariant tensors. Quantum mechanics also motivated other formulations. For example, consider a conductor moving in the field of a magnet. In the frame of the magnet, that conductor experiences a magnetic force. But in the frame of a conductor moving relative to the magnet, the conductor experiences a force due to an electric field. The following formulation shows how Maxwell's equations take the same form in any inertial coordinate system.
Covariant formulation of Maxwell's equations In special relativity, in order to more clearly express the fact that Maxwell's ('microscopic') equations take the same form in any inertial coordinate system, Maxwell's equations are written in terms of four-vectors and tensors in the "manifestly covariant" form. The purely spatial components of the following are in SI units. One ingredient in this formulation is the electromagnetic tensor, a rank-2 covariant antisymmetric tensor combining the electric and magnetic fields:
and the result of raising its indices
The other ingredient is the four-current:
where ρ is the charge density and J is the current density. With these ingredients, Maxwell's equations can be written:
and
The first tensor equation is an expression of the two inhomogeneous Maxwell's equations, Gauss's law and Ampere's law with Maxwell's correction. The second equation is an expression of the two homogeneous equations, Faraday's law of induction and Gauss's law for magnetism. The second equation is equivalent to
where
is the contravariant version of the Levi-Civita symbol, and
is the 4-gradient. In the tensor equations above, repeated indices are summed over according to Einstein summation convention. We have displayed the results in several common notations. Upper and lower components of a vector, vα and vα respectively, are interchanged with the fundamental tensor g, e.g., g = η = diag(−1, +1, +1, +1). Alternative covariant presentations of Maxwell's equations also exist, for example in terms of the four-potential.
Potential formulation In advanced classical mechanics and in quantum mechanics (where it is necessary) it is sometimes useful to express Maxwell's equations in a 'potential formulation' involving the electric potential (also called scalar potential), φ, and the magnetic potential, A, (also called vector potential). These are defined such that:
With these definitions, the two homogeneous Maxwell's equations (Faraday's Law and Gauss's law for magnetism) are automatically satisfied and the other two (inhomogeneous) equations give the following equations (for "Maxwell's microscopic equations"):
These equations, taken together, are as powerful and complete as Maxwell's equations. Moreover, if we work only with the potentials and ignore the fields, the problem has been reduced somewhat, as the electric and magnetic fields each have three components which need to be solved for (six components altogether), while the electric and magnetic potentials have only four components altogether.
Many different choices of A and φ are consistent with a given E and B, making these choices physically equivalent – a flexibility known as gauge freedom. Suitable choice of A and φ can simplify these equations, or can adapt them to suit a particular situation.
Four-potential In the Lorenz gauge, the two equations that represent the potentials can be reduced to one manifestly Lorentz invariant equation, using four-vectors: the four-current defined by
formed from the current density j and charge density ρ, and the electromagnetic fourpotential defined by
formed from the vector potential A and the scalar potential . The resulting single equation, due to Arnold Sommerfeld, a generalization of an equation due to Bernhard Riemann and known as the Riemann–Sommerfeld equation or the covariant form of the Maxwell equations, is: , where
is the d'Alembertian operator, or four-Laplacian, , sometimes written
, or
, where
is the four-gradient.
Differential geometric formulations In free space, where ε = ε0 and μ = μ0 are constant everywhere, Maxwell's equations simplify considerably once the language of differential geometry and differential forms is used. In what follows, cgs-Gaussian units, not SI units are used. The electric and magnetic fields are now jointly described by a 2-form F in a 4-dimensional spacetime manifold. Maxwell's equations then reduce to the Bianchi identity
where d denotes the exterior derivative — a natural coordinate and metric independent differential operator acting on forms — and the source equation
where the (dual) Hodge star operator * is a linear transformation from the space of 2forms to the space of (4-2)-forms defined by the metric in Minkowski space (in four
dimensions even by any metric conformal to this metric), and the fields are in natural units where 1/4πε0 = 1. Here, the 3-form J is called the electric current form or current 3form satisfying the continuity equation
The current 3-form can be integrated over a 3-dimensional space-time region. The physical interpretation of this integral is the charge in that region if it is spacelike, or the amount of charge that flows through a surface in a certain amount of time if that region is a spacelike surface cross a timelike interval. As the exterior derivative is defined on any manifold, the differential form version of the Bianchi identity makes sense for any 4dimensional manifold, whereas the source equation is defined if the manifold is oriented and has a Lorentz metric. In particular the differential form version of the Maxwell equations are a convenient and intuitive formulation of the Maxwell equations in general relativity. In a linear, macroscopic theory, the influence of matter on the electromagnetic field is described through more general linear transformation in the space of 2-forms. We call
the constitutive transformation. The role of this transformation is comparable to the Hodge duality transformation. The Maxwell equations in the presence of matter then become:
where the current 3-form J still satisfies the continuity equation dJ = 0. When the fields are expressed as linear combinations (of exterior products) of basis forms θp,
the constitutive relation takes the form
where the field coefficient functions are antisymmetric in the indices and the constitutive coefficients are antisymmetric in the corresponding pairs. In particular, the Hodge duality transformation leading to the vacuum equations discussed above are obtained by taking
which up to scaling is the only invariant tensor of this type that can be defined with the metric. In this formulation, electromagnetism generalises immediately to any 4-dimensional oriented manifold or with small adaptations any manifold, requiring not even a metric. Thus the expression of Maxwell's equations in terms of differential forms leads to a further notational and conceptual simplification. Whereas Maxwell's Equations could be written as two tensor equations instead of eight scalar equations, from which the propagation of electromagnetic disturbances and the continuity equation could be derived with a little effort, using differential forms leads to an even simpler derivation of these results.
Conceptual insight from this formulation On the conceptual side, from the point of view of physics, this shows that the second and third Maxwell equations should be grouped together, be called the homogeneous ones, and be seen as geometric identities expressing nothing else than: the field F derives from a more "fundamental" potential A. While the first and last one should be seen as the dynamical equations of motion, obtained via the Lagrangian principle of least action, from the "interaction term" A J (introduced through gauge covariant derivatives), coupling the field to matter. Often, the time derivative in the third law motivates calling this equation "dynamical", which is somewhat misleading; in the sense of the preceding analysis, this is rather an artifact of breaking relativistic covariance by choosing a preferred time direction. To have physical degrees of freedom propagated by these field equations, one must include a kinetic term F *F for A; and take into account the non-physical degrees of freedom which can be removed by gauge transformation A → A' = A − dα..
Geometric Algebra (GA) formulation In geometric algebra, Maxwell's equations are reduced to a single equation,
where F and J are multivectors
and
with the unit pseudoscalar I2 = −1 The GA spatial gradient operator ∇ acts on a vector field, such that
In spacetime algebra using the same geometric product the equation is simply
the spacetime derivative of the electromagnetic field is its source. Here the (non-bold) spacetime gradient
is a four vector, as is the current density
For a demonstration that the equations given reproduce Maxwell's equations.
Classical electrodynamics as the curvature of a line bundle An elegant and intuitive way to formulate Maxwell's equations is to use complex line bundles or principal bundles with fibre U(1). The connection on the line bundle has a curvature which is a two-form that automatically satisfies and can be interpreted as a field-strength. If the line bundle is trivial with flat reference connection d and F = dA with A the 1-form composed of the electric we can write potential and the magnetic vector potential. In quantum mechanics, the connection itself is used to define the dynamics of the system. This formulation allows a natural description of the Aharonov-Bohm effect. In this experiment, a static magnetic field runs through a long magnetic wire (e.g., an iron wire magnetized longitudinally). Outside of this wire the magnetic induction is zero, in contrast to the vector potential, which essentially depends on the magnetic flux through the cross-section of the wire and does not vanish outside. Since there is no electric field either, the Maxwell tensor F = 0 throughout the space-time region outside the tube, during the experiment. This means by definition that the connection is flat there. However, as mentioned, the connection depends on the magnetic field through the tube since the holonomy along a non-contractible curve encircling the tube is the magnetic flux through the tube in the proper units. This can be detected quantum-mechanically with a double-slit electron diffraction experiment on an electron wave traveling around
the tube. The holonomy corresponds to an extra phase shift, which leads to a shift in the diffraction pattern.
Curved spacetime Traditional formulation Matter and energy generate curvature of spacetime. This is the subject of general relativity. Curvature of spacetime affects electrodynamics. An electromagnetic field having energy and momentum also generates curvature in spacetime. Maxwell's equations in curved spacetime can be obtained by replacing the derivatives in the equations in flat spacetime with covariant derivatives. (Whether this is the appropriate generalization requires separate investigation.) The sourced and source-free equations become (cgs-Gaussian units):
and
Here,
is a Christoffel symbol that characterizes the curvature of spacetime and Dγ is the covariant derivative.
Formulation in terms of differential forms The formulation of the Maxwell equations in terms of differential forms can be used without change in general relativity. The equivalence of the more traditional general relativistic formulation using the covariant derivative with the differential form formulation can be seen as follows. Choose local coordinates xα which gives a basis of 1forms dxα in every point of the open set where the coordinates are defined. Using this basis and cgs-Gaussian units we define
The antisymmetric infinitesimal field tensor Fαβ, corresponding to the field 2-form F
The current-vector infinitesimal 3-form J
Here g is as usual the determinant of the metric tensor gαβ. A small computation that uses the symmetry of the Christoffel symbols (i.e., the torsion-freeness of the Levi Civita connection) and the covariant constantness of the Hodge star operator then shows that in this coordinate neighborhood we have:
the Bianchi identity
the source equation
the continuity equation
Chapter-2
Navier–Stokes Existence and Smoothness
The Navier–Stokes equations are one of the pillars of fluid mechanics. These equations describe the motion of a fluid (that is, a liquid or a gas) in space. Solutions to the Navier– Stokes equations are used in many practical applications. However, theoretical understanding of the solutions to these equations is incomplete. In particular, solutions of the Navier–Stokes equations often include turbulence, which remains one of the greatest unsolved problems in physics despite its immense importance in science and engineering. Even much more basic properties of the solutions to Navier–Stokes have never been proven. For the three-dimensional system of equations, and given some initial conditions, mathematicians have not yet proved that smooth solutions always exist, or that if they do exist they have bounded kinetic energy. This is called the Navier–Stokes existence and smoothness problem. Since understanding the Navier–Stokes equations is considered to be the first step for understanding the elusive phenomenon of turbulence, the Clay Mathematics Institute offered in May 2000 a US$1,000,000 prize, not to whomever constructs a theory of turbulence but (more modestly) to the first person providing a hint on the phenomenon of turbulence. In that spirit of ideas, the Clay Institute set a concrete mathematical problem: Prove or give a counter-example of the following statement: In three space dimensions and time, given an initial velocity field, there exists a vector velocity and a scalar pressure field, which are both smooth and globally defined, that solve the Navier–Stokes equations.
The Navier–Stokes equations In mathematics, the Navier–Stokes equations are a system of nonlinear partial differential equations for abstract vector fields of any size. In physics and engineering, they are a system of equations that models the motion of liquids or non-rarefied gases using continuum mechanics. The equations are a statement of Newton's second law, with the forces modeled according to those in a viscous Newtonian fluid—as the sum of contributions by pressure, viscous stress and an external body force. Since the setting of
the problem proposed by the Clay Mathematics Institute is in three dimensions, for an incompressible and homogeneous fluid, we will consider only that case. be a 3-dimensional vector, the velocity of the fluid, and let Let pressure of the fluid. The Navier–Stokes equations are:
be the
the external force, is the gradient where ν > 0 is the kinematic viscosity, operator and is the Laplacian operator, which is also denoted by . Note that this is a vector equation, i.e. it has three scalar equations. If we write down the coordinates of the velocity and the external force
then for each i = 1,2,3 we have the corresponding scalar Navier–Stokes equation:
The unknowns are the velocity and the pressure . Since in three dimensions we have three equations and four unknowns (three scalar velocities and the pressure), we need a supplementary equation. This extra equation is the continuity equation describing the incompressibility of the fluid:
Due to this last property, the solutions for the Navier–Stokes equations are searched in the set of "divergence-free" functions. For this flow of a homogeneous medium, density and viscosity are constants. We can eliminate the pressure p by taking an operator rot (alternative notation curl) of both sides of the Navier–Stokes equations. In this case the Navier–Stokes equations reduce to the Vorticity transport equations. In two dimensions (2D), these equations are well known [6, p. 321]. In three dimensions (3D), it is known for a long time that Vorticity transport equations have additional terms [6, p. 294]. However, why 1D, 2D and 3D Navier–Stokes equations in the vector form are identical? In that case, probably, the vorticity transport equations in the vector form must be identical too.
Two settings: unbounded and periodic space There are two different settings for the one-million-dollar-prize Navier–Stokes existence , which needs and smoothness problem. The original problem is in the whole space extra conditions on the growth behavior of the initial condition and the solutions. In order to rule out the problems at infinity, the Navier–Stokes equations can be set in a periodic framework, which implies that we are no longer working on the whole space but in the 3-dimensional torus
. We will treat each case separately.
Statement of the problem in the whole space Hypotheses and growth conditions The initial condition is assumed to be a smooth and divergence-free function such that, for every multi-index α and any K > 0, there exists a constant C = C(α,K) > 0 (i.e. this "constant" depends on α and K) such that
for all The external force is assumed to be a smooth function as well, and satisfies a very analogous inequality (now the multi-index includes time derivatives as well):
for all For physically reasonable conditions, the type of solutions expected are smooth functions . More precisely, the following assumptions are that do not grow large as made: 1. 2. There exists a constant
such that
for all
Condition 1 implies that the functions are smooth and globally defined and condition 2 means that the kinetic energy of the solution is globally bounded.
The million-dollar-prize conjectures in the whole space (A) Existence and smoothness of the Navier–Stokes solutions in
Let . For any initial condition satisfying the above hypotheses there exist smooth and globally defined solutions to the Navier–Stokes equations, i.e. there is a velocity vector
and a pressure p(x,t) satisfying conditions 1 and 2 above.
(B) Breakdown of the Navier–Stokes solutions in There exists an initial condition no solutions
and an external force
such that there exists
and p(x,t) satisfying conditions 1 and 2 above.
Statement of the periodic problem Hypotheses The functions we seek now are periodic in the space variables of period 1. More precisely, let ei be the unitary vector in the j- direction:
Then
is periodic in the space variables if for any i = 1,2,3 we have that
Notice that we are considering the coordinates mod 1. This allows us to work not on the whole space but on the quotient space dimensional torus
, which turns out to be the 3-
We can now state the hypotheses properly. The initial condition
is assumed to be a
smooth and divergence-free function and the external force is assumed to be a smooth function as well. The type of solutions that are physically relevant are those who satisfy these conditions: 3.
4. There exists a constant
such that
for all
Just as in the previous case, condition 3 implies that the functions are smooth and globally defined and condition 4 means that the kinetic energy of the solution is globally bounded.
The periodic million-dollar-prize theorems (C) Existence and smoothness of the Navier–Stokes solutions in Let . For any initial condition satisfying the above hypotheses there exist smooth and globally defined solutions to the Navier–Stokes equations, i.e. there is a velocity vector
and a pressure p(x,t) satisfying conditions 3 and 4 above.
(D) Breakdown of the Navier–Stokes solutions in There exists an initial condition no solutions
and an external force
such that there exists
and p(x,t) satisfying conditions 3 and 4 above.
Partial results 1. The Navier–Stokes problem in two dimensions has already been solved positively since the 1960s: there exist smooth and globally defined solutions. is sufficiently small then the statement is true: there 2. If the initial velocity are smooth and globally defined solutions to the Navier–Stokes equations. 3. Given an initial velocity
there exists a finite time T, depending on
such that the Navier–Stokes equations on
have smooth
solutions and p(x,t). It is not known if the solutions exist beyond that "blowup time" T. 4. The mathematician Jean Leray in 1934 proved the existence of so called weak solutions to the Navier–Stokes equations, satisfying the equations in mean value, not pointwise.
Chapter-3
Noether's Theorem
Noether's (first) theorem states that any differentiable symmetry of the action of a physical system has a corresponding conservation law. The theorem was proved by German mathematician Emmy Noether in 1915 and published in 1918. The action of a physical system is the integral over time of a Lagrangian function (which may or may not be an integral over space of a Lagrangian density function), from which the system's behavior can be determined by the principle of least action. Noether's theorem has become a fundamental tool of modern theoretical physics and the calculus of variations. A generalization of the seminal formulations on constants of motion in Lagrangian and Hamiltonian mechanics (1788 and 1833, respectively), it does not apply to systems that cannot be modeled with a Lagrangian; for example, dissipative systems with continuous symmetries need not have a corresponding conservation law. For illustration, if a physical system behaves the same regardless of how it is oriented in space, its Lagrangian is rotationally symmetric; from this symmetry, Noether's theorem shows the angular momentum of the system must be conserved. The physical system itself need not be symmetric; a jagged asteroid tumbling in space conserves angular momentum despite its asymmetry – it is the laws of motion that are symmetric. As another example, if a physical experiment has the same outcome regardless of place or time (having the same outcome, say, somewhere in Asia on a Tuesday or in America on a Wednesday), then its Lagrangian is symmetric under continuous translations in space and time; by Noether's theorem, these symmetries account for the conservation laws of linear momentum and energy within this system, respectively. (These examples are just for illustration; in the first one, Noether's theorem added nothing new – the results were known to follow from Lagrange's equations and from Hamilton's equations.) Noether's theorem is important, both because of the insight it gives into conservation laws, and also as a practical calculational tool. It allows researchers to determine the conserved quantities from the observed symmetries of a physical system. Conversely, it allows researchers to consider whole classes of hypothetical Lagrangians to describe a physical system. For illustration, suppose that a new field is discovered that conserves a quantity X. Using Noether's theorem, the types of Lagrangians that conserve X because of a continuous symmetry can be determined, and then their fitness judged by other criteria.
There are numerous different versions of Noether's theorem, with varying degrees of generality. The original version only applied to ordinary differential equations (particles) and not partial differential equations (fields). The original versions also assume that the Lagrangian only depends upon the first derivative, while later versions generalize the theorem to Lagrangians depending on the nth derivative. There is also a quantum version of this theorem, known as the Ward–Takahashi identity. Generalizations of Noether's theorem to superspaces also exist.
Informal statement of the theorem All fine technical points aside, Noether's theorem can be stated informally If a system has a continuous symmetry property, then there are corresponding quantities whose values are conserved in time. A more sophisticated version of the theorem states that: To every differentiable symmetry generated by local actions, there corresponds a conserved current. The word "symmetry" in the above statement refers more precisely to the covariance of the form that a physical law takes with respect to a one-dimensional Lie group of transformations satisfying certain technical criteria. The conservation law of a physical quantity is usually expressed as a continuity equation. The formal proof of the theorem uses only the condition of invariance to derive an expression for a current associated with a conserved physical quantity. The conserved quantity is called the Noether charge and the flow carrying that 'charge' is called the Noether current. The Noether current is defined up to a solenoidal vector field.
Historical context A conservation law states that some quantity X describing a system remains constant throughout its motion; expressed mathematically, the rate of change of X (its derivative with respect to time) is zero:
Such quantities are said to be conserved; they are often called constants of motion, although motion per se need not be involved, just evolution in time. For example, if the energy of a system is conserved, its energy is constant at all times, which imposes a constraint on the system's motion and may help to solve for it. Aside from the insight that such constants of motion give into the nature of a system, they are a useful calculational tool; for example, an approximate solution can be corrected by finding the nearest state that satisfies the necessary conservation laws.
The earliest constants of motion discovered were momentum and energy, which were proposed in the 17th century by René Descartes and Gottfried Leibniz on the basis of collision experiments, and refined by subsequent researchers. Isaac Newton was the first to enunciate the conservation of momentum in its modern form, and showed that it was a consequence of Newton's third law; interestingly, conservation of momentum still holds even in situations when Newton's third law is incorrect. Modern physics has revealed that the conservation laws of momentum and energy are only approximately true, but their modern refinements – the conservation of four-momentum in special relativity and the zero covariant divergence of the stress-energy tensor in general relativity – are rigorously true within the limits of those theories. The conservation of angular momentum, a generalization to rotating rigid bodies, likewise holds in modern physics. Another important conserved quantity, discovered in studies of the celestial mechanics of astronomical bodies, was the Laplace–Runge–Lenz vector. In the late 18th and early 19th centuries, physicists developed more systematic methods for discovering conserved quantities. A major advance came in 1788 with the development of Lagrangian mechanics, which is related to the principle of least action. In this approach, the state of the system can be described by any type of generalized coordinates q; the laws of motion need not be expressed in a Cartesian coordinate system, as was customary in Newtonian mechanics. The action is defined as the time integral I of a function known as the Lagrangian L
where the dot over q signifies the rate of change of the coordinates q
Hamilton's principle states that the physical path q(t) – the one truly taken by the system – is a path for which infinitesimal variations in that path cause no change in I, at least up to first order. This principle results in the Euler–Lagrange equations
Thus, if one of the coordinates, say qk, does not appear in the Lagrangian, the right-hand side of the equation is zero, and the left-hand side shows that
where the conserved momentum pk is defined as the left-hand quantity in parentheses. The absence of the coordinate qk from the Lagrangian implies that the Lagrangian is unaffected by changes or transformations of qk; the Lagrangian is invariant, and is said to exhibit a kind of symmetry. This is the seed idea from which Noether's theorem was born. Several alternative methods for finding conserved quantities were developed in the 19th century, especially by William Rowan Hamilton. For example, he developed a theory of canonical transformations that allowed researchers to change coordinates so that coordinates disappeared from the Lagrangian, resulting in conserved quantities. Another approach and perhaps the most efficient for finding conserved quantities is the Hamilton– Jacobi equation.
Mathematical expression The essence of Noether's theorem is the following: Imagine that the action I defined above is invariant under small perturbations (warpings) of the time variable t and the generalized coordinates q; (in a notation commonly used by physicists) we write
where the perturbations δt and δq are both small but variable. For generality, assume that there might be several such symmetry transformations of the action, say, N; we may use an index r = 1, 2, 3, …, N to keep track of them. Then a generic perturbation can be written as a linear sum of the individual types of perturbations
Using these definitions, Emmy Noether showed that the N quantities
are conserved, i.e., are constants of motion; this is a simple version of Noether's theorem.
Examples For illustration, consider a Lagrangian that does not depend on time, i.e., that is invariant (symmetric) under changes t → t + δt, without any change in the coordinates q. In this case, N = 1, T = 1 and Q = 0; the corresponding conserved quantity is the total energy H
Similarly, consider a Lagrangian that does not depend on a coordinate qk, i.e., that is invariant (symmetric) under changes qk → qk + δqk. In that case, N = 1, T = 0, and Qk = 1; the conserved quantity is the corresponding momentum pk
In special and general relativity, these apparently separate conservation laws are aspects of a single conservation law, that of the stress-energy tensor, that is derived in the next section. The conservation of the angular momentum L = r × p is slightly more complicated to derive, but analogous to its linear momentum counterpart. It is assumed that the symmetry of the Lagrangian is rotational, i.e., that the Lagrangian does not depend on the absolute orientation of the physical system in space. For concreteness, assume that the Lagrangian does not change under small rotations of an angle δθ about an axis n; such a rotation transforms the Cartesian coordinates by the equation
Since time is not being transformed, T equals zero. Taking δθ as the ε parameter and the Cartesian coordinates r as the generalized coordinates q, the corresponding Q variables are given by
Then Noether's theorem states that the following quantity is conserved
In other words, the component of the angular momentum L along the n axis is conserved. If n is arbitrary, i.e., if the system is insensitive to any rotation, then every component of L is conserved; in short, angular momentum is conserved.
Field-theory version Although useful in its own right, the version of her theorem just given was a special case of the general version she derived in 1915. To give the flavor of the general theorem, a version of the Noether theorem for continuous fields in four-dimensional space-time is now given. Since field theory problems are more common in modern physics than
mechanics problems, this field-theory version is the most commonly used version of Noether's theorem. Let there be a set of differentiable fields φk defined over all space and time; for example, the temperature T(x, t) would be representative of such a field, being a number defined at every place and time. The principle of least action can be applied to such fields, but the action is now an integral over space and time
(the theorem can actually be further generalized to the case where the Lagrangian depends on up to the nth derivative using jet bundles) Let the action be invariant under certain transformations of the space-time coordinates xμ and the fields φk
where the transformations can be indexed by r = 1, 2, 3, …, N
For such systems, Noether's theorem states that there are N conserved current densities
In such cases, the conservation law is expressed in a four-dimensional way
which expresses the idea that the amount of a conserved quantity within a sphere cannot change unless some of it flows out of the sphere. For example, electric charge is conserved; the amount of charge within a sphere cannot change unless some of the charge leaves the sphere. For illustration, consider a physical system of fields that behaves the same under translations in time and space, as considered above; in other words, the fields do not depend on the absolute position in space and time. In that case, N = 4, one for each dimension of space and time. Since only the positions in space-time are being warped,
not the fields, the Ψ are all zero and the Xμν equal the Kronecker delta δμν, where we have used μ instead of r for the index. In that case, Noether's theorem corresponds to the conservation law for the stress-energy tensor Tμν
The conservation of electric charge can be derived by considering transformations of the fields themselves. In quantum mechanics, the probability amplitude ψ(x) of finding a particle at a point x is a complex field, because it ascribes a complex number to every point in space and time. The probability amplitude itself is physically unmeasurable; only the probability p = |ψ|2 is directly measureable. Therefore, the system is invariant under transformations of the ψ field and its complex conjugate field ψ* that leave |ψ|2 unchanged, such as
In the limit when θ becomes infinitesimally small (δθ), it may be taken as the ε, and the Ψ are equal to iψ and −iψ*, respectively. A specific example is the Klein–Gordon equation, the relativistically correct version of the Schrödinger equation for spinless particles, which has the Lagrangian density
In this case, Noether's theorem states that the conserved current equals
which, when multiplied by the charge on that type of particle, equals the electric current density due to that type of particle. This transformation was first noted by Hermann Weyl and is one of the fundamental gauge symmetries of modern physics.
Derivations One independent variable Consider the simplest case, a system with one independent variable, time. Suppose the dependent variables are such that the action integral
is invariant under brief infinitesimal variations in the dependent variables. In other words, they satisfy the Euler–Lagrange equations
And suppose that the integral is invariant under a continuous symmetry. Mathematically such a symmetry is represented as a flow, , which acts on the variables as follows
where is a real variable indicating the amount of flow and T is a real constant (which could be zero) indicating how much the flow shifts time.
The action integral flows to
which may be regarded as a function of ε. Calculating the derivative at ε = 0 and using the symmetry, we get
Notice that the Euler–Lagrange equations imply
Substituting this into the previous equation, one gets
Again using the Euler–Lagrange equations we get
Substituting this into the previous equation, one gets
From which one can see that
is a constant of the motion, i.e. a conserved quantity. Since
, we get
and so the conserved quantity simplifies to
To avoid excessive complication of the formulas, this derivation assumed that the flow does not change as time passes. The same result can be obtained in the more general case.
Field-theoretic derivation Noether's theorem may also be derived for tensor fields φA where the index A ranges over the various components of the various tensor fields. These field quantities are functions defined over a four-dimensional space whose points are labeled by coordinates xμ where the index μ ranges over time (μ=0) and three spatial dimensions (μ=1,2,3). These four coordinates are the independent variables; and the values of the fields at each event are
the dependent variables. Under an infinitesimal transformation, the variation in the coordinates is written
whereas the transformation of the field variables is expressed as
By this definition, the field variations δφA result from two factors: intrinsic changes in the field themselves and changes in coordinates, since the transformed field αA depends on the transformed coordinates ξμ. To isolate the intrinsic changes, the field variation at a single point xμ may be defined
If the coordinates are changed, the boundary of the region of space-time over which the Lagrangian is being integrated also changes; the original boundary and its transformed version are denoted as Ω and Ω’, respectively. Noether's theorem begins with the assumption that a specific transformation of the coordinates and field variables does not change the action, which is defined as the integral of the Lagrangian density over the given region of spacetime. Expressed mathematically, this assumption may be written as
where the comma subscript indicates a partial derivative with respect to the coordinate(s) that follows the comma, e.g.
Since ξ is a dummy variable of integration, and since the change in the boundary Ω is infinitesimal by assumption, the two integrals may be combined using the fourdimensional version of the divergence theorem into the following form
The difference in Lagrangians can be written to first-order in the infinitesimal variations as
However, because the variations are defined at the same point as described above, the variation and the derivative can be done in reverse order; they commute
Using the Euler–Lagrange field equations
the difference in Lagrangians can be written neatly as
Thus, the change in the action can be written as
Since this holds for any region Ω, the integrand must be zero
For any combination of the various symmetry transformations, the perturbation can be written
where
is the Lie derivative of ,
in the
direction. When
is a scalar or
These equations imply that the field variation taken at one point equals
Differentiating the above divergence with respect to ε at ε=0 and changing the sign yields the conservation law
where the conserved current equals
Manifold/fiber bundle derivation Suppose we have an n-dimensional oriented Riemannian manifold, M and a target manifold T. Let be the configuration space of smooth functions from M to T. (More generally, we can have smooth sections of a fiber bundle over M.) Examples of this M in physics include:
In classical mechanics, in the Hamiltonian formulation, M is the one-dimensional manifold R, representing time and the target space is the cotangent bundle of space of generalized positions. In field theory, M is the spacetime manifold and the target space is the set of values the fields can take at any given point. For example, if there are m realvalued scalar fields, φ1,…,φm, then the target manifold is Rm. If the field is a real vector field, then the target manifold is isomorphic to R3.
Now suppose there is a functional
called the action. (Note that it takes values into , rather than reasons, and doesn't really matter for this proof.)
; this is for physical
To get to the usual version of Noether's theorem, we need additional restrictions on the action. We assume is the integral over M of a function
called the Lagrangian density, depending on φ, its derivative and the position. In other words, for φ in
Suppose we are given boundary conditions, i.e., a specification of the value of at the boundary if M is compact, or some limit on as x approaches ∞. Then the subspace of consisting of functions such that all functional derivatives of at φ are zero, that is:
and that satisfies the given boundary conditions, is the subspace of on shell solutions. Now, suppose we have an infinitesimal transformation on , generated by a functional derivation, Q such that
for all compact submanifolds N or in other words,
for all x, where we set
If this holds on shell and off shell, we say Q generates an off-shell symmetry. If this only holds on shell, we say Q generates an on-shell symmetry. Then, we say Q is a generator of a one parameter symmetry Lie group. Now, for any N, because of the Euler–Lagrange theorem, on shell (and only on-shell), we have
Since this is true for any N, we have
But this is the continuity equation for the current
defined by:
which is called the Noether current associated with the symmetry. The continuity equation tells us that if we integrate this current over a space-like slice, we get a conserved quantity called the Noether charge (provided, of course, if M is noncompact, the currents fall off sufficiently fast at infinity).
Comments Noether's theorem is really a reflection of the relation between the boundary conditions and the variational principle. Assuming no boundary terms in the action, Noether's theorem implies that
Noether's theorem is an on shell theorem. The quantum analog of Noether's theorem are the Ward–Takahashi identities.
Generalization to Lie algebras Suppose say we have two symmetry derivations Q1 and Q2. Then, [Q1, Q2] is also a symmetry derivation. Let's see this explicitly. Let's say
and
Then,
where f12=Q1[f2μ]-Q2[f1μ]. So,
This shows we can extend Noether's theorem to larger Lie algebras in a natural way.
Generalization of the proof This applies to any local symmetry derivation Q satisfying , and also to more general local functional differentiable actions, including ones where the Lagrangian depends on higher derivatives of the fields. Let ε be any arbitrary smooth function of the spacetime (or time) manifold such that the closure of its support is disjoint from the boundary. ε is a test function. Then, because of the variational principle (which does not apply to the boundary, by the way), the derivation distribution q generated by q[ε][Φ(x)] = ε(x)Q[Φ(x)] satisfies for any ε, or more compactly, for all x not on the boundary (but remember that q(x) is a shorthand for a derivation distribution, not a derivation parametrized by x in general). This is the generalization of Noether's theorem. To see how the generalization related to the version given above, assume that the action is the spacetime integral of a Lagrangian that only depends on and its first derivatives. Also, assume
Then,
for all ε. More generally, if the Lagrangian depends on higher derivatives, then
Examples Example 1: Conservation of energy Looking at the specific case of a Newtonian particle of mass m, coordinate x, moving under the influence of a potential V, coordinatized by time t. The action, S, is:
Consider the generator of time translations
. In other words,
. Note that x has an explicit dependence on time, whilst V does not; consequently:
so we can set
Then,
The right hand side is the energy and Noether's theorem states that (i.e. the principle of conservation of energy is a consequence of invariance under time translations). More generally, if the Lagrangian does not depend explicitly on time, the quantity
(called the Hamiltonian) is conserved.
Example 2: Conservation of center of momentum Still considering 1-dimensional time, let
i.e. N Newtonian particles where the potential only depends pairwise upon the relative displacement. For , let's consider the generator of Galilean transformations (i.e. a change in the frame of reference). In other words,
Note that
This has the form of
so we can set
Then,
where is the total momentum, M is the total mass and Noether's theorem states:
is the center of mass.
Example 3: Conformal transformation Both examples 1 and 2 are over a 1-dimensional manifold (time). An example involving spacetime is a conformal transformation of a massless real scalar field with a quartic potential in (3 + 1)-Minkowski spacetime.
For Q, consider the generator of a spacetime rescaling. In other words,
The second term on the right hand side is due to the "conformal weight" of φ. Note that
This has the form of
(where we have performed a change of dummy indices) so set
Then,
Noether's theorem states that (as one may explicitly check by substituting the Euler–Lagrange equations into the left hand side). (Aside: If one tries to find the Ward–Takahashi analog of this equation, one runs into a problem because of anomalies.)
Applications Application of Noether's theorem allows physicists to gain powerful insights into any general theory in physics, by just analyzing the various transformations that would make the form of the laws involved invariant. For example:
the invariance of physical systems with respect to spatial translation (in other words, that the laws of physics do not vary with locations in space) gives the law of conservation of linear momentum; invariance with respect to rotation gives the law of conservation of angular momentum; invariance with respect to time translation gives the well-known law of conservation of energy
In quantum field theory, the analog to Noether's theorem, the Ward–Takahashi identity, yields further conservation laws, such as the conservation of electric charge from the invariance with respect to a change in the phase factor of the complex field of the charged particle and the associated gauge of the electric potential and vector potential. The Noether charge is also used in calculating the entropy of stationary black holes.
Chapter-4
Method of Characteristics and Method of Lines
Method of characteristics In mathematics, the method of characteristics is a technique for solving partial differential equations. Typically, it applies to first-order equations, although more generally the method of characteristics is valid for any hyperbolic partial differential equation. The method is to reduce a partial differential equation to a family of ordinary differential equations along which the solution can be integrated from some initial data given on a suitable hypersurface.
Characteristics of first-order partial differential equations For a first-order PDE, the method of characteristics discovers curves (called characteristic curves or just characteristics) along which the PDE becomes an ordinary differential equation (ODE). Once the ODE is found, it can be solved along the characteristic curves and transformed into a solution for the original PDE. For the sake of motivation, we confine our attention to the case of a function of two independent variables x and y for the moment. Consider a quasilinear PDE of the form (1) Suppose that a solution u is known, and consider the surface graph z = u(x,y) in R3. A normal vector to this surface is given by
As a result, equation (1) is equivalent to the geometrical statement that the vector field
is tangent to the surface z = u(x,y) at every point. In other words, the graph of the solution must be a union of integral curves of this vector field. These integral curves are called the characteristic curves of the original partial differential equation. The equations of the characteristic curve may be expressed invariantly by the LagrangeCharpit equations
or, if a particular parametrization t of the curves is fixed, then these equations may be written as a system of ordinary differential equations for x(t), y(t), z(t):
These are the characteristic equations for the original system.
Linear and quasilinear cases Consider now a PDE of the form
For this PDE to be linear, the coefficients ai may be functions of the spatial variables only, and independent of u. For it to be quasilinear, ai may also depend on the value of the function, but not on any derivatives. The distinction between these two cases is inessential for the discussion here. For a linear or quasilinear PDE, the characteristic curves are given parametrically by
such that the following system of ODEs is satisfied (2)
(3) Equations (2) and (3) give the characteristics of the PDE.
Fully nonlinear case Consider the partial differential equation
where the variables pi are shorthand for the partial derivatives
Let (xi(s),u(s),pi(s)) be a curve in R2n+1. Suppose that u is any solution, and that
Along a solution, differentiating (1) with respect to s gives
(The second equation follows from applying the chain rule to a solution u, and the third follows by taking an exterior derivative of the relation du-Σpidxi=0.) Manipulating these equations gives
where λ is a constant. Writing these equations more symmetrically, one obtains the Lagrange-Charpit equations for the characteristic
Geometrically, the method of characteristics in the fully nonlinear case can be interpreted as requiring that the Monge cone of the differential equation should everywhere be tangent to the graph of the solution.
Example As an example, consider the advection equation (this example assumes familiarity with PDE notation, and solutions to basic ODEs).
where is constant and is a function of and . We want to transform this linear firstorder PDE into an ODE along the appropriate curve; i.e. something of the form
, where
is a characteristic line. First, we find
by the chain rule. Now, if we set
and
we get
which is the left hand side of the PDE we started with. Thus
So, along the characteristic line
, the original PDE becomes the ODE . That is to say that along the characteristics, the solution
where and lie on the same is constant. Thus, characteristic. So to determine the general solution, it is enough to find the characteristics by solving the characteristic system of ODEs:
, letting , letting
we know we know
, ,
, letting
we know .
In this case, the characteristic lines are straight lines with slope , and the value of remains constant along any characteristic line.
Characteristics of linear differential operators Let X be a differentiable manifold and P a linear differential operator
of order k. In a local coordinate system xi,
in which α denotes a multi-index. The principal symbol of P, denoted σP, is the function on the cotangent bundle T∗X defined in these local coordinates by σP(x,ξ) =
∑
Pα(x)ξα
|α|=k where the ξi are the fiber coordinates on the cotangent bundle induced by the coordinate differentials dxi. Although this is defined using a particular coordinate system, the transformation law relating the ξi and the xi ensures that σP is a well-defined function on the cotangent bundle. The function σP is homogeneous of degree k in the ξ variable. The zeros of σP, away from the zero section of T∗X, are the characteristics of P. A hypersurface of X defined by the equation F(x) = c is called a characteristic hypersurface at x if σP(x,dF(x)) = 0. Invariantly, a characteristic hypersurface is a hypersurface whose conormal bundle is in the characteristic set of P.
Qualitative analysis of characteristics Characteristics are also a powerful tool for gaining qualitative insight into a PDE.
One can use the crossings of the characteristics to find shock waves. Intuitively, we can think of each characteristic line implying a solution to along itself. Thus, when two characteristics cross two solutions are implied. This causes shock waves and the solution to becomes a multivalued function. Solving PDEs with this behavior is a very difficult problem and an active area of research. Characteristics may fail to cover part of the domain of the PDE. This is called a rarefaction, and indicates the solution typically exists only in a weak, i.e. integral equation, sense. The direction of the characteristic lines indicate the flow of values through the solution, as the example above demonstrates. This kind of knowledge is useful when solving PDEs numerically as it can indicate which finite difference scheme is best for the problem.
Method of lines The method of lines (MOL, NMOL, NUMOL) (Schiesser, 1991; Hamdi, et al., 2007; Schiesser, 2009 ) is a technique for solving partial differential equations (PDEs) in which all but one dimension is discretized. MOL allows standard, general-purpose methods and software, developed for the numerical integration of ODEs and DAEs, to be used. A large number of integration routines have been developed over the years in many different programming languages, and some have been published as open source resources; see for example Lee and Schiesser (2004). The method of lines most often refers to the construction or analysis of numerical methods for partial differential equations that proceeds by first discretizing the spatial derivatives only and leaving the time variable continuous. This leads to a system of ordinary differential equations to which a numerical method for initial value ordinary equations can be applied. The method of lines in this context dates back to at least the early 1960s Sarmin and Chudov. Many papers discussing the accuracy and stability of the method of lines for various types of partial differential equations have appeared since (for example Zafarullah or Verwer and Sanz-Serna). W. E. Schiesser of Lehigh University is one of the major proponents of the method of lines, having published widely in this field.
Application to elliptical equations MOL requires that the PDE problem is well-posed as an initial value (Cauchy) problem in at least one dimension, because ODE and DAE integrators are initial value problem (IVP) solvers. Thus it cannot be used directly on purely elliptic equations, such as Laplace's equation. However, MOL has been used to solve Laplace's equation by using the method of false transients (Schiesser, 1991; Schiesser, 1994). In this method, a time derivative of the
dependent variable is added to Laplace’s equation. Finite differences are then used to approximate the spatial derivatives, and the resulting system of equations is solved by MOL. It is also possible to solve elliptical problems by a semi-analytical method of lines (Subramaniana, 2004). In this method the discretization process results in a set of ODE's that are solved by exploiting properties of the associated exponential matrix.
Chapter-5
Ricci Flow
Several stages of Ricci flow on a 2D manifold.
In differential geometry, the Ricci flow is an intrinsic geometric flow. It is a process that deforms the metric of a Riemannian manifold in a way formally analogous to the diffusion of heat, smoothing out irregularities in the metric. The Ricci flow was first introduced by Richard Hamilton in 1981, and is also referred to as the Ricci-Hamilton flow. It plays an important role in Grigori Perelman's solution of the Poincaré conjecture, as well as in the proof of the Differentiable sphere theorem by Brendle and Schoen.
Mathematical definition Given a Riemannian manifold with metric tensor gij, we can compute the Ricci tensor Rij, which collects averages of sectional curvatures into a kind of "trace" of the Riemann curvature tensor. If we consider the metric tensor (and the associated Ricci tensor) to be functions of a variable which is usually called "time" (but which may have nothing to do with any physical time), then the Ricci flow may be defined by the geometric evolution equation
The normalized Ricci flow makes sense for compact manifolds and is given by the equation
where Ravg is the average (mean) of the scalar curvature (which is obtained from the Ricci tensor by taking the trace) and n is the dimension of the manifold. This normalized equation preserves the volume of the metric. The factor of −2 is of little significance, since it can be changed to any nonzero real number by rescaling t. However the minus sign ensures that the Ricci flow is well defined for sufficiently small positive times; if the sign is changed then the Ricci flow would usually only be defined for small negative times. (This is similar to the way in which the heat equation can be run forwards in time, but not usually backwards in time.) Informally, the Ricci flow tends to expand negatively curved regions of the manifold, and contract positively curved regions.
Examples
If the manifold is Euclidean space, or more generally Ricci-flat, then Ricci flow leaves the metric unchanged. Conversely, any metric unchanged by Ricci flow is Ricci-flat. If the manifold is a sphere (with the usual metric) then Ricci flow collapses the manifold to a point in finite time. If the sphere has radius 1 in n dimensions, then
after time t the metric will be multiplied by (1 − 2t(n − 1)), so the manifold will collapse after time 1 / 2(n − 1). More generally, if the manifold is an Einstein manifold (Ricci = constant×metric), then Ricci flow will collapse it to a point if it has positive curvature, leave it invariant if it has zero curvature, and expand it if it has negative curvature. For a compact Einstein manifold, the metric is unchanged under normalized Ricci flow. Conversely, any metric unchanged by normalized Ricci flow is Einstein.
In particular, this shows that in general the Ricci flow cannot be continued for all time, but will produce singularities. For 3 dimensional manifold, Perelman showed how to continue past the singularities using surgery on the manifold.
A significant 2-dimensional example is the cigar soliton solution, which is given by the metric (dx2 + dy2)/(e4t + x2 + y2) on the Euclidean plane. Although this metric shrinks under the Ricci flow, its geometry remains the same. Such solutions are called steady Ricci solitons. An example of a 3-dimensional steady Ricci soliton is the "Bryant soliton", which is rotationally symmetric, has positive curvature, and is obtained by solving a system of ordinary differential equations.
Relationship to uniformization and geometrization The Ricci flow (named after Gregorio Ricci-Curbastro) was introduced by Richard Hamilton in 1981 in order to gain insight into the geometrization conjecture of William Thurston, which concerns the topological classification of three-dimensional smooth manifolds. Hamilton's idea was to define a kind of nonlinear diffusion equation which would tend to smooth out irregularities in the metric. Then, by placing an arbitrary metric g on a given smooth manifold M and evolving the metric by the Ricci flow, the metric should approach a particularly nice metric, which might constitute a canonical form for M. Suitable canonical forms had already been identified by Thurston; the possibilities, called Thurston model geometries, include the three-sphere S3, threedimensional Euclidean space E3, three-dimensional hyperbolic space H3, which are homogeneous and isotropic, and five slightly more exotic Riemannian manifolds, which are homogeneous but not isotropic. (This list is closely related to, but not identical with, the Bianchi classification of the three-dimensional real Lie algebras into nine classes.) Hamilton's idea was that these special metrics should behave like fixed points of the Ricci flow, and that if, for a given manifold, globally only one Thurston geometry was admissible, this might even act like an attractor under the flow. Hamilton succeeded in proving that any smooth closed three-manifold which admits a metric of positive Ricci curvature also admits a unique Thurston geometry, namely a spherical metric, which does indeed act like an attracting fixed point under the Ricci flow, renormalized to preserve volume. (Under the unrenormalized Ricci flow, the manifold collapses to a point in finite time.) This doesn't prove the full geometrization conjecture because the most difficult case turns out to concern manifolds with negative Ricci curvature and more specifically those with negative sectional curvature. (A strange and interesting fact is that all closed three-manifolds admit metrics with negative Ricci
curvatures! This was proved by L. Zhiyong Gao and Shing-Tung Yau in 1986.) Indeed, a triumph of nineteenth century geometry was the proof of the uniformization theorem, the analogous topological classification of smooth two-manifolds, where Hamilton showed that the Ricci flow does indeed evolve a negatively curved two-manifold into a twodimensional multi-holed torus which is locally isometric to the hyperbolic plane. This topic is closely related to important topics in analysis, number theory, dynamical systems, mathematical physics, and even cosmology. Note that the term "uniformization" correctly suggests a kind of smoothing away of irregularities in the geometry, while the term "geometrization" correctly suggests placing a geometry on a smooth manifold. Geometry is being used here in a precise manner akin to Klein's notion of geometry. In particular, the result of geometrization may be a geometry that is not isotropic. In most cases including the cases of constant curvature, the geometry is unique. An important theme in this area is the interplay between real and complex formulations. In particular, many discussions of uniformization speak of complex curves rather than real two-manifolds. The Ricci flow does not preserve volume, so to be more careful in applying the Ricci flow to uniformization and geometrization one needs to normalize the Ricci flow to obtain a flow which preserves volume. If one fail to do this, the problem is that (for example) instead of evolving a given three-dimensional manifold into one of Thurston's canonical forms, we might just shrink its size. It is possible to construct a kind of moduli space of n-dimensional Riemannian manifolds, and then the Ricci flow really does give a geometric flow (in the intuitive sense of particles flowing along flowlines) in this moduli space.
Relation to diffusion To see why the evolution equation defining the Ricci flow is indeed a kind of nonlinear diffusion equation, we can consider the special case of (real) two-manifolds in more detail. Any metric tensor on a two-manifold can be written with respect to an exponential isothermal coordinate chart in the form
(These coordinates provide an example of a conformal coordinate chart, because angles, but not distances, are correctly represented.) The easiest way to compute the Ricci tensor and Laplace-Beltrami operator for our Riemannian two-manifold is to use the differential forms method of Élie Cartan. Take the coframe field
so that metric tensor becomes
Next, given an arbitrary smooth function h(x,y), compute the exterior derivative
Take the Hodge dual
Take another exterior derivative
(where we used the anti-commutative property of the exterior product). That is,
Taking another Hodge dual gives
which gives the desired expression for the Laplace/Beltrami operator
To compute the curvature tensor, we take the exterior derivative of the covector fields making up our coframe:
From these expressions, we can read off the only independent connection one-form
Take another exterior derivative
This gives the curvature two-form
from which we can read off the only linearly independent component of the Riemann tensor using
Namely
from which the only nonzero components of the Ricci tensor are R22 = R11 = − Δp. From this, we find components with respect to the coordinate cobasis, namely
But the metric tensor is also diagonal, with gxx = gyy = exp(2p) and after some elementary manipulation, we obtain an elegant expression for the Ricci flow:
This is manifestly analogous to the best known of all diffusion equations, the heat equation
where now is the usual Laplacian on the Euclidean plane. The reader may object that the heat equation is of course a linear partial differential equation--where is the promised nonlinearity in the p.d.e. defining the Ricci flow? The answer is that nonlinearity enters because the Laplace-Beltrami operator depends upon the same function p which we used to define the metric. But notice that the flat Euclidean plane is given by taking p(x,y) = 0. So if p is small in magnitude, we can consider it to define small deviations from the geometry of a flat plane, and if we retain only first order terms in computing the exponential, the Ricci flow on our two-
dimensional almost flat Riemannian manifold becomes the usual two dimensional heat equation. This computation suggests that, just as (according to the heat equation) an irregular temperature distribution in a hot plate tends to become more homogeneous over time, so too (according to the Ricci flow) an almost flat Riemannian manifold will tend to flatten out the same way that heat can be carried off "to infinity" in an infinite flat plate. But if our hot plate is finite in size, and has no boundary where heat can be carried off, we can expect to homogenize the temperature, but clearly we cannot expect to reduce it to zero. In the same way, we expect that the Ricci flow, applied to a distorted round sphere, will tend to round out the geometry over time, but not to turn it into a flat Euclidean geometry.
Recent developments The Ricci flow has been intensively studied since 1981. Some recent work has focused on the question of precisely how higher-dimensional Riemannian manifolds evolve under the Ricci flow, and in particular, what types of parametric singularities may form. For instance, a certain class of solutions to the Ricci flow demonstrates that neckpinch singularities will form on an evolving n-dimensional metric Riemannian manifold having a certain topological property (positive Euler characteristic), as the flow approaches some characteristic time t0. In certain cases, such neckpinches will produce manifolds called Ricci solitons. There are many related geometric flows, some of which (such as the Yamabe flow and the Calabi flow) have properties similar to the Ricci flow.
Chapter-6
Secondary Calculus and Cohomological Physics, Screened Poisson Equation and Saint-Venant's Compatibility Condition
Secondary calculus and cohomological physics In mathematics, secondary calculus is a proposed expansion of classical differential calculus on manifolds, to the "space" of solutions of a (nonlinear) partial differential equation. It is a sophisticated theory at the level of jet spaces and employing algebraic methods.
Secondary calculus Secondary calculus acts on the space of solutions of a system of partial differential equations (usually non-linear equations). When the number of independent variables is zero, i.e. the equations are algebraic ones, secondary calculus reduces to classical differential calculus. All objects in secondary calculus are cohomology classes of differential complexes growing on diffieties. The latter are, in the framework of secondary calculus, the analog of smooth manifolds.
Cohomological physics Cohomological physics was born with Gauss's theorem, describing the electric charge contained inside a given surface in terms of the flux of the electric field through the surface itself. Flux is the integral of a differential form and, consequently, a de Rham cohomology class. It is not by chance that formulas of this kind, such as the well known Stokes formula, though being a natural part of classical differential calculus, have entered in modern mathematics from physics.
Classical analogues All the constructions in classical differential calculus have an analog in secondary calculus. For instance, higher symmetries of a system of partial differential equations are the analog of vector fields on differentiable manifolds. The Euler operator, which associates to each variational problem the corresponding Euler-Lagrange equation, is the analog of the classical differential associating to a function on a variety its differential. The Euler operator is a secondary differential operator of first order, even if, according to its expression in local coordinates, it looks like one of infinite order. More generally, the analog of differential forms in secondary calculus are the elements of the first term of the so-called C-spectral sequence, and so on. The simplest diffieties are infinite prolongations of partial differential equations, which are sub varieties of infinite jet spaces. The latter are infinite dimensional varieties that can not be studied by means of standard functional analysis. On the contrary, the most natural language in which to study these objects is differential calculus over commutative algebras. Therefore, the latter must be regarded as a fundamental tool of secondary calculus. On the other hand, differential calculus over commutative algebras gives the possibility to develop algebraic geometry as if it were differential geometry.
Theoretical physics Recent developments of particle physics, based on quantum field theories and its generalizations, have led to understand the deep cohomological nature of the quantities describing both classical and quantum fields. The turning point was the discovery of the famous BRST transformation. For instance, it was understood that observables in field theory are classes in horizontal de Rham cohomology which are invariant under the corresponding gauge group and so on. This current in modern theoretical physics is actually growing and it is called Cohomological Physics. It is relevant that secondary calculus and cohomological physics, which developed for twenty years independently from each other, arrived at the same results. Their confluence took place at the international conference Secondary Calculus and Cohomological Physics (Moscow, August 24–30, 1997).
Prospects A large number of modern mathematical theories harmoniously converges in the framework of secondary calculus, for instance: commutative algebra and algebraic geometry, homological algebra and differential topology, Lie group and Lie algebra theory, differential geometry, etc.
Screened Poisson equation In Physics, the screened Poisson equation is the following partial differential equation:
where Δ is the Laplace operator, λ is a constant, f is an arbitrary function of position (known as the "source function") and u is the function to be determined. The screened Poisson equation occurs frequently in Physics, including Yukawa's theory of mesons and electric field screening in plasmas. In the homogenous case (f=0), the screened Poisson equation is the same as the timeindependent Klein–Gordon equation. In the inhomogeneous case, the screened Poisson equation is very similar to the inhomogeneous Helmholtz equation, the only difference being the sign within the brackets. Without loss of generality, we will take λ to be non-negative. When λ is zero, the equation reduces to Poisson's equation. Therefore, when λ is very small, the solution approaches that of the unscreened Poisson equation, which, in dimension n = 3, is a superposition of 1/r functions weighted by the source function f:
On the other hand, when λ is extremely large, u approaches the value f/λ², which goes to zero as λ goes to infinity. As we shall see, the solution for intermediate values of λ behaves as a superposition of screened (or damped) 1/r functions, with λ behaving as the strength of the screening. The screened Poisson equation can be solved for general f using the method of Green's functions. The Green's function G is defined by
Assuming u and its derivatives vanish at large r, we may perform a continuous Fourier transform in spatial coordinates:
where the integral is taken over all space. It is then straightforward to show that
The Green's function in r is therefore given by the inverse Fourier transform,
This integral may be evaluated using spherical coordinates in k-space. The integration over the angular coordinates is straightforward, and the integral reduces to one over the radial wavenumber kr:
This may be evaluated using contour integration. The result is:
The solution to the full problem is then given by
As stated above, this is a superposition of screened 1/r functions, weighted by the source function f and with λ acting as the strength of the screening. The screened 1/r function is often encountered in physics as a screened Coulomb potential, also called a "Yukawa potential". In two dimensions: In the case of a magnetized plasma, the screened Poisson equation is quasi-2D:
with and , with the magnetic field and ρ is the (ion) Larmor radius. The two-dimensional Fourier Transform of the associated Green's function is:
The 2D screened Poisson equation yields:
. The Green's function is therefore given by the inverse Fourier transform:
This integral can be calculated using polar coordinates in k-space:
The integration over the angular coordinate is straightforward, and the integral reduces to one over the radial wavenumber kr:
Saint-Venant's compatibility condition In the mathematical theory of elasticity the strain is related to a displacement field by
Barré de Saint-Venant derived the compatibility condition for an arbitrary symmetric second rank tensor field to be of this form, this has now been generalized to higher rank symmetric tensor fields.
Rank 2 tensor fields The integrability condition takes the form of the vanishing of the Saint-Venant's tensor defined by
The result that, on a simply connected domain W=0 implies that strain is the symmetric derivative of some vector field, was first described by Barré de Saint-Venant in 1864 and proved rigorously by Beltrami in 1886. For non-simply connected domains there are
finite dimensional spaces of symmetric tensors with vanishing Saint-Venant's tensor that are not the symmetric derivative of a vector field. The situation is analogous to de Rham cohomology Due to the symmetry conditions Wijkl = Wklij = − Wjikl = Wijlk there are only six (in the three dimensional case) distinct components of W For example all components can be deduced from Wijkl the indices ijkl=2323, 2331, 1223, 1313, 1312 and 1212. The six components in such minimal sets are not independent as functions as they satisfy partial differential equations such as
and there are two further relations obtained by cyclic permutation. In its simplest form of course the components of must be assumed twice continuously differentiable, but more recent work proves the result in a much more general case. The relation between Saint-Venant's compatibility condition and Poincare's lemma can be understood more clearly using the operator , where is a symmetric tensor field. The matrix curl of a symmetric rank 2 tensor field T is defined by
where is the permutation symbol. The operator maps symmetric tensor fields to symmetric tensor fields. The vanishing of the Saint Venant's tensor W(T) is equivalent to . This illustrates more clearly the six independent components of satisfies W(T). The divergence of a tensor field . This exactly the three first order differential equations satisfied by the components of W(T) mentioned above. In differential geometry the symmetrized derivative of a vector field appears also as the Lie derivative of the metric tensor g with respect to the vector field.
where indices following a semicolon indicate covariant differentiation. The vanishing of W(T) is thus the integrability condition for local existence of U in the Euclidean case.
Generalization to higher rank tensors Saint-Vanant's compatibility condition can be thought of as an analogue, for symmetric tensor fields, of Poincare's lemma for skew-symmetric tensor fields (differential forms). The result can be generalized to higher rank symmetric tensor fields. Let F be a symmetric rank-k tensor field on an open set in n-dimensional Euclidean space, then the symmetric derivative is the rank k+1 tensor field defined by
where we use the classical notation that indices following a comma indicate differentiation and groups of indices enclosed in brackets indicate symmetrization over those indices. The Saint-Venant tensor W of a symmetric rank-k tensor field T is defined by
with
On a simply connected domain in Euclidean space W = 0 implies that T = dF for some rank k-1 symmetric tensor field F.
Chapter-7
Separation of Variables
In mathematics, separation of variables is any of several methods for solving ordinary and partial differential equations, in which algebra allows one to rewrite an equation so that each of two variables occurs on a different side of the equation.
Ordinary differential equations (ODE) Suppose a differential equation can be written in the form
which we can write more simply by letting y = f(x):
As long as h(y) ≠ 0, we can rearrange terms to obtain:
so that the two variables x and y have been separated. dx (and dy) can be viewed, at a simple level, as just a convenient notation, which provides a handy mnemonic aid for assisting with manipulations. A formal definition of dx as a differential (infinitesimal) is somewhat advanced.
Alternative notation Some who dislike Leibniz's notation may prefer to write this as
but that fails to make it quite as obvious why this is called "separation of variables". Integrating both sides of the equation with respect to x, we have
or equivalently,
because of the substitution rule for integrals. If one can evaluate the two integrals, one can find a solution to the differential equation. as a fraction Observe that this process effectively allows us to treat the derivative which can be separated. This allows us to solve separable differential equations more conveniently, as demonstrated in the example below. (Note that we do not need to use two constants of integration, in equation (2) as in
because a single constant C = C2 − C1 is equivalent.)
Example (I) The ordinary differential equation
may be written as
If we let g(x) = 1 and h(y) = y(1 − y), we can write the differential equation in the form of equation (1) above. Thus, the differential equation is separable. As shown above, we can treat dy and dx as separate values, so that both sides of the equation may be multiplied by dx. Subsequently dividing both sides by y(1 − y), we have
At this point we have separated the variables x and y from each other, since x appears only on the right side of the equation and y only on the left. Integrating both sides, we get
which, via partial fractions, becomes
and then ln | y | − ln | 1 − y | = x + C where C is the constant of integration. A bit of algebra gives a solution for y:
One may check our solution by taking the derivative with respect to x of the function we found, where B is an arbitrary constant. The result should be equal to our original problem. (One must be careful with the absolute values when solving the equation above. It turns out that the different signs of the absolute value contribute the positive and negative values for B, respectively. And the B = 0 case is contributed by the case that y = 1, as discussed below.) Note that since we divided by y and (1 − y) we must check to see whether the solutions y(x) = 0 and y(x) = 1 solve the differential equation (in this case they are both solutions).
Example (II) Population growth is often modeled by the differential equation
where P is the population with respect to time t, k is the rate of growth, and K is the carrying capacity of the environment. Separation of variables may be used to solve this differential equation.
To evaluate the integral on the left side, we simplify the fraction
and then, we decompose the fraction into partial fractions
Thus we have
Let
.
Therefore, the solution to the logistic equation is
To find A, let t = 0 and
. Then we have
Noting that e0 = 1, and solving for A we get
Partial differential equations The method of separation of variables are also used to solve a wide range of linear partial differential equations with boundary and initial conditions, such as heat equation, wave equation, Laplace equation and Helmholtz equation.
Homogeneous case Consider the one-dimensional heat equation.The equation is (1) The boundary condition is homogeneous, that is (2) Let us attempt to find a solution which is not identically zero satisfying the boundary conditions but with the following property: u is a product in which the dependence of u on x, t is separated, that is: u(x,t) = X(x)T(t).
(3)
Substituting u back into equation, (4) Since the right hand side depends only on x and the left hand side only on t, both sides are equal to some constant value − λ. Thus: T'(t) = − λαT(t),
(5)
X''(x) = − λX(x).
(6)
and
− λ here is the eigenvalue for both differential operators, and T(t) and X(x) are corresponding eigenfunctions. We will now show that solutions for X(x) for values of λ ≤ 0 cannot occur: Suppose that λ < 0. Then there exist real numbers B, C such that
From (2) we get X(0) = 0 = X(L),
(7)
and therefore B = 0 = C which implies u is identically 0. Suppose that λ = 0. Then there exist real numbers B, C such that X(x) = Bx + C. From (7) we conclude in the same manner as in 1 that u is identically 0. Therefore, it must be the case that λ > 0. Then there exist real numbers A, B, C such that T(t) = Ae − λαt, and
From (7) we get C = 0 and that for some positive integer n,
This solves the heat equation in the special case that the dependence of u has the special form of (3). In general, the sum of solutions to (1) which satisfy the boundary conditions (2) also satisfies (1) and (3). Hence a general solution can be given as
where Dn are coefficients determined by initial condition. Given the initial condition
we can get
This is the sine series expansion of f(x). Multiplying both sides with integrating over [0,L] result in
and
This method requires that the eigenfunctions of x, here and complete. In general this is guaranteed by Sturm-Liouville theory.
, are orthogonal
Nonhomogeneous case Suppose the equation is nonhomogeneous, (8) with the boundary condition the same as (2). Expand h(x,t) and u(x,t) into (9) (10) where hn(t) can be calculated by integration, while un(t) is to be determined. Substitute (9) and (10) back to (8) and considering the orthogonality of sine functions we get
which are a sequence of linear differential equations that can be readily solved with, for instance, Laplace transform. If the boundary condition is nonhomogeneous, then the expansion of (9) and (10) is no longer valid. One has to find a function v that satisfies the boundary condition only, and subtract it from u. The function u-v then satisfies homogeneous boundary condition, and can be solved with the above method. In orthogonal curvilinear coordinates, separation of variables can still be used, but in some details different from that in Cartesian coordinates. For instance, regularity or periodic condition may determine the eigenvalues in place of boundary conditions.
Matrices The matrix form of the separation of variables is the Kronecker sum. As an example we consider the 2D discrete Laplacian on a regular grid:
where and are 1D discrete Laplacians in the x- and y-directions, correspondingly, and are the identities of appropriate sizes.
Chapter-8
Spherical Harmonics
Visual representations of the first few spherical harmonics. Red portions represent regions where the function is positive, and green portions represent regions where the function is negative. In mathematics, spherical harmonics are the angular portion of a set of solutions to Laplace's equation. Represented in a system of spherical coordinates, Laplace's spherical harmonics are a specific set of spherical harmonics that forms an orthogonal system, first introduced by Pierre Simon de Laplace. Spherical harmonics are important in many theoretical and practical applications, particularly in the computation of atomic orbital electron configurations, representation of gravitational fields, geoids, and the magnetic fields of planetary bodies and stars, and characterization of the cosmic microwave background radiation. In 3D computer graphics, spherical harmonics play a special role in a wide variety of topics including indirect lighting (ambient occlusion, global illumination, precomputed radiance transfer, etc.) and recognition of 3D shapes.
History Spherical harmonics were first investigated in connection with the Newtonian potential of Newton's law of universal gravitation in three dimensions. In 1782, Pierre-Simon de Laplace had, in his Mécanique Céleste, determined that the gravitational potential at a point x associated to a set of point masses mi located at points xi was given by
Each term in the above summation is an individual Newtonian potential for a point mass. Just prior to that time, Adrien-Marie Legendre had investigated the expansion of the Newtonian potential in powers of r = |x| and r1 = |x1|. He discovered that if r ≤ r1 then
where γ is the angle between the vectors x and x1. The functions Pi are the Legendre polynomials, and they are a special case of spherical harmonics. Subsequently, in his 1782 memoire, Laplace investigated these coefficients using spherical coordinates to represent the angle γ between x1 and x. In 1867, William Thomson (Lord Kelvin) and Peter Guthrie Tait introduced the solid spherical harmonics in their Treatise on Natural Philosophy, and also first introduced the name of "spherical harmonics" for these functions. The solid harmonics were homogeneous solutions of Laplace's equation
By examining Laplace's equation in spherical coordinates, Thomson and Tait recovered Laplace's spherical harmonics. The term "Laplace's coefficients" was employed by William Whewell to describe the particular system of solutions introduced along these lines, whereas others reserved this designation for the zonal spherical harmonics that had properly been introduced by Laplace and Legendre. The 19th century development of Fourier series made possible the solution of a wide variety of physical problems in rectangular domains, such as the solution of the heat equation and wave equation. This could be achieved by expansion of functions in series of trigonometric functions. Whereas the trigonometric functions in a Fourier series represent the fundamental modes of vibration in a string, the spherical harmonics represent the fundamental modes of vibration of a sphere in much the same way. Many aspects of the theory of Fourier series could be generalized by taking expansions in spherical harmonics rather than trigonometric functions. This was a boon for problems
possessing spherical symmetry, such as those of celestial mechanics originally studied by Laplace and Legendre. The prevalence of spherical harmonics already in physics set the stage for their later importance in the 20th century birth of quantum mechanics. The spherical harmonics are eigenfunctions of the square of the orbital angular momentum operator
and therefore they represent the different quantized configurations of atomic orbitals.
Laplace's spherical harmonics
Real (Laplace) spherical harmonics for (left to right). The negative order harmonics
to 4 (top to bottom) and m = 0 to 4 are rotated about the z axis by
with respect to the positive order ones. Laplace's equation imposes that the divergence of the gradient of a scalar field f is zero. In spherical coordinates this is:
Consider the problem of finding solutions of the form ƒ(r,θ,φ) = R(r)Y(θ,φ). By separation of variables, two differential equations result by imposing Laplace's equation:
The second equation can be simplified under the assumption that Y has the form Y(θ,φ) = Θ(θ)Φ(φ). Applying separation of variables again to the second equation gives way to the pair of differential equations
for some number m. A priori, m is a complex constant, but because Φ must be a periodic function whose period evenly divides 2π, m is necessarily an integer and Φ is a linear combination of the complex exponentials e±imφ. The solution function Y(θ,φ) is regular at the poles of the sphere, where θ=0,π. Imposing this regularity in the solution Θ of the second equation at the boundary points of the domain is a Sturm–Liouville problem that forces the parameter λ to be of the form λ = ℓ(ℓ+1) for some non-negative integer with ℓ ≥ |m|; this is also explained below in terms of the orbital angular momentum. Furthermore, a change of variables t = cosθ transforms this equation into the Legendre equation, whose solution is a multiple of the associated Legendre polynomial . Finally, the equation for R has solutions of the form R(r) = Arℓ + Br−ℓ−1; requiring the solution to be regular throughout R3 forces B = 0. Here the solution was assumed to have the special form Y(θ,φ) = Θ(θ)Φ(φ). For a given value of ℓ, there are 2ℓ+1 independent solutions of this form, one for each integer m with −ℓ ≤ m ≤ ℓ. These angular solutions are a product of trigonometric functions, here represented as a complex exponential, and associated Legendre polynomials:
which fulfill
Here is called a spherical harmonic function of degree ℓ and order m, is an associated Legendre polynomial, N is a normalization constant, and θ and φ represent colatitude and longitude, respectively. In particular, the colatitude θ, or polar angle, ranges from 0 at the North Pole to π at the South Pole, assuming the value of π/2 at the Equator, and the longitude φ, or azimuth, may assume all values with 0 ≤ φ < 2π. For a fixed integer ℓ, every solution Y(θ,φ) of the eigenvalue problem
is a linear combination of . In fact, for any such solution, rℓY(θ,φ) is the expression in spherical coordinates of a homogeneous polynomial that is harmonic, and so counting dimensions shows that there are 2ℓ+1 linearly independent such polynomials. The general solution to Laplace's equation in a ball centered at the origin is a linear combination of the spherical harmonic functions multiplied by the appropriate scale factor rℓ,
are constants and the factors where the expansion is valid in the ball
are known as solid harmonics. Such an
Orbital angular momentum In quantum mechanics, Laplace's spherical harmonics are understood in terms of the orbital angular momentum
The is conventional in quantum mechanics; it is convenient to work in units in which . The spherical harmonics are eigenfunctions of the square of the orbital angular momentum
Laplace's spherical harmonics are the joint eigenfunctions of the square of the orbital angular momentum and the generator of rotations about the azimuthal axis:
These operators commute, and are densely defined self-adjoint operators on the Hilbert space of functions ƒ square-integrable with respect to the normal distribution on R3:
Furthermore, L2 is a positive operator.
If Y is a joint eigenfunction of L2 and Lz, then by definition
for some real numbers m and λ. Here m must in fact be an integer, for Y must be periodic in the coordinate φ with period a number that evenly divides 2π. Furthermore, since
and each of Lx, Ly, Lz are self-adjoint, it follows that λ ≥ m2. Denote this joint eigenspace by Eλ,m, and define the raising and lowering operators by
Then L+ and L− commute with L2, and the Lie algebra generated by L+, L−, Lz is the special linear Lie algebra, with commutation relations
Thus L+ : Eλ,m → Eλ,m+1 (it is a "raising operator") and L− : Eλ,m → Eλ,m−1 (it is a "lowering operator"). In particular, Lk 2 + : Eλ,m → Eλ,m+k must be zero for k sufficiently large, because the inequality λ ≥ m must hold in each of the nontrivial joint eigenspaces. Let Y ∈ Eλ,m be a nonzero joint eigenfunction, and let k be the least integer such that
Then, since
it follows that
Thus λ = ℓ(ℓ+1) for the positive integer ℓ = m+k.
Conventions Orthogonality and normalization Several different normalizations are in common use for the Laplace spherical harmonic functions. Throughout the section, we use the standard convention that
which is the natural normalization given by Rodrigues' formula. In physics and seismology, the Laplace spherical harmonics are generally defined as
which are orthonormal
where δaa = 1, δab = 0 if a ≠ b, and dΩ = sinθ dφ dθ. This normalization is used in quantum mechanics because it ensures that probability is normalized, i.e. . The disciplines of geodesy and spectral analysis use
which possess unit power
The magnetics community, in contrast, uses Schmidt semi-normalized harmonics
which have the normalization
In quantum mechanics this normalization is often used as well, and is named Racah's normalization after Giulio Racah. It can be shown that all of the above normalized spherical harmonic functions satisfy
where the superscript * denotes complex conjugation. Alternatively, this equation follows from the relation of the spherical harmonic functions with the Wigner D-matrix.
Condon–Shortley phase One source of confusion with the definition of the spherical harmonic functions concerns a phase factor of (−1) m, commonly referred to as the Condon–Shortley phase in the quantum mechanical literature. In the quantum mechanics community, it is common practice to either include this phase factor in the definition of the associated Legendre polynomials, or to append it to the definition of the spherical harmonic functions. There is no requirement to use the Condon–Shortley phase in the definition of the spherical harmonic functions, but including it can simplify some quantum mechanical operations, especially the application of raising and lowering operators. The geodesy and magnetics communities never include the Condon–Shortley phase factor in their definitions of the spherical harmonic functions.
Real form A real basis of spherical harmonics can be defined in terms of their complex analogues by setting
where
denotes the normalization constant as a function of ℓ and m. The real form
of non-negative |m|. The harmonics requires only associated Legendre polynomials with m > 0 are said to be of cosine type, and those with m < 0 of sine type. These real spherical harmonics are sometimes known as tesseral spherical harmonics. These functions have the same normalization properties as the complex ones above.
Spherical harmonics expansion The Laplace spherical harmonics form a complete set of orthonormal functions and thus form an orthonormal basis of the Hilbert space of square-integrable functions. On the unit sphere, any square-integrable function can thus be expanded as a linear combination of these:
This expansion holds in the sense of mean-square convergence — convergence in L2 of the sphere — which is to say that
The expansion coefficients are the analogs of Fourier coefficients, and can be obtained by multiplying the above equation by the complex conjugate of a spherical harmonic, integrating over the solid angle , and utilizing the above orthogonality relationships. This is justified rigorously by basic Hilbert space theory. For the case of orthonormalized harmonics, this gives:
If the coefficients decay in ℓ sufficiently rapidly — for instance, exponentially — then the series also converges uniformly to ƒ. A real square-integrable function ƒ can be expanded in terms of the real harmonics Yℓm above as a sum
Convergence of the series holds again in the same sense.
Spectrum analysis Power spectrum in signal processing The total power of a function ƒ is defined in the signal processing literature as the integral of the function squared, divided by the area of its domain. Using the orthonormality properties of the real unit-power spherical harmonic functions, it is straightforward to
verify that the total power of a function defined on the unit sphere is related to its spectral coefficients by a generalization of Parseval's theorem:
where
is defined as the angular power spectrum. In a similar manner, one can define the crosspower of two functions as
where
is defined as the cross-power spectrum. If the functions ƒ and g have a zero mean (i.e., the spectral coefficients ƒ00 and g00 are zero), then Sƒƒ(ℓ) and Sƒg(ℓ) represent the contributions to the function's variance and covariance for degree ℓ, respectively. It is common that the (cross-)power spectrum is well approximated by a power law of the form
When β = 0, the spectrum is "white" as each degree possesses equal power. When β < 0, the spectrum is termed "red" as there is more power at the low degrees with long wavelengths than higher degrees. Finally, when β > 0, the spectrum is termed "blue". The condition on the order of growth of Sƒƒ(ℓ) is related to the order of differentiability of ƒ in the next section.
Differentiability properties One can also understand the differentiability properties of the original function ƒ in terms of the asymptotics of Sƒƒ(ℓ). In particular, if Sƒƒ(ℓ) decays faster than any rational function of ℓ as ℓ → ∞, then ƒ is infinitely differentiable. If, furthermore, Sƒƒ(ℓ) decays exponentially, then ƒ is actually real analytic on the sphere.
The general technique is to use the theory of Sobolev spaces. Statements relating the growth of the Sƒƒ(ℓ) to differentiability are then similar to analogous results on the growth of the coefficients of Fourier series. Specifically, if
then ƒ is in the Sobolev space Hs(S2). In particular, the Sobolev embedding theorem implies that ƒ is infinitely differentiable provided that
for all s.
Algebraic properties Addition theorem A mathematical result of considerable interest and use is called the addition theorem for spherical harmonics. This is a generalization of the trigonometric identity cos(θ' − θ) = cosθ'cosθ + sinθsinθ' in which the role of the trigonometric functions appearing on the right-hand side is played by the spherical harmonics and that of the left-hand side is played by the Legendre polynomials. Consider two unit vectors x and y, having spherical coordinates (θ,φ) and (θ′,φ′), respectively. The addition theorem states
(1)
where Pℓ is the Legendre polynomial of degree ℓ. This expression is valid for both real and complex harmonics. The result can be proven analytically, using the properties of the Poisson kernel in the unit ball, or geometrically by applying a rotation to the vector y so that it points along the z-axis, and then directly calculating the right-hand side. In particular, when x = y, this gives Unsöld's theorem
which generalizes the identity cos2θ + sin2θ = 1 to two dimensions. In the expansion (1), the left-hand side Pℓ(x·y) is a constant multiple of the degree ℓ zonal spherical harmonic. From this perspective, one has the following generalization to higher dimensions. Let Yj be an arbitrary orthonormal basis of the space Hℓ of degree ℓ spherical harmonics on the n-sphere. Then unit vector x, decomposes as
, the degree ℓ zonal harmonic corresponding to the
(2)
Furthermore, the zonal harmonic appropriate Gegenbauer polynomial:
is given as a constant multiple of the
(3) Combining (2) and (3) gives (1) in dimension n = 2 when x and y are represented in spherical coordinates. Finally, evaluating at x = y gives the functional identity
where ωn−1 is the volume of the (n−1)-sphere.
Clebsch-Gordan coefficients The Clebsch-Gordan coefficients are the coefficients appearing in the expansion of the product of two spherical harmonics in terms of spherical harmonics itself. A variety of techniques are available for doing essentially the same calculation, including the Wigner 3-jm symbol, the Racah coefficients, and the Slater integrals. Abstractly, the ClebschGordan coefficients express the tensor product of two irreducible representations of the rotation group as a sum of irreducible representations: suitably normalized, the coefficients are then the multiplicities.
Parity The spherical harmonics have well defined parity in the sense that they are either even or odd with respect to reflection about the origin. Reflection about the origin is represented by the operator . For the spherical angles, {θ,φ} this corresponds to the replacement {π − θ,π + φ}. The associated Legendre polynomials gives (−1)ℓ-m and
from the exponential function we have (−1)m, giving together for the spherical harmonics a parity of (-1)ℓ:
This remains true for spherical harmonics in higher dimensions: applying a point reflection to a spherical harmonic of degree ℓ changes the sign by a factor of (−1)ℓ.
Visualization of the spherical harmonics
on the unit sphere and its nodal lines. is equal to 0 Schematic representation of along m great circles passing through the poles, and along circles of equal latitude. The function changes sign each time it crosses one of these lines.
3D color plot of the spherical harmonics of degree n = 5 The Laplace spherical harmonics can be visualized by considering their "nodal lines", that is, the set of points on the sphere where . Nodal lines of are composed of circles: some are latitudes and others are longitudes. One can determine the in the number of nodal lines of each type by counting the number of zeros of latitudinal and longitudinal directions independently. For the latitudinal direction, the associated Legendre polynomials possess ℓ−|m| zeros, whereas for the longitudinal direction, the trigonometric sin and cos functions possess 2|m| zeros.
When the spherical harmonic order m is zero (upper-left in the figure), the spherical harmonic functions do not depend upon longitude, and are referred to as zonal. Such spherical harmonics are a special case of zonal spherical functions. When ℓ = |m| (bottom-right in the figure), there are no zero crossings in latitude, and the functions are referred to as sectoral. For the other cases, the functions checker the sphere, and they are referred to as tesseral. More general spherical harmonics of degree ℓ are not necessarily those of the Laplace , and their nodal sets can be of a fairly general kind. basis
List of spherical harmonics Analytic expressions for the first few orthonormalized Laplace spherical harmonics that use the Condon-Shortley phase convention:
Higher dimensions The classical spherical harmonics are defined as functions on the unit sphere S2 inside three-dimensional Euclidean space. Spherical harmonics can be generalized to higher
dimensional Euclidean space Rn as follows. Let Pℓ denote the space of homogeneous polynomials of degree ℓ in n variables. That is, a polynomial P is in Pℓ provided that
Let Aℓ denote the subspace of Pℓ consisting of all harmonic polynomials; these are the solid spherical harmonics. Let Hℓ denote the space of functions on the unit
obtained by restriction from Aℓ. The following properties hold:
The spaces Hℓ are dense in the set of continuous functions on Sn−1 with respect to the uniform topology, by the Stone-Weierstrass theorem. As a result, they are also dense in the space L2(Sn−1) of square-integrable functions on the sphere.
For all ƒ ∈ Hℓ, one has where ΔSn−1 is the Laplace-Beltrami operator on Sn−1. This operator is the analog of the angular part of the Laplacian in three dimensions; to wit, the Laplacian in n dimensions decomposes as
It follows from the Stokes theorem and the preceding property that the spaces Hℓ are orthogonal with respect to the inner product from L2(Sn−1). That is to say,
for ƒ ∈ Hℓ and g ∈ Hk for k ≠ ℓ.
Conversely, the spaces Hℓ are precisely the eigenspaces of ΔSn−1. In particular, an application of the spectral theorem to the Riesz potential gives another proof that the spaces Hℓ are pairwise orthogonal and complete in L2(Sn−1).
Every homogeneous polynomial P ∈ Pℓ can be uniquely written in the form
where Pj ∈ Aj. In particular,
An orthogonal basis of spherical harmonics in higher dimensions can be constructed inductively by the method of separation of variables, by solving the Sturm-Liouville problem for the spherical Laplacian
where φ is the axial coordinate in a spherical coordinate system on Sn−1.
Connection with representation theory The space Hℓ of spherical harmonics of degree ℓ is a representation of the symmetry group of rotations around a point (SO(3)) and its double-cover SU(2). Indeed, rotations act on the two-dimensional sphere, and thus also on Hℓ by function composition
for ψ a spherical harmonic and ρ a rotation. The representation Hℓ is an irreducible representation of SO(3). The elements of Hℓ arise as the restrictions to the sphere of elements of Aℓ: harmonic polynomials homogeneous of degree ℓ on three-dimensional Euclidean space R3. By polarization of ψ ∈ Aℓ, there are coefficients symmetric on the indices, uniquely determined by the requirement
The condition that ψ be harmonic is equivalent to the assertion that the tensor must be trace free on every pair of indices. Thus as an irreducible representation of SO(3), Hℓ is isomorphic to the space of traceless symmetric tensors of degree ℓ. More generally, the analogous statements hold in higher dimensions: the space Hℓ of spherical harmonics on the n-sphere is the irreducible representation of SO(n+1) corresponding to the traceless symmetric ℓ-tensors. However, whereas every irreducible tensor representation of SO(2) and SO(3) is of this kind, the special orthogonal groups in higher dimensions have additional irreducible representations that do not arise in this manner. The special orthogonal groups have additional spin representations that are not tensor representations, and are typically not spherical harmonics. An exception are the spin
representation of SO(3): strictly speaking these are representations of the double cover SU(2) of SO(3). In turn, SU(2) is identified with the group of unit quaternions, and so coincides with the 3-sphere. The spaces of spherical harmonics on the 3-sphere are certain spin representations of SO(3), with respect to the action by quaternionic multiplication.
Generalizations The angle-preserving symmetries of the two-sphere are described by the group of Möbius transformations PSL(2,C). With respect to this group, the sphere is equivalent to the usual Riemann sphere. The group PSL(2,C) is isomorphic to the (proper) Lorentz group, and its action on the two-sphere agrees with the action of the Lorentz group on the celestial sphere in Minkowski space. The analog of the spherical harmonics for the Lorentz group is given by the hypergeometric series; furthermore, the spherical harmonics can be re-expressed in terms of the hypergeometric series, as SO(3) = PSU(2) is a subgroup of PSL(2,C). More generally, hypergeometric series can be generalized to describe the symmetries of any symmetric space; in particular, hypergeometric series can be developed for any Lie group.
Chapter-9
Variational Inequality and Underdetermined System
Variational inequality In mathematics, a variational inequality is an inequality involving a functional, which has to be solved for all the value of a given variable, belonging usually to a convex set. The mathematical theory of variational inequalities was initially developed to deal with equilibrium problems, precisely the Signorini problem: in that model problem, the functional involved was obtained as the first variation of the involved potential energy therefore it has a variational origin, recalled by the name of the general abstract problem. The applicability of the theory has since been expanded to include problems from economics, finance, optimization and game theory.
History The first problem involving a variational inequality was the Signorini problem, posed by Antonio Signorini in 1959 and solved by Gaetano Fichera in 1963, according to the references (Antman 1983, pp. 282–284) and (Fichera 1995): the first papers of the theory were (Fichera 1963) and (Fichera 1964a), (Fichera 1964b). Later on Guido Stampacchia proved his generalization to the Lax-Milgram theorem in (Stampacchia 1964) in order to study the regularity problem for partial differential equations and coined the name "variational inequality" for all the problems involving inequalities of this kind. Georges Duvaut encouraged his graduate students to study and expand on Fichera's work, after attending a conference in Brixen on 1965 where Fichera presented his study of the Signorini problem, as Antman 1983, p. 283 reports: thus the theory become widely known throughout France. Also in 1965, Stampacchia and Jacques-Louis Lions extended earlier results of (Stampacchia 1964), announcing them in the paper (Lions & Stampacchia 1965): full proofs of their results appeared later in the paper (Lions & Stampacchia 1967).
Definition Following Antman (1983, p. 283), the formal definition of a variational inequality is the following one. from K Definition 1. Given a Banach space E, a subset K of E, and a functional * to the dual space E of the space E, the variational inequality problem is the problem of solving respect to the variable x belonging to K the following inequality:
where
is the duality pairing.
In general, the variational inequality problem can be formulated on any finite- or infinitedimensional Banach space. The three obvious steps in the study of the problem are the following ones: 1. Prove the existence of a solution: this step implies the mathematical correctness of the problem, showing that there is at least a solution. 2. Prove the uniqueness of the given solution: this step implies the physical correctness of the problem, showing that the solution can be used to represent a physical phenomenon. It is a particularly important step since most of the problems modeled by variational inequalities are of physical origin. 3. Find the solution.
Examples The problem of finding the minimal value of a real-valued function of real variable This is a standard example problem, reported by Antman (1983, p. 283): consider the problem of finding the minimal value of a differentiable function f over a closed interval I = [a,b]. Let be a point in I where the minimum occurs. Three cases can occur: 1. if 2. if 3. if
then then then
These necessary conditions can be summarized as the problem of finding
such that
The absolute minimum must be searched between the solutions (if more than one) of the preceding inequality: note that the solution is a real number, therefore this is a finite dimensional variational inequality.
The general finite dimensional variational inequality A formulation of the general problem in is the following: given a subset K of and a , the finite-dimensional variational inequality problem associated with mapping K consist of finding a n-dimensional vector x belonging to K such that
where
is the standard inner product on the vector space
.
The variational inequality for the Signorini problem
The classical Signorini problem: what will be the equilibrium configuration of the orange spherically shaped elastic body resting on the blue rigid frictionless plane? In the historical survey (Fichera 1995), Gaetano Fichera describes the genesis of his solution to the Signorini problem: the problem consist in finding the elastic equilibrium configuration of a anisotropic non-homogeneous elastic body that lies in a subset A of the three-dimensional euclidean space whose boundary is , resting on a rigid frictionless surface and subject only to its mass forces. The solution u of the problem exists and is unique (under precise assumptions) in the set of admissible displacements i.e. the set of displacement vectors satisfying the system of ambiguous boundary conditions if and only if
where
and
are the following functionals, written using the Einstein notation
where, for all
,
Σ is the contact surface (or more generally a contact set), is the body force applied to the body, is the surface force applied to , is the infinitesimal strain tensor, is the Cauchy stress tensor, defined as
where the elasticity tensor.
is the elastic potential energy and
is
Underdetermined system In mathematics, a system of linear equations is considered underdetermined if there are fewer equations than unknowns. The terminology can be described in terms of the concept of counting constraints. Each unknown can be seen as an available degree of freedom. Each equation introduced into the system can be viewed as a constraint that restricts one degree of freedom. Therefore the critical case occurs when the number of equations and the number of independent variables are equal. For every degree of freedom, there exists a corresponding restraint. The underdetermined case occurs when the system has been underconstrained—that is, when the number of unknowns outnumbers the number of the equations.
Systems of equations An example in two dimensions
#1 A system of three equations, three lines, three intersections, pairwise linearly independent
#2 A system of three equations, three lines (two parallel), two intersections
#3 A system of three equations, three lines (three parallel), no intersections
#4 A system of three equations, three lines (two collinear), one intersection, one linearly dependent
#5 A system of three equations, three lines, one intersection, one linearly dependent
#6 A system of three equations, three lines, infinite intersection, all linearly dependent Consider the system of 3 equations and 2 unknowns (x1 and x2): 2x1 + x2 = − 1 − 3x1 + x2 = − 2 − x1 + x2 = 1
Note: equations above correspond to picture #1 such that x1=x and x2=y in the Cartesian coordinate plane There are three "solutions" to the system as can be determined from the graph's intersections, one for each pair of linear equations: between one and two (0.2, -1.4), between one and three (-2/3, 1/3), between two and three (1.5, 2.5). However there is no solution that satisfies all three simultaneously. Systems of this variety are deemed inconsistent. The only case where the overdetermined system does in fact have a solution is demonstrated in pictures four, five, and six. These exceptions occur when the overdetermined set contains linearly dependent equations. Linear dependence means that the elements of a set can be described in terms of existing equations. For example, y=x+1 and 2y=2x+2 are linearly dependent equations.
Matrix form Any system of linear equations can be written as a matrix equation. The previous system of equations can be written as follows:
Notice that the number of rows outnumber the number of columns. In linear algebra the concepts of row space, column space and null space are important for determining the properties of matrices. The informal discussion of constraints and degrees of freedom above relate directly to these more formal concepts.
General cases Homogeneous case The homogeneous case is always consistent (because there is a trivial, all-zero solution). There are two cases, in rough terms: there is just the trivial solution, or the trivial solution plus an infinite set of solutions, of nature determined by the number of linearly dependent equations. Consider the system of linear equations: Li = 0 for 1 ≤ i ≤ M, and variables X1, X2, ..., XN, then X1 = X2 = ... = XN = 0 is always a solution. When M < N the system is underdetermined and there are always further solutions. In fact the dimension of the space of solutions is always at least N − M.
For M ≥ N, there may be no solution other than all values being 0. There will be other solutions just when the system of equations has enough dependencies (linearly dependent elements), so that the number of effective constraints is less than the apparent number M; more precisely the system must reduce to at most N − 1 equations. All we can be sure about is that it will reduce to at most N.
Inhomogeneous case When studying systems of linear equations, Li = ci for 1 ≤ i ≤ M, in variables X1, X2, ..., XN the equations Li are sometimes linearly dependent. These dependent equations (often described in terms of vectors) lead to three possible cases for an overdetermined system.
M equations* and N unknowns*, such that M > N and all M are linearly independent. This case yields no solution. M equations and N unknowns, such that M > N but all M are not linearly independent. There exist three possible sub-cases of this: o M equations and N unknowns, such that M > N but all M are not linearly independent, but when the linearly dependent equations(D) are removed M - D > N. This case yields no solutions. o M equations and N unknowns, such that M > N but all M are not linearly independent, but when the linearly dependent equations(D) are removed M − D = N. This case yields a single solution. o M equations and N unknowns, such that M < N but all M are not linearly independent, but when the linearly dependent equations(D) are removed M − D < N. This case yields infinitely many solutions, which can be described as F which is the entirety of the field in which the equations are operating.
Note:(*Equations and unknowns can correspond to the rows and columns of a matrix respectively)
M equations and N unknowns, such that M < N and all M are linearly dependent. This case yields infinitely many solutions except in some cases where the solution is sparse.
The discussion is more convincing, perhaps, when translated into the geometric language of intersecting hyperplanes. The homogeneous case applies to hyperplanes through a given point, taken as origin of coordinates. The inhomogeneous case is for general hyperplanes, which may therefore exhibit parallelism or intersect. A sequence of hyperplanes H1, H2, ..., HM gives rise to intersections of the first k, which are expected to drop in dimension 1 each time. If M > N, the dimension of the ambient space, we expect the intersection to be empty, and this is precisely the overdetermined case.
Exact solutions to some underdetermined systems While in the general case, there are an infinite number of solutions to an underdetermined system, there is a subset of problems fitting the compressed sensing framework, where sparse solutions can be found to be the unique solutions of an underdetermined system. Exact solutions can be found using a variety of solvers implementing a wide range of techniques ranging from linear programming to greedy algorithms.
In general use Underdetermined systems appear in a whole slew of problems where there far fewer equations than unknowns such as genomics, sub-Nyquist sampling, compressed sensing etc... The concept can also be applied to more general systems of equations, such as partial differential equations.