[i]
Alan U. Kennington
Differential geometry reconstructed a unified systematic framework
First edition [work in prog...
40 downloads
976 Views
5MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
[i]
Alan U. Kennington
Differential geometry reconstructed a unified systematic framework
First edition [work in progress]
90
90 10
0
90
90
90 10
10
0
0
10
0
90
90
10
0
10
0
0
10 0 0
10
90
10
10
0 10
90
0
0
10
10
0
0
90
90
10
10
10
0 10
90
0
0
90 10 0
10
10
0 0
0
90
90
10
0
10 0
10
0
90 90
0
10
0
0
0
0
10 10
90
0
10
90
0
10
10
90
10
90
0
B
90
0
10
10 10
10
10
10
90
90
0
0
0
90
10
10
10 0
0
0
10
10
0
0
10
Ω1
Ω2 thematics ma
0
c
pology
o thr an e en c
[ www.topology.org/tex/conc/dg.html ]
7
gy neurosci
6
olo bi
9
y
8
chemist r
4
ics
1 5
ph
l og i
2
ys
3
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
A
0
90
90
[ii]
Mathematics Subject Classification (MSC 2000): 53–01
Library cataloguing data Kennington, Alan Ulrich (1953Differential geometry reconstructed: a unified systematic framework 516.3
First printing, March 2009
[work in progress]
c 2009, Alan U. Kennington. Copyright All rights reserved. The author hereby grants permission to print this book draft in A4 format. Printing in all other formats is forbidden. You may not charge any fee for copies of this book draft. This book was typeset by the author with the plain TEX typesetting system. The illustrations in this book were created with MetaPost. This book is available on the Internet: http://www.topology.org/tex/conc/dg.html
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
2009
[iii]
Preface This book should be suitable for fourth year university mathematics, physics and engineering students, or for anyone who has already learned differential geometry but has an uneasy feeling that they may have skimmed over a few too many fine points. The intention here is to replace intuition and hand-waving with a seamless, systematic exposition. However, this is only a definitions book, not a theorems book. The reader must look elsewhere for serious theorems and serious applications. But understanding definitions is obviously an enormously important part of understanding theorems and applications. The author wrote the first 112 pages of this book in early 1992 in Bonn on his Atari ST computer. After nine years of neglect, he wrote another 310 pages from August 2001 to November 2002. It is still a scruffy “work in progress” scrapbook, but it may be ready for a first printing some time in 2009. Then there should be one or two printings every year thereafter. Right now, this book looks more like a construction site than a finished building. With some imagination, you may be able to conjure up a vision of the finished work through the scaffolding. Material is being added and rewritten in many chapters and sections simultaneously. The creative process for producing this book is illustrated in the following diagram. All processes are happening concurrently. upload
download
ideas
notes
TEX files
PostScript files
brain
desk
workstation
web server
Internet
The current strategy is to first type in all of my hand-written notes during the “ideas capture phase”. Then during the “consolidation phase”, everything will be made neat, tidy, comprehensible and coherent. The book is being assembled like a jigsaw puzzle. Some of the pieces are fitting together nicely already, but most pieces are in disorganized heaps. Many pieces are still in the box waiting to be thrown on the table. Sometimes new pieces must be crafted by hand. It’s like moving into a new house. First you dump all the boxes on the floor; then you must put everything where it belongs. Inconsistencies, repetition, self-indulgence and frivolity will be progressively removed. All of the theorems will be proved. All of the exercises will be solved. Formative chaos will yield to serene order. It won’t happen overnight, but it will happen. March 2009
Dr. Alan U. Kennington Melbourne, Victoria Australia
Disclaimer The author of this book disclaims any express or implied guarantee of the fitness of this book for any purpose. In no event shall the author of this book be held liable for any direct, indirect, incidental, special, exemplary, or consequential damages (including, but not limited to, procurement of substitute services; loss of use, data, or profits; or business interruption) however caused and on any theory of liability, whether in contract, strict liability, or tort (including negligence or otherwise) arising in any way out of the use of this book, even if advised of the possibility of such damage.
Biography The author was born in England in 1953 to a German mother and Irish father. The family migrated in 1963 to Adelaide, South Australia. The author graduated from the University of Adelaide in 1984 with a Ph.D. in mathematics. He was a tutor at University of Melbourne in 1984, research assistant at the Australian National University (Canberra) in early 1985, Assistant Professor at University of Kentucky for the 1985/86 academic year, and visiting researcher at the University of Heidelberg, Germany, in 1986/87. From 1987 to 2007, the author carried out research and development of communications and information technologies in Australia, Germany and the Netherlands.
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
books
type
write
read
[iv]
Chapters
2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24.
Part I. Preliminary topics Philosophical considerations . . . . . . . . . . . . . . . Logic semantics . . . . . . . . . . . . . . . . . . . . . . Logic methods . . . . . . . . . . . . . . . . . . . . . . Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . Relations and functions . . . . . . . . . . . . . . . . . . Order and integers . . . . . . . . . . . . . . . . . . . . Rational and real numbers . . . . . . . . . . . . . . . . Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . Linear algebra . . . . . . . . . . . . . . . . . . . . . . . Matrix algebra . . . . . . . . . . . . . . . . . . . . . . Affine spaces . . . . . . . . . . . . . . . . . . . . . . . Tensor algebra . . . . . . . . . . . . . . . . . . . . . . Topology . . . . . . . . . . . . . . . . . . . . . . . . . Topology classes and constructions . . . . . . . . . . . . Topological curves, paths and groups . . . . . . . . . . . Metric spaces . . . . . . . . . . . . . . . . . . . . . . . Differential calculus . . . . . . . . . . . . . . . . . . . . Diffeomorphisms in Euclidean space . . . . . . . . . . . Measure and integration . . . . . . . . . . . . . . . . . Differential equations . . . . . . . . . . . . . . . . . . . Non-topological fibre bundles . . . . . . . . . . . . . . . Topological fibre bundles . . . . . . . . . . . . . . . . . Parallelism on topological fibre bundles . . . . . . . . .
[ www.topology.org/tex/conc/dg.html ]
. . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . .
1 17 65 113 159 205 227 249 261 289 307 321 325 347 377 397 413 425 443 459 477 485 493 525
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
v
25. 26. 27. 28. 29. 30. 31. 32. 33. 34. 35. 36. 37. 38. 39. 40. 41. 42. 43. 44. 45. 46. 47. 48. 49. 50. 51.
Part II. Differential geometry Overview of differential geometry layers . . . . . . . . . . Topological manifolds . . . . . . . . . . . . . . . . . . . . Differentiable manifolds . . . . . . . . . . . . . . . . . . . Tangent bundles on differentiable manifolds . . . . . . . . Tensor bundles and tensor fields on manifolds . . . . . . . Higher-order tangent vectors . . . . . . . . . . . . . . . . Differentials on manifolds . . . . . . . . . . . . . . . . . . Higher-order differentials . . . . . . . . . . . . . . . . . . Vector field calculus . . . . . . . . . . . . . . . . . . . . . Differentiable groups . . . . . . . . . . . . . . . . . . . . Differentiable fibre bundles . . . . . . . . . . . . . . . . . Connections on differentiable fibre bundles . . . . . . . . . Affine connections and covariant derivatives . . . . . . . . Geodesics, convexity and Jacobi fields . . . . . . . . . . . Riemannian manifolds . . . . . . . . . . . . . . . . . . . Pseudo-Riemannian manifolds . . . . . . . . . . . . . . . Tensor calculus . . . . . . . . . . . . . . . . . . . . . . . Geometry of the 2-sphere . . . . . . . . . . . . . . . . . . Examples of manifolds . . . . . . . . . . . . . . . . . . . Examples of fibre bundles . . . . . . . . . . . . . . . . . . Derivations, gradient operators, germs and jets . . . . . . History of differential geometry . . . . . . . . . . . . . . . Exercise questions . . . . . . . . . . . . . . . . . . . . . . Exercise answers . . . . . . . . . . . . . . . . . . . . . . Notations and abbreviations . . . . . . . . . . . . . . . . Bibliography . . . . . . . . . . . . . . . . . . . . . . . . Index . . . . . . . . . . . . . . . . . . . . . . . . . . . .
[ www.topology.org/tex/conc/dg.html ]
. . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . .
537 549 559 577 607 617 629 641 649 659 675 687 705 719 727 737 739 747 767 775 781 791 805 811 821 831 841
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Chapters
vi
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
[vii]
Table of contents
1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 1.10
. . . . . . . . . .
2 3 3 5 9 10 11 11 12 13
Part I. Preliminary topics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
15
Chapter 2. Philosophical considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
17
2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 2.10 2.11 2.12
Layers of structure of differential geometry Topic flow diagram . . . . . . . . . . . . Chapter groups . . . . . . . . . . . . . . Objectives and motivations . . . . . . . . Style . . . . . . . . . . . . . . . . . . . . Some minor details of presentation . . . . Differences from other differential geometry Acknowledgements . . . . . . . . . . . . . MSC 2000 subject classification . . . . . . How to learn mathematics . . . . . . . . .
. . . . . . . . . . . . . . . . . . texts . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
[ www.topology.org/tex/conc/dg.html ]
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . .
65
. . . . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . .
Chapter 3. Logic semantics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . .
19 22 25 27 31 36 38 40 43 47 52 61
Mathematical logic subject development . . . . . . . . . . . General comments on logic . . . . . . . . . . . . . . . . . . Modelling, meta-modelling and recursive modelling . . . . . . The universality (or otherwise) of modern logic . . . . . . . . Logic in literature . . . . . . . . . . . . . . . . . . . . . . . Proposition-store versus world-view ontology for logic . . . . A proposition-store ontology for logic . . . . . . . . . . . . . Undecidable propositions and incomplete information transfer The semantics of truth and falsity . . . . . . . . . . . . . . . The semantics of logical negation . . . . . . . . . . . . . . . Proof by contradiction . . . . . . . . . . . . . . . . . . . . . The moods of logical propositions . . . . . . . . . . . . . . . Other remarks on the semantics of logic . . . . . . . . . . . . Naive mathematics . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . . . .
3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 3.10 3.11 3.12 3.13 3.14
The bedrock of mathematics . . . . . . . . . . . . . . . . . . . . Logic, language and tribalism . . . . . . . . . . . . . . . . . . . . Ontology of mathematics . . . . . . . . . . . . . . . . . . . . . . Plato’s theory of ideas . . . . . . . . . . . . . . . . . . . . . . . . Sets as parameters for socio-mathematical network communications Sets as parameters for classes of objects . . . . . . . . . . . . . . . Extraneous properties of set-constructions in definitions . . . . . . Axioms versus constructions for defining mathematical systems . . Some general remarks on mathematics and logic . . . . . . . . . . Dark sets and dark numbers . . . . . . . . . . . . . . . . . . . . . Integers and infinity . . . . . . . . . . . . . . . . . . . . . . . . . Real numbers and infinitesimality . . . . . . . . . . . . . . . . . .
. . . . . . . . . .
1
. . . . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
65 69 73 77 79 85 87 93 95 97 103 106 107 110
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Chapter 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
viii
4.1 4.2 4.3 4.4 4.5 4.6 4.7 4.8 4.9 4.10 4.11 4.12 4.13 4.14 4.15 4.16
. . . . . . . . . . . . . . . .
114 118 120 126 127 129 132 133 137 141 143 145 150 153 156 157
Chapter 5. Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
159
5.1 5.2 5.3 5.4 5.5 5.6 5.7 5.8 5.9 5.10 5.11 5.12 5.13 5.14 5.15 5.16
Concrete proposition domains . . . . . . . . . . . . . . Logic operations in concrete proposition domains . . . . Logical operators and expressions . . . . . . . . . . . . Logical expression evaluation and logical argumentation Propositional calculus formalization . . . . . . . . . . . Deduction rules . . . . . . . . . . . . . . . . . . . . . An implication-based propositional calculus . . . . . . . Some propositional calculus theorems . . . . . . . . . . Meta-theorems and the “deduction theorem” . . . . . . Further theorems for the implication operator . . . . . Other logical operators . . . . . . . . . . . . . . . . . Parametrized families of propositions . . . . . . . . . . Logical quantifiers . . . . . . . . . . . . . . . . . . . . Predicate calculus . . . . . . . . . . . . . . . . . . . . Equality . . . . . . . . . . . . . . . . . . . . . . . . . Uniqueness . . . . . . . . . . . . . . . . . . . . . . . .
[ www.topology.org/tex/conc/dg.html ]
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
205
. . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
Chapter 6. Relations and functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
161 165 167 169 169 171 173 185 188 194 195 195 196 198 200 202
Ordered pairs . . . . . . . . . . . . . . . . . . . . Cartesian product of a pair of sets . . . . . . . . . Relations . . . . . . . . . . . . . . . . . . . . . . . Equivalence relations and partitions . . . . . . . . . Functions . . . . . . . . . . . . . . . . . . . . . . Function set maps and inverse set maps . . . . . . . Composition of functions . . . . . . . . . . . . . . Families of sets and functions . . . . . . . . . . . . Cartesian products of families of sets and functions . Partial Cartesian products and identification spaces Partially defined functions . . . . . . . . . . . . . . Notations for sets of functions . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
6.1 6.2 6.3 6.4 6.5 6.6 6.7 6.8 6.9 6.10 6.11 6.12
Zermelo-Fraenkel set theory . . . . . . . . . . . . . . The ZF extension axiom . . . . . . . . . . . . . . . . The ZF empty set, pair, union and power set axioms . The ZF replacement axiom . . . . . . . . . . . . . . The ZF regularity axiom . . . . . . . . . . . . . . . The ZF infinity axiom . . . . . . . . . . . . . . . . . Russell’s paradox . . . . . . . . . . . . . . . . . . . ZF set theory definitions and notations . . . . . . . . Axiom of choice . . . . . . . . . . . . . . . . . . . . Axiom of countable choice . . . . . . . . . . . . . . . Zermelo set theory . . . . . . . . . . . . . . . . . . . Bernays-G¨ odel set theory . . . . . . . . . . . . . . . Basic properties of binary set unions and intersections Basic properties of general set unions and intersections Closure of set unions under arbitrary unions . . . . . Specification tuples . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
113
. . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
206 208 209 213 213 216 218 219 221 222 223 225
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Chapter 4. Logic methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
ix
7.1 7.2 7.3 7.4 7.5 7.6 7.7 7.8 7.9 7.10 7.11 7.12 7.13
. . . . . . . . . . . . .
227 230 235 237 237 239 239 240 241 242 244 246 248
Chapter 8. Rational and real numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
249
8.1 8.2 8.3 8.4 8.5 8.6 8.7
Ordered sets . . . . . . . . . . . . . . . . . . . . . . . . Ordinal numbers . . . . . . . . . . . . . . . . . . . . . . Natural numbers . . . . . . . . . . . . . . . . . . . . . . Unsigned integer arithmetic . . . . . . . . . . . . . . . . Signed integers . . . . . . . . . . . . . . . . . . . . . . . Extended integers . . . . . . . . . . . . . . . . . . . . . Cartesian products of sequences of sets and functions . . Choice functions without the axiom of choice . . . . . . . Indicator functions and delta functions . . . . . . . . . . Permutations . . . . . . . . . . . . . . . . . . . . . . . Combinations and ordered selections . . . . . . . . . . . List spaces for general sets . . . . . . . . . . . . . . . . Reformulation of logic in terms of axiomatic mathematics
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
[ www.topology.org/tex/conc/dg.html ]
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . .
. . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . .
. . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . .
. . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . .
. . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . .
. . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . .
289
. . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . .
. . . . . . . . . . . . .
Chapter 10. Linear algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . .
. . . . . . . . . . . . .
262 263 266 269 274 276 278 280 281 284 285 287
Linear spaces . . . . . . . . . . . . . . . Linear subspaces and basis vectors . . . . Linear maps . . . . . . . . . . . . . . . . Eigenspaces of linear space endomorphisms Linear functionals and dual spaces . . . . Direct sums of linear spaces . . . . . . . . Quotients of linear spaces . . . . . . . . . Inner products and norms . . . . . . . . . Groups of linear transformations . . . . . Free linear spaces . . . . . . . . . . . . . Exact sequences of linear maps . . . . . .
. . . . . . . . . . . .
. . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . .
10.1 10.2 10.3 10.4 10.5 10.6 10.7 10.8 10.9 10.10 10.11
. . . . . . . . . . . .
. . . . . . .
. . . . . . . . . . . . .
261
. . . . . . . . . . . .
. . . . . . .
. . . . . . . . . . . . .
Chapter 9. Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . .
. . . . . . . . . . . . .
249 250 251 253 254 254 258
Semigroups . . . . . . . . . . . . . . . . . . . Groups . . . . . . . . . . . . . . . . . . . . . . Subgroups . . . . . . . . . . . . . . . . . . . . Left transformation groups . . . . . . . . . . . Right transformation groups . . . . . . . . . . . Mixed transformation groups . . . . . . . . . . Figures and invariants of transformation groups Rings and fields . . . . . . . . . . . . . . . . . Modules . . . . . . . . . . . . . . . . . . . . . Associative algebras . . . . . . . . . . . . . . . Lie algebras . . . . . . . . . . . . . . . . . . . List space for sets with algebraic structure . . .
. . . . . . .
. . . . . . . . . . . . .
. . . . . . .
9.1 9.2 9.3 9.4 9.5 9.6 9.7 9.8 9.9 9.10 9.11 9.12
Rational numbers . . . . . . . . . . . Extended rational numbers . . . . . . Real numbers . . . . . . . . . . . . . Extended real numbers . . . . . . . . Real number tuples . . . . . . . . . . Some useful basic real-valued functions Complex numbers . . . . . . . . . . .
. . . . . . . . . . . . .
227
. . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
289 291 293 294 295 298 299 300 301 301 302
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Chapter 7. Order and integers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
x
11.1 11.2 11.3 11.4 11.5 11.6 11.7
. . . . . . .
307 311 313 315 317 318 318
Chapter 12. Affine spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
321
12.1 12.2 12.3 12.4
Rectangular matrix algebra . . . . . . . Component matrices of linear maps . . . Square matrix algebra . . . . . . . . . . Real square matrix algebra . . . . . . . Real symmetric matrix algebra . . . . . Real symmetric definite and semi-definite Matrix groups . . . . . . . . . . . . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . . . . .
. . . .
. . . . . . .
. . . .
. . . . . . .
[ www.topology.org/tex/conc/dg.html ]
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . .
. . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . .
. . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . .
. . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . .
. . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . .
. . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . .
. . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . .
. . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . .
. . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . .
. . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . .
347
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . .
. . . . . . .
Chapter 14. Topology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . .
. . . .
. . . . . . .
326 329 330 332 334 336 337 338 342 344 344 345
History and generalities . . . . . . . . . . . . . Topological spaces . . . . . . . . . . . . . . . . Some simple topologies on finite sets . . . . . . Interior and closure of sets . . . . . . . . . . . . Exterior and boundary of sets . . . . . . . . . . Limit points and isolated points . . . . . . . . . Some simple topologies on countably infinite sets Generation of topologies from collections of sets The standard topology for the real numbers . . Open bases and open subbases . . . . . . . . . Continuous functions . . . . . . . . . . . . . . Homeomorphisms . . . . . . . . . . . . . . . .
. . . . . . . . . . . .
. . . .
. . . . . . .
. . . . . . . . . . . .
14.1 14.2 14.3 14.4 14.5 14.6 14.7 14.8 14.9 14.10 14.11 14.12
. . . . . . . . . . . .
. . . .
. . . . . . .
325
. . . . . . . . . . . .
. . . .
. . . . . . .
Chapter 13. Tensor algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . .
. . . . . . .
321 322 323 324
Multilinear maps . . . . . . . . . . . . . . . . . . . . Linear spaces of multilinear maps . . . . . . . . . . . Symmetric and antisymmetric multilinear maps . . . Tensor product metadefinition . . . . . . . . . . . . . Tensor products of linear spaces . . . . . . . . . . . . Covariant tensors . . . . . . . . . . . . . . . . . . . Mixed tensors . . . . . . . . . . . . . . . . . . . . . General tensor algebra . . . . . . . . . . . . . . . . . Alternating tensors . . . . . . . . . . . . . . . . . . Alternating tensor algebra . . . . . . . . . . . . . . . Tensor products defined via free linear spaces . . . . . Tensor products defined via lists of tensor monomials
. . . .
. . . . . . .
. . . .
13.1 13.2 13.3 13.4 13.5 13.6 13.7 13.8 13.9 13.10 13.11 13.12
Affine spaces discussion . . . Affine space definitions . . . Affine transformation groups Euclidean spaces . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . matrices . . . . .
307
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
349 351 353 355 358 363 364 367 369 369 371 374
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Chapter 11. Matrix algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
xi
15.1 15.2 15.3 15.4 15.5 15.6 15.7 15.8 15.9 15.10 15.11
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
377 379 383 384 388 390 391 393 394 394 395
Chapter 16. Topological curves, paths and groups . . . . . . . . . . . . . . . . . . . . . . . . . .
397
16.1 16.2 16.3 16.4 16.5 16.6 16.7 16.8 16.9 16.10
Product and quotient topologies . . . . . . . . . . . . Separation classes . . . . . . . . . . . . . . . . . . . Separation and disconnection of sets . . . . . . . . . Connectivity classes . . . . . . . . . . . . . . . . . . Definition of continuity of functions using connectivity Open bases, countability classes and separability . . . Compactness classes . . . . . . . . . . . . . . . . . . Topological properties of real number intervals . . . . Topological dimension . . . . . . . . . . . . . . . . . Set union topology . . . . . . . . . . . . . . . . . . . Topological identification spaces . . . . . . . . . . . .
Chapter 18. Differential calculus 18.1 18.2 18.3 18.4 18.5 18.6 18.7 18.8 18.9
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
[ www.topology.org/tex/conc/dg.html ]
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . . . . .
. . . . .
. . . . . .
. . . . . . . . .
. . . . .
. . . . . . . . . .
. . . . . .
. . . . . . . . .
. . . . .
. . . . . . . . . .
. . . . . . . . . . .
. . . . . .
. . . . . . . . .
. . . . .
. . . . . . . . . .
. . . . . . . . . . .
. . . . . .
. . . . . . . . .
. . . . .
. . . . . . . . . .
. . . . . . . . . . .
. . . . . .
. . . . . . . . .
. . . . .
. . . . . . . . . .
. . . . . . . . . . .
. . . . . .
. . . . . . . . .
. . . . .
. . . . . . . . . .
. . . . . . . . . . .
. . . . . .
. . . . . . . . .
. . . . .
. . . . . . . . . .
. . . . . .
. . . . . . . . .
. . . . .
443
. . . . . .
. . . . . . . . .
. . . . .
. . . . . . . . . .
. . . . . . . . . . .
Chapter 19. Diffeomorphisms in Euclidean space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . .
. . . . .
. . . . . . . . . .
. . . . . . . . . . .
425 426 431 432 435 439 440 440 441
Tangent vectors and diffeomorphisms . . . . . . . . . . Differentials and diffeomorphisms . . . . . . . . . . . . Second-level tangent vectors and diffeomorphisms . . . Diffeomorphism pseudogroups . . . . . . . . . . . . . . Second-order differential operators and diffeomorphisms Directionally differentiable homeomorphisms . . . . . .
. . . . . . . . .
. . . . .
. . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . .
19.1 19.2 19.3 19.4 19.5 19.6
. . . . . . . . .
. . . . .
. . . . . . . . . .
. . . . . . . . . . .
425
. . . . . . . . .
. . . . .
. . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . .
. . . . . . . . . .
. . . . . . . . . . .
414 416 417 421 422
. . . . . . . . .
. . . . .
. . . . . . . . . .
. . . . . . . . . . .
. . . . .
Infinitesimals . . . . . . . . . . . . . . . . . . . . . . Differentiation for one variable . . . . . . . . . . . . Unidirectional differentiability of real-to-real functions Higher-order derivatives for real-to-real functions . . . Differentiation for several variables . . . . . . . . . . Higher-order derivatives for several variables . . . . . Some differentiability-based function spaces . . . . . . Differentiation for abstract linear spaces . . . . . . . H¨older continuity . . . . . . . . . . . . . . . . . . .
. . . . .
. . . . . . . . . .
. . . . . . . . . . .
413
. . . . .
. . . . . . . . . .
. . . . . . . . . . .
Chapter 17. Metric spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . . .
397 400 402 404 406 408 408 409 410 411
Distance functions and balls . . . . . Set distance and set diameter . . . . The topology induced by a metric . . Continuous functions in metric spaces Rectifiable sets, curves and paths . .
. . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . .
17.1 17.2 17.3 17.4 17.5
Curve and path terminology and definition options . Curves . . . . . . . . . . . . . . . . . . . . . . . . Path-equivalence relations for curves . . . . . . . . Paths . . . . . . . . . . . . . . . . . . . . . . . . . Convex curvilinear interpolation . . . . . . . . . . . Algebraic topology . . . . . . . . . . . . . . . . . . Topological groups . . . . . . . . . . . . . . . . . . Topological transformation groups . . . . . . . . . Topological vector spaces . . . . . . . . . . . . . . Network topology and continuous paths in networks
. . . . . . . . . . .
377
. . . . . .
. . . . . . . . .
. . . . . .
. . . . . .
443 445 449 452 454 456
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Chapter 15. Topology classes and constructions
xii
20.1 20.2 20.3 20.4 20.5 20.6 20.7 20.8 20.9 20.10 20.11 20.12 20.13
. . . . . . . . . . . . .
459 460 461 463 464 466 469 470 470 470 471 471 473
Chapter 21. Differential equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
477
21.1 21.2 21.3 21.4 21.5 21.6 21.7
Lebesgue measure . . . . . . . . . . . . . . . . Lebesgue integration . . . . . . . . . . . . . . . Rectangular Stokes theorem in two dimensions . Rectangular Stokes theorem in three dimensions Differential forms . . . . . . . . . . . . . . . . The exterior derivative . . . . . . . . . . . . . Exterior differentiation using Lie derivatives . . Geometric measure theory . . . . . . . . . . . . Stokes theorem . . . . . . . . . . . . . . . . . Radon measures . . . . . . . . . . . . . . . . . Some integrability-based function spaces . . . . Logarithmic and exponential functions . . . . . Trigonometric functions . . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . . . . .
. . . . . . . . . . . . .
. . . .
. . . . . . .
. . . . . . . . . . . . .
. . . .
. . . . . . .
. . . . . . . . . . . . .
[ www.topology.org/tex/conc/dg.html ]
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . . . . . . . . . .
. . . .
. . . . . . .
. . . . . . . . . . . . .
. . . .
. . . . . . . . . . . .
. . . .
. . . . . . .
. . . .
. . . . . . . . . . . .
. . . .
525
. . . .
. . . . . . . . . . . .
. . . .
. . . . . . .
. . . . . . . . . . . . .
Chapter 24. Parallelism on topological fibre bundles . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . .
. . . .
. . . . . . .
. . . . . . . . . . . . .
493 496 498 502 503 504 508 510 515 517 520 521
Parallelism path classes . . . . . . . . . . . . . Pathwise parallelism on topological fibre bundles Associated parallelism . . . . . . . . . . . . . . Other topics . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . .
. . . .
. . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . .
24.1 24.2 24.3 24.4
. . . . . . . . . . . .
. . . .
. . . . . . .
. . . . . . . . . . . . .
493
. . . . . . . . . . . .
. . . .
. . . . . . .
. . . . . . . . . . . . .
Chapter 23. Topological fibre bundles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . .
. . . . . . .
. . . . . . . . . . . . .
485 487 489 489
History, motivation and overview . . . . . . . . . . . . . . . . . . Topological fibrations with intrinsic fibre spaces . . . . . . . . . . Topological fibrations and fibre atlases . . . . . . . . . . . . . . . Fibration identification spaces . . . . . . . . . . . . . . . . . . . . Structure groups discussion . . . . . . . . . . . . . . . . . . . . . Topological fibre bundles . . . . . . . . . . . . . . . . . . . . . . Fibre bundle homomorphisms, isomorphisms and products . . . . . Structure-preserving fibre set maps . . . . . . . . . . . . . . . . . Topological principal fibre bundles . . . . . . . . . . . . . . . . . Associated topological fibre bundles . . . . . . . . . . . . . . . . . Construction of associated topological fibre bundles . . . . . . . . Construction of associated topological fibre bundles via orbit spaces
. . . .
. . . . . . .
. . . . . . . . . . . . .
. . . .
23.1 23.2 23.3 23.4 23.5 23.6 23.7 23.8 23.9 23.10 23.11 23.12
. . . .
. . . . . . .
. . . . . . . . . . . . .
485
. . . .
. . . . . . .
. . . . . . . . . . . . .
Chapter 22. Non-topological fibre bundles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . .
. . . . . . . . . . . . .
479 481 481 481 482 482 483
Non-topological fibrations . . . . . . . . . . Parallelism for non-topological fibrations . . Non-topological fibre bundles . . . . . . . . Finite transformation groups as fibre bundles
. . . . . . . . . . . . . . . . . . . . . . . . . functions . . . . .
. . . . . . . . . . . . .
. . . . . . .
22.1 22.2 22.3 22.4
Ordinary differential equations . . . . . . . . . Systems of linear second-order ODEs . . . . . . Boundary value problems . . . . . . . . . . . . Initial value problems . . . . . . . . . . . . . . Calculus of variations . . . . . . . . . . . . . . ODEs for defining exponential and trigonometric Taylor series and exponentials of matrices . . .
. . . . . . . . . . . . .
459
. . . .
. . . . . . . . . . . .
. . . .
. . . .
525 527 530 532
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Chapter 20. Measure and integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Part II. Differential geometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
535
Chapter 25. Overview of differential geometry layers . . . . . . . . . . . . . . . . . . . . . . . . .
537
25.1 25.2 25.3 25.4 25.5 25.6 25.7
. . . . . . .
537 538 540 542 542 544 545
Chapter 26. Topological manifolds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
549
26.1 26.2 26.3 26.4 26.5 26.6
Layer 0: Set theory (points) . . . . . . . . . . . . . . Layer 1: Topology (connectivity and continuity) . . . Layer 2: Differentiable structure (charts and vectors) . Tensors and differential forms . . . . . . . . . . . . . Layer 3: Affine connection (parallelism at a distance) Layer 4: Riemannian metric (distance and angles) . . Pseudo-Riemannian metric (hyperbolic distance) . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . .
. . . . . . .
[ www.topology.org/tex/conc/dg.html ]
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . .
. . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . .
. . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . .
. . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . .
. . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . .
. . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . .
577
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . .
. . . . . . .
Chapter 28. Tangent bundles on differentiable manifolds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . .
. . . . . . .
560 561 563 563 565 566 567 568 570 570 571 572 574
Styles of representation of tangent vectors . . . . . Tangent bundle metadefinition . . . . . . . . . . . Tangent vectors . . . . . . . . . . . . . . . . . . . Computational tangent vectors . . . . . . . . . . . Tangent operators . . . . . . . . . . . . . . . . . . Tagged tangent operators . . . . . . . . . . . . . . Pointwise tangent spaces . . . . . . . . . . . . . . Tangent bundles . . . . . . . . . . . . . . . . . . . Tangent operator bundles . . . . . . . . . . . . . . The tangent bundle of a tangent bundle . . . . . . Horizontal components and drop functions . . . . . Tangent frames and coordinate basis vectors . . . . Tangent space constructions, attributes and relations Unidirectional tangent bundles . . . . . . . . . . . Distributions as representations of tangent bundles . Tangent bundles on infinite-dimensional manifolds .
. . . . . . . . . . . . .
. . . . . .
. . . . . . .
. . . . . . . . . . . . .
28.1 28.2 28.3 28.4 28.5 28.6 28.7 28.8 28.9 28.10 28.11 28.12 28.13 28.14 28.15 28.16
. . . . . . . . . . . . .
. . . . . .
. . . . . . .
559
. . . . . . . . . . . . .
. . . . . .
. . . . . . .
Chapter 27. Differentiable manifolds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . .
. . . . . . .
549 551 552 553 556 557
Terminology and definition choices . . . . . . . . . . . . . . Differentiable manifold atlases . . . . . . . . . . . . . . . . . Some standard differentiable manifold atlases . . . . . . . . . Some basic definitions for differentiable manifolds . . . . . . Differentiable real-valued functions on differentiable manifolds Differentiable curves and paths . . . . . . . . . . . . . . . . Differentiable families of differentiable transformations . . . . Differentiable maps between differentiable manifolds . . . . . Analytic manifolds . . . . . . . . . . . . . . . . . . . . . . . Unidirectionally differentiable manifolds . . . . . . . . . . . Lipschitz manifolds and rectifiable curves . . . . . . . . . . . Differentiable fibrations . . . . . . . . . . . . . . . . . . . . Tangent space building principles . . . . . . . . . . . . . . .
. . . . . .
. . . . . . .
. . . . . .
27.1 27.2 27.3 27.4 27.5 27.6 27.7 27.8 27.9 27.10 27.11 27.12 27.13
Background . . . . . . . . . . . . . . . . . . . . . . . . . Euclidean and locally Euclidean spaces . . . . . . . . . . . Topological manifolds . . . . . . . . . . . . . . . . . . . . Charts and atlases . . . . . . . . . . . . . . . . . . . . . . Topological manifold constructions, attributes and relations Topological identification spaces . . . . . . . . . . . . . . .
. . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
579 583 586 587 588 591 591 593 596 596 600 602 604 604 605 606
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
xiii
xiv
29.1 29.2 29.3 29.4 29.5 29.6 29.7 29.8 29.9
. . . . . . . . .
608 608 610 611 612 613 614 614 615
Chapter 30. Higher-order tangent vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
617
30.1 30.2 30.3 30.4 30.5 30.6 30.7 30.8
Contravariant tensors and tensor spaces . . Cotangent vectors and cotangent spaces . Covariant and mixed tensors . . . . . . . Double tangent spaces . . . . . . . . . . . Vector fields . . . . . . . . . . . . . . . . Tangent operator fields . . . . . . . . . . Tensor fields . . . . . . . . . . . . . . . . Vector fields and tensor fields along curves Differential forms . . . . . . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . .
[ www.topology.org/tex/conc/dg.html ]
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . . . .
. . . . .
. . . . . .
. . . . . . . .
. . . . .
. . . . . .
. . . . . . . .
. . . . .
. . . . . . . .
. . . . . . . . .
. . . . . .
. . . . . . . .
. . . . .
. . . . . . . .
. . . . . . . . .
. . . . . .
. . . . . . . .
. . . . .
. . . . . . . .
. . . . . .
. . . . . . . .
. . . . .
649
. . . . . .
. . . . . . . .
. . . . .
. . . . . . . .
. . . . . . . . .
Chapter 33. Vector field calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . .
. . . . .
. . . . . . . .
. . . . . . . . .
641 641 642 644 645 646 646 648
Naive vector field derivatives . . . . . . The Poisson bracket . . . . . . . . . . . Vector field derivatives for curve families Lie derivatives of vector fields . . . . . . Lie derivatives of tensor fields . . . . . . The exterior derivative . . . . . . . . .
. . . . . . . .
. . . . .
. . . . . . . .
. . . . . . . . .
. . . . . . . .
33.1 33.2 33.3 33.4 33.5 33.6
. . . . . . . .
. . . . .
. . . . . . . .
. . . . . . . . .
641
. . . . . . . .
. . . . .
. . . . . . . .
. . . . . . . . .
Chapter 32. Higher-order differentials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . .
. . . . . . . .
. . . . . . . . .
629 631 633 637 639
Higher-order differentials of a real-valued function . . . . . . Higher-order differentials of a differentiable map . . . . . . . Higher-order differentials of a curve . . . . . . . . . . . . . . Higher-order differentials of curve families . . . . . . . . . . Differentials of real-valued functions for higher-order operators Hessian operators at critical points . . . . . . . . . . . . . . Differentials of differentiable maps for higher-order operators . Differentials of curves for higher-order operators . . . . . . .
. . . . .
. . . . . . . .
. . . . . . . . .
. . . . .
32.1 32.2 32.3 32.4 32.5 32.6 32.7 32.8
. . . . .
. . . . . . . .
. . . . . . . . .
629
. . . . .
. . . . . . . .
. . . . . . . . .
Chapter 31. Differentials on manifolds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vector fields
. . . . . . . .
. . . . . . . . .
619 621 623 626 626 627 627 628
Pointwise differentials versus induced maps The differential of a real-valued function . The differential of a differentiable map . . The differential of a curve . . . . . . . . . One-parameter transformation families and
. . . . . . . .
. . . . . . . . .
. . . . . . . .
31.1 31.2 31.3 31.4 31.5
Higher-order tangent operators . . . . . . . . . . . . . . . Tensorization coefficients for second-order tangent operators Higher-order tangent vectors . . . . . . . . . . . . . . . . Higher-order tangent spaces . . . . . . . . . . . . . . . . . Drop functions for second-level tangent vectors . . . . . . . Elliptic second-order operators . . . . . . . . . . . . . . . Higher-order vector fields . . . . . . . . . . . . . . . . . . Higher-order vector fields for families of curves . . . . . . .
. . . . . . . . .
607
. . . . . .
. . . . . . . .
. . . . . .
. . . . . .
649 651 653 653 658 658
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Chapter 29. Tensor bundles and tensor fields on manifolds . . . . . . . . . . . . . . . . . . . . . .
xv
34.1 34.2 34.3 34.4 34.5 34.6 34.7 34.8
. . . . . . . .
660 661 662 666 668 669 670 672
Chapter 35. Differentiable fibre bundles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
675
35.1 35.2 35.3 35.4 35.5 35.6 35.7 35.8 35.9
Lie groups . . . . . . . . . . . . . . . . . Hilbert’s fifth problem . . . . . . . . . . . Left invariant vector fields on Lie groups . Right invariant vector fields on Lie groups The Lie algebra of a Lie group . . . . . . Diffeomorphism groups . . . . . . . . . . Lie transformation groups . . . . . . . . . Infinitesimal transformations . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . . .
[ www.topology.org/tex/conc/dg.html ]
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . . . .
. . . . . . . . .
. . . . . . . .
. . . . . . . . . .
. . . . . . . . .
. . . . . . . .
. . . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . .
. . . . . . . . . .
. . . . . . . . .
719
. . . . . . . .
. . . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . .
Chapter 38. Geodesics, convexity and Jacobi fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . .
705 707 708 709 712 712 714 716 716 717
Covariant derivatives of vector fields along curves Geodesic curves . . . . . . . . . . . . . . . . . . Jacobi fields . . . . . . . . . . . . . . . . . . . . Convex sets . . . . . . . . . . . . . . . . . . . . Convex combinations . . . . . . . . . . . . . . . Families of geodesic interpolations . . . . . . . . . Exponential maps . . . . . . . . . . . . . . . . . Convex functions . . . . . . . . . . . . . . . . .
. . . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . .
. . . . . . . . . .
38.1 38.2 38.3 38.4 38.5 38.6 38.7 38.8
. . . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . .
705
. . . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . .
Chapter 37. Affine connections and covariant derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . .
688 690 692 695 696 699 701 701 702
Concepts, history and terminology . . . . . . . . . . . . Motivation for defining connections on manifolds . . . . . Affine connections on tangent bundles . . . . . . . . . . Covariant derivatives . . . . . . . . . . . . . . . . . . . Hessian operators . . . . . . . . . . . . . . . . . . . . . Elliptic second-order operator fields . . . . . . . . . . . . Curvature and torsion . . . . . . . . . . . . . . . . . . . Affine connections on principal fibre bundles . . . . . . . Coefficients of affine connections on principal fibre bundles Connections for Lagrangian mechanics . . . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . .
. . . . . . . . .
37.1 37.2 37.3 37.4 37.5 37.6 37.7 37.8 37.9 37.10
. . . . . . . . .
. . . . . . . . .
. . . . . . . .
687
. . . . . . . . .
. . . . . . . . .
. . . . . . . .
Chapter 36. Connections on differentiable fibre bundles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . .
. . . . . . . .
676 676 677 679 680 680 683 683 684
Naming, history and choice of definitions . . . . . Differentiation of parallel transport . . . . . . . . Horizontal lift functions for ordinary fibre bundles Curvature of connections on ordinary fibre bundles Horizontal lift functions for principal fibre bundles Connection forms for PFB connections . . . . . . Covariant derivatives for general connections . . . Parallel displacement for PFB connections . . . . Alternative definitions for general connections . .
. . . . . . . . .
. . . . . . . .
. . . . . . . . .
36.1 36.2 36.3 36.4 36.5 36.6 36.7 36.8 36.9
Differentiable fibre bundles with non-Lie structure group Differentiable fibre bundles with Lie structure group . . Vector fields on differentiable fibre bundles . . . . . . . Differentiable principal fibre bundles . . . . . . . . . . Vector fields on differentiable principal fibre bundles . . Associated differentiable fibre bundles . . . . . . . . . . Vector bundles . . . . . . . . . . . . . . . . . . . . . . Tangent bundles of differentiable manifolds . . . . . . . Tangent frame bundles . . . . . . . . . . . . . . . . .
. . . . . . . .
659
. . . . . . . .
. . . . . . . . . .
. . . . . . . .
. . . . . . . .
719 720 721 721 722 722 724 724
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Chapter 34. Differentiable groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
xvi
39.1 39.2 39.3 39.4 39.5 39.6 39.7 39.8 39.9
. . . . . . . . .
728 729 730 732 734 734 735 735 735
Chapter 40. Pseudo-Riemannian manifolds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
737
40.1 40.2 40.3 40.4
Historical notes on Riemannian geometry The Riemannian metric . . . . . . . . . The point-to-point distance function . . The Levi-Civita connection . . . . . . . Curvature tensors . . . . . . . . . . . . Differential operators . . . . . . . . . . Inner product . . . . . . . . . . . . . . Embedded Riemannian manifolds . . . . Information geometry . . . . . . . . . .
metric . . . . . . . . . . . .
Chapter 42. Geometry of the 2-sphere
. . . . . . .
. . . .
. . . . . . .
. . . .
. . . . . . .
. . . .
. . . . . . .
. . . .
. . . . . . .
. . . .
. . . . . . .
. . . .
. . . . . . . . .
. . . . . . .
. . . .
. . . . . . . . .
. . . . . . .
. . . .
. . . . . . . . .
. . . . . . .
. . . .
. . . . . . . . .
. . . . . . .
. . . . . . .
. . . .
. . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . .
. . . .
. . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . .
. . . .
. . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . .
. . . .
. . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . .
. . . .
. . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . .
. . . .
. . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . .
. . . .
. . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . .
. . . .
. . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . .
. . . .
. . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . .
. . . .
. . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . .
. . . .
. . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . .
. . . .
. . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . .
. . . .
747
. . . . . . . . . . . . . . . . .
. . . . . . .
. . . .
. . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . .
. . . .
. . . . . . . . .
740 740 740 741 743 745 745
. . . . . . . . . . . . . . . . .
. . . . . . .
. . . .
. . . . . . . . .
. . . . . . .
Terrestrial coordinates . . . . . . . . . . . . . . . . . Tensor calculus in terrestrial coordinates . . . . . . . Metric tensor calculation from the distance function . The principal fibre bundle in terrestrial coordinates . The Riemannian connection in terrestrial coordinates Coordinates for polar exponential maps . . . . . . . . The global tangent bundle . . . . . . . . . . . . . . . Isometries of S 2 . . . . . . . . . . . . . . . . . . . . Geodesic curves . . . . . . . . . . . . . . . . . . . . Affinely parametrized geodesics . . . . . . . . . . . . Convex sets and functions . . . . . . . . . . . . . . . Normal coordinates . . . . . . . . . . . . . . . . . . Jacobi fields . . . . . . . . . . . . . . . . . . . . . . Circles on the sphere . . . . . . . . . . . . . . . . . Calculation of the “hours of daylight” . . . . . . . . . Some standard map projections . . . . . . . . . . . . Projection of a sphere onto a plane . . . . . . . . . .
[ www.topology.org/tex/conc/dg.html ]
. . . .
. . . . . . . . .
739
. . . . . . .
. . . .
. . . . . . . . .
Chapter 41. Tensor calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . History . . . . . . . . . . . . . . Differentiable manifolds . . . . . Manifolds with affine connection Equations of geodesic variation . Riemannian manifolds . . . . . . Pseudo-Riemannian manifolds . . Submanifolds of Euclidean space
. . . .
. . . . . . . . .
737 737 738 738
42.1 42.2 42.3 42.4 42.5 42.6 42.7 42.8 42.9 42.10 42.11 42.12 42.13 42.14 42.15 42.16 42.17
. . . .
. . . . . . . . .
. . . .
41.1 41.2 41.3 41.4 41.5 41.6 41.7
The pseudo-Riemannian General relativity . . . Singularities . . . . . . Global solutions . . . .
. . . . . . . . .
727
. . . . . . . . . . . . . . . . .
. . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
747 749 751 752 753 755 757 758 760 762 763 763 763 763 764 764 764
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Chapter 39. Riemannian manifolds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
xvii
43.1 43.2 43.3 43.4 43.5 43.6 43.7 43.8 43.9 43.10
. . . . . . . . . .
767 768 769 769 771 771 772 772 772 773
Chapter 44. Examples of fibre bundles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
775
44.1 44.2 44.3
Topological space examples . . . . . . . Euclidean spaces . . . . . . . . . . . . . Non-Hausdorff locally Euclidean spaces . H¨older-continuous manifolds . . . . . . . Torus . . . . . . . . . . . . . . . . . . . General sphere . . . . . . . . . . . . . . Conical coordinates for Euclidean spaces Hyperboloid . . . . . . . . . . . . . . . Tractrix . . . . . . . . . . . . . . . . . Analysis on Euclidean spaces . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
767
Euclidean fibre bundles on Euclidean spaces . . . . . . . . . . . . . . . . . . . . . . . The M¨ obius strip as a fibre bundle . . . . . . . . . . . . . . . . . . . . . . . . . . . . The M¨ obius strip fibre bundle on S 1 . . . . . . . . . . . . . . . . . . . . . . . . . . .
775 776 778
Chapter 45. Derivations, gradient operators, germs and jets . . . . . . . . . . . . . . . . . . . . .
781
45.1 45.2 45.3 45.4 45.5 45.6 45.7 45.8 45.9 45.10
. . . . . . . . . .
782 782 785 785 787 789 789 789 789 790
Chapter 46. History of differential geometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
791
46.1 46.2 46.3 46.4
Definitions . . . . . . . . . . . . Some elementary examples . . . Further elementary examples . . Spaces of differentiable functions Spaces of smooth functions . . . The space of analytic functions . The H¨ older spaces . . . . . . . . Further topics on derivations . . Germs . . . . . . . . . . . . . . Jets . . . . . . . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
[ www.topology.org/tex/conc/dg.html ]
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . .
. . . . . . . . . .
. . . . . . . . .
. . . .
. . . . . . . . . .
. . . . . . . . .
. . . .
. . . . . . . . . .
. . . . . . . . .
. . . .
. . . . . . . . . .
. . . . . . . . .
. . . .
. . . . . . . . . .
. . . . . . . . .
. . . .
. . . . . . . . . .
. . . . . . . . .
. . . .
. . . . . . . . . .
. . . . . . . . .
. . . .
. . . . . . . . . .
. . . . . . . . .
. . . .
. . . . . . . . . .
. . . . . . . . .
. . . .
. . . . . . . . . .
. . . . . . . . .
. . . .
. . . . . . . . . .
. . . . . . . . .
. . . .
. . . . . . . . . .
. . . . . . . . .
. . . .
. . . . . . . . . .
. . . . . . . . .
. . . .
. . . . . . . . . .
. . . . . . . . .
. . . .
. . . . . . . . . .
. . . . . . . . .
. . . .
. . . . . . . . . .
. . . . . . . . .
. . . .
. . . . . . . . . .
. . . . . . . . .
. . . .
. . . . . . . . . .
. . . . . . . . .
. . . .
. . . . . . . . . .
. . . . . . . . .
. . . .
. . . . . . . . . .
. . . . . . . . .
. . . .
. . . . . . . . . .
. . . . . . . . .
. . . .
. . . . . . . . . .
805
. . . . . . . . .
. . . .
. . . . . . . . . .
Chapter 47. Exercise questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . .
. . . . . . . . . .
791 796 798 801
Logic . . . . . . . . . . . . Sets, relations and functions Numbers . . . . . . . . . . Algebra . . . . . . . . . . Linear algebra . . . . . . . Tensor algebra . . . . . . . Topology . . . . . . . . . . Topological fibre bundles . Topological manifolds . . .
. . . .
. . . . . . . . . .
. . . .
47.1 47.2 47.3 47.4 47.5 47.6 47.7 47.8 47.9
Chronology of mathematicians . . . Origins of words and notations . . . Etymology of affine spaces . . . . . . Logical language in ancient literature
. . . . . . . . . .
. . . . . . . . .
. . . .
. . . . . . . . .
. . . . . . . . .
805 806 807 807 807 808 808 808 809
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Chapter 43. Examples of manifolds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
xviii Chapter 48. Exercise answers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48.1 48.2 48.3 48.4 48.5 48.6 48.7 48.8 48.9
. . . . . . . . .
811 813 817 817 818 818 819 819 820
Chapter 49. Notations and abbreviations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
821
49.1 49.2
Logic . . . . . . . . . . . . Sets, relations and functions Numbers . . . . . . . . . . Algebra . . . . . . . . . . Linear algebra . . . . . . . Tensor algebra . . . . . . . Topology . . . . . . . . . . Topological fibre bundles . Topological manifolds . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
811
Notations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Abbreviations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
821 829
Chapter 50. Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
831
Comments on other people’s books . . . Differential geometry introductory texts Other differential geometry references . . Other mathematics references . . . . . . Physics . . . . . . . . . . . . . . . . . . Logic and set theory . . . . . . . . . . . Anthropology and linguistics . . . . . . Philosophy and ancient history . . . . . History of mathematics . . . . . . . . . Other references . . . . . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
831 832 833 834 836 836 837 837 838 838
Chapter 51. Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
841
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
50.1 50.2 50.3 50.4 50.5 50.6 50.7 50.8 50.9 50.10
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
[1]
Chapter 1 Introduction
Layers of structure of differential geometry . . . Topic flow diagram . . . . . . . . . . . . . . . . Chapter groups . . . . . . . . . . . . . . . . . . Objectives and motivations . . . . . . . . . . . . Style . . . . . . . . . . . . . . . . . . . . . . . Some minor details of presentation . . . . . . . . Differences from other differential geometry texts Acknowledgements . . . . . . . . . . . . . . . . MSC 2000 subject classification . . . . . . . . . How to learn mathematics . . . . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
2 3 3 5 9 10 11 11 12 13
This book is not “Differential Geometry Made Easy”. Differential geometry is not easy. If you think it’s easy, you haven’t understood it! Attempts to make it seem easy give the reader only a superficial understanding. The best that can be hoped for realistically is a systematic, self-consistent presentation of topics so that the ideas can be assimilated by the reader without any more pain and confusion than absolutely necessary. This book aims to be “Differential Geometry Made Crystal Clear”, but enlightenment requires effort. This is a definitions book, not a theorems book. Definitions introduce you to things and tell you their names. Theorems tell you properties and relations of things. Most mathematical texts give definitions so that they can present their theorems. In this book, theorems are given only when required for the presentation of definitions. If the reader can understand the definitions in the DG literature, that is a good starting point for understanding the theorems. To be meaningful, many definitions do require existence, uniqueness or regularity proofs. So some basic theorems are unavoidable. Also a few theorems are given here to motivate definitions or to clarify the relations between them. Before studying differential geometry, the reader should have some prior familiarity with set theory, group theory, linear algebra, topology, measure theory and partial differential equations. These prerequisites are presented in preliminary chapters in this book, but it is preferable to have studied these topics beforehand. Differential geometry is the geometry of manifolds. Coordinate charts are the principal defining characteristic of differential geometry, not an embarrassing nuisance. Therefore coordinate charts are constantly and unashamedly in the foreground in this book. Some authors don’t like coordinates. They can be hidden but not removed. (The coordinates, that is.) The central concepts of differential geometry are coordinate charts, tangent vectors, the exterior derivative, pathwise parallelism, curvature and metric tensors. The aim of this book is to give the reader a confident understanding of these concepts and the relations between them. The presentation strategy is to stratify all DG concepts according to structural layers. This book will hopefully fill the role of an illustrated dictionary. It does not try to be a comprehensive encyclopedia. It presents only the dramatis personae, not the complete works of Shakespeare. The author may use this book as a resource for the creation of other books. When this full version has been released, the author may write a half-length version which omits the less popular technicalities. The shorter version may be titled “Differential Geometry Made Easy”. It might sell a lot of copies!
©
Alan U. Kennington, “Differential geometry reconstructed: a unified systematic framework”. Copyright 2009, Alan U. Kennington. All rights reserved. You may print this book draft in A4 format. Printing in all other formats is forbidden. You may not charge any fee for copies of this book draft. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 1.10
2
1. Introduction
1.1. Layers of structure of differential geometry The following table summarizes the progressive build-up of the layers of structure of differential geometry in the chapters of this book. layer 0 1 2 3 4
main concept
point-set layer points topological layer connectivity differential layer vectors connection layer parallelism metric layer distance
structure
chapters
set of points with no topological structure topological space: open neighbourhoods atlas of differentiable charts; tangent bundle affine connection on the tangent bundle Riemannian metric tensor field
5–13 14–17, 23–24, 26 27–35 36–38 39–40
The following table shows which levels of structure are required by some important concepts. For example, geodesic curves are well defined if an affine connection or Riemannian metric is specified, but not if you only have a differentiable structure. concepts structural layers where concepts are meaningful
cardinality of sets
1
boundaries of sets connectivity of sets continuity of functions 2 tangent vectors differentials of functions tensor bundles differential forms vector field algebra Lie derivatives exterior derivative Stokes theorem 3
parallel transport covariant derivatives geodesic curves geodesic coordinates convex sets and functions Riemann curvature Ricci curvature
4
angle between vectors length of vector distance between points normal coordinates sectional curvature scalar curvature Einstein curvature tensor Laplace-Beltrami operator
topology differentiable structure yes yes yes yes yes
yes yes yes yes yes yes yes yes yes yes yes
affine riemannian connection metric yes yes yes yes yes yes yes yes yes yes yes yes yes
yes yes yes yes yes yes yes yes yes yes yes
yes yes yes yes yes yes yes
yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes
The specification of any structural layer uniquely determines the lower layer structures but not the higher layer structures. When only a point set is specified, there are many possible choices for the topology. When only the topology is specified, there are many choices for the differentiable structure. On a given differentiable structure, many choices of connection are possible. But a Riemannian metric uniquely determines the affine connection, which uniquely determines the differentiable structure, and so forth. The higher layers are optional. You only need to provide the layers of structure which are required by the concepts you wish to use. More structure gives you more concepts. It is noteworthy that so many concepts are well defined in the absence of a metric, and even in the absence of a connection. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
0
point set yes
1.2. Topic flow diagram
3
1.2. Topic flow diagram
1.3. Chapter groups 1.3.1 Remark: The page counts for the chapter groups are as follows. pages chapter group
chapters
16 introduction 520 Part I: preliminary topics
1 2–24
96 148 86 78 60 52
philosophy, semantics logic, set theory, numbers algebra topology calculus topological fibre bundles
2–3 4–8 9–13 14–17 18–21 22–24
254
Part II: differential geometry
12 10 100 28 40 20 34 10 50 44
overview of DG structural layers topological manifolds differentiable manifolds Lie groups, differentiable fibre bundles connections Riemannian metric, tensor calculus examples derivations appendices index
[ www.topology.org/tex/conc/dg.html ]
25–45 25 26 27–33 34–35 36–38 39–41 42–44 45 46–50 51 [ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Chapters 2 to 24 present preliminary topics for reference in later chapters. These topics are in the early chapters so that later chapters are not cluttered by the interpolation of prerequisites. The preliminary topics include logic, sets, functions, order and numbers (Chapters 3–8), algebra (Chapter 9), linear algebra (Chapters 10–12), tensor algebra (Chapter 13), topology (Chapters 14–17), sets and numbers calculus (Chapters 18–21) and fibre bundles (Chapters 22–24). Differential geometry begins with topological manifolds (Chapter 26), algebra followed by differentiable manifolds (Chapter 27), connections (Chapters 36–38) and Riemannian metric spaces (Chapters 39–40). topology linear spaces This book tries to disentangle which concepts and theorems belong to four levels of structure: topological structure (Chapter 26), differencalculus tensor algebra tiable structure (Chapters 27–33), affine connection structure (Chaptopological differentiable ters 36–38), and Riemannian metric structure (Chapters 39–40). Lie manifolds manifolds (differentiable) groups (Chapter 34) and differentiable fibre bundles (Chapter 35) may be considered as preliminary topics, but like tensor topological Lie groups fibre bundles algebra (Chapter 13) and topological fibre bundles (Chapters 23–24), they may be regarded as core topics of differential geometry. differentiable fibre bundles Later chapters deal with tensor calculus (Chapter 41), the 2-sphere (Chapter 42), example geometries and fibre bundles (Chapters 43–44), affine connections and alternative tangent space definitions (Chapter 45). Riemannian pseudo-Riemannian The topic flow diagram shows the progressive build-up of algebraic manifolds manifolds structure from “sets and numbers” to “tensor algebra”. Then “topology” and “calculus” are combined with “tensor algebra” to define “differentiable manifolds”. Adding “topological fibre bundles” to this yields “differentiable fibre bundles” on which “affine connections” are defined. Adding a metric or pseudo-metric leads to Riemannian or pseudo-Riemannian manifolds. Algebraic and analytical structure are thus developed in two intermingled streams. This modern approach to differential geometry is expressed in the rarified language of fibre bundles. Affine connections may be defined directly on differentiable manifolds, bypassing fibre bundles as suggested by the dashed arrow, like in the olden days.
1. Introduction
1.3.2 Remark: The chapters of this book fall more or less naturally into the following groups. Chapter 1 is a general introduction. This may be safely ignored apart from Section 1.1. 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1
Part I. Preliminary topics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
15
Chapters 2 and 3 discuss philosophy and semantics. These are the most annoying chapters of the book. 2. Philosophical considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 3. Logic semantics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 Chapters 4 to 8 present logic, sets, relations, functions, order and numbers. 4. Logic methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 5. Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159 6. Relations and functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205 7. Order and integers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227 8. Rational and real numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249 Chapters 9 to 13 introduce algebra, especially linear and multilinear (tensor) algebra. 9. Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261 10. Linear algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 289 11. Matrix algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 307 12. Affine spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 321 13. Tensor algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 325 Chapters 14 to 17 introduce topology. Metric spaces are a particular kind of topological space. 14. Topology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 347 15. Topology classes and constructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 377 16. Topological curves, paths and groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . 397 17. Metric spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 413 Chapters 18 to 21 introduce analytical topics, namely the differential calculus and integral calculus. This provides a break from topology before returning to it in the fibre bundle chapters. 18. Differential calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 425 19. Diffeomorphisms in Euclidean space . . . . . . . . . . . . . . . . . . . . . . . . . . . . 443 20. Measure and integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 459 21. Differential equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 477 Chapters 22 to 24 introduce fibre bundles which have topology but no differentiable structure. 22. Non-topological fibre bundles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 485 23. Topological fibre bundles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 493 24. Parallelism on topological fibre bundles . . . . . . . . . . . . . . . . . . . . . . . . . . 525 Part II. Differential geometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
535
Chapter 25 is an overview of a five-layer structure model for differential geometry. 25. Overview of differential geometry layers . . . . . . . . . . . . . . . . . . . . . . . . . . 537 Chapter 26 introduces topological manifolds. This is layer 1 in the five-layer DG structure model. 26. Topological manifolds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 549 Chapters 27 to 33 add differentiable structure (i.e. charts) to manifolds. This commences layer 2. 27. Differentiable manifolds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 559 28. Tangent bundles on differentiable manifolds . . . . . . . . . . . . . . . . . . . . . . . . 577 29. Tensor bundles and tensor fields on manifolds . . . . . . . . . . . . . . . . . . . . . . . 607 30. Higher-order tangent vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 617 31. Differentials on manifolds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 629 32. Higher-order differentials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 641 33. Vector field calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 649 Chapters 34 and 35 introduce Lie (i.e. differentiable) groups. Lie groups are required for the formal definition of differentiable fibre bundles, which are required for defining general connections. 34. Differentiable groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 659 35. Differentiable fibre bundles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 675 [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
4
1.4. Objectives and motivations
5
Chapters 36 to 38 introduce connections (i.e. differentiable parallelism) on manifolds. This is layer 3. Connections are required for concepts such as covariant derivatives, geodesics, convexity and curvature. 36. Connections on differentiable fibre bundles . . . . . . . . . . . . . . . . . . . . . . . . . 687 37. Affine connections and covariant derivatives . . . . . . . . . . . . . . . . . . . . . . . . 705 38. Geodesics, convexity and Jacobi fields . . . . . . . . . . . . . . . . . . . . . . . . . . . 719 Chapters 39 to 41 introduce Riemannian and pseudo-Riemannian metrics. This is layer 4. Such metrics are required for general relativity. Tensor calculus is a notational system for practical DG calculations. 39. Riemannian manifolds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 727 40. Pseudo-Riemannian manifolds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 737 41. Tensor calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 739 Chapters 42 to 44 present numerous familiar manifold is the 2-sphere S 2 . 42. Geometry of the 2-sphere . . . 43. Examples of manifolds . . . . 44. Examples of fibre bundles . . .
examples of manifolds and fibre bundles. A particularly useful and . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
747 767 775
Chapter 45 is a not very useful set of notes on derivations, germs and jets, which provide some obscure representations of tangent spaces. 45. Derivations, gradient operators, germs and jets . . . . . . . . . . . . . . . . . . . . . . 781 Chapters 46 to 51 are appendices to the main part 46. History of differential geometry . . . . . . . 47. Exercise questions . . . . . . . . . . . . . . 48. Exercise answers . . . . . . . . . . . . . . 49. Notations and abbreviations . . . . . . . . 50. Bibliography . . . . . . . . . . . . . . . . 51. Index . . . . . . . . . . . . . . . . . . . .
of the book. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
791 805 811 821 831 841
1.4.1 Remark: Between 1986 and 1991, the author was trying to generalize some geometric properties of solutions of second-order boundary and initial value problems from flat space to differentiable manifolds. (In particular, the author needed estimates for parallel transport of second-order partial differential operators along geodesics in terms of bounds on curvature.) More importantly, it seemed a terrible shame that mathematicians had developed such a deep and comprehensive corpus of results for partial differential equations in flat space, particularly for boundary and initial value problems, whereas according to cosmologists, the universe is no longer flat, in which case a vast swathe of the PDE corpus must surely be null and void, being inapplicable to curved space. Many of the techniques of PDE theory are in fact very much dependent on the special properties of Euclidean space. The five-layer structural organization of differential geometry in Section 1.1 is a direct consequence of the desire to minimize the requirements placed by DG on PDE so that the maximum extent of generalization to curved spaces will be facilitated. The requirements minimization objective is considered in choosing all definitions in this book. For the task of converting PDE concepts and techniques to curved space, the author could not find differential geometry texts which met the high standards of systematic development and logical rigour of the best analysis texts. The more he read, the more confusing the subject became because of the multitude of contradictory definitions and formalisms. The origins and motivations of fundamental DG concepts are largely submerged under a century of continuous redefinition and rearrangement. The differential geometry literature is plagued by a plethora of mutually incomprehensible formalisms and notations. Writing this book has been like creating a map of the world from a hundred regional maps which use different coordinate systems and languages for locating and naming geographical features. 1.4.2 Remark: Long after initially writing the comments in Remark 1.4.1 regarding a “multitude of contradictory definitions and formalisms” and a “plethora of mutually incomprehensible formalisms and notations”, [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
1.4. Objectives and motivations
6
1. Introduction
the author acquired a copy of Michael Spivak’s 5-volume DG book. The first two paragraphs of the preface to the 1970 edition ([42], page ix) contain eerily similar comments. [. . . ] no one denies that modern definitions are clear, elegant, and precise; it’s just that it’s impossible to comprehend how any one ever thought of them. And even after one does master a modern treatment of differential geometry, other modern treatments often appear simply to be about totally different subjects. Since 1970, it seems little has changed. If anything, the literature is now even more confusing.
1.4.4 Remark: The most difficult aspect of differential geometry is the lack of explanation for why definitions are so and not otherwise. There is perhaps an analogy here with the contrast between ancient Egyptian mathematics and classical Greek mathematics. It was said that Thales brought back mathematics from a visit to Egypt in around 600bc. (See Ball [187], pages 14–19: “Probably it was as a merchant that Thales first went to Egypt, but during his leisure there he studied astronomy and geometry.” There had been very substantial sea trade in the Eastern Mediterranean for centuries. So such contacts were inevitable.) The Egyptian priest class never gave reasons for why their theorems were true. They simply observed, for example, that a 3/4/5 triangle has a right angle. (See Bell [189], page 40.) The Egyptians just said: “This is how you do it.” The Greeks, by contrast, insisted on trying to find proofs, and by finding proofs, the Greeks were able to enormously expand the body of theorems. Classical Greek mathematics was characterized by the excitement of discovery whereas Egyptian mathematics was static. (It is just possible that the severe limitations of the Egyptian writing system had something to do with this. The Egyptian scribe class had to learn all the symbols by rote, although their writing was partly phonetically based. The Greek writing system was the first fully phonetic system in history, which resulted in high general literacy because only a few simple principles were required to pronounce every word in the language. Axiomatic and deductive thinking were an integral part of Greek culture.) While some differential geometry books do try very hard to motivate the choices of definitions, there are many definitions for which it is very difficult to find any explanation of how the choice is made. The modern mathematician’s instinct is always to modify and extend definitions to see if something useful arises. In this book, an attempt is made to determine what happens if many of the appararently arbitrary choices and restrictions in definitions are really necessary. If it turns out that dropping a requirement or extending the domain of an argument results in a useless or meaningless definition, this helps to clarify the meaning. But sometimes the usual way of doing things turns out to be an obstacle in the way of further development of the subject. Therefore this book tries to avoid simply saying: “This is how you do it.” 1.4.5 Remark: A vegetarian cook will generally be better at cooking vegetables than the meat-centric cook who regards vegetables as a necessary but uninteresting background. In the same way, a definitioncentric book will generally explain definitions better than a theorem-centric book which regards definitions as a necessary but uninteresting background. 1.4.6 Remark: The principal goal of mathematics teaching is to liberate the student from the teacher. The teacher who explains the motivation and justification of every assumption and assertion assists the student to become independent of the teacher. The teacher who says “this is how you do it” without explanation makes the student a prisoner of dogma, unable to adapt their knowledge to circumstances and make new discoveries. Dogmatically taught students become dogmatic teachers and practitioners because they do not know the reasons for how things are done. The best teachers are happiest when their students correctly challenge assumptions and assertions and have the courage and capability to develop valid extensions, generalizations and alternatives. 1.4.7 Remark: For every mathematical object, one may ask the following questions. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
1.4.3 Remark: The initial strategy of this book was to stitch together a dozen of the differential geometry articles in the Mathematical Society of Japan’s excellent Encyclopedic dictionary of mathematics [33] into a small coherent presentation in a logical order with uniform notation, together with prerequisites and further details from other texts. The original target length of about 50 pages has unfortunately been exceeded! The recursive catchment area of differential geometry prerequisites is a surprisingly large proportion of undergraduate mathematics. The finished product will hopefully achieve a reasonable coherence and harmony between the various perspectives of the subject without becoming encyclopedic.
1.4. Objectives and motivations
7
(1) How do I perform calculations with this object? (2) In which space does this object “live”? (3) What is the “essential nature” of this object? Students who need mathematics only as a tool for other subjects may be taught only the answers to question (1). This often leads to incorrect or meaningless calculations because of a lack of knowledge of which kind of space each object “lives” in. Different spaces have different rules. Question (3) is important to help guide one’s intuition to form conjectures and discover proofs and refutations. One must have some kind of mental concept of every class of object. This book tries to give answers to all three of the above questions. Human beings are half animal, half robot. It is important to satisfy the animal half’s need for motivation and meaning as well as the robot half’s need to do calculations. If one cannot determine the class of object to which a symbolic expression refers, it may be that the expression is a “pseudo-notation”. That is, it may be a meaningless expression. Such expressions are frequently encountered in differential geometry. They should be replaced with well-defined, meaningful expressions. It sometimes happens that a difficult problem is made easy, trivial or vacuous by carefully elucidating all of the symbolic expressions in the statement of the problem. This book tries to avoid pseudo-notations. An effective tactic for making sense of difficult ideas in the DG literature is to determine which set each object in each mathematical expression belongs to. Plodding correctness is better than brilliant guesswork. (Best of all, of course, is brilliant guesswork combined with plodding correctness.)
1.4.8 Remark: Mathematical research generally proceeds as two parallel activities. By intuition, the mathematician forms conjectures. Intuition is a forward-looking activity. But every conjecture needs proof by rigorous deduction. Proof is a backward-looking activity which tries to “join up the dots” between assumptions and assertions. (In a sense, one goes into debt when one states a theorem, and the debt is only paid off when the theorem is proved.) Both intuition and deduction are essential in mathematical research. Therefore a good book should assist the reader in both areas. The strict methods of proof must be presented, but a strong intuitive insight must also be communicated. This book tries to be thoroughly rigorous, while also trying to make as many concepts as possible intuitively clear. Therefore many of the fundamental concepts are discussed at much greater length than is usual, and many diagrams are included to assist the intuition. After all, the ability to validate or invalidate a conjecture is of little value if one’s conjectures are formed at random with little idea of what to expect. Often the discussions in this book state the completely obvious. But there is method in this madness. By making obvious observations explicit, one may question them, and sometimes a statement which is “obviously true” turns out later to be either not so obvious at all, or sometimes partly or completely false. It is normal in lectures to spell out many obvious consequences of definitions and theorems. Publishers, and their reviewers, generally frown upon the inclusion of obvious observations in books. But if one is self-publishing, this constraint on the communication of “the obvious” is removed. 1.4.9 Remark: Most DG texts assume a high degree of smoothness (e.g. C ∞ ) for functions and manifolds to make their work easier. This hinders applications to functions and manifolds (e.g. the graphs of functions) which arise as solutions of initial and boundary value problems for physical models. Analysts often have to work very hard indeed to prove that a problem has solutions with even limited regularity such as C 1,1 or C 2 . So the blanket assumption that everything is C ∞ can be quite irksome. The analyst seeking to apply DG theorems must first investigate whether the C ∞ results can be generalized to weaker regularity. This book tries to keep regularity assumptions in definitions and theorems as weak as possible. 1.4.10 Remark: Just as the general public of the 18th and 19th centuries had a desire to understand the new theories of gravity, the spherical Earth and the solar system, so in our own time the public have the same [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
To suppose that mathematics is the art of calculation is like supposing that architecture is the art of bricklaying, or that literature is the art of typing. Calculation in mathematics is necessary, but it is an almost mechanical procedure which can be automated by machines. The mathematician does much more than a computerized mathematics package. The mathematician formulates problems, reduces them to a form which a computer can handle, checks that the results correspond to the original problem, and interprets and applies the answers. Therefore the mathematician must not be too concerned with mere computation. That is the task of the robot. The task of the human is to understand mathematics.
8
1. Introduction
desire to understand the new theories of gravity, curved space-time and the universe. Every effort should be made to remove obstacles lying in the way of the non-specialist who wants to better understand the big ideas of our time. There’s no point being in the 21st century if one’s understanding of the universe remains stuck in the 19th. It is a worthy life aim to understand and appreciate the best theories and discoveries of one’s own era; even more worthy is to contribute to them. It is possible that the current orthodoxy in cosmology may require an overhaul some time soon. If Einstein’s equations need to be put on the operating table for major surgery, it will be important to have a deep understanding of the mathematical machinery underlying those equations. This justifies the detailed investigation of the fundamentals of differential geometry in this book. 1.4.11 Remark: Since the scope of this book is quite extensive, from the fundamental concepts of logic to the convoluted structures of differential geometry, sorting the topics into define-before-use order has not been easy. Specialists in each mathematical topic borrow what they need from other specialists in a non-acyclic manner, possibly unaware of the circularity of their definitions. So define-before-use ordering is difficult to achieve. A large part of the author’s motivation for seeking a “bedrock for mathematics” is (or was) the desire to arrange all concepts in define-before-use order. This is still an objective, but it can be achieved only in a best-effort sense. 1.4.12 Remark: The diversity and divergence of definitions in mathematics over time are reminiscent of the diversity and divergence of natural language families during the last few thousand years. (The divergence process for natural languages is well described, both in overview and in detail, by Ostler [175].) The reunification of mathematical definitions is difficult to achieve, but the objective is worthwhile.
The author is reminded of the foreword to the counterpoint tutorial “Gradus ad Parnassum” ([208], page 17) by Johann Joseph Fux, published in 1725: I do not believe that I can call back composers from the unrestrained insanity of their writing to normal standards. Let each follow his own counsel. My object is to help young persons who want to learn. I knew and still know many who have fine talents and are most anxious to study; however, lacking means and a teacher, they cannot realize their ambition, but remain, as it were, forever desperately athirst. Joseph Haydn was one of the composers who learned counterpoint from Fux’s book because he “lacked means and a teacher”. This differential geometry book is also aimed at those who lack means and a teacher. The “Gradus ad Parnassum” was much more successful than previous counterpoint tutorials because Fux arranged the ideas in a logical order, starting from the most basic two-part counterpoint (with gradually increasing rhythmic complexity), then three-part counterpoint in the same way, and finally four-part counterpoint in the same graduated manner. Fux said about this systematic approach: “When I used this method of teaching I observed that the pupils made amazing progress within a short time.” ([208], page 18.) This book tries similarly to arrange all of the ideas which are required for differential geometry in a clear systematic order. 1.4.14 Remark: The motivations for this book may be summarized as follows. (1) Provide a “source book” for the author to create smaller, beginner-friendlier books. (2) Help the author to understand at least some of the mathematical physics which was incomprehensible to him in the 1970s. (3) Provide a background resource for mathematics and physics students who are studying differential geometry, particularly those who haven’t two pennies to rub together. (4) Provide a notations and definitions resource for other authors of differential geometry and physics books. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
1.4.13 Remark: Differential geometry has developed multiple languages and dialects for expressing its extensive network of concepts during the last 150 years. Familiarity with the folklore of each school of thought in this subject sometimes requires a lengthy apprenticeship to learn the language and methods. It is the author’s belief that a competent mathematician should be able to learn differential geometry from books alone without the need for initiation into the mysteries by the knowledgeable ones.
1.5. Style
9
(5) Provide the necessary background for the author to generalize some work on boundary value problems for elliptic second-order differential equations from flat space to curved space. The analytic approach of this book will hopefully assist analysts in general to extend flat-space results to curved space. (6) Provide a path to understanding Einstein’s general relativity for the determined adult with much enthusiasm but limited initial mathematics background. (7) Restructure the subject of differential geometry into a unified, integrated subject which makes sense to pure mathematicians. This requires some research into the motivation of definitions. (8) Clarify for all differential geometry practitioners which concepts and theorems apply to each of the layers of differentiable geometry so that they don’t use the wrong formulas in their work. (9) Provide justifications for each building block in the entire edifice of theory which underlies differential geometry so as to facilitate the reconstruction of physics. Only by understanding in depth why the ziggurat is constructed as it is can one safely reconstruct it. (10) Provide a framework for differential geometry which is “close to the ground”. High abstractions are avoided. (Some authors represent simple concepts by astonishingly complex definitions with no obvious benefit. This book tries to give the simplest possible formal definition for every concept as if the reader had better things to do than spend hours studying a convoluted definition to determine finally that it is exactly the same as a very simple definition which they knew already.)
1.5.1 Remark: Some DG books begin with appeals to physical and geometric intuition; some start in the middle layers of the subject with differentiable manifolds and work outwards; others begin with the familiar flat spaces of high school geometry and gradually introduce manifolds and curvature. Here the approach is to systematically build secure foundations in the early chapters so that whenever there is doubt, one may trace any definition back through the progressively built-up layers to the ground levels of mathematical logic and set theory. Although the soil underneath mathematical logic is quite sandy in places, it is probably the best basis on which to build a secure formalism for differential geometry. (See Remark 2.1.4 for this choice of foundation layer.) Many DG books try to build the reader’s understanding on an intuitive foundation, starting with familiar ideas and appealing to intuition at frequent intervals to help make conceptual leaps which don’t seem quite logical. Much of differential geometry is counter-intuitive. So intuition often leads down erroneous paths. In this book, the theory is developed along strictly mathematical lines. Intuition is often helpful, but must always be backed up by rigorous proof. 1.5.2 Remark: In some books, half of the material is relegated to a series of frustrating exercises at the end of each chapter. Putting significant material in the exercises makes the main presentation incomplete. In this book, answers for all exercises will be provided. Trying to understand mathematics is itself a sufficient exercise. Interested readers can always work out their own examples and check the theorems and definitions to make sure that they are correct and self-consistent. The enthusiastic reader can experiment with theorems and definitions to try to improve them or to understand why particular assumptions were chosen. Whenever an author says that an assertion is “clear”, “easy to show”, or “blindingly obvious”, this is an invitation to readers to check that they know how to fill in the missing steps. Simmons [139], page xi, makes the following similar comment. (Authors were less gender-neutral in 1963.) The serious student will train himself to look for gaps in proofs, and should regard them as tacit invitations to do a little thinking on his own. Phrases like “it is easy to see,” “one can easily show,” “evidently,” “clearly,” and so on, are always to be taken as warning signals which indicate the presence of gaps, and they should put the reader on his guard. 1.5.3 Remark: There are three kinds of books on any technical subject: (1) a reference book, (2) a tutorial, (3) a ‘cookbook’. The cookbook style of work presents a set of recipes for solving various particular problems from which the reader is supposed to gradually accumulate skills and knowledge or else just solve the particular kinds of problems that they are interested in. The tutorial style of work takes the reader through a subject in a manner which systematically increases their knowledge by starting from easy concepts with which they can be expected to be familiar and adding new concepts which depend only on concepts that [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
1.5. Style
10
1. Introduction
have been presented earlier or else are assumed as prerequisites. The reference style of work is supposed to be complete, systematic and well-indexed so that someone already knowledgeable in the subject can quickly locate the details which they require. This author is against cookbooks. The cookbook style of presentation is best suited to subjects which are so totally incomprehensible that trial-and-error is the only way to make sense of it. Then some ready-made recipes are indispensable. The style here is intended to be a combination of reference and tutorial. In other words, this book should hopefully be complete and systematic, but also readable and digestible in a single linear reading. 1.5.4 Remark: If the reader finds the amount of informal discussion in this book excessive, it is interesting to note that Hermann Weyl frequently entered into philosophical discussion in his classic “Raum, Zeit, Materie” [50]. In 1923, there was more time to think about the meaning of mathematics. At that time, there was less of a boundary between mathematics and philosophy. On the other hand, some authors chatter too much. (For example, the book on π by Beckmann [188] looks more like a political pamphlet than a serious mathematics book.)
1.6. Some minor details of presentation 1.6.1 Remark: The mathematical literature has a wide range of notations for sets of numbers. This book uses the following notations. all positive non-negative negative non-positive
extended integers extended rationals extended reals
Z Q
IR − − − IR
Z Q
Z+,+N Q+
Z+0+, ω Q+0
IR −+ − , −+ − IR+
Z−− Q−
IR0 −+ + 0, ω −+ 0 −+ IR0
Z N Q
Z
IR −− −− − IR−
Z Q
Q
Z
Z−0− Q−0
IR0 −− −0− −0 IR− 0
Z Q
The notation ω is used for the non-negative integers + 0 when the ordinal number representation is meant. Thus 0 = ∅, 1 = {0}, 2 = {0, 1} and so forth. Then ω + = ω ∪ {ω}. The notation + 0 is preferable when the representation is not important. Note that the plus-symbol in + indicates the inclusion of positive 0 numbers, whereas the plus-symbol in ω + means that an extra element is added to the set ω. The notation is mnemonic for the “natural numbers” + . (Some authors include zero in the natural numbers.) The bar over a set of numbers indicates that the set is extended by elements ∞ and −∞. So − − = ∪ {∞, −∞} and IR = IR ∪ {∞, −∞}. Then the positivity and negativity restrictions are applied to −− −+ − these extended sets. So 0 = + 0 ∪ {∞} and IR = IR ∪ {−∞}. The notations n = {1, 2, . . . n} and n = {0, 1, . . . n − 1} are used as index sets for finite sequences. These index sets are often used interchangeably in an informal manner. Thus IRn usually means IRNn in practice.
Z
N
Z Z
N
Z
Z
Z
Z
1.6.2 Remark: When mathematical symbols appear at the beginning of a sentence, the meaning can become unclear. Therefore a strong effort is made to commence all sentences with natural language words. For similar reasons, end-of-sentence mathematical symbols at the beginning of a line are also avoided where possible. These style considerations sometimes lead to slightly unnatural sentence structure. 1.6.3 Remark: Alternative definitions are sometimes given for the author’s preferred definitions. An alternative definition has an arrow pointing to the corresponding standard definition as in the following. 9.2.4 Definition: This is a definition which is adopted as standard in this book. 9.2.18 Definition (→ 9.2.4): This is an equivalent alternative version of Definition 9.2.4. 1.6.4 Remark: Theorems based on non-standard axioms are tagged. (See Remarks 4.7.2 and 5.0.10.) 1.6.5 Remark: In mathematical definitions in this book, the word “when” is sometimes used where usually the word “if” is used. A definition is not the same as a logical implication or equivalence. A definition is a shorthand for an often-used concept. It is not an axiom or theorem. The popular definition “assignment” def ∆ notations = , =, ← and := are clumsy, unattractive and incorrect. Plain language is used instead. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
integers rationals reals
1.7. Differences from other differential geometry texts
11
1.6.6 Remark: The modern symbol “ ” is placed at the end of each completed proof instead of the old-fashioned abbreviation QED (Quod Erat Demonstrandum). 1.6.7 Remark: The spelling is mostly British. The suffix spellings -ize and -ization are used in accordance with the excellent discussion in the OED [211], page 1122. However, some of the north American spelling variations devised by Noah Webster in about 1828 ([204], page xxiv) are occasionally used.
1.7. Differences from other differential geometry texts The following are some of the general differences in presentation between this book and the majority of other differential geometry textbooks. (1) Differential geometry concepts are presented in a strictly progressive order in terms of structural layers and sublayers. In particular, the Riemannian metric is not introduced until all affine connection topics have been presented, and all concepts which are meaningful without a connection are defined in the differentiable manifold chapters before connections are defined. (See Section 1.1 for further details.) (2) Substantial preliminary chapters present most of the prerequisites for the book. This avoids having to weave elementary material as needed into the more advanced material as many books do. (3) Definitions are the main focus rather than theorems. Theorems are presented only to support the presentation of definitions. (4) Classes of mathematical objects are generally defined semi-formally in terms of specification tuples. (5) Examples are mostly collected at the end of the book to avoid interrupting the theoretical development. (6) The exercises are at the end of the book and all answers are given. Therefore no essential result is required to be provided by the reader.
(i) Paths are defined as equivalence classes of curves, not images of curves. (ii) Associated fibre bundles are defined in terms of a relation between fibre bundle atlases rather than the customary explicit constructions using orbit spaces. (iii) Connections on fibre bundles are generalized to topological pathwise parallelism. (iv) Tangent vectors are defined as equivalence classes of vector components rather than as differential operators or curves. (v) Higher-order tangent operators are defined for use in analysis of partial differential equations. (vi) Connections are defined on ordinary fibre bundles rather than the customary principal fibre bundles. (vii) The covariant derivative is defined directly in terms of an affine connection by using a “drop function”. (viii) Riemannian metric tensors are defined as half the Hessian of the square of the point-to-point distance function, rather than as ad hoc symmetric covariant tensors of degree 2.
1.8. Acknowledgements [ Acknowledgements will be given here to particular people who helped with this book. ] It goes perhaps without saying that the author is greatly indebted to Donald E. Knuth [206] for the gift of the TEX typesetting system, without which this book would not have been created. The author would have been even more grateful if plain TEX didn’t require an IQ of 200 to fully understand it. Thanks also to John D. Hobby for the MetaPost illustration tool which is used for all of the diagrams. The fonts used in this book include cmr (Computer Modern Roman, American Mathematical Society), msam/msbm (AMSFonts), rsfs (Taco Hoekwater) and wncyr Cyrillic (Washington University).
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
The following are some specific differences between the way concepts are defined in this book compared to many or most other differential geometry books.
12
1. Introduction
1.9. MSC 2000 subject classification The following ‘wheel of fortune’ shows where differential geometry fits into the general scheme of things. 26
20
22
19
18
17
C o m m A u l t ge at ive br a i Lin c r in ge ea om gs ra a n e d Ass try nd mu oci al ltili at g i n v e e e br ar r Non i ngs as asso and algeb cia t a i v l e ge ra; ring Categ s an bras matr ory th d ix e o a r l y t g ; he hom eb o ory lo K-theory g ical a ras lgebra Group theory and generalizations ps, Lie groups Topological grou es ac nctions sp le tion Real fu b a r tic a i integ var aly and an plex sure nd com Mea a f sa ns o ble ctio ry ns ria o Fun io va the at lex tial qu ten mp ns le Po co o ia ral cti nt n ve re fu Se ffe ial di ec y Sp ar in rd O
28
31
30
16
32
15
14
33
13
82
general relativity
81
80
78
76
74
70
55
37
39
40
41 42
43
54
57
68
65
62
60
58
1.9.1 Remark: Differential geometry is one of 63 mathematics subject areas. In the Dewey decimal classification system, mathematics occupies 10 out of 1000 classifications. This suggests that differential geometry constitutes approximately 1/6300 of all human knowledge. This is about 0.016%. It is perhaps a depressing thought that decades of study are required to acquire even a fair understanding of such a small proportion of human knowledge. The human mind is a finite microscope scanning an infinite universe of ideas. No matter how much one learns, one’s world-view will always be woefully incomplete and unrepresentative. The research community is like millions of ants with tiny microscopes, each examining a grain of sugar at a time. Textbook writers try to assemble the grains of sugar into tidy tasty sugar cubes of knowledge. But there is a huge sugar-mountain of knowledge out there to be organized and understood. So it is essential to continually compactify and demystify all areas of mathematics so that valuable human time and energy are not wasted collecting and descrambling scattered and sometimes cryptic material.
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Mechanics of particles and systems ble solids f deforma o s ic n a h r Mec sfe eory at tran tter h anics t h c c i e e m et ma Fluid s, h agn of mic trom c a e e l n e r ry dy ics, ctu o mo Opt her tru l the t s l ca s, na ory ssi nic tio the Cla a a h tum vit ec a an m r u l g Q d ica ist an t a ty St ivi at l Re
83
en er Al al ge to br po aic Ma lo gy nif to po old log sa Glo nd y ba l an ce a l l lys Pro co is, bab m an ility aly plex the e sis ory Stati stics o s and stoc n ma has nifo Numeri tic p lds cal ana roc lysis ess es Computer scien ce
85
G
92
53
93
90
52
94
91
51
97
49
00
44
01
45
03
46
05
47
06
As rt o no Ge m o y p hy an Op sic d e s a rat st ion ro Ga s p r m h e e th ys se a ics e o r Biol y, e rch, ogy ma con an th o d m e o ic m ther Syste natu s, so atic ms t c a r h a e o l r scie ial an l pro y; co ntrol nce d b gr Informa s e a ti o h n a n a d commun vm iom ication, urinag circu l Ma it the s s ma tics edu c cat ion ien ces
35
d el Fi
General ns raphy io t res and biog unda ctu History nd fo stru gic a aic ical lo ebr emat Math alg s ed toric ms der r e bina t s, o ys Com ls ice cs ia latt rai b er, om lge Ord a lyn ry ral po eo ne d h Ge rt an be ry eo th 08
n atio
86
imiz tions opt ; Integral equa l ro ont lysis al c al ana n o m i ti t c Fun op ry nd theo sa r n o t y o ti ra etr Ope aria v m f o so ge culu te e Cal r y isc ry etr et om dd e m n a G o ex ge nv al i o t C n re ffe i D
you are here
11
Pa rti al Dy dif na fe re mi nt ca Dif ial ls fer ys e en t ce em qua Seq an s tio df uen un and ns ces c e ti , se App ries onal rgod roxim , su eq ic ation mm uat the s an ion o ab Fourie d ex r analy pans ility s ry sis ions Abstract ha rmonic anal ysis Integral transforms, operational calculus
12
34
m Nu
PDE
1.10. How to learn mathematics
13
1.10. How to learn mathematics (1) Find somewhere quiet. Turn off the radio (and your portable digital music player). A library reading room is best if it is a quiet library reading room. Preferably have a large table or desk to work on. (2) Open the book or lecture notes which you wish to study. (3) Start copying the relevant part of the book or the lecture notes to a notepad by hand. If you are studying a book, you should be writing a summary or paraphrasing what you are reading. If you are studying lecture notes, you should copy everything and add explanations where required. (4) Whenever you copy something, ask yourself if you really understand it completely. In other words, you must understand every word in every sentence. As long as you are completely comfortable with what you are copying, keep going. (5) If you read something which is difficult to understand, stop and think about it until you understand it clearly. If a mathematical expression is unclear, try to determine which set or class each term in the expression belongs to. All sub-expressions in a complex expression must belong to some set or class. Every operator and function in an expression must act only on variables in the domain of definition of the operator or function. (6) If you find something that you really can’t understand after a long time, copy it to your notebook, but put an asterisk in the margin. This means that you have copied something that you did not understand. (7) While you continue copying, keep going back to the lines which are marked with an asterisk to see if you can understand them. If you find an explanation later, you can erase the asterisk . (8) When you have finished copying enough material for one sitting, look over your notes to see if you can understand the lines which still have an asterisk. If you have no asterisks, that means that you have understood everything. So you can progress to the next chapter or section of the text. (9) If you still have one or more asterisks left in your notes after a day or more, you should keep trying to understand the lines with the asterisks. Whenever you get some spare time and energy, just look at the lines with asterisks on them. These are the lines that need your attention most. (10) If you discuss your work with other people, especially with teachers or tutors, show them your notes and the lines with the asterisks. Try to get them to explain these lines to you. If you keep working like this, you will find that your study becomes very efficient. This is because you do not waste your time studying things which you have already understood. I used to notice that I would spend most of my time reading the things which I did understand. To learn efficiently, it is necessary to focus on the difficult things which you do not understand. That’s why it is so important to mark the incomprehensible lines with an asterisk. Copying material by hand is important because this forces the ideas to go through the mind. The mind is on the path between the eyes and the hands. So when you copy something, it must go through your mind! It is also important to develop an awareness of whether you do or do not really understand something. It is important to remove the mental blind spots which can hide the difficult things. When copying material, it is important to determine which is the first word where it becomes difficult. Read a difficult sentence until you find the first incomprehensible word. Focus on that word. If a mathematical expression is too difficult to understand, read each symbol one by one and ask yourself if you really know what each symbol means. Make sure you know which set or space each symbol belongs to. Look up the meaning of any symbol which is not clear. The best way to learn any subject is to write a book on it. But if you’re in a hurry, the asterisk method is a good second best. 1.10.2 Remark: It is often said that learning is most effective when the student has an insight into an idea. This assertion is sometimes used as a pretext for the avoidance of “rote learning” because insight must precede the acquisition of ideas. In practice, this leads to inefficient learning. The best way to get insight into the properties and relations of ideas is to first upload them into the mind and then achieve insight into them. For example, a child who has learned the “five times table” will easily notice the redundancy in this table during or after rote learning. The best strategy is to learn first, then understand more deeply. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
1.10.1 Remark: This is the “asterisk method of learning” which I discovered in 1975. It worked!
14
[ www.topology.org/tex/conc/dg.html ]
1. Introduction
[ draft: UTC 2009–3–21 Saturday 11:36 ]
[15]
Part I
Preliminary topics
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
16
[ www.topology.org/tex/conc/dg.html ]
1. Introduction
[ draft: UTC 2009–3–21 Saturday 11:36 ]
[17]
Chapter 2 Philosophical considerations
2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 2.10 2.11 2.12
The bedrock of mathematics . . . . . . . . . . . . . . . . . . . . . Logic, language and tribalism . . . . . . . . . . . . . . . . . . . . Ontology of mathematics . . . . . . . . . . . . . . . . . . . . . . . Plato’s theory of ideas . . . . . . . . . . . . . . . . . . . . . . . . Sets as parameters for socio-mathematical network communications Sets as parameters for classes of objects . . . . . . . . . . . . . . . Extraneous properties of set-constructions in definitions . . . . . . . Axioms versus constructions for defining mathematical systems . . . Some general remarks on mathematics and logic . . . . . . . . . . . Dark sets and dark numbers . . . . . . . . . . . . . . . . . . . . . Integers and infinity . . . . . . . . . . . . . . . . . . . . . . . . . Real numbers and infinitesimality . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
19 22 25 27 31 36 38 40 43 47 52 61
The purpose of this chapter is to outline some of the difficulties which are encountered when attempting to find a solid basis for mathematics, in particular for differential geometry. Any resemblance between this chapter and the academic subject “philosophy of mathematics” is purely coincidental and unintentional. p
sic
l og i
hy
c
chemist
op o lo g y thr an
s
e
ry olo bi
gy
neuro sci en c
2.0.1 Remark: No bedrock of knowledge underlies mathematics. Reductionism ultimately fails. The above medieval-style “wheel of knowledge” shows some interdependencies between seven disciplines. One may seek answers to questions about the foundations of each discipline by following the arrow to a more fundamental discipline, but there seems to be no ultimate “bedrock of knowledge”. The author started writing this book with the intention of putting differential geometry on a firm, reliable footing for the benefit of mathematicians, physicists and engineers so that they could feel complete confidence that their work had a solid basis. When he had read a set theory book by Halmos [159] in 1975, it seemed to him that all of mathematics was reducible to set constructions. But by delving deeper into set theory and logic, the author encountered the opposite result. Thirty years later, the happy illusion faded and crumbled. It seems like a completely reasonable idea to seek the meaning of a mathematical expression in terms of the meanings of individual symbols in the expression. When this is applied recursively, the inevitable result
©
Alan U. Kennington, “Differential geometry reconstructed: a unified systematic framework”. Copyright 2009, Alan U. Kennington. All rights reserved. You may print this book draft in A4 format. Printing in all other formats is forbidden. You may not charge any fee for copies of this book draft. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
h e m at i c s m at
18
2. Philosophical considerations
is either a cyclic definition at some level or a definition which refers outside of mathematics. Part of the philosophy of this book is to avoid external definitions. But cyclic definitions are even more unsatisfying and empty. So the best solution is to admit that expanding the semantic tree of mathematical expressions must ultimately lead to leaf nodes which make references outside mathematics. But these external references do not have to be external to this book. In other words, a book which seeks to give full meaning to mathematical expressions may, and should, include sufficient extra-mathematical context at the lowest levels to ensure that the whole edifice has meaning. The reductionist approach to mathematics is enormously valuable, but cannot be carried to its full conclusion. Just as matter may be reduced to atoms and elementary particles, so all of mathematics may be reduced to sets and logic. It turns out, however, that when elementary particles are split, they are recursively defined in terms of each other. They can be arranged in a network, not in a tree. In the same way, the elements of set theory and logic turn out to be recursively defined in terms of each other.
2.0.3 Remark: Some kinds of philosophical questions which arise in a definitions book. Many particular questions of a philosophical nature arose when writing this book. For example: What are the “correct” or “best” definitions for real numbers, tensor products (of vectors), and tangent vectors (on manifolds)? Should one accept the “existence” of a set whose contents can never be known? If so, what would such “existence” mean? How can a secure basis be provided for mathematics if basic logic requires set theory for its establishment and set theory requires basic logic for its establishment? What is the real “stuff” of mathematics? If the written symbols are not the real “stuff” of mathematics, what is? What do the symbols point to? Is mathematics universal in the sense that inter-galactic civilisations would discover the same mathematics that Earthlings have? Or is mathematics merely a local culture which is propagated in our civilisation more or less in the manner of natural languages? Is there any sense in which anything in mathematics can be said to be certainly true in an absolute sense? Or is all mathematics merely a socially defined behaviour? To what extent is our mathematics a consequence of the peculiar capabilities of the human brain? Would a more advanced species use a totally different (and superior) mathematics? Since the real number system seems to be a consequence of our system of physical measurements, what is the significance of the absurdly large infinity of elements of the set of real numbers? Are the logical difficulties that arise with the real numbers a problem with the physical universe which we measure, or a problem with the limitations of the human modelling process? Since we never experience or observe anything truly infinite in the real world, how is it that the most successful applications of mathematics to the physical sciences rely so heavily on infinite and infinitesimal concepts? These questions and many more are likely to arise spontaneously from any serious attempt to provide a robust, self-consistent, unified, systematic basis for a rich mathematical topic such as differential geometry and its pre-requisites. Therefore some philosophy seems inevitable since philosophy is the study of questions which cannot be answered. (When a question can be answered, it moves from the philosophy department to the science faculty.) 2.0.4 Remark: Philosophy of mathematics is potentially harmless. Richard Feynman is reputed to have said: “Philosophy of science is about as useful to scientists as ornithology is to birds.” Philosophy of mathematics is probably equally useful. It is best for mathematicians to get on with their mathematics and let the philosophers worry about what it all means, if anything. Philosophy of mathematics may or may not be completely harmless. The best attitude to philosophy of mathematics might be to simply read it and forget it. In fact, one may as well just not read it at all. Skipping this chapter would certainly do no harm. This chapter is not much more than a collection of mini-essays and speculations, sometimes presented in a disconnected fashion. The reader should certainly not expect logical self-consistency in this chapter. Philosophy is a great opportunity to relax one’s intellectual rigour for a while and let the mind unwind.
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
2.0.2 Remark: Philosophical enquiry is unavoidable when studying meanings of mathematical definitions. This chapter arose naturally from the wide scope and unifying objectives of the book. If one writes about a narrow range of mathematical topics, there is little need for philosophical reflection, but this book attempts to present differential geometry and all of its prerequisites in a unified and systematic manner, tracing all definitions to their origins and incorporating numerous divergent approaches to the subject.
2.1. The bedrock of mathematics
19
2.1. The bedrock of mathematics 2.1.1 Remark: Rigorous mathematics is boot-strapped from naive mathematics. Although logic is arguably the bedrock of mathematics, it floats on a sea of molten magma, namely the naive notions of logic, sets, functions, order and numbers which are used in the formulation of “rigorous” mathematical logic. This is illustrated in Figure 2.1.1. sedimentary layers (rigorous mathematics) functions
order
numbers
set theory
bedrock layer
naive logic
naive sets
mathematical logic
naive functions
naive order
naive numbers
magma layer (naive mathematics) Relations between naive mathematics and “rigorous mathematics”
Mathematics is boot-strapped into existence by first assuming socially and biologically acquired naive notions; then building logical machinery on this wobbly basis; then “rigorously” redefining the naive notions of sets, functions and numbers with the logical machinery. Mathematics and logic are like two snakes swallowing each other by the tail. (A similar problem occurs in natural-language dictionaries. Some minimal vocabulary is required to boot-strap the definitions.) Conjuring up concepts such as metamathematics and metalogic solves nothing. Inventing long names to describe the circular definition problem does not make it go away. 2.1.2 Remark: Etymology and meaning of the word “naive”. The word “naive” is not necessarily pejorative. The French word “naive” comes from the Latin word “nativus”, which means “native”, “innate” or “natural”. These meanings are applicable in the context of logic and set theory. The Latin word “nativus” comes from Latin “natus”, which means “born”. So the word “naive” may be thought of as meaning “inborn”. That is, naive mathematics is a capability which humans are born with. (This is related, of course, to the philosophizing in linguistics about the extent to which natural language is an inborn or learned capability. Much the same arguments apply to mathematical behaviour.) 2.1.3 Remark: Fundamental mathematics is both a-priori and analytic knowledge. The boot-strap notion of logic and mathematics seems to resolve the question of whether logic and arithmetic are a-priori knowledge. It seems that logic and arithmetic are both a-priori and analytic knowledge because they are a-priori during the boot-strap phase and analytic when re-defined “rigorously”. A mathematical education consists of many loops around the snakes and ladders of logic and mathematics until one’s thinking is synchronized with other mathematicians. Philosophers who worry about the true nature of mathematics possibly make the mistake of assuming a static picture. They ignore the cyclic and dynamic nature of all knowledge, which has no ultimate bedrock. Terra firma floats on molten magma. 2.1.4 Remark: Mathematical logic is the best “bedrock layer” for mathematics. One must agree on a starting point for the presentation of a body of mathematics and then proceed in an orderly fashion from that agreed starting point. The starting assumptions cannot be validated in a noncyclic way from more basic “knowledge”. Like all communication of knowledge, one must proceed from an agreed point in an agreed manner. New ideas can be communicated only if old ideas are first agreed. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Figure 2.1.1
20
2. Philosophical considerations
(Computer communications are similar. A pair of computers generally start any communication session with a “handshake” procedure to synchronize their states so that the rest of the session will be correctly understood by both sides.) At a postgraduate university mathematics level, it makes sense to strive to achieve synchronization of concepts in the mathematical logic layer as a “bedrock” and base everything else on that. Since most of mathematics can be built upon a basis of mathematical logic, it is a suitable starting point. This does not mean that logic is self-evidently true. It is just easier to achieve standardization of language and concepts in the mathematical logic layer in the current century at a particular educational level. 2.1.5 Remark: The bedrock of mathematics may be chosen to be natural or minimal. One may compare the organization of mathematics as a subject with chemistry. One group of chemists could define hydrogen and oxygen as products of the electrolysis of water, while another group could define water as a compound made from hydrogen and oxygen. It is more natural to define the hard-to-produce gases in terms of the ubiquitous liquid. But it is more minimalist to start from 92 elements and defined all compounds in terms of them.
2.1.6 Remark: Mathematics bedrock layers in history. The choice of bedrock layer for mathematics seems to be a question of fashion. In ancient Greece, and for a long time after, the most popular choice for the bedrock layer was geometry. For a long time, arithmetic and algebra could be based on the ruler-and-compass geometry of points, lines and circles. From the time of Descartes onwards, geometry could be based on arithmetic and algebra. Then geometry became progressively more secondary, and arithmetic became more primary, because algebra and analysis extended numbers well beyond what the old geometry could deliver. Later on, set theory provided a new, more primary layer upon which arithmetic, algebra, analysis and geometry could be based. Around the beginning of the 20th century, mathematical logic became a yet more primary layer upon which set theory and all of mathematics could be based. Maybe some day a new concept layer will be developed to better underly logic and mathematics. Such a development is difficult to foresee. The fact that logic and set theory are mutually intertwined in a cyclic manner suggests that a better bedrock layer is needed. But at the current time, logic seems to be the best choice of boot-strap layer. 2.1.7 Remark: The network of mathematical concepts requires coherence. Although the network of mathematical concepts cannot be defined in the sense of deriving all concepts from other concepts in an acyclic manner, the concept network can at least be coherent. Coherence is the best that can be hoped for in the non-acyclic concept network which includes both mathematics and mathematical logic as in Figure 2.1.1. Coherence is not the same as logical self-consistency. The latter means that the concept network is tested with respect to a deductive logic framework which is established external to the network. In the situation considered here, the logical framework is part of the concept network which is being tested. The rules of deduction are themselves defined within the network. In the original Latin, the word “coherent” means “sticking together” whereas “consistent” means “standing firmly” or “remaining firm”. The English meanings of “consistent” and “coherent” are very similar in general usage. In the context of mathematical logic, “consistent” generally means that no propositions will contradict each other whereas “coherent” is more abstract term referring to the entire framework of concepts. 2.1.8 Remark: The mathematical pre-requisities for mathematical logic. Mendelson [164], pages 5–11, explicitly presents six pages of mathematics which are pre-requisite to his introduction to mathematical logic. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
In the same way, the basis of each mathematical topic may be chosen to be either natural or minimal. Very often a minimal set of axioms and rules will seem unnatural and difficult to understand, while a natural starting point may suffer from substantial redundancy and untidiness. In chemistry, there is broad agreement that the 92 elements are a suitable basis for understanding all compounds. But in mathematics and logic, and in differential geometry in particular, there are often no clear winners in the competition to be the undisputed basis of particular topics.
2.1. The bedrock of mathematics
21
For the absolute novice a summary will be given here of some of the basic ideas and results used in the text. Since Mendelson’s six pages include (very compactly) a big chunk of the basic theory of sets, relations, functions, cardinal numbers and order, this seems to be a quiet confession that mathematical logic cannot be developed without assuming much mathematics which is itself based on mathematical logic. The surprising thing is that this circular dependency is so seldom remarked upon. Maybe the teachers for each topic refer to each other for the pre-requisites of their courses, each believing that their own topic is firmly based on lower-level concepts presented elsewhere. Hopefully, in the end, there are at least no contradictions between the topics. Remark 7.2.5 discusses how the naive notion of a sequence is required as prior knolwedge for the development of logic and mathematics, but is then re-defined within ZF set theory. Section 3.14 is an attempt to list some of the naive mathematics required in the set-up of mathematical logic in this book.
Lack of an emotional hitching-post was quite clearly a major factor in driving the young Russell out on his quest for an intellectual alternative – for certainty in an uncertain world – a journey which took him first into mathematics and then into philosophy. The expedition had started by 1883 when Frank Russell took his brother’s mathematical training in hand. ‘I gave Bertie his first lesson in Euclid this afternoon’, he noted in his diary on 9 August. ‘He is sure to prove a credit to his teacher. He did very well indeed, and we got half through the Definitions.’ Here there was to be no difficulty. The trouble came with the Axioms. What was the proof of these, the young pupil asked with naive innocence. Everything apparently rested on them, so it was surely essential that their validity was beyond the slightest doubt. Frank’s statement that the Axioms had to be taken for granted was one of Russell’s early disillusionments. ‘At these words my hopes crumbled’, he remembered; ‘. . . why should I admit these things if they can’t be proved?’ His brother warned that unless he could accept them it would be impossible to continue. Russell capitulated – provisionally. 2.1.10 Remark: Philosophy does not provide a solid bedrock for mathematical logic. Sadly, one cannot find a bedrock underlying logic even in the realm of philosophy because philosophy itself depends heavily on logic, thereby creating a further dependency cycle. At best one can hope to find a logic and philosophy which are coherent with each other. Philosophy does “underly” logic in some sense, as suggested by the diagram discussed in Remark 2.0.1, but all of the arrows depicted in that diagram are accompanied by thinner arrows pointing in the opposite direction. One would not want to base any subject on philosophy anyway, since philosophy is the woolliest of all disciplines. Every proposition in philosophy can be shown to be simultaneously true, false and meaningless. Philosophy is analogous to the Earth’s core in the magma/bedrock/sedimentary picture in Figure 2.1.1. Everything in the Earth’s core is a matter of conjecture and guesswork because direct observation is not possible. 2.1.11 Remark: Mathematics is psychological and biological in nature, location and origin. Lakoff/N´ un ˜ez [172], page 49, has the following comment on where the ultimate meaning of mathematics comes from when mathematical symbols are recursively interpreted. To understand a mathematical symbol is to associate it with a concept—something meaningful in human cognition that is ultimately grounded in experience and created via neural mechanisms. [. . . ] The meaning of mathematical symbols is not in the symbols alone and how they can be manipulated by rule. Nor is the meaning of symbols in the interpretation of the symbols in terms of settheoretical models that are themselves uninterpreted. Ultimately, mathematical meaning is like everyday meaning. It is part of embodied cognition. Texts on mathematical logic typically interpret symbolic logic in terms of set theory, which in turn is defined in terms of mathematical logic. The Lakoff/N´ un ˜ez claim is that the recursive interpretation does end [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
2.1.9 Remark: Bertrand Russell’s disillusionment with the non-provability of axioms. The lack of a “bedrock” for mathematics is reminiscent of the experience of Bertrand Russell when he discovered that everything in Euclidean geometry is derived from axioms which must be accepted without proof. The following paragraph (Clark [179], page 34) describes Russell’s disillusionment.
22
2. Philosophical considerations
somewhere, namely with “embodied cognition”, which means essentially the biological processes of human thinking. In other words, all mathematics is ultimately psychological and biological in its nature, its location and its origin.
2.2. Logic, language and tribalism
2.2.2 Remark: The membership concept is related to tribalism. The concept of sets comes from the human mind’s ability to group objects together and define boundaries around territories. The very word “membership” in the set theory context is suggestive of tribal membership, which is fundamental to both human cooperation and competition (which in turn are responsible for the majority of human happiness and misery respectively). 2.2.3 Remark: The concept of propositions originates from cooperation and competition. The ability to convert mathematics into propositions is derived from the human ability to convert ideas into words, which is fundamental to the evolution of humans from isolated monkey tribes into communities of minds who build common views of the world and history. Speech allowed the human species to transmit knowledge efficiently and preserve it through replication in the minds of others after individuals who possessed knowledge died. (Speech also permitted more efficient cooperation within tribes to defeat competitors and the environment.) Ultimately the conversion of all mathematics into symbolic logic rests on the speech capability—the ability to convert ideas into words and convert words back into ideas. (See Figure 2.2.1.) talking verbalization ideas Figure 2.2.1
speech
hearing interpretation ideas
Communication of human ideas via speech
The semantics of symbolic logic rests on this human capability to convert back and forth between ideas and words. So it seems that our ability to perform symbolic logic rests heavily on a human capability which evolved in reponse to a need to cooperate better within tribes to increase survivability. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
2.2.1 Remark: Removing the circularities in logic and mathematics by external references. The attempt to remove the circularities in the structure of mathematics may be compared to a study of how the parts of the human body are supported. One may observe that the head is supported by the neck, which is supported by the shoulders, which are supported by the chest, which is supported by the spine and abdomen and so forth. But the feet are not supported by any part of the body. The feet must be supported by something outside the body. In this sense, the ground underneath the feet is part of the body. In fact, the dirt we walk on is an inseparable part of the body (in the same way that various friendly bacteria which are needed for digestion are an essential part of the human body). Even if we jump in the air, the trajectory of that jump is defined by the point at which we leave the ground and the point at which we return. If we swim in the sea, the sea itself is defined by the ground on which it rests. Swimming is supported by the buoyancy of the water, which is supported by the earth below it. Thus any explanation of the physical support of the components of the human body can only avoid circular dependencies by finally referring to something which is outside the body. In the same way, an explanation of the meaning of mathematics can never avoid circularities unless there is some support point outside mathematics. This is where naive mathematics is required. Naive mathematics is the point of contact between pure mathematics and the nature of human experience itself, including the experience of the arrow of time, which allows us to order events, and this ordering of events allows us to count objects in sets, which gives us cardinality and numbers. The arrow of time is also required for the ordering of logical arguments from assumptions at the beginning to assertions at the end. A mathematics book typically starts with assumptions and works towards conclusions, which also requires the arrow of time to distinguish the future from the past in an asymmetric fashion. The ability to read a mathematical text requires knowledge of left and right, and up and down.
2.2. Logic, language and tribalism
23
2.2.5 Remark: The anthropological approach to mathematics. The whole of mathematics (and physics and logic) may be approached with an anthropological mind-set. (Anthropology is the study of humans with the same mind-set that zoology applies to animals, but with special emphasis on the differences between humans and other animals.) It is important to study not only what people do in mathematics, but also how they think about what they are doing. When the studied humans say things about their thinking which do not seem to make much sense, one should pay more attention to what they are doing. 2.2.6 Remark: Counting requires language and the arrow of time. The ability to count things comes from the association of an ordered sequence of words with an ordered sequence of observations of objects. Counting (beyond one or two items) would not be possible without a vocabulary of number-words. So numbers arise (probably) from ordered listings of things according to a standard set of number-words. This ability to associate words with lists of objects is clearly useful for communicating whether one’s possessions have increased or decreased, for example lists of tribal members, tools or domesticated animals. Sometimes it seems sad that geometry has been reduced to numbers (as in numerical coordinate systems), and that numbers have been reduced to symbolic logic. But symbolic logic is more convenient for reliable written communication, which originated in the words and sentences of natural-language speech, which itself arose from the need for efficient communication within human tribes. It seems, then, that there is nothing really fundamental about the reduction of mathematics to symbolic logic. It just happens to be convenient for fast and reliable communication between humans. If evolution had taken another path, the most efficient means of communication might have been more geometrical in character, with no words or sentences at all. Or effective communication could have evolved along lines which are difficult for us to imagine because we are human beings who see everything anthropocentrically. Consequently one should not necessarily expect symbolic logic (and ordered logical arguments) to be the means of expressing the fundamentals of mathematics among extra-terrestrial civilisations. In other words, [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
2.2.4 Remark: The role of language in community formation. The claim in Remark 2.2.3 that language plays an important role in tribe and community formation is supported by the following passage in a book on anthropological linguistics by Foley [171], page 69. Aiello and Dunbar [169] and Dunbar [170] [. . . ] note a close correlation between social-group size and brain size, and note that both of these were increasing significantly about the time of Homo habilis 2 million years ago. As the size increased, grooming behaviour would have no longer sufficed to ensure group social cohesion. Aiello and Dunbar [169] calculate that Homo erectus (Olduvai 9) with a cranial capacity of 1067 cm3 and a mean group size of 116.39 would have needed to spend 33 percent of its time in grooming activity to promote social bonding. They posit that the function and complexity of vocalizations were extended to take on some of the load previously carried by grooming. They also point out that gelada baboons, living in the largest groups among primates excluding humans, with a mean group size of 115, have vocalization patterns with a number of features once considered unique to human speech: fricatives, stops and nasals, 3 places of articulation (labial, dental and velar), and prosodic melodies (Richman [176,177,178]). These vocal properties seem to supplement grooming as a mechanism for social bonding. Given the grooming time that would have been required for hominids, Aiello and Dunbar [169] propose that vocalizations underwent a similar extension of function in the ancestral Homo lineage. Indeed, the importance of their role in promoting social cohesion has been emphasized continuously by many scholars, going back at least to Malinowski [174]; this, of course, is also central to the view of linguistic practices presented in [the previous chapter]. This is often insufficiently recognized because speakers of written languages (and linguists!) often unduly overemphasize its propositional bearing function. It seems somewhat ironic that our modern, sophisticated system of communicating and thinking may have originated not so much in the need to communicate “propositions” or “facts”, but rather in the advantages of forming larger tribal groupings for the sake of cooperation against other animals of the same species. In other words, the original adaptive advantage of language may have been the greater solidarity when competing with other individuals of the same species. The fact that language eventually developed the ability to communicate propositions may have been an incidental by-product of identifying which individuals are in or out of a particular group; in other words, “friend or foe”.
24
2. Philosophical considerations
one should not be too concerned at the apparent arbitrariness of some aspects of mathematical logic. Many aspects are aribitrary, because they originate in the peculiarities of the human species and our current culture. The purpose of presenting naive mathematics is to identify, in a sense, “where the feet meet the ground”. (See Section 3.14.) 2.2.7 Remark: The normative influence of written language on spoken language. There are some similarities between the disciplines of logic and linguistics. Linguists (including anthropologists and proselytizers) have written dictionaries and grammar books describing the languages of illiterate societies since shortly after the invention of printing in Europe. (For example, see Ostler [175], pages 341– 347, for a brief account of the history of dictionaries and grammar books in the Spanish colonies in America from about 1540 onwards.) Then later the studied societies were able to use dictionaries and grammar books to help learn their own languages better and to teach their children. (Even in the modern world, some groups of people learn how to do their own traditional dances and rituals from anthropologists who described these practices before they became extinct in the wild.) In the same way, logical argument used to be an activity in the wild. Logicians studied how that wild logic was happening and developed symbolic logic to describe that behaviour. Gradually the studied populations, particularly mathematicians, have been able to use symbolic logic in their own thinking. The formalization of both language and logic is partly good and partly bad. The linguists end up forcing languages to go down paths that they otherwise would not have. People who learn language from books find that they have to strictly conform to rules which have been inferred (sometimes imperfectly) by the linguists. (See Figure 2.2.2.)
talking ideas Figure 2.2.2
literature speech
reading hearing ideas
Communication of human ideas via speech and literature
This often brings about conflict between prescribed language and colloquial language. In the case of logic, the specified logic of the logicians can force mathematicians to accept propositions against their intuition and will. It is important not to forget where formal logic came from. Formal logic derives its authority from the wild logic from which the formal logic was inferred. Logicians have as much authority to tell mathematicians how to do mathematics as anthropologists have to tell Australian aborigines how to roast a kangaroo. 2.2.8 Remark: Formalist mathematics yields existence assertions which require “faith”. One could view some of the heated controversies about intuitionism and constructivism in the 19th and 20th centuries as resistance to the new formal way of developing mathematics. In particular, the concept of “existence” formerly meant that something could be specifically described, whereas later mathematicians used formal methods to describe an empty concept of existence which did not yield specific objects. The fact that one write down that something exists, and justify this statement in terms of a logical argument based upon baseless axioms, is nowadays accepted as a proof of existence. The formalist approach to the notion of existence shows how formal grammars can distort the original thinking processes by failing to fully, accurately describe the original processes. In the same way that the real, original speakers of a language may feel they have a right to speak as they see fit rather than following the grammar books and dictionaries of anthropologists and linguists, so also it seems reasonable that mathematicians should feel they have the right to object to axioms which seem to describe something which they do not recognize as authentic mathematics. When one can prove an assertion ∃x, P (x) from axioms, it requires a certain amount of faith to believe that an x really does exist which satisfies P (x). A sceptical person wishes to see material evidence of the [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
writing
2.3. Ontology of mathematics
25
existence. If someone claims that a set is non-empty, but also states that nota single member of that set can be specified, scepticism would seem to be justified. Even the best-presented argument for the existence of pixies is thrown into doubt by the impossibility of ever seeing one.
2.3. Ontology of mathematics 2.3.1 Remark: Ontology versus “an ontology”. Semantics is the association of meanings with texts. An ontology is a semantics for which the meanings are expressed in terms of real-world models. In philosophy, the subject of “ontology” is generally defined as the study of being or the essence of things. The phrase “an ontology” has a different but related meaning. An ontology, especially in the context of artificial intelligence computer software, is a standardized model to which different languages may refer to facilitate translation between the languages. To achieve this objective, an ontology must contain representations of all the objects, classes, relations, attributes and other things which are signified by the languages in question.
2.3.2 Remark: Real-world models are required for interpreting physical phenomena. Since the “real world” can only be indirectly perceived via physical phenomena (i.e. interactions between human minds and the real world), the closest we can come to the real world is a model of the real world which is consistent with observed phenomena. Real-world models vary between individual people and over time, and according to the tasks to which the models are to be applied. A single individual may simultaneously adopt multiple real-world models. Models may attempt to comprehensively describe the whole universe or only particular aspects of the entire universe. Since the words “whole” and “entire” cannot be validated for the universe, because we cannot know the full extent of the limitations of human perception and observations of the universe, all models must be assumed to be partial at best. More importantly, the limitations on modelling by humans imply that all real-world models have dubious correctness and completeness even under the most optimistic assumptions. The particular difficulty in developing an ontology for mathematics is the non-physical nature of mathematical “phenomena”. 2.3.3 Remark: An ontology for mathematics can be based on mind-states. The view adopted in this book is that a satisfying ontology for mathematics can be built on a real-world model which locates mathematical objects inside human minds, in the minds of some animals, and also in computer systems which execute mathematical software. The key concept in this proposed model is the “socio-mathematical network” which is outlined in Section 2.5. The socio-mathematical network view of mathematics contrasts with the Platonic “Ideal” view in Section 2.4. 2.3.4 Remark: Albert Einstein’s explanation of the ontological problem for mathematics. Albert Einstein wrote the following comments about “the ontological problem” in an essay: “The problem of space, ether, and the field in physics” [181], pages 61–62. It is the security by which we are so much impressed in mathematics. But this security is purchased at the price of emptiness of content. Concepts can only acquire content when they are connected, however indirectly, with sensible experience. But no logical investigation can reveal this connection; it can only be experienced. And yet it is this connection that determines the cognitive value of systems of concepts. Take an example. Suppose an archaeologist belonging to a later culture finds a text-book of Euclidean geometry without diagrams. He will discover how the words “point,” “straight-line,” “plane” are used in the propositions. He will also see how the latter are deduced from each other. He will even be able to frame new propositions according to the known rules. But the framing of these propositions will remain an empty word-game for him, as long as “point,” “straight-line,” “plane,” etc., convey nothing to him. Only when they do convey something will geometry possess [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
In the context of mathematics, an ontology must be able to represent all of the ideas which are signified by mathematical language. The big question is: What should an ontology for mathematics contain? In other words, what are the things to which mathematics refers and what are the relations between them? This is very much like asking: What is the essence of mathematical things?
26
2. Philosophical considerations
2.3.5 Remark: Mathematics without ontology is empty of meaning. Mathematics is as much about sets as computer data is about zeros and ones. Just as computer data must be brought to life by being interpreted by human beings, so also must set constructions be brought to life by interpretation. Set theory is a language, but this language is no more useful than ancient Mycenaean clay tablets if one has no dictionary and no familiarity with the world model within which the tablets were created. Even if a future archaeologist has the same sort of brain as 21st century mathematicians, however, there is still the difficult question of how 21st century mathematical symbols can be mapped to the concepts inside the mind of the future archaeologist. The minds of mathematicians in this century are formed by a long and intense process of training to “see” concepts which are quite foreign to spontaneous thinking. These concepts require considerable “mind-stretching”. It is not obvious that the minds of people of the future would be sufficiently “stretched” by reading 21st century mathematical literature to be able to synchronize to our mathematical concepts. On the other hand, the mathematical literature of ancient Greece was somehow adequate to stimulate the renaissance of mathematics in Europe after a Dark Age which lasted more than 1500 years. There was, admittedly, some slim continuity in the person-to-person communication of mathematical concepts. So it is difficult to know if the literature alone could have stimulated a mathematical renaissance. Luckily the particular natural langauge of the texts was not yet extinct when Europe emerged from this Dark Age. 2.3.6 Remark: Mathematical activity may be perceived internally. Einstein is probably not right (in Remark 2.3.4) in saying that all meaning must be connected to “sensible experience” unless one includes internal perception as a kind of sensory experience. Pure mathematical language refers to activities which are perceived within the human brain. 2.3.7 Remark: The intellectual content of mathematics lies in human ideas. Lakoff/N´ un ˜ez [172], page xi, makes the following comment on the lack of intellectual content in mathematical symbols. Mathematics is seen as the epitome of precision, manifested in the use of symbols in calculation and in formal proofs. Symbols are, of course, just symbols, not ideas. The intellectual content of mathematics lies in its ideas, not in the symbols themselves. In short, the intellectual content of mathematics does not lie where the mathematical rigor can be most easily seen—namely, in the symbols. Rather, it lies in human ideas. 2.3.8 Remark: Classification of ontologies for mathematics. Ontologies for mathematics are often categorized in an effort to try to bring order to a chaotic jumble of proposals by numerous authors. Various mathematics ontology category lists may be found in the literature. Among the numerous categories are the following. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
any real content for him. The same will be true of analytical mechanics, and indeed of any exposition of the logically deductive sciences. What does this talk of “straight-line,” “point,” “intersection,” etc., conveying something to one, mean? It means that one can point to the parts of sensible experience to which those words refer. This extra-logical problem is the essential problem, which the archaeologist will only be able to solve intuitively, by examining his experience and seeing if he can discover anything which corresponds to those primary terms of the theory and the axioms laid down for them. Only in this sense can the question of the nature of a conceptually presented entity be reasonably raised. With our pre-scientific concepts we are very much in the position of our archaeologist in regard to the ontological problem. We have, so to speak, forgotten what features in the world of experience caused us to frame those concepts, and we have great difficulty in representing the world of experience to ourselves without the spectacles of the old-fashioned conceptual interpretation. There is the further difficulty that our language is compelled to work with words which are inseparably connected with those primitive concepts. Mathematics is not able to define itself in a self-contained way. Even though a machine may perform calculations (like a future archaeologist who does not know what the words and symbols refer to), the machine is not able to solve the ontological problem of mapping words and symbols to their meaning.
2.4. Plato’s theory of ideas
27
(1) Plato’s theory of ideas. Mathematics exists in a “timeless realm of being”. (See Section 2.4.) (2) Mathematics exists in the machinery of the universe. (3) Mathematics exists in the human mind. 2.3.9 Remark: The physical-universe-structure mathematics ontology. The ontology category (2) in Remark 2.3.8 is exemplified by the following comments in Lakoff/N´ un ˜ez [172], page xv. (The authors then proceed to cast scorn on the idea.) Mathematics is part of the physical universe and provides rational structure to it. There are Fibonacci series in flowers, logarithmic spirals in snails, fractals in mountain ranges, parabolas in home runs, and π in the spherical shape of stars and planets and bubbles. Later, these same authors say the following (Lakoff/N´ un ˜ez [172], page 3). How can we make sense of the fact that scientists have been able to find or fashion forms of mathematics that accurately characterize many aspects of the physical world and even make correct predictions? It is sometimes assumed that the effectiveness of mathematics as a scientific tool shows that mathematics itself exists in the structure of the physical universe. This, of course, is not a scientific argument with any empirical scientific basis. [. . . ] Our argument, in brief, will be that whatever “fit” there is between mathematics and the world occurs in the minds of scientists who have observed the world closely, learned the appropriate mathematics well (or invented it), and fit them together (often effectively) using their all-too-human minds and brains.
The correspondence between models and the universe is pretty good at times, but we view the universe through a cloud of statistical variations in all of our measurements. All observations of the physical universe require statistical inference to discern the noumena underlying the phenomena. This inference may be explicit, with reference to probabilistic models, or it may be implicit, as in the case of the human sensory system. 2.3.10 Remark: The mathematical mind ontology leads to a useful avenue of research. One may look at the options for the ontology of mathematics in Remark 2.3.8 in terms of their consequences for research. If one accepts option (1) (the mathematical heaven) or option (2) (the mathematical universe), the source of mathematics is in both cases inaccessible to practical research. Both are dead ends for research. By contrast, if one accepts option (1) (the mathematical mind), one is led to inquire into the history of the development of mathematical ideas in the human mind. The history of models, propositions, numbers and sets can be inferred to some extent from biology, zoology, palaeontology, archaeology, history and anthropology. In particular, the development of these building blocks of mathematical thought over the last million years can be inferred as we have progressed from monkeys to pre-linguistic humans, to linguistic humans, to neolithic farmers, to the earliest literacy, and so forth. the “deconstruction” of our modern mathematical ideas in this historical fashion brings some credible clarification of their nature.
2.4. Plato’s theory of ideas 2.4.1 Remark: Sets and numbers exist in minds, not in a mathematics-heaven. The Platonic style of ontology is explicitly rejected in this book. Sets and numbers, for example, really exist in the mind-states and communications among human beings (and also in the electronic states and communications among computers and between computers and human beings), but sets and numbers do not exist in any “mathematics heaven” where everything is perfect and eternal. Any such “heaven for ideas” is located in the human mind, if anywhere, and it is neither perfect nor eternal. Plato’s ideal “Forms” really do exist, but only in the human mind. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
There is very great certainty that mathematics does exist in the human mind. This mathematics corresponds to observations of, and interactions with, the physical universe. But those observations and interactions are extremely limited by the human sensory system and the human cognitive system. We cannot be at all certain that the mathematics of our physical models is inherent in the observed universe itself. We can only say that our mathematics is highly suited to describing our observations and interactions, which are themselves very limited by the channels through which we make our observations.
28
2. Philosophical considerations
The idea that all imperfect physical-world circles are striving towards a single Ideal circle form in a perfect Form-world is quite seductive. (For discussion of Plato’s “theory of ideas”, see for example Russell [185], Chapter XV, pages 135–146; Foley [171], pages 81–83.) [ Remarks 2.4.2 and 2.4.3 are very similar to Remark 2.11.19. Should combine or at least collocate them. ] 2.4.2 Remark: A plausible argument in favour of a Platonic-style ontology for mathematics. A plausible argument may be made in favour of Platonic ontology for the integers in the following form. (1) Numbers written on paper must refer to something. (2) Numbers do not correspond exactly to anything in the sensible world. (3) Therefore numbers correspond exactly to something which is not in the sensible world. Here the word “sensible” means “able to be sensed”, e.g. via sight or hearing. The same overall form of argument is applied to geometrical forms such as triangles in this passage from Russell [185], page 139. In geometry, for example, we say: ‘Let ABC be a rectilinear triangle.’ It is against the rules to ask whether ABC really is a rectilinear triangle, although, if it is a figure that we have drawn, we may be sure that it is not, because we can’t draw absolutely straight lines. Accordingly, mathematics can never tell us what is, but only what would be if. . . There are no straight lines in the sensible world; therefore, if mathematics is to have more than hypothetical truth, we must find evidence for the existence of super-sensible straight lines in a super-sensible world. This cannot be done by the understanding, but according to Plato it can be done by reason, which shows that there is a rectilinear triangle in heaven, of which geometrical propositions can be affirmed categorically, not hypothetically.
(1) If you write down the numbers 1, 2 and 3 on a piece of paper and ask: “Are these numbers?”, many people will say that they are. But if you write down the word “dog” and ask if this is a dog, they will almost certainly say it is not. The word “dog” refers to an external entity. In the same way, the numbers written on paper also refer to entities other than the ink on the paper. So what do written numerals refer to? The written numerals are only ink or graphite smeared onto paper. If billions of people are writing these symbols, they must mean something. They must refer to something in the world of experience of the people who write the symbols to communicate with each other. (2) Nothing in the observed physical world corresponds exactly to numbers. The number of cows in a field may seem to be 3, for example, but if a cow dies, the number of cows will vary over time. There will be a point in time at which it is not clear whether there are 2 or 3 cows because one of them is undergoing a death process. Even in the case of humans, there is enormous controversy over the many definitions of when death has occurred. If a cow is born, the number of cows increases, but there are times when the cow count is ambiguous. There is frequent animated controversy over the time-point at which a cow (or human) may be said to come into existence. Likewise, there is ambiguity about whether a cow is inside a given field or outside it. If the cow is entering or leaving the field via a hole in the fence, the time of entry or exit is subjective. Quantum mechanics muddies the waters still further. (3) Since nothing in the physical world corresponds exactly to the idea of a number, and written numbers must refer to something, there must be a non-physical world where numbers exist. This non-physical world must be perceived by all humans because otherwise they could not communicate about numbers with each other. This proves the existence of a non-physical world where all numbers are perfect and eternal. This world may be referred to as a “mathematics heaven”. This number-world can be perceived by the minds of human beings. Convincing argument, isn’t it? Well, maybe not. The integers may (and probably do) refer to specific kinds of contents of the human mind. But these contents can be sensed by the mind, although they are not sensed [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Although microscopy, chemistry and quantum mechanics convince us very easily that there is no such thing as a straight line, it is a little more difficult to make the case that the “cardinality of collections” is ill-defined in the sensible world. Each integer is supposed to correspond to the cardinality of real-world collections of objects, or so we are told in our infancy. Even chimpanzees can count. But serious doubt may be thrown on this idea. So here is the above 3-step argument for integers.
2.4. Plato’s theory of ideas
29
externally. In other words, it is not necessary to invent some sort of spooky, aetherial, all-pervasive metauniverse which can be detected by deep contemplation and meditation. Numbers (and rectilinear triangles) are just part of the normal thought processes of normal human beings living normal lives. Nothing spooky. 2.4.3 Remark: Possible non-existence of sensible objects. It may be remarked that the second part of the argument in Remark 2.4.2 implies in particular that the integers 0 and 1 have no reliable correspondence to the “sensible world” either. So both existence and uniqueness of objects are thrown into doubt. (This is because existence means that the cardinality of a collection is at least 1 whereas uniqueness means that the cardinality is at most 1.) A cow may be defined as a region of space-time which has an approximate boundary, but this kind of definition is not free of ambiguity. In order to count up to 1 cow, the observer must be able to (i) (ii) (iii) (iv) (v)
see a cow as more than just a two-dimensional pattern of light; distinguish one cow from another similar-looking cow; identify the cow at each point in time as being the same cow; distinguish cows from non-cows; not count the same cow twice by accident.
2.4.4 Remark: Single object with a trajectory over time versus time sequence of objects. Requirement (iii) in Remark 2.4.3 implies that as a cow develops over life, it is supposed to be the same cow. This is despite the fact that the cow does change over that time, and a large proportion of the matter in the cow is exchanged during life. In the case of a car whose parts have all been replaced at least once during its lifetime, one often still regards that car as being the same car, having the same “identity”. The same identification issue arises in the case of nations. It is generally assumed that a nation at some point in history may be identified with the nation of the same name 150 years later, although all of the individuals have been “exchanged”. This is sometimes called the “continuity and succession” issue. This arises especially in discussions of debts and rights. (For example, Russia was regarded as the “successor state” to the Soviet Union in 1991, inheriting its rights and obligations even though the territory and population were significantly altered.) Similarly, companies merge and split, and employees come and go, but somehow the identity of companies must be determined. As another example, if a bacterium divides, it is difficult to say whether it died at that time, or half of its identity is given to each of its offspring, or its full identity is given to each of the offspring. A very similar issue arises in the design of databases. One must often decide what constitutes an “object”. Some attributes determine the identity of an object, whereas other attributes are changeable without changing the identity. Sometimes the determination of object identity is easy. Sometimes it is almost impossible. The implication of these comments for mathematics, and for logic generally, is that one should not take too seriously the idea that numbers and sets have their origins in the perceived universe. Numbers and sets are human concepts for organizing perceptions. Number and sets may be thought of as part of the “commentary” by the human mind on the world as perceived. When a child is shown 3 objects and is told that this is the number 3, this does not mean that the number 3 is in the real world. The number 3 is something which arises in the child’s mind when 3 objects are shown. And the definition of “3 objects” is whatever triggers the adult’s mental concept of “3 objects”. The adult and the child are sharing the same commentary on the world as perceived. But the fact that two people experience the same thing under the same circumstances does not imply that that experience is “out there” in the environment. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
The difficulty of programming computers to perform these tasks shows that even counting to 1 is non-trivial. One may think of the process of counting collections of objects as a kind of ADC (Analogue Digital Conversion). The engineering of systems which perform ADC is a well-developed art, but there are always boundary cases which are difficult to convert. Mathematics and logic are performed on the converted perceptions, not on real things themselves. In particular, the geometrical concept of a “point with no extent” is a “digital” concept which exists only in the mind, but this concept is the result of conversion of “analogue” observations. Likewise, the “digital” or “discrete” nature of human language requires some sort of ADC to convert fuzzy ideas inside minds into particulate language.
30
2. Philosophical considerations
2.4.5 Remark: Descartes and Hermite supported the Platonic ontology for mathematics. Descartes and Hermite supported the Platonic Forms view of mathematics. Bell [190], page 457, published the following comment in 1937. Hermite’s number-mysticism is harmless enough and it is one of those personal things on which argument is futile. Briefly, Hermite believed that numbers have an existence of their own above all control by human beings. Mathematicians, he thought, are permitted now and then to catch glimpses of the superhuman harmonies regulating this ethereal realm of numerical existence, just as the great geniuses of ethics and morals have sometimes claimed to have visioned the celestial perfections of the Kingdom of Heaven. It is probably right to say that no reputable mathematician today who has paid any attention to what has been done in the past fifty years (especially the last twenty five) in attempting to understand the nature of mathematics and the processes of mathematical reasoning would agree with the mystical Hermite. Whether this modern skepticism regarding the other-worldliness of mathematics is a gain or a loss over Hermite’s creed must be left to the taste of the reader. What is now almost universally held by competent judges to be the wrong view of “mathematical existence” was so admirably expressed by Descartes in his theory of the eternal triangle that it may be quoted here as an epitome of Hermite’s mystical beliefs.
2.4.6 Remark: Bertrand Russell abandoned the Platonic mathematics ontology. Bertrand Russell once believed the ideal Forms ontology, but later rejected it. Bell [189], page 564, said the following about Bertrand Russell’s change of mind. In the second edition (1938) of the Principles, he recorded one such change which is of particular interest to mathematicians. Having recalled the influence of Pythagorean numerology on all subsequent philosophy and mathematics, Russell states that when he wrote the Principles, most of it in 1900, he “shared Frege’s belief in the Platonic reality of numbers, which, in my imagination, peopled the timeless realm of Being. It was a comforting faith, which I later abandoned with regret.” There is no need to find a location for mathematics other than the human mind. Perfectly circular circles, perfectly straight lines and zero-width points all exist in the human mind, which is an adequate location for all mathematical concepts. 2.4.7 Remark: The location of mathematics is the human brain. Lakoff/N´ un ˜ez [172], page 33, has the following comment on the location of mathematics. Ideas do not float abstractly in the world. Ideas can be created only by, and instantiated only in, brains. Particular ideas have to be generated by neural structures in brains, and in order for that to happen, exactly the right kind of neural processes must take place in the brain’s neural circuitry. Lakoff/N´ un ˜ez [172], page 9, has the following similar comment. Mathematics as we know it is human mathematics, a product of the human mind. Where does mathematics come from? It comes from us! We create it, but it is not arbitrary—not a mere historically contingent social construction. What makes mathematics nonarbitrary is that it uses the basic conceptual mechanisms of the embodied human mind as it has evolved in the real world. Mathematics is a product of the neural capacities of our brains, the nature of our bodies, our evolution, our environment, and our long social and cultural history.
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
“I imagine a triangle, although perhaps such a figure does not exist and never has existed anywhere in the world outside my thought. Nevertheless this figure has a certain nature, or form, or determinate essence which is immutable or eternal, which I have not invented and which in no way depends on my mind. This is evident from the fact that I can demonstrate various properties of this triangle, for example that the sum of its three interior angles is equal to two right angles, that the greatest angle is opposite the greatest side, and so forth. Whether I desire it or not, I recognize very clearly and convincingly that these properties are in the triangle although I have never thought about them before, and even if this is the first time I have imagined a triangle. Nevertheless no one can say that I have invented or imagined them.” Transposed to such simple “eternal verities” as 1 + 2 = 3, 2 + 2 = 4, Descartes’ everlasting geometry becomes Hermite’s superhuman arithmetic.
2.5. Sets as parameters for socio-mathematical network communications
31
2.5. Sets as parameters for socio-mathematical network communications 2.5.1 Remark: Ontological questions disappear in the anthropological viewpoint. The “socio-mathematical network” idea referred to in this section is essentially an anthropological view of mathematics. We can study human mathematical behaviour with the same mind-set as when studying animal communication behaviour. In this perspective, most philosophical questions about meanings of mathematical concepts disappear. Within the anthropological perspective, mathematical concepts exist only in models which are built to explain the observed behaviour. In other words, we seek models to explain the externally observed behaviour whereas in the more abstract philosophy of mathematics, one searches for external objects with which the mathematical concepts can be associated. 2.5.2 Remark: Set theory and logic are only languages, not the real stuff of mathematics. The principal assertion of this section is that set theory and mathematical logic are merely languages for the communication of mathematical ideas rather than the true “stuff” of mathematics. Mathematical mind states and activity are the true stuff of mathematics. A mathematical definition of a system such as a tensor space is not really a definition of the system. It is merely a set of construction instructions and interoperability tests which characterize the system to be constructed. 2.5.3 Remark: Ontology of mathematics based on mind-states in a socio-mathematical network. In this section, a model is sketched for an informal ontology of mathematics. According to this “sociomathematical network” model, mathematical entities (such as numbers and sets) are identified with retrievable data-items in mind-states which result from a process of synchronization of a network of mathematical thinkers. The notations of mathematics are parameters of a communications protocol between participants in the network. The network protocol also includes indications of object classes.
2.5.4 Remark: There is no standard notation for ontological classes of mathematical objects. As mentioned in Section 2.6, sets alone are not sufficient to convey the full meaning of mathematical entities. Each mathematical object represented by a set must be explicitly or implicitly tagged within each context with an indication of the class to which the object belongs. This raises the difficult question of how to communicate classes of objects. A language mechanism is required, for example, to indicate the difference between a left transformation group and a right transformation group which have identical set representations. (This clash is mentioned in Remarks 5.16.6 and 34.8.8.) How is the difference between the ordered pair (0, 1) ∈ IR2 and the complex number i = (0, 1) ∈ to be indicated? Should there be an international standards body to maintain a register of mathematical object classes? Should there be a standardized notation for mathematical class ontology? How would the meaning of such a notation be communicated and standardized?
C
[ Maybe category theory gives a standardized notation for a large number of mathematical object classes? ] At present there is (probably) no international standard notation for mathematical class ontology. Any such notation would need to be boot-strapped with recursively lower-level concepts until some sort of “a-priori knowledge” level is reached. The central claim which is made in this section is that mathematical object classes, which provide the missing semantic content for set constructions, are currently standardized informally by means of socio-mathematical network synchronization processes. There is nothing very unusual about these processes. Abstract concepts such as beauty, truth, pain, pleasure, hope and despair are standardized by means of informal social network communication through immersion in the culture. 2.5.5 Remark: Indication of word categories in ancient Egyptian hieroglyphics. It is perhaps noteworthy that in ancient Egyptian hieroglyphics, concrete words were represented by pictures of the things referred to whereas abstract words were written as phonetic symbols followed by a special category symbol (a rolled-up papyrus) which indicated abstract nouns. (For Egyptian abstract determinatives, see [197], pages 5–6; [214], pages 80–83.) The papyrus symbol following a word signified that the idea could not be shown in a picture, but only written in terms of its pronunciation on papyrus. Dogs can be drawn, [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Whether or not real-life mathematics is done literally in this fashion is not important. An ontological model is only an agreed “reference model” which acts as a target for the semantics of a language.
32
2. Philosophical considerations
but mental states are abstract, invisible entities which can be referred to only by words (for example on papyrus). Some modern languages, such as Vietnamese, have a similar system of category-words which precede ordinary words to disambiguate them. In English, the categories of words are indicated informally. In mathematics, the category is usually indicated informally also. 2.5.6 Remark: Understanding mathematics requires prior capability to represent the ideas. It is an interesting question whether an alien species from another galaxy could decipher and really understand our mathematics written on Earth. Probably alien minds would require similar internal processes to humans in order to understand mathematical texts as anything more than a formal system. The inability to see the similarity between 4 chickens and 4 asteroids (i.e. the concept of cardinal numbers) would probably exclude any understanding of human mathematics. Mathematical notations must stimulate in the human reader some pre-existing capability to internally represent the ideas indicated. Integers on paper must evoke some kind of “essence of integers”. Otherwise mathematics is as meaningless as ancient Minoan tablets. Harmonization of such “essences” requires a social network to make words and symbols correspond to commonly agreed meanings. This process is as reliable, and unreliable, as social agreement on the meanings of natural languages. 2.5.7 Remark: The difference between internal representations and communication languages. Figure 2.5.1 illustrates the role of set theory as a communication protocol between mathematicians, as opposed to a language for referring to mathematical objects. Set theory plays a role similar to the Internet protocols for communicating between computers. mind A 3′ + 1′
set theory protocol 3+1
3′′ + 1′′
construction answer
construction 4′
set theory protocol 4
rule test problem
2′ + 2′ mind A
Figure 2.5.1
problem
answer
4′′
rule test set theory protocol 2+2
2′′ + 2′′
problem
mind B
Commutative diagram for socio-mathematical synchronization
Mathematical language steers the reader or listener towards the thoughts which the writer or speaker is trying to communicate. Mathematical language does not describe the destination conclusions, but rather the navigation directions to get to those conclusions. Thus mathematics is more a “steering language” than a “location language”. In other words, mathematical language sends “difference signals” to alter the reader’s or listener’s thoughts rather than absolute signals. It follows that the sender and receiver of this language must have agreed starting points. By comparison, the languages used for storing data inside computers are quite different to network protocol languages. For example, C++ and Fortran are programming languages which are used inside a single computer, whereas HTTP and SNMP are network languages for communciation between computers “on the wire”. The programming languages specify data in a fairly static way whereas the network languages consist of a sequence of instructions which direct the recipient to dynamically change state in some way. The “state transition rules” for network languages steer the receiving computer towards a specified state. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
problem
mind B
2.5. Sets as parameters for socio-mathematical network communications
33
2.5.8 Remark: Synchronization of different implementations of the integers. Consider the simple situation illustrated in Figure 2.5.1 where two minds A and B have already established an agreement on the nature and usage of the integers 1, 2 and 3. Mind A has internal representations 1′ , 2′ and 3′ . Mind B has internal representations 1′′ , 2′′ and 3′′ . Suppose mind A wishes to introduce the concept of the number 4 into the discussion. Then mind A could send a definition of 4 to mind B as 4 = 3 + 1. Although mind B has a different internal representation of the numbers 3 and 1 to mind A, the operation of addition may be carried out to obtain 4′′ from 3′′ + 1′′ . To verify that the rules of addition hold, mind A may send a test problem to mind B such as “2 + 2 = ?”. If the result of the addition 2′′ + 2′′ is the same as 3′′ + 1′′ , the addition rules are preserved. To confirm this to mind A, the internal number 4′′ is translated to the external number 4 and sent back to mind A, which converts this to internal number 4′ . In a more systematic way, mind A could send the complete set of rules for the number system {1, 2, 3, 4} for verification by mind B. If all rules of the system are verified, the two minds have a consistent definition of the system. They can then proceed to each prove theorems about their number systems, confident that all operations will yield the same result at both ends of the communication path. The number 4 might be represented by a child as a symbol ‘4’ drawn on paper (or a drawing of 4 chickens), by an ancient Roman as ‘IV’, by a mathematician as 3+ or {0, 1, 2, 3} or {∅, {∅}, {∅, {∅}}, {∅, {∅}, {∅, {∅}}}}, and by a computer as a sequence of voltages of the form 00000100 or 11111011.
2.5.10 Remark: The phenomena/noumena and language/representation dichotomies are similar. In the case of mathematics, we can see the inter-person network languages, but not the internal representation “languages”. The distinction between phenomena and noumena is similar to the distinction between observable languages and internal representations. Phenomena are the observations of things, which are an interaction between observers and observed things. Noumena are the things themselves, for which we can create speculative models which match the phenomena. In the same way, the languages which are used for mathematical interactions between individuals are directly observable, whereas the internal representations of mathematical objects within individuals cannot be observed, but we can create models (in our own minds) of the representations of mathematical objects inside other people’s minds. 2.5.11 Remark: The definition of equality is the responsibility of individual applications. The difference between the traffic on the mathematics communication channels and the implementation of mathematical concepts in human minds corresponds fairly well to the difference between abstract mathematical logic and concrete mathematical logic as discussed in Remark 4.12.3. For example, when the equation “x = ∅” is written in abstract logic, this means that the concrete object which is pointed to by the variable name x is the same as the object which is pointed to by the constant name ∅. The meaning of equality is the responsibility of the mind which implements the set concepts. For the purposes of the communications channel, it is only important that whatever equality relation is implemented in the concrete logic inside minds must satisfy the abstract requirements (reflexivity etc.) which are assumed at the linguistic level. c
When a concrete equality relation = is exported from the concrete level inside the mind as an abstract a linguistic-level equality relation = “on the wire”, it must possess the properties of an equality relation for any choice of map µVk : NV → Vk from the variable abstract name space NV to the concrete object spaces Vk . (See Figure 2.5.2.) What is called “equality” at the abstract linguistic level is probably more accurately called “identity” at the concrete level. (See Remark 4.15.1 for related comments.) 2.5.12 Remark: Synchronization of different internal representations of tensor products More complex constructions, such as tensor products of linear spaces or tangent spaces on differentiable manifolds, may follow the same basic pattern as for integers. Mind A may construct a tensor product from two linear spaces which both minds have previously synchronized. Then mind A may send a request to mind B to construct tensor products of individual vector pairs by formally juxtaposing them. It may be that mind B already has a physical intuition for what a tensor product might be, but it is not necessary. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
2.5.9 Remark: Mind-dependent addition and equality. Perhaps notations such as +′ , +′′ , =′ and =′′ should be used in Remark 2.5.8 to emphasize that these operations are mind-dependent. These superscripts have been suppressed in the interests of tidiness.
34
2. Philosophical considerations
Figure 2.5.2
a
object space V1
X=Y
object space V2
µV1 (X) =1 µV1 (Y )
c
abstract linguistic level
µV2 (X) =2 µV2 (Y )
mind 1
X=Y
a
c
mind 2
Abstract and concrete equality/identity relations
2.5.13 Remark: Axioms are an efficient way of synchronizing infinitely many objects. The kind of synchronization of definitions which is described in Remark 2.5.8 is clearly not feasible for infinite systems, and is quite inefficient even for finite systems. This is why axioms are a good idea. The purpose of axioms is to synchronize an arbitrarily large number of properties and relations among participants in a socio-mathematical discussion. Axioms are typically templates for an infinite number of properties or relations. Axioms require only finite bandwidth to communicate an infinite amount of information. This gets around the finite bandwidth limitation of human communications. If it can be shown that all systems which obey a particular set of axioms are isomorphic in some sense, this builds confidence that two people are essentially talking about the same thing if their systems obey the particular axioms. This gets around the problem that each human may have a different internal representation of each object or class of object. For the positive integers, the classic axiomatic system is due to Peano. (See for example Definition 7.3.2.) If two people agree on the Peano axioms and the rules of logic, any deduction made from the axioms and rules by one person must (hopefully) be accepted by anyone else who accepts the axioms and rules. If the axioms and rules imply uniqueness of the system of integers up to isomorphism, one can be fairly certain that any theorem proved by one person will be accepted by all people in the socio-mathematical network. This is the entire motivation behind the axiomatization of mathematical systems. The process is similar to the standardization processes for the Internet and for audio cable connectors. 2.5.14 Remark: Jacquard loom programming analogy for mathematics synchronization. The process of socio-mathematical synchronization may be compared to the synchronization of Jacquard loom designs by exchanging punch cards and loom outputs in the post. (The Jacquard loom was invented in 1790 or 1801. See for example Guinness encyclopedia [203], pages 319, 335.) If loom designer A sends a set of programming cards for the loom through the post to loom designer B, who uses the cards to weave some cloth which is sent back through the post, loom designer A may compare the cloth with what is produced locally. If the result is the same, this increases the level of confidence that the two machines work in the same way. One machine may be wooden and the other metallic, but as long as they produce the same output for the same inputs, both loom designers can be confident that whenever they create a set of programming cards for their own machine, it will produce the same cloth on the other machine a long distance away. The same synchronization process is involved in mathematics, where each mathematician has a high confidence that their deductions will be accepted by other mathematicians, because they all test their own thinking processes against standard axioms and deductive methods. Mathematical theorems are not absolute knowledge any more (or less) than Jacquard loom punch cards are absolute knowledge. Mathematical theorems apply only to the machines, i.e. the minds, which obey the same calculation rules. 2.5.15 Remark: Inference of virtual machines by observing mathematical communications. By observing the communication of mathematics between people, one can infer something about the nature [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
The vector products may be internally represented by mind B as symbolic juxtapositions of pairs of internal representations of vectors. When this has been done by mind B, mind A should send some tests to determine whether mind A has a consistent definition. For example, mind A could send pairs v1 ⊗ v2 and (2v1 ) ⊗ (0.5v2 ) for some vectors v1 and v2 . Then mind B should obtain the same answer for each tensor product, namely v1′ ⊗ v2′ = (2v1′ ) ⊗ (0.5v2′ ). If all of the rules of a tensor product are satisfied, then all relevant properties and relations of the tensor product space will be the same for both minds because the tensor product rules are a complete characterization (as discussed in Section 13.4).
2.5. Sets as parameters for socio-mathematical network communications
35
of the “virtual machines” which participate in the mathematical network. The inferred virtual machines must be able to store information which is received on the communciation channel for later transmission. The machines must be able to somehow represent all of the data structures described in the communications and the parameters of those structures must be modifiable in response to incoming information. One can infer that the stored data structures corresponding to observed communications of mathematics must have at least as much information content as the structures which are observed on the channel. What one cannot infer is that the data representation in the mind has a direct relation to the way in which mathematics is written. 2.5.16 Remark: Mathematics is a craft, not a science. It may be concluded that mathematics is not a science but rather a mental and social craft. The mathematics craft requires both psychological processes and social communications. It is impossible to settle questions about the true nature of mathematics without reference to the human mind and social context. Therefore it is useless to ask what the true tangent vector definition is or which set theory axioms are the true axioms. You choose your axioms and your definitions according to your purposes and the community with which you wish to communicate. Therefore mathematical logic may be regarded as a systematic study of a particular range of human psychological and social behaviour. 2.5.17 Remark: The objects of mathematics are part of anthropology. In terms of the diagram which was discussed in Remark 2.0.1, the socio-mathematical network model lies in the discipline of anthropology. Geometric and arithm`etic concepts may thus be explained in terms of neuropsychology and neurophysiology. These may in turn be explained in terms of biology, chemistry and physics. This completes the cycle of dependencies of the seven disciplines because physical theories are expressed in terms of mathematics.
Such a set of postulates may be regarded as a distillation of experience. Centuries of working with numbers and getting useful results according to the rules of arithmetic—empirically arrived at—suggested most of the rules embodied in these precise postulates, but once the suggestions of experience are understood, the interpretation (here common arithmetic) furnished by experience is deliberately suppressed or forgotten, and the system defined by the postulates is developed abstractly, on its own merits, by common logic plus mathematical tact. The unsystematic synchronization of the idea of a field (or any other mathematical system) amongst mathematicians may be replaced by a compact set of rules, which then becomes the basis of socio-mathematical synchronization. Mathematical systems exist first in human experience, and are later formalized by axioms and constructions for efficient communication with others. (The idea that mathematics is “empirically arrived at” may be compared with the Einstein quotation in Remark 2.3.4 and the comments in Remark 2.3.5.) Although the most basic systems, such as integers and real numbers, are derived from experience, mathematicians often invent axiomatic systems simply out of curiosity or idle recreation. Such systems sometimes develop a connection with real-world experience at a later time in history, but often these invented-on-paper mathematical systems become sterile exercises, like roads leading into the desert, hotels without customers, trains without passengers. This is not different to the situation of natural languages. The words of natural languages are initially closely connected to experience, but speakers and writers may create copious speech or text with no apparent connection to the real world—grammatically correct yet ontologically empty. 2.5.19 Remark: Axiomatization of mathematical systems discards the ontology. When the “real stuff” of mathematics in the human mind is formalized through axioms and constructions for easy, reliable communication and synchronization, the original experience and meaning of mathematical objects is not communicated through the axioms and constructions. The communicable properties and relations of mathematical objects enable correct computations to be made, but do not communicate what the objects are. Therefore it is important to accompany definitions with humanly meaningful explanations [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
2.5.18 Remark: Axiom systems are abstracted from experience. Axioms are often presented in textbooks as being the starting point for mathematical definitions. However, axioms may also be regarded as merely a summary of experience. That is, the experience comes first and the axioms are derived from experience. Bell [190], page 356, made the following comment regarding the axioms (or “postulates”) of an abstract algebraic field.
36
2. Philosophical considerations
of their significance. The writer usually knows the significance; the reader sometimes does not know and cannot easily guess. 2.5.20 Remark: Behaviourism versus introspection. The behaviourist movement in psychology, which was dominant in the USA in the second half of the 20th century, claimed that human introspection is an improper subject for serious study. They claimed that only human behaviour which is observable by others can be studied scientifically. (One big hole in their argument was the fact that the reports of the observers of the behaviour are as unreliable as the introspective reports by the subjects being studied.) According to this view, the inner experience of individuals must be ignored and a human being is considered to be no more than the sum total of their observable behaviours. In a similar way, mathematics could be regarded as no more than an observable human behaviour where streams of symbols pass between people. These symbols could be analysed for patterns without understanding their meaning for the participants. This perspective would be as erroneous as the behaviourist perspective in psychology. It would be like regarding all of human history as no more than a series of texts without regard to the world about which the texts were written. 2.5.21 Remark: Academic philosophy of mathematics. Embodied mind theories. In the serious academic philosophy of mathematics, the viewpoint called “embodied mind theories” seems to be very close to the ideas expressed in this chapter. (See Lakoff/N´ un ˜ez [172] for this viewpoint.) That view asserts that all of mathematics exists in the human brain, and that the instantiation of mathematics may be explained through cognitive science.
2.6.1 Remark: Sets are merely “secondary keys” for mathematical objects. In this section, it is proposed that sets are merely parameters for classes of mathematical objects, not the objects themselves. The question of what mathematical objects are, and where they reside, is discussed in Section 2.5. Even if we accept that mathematical objects reside in human minds, it does not immediately follow that sets are the primary objects. In fact, it is more likely that classes are primary, and sets provide merely a “secondary key” (in the sense of computer databases) which tells you which object in a class is being indicated. The reason for this proposal is essentially the idea that there are not enough sets to comfortably describe all of the objects which are discussed in mathematics. Maybe this section is stating the obvious, but the author thought for a long time that sets were the primary objects of mathematics. However, although the map from mathematical objects to sets may be well-defined, the inverse map from sets to mathematical objects is highly non-unique. This is why “tagging” of sets by class and/or context is required. A student could quite reasonably suppose that sets are the primary mathematical objects because teachers so often make statements like: “We define a group to be an ordered pair (G, σ) where G is a set and the function σ : G × G → G has the following properties.” When a teachers defines a concept to be a set, they are just not stating the “obvious” fact that the definition is contextdependent. Human communication would very inefficient indeed if the full context was made explicit in all statements. 2.6.2 Remark: Mathematics without semantics is literally meaningless. Just as physical containers (like boxes, cups, saucepans and houses) are very useful in everyday life, but are not so useful without other things to put inside them, so also sets are very useful in mathematics, but trying to build all of mathematics with pure sets only is somewhat extreme. Yes, it can be done. But the semantic content of sets is very limited indeed. Mathematics without semantics is literally meaningless. Giving meaning to mathematical definitions requires something more than just containers. Meaning is normally communicated through informal discussions which frame the set-based definitions. But perhaps definitions can be upgraded and enhanced a little to communicate some semantics themselves from time to time. One should not need to read or listen to a “second channel” to get the semantics of every definition. The emptiness of pure set definitions suggests that other axiomatically defined systems should be permitted within mathematics. For example, integers and real numbers may be independently axiomatized and introduced into mathematics without being constructed themselves out of sets. Concepts such as tensor [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
2.6. Sets as parameters for classes of objects
2.6. Sets as parameters for classes of objects
37
spaces and tangent spaces can be defined axiomatically, and concrete constructions may be imported from any source desired. It should not be forgotten that sets are themselves merely models of some aspects of human perception of the world. Even sets are “imported” into mathematical discussion from “somewhere else”. So it is not a radical step to import other mathematical structures also. The mathematical logic and analysis of tensor spaces and tangent spaces is independent of the source of importation of the structures. That logic and analysis is the primary value in the study of such structures. The particularities of the representation are secondary. Mathematics is more about methods than concrete constructions.
2.6.4 Remark: Ubiquity of the empty set. Consider the example of the empty set ∅. This set is a member of every topology. In a particular topological space (X, T ), the meaning of the element ∅ ∈ T is “the empty subset of X”. In other words, the empty set ∅ indicates which element of a given set T is meant. Consider topologies T1 and T2 for sets X1 and X2 respectively, where X1 ∩ X2 = ∅. The empty set is in both T1 and T2 . So T1 ∩ T2 = {∅} 6= ∅. Intuitively one would consider that T1 ∩ T2 = ∅ because X1 and X2 have no association of any kind. All topologies for all topological spaces contain the empty set as a common element. This is more serious than it seems. Suppose one wishes to define a function f on the set T1 ∪ T2 . Then f must agree on the empty set of each of these two topologies. It is impossible, for example, to define f (Ω1 ) = 1 for Ω1 ∈ T1 and f (Ω2 ) = 2 for Ω2 ∈ T2 . This suggests that the notation “∅” really means “the empty subset of the set under consideration”. In other words, the set should be tagged with the containing set. (And the containing set should be tagged with its object class.) Lakoff/N´ un ˜ez [172], page XIII, makes a related remark: “Why is there a unique empty class and why is it a subclass of all classes? Indeed, why is the empty class a class at all, if it cannot be a class of anything?” 2.6.5 Remark: Ambiguity of zero-valued tangent operators. Context tags are needed. A popular definition of tangent vectors on manifolds has a problem which is similar to the example in Remark 2.6.4. As mentioned in Remark 28.6.1, if each tangent vector v at a point p in a manifold M is represented as a first-order differential operator Dv on the space of continuously differentiable functions f : M → IR, it transpires that the zero vector v = 0 is represented as exactly the same operator Dv at all points p ∈ M . Many authors ignore this inconvenient fact, which implies that the tangent spaces at all points of a manifold have a common element. This is an unfortunate by-product of the set construction used for the representation. The zero operator D0 at each point p ∈ M really means “the zero vector in the tangent space at p”. The set construction (i.e. the differential operator) parametrizes the elements of each per-point tangent space, but the set construction on its own is insufficient to convey the intended meaning. 2.6.6 Remark: Clashes between objects represented as tuples. As another example, the set A = {(0, 1)} (where 0 and 1 denote the usual real numbers) may represent where i is the purely imaginary complex number, or a set of real-number pairs A ⊆ IR2 , or A {i} ⊆ may represent a partially defined function from IR to IR which maps 0 to 1. In practice, these kinds of set construction clashes must be resolved by implicit tagging within each context. Functions are usually thought of as a different category of object to sets, and yet for reasons of economy, functions are represented as sets. The sense that a function is some sort of active machine producing outputs
C
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
2.6.3 Remark: Sets provide a parameter space for classes of mathematical objects. Sets may be regarded as concept parameters rather than the concepts themselves. A set indicates which object is intended from among a given class of objects. A set on its own is not a full mathematical object. Each object can only be fully specified if its class is indicated. The category of all sets may be regarded as a huge parameter space for the concepts that mathematicians wish to discuss and think about. Just as mathematics provides tools for modelling the physical world, so also set theory provides tools for modelling mathematical thinking. There is more content in mathematical thinking than is captured in set constructions. Like names of people, sets communicate which objects are intended, but do not provide a complete description. Set theory provides a more or less adequate system for pointing to mathematical objects, but it does not answer the question of what those objects really are. Sets are no more the real concepts of mathematics than English-language words are the real things of the world in which we live. (A related comment is made in Remark 2.3.5.)
38
2. Philosophical considerations
from given inputs, whereas a set is a passive object, must be provided by the person who thinks about the object. Within a particular context, it may be clear whether a set A is intended to be a set or a function, but sometimes it is not, and sometimes there is a clash between identical sets which have different meanings within the same context. 2.6.7 Remark: Pathological clash of definitions for ordinal numbers. Consider the set of integers. The integers 0, 1 and 2 may be defined as ∅, {∅} and {∅, {∅}} respectively. (See Section 7.2.) The idea that the number 1 is a subset of the number 2 seems a little absurd. The best way to think of this set representation of the integers is that it is a parametrization within a limited scope. Otherwise the topological space (∅, {∅}) on the empty set would be identical to the pair (0, 1), which would be perplexing. Clearly the integer 1 and the set {∅} are objects of two different classes, which just happen to be parametrized by the same set within each class. Similarly, consider that an ordered pair such as (0, 1) is defined as {{0}, {0, 1}}. So (0, 1) = {{∅}, {∅, {∅}}}, which also seems absurd. A danger of defining integers to be sets is that if one, for instance, wishes to deal with sets of the form {A, b} such that A ⊆ IR and b is an integer, then a set such as {∅, 0} could arise. But this would then satisfy {∅, 0} = {∅} = {0} = 1. So a set which appears to have two elements in fact has only one. When all objects are represented as sets, there is always a danger that these objects will clash. This suggests the idea of “tagging” sets with a category label of some sort. Halmos [159], page 35, refers to such clashes of definitions as “pathological”. 2.6.8 Remark: Disambiguation of sets by implicit class tags in each usage context. The idea of using sets as parameters of objects within mathematical classes is outlined in Section 5.16. In such a framework, each mathematical object is indicated by a class/set pair. One way to subtly clarify the class to which a set belongs is to refer to it as, for example, “the complex number (0, 1)” or “the ordered pair (0, 1)”. Alternatively, one may write “(0, 1) ∈ ” or “(0, 1) ∈ IR × IR”. This kind of implicit class tagging is mostly adopted in this book (as in most mathematics books). It is important not to forget that the real-number pair (0, 1) is not a complex number on its own. Explicit tagging would look more like Complex(0, 1) or Real-tuple(0, 1), which would be tedious to write and to read. 2.6.9 Remark: Inheritance of implicit class tags by constructed mathematical objects The implicit class tags which are described in Remark 2.6.8 for mathematical objects are presumably inherited by objects which are constructed out of them. The two component sets of the cross product × still have the quality of being complex numbers even in the cross product, and the cross product IR2 × IR2 similarly inherits the implicit class of each set IR2 . Otherwise the sets × and IR2 × IR2 would be the same. If complex tuples and IR2 tuples are given explicit tags as in the examples Complex-tuple((0, 1), (0, 2)) or Real-tuple-tuple((0, 1), (0, 2)), one would expect the first-component projections to be Complex(0, 1) or Real-tuple(0, 1) respectively. This suggests that the original tuple-tuples should have been formalized as Complex-tuple(Complex(0, 1), Complex(0, 2)) or Real-tuple-tuple(Real-tuple(0, 1), Real-tuple(0, 2)). This kind of systematic formalization of explicit class tags becomes very tedious very quickly. A similar formalization is routinely implemented in computer programming languages where ambiguity must be avoided. But even in mature “strongly typed” programming languages such as C++, ambiguities and surprising outcomes are often encountered despite serious attempts at standardization. In the case of mathematics, it is important to simply be aware of the pitfalls of “untyped” objects. As mentioned in Remark 1.4.7, it is important to be able to identify in which space (i.e. class) each mathematical object “lives”. Whenever confusion arises, it is useful to be able to determine the class of every sub-expression of every expression.
C C
C C
2.7. Extraneous properties of set-constructions in definitions 2.7.1 Remark: Definitions ensure that people are talking about the same thing. In Section 2.5, it is asserted that mathematical objects are not located on paper (as written symbols) and are not located in some mysterious, mystical, metaphysical, mathematical universe where everything is perfect. Mathematical objects are located inside minds. This ontological picture has implications for how mathematical objects should be defined. This is important in a definitions book such as this. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
C
2.7. Extraneous properties of set-constructions in definitions
39
Mathematical definitions are part of the communications language which may be observed on a sociomathematical network whereas the real “stuff” of mathematics resides inside the minds of the participants. In this view, definitions are merely signals whose purpose is to ensure that participants in a network are talking about the same thing. 2.7.2 Remark: Internal representations of mathematical objects should not transmit extraneous properties. We may consider two mathematical objects within two different minds to be “the same thing” if all of their properties are identical. This raises the issue of extraneous properties. Sometimes two objects may have identical properties as required by the particular context of a discussion, but they may have additional properties which are different. For example, the number 5 may have the colour green inside one mind and a bright yellow colour in another mind. There is little that can be done to prevent two minds from giving their mathematical object representations extraneous properties which are not specified by definitions communicated over the network. However, it is possible to prevent such extraneous properties from entering into the discussion by listing very precisely those properties which are relevant to the context. Such a listing of relevant properties is similar to an axiomatic specification. One of the purposes of axiomatization is to ensure that a discussion is not confused by the introduction of extraneous properties. As a practical example, two participants may agree that the real numbers 3 and π satisfy 3 < π, but ⊃ π. (See Notation 5.1.12.) one participant may claim that 3 ⊂ 6= π whereas the other one claims that 3 = \ These set inclusion relations are irrelevant and mutually contradictory, but two different kinds of Dedekind cut definitions do in fact have these properties. (See Section 8.3 for Dedekind cuts.) Clearly this kind of intrusion of contradictory extraneous properties into a discussion must be excluded. The simplest way to avoid such contradictions is to ban all extraneous properties.
A second cause of extraneous properties is the co-existence on a socio-mathematical network of multiple alternative set-constructions for objects. For example, some√network participants may use a Dedekind cut of the form {q ∈ ; q 3 < n} to represent the real number 3 n for integers n√∈ on the network whereas ; q 3 > n} for 3 n on the network. The first other participants may use a Dedekind cut of the form {q ∈ √ √ 3 3 ⊂ representation has the extraneous property that n1 < n2√ ⇔ √n1 6= n2 . The second representation has 3 the contradictory extraneous property that n1 < n2 ⇔ 3 n1 ⊃ n2 . = \
Q
Q
Z
This kind of extraneous property is not a leakage of internal representation properties into the discussion. In this case, the extraneous properties arise from the use of multiple definitions “on the wire”. The above two example definitions provide all intended properties of real numbers such as algebraic operations, order and completeness, but they also provide extraneous properties. 2.7.4 Remark: How to manage extraneous properties of constructional definitions. One may avoid the extraneous properties that arise from a diversity of set-construction representations on the network by simply having no set-construction definitions at all. One could exclusively use axiomatic definitions. However, there are at least two good reasons to make use of set constructions instead of axioms for some definitions. (i) Some set-construction definitions are precisely what is intended. In other words, all properties of the construction are intended. An example is the representation of real number intervals [r1 , r2 ] as sets {x ∈ IR; r1 ≤ x ∧ x ≤ r2 } for r1 , r2 ∈ IR with r1 ≤ r2 . Absolutely all properties of this representation are intended. (There are alternative representations which have both the intended properties and some unintended extraneous properties.) Real-number intervals could be defined axiomatically, but this would require some tedious work to define the intervals and to verify that the axioms satisfy existence and uniqueness. (ii) Even when a set construction has undesired extraneous properties, the set construction may have many advantages. For example, a specific construction may have the advantage of automatically satisfying existence and uniqueness whereas an axiomatic definition might require substantial work to demonstrate [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
2.7.3 Remark: Constructional definitions on the network should not have extraneous properties. The danger of extraneous properties mentioned in Remark 2.7.2 arises because of differing internal representations of mathematical objects. Such differences cannot be avoided. One can only prevent extraneous properties of internal representations from appearing on the communications channel.
40
2. Philosophical considerations existence and uniqueness up to isomorphism. A specific construction may also be useful for visualization or intuitive understanding of the objects. For example, linear spaces of real n-tuples may be represented as functions f : n → IR, but then (r1 , r2 ) = {(1, r1 ), (2, r2 )} and (r1 , r2 , r3 ) = {(1, r1 ), (2, r2 ), (3, r3 )} for r1 , r2 , r3 ∈ IR so that (r1 , r2 ) ⊆ (r1 , r2 , r3 ), which is (most likely) an undesired extraneous property.
N
In case (i), there is no danger of extraneous properties. So there is no reason to avoid such representations. In case (ii), extraneous properties are typically disregarded in an informal way, but the disregard sometimes fails. It is difficult to resolve this problem. For example, the non-negative integers may be represented as ∅ for 0 and n ∪ {n} for integers n + 1 with n ≥ 0. (See Remark 7.2.4.) This representation has the extraneous property that n ∈ n + 1 and n ⊆ n + 1 for all n ≥ 0. The reader may be instructed to not apply the relations “∈” and “⊆” to pairs of numbers, but it is difficult to make a complete list of all extraneous properties which must be ignored. Case (ii) should be avoided where possible. If such a set construction is given, a set of axioms should be given also, and any particular construction should be presented as merely a representation for the axiom system, not as the real definition.
2.8. Axioms versus constructions for defining mathematical systems 2.8.1 Remark: Axiomatic versus constructional methods for defining mathematical systems. Mathematical systems may be brought into a mathematical communication in two ways. (1) Axiomatic method: Define a system by its properties. (Examples: Peano natural number axioms, real-number axioms.) (2) Constructional method: Define a system as a specific set construction from previously defined systems. (Examples: von Neumann construction for ordinal numbers, Dedekind cuts and Cantor construction for real numbers.)
(i) Inconsistent axioms. There are no set structures satisfying for the axioms. (ii) Unique representation of a single system. One and only one set structure satisfies the axioms. (iii) Multiple representations of a single system. Many set structures satisfy the axioms, but they are all equivalent to each other (for some equivalence relation). (iv) Class of systems. Many set structures satisfy the axioms, and they are not all equivalent to each other (for some equivalence relation). In case (ii), one may as well use a direct construction. Case (iii) may be said to characterize all representations of a single defined concept. Such a definition may be referred to as a “characterization” or “metadefinition”. In other words, the defined concept is fixed, but the representation is variable. For example, all systems which satisfy Metadefinition 13.4.1 (for the tensor product of linear spaces) are isomorphic. All such systems are equivalent alternative representations of the tensor product idea. In case (iv), the axioms define a class of systems. For example, Definition 26.3.1 gives a set of rules for an n-dimensional manifold. For each parameter n ∈ + , there are many equivalence classes of n-dimensional manifolds.
Z
2.8.3 Remark: Modern and classical axiom systems. Shoenfield [168], page 2, calls axiom systems in Remark 2.8.2 case (iv) “modern axiom systems”, whereas he calls axiom systems in case (ii) “classical axiom systems”. But he comments: “Of course, the difference is not really in the axiom system, but in the intentions of the framer of the system.” He defines an axiom system as “the entire edifice [. . . ], consisting of basic concepts, derived concepts, axioms, and theorems”. A classical axiom system has only a single “universe” of objects whereas a modern axiom system may have many alternative “universes” of objects. (See Shoenfield [168], page 10.) [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
2.8.2 Remark: Further classification of the axiomatic method of definition. The axiomatic method (1) in Remark 2.8.1 may be further classified as follows.
2.8. Axioms versus constructions for defining mathematical systems
41
2.8.4 Remark: Further classification of the constructional method of definition. The constructional method (2) in Remark 2.8.1 may be further classified as follows. (i) The construction method yields no structures. (ii) The construction method yields exactly one structure. (iii) The construction method yields more than one structure. One might think that a construction method should always yield one and only one structure. However, concepts are very often defined as “the solution” of a set of equations, or as the outcome of some maximization of minimization process. For example, the square root of a real number y is defined as “the solution” x of the equation x2 = y. Clearly the number of solutions may be zero, one or many according to the value of y. Since so many concepts are constructed as the solutions of equations, existence and uniqueness proofs are frequently required if the concept being defined is expected to be single-valued. Such proofs are sometimes extremely difficult to provide. Consequently establishing that a concept is well-defined can sometimes require a lot of hard work. So a mere “definitions book”, for example, might not be so straightforward as expected. Even set constructions which appear to be direct and simple may be regarded as sets of equations which have solutions. Consider, for example, the empty set x = ∅. This is shorthand for “x; ∀y, y ∈ / x”. In other words, the empty set is the unique set x which satisfies ∀y, y ∈ / x. It happens that this set x does exist and is unique, but this requires proof . Consider also two-element sets {a, b}. For each a and b, {a, b} is defined to be the solution x of the proposition ∀y, (y ∈ x ⇔ (y = a ∨ y = b)). Just like the empty set, this two-element set requires an existence and uniqueness proof.
Often, however, it is convenient to define many systems or classes of systems according to a variable parameter. Examples of parametrized single systems are the group of permutations of the set n for a given positive integer n, the ring C(X, Y ) of continuous functions from X to Y for given topological spaces X and Y , and the function space C k,α (X) for a given non-negative integer k, real number α ∈ (0, 1] and differentiable manifold X. Examples of parameterized classes of systems are the linear spaces over a given field K and the vector fields on a given differentiable manifold M .
Z
It is very often impossible to determine whether the parameters for a definition are really parameters or are in fact part of a single non-parametrized definition. For example, one may define n-dimensional topological manifolds for a given parameter n. But one could equally define topological manifolds with the dimension n as part of the definition. For example, consider the following. (1) Let n be a non-negative integer. An n-dimensional topological manifold is a Hausdorff space M which is everywhere locally homeomorphic to IRn . (2) A topological manifold is a Hausdorff space M which is everywhere locally homeomorphic to IRn for some non-negative integer n. In case (1), n is clearly a parameter for an infinite set of definitions. In case (2), n is a dummy variable in the proposition “∃n, M is locally homeomorphic to IRn ”. In the latter case, it must be shown that n is unique if it exists. (In fact, it is not unique if M is the empty topological space, and the proof is non-trivial otherwise.) In practice, definitions of the form (1) and (2) are used interchangeably, although in a formal sense they are very different. 2.8.6 Remark: Set constructions have extraneous properties. Axioms require existence/uniqueness. As mentioned in Section 2.7, the set-construction style of definition has the disadvantage of “extraneous properties”. But axiomatic definitions have the difficulty that existence and uniqueness proofs are often required. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
2.8.5 Remark: Parametrized families of systems. Sometimes a single mathematical system, or a single class of systems, is defined (by axioms or by construction) in isolation. Examples of a single system are the integers, the rational numbers, the real numbers and the complex numbers. Examples of a single class of systems are the finite groups, general groups, fields, rings, linear spaces and topological spaces.
42
2. Philosophical considerations
A useful way to think about set-construction definitions is that every such set-construction is merely a “canonical representation” of an abstract idea. In other words, any construction which has the same information content and properties may be substituted for the given representation. Equivalence of respresentations may be defined with the aid of an isomorphism test. This isomorphism is often clear from the context. The provision of an canonical representation avoids the requirement to provide an existence proof (which would be necessary for an axiomatic definition). The uniqueness requirement is then fulfilled by the provision of an isomorphism test, because this actually defines uniqueness for representations of the given concept. In this sense, all set-construction definitions are essentially the same thing as axiomatically defined metadefinitions or characterizations. In the former case, the set of permitted representations is determined by an isomorphism test relative to a canonical representations. In the latter case, the set of permitted represenations is determined by a set of required attributes and relations. Consequently one should never be too insistent on the adoption of one definition or another. The important thing is to very precisely communicate the concept which is being presented. 2.8.7 Remark: Mathematical classes which may be defined either axiomatically or constructionally. Some mathematical systems are best defined by axioms; others are best defined as constructions; some systems may be conveniently defined either axiomatically or constructionally; and some systems are inconvenient for both methods of definition. For example, a linear space K n may be constructed from a field K using ordered n-tuples in a way which is well suited to the vast majority of contexts. (See Remark 2.7.3 part (ii) for an example of extraneous properties of the system K n .) For such a system, it is superfluous, but not too difficult, to go through the axiomatic approach. One could, in fact, give a metadefinition for the linear space K n as an abstract linear space V together with families of canonical linear injections ik : K → V and canonical linear projections πk : V → K such that ik ◦ πk = idK and πk ◦ ik : V → V is idempotent for all k ∈ n . If a few more axioms are added, such a metadefinition will guarantee that all such representations V will be isomorphic to each other and to the usual construction of K n . In this case, the constructional method is more convenient than the axiomatic method and the extraneous properties are not too difficult to ignore. The non-negative integers may be defined via axioms as in Definition 7.3.2 (similar to the Peano axioms) or via the von Neumann construction from empty sets as in Remark 7.2.4. Probably the axioms provide the most convenient definition for the non-negative integers, but the construction based on ZF set theory provides some useful insights and is more easily extended to general ordinal numbers. Likewise, the real numbers are perhaps slightly more conveniently defined by axioms rather than construction, but more advanced texts prefer constructions, either in the Dedekind or Cantor style. (See Remark 8.3.4 for comments on these two constructions.) One may regard the constructional approach as merely a short-cut version of the axiomatic approach because a specific system construction conveniently omits the need for axioms and uniqueness and existence proofs. In fact, the constructional approach is always a special case of the axiomatic approach because one may have axioms which precisely specify the construction method. In this book, the vast majority of definitions use the constructional method. The axiomatic method is used only when specific constructions have significant difficulties (as in the case of tensor products). In particular, a list of axioms specifying properties and relations is preferable when the mathematics literature contains a proliferation of unsatisfying definitions, as is the case for tangent vectors on manifolds for example. The axiomatic method is essential for systems which are highly non-unique, such as the categories of groups, linear spaces and topological manifolds. 2.8.8 Remark: Particular set-constructions for concepts may be too ugly. The set constructions for some concepts, such as for tensor products and tangent spaces in particular, are quite unintuitive and unconvincing because they are much more complicated and untidy than one’s intuitive ideas about these concepts. (For example, the usual von Neumann ordinal number representation is somewhat ugly. There seem to be no graceful representations for tensor products either.) In such cases, there are often multiple alternative definitions, each of which is more or less suitable for particular applications and audiences. The adoption of an axiomatic definition for a concept aften avoids the ugliness of particular representations. If no tidy, intuitively appealing set-construction can be found, an axiomatic definition may be preferable. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
N
2.9. Some general remarks on mathematics and logic
43
2.8.9 Remark: Axiomatic systems specify interoperability, not implementation. An axiomatic system for a class of mathematical objects is analogous to interoperability and conformance specifications in various engineering fields. In engineering standardization bodies, it is usual to require standards to specify only interfaces, not implementation. Great importance is attached to interoperability so that implementers of interoperable systems are left free to design concrete representations of the abstract specifications as they wish. The philosophical principle behind this is that there should be open competition to build the best implementations of the systems, while maintaining full interoperability between different brands of systems which do not need to know anything about the internal details of particular system designs. In the mathematics field, it is not merely undesirable to know how other people implement mathematical objects in their own minds; it is actually (currently) impossible to know how mathematical objects are implemented. Therefore mathematical definitions can only ever specify interoperability rules. The entire literature of mathematics specifies only what the communications channels between humans must look like, not what the concepts look like inside human minds. This remark has the important consequence that mathematical definitions do not describe mathematical objects. Definitions only describe communications between human beings about mathematical objects. Consequently, one should not seek deep understanding of mathematical objects by examining definitions, which only describe the contents of communications channels. The same mathematical object in a human mind may be communicated in many different ways. Thus many different definitions may actually refer to the same object. For example, five different definitions of tangent vectors on manifolds might refer to the same mental concept.
2.9. Some general remarks on mathematics and logic 2.9.1 Remark: The vast majority of all possible theorems are uninteresting. Symbolic logic seems to reduce all of mathematical deduction to mechanistic computations which may be automated by a computer. In principle, a computer program may generate all possible theorems in mathematics. However, it is equally true that a computer may generate all possible digital images and all possible natural-language texts. Possessing all (256 × 256 × 256)1000000 possible 1000 bit by 1000 bit images is of very little practical benefit. Someone must still be employed to find the interesting images. In the same way, the ability to generate all possible theorems is of little value. The real business of mathematics is to find and use the interesting theorems. (See Remark 4.8.1 for a similar comment.) The mechanization of mathematics serves only to check that when someone claims to have discovered an interesting theorem, it can be established with great confidence that the theorem is true or false. 2.9.2 Remark: Set theory and logic define rules for an artifical universe. Mathematics may be regarded as an artificial universe with its own rules or “equations of motion” or “evolution equations”. Within this artificial universe, models may be built for real-world processes, which is why the rules are chosen as they are. (More accurately, the rules of mathematics are chosen as a compromise between what the human brain is capable of and what the real world applications demand.) A computer language similarly creates an abstract semantic space with its own special rules. Each language is best suited to modelling different kinds of situations and problems. Just as computer languages may be executed on computers, so mathematics may be processed in human minds. The rules of mathematics are not provable. They are just currently the best rules for practical applications. (The recreational applications of mathematics are a by-product of the useful applications.) [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
This perspective on mathematical definitions is particularly significant for this book because it is concerned predominantly with definitions as the foreground concern, not just as a necessary prelude to presentation of theorems. In order to choose the best definitions, the author needed to know what is being defined. When it is accepted that only the communication of mathematical systems is being defined, not the systems themselves, it follows that definitions should be generally more axiomatic in character rather than in terms of specified set constructions. In other words, mathematical systems should preferably be characterized rather than constructed . Specific constructions should be regarded as mere “demonstration prototypes” rather than “the only way to do it”. In engineering, “reference implementations” play an analogous role. Any system which is a “drop-in replacement” for the reference implementation is considered to be acceptable.
44
2. Philosophical considerations
2.9.3 Remark: Acceptance of mathematical concepts varies according to personality and history. There have always been disagreements about the fundamentals of mathematics. For much of history, negative numbers were not accepted, and complex numbers were only generally accepted in relatively recent times. Some people might still not accept induction arguments or infinite sets. Some people might not accept that there are more irrationals than rationals. (See Bell [189], page 566.) At various times in history, irrational numbers, transcendental numbers, non-analytic functions and non-Euclidean geometries were rejected. People must have a model in their own mind which corresponds to the elements of mathematics which they accept. A person to whom a negative number seems absurd will have great difficulties with signed arithmetic, no matter how skilled they are in manipulating symbols according to the prescribed rules. 2.9.4 Remark: Arbitrariness of the rules of mathematics. Arguing about the axioms of mathematics is as important, and unimportant, as arguing about the rules of football. There are many “codes” of football rules which co-exist in the world, but within a single code, and for a particular football game, there must be an agreed set of rules. Mathematics is “all in the mind”. So there is no way to test the validity of any set of rules for mathematics.
2.9.6 Remark: Mathematics is an art or craft. Mathematics is an art like music or knitting, taught by boot-strapping from the prior skills and naive concepts of the individual. In other words, mathematics is not pure and absolute knowledge. Mathematics is an abstraction from some rather ancient arts, principally the geometric arts (including surveying, navigation, astronomy and carpentry) and the arithm`etic arts (including accounting, chronology, and weights and measures). Ren´e Descartes established strong links between the geometric and arithm`etic arts. Ultimately the numerical view of geometry triumphed over the geometric view of arithmetic because the arithm`etic perspective was the more powerful, although the arbitrariness of the choice of numeric coordinates still annoys physicists and mathematicians. (Analysis and algebra are included in the arithm`etic arts since they deal with numbers rather than points and lines, although both analysis and algebra were often couched in geometric language until the 19th century.) One of the constant currents in differential geometry is the antagonism between the numerical and geometric points of view. The frequent clamour for coordinate-free differential geometry is an attempt to return to the good old days when geometry had a purely geometric character. This clash of views seems to be inherent in human nature, since we have both geometric and arithm`etic modes of thinking, and the geometric mode is more intuitive and direct. [ Check the roles of the people mentioned in Remark 2.9.7. ] 2.9.7 Remark: Mathematical deduction may be regarded as a typographic art. Around the turn of the 20th century, there were two important developments in relations between the geometric and arithm`etic arts. First, there was the set theory associated with names such as Cantor and Weierstraß. Then there was the symbolic logic basis for mathematics developed by Peano, Frege, Russell, Zermelo, Fraenkel, Bernays, G¨ odel and others. The set theoretic perspective provided a new basis for geometry and numbers. Then symbolic logic provided a systematic basis for set theory. Symbolic logic may be characterized as a typographic art rather than a science. The martialling of sequences of symbols on lines and the enforcement of strict rules between sequences of lines are reminiscent of lead typography. Symbolic logic requires prior notions of order of symbols on lines, the order of lines within an argument, sets of symbols (the font cases), and relations between sets of symbols and between lines of symbols. Any treatment of symbolic logic freely uses number concepts such as integer arithmetic, and geometric concepts such as left, right, above and below. A related remark is made by Lakoff/N´ un ˜ez [172], page 86, as follows. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
2.9.5 Remark: Controversies about infinities in the fundamentals of mathematics. When you study mathematics, you are studying one of the capabilities of the human mind. Most of the controversies in the fundamentals of mathematics are concerned with the credibility or otherwise of infinite concepts. Since the human mind is finite, it is not surprising that disagreements occur in this area, because everything that can be said about infinite things is conjectural. Different opinions on set theory are generally distinguished by differing levels of comfort with concepts which are infinite, which are all ultimately extrapolations from what we can think of to what we can’t think of directly. Unfortunately, nearly all of differential geometry is concerned with things which are infinite.
2.9. Some general remarks on mathematics and logic
45
Our mathematics of calculation and the notation we do it in is chosen for bodily reasons—for ease of cognitive processing and because we have ten fingers and learn to count on them. But our bodies enter into the very idea of a linearly ordered symbolic notation for mathematics. Our writing systems are linear partly because of the linear sweep of our arms and partly because of the linear sweep of our gaze. The very idea of a linear symbol system arises from the peculiar properties of our bodies. And linear symbol systems are at the heart of mathematics. Our linear, positional, polynomial-based notational system is an optimal solution to the constraints placed on us by our bodies (our arms and our gaze), our cognitive limitations (visual perception and attention, memory, parsing ability), and possibilities given by conceptual metaphor. The phrase “linear, positional, polynomial-based notational system” refers to our system for writing numbers. Thus the decimal P notation represents a non-negative integer by the sequence of coefficients an an−1 . . . a0 of a polynomial ni=0 ai 10i . (See Lakoff/N´ un ˜ez [172], page 82.)
2.9.8 Remark: Medieval university subject levels: trivium and quadrivium. It is interesting to compare the topic layering in Figure 2.1.1 (Remark 2.1.1) with the corresponding layering of university subjects in medieval Europe. These subject layers are illustrated in Figure 2.9.1. quadrivium arithmetic
grammar
geometry
astronomy
logic
music
rhetoric
trivium Relations between trivium and quadrivium in modern mathematics
There are four perspectives for mathematics which can each be defined in terms of the others. These are the geometric, numeric, set theoretic and symbolic logic perspectives. The latter two are identifiable with the “trivium” of ancient Greek and medieval university education, namely grammar, rhetoric and logic, whereas the former two are identifiable with the university “quadrivium”, namely arithmetic, geometry, astronomy and music. In a sense, the systematization of mathematics at the turn of the 20th century redefined the mathematics of the quadrivium (the advanced university subjects) in terms of the trivium (the elementary university subjects). 2.9.9 Remark: Roles and interactions of mathematicians and physicists. The interactions between mathematicians, physicists and reality are roughly illustrated in Figure 2.9.2. real world Figure 2.9.2
experimental physicist
theoretical physicist
mathematical physicist
mathematician
Interactions between mathematicians, physicists and reality
There are endless arguments about the differences between theoretical and mathematical physics. The following is an outline of this author’s current perspective on the issue. (1) The experimental physicist does experiments, including design of experiments, design of apparatus, operating the apparatus, collecting data, recording data, organizing data, making initial interpretations of data, and publishing the data in a meaningful form. The experimental physicist has some contact with the real world. (2) The theoretical physicist interprets data which is produced by the experimental physicist by comparing data to what the theories predict, attempting to fit data to the theories, attempting to select among competing theories by eliminating hypotheses which are inconsistent with the data, and suggesting future research projects to help select more effectively between competing theories. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Figure 2.9.1
46
2. Philosophical considerations
(3) The mathematical physicist studies the theories which theoretical physicists work on and proposes new theories, but is not directly involved in trying to harmonize experimental data with theories. The mathematical physicist studies the mathematical properties of physical theories to determine whether they are self-consistent and to determine what predictions can be made from the theories. Thus the mathematical physicist is concerned with the mathematical machinery of physical models, not with their correspondence to experimental data, but the motivation is to assist the theoretical physicist by checking the validity and broad properties of proposed models, improving the models, and providing tools to better utilize the models. (4) The mathematician studies mathematical machinery. The mathematician is concerned with the selfconsistency of mathematical systems and the consequences and interrelationships of mathematical systems. A large proportion of those systems originate in physics and other sciences. Some mathematical systems, however, could be characterized as of recreational or intellectual interest only. Sometimes mathematicians develop systems out of mere curiosity which turn out unintentionally to be useful for modelling physical systems. Mathematicians are usually concerned with logical self-consistency, minimalist clarity and strict correctness more than mathematical physicists are. The mathematician (4) is also called a pure mathematician to contrast with the applied mathematician (5). But applied mathematicians are often so concerned with their application areas that they are better grouped together with the application discipline since they use mathematics as a tool rather than developing new mathematics. (5) The applied mathematician is concerned with practical techniques of solution of mathematical problems more than with the formulation of mathematical frameworks for defining classes of problems. Thus applied mathematicians often develop numerical techniques and algorithms for solving problems approximately for use in real-life applications.
Laplace, Lagrange, and Legendre were the referees. While admitting the novelty and importance of Fourier’s work they pointed out that the mathematical treatment was faulty, leaving much to be desired in the way of rigor. Lagrange himself had discovered special cases of Fourier’s main theorem but had been deterred from proceeding to the general result by the difficulties which he now pointed out. These subtle difficulties were of such a nature that their removal at the time would probably have been impossible. More than a century was to elapse before they were satisfactorily met. In passing it is interesting to observe that this dispute typifies a radical distinction between pure mathematicians and mathematical physicists. The only weapon at the disposal of pure mathematicians is sharp and rigid proof, and unless an alleged theorem can withstand the severest criticism of which its epoch is capable, pure mathematicians have but little use for it. The applied mathematician and the mathematical physicist, on the other hand, are seldom so optimistic as to imagine that the infinite complexity of the physical universe can be described fully by any mathematical theory simple enough to be understood by human beings. Nor do they greatly regret that Airy’s beautiful (or absurd) picture of the universe as a sort of interminable, self-solving system of differential equations has turned out to be an illusion born of mathematical bigotry and Newtonian determinism; they have something more real to appeal to at their own back door—the physical universe itself. They can experiment and check the deductions of their purposely imperfect mathematics against the verdict of experience—which, by the very nature of mathematics, is impossible for a pure mathematician. If their mathematical predictions are contradicted by experience they do not, as a mathematician might, turn their backs on the physical evidence, but throw their mathematical tools away and look for a better kit. This indifference of scientists to mathematics for its own sake is as enraging to one type of pure mathematician as the omission of a doubtful iota subscript is to another type of pedant. The result is that but few pure mathematicians have ever made a significant contribution to science—apart, of course, from inventing many of the tools which scientists find useful (perhaps indispensable). [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
2.9.10 Remark: Mathematics as a tool subservient to the interests of physicists. Bell [190], pages 197–198, makes the following comments on the differences between mathematicians and physicists in the context of the examination of a paper on heat theory by Fourier for the Grand Prize of the French Academy in 1812.
2.10. Dark sets and dark numbers
47
It would be very surprising indeed if the mathematics of which the human mind is capable was adequate to describe all physical phenomena. The human mind is a small machine. The universe is a very large machine. The human mind is implemented as a biological system which is optimized for many purposes other than doing physics. It is truly surprising that human mathematical modelling is as powerful as it is. However, future progress in physics may require new developments in the fundamentals of mathematics. This justifies the strong emphasis in this book on fundamentals. Overhauling a system requires thorough understanding.
2.10. Dark sets and dark numbers [ Rewrite Sections 2.10, 2.11 and 2.12 in light of Remark 2.10.1. ] 2.10.1 Remark: Dark sets and numbers are sets and numbers which cannot be named. The issues regarding dark sets and numbers, and grey sets and numbers, and concerns about the validity of concepts like infinity and infinitesimality, become quite easy to resolve in terms of the modelling ontology which is outlined in Section 3.3, Section 4.1, Remark 4.12.3, Remark 5.7.25 and elsewhere. Dark numbers (or pixie numbers) are those numbers which are assumed to exist in a concrete object space which is modelled by an abstract language space. Since the span of the abstract language is limited to a countable set of names, it is impossible to give names to all elements of a presumed uncountable space of concrete objects. Therefore almost all elements of a presumed uncountable concrete space must be “dark”. That is, they cannot be given names.
Questions regarding infinity concepts are also quite easy to resolve within the names/objects modelling perspective. The elements of presumed infinite sets in modelled systems cannot all be referred to individually by name, as mentioned above. The concrete systems being modelled do not exist in any physical sense. Mathematical models (such as ZF set theory) which supposedly describe infinite sets simply do not model any real system. The system being modelled in this case is an imaginary system, and models which include infinite sets merely state how such a system would behave if it existed. Since such systems can never be observed in nature (due to the finite bandwidth of human perception if for no other reason), mathematics involving infinities is, in a sense, null and void and to no effect. However, the infinitudes of mathematics are never tested in practice (since such testing is impossible). So the infinities of mathematics are not part of the testable component of models. Only the finite parts of mathematics are ever tested, and these finite parts are often found to be enormously useful. Therefore there is no imperative to discard infinite mathematics on the grounds that infinities cannot neither be named nor observed. The principle of mathematical induction requires a name to be assigned to an object in a set, and an induction rule is applied to this named object to show that another named object (its successor) is in the set, which then shows that there is no maximum element in the set. This is what we mean by an infinite set. However, since we cannot name all objects, and we cannot explicitly specify an infinite number of naming maps, it is not possible to validate the application of mathematical induction to concrete systems. The concept of “the largest namable number in a set” is as circular as “the largest number which can be written in ten words”. When one writes N = “the largest namable number in set X”, one must adjust one’s naming system so that the name N points to the largest namable number in X. Thus the name map from abstract to concrete objects is defined in terms of itself. There is an implicit name-to-object map µV : NV → V here which maps an abstract name such as N to a concrete number µV (N ). (See Remark 4.12.3 for notations.) This is illustrated in Figure 2.10.1. The inability to give names to all individual objects does not contradict the fact that we can give names to uncountable aggregates of objects. The set of real numbers can be named. The set of real numbers between 0 and 1 can be named. We just can’t name all of the members of such sets. We “know” that they are “there”, but we can’t name them! But we only know that they are there because the abstract theory claims that this [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Since in practice only a finite number of names can ever be written down or thought about, almost all of the non-dark concrete objects will never be given names, and never will be given names under any naming system. Since the naming systems for logical and mathematical objects vary over time and can be newly invented or extended at will, it is not possible to talk about the “largest number which can be named”, because obviously this meta-name may be used in a meta-naming-system to include a number which is even larger. Naming systems for logical and mathematical objects are socially defined and highly dynamic. So the classes of “grey numbers” and “grey sets” are also socially defined and dynamic.
48
2. Philosophical considerations abstract name space names NV name µV map rk number s
da
named numbers
da
V
rk numbers
modelled concrete numbers The naming bottleneck and dark numbers
is so. In fact, the truth is that we will never obtain evidence that they are not there. So it is (potentially) harmless to assume that all of the individual real numbers are “there”. But “there” means in the modelled concrete object space, which we can never experience and never test. One of the great advantages of the human mind is its ability to imagine things which have not yet been experienced. In this case, the human mind imagines real numbers and ZF set theory, which are actually impossible to experience. As long as there are practical advantages to the development of models of imaginary worlds, there is an incentive to continue to develop them. As mentioned in Remark 3.4.2, there are practical reasons why we should continue to accept our current logic and mathematics. It is in our self-interest to do so. Modern logic and mathematics are not “true” in any sense. They are beneficial to our interests. They just work. There is no incentive to interrogate a goose too closely if it continues to lay golden eggs. The way in which the “naming bottleneck” restricts our ability to give names to all of the real numbers is analogous to the way in which the use of a microscope limits our ability to view every point on Earth. A name map from a countable set of abstract names to a notional uncountably infinite set of concrete objects is like a microscope for viewing the concrete objects. We can only see a countable number (at most) at any one time. We just don’t have the required bandwidth to see (and name) all of the real numbers at the same time. But we can view a satellite image which shows the whole Earth in aggregate. Mathematical induction, for example, names and examines two objects at a time, which is a microscopic operation on an infinite set. 2.10.2 Remark: Finite minds cannot think about non-finitely-representable things. In symbolic logic, only a countable number of propositions may be written (or thought about or processed in a computer), even in infinite time, since all propositions contain a finite number of symbols chosen from a countable set. Therefore only a countable number of sets may ever be referred to. This leads to some “issues” when formulating a set theory which deals with uncountable numbers of things. There are bound to be difficulties when finite minds try to contemplate infinite concepts. 2.10.3 Remark: Most sets cannot be talked about individually. The sets which one can explicitly refer to in a set theory are like ants in a garden. If a particular set theory has only a finite number of sets in total, we can give them all names and see them individually, like in a formicarium. (A “formicarium” is like an aquarium, but with ants instead of fish.) When the number of sets is infinite, they may be compared to the ants in a big garden. You can, in principle, see any individual ant, but you can’t see them all. That is, you can see any ant, but you can’t see all the ants. This is an important distinction. This is similar to the observation that you can see any person in the world, but you can’t see all 6 billion people in the world. This does not make us doubt that those people exist. Our lives are just not long enough to visit them all. (A hundred Gregorian years equals only 3,155,695,200 seconds.) The fact that we can visit any one of them at will gives us confidence that the rest do exist. Therefore we can refer to some sets, people or ants as individuals, but the rest must be referred to in aggregate. Suppose a set theory has an uncountably infinite number of sets. You can only refer to a finite selection of sets from a countably infinite menu of these sets. This kind of set theory is like a garden with a vast number of ants, of which any single ant can be seen and referred to, but there are additionally an uncountably infinite number of pixies which can never, ever be seen or referred to in any individual way. We know that [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Figure 2.10.1
2.10. Dark sets and dark numbers
49
all those pixies are out there, and we can refer to them in aggregate. We can classify the pixies according to hair colour, eye colour, gender and shoe size, but we can never, ever see one individually. This is analogous to the situation with infinite sequences of digits in decimal expansions of real numbers. We assume that they all exist although we can never write them down. But we can classify them without referring to them individually. There is a third level of set theory in which some sets consist entirely of pixies. That is, not a single element of the set may be referred to individually. The elements may be classified and written about in aggregate, but not a single one of them can be referred to individually. This is the kind of set which is brought into existence by the axiom of choice when the Zermelo-Fraenkel set theory axioms do not alone imply its existence. 2.10.4 Remark: Unmentionable sets and numbers are incompressible. Numbers and sets which cannot be referred to individually by any construction or other specific individual definition may be described as “unmentionable” numbers or sets. They could also be described as “incompressible” numbers and sets because the ability to reduce an infinite specification to a finite specification is a kind of compression. For example, P the realnnumber π−1has an infinite number of digits in its decimal expansion, but the expression π = 4 ∞ is a finite specification for π which uniquely n=0 (−1) (2n + 1) defines it so that it can be be distinguished from any other specified real number. 2.10.5 Remark: Unmentionable numbers and sets may be thought of as dark numbers and dark sets. Pixie numbers and sets, the numbers and sets which we know are “out there” but can never see or think about individually, could be described as “dark numbers” and “dark sets” by analogy with “dark matter” and “dark energy”. (The difference, perhaps, it that we don’t really know that dark matter and dark energy are out there.) The supposed existence of dark matter and dark energy is inferred in aggregate from their effects in the same way that dark numbers may be mentioned in aggregate. (See Remark 2.11.12 for classification of dark numbers into black numbers and grey numbers.)
2.10.6 Remark: Ant-and-pixie analogy for discomfort levels with unmentionable sets and numbers. The levels of discomfort with infinite and unreachable sets described in Remark 2.10.3 may be further described as follows. Discomfort level 0: Ants in a Formicarium. (Finite set.) All inhabitants are ants. You can see any individual ant. If you wish, you can see all ants as individuals. This is a metaphor for a set theory which has only finite sets. Each set is like a formicarium with only a finite number of ants, which can all be referred to individually within the language of the set theory. Discomfort level 1: Ants in the Garden. (Countably infinite set.) In this case, all of the inhabitants are ants, and you can see any ant individually. You can’t see them all individually because there are too many of them. (There are in my garden anyway.) But you can see as many as you like individually. No ant is incapable of being detected if you make enough effort. This is a metaphor for a set theory in which there are infinite sets such as the set ω of ordinal numbers. Although we can’t write down all the elements, we can single out any finite subset of them, given a large enough piece of paper to write on or enough time to pronounce all the words of the specification. Even though we don’t have infinite paper or infinite lives, the induction principle is very convincing: No matter how large a number is, we can always append a zero to make it bigger. Virtually everyone is convinced by this argument after about the age of 10. Most people are content to refer to all the integers in aggregate although we can never write them all down individually. Discomfort level 2: Both Ants and Pixies in the Garden. (Uncountably infinite set.) In this case, there are infinitely many ants in the garden, but there are infinitely more numerous invisible pixies in the garden. You can see any individual ant, but not all of them individually in one lifetime. However, you can’t see or refer to any of the pixies individually. You can refer to them only in aggregate. This is a metaphor for a set theory in which an “infinity axiom” brings into existence infinite sets such as ω, and a “power set axiom” brings into existence sets such as IP(ω), which is the set of all subsets of an infinite set. The set IP(ω) is equivalent to the set 2ω of all infinite sequences of zeros and ones, which [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
[ How does the classification of discomfort levels in Remark 2.10.6 change if ZF is replaced by NBG set theory? Proper classes presumably have even more “issues” than sets. ]
50
2. Philosophical considerations may be thought of as the binary numbers in the real interval [0, 1) with infinite binary digits after the “decimal point”. The set IR of real numbers is therefore also a mixture of mentionable numbers and unmentionable “pixie numbers”. At this discomfort level, we can individually specify any finite number of elements of the countably infinite subset of mentionable elements of 2ω , but we can’t individually specify any elements of the uncountable subset of unmentionable elements of 2ω . Luckily, the mentionable elements are those of greatest interest because if we can think of an element, we can always write down a description in finitely many words. The unmentionable numbers can be referred to in aggregate. They are like pixies which we will never see, but we “know” they’re there. We can refer, for example to the set of all sequences of zeros and ones which start with a one. That is an aggregate description, but we can’t write down most of the elements of this set. So we have lots of ants to work with, and they are the most interesting things because they are the things which we can think of individually. But there are infinitely more pixies than ants. We can never even think about the pixie sets and numbers individually mind. The number π is a real Pwith a finite n −1 number which we can think of and specify. (E.g. π = 4 ∞ is a finite specification n=0 (−1) (2n + 1) of an infinite number of decimal digits. See Section 2.10.4.) So π is an ant in this metaphor, not a pixie. We can’t see all of the digits in the decimal expansion of π, but at least we can specify this ant unambiguously and see as much of it as we have time for. It is not possible to give an example of an unmentionable number because, by definition, they cannot be individually mentioned! This is certainly disturbing. We know they exist, but we can’t give any examples. In real life, when someone claims that things of a particular kind exist, but no examples can ever be given because they are undetectable by any means, such a person would generally be humoured and politely avoided.
Discomfort level 3: Only Pixies in the Garden. (Set of AC-existence-only sets.)
The axiom of choice brings into existence sets which are “constructed” according to some unspecified infinite set of random choices which are unknown and unknowable. For example, a Lebesgue nonmeasurable subset of the real numbers is guaranteed by AC to exist, but there is no way to know its membership. Invisible sets such as Lebesgue non-measurable sets may be thought of as “dark sets” in a rough analogy with the hypotheses of dark matter and dark energy in cosmology. If at least one Lebesgue non-measurable subset of the real numbers exists, then there are uncountably many such subsets. So we “know” (by the axiom of choice) that there exist an uncountable infinity of pixies in the garden of Lebesgue non-measurable sets, but we can’t see any individual pixie. We can refer to the pixies in the garden only in aggregate. If U denotes the set of Lebesgue non-measurable functions f : IR → IR, then U and the subset {f : IR → IR; f (0) = 1} of U are well-defined aggregates, but no example elements of either set can be written down. There are no visible ants at all in such a garden, which is very frustrating indeed. We are happy to assume that there are billions of people in the world whom we will never meet because we can meet as many of them as we like and extrapolate from our experiences, but in the case of a garden where all the pixies are invisible, we don’t even have a starting point on which to base the extrapolation of our experience. Absolutely no individual examples of the set’s members can be presented. A more profoundly worrying aspect of pixies-only sets is that you can make claims about their members without fear that someone will construct a counter-example to your claim! Disproving a conjecture by construction of a counter-example is impossible. People who believe in the existence of sets like this should be humoured and politely avoided. (Admittedly, there are other ways of disproving conjectures than the provision of counterexamples. But it is very disturbing to know that one important method of proof, namely the construction of a counter-example, is removed from the tool-box in advance.) [ Get a reference for who showed that ZF is consistent with the negative of AC. In that case, Lebesgue nonmeasurable sets are known to not exist. Find out if G¨odel’s theorems have something to do with “goblin” sets which have indeterminate existence status. ] Discomfort level 4: Only Goblins, and we don’t know if they’re there. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
In this case, a set has elements which are all pixies. We can talk about the set as a whole in aggregate, but we can never see a single individual element because every element is an unmentionable pixie set.
2.10. Dark sets and dark numbers
51
Level 3 might seem like it is the most uncomfortable level, but there are sometimes sets whose existence is not knowable, whereas at level 3, only the membership of the pixie sets is unknowable. For lack of a better word, sets whose existence or non-existence is unknowable may be referred to as “goblins”. For example, Zermelo-Fraenkel set theory without the axiom of choice does not say whether Lebesgue non-measurable sets exist or not. If they do exist, they are unmentionable, but the axioms do not allow us to show that they exist. (We “know” this because someone proved that ZF with no Lebesgue non-measurable sets is a consistent axiomatic system. But presumably, like all metalogic and metamathematics proofs, you have to assume quite a lot of basic logic and mathematics in order to make the proofs work.) 2.10.7 Remark: Informal classification of discomfort levels with unmentionable sets and numbers. The levels of discomfort in Remark 2.10.6 may be summarized as follows. level description example Finite ants. Infinite ants. Ants and pixies. Pixies only. Goblins.
All ants visible in finite time on finite paper. Any single ant visible, but not all in finite viewing time. All ants visible. Pixies invisible. but definitely exist. No ants. Pixies invisible, but definitely exist. All goblins invisible, and they may or may not exist.
Z1000, ZF − ∞ ω 2ω , IR ZF+AC ZF
If the infinity axiom is removed from the Zermelo-Fraenkel set theory axioms ZF, the reduced axioms ZF−∞ would be at discomfort level 0. Probably some subset of the ZF axioms (but including the infinity axiom) would be at level 1. (The power set axiom would have to be removed, for example.) The full ZF set theory is at discomfort level 2 (if you assume that any set not constructible with ZF axioms definitely does not exist). If the axiom of choice is added to ZF, this increases to discomfort level 3. (But see Remark 2.10.8.) There are times when one can tolerate level 3. The reader must then simply be aware that they may encounter a garden full of invisible pixies which can only ever be referred to in aggregate. As long as the limitations are understood, there should be no seriously harmful consequences. ZF set theory has discomfort level 4 if no determination is made on whether sets requiring AC for existence proofs really do or do not exist. Then, for example, Lebesgue unmeasurable sets become goblins whose existence is indeterminate. 2.10.8 Remark: ZF set theory may have the highest level of unmentionability discomfort. In a sense, Zermelo-Fraenkel set theory without the axiom of choice may be even more disturbing than ZF with AC. Although ZF with AC guarantees the existence of the disturbing pixie sets whose membership is unknowable (and remember that membership is the only property that sets can have in ZF set theory), ZF without AC is more disturbing because we don’t even know if these disturbing sets are “out there”. Possibly even more disturbing than this is a ZF set theory where the existence of Lebesgue non-measurable sets is excluded by an axiom. This is, apparently, a logically consistent axiomatic system. It is not clear whether it is better to exclude the most disturbing sets, or learn to live with them, or just adopt a “don’t ask” policy. In this book, the “don’t ask” option is adopted; that is, simple ZF with no axiom of choice. In other words, there may be some goblins out there. The situation where the ZF axioms and deduction rules are inadequate to determine whether Lebesgue non-measurable sets exist is illustrated by Figure 3.8.2 in Remark 3.8.2. [ Find a reference for ZF with axiom asserting non-existence of Lebesgue non-measurable sets. ] 2.10.9 Remark: The human mind is a finite machine in a countably infinite playpen. One may say that the human mind is a finite machine which is confined to a countably infinite playpen in an uncountably infinite world. This is true even in pure mathematics without considering the empirical world at all. The set of mentionable real numbers is a countable playpen within which we can do mathematics. An uncountable infinity of unmentionable numbers are forever out of reach of human thought. It is quite embarrassing to realize that mathematical analysis has such a dubious basis, and that most of physics and the other sciences are expressed in terms of this flimsy foundation. The “finite playpen” is evaded in practice by referring to infinite sets of mathematical objects in aggregate. Otherwise analysis would be impossible. But our real-world experiences develop a strong belief that we can [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
0. 1. 2. 3. 4.
52
2. Philosophical considerations
2.10.10 Remark: Very large sets are analogous to dynamically allocated “virtual” computer memory. The problems with dark sets and dark numbers if we take a “dynamic existence” approach to sets and numbers, analogous to the “virtual memory” concept in computer operating systems. In modern computers, a program is allocated a large space of computer addresses which it can access. However, those addresses do not “exist” until the program tries to read or write the contents of the addresses. The operating system dynamically allocates real memory to represent the virtual memory addresses when the program attempts an access. This is a bit like a sandy tennis court which has no hard surface for the ball to bounce on, but just before the ball hits the ground, a small area of hard surface is placed where the ball is about to hit. From the ball’s point of view, the whole tennis court is a hard, flat, continuous area. The advantage of this approach is that it economizes the available pieces of hard court surface if they are in short supply. Sets and numbers are very much like this. The Zermelo-Fraenkel axioms for sets and the Cantor construction for real numbers guarantee the “existence” of sets and numbers whenever the mathematician wishes them to exist, in accordance with various construction principles. So when they are requried, they come into existence. Now when one talks about the existence of, say, all of the real numbers, one should ask whether one is talking about the virtual real numbers which exist in principle (but which are not yet “allocated”), or the real real numbers which have been brought into existence dynamically by some construction operation. If one takes an anthropological view of sets and numbers, one finds that in practice, all sets and numbers are defined by procedures or algorithms. Apart from the random sets which are brought into existence by the axiom of choice, all sets and numbers are arrived at by a construction of some sort, which can be written as a linear sequence of symbols. This countable space of constructions of sets and numbers yeilds numbers which are brought into existence as required, if the procedures or algorithms conform to the rules of the system. This “dynamic allocation” picture of sets and numbers relegates the dark sets and numbers to the “virtual memory” part of the system, whereas the “reachable” or “mentionable” sets and numbers are dynamically brought into existence as “real memory” as required. The fact that this picture is dynamic is not a problem. Human mathematical thought is dynamic. So in the anthropological sense, the picture is accurate. The desire for a static picture arises from the fact that mathematicians think of their concepts of sets and numbers as being static.
2.11. Integers and infinity 2.11.1 Remark: Mathematical induction and the infinity of integers. The principle of mathematical induction is fundamental to all of mathematics. It is probably the most important of all mathematical principles. It is essentially equivalent to the concept of infinity. If this principle is not “true”, then there are no infinite sets. Whether or not the integers are infinite is related to the questions: whether the universe is infinite, and whether time and space are infinitely divisible. The Euclidean geometry model for space is infinite and infinitely divisible, but these properties are never needed in practice. Euclidean geometry was suitable for the surface of the Earth as long as one did not apply the model to unduly large extents. It was a model which was true enough within its range of application. Similarly, whether the integers are infinite is irrelevant if one can never use an infinite number of integers in practice. 2.11.2 Remark: Infinite integers versus the finite human mind and finite universe. It is very difficult to reconcile the idea of infinitely many integers with the finite human mind. A finite mind or computer can only represent a finite number of numbers. For example, a computer memory with N bits of storage can represent at most 2N different numbers. These do not need to be the integers from 0 up to 2N − 1. For example, the number 21,000,000 can be represented in very little space. One might refer to such [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
test general propositions by drawing a random sample from a population and applying the test criteria to the sample. But in many areas of mathematics, the “random sample” excludes in advance almost all elements of the general set. Thus all “random samples” are in fact strongly biased. So this practical method for allaying scepticism of general propositions is not applicable, or is seriously flawed at best.
53
an expression as a “compressed number representation”. (The familiar floating point representations are only a particular style of number compression.) Although this number has 1,000,000 binary zeros after the initial digit 1, only a few bytes are required for its representation rather than a million. But if 1,000,000 binary digits are chosen at random, it is very unlikely that such an integer can be “compressed”. Representation of all integers less than 21,000,000 requires at least a million binary bits of storage. This is not difficult to do with modern computers. Now consider the numbers from 0 up to 2N where N = 101,000,000 . Representation of all of these numbers requires at least 101,000,000 bits of storage. The amount of matter in the Universe is around about 5.68 × 1056 grams. (See Misner/Thorne/Wheeler [37], box 27.4, page 738.) This corresponds to about 3.40 × 1080 protons. If each proton stores one bit of information, this is still very much less than the required 101,000,000 bits. So even with a brain as big as the Universe, one inevitably finds that there must be integers which cannot be thought about, no matter how big the brain is. There is also a limit to the number of integers that can be represented in a given time because of special relativity and quantum effects. The most serious restriction is the “bandwidth bottleneck” of any mind. There is a very low upper bound on the number of bits of data which can be fed into a mind in a given time. 1,000,000 One must conclude that the vast majority of the numbers between 0 and 210 will never be thought about by any being or system in the Universe. Even if the Universe was infinitely voluminous, it would not be possible to communicate the state of such a Universe to any subset of the Universe (such as a human mind) in finite time due to bandwidth constraints. So if we now crank up the number N to much larger values, it is clear that 0% of all integers can ever be thought about if there are an infinite number of them. The situation is much worse than this. Not only will almost all integers not be thought about; almost all integers cannot be thought about. There are only a finite number of methods that human beings will ever devise for “compressing” integers. By writing 1010 instead of 10,000,000,000 we obtain a compression factor of about 2.5. By writing 101,000,000 instead of a million-and-one decimal digit number, we obtain a compression factor of about 100,000. There are only a finite number of ways of compressing integers into such notations. Consider the sequence of numbers ni defined by n0 = 1 and ni+1 = 2ni for (finitely many) integers i. These are 1, 2, 4, 16, 256, 2256 , and so forth. Even at i = 5, the number ni is painful to write down as decimal digits. But the numbers become very large very quickly. Suppose we take the 101,000,000 th element of this sequence. That will be a really huge number. So now use that number ni for i = 101,000,000 as the number i. This is now monstrously bigger. Now do this feedback process 101,000,000 times. The number will be stupendously big. However, no matter what method we specify, the result will take a finite amount of space to represent. The description of any integer, no matter how much “compression” we use, will require a finite amount of time to describe or a finite amount of space on a sheet of paper or in a computer. Even if we use all of the atoms of the Universe, there are still only a finite number of numbers that can be represented, because there are only a finite number of methods of specifying sequences and other ways of describing integers. Therefore some integers cannot be described. [ At this point, the Kolmogorov-Chaitin complexity might be relevant. See EDM2 [34], 354.D. This might have something to do with the compressibility of numbers. ] 2.11.3 Remark: An argument against the validity of the induction principle. One may conclude from Remark 2.11.2 that (1) there are only finitely many integers which can be represented in the Universe in which we live, and (2) there is a maximum integer which can be represented. But now the intuitive principle of induction should save us. The intuitive induction principle says: “Whatever the largest integer is, just add one. Then you have a bigger integer. That contradicts the hypothesis of a finite number of integers. Therefore the integers are infinite.” This intuition is so fundamental to modern mathematics that it is almost impossible to reconcile with the above argument for the finity of integers. (It is difficult to let go of a belief which is acquired at a very young age!) The problem with the induction principle is that you can’t know which integer is the largest. Therefore you cannot add 1 to it. This is related to Zeno’s paradox of Achilles and the tortoise. It is also related to the atomic theory of matter. More precisely, consider the analogy with the finite speed of light. “No matter how fast you travel, you can always travel faster.” Even within special relativity, this is true. But there is a finite limit to speed. The problem is that the amount of energy required for each speed boost increases so that one can never exceed the speed of light. (One would need an infinite amount of fuel, and this would make the vehicle infinitely massive, and so forth.) [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
2.11. Integers and infinity
54
2. Philosophical considerations
In the case of integers, you must be able to represent an integer in order to add 1 to it. No matter which integer you pick, even if you take the millionth power of this integer, you still only have yet another large integer. The finity of integers is related to this kind of paradox: “One more than the largest number which can be described in thirteen words.” This seems to describe a number in 13 words. The problem is that it doesn’t describe a number. It describes a recursive relation which is not soluble. So it is not a paradox. It is similar to “the solution of x = x + 1”, but the contradictory recursion is easier to disguise in plain English. (See Remark 8.4.4 for discussion of x = x + 1.) In the same way, “the largest number which can be represented” is not a representation of a number. The source of the paradox is that a human mind is trying to talk about the largest number which can be represented by a universe which includes that human mind. A system cannot represent all of the states of a system which includes itself. That is as recursive and meaningless as the 13-word pseudo-paradox above. Therefore there is no paradox in assuming that there are only a finite number of integers.
2.11.5 Remark: The assumption of infinitely many integers avoids the need for boundary conditions. Although the “real world” set of integers, which we model by the mathematical integers, is fairly clearly finite, there is one excellent reason for assuming that the integers are infinite. That reason is the avoidance of boundary conditions. If we defined the integers to be finite, all theorems and proofs would need to deal with the boundary condition at the “high end” of the set. We would need to say things like: “Let an+1 = 2an if n is less than the largest integer; otherwise an+1 is undefined.” Working with such a “closed set” of integers, we would always need to have separate conditions for the “interior” and “boundary” of the set of integers. This is because our logical systems require the total absence of contradictions and errors. (A modified logical system could, in principle, permit fuzziness of the logic at the boundaries of the system. But that would be revolutionary step backwards which would rob mathematics of any credibility which it currently possesses. For example, proof by contradiction would cease to be valid. See Section 3.11 for discussion of proof by contradiction.) If we exclude those integers which cannot be represented by the state of a human mind, or by a machine the size of the Universe, we would need to place conditions on integers like: “If n is a representable integer, then. . . ”. This would make all of mathematics very messy indeed. (It is true that some mathematics is nominally finite, but mathematical logic itself makes extensive use of induction. So induction is explicit or implicit in all of modern mathematics.) The strongest argument for the infinity of the integers and the principle of mathematical induction is convenience. The infinite integers in Definitions 7.3.2 and 7.3.5 are simply the most convenient kinds of integers which include all of the integers that we will ever want to refer to, and we don’t have to extend the set every time someone extends the integers by thinking up a previously unknown method of describing integers. This is like using Euclidean 3-space to describe the Universe so that we don’t have to say how big it is or whether it is infinite. Euclidean 3-space has enough points for our purposes. If we don’t use it all, that’s not a problem. If we run out of numbers or points, that would be a problem. Similarly, in physics and engineering, the behaviour of systems after they achieve equilibrium is routinely studied, although equilibrium requires infinite time. It’s just more convenient to ignore the modelling error and work with the infinite-time solution. In conclusion, it will be assumed in this book that the integers are infinite and that the principle of mathematical induction is valid. This is a convenient and harmless fiction, as long as you don’t think about it too deeply. It’s just a model. All models are imperfect because models are in the mind, and the world is “out [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
2.11.4 Remark: We don’t really need an infinite number of numbers. One should not feel too uncomfortable with a finite set of integers. Euclidean geometry assumes a point set of infinite extent, but we use this for 2-d maps of the Earth and 3-d models of the Universe. We just ignore the useless bit outside the domain of interest. This is as harmless as Newtonian mechanics being applied to everyday situations. We still use Newtonian mechanics for almost all practical purposes, but we ignore the part of the model which we don’t need. In the same way, we should not really care if there are finite or infinite integers in our model for human and mechanical computation processes. We will never need most of the integers in the infinite set. So it is of no importance that most of the integers will never be thought about, and never can be thought about, by any human or computer, or any network of humans and computers.
2.11. Integers and infinity
55
there”. The mind is too simple to represent the complexity of the world because a mind is a tiny subset of the world. Mathematics is the study of models, not of the real world. 2.11.6 Remark: Practicality of mathematics using only a finite number of integers. There could be applications for a mathematical theory of integers which explicitly recognizes “compressible integers” and works exclusively within such a framework. Then one could represent a large number of integers using a “compression code book”. Using this code book, very large numbers may be represented by shorter strings of symbols, like 101,000,000 − 1 to represent a number with many decimal digits. Since the code book must be finite, one must be careful to include in the code book as many important integers as possible, omitting “uninteresting” integers. It would be possible to define large finite sets of integers which have well-defined partial closure properties. Obviously no finite set can be closed under normal addition. But some sort of “weak closure” could be useful. Such almost-closed sets of integers would have the advantage of being implementable in a finite-state machine. Such a finite theory of integers is routinely implemented in computers. The problems which arise from such finite representations are well known in the field of numerical analysis. This topic lies outside the scope of this book. 2.11.7 Remark: The “biggest number” is time-dependent. A useful way to think of the set of representable integers is to consider the set as time-dependent. The integers which have been represented somewhere on Earth yesterday may be fed into functions which generate new, bigger integers today. Today’s biggest integer will be larger than yesterday’s biggest integer. This helps to explain our intuition that no matter how large an integer is, we can always make it bigger. This is true, but making integers bigger takes time. One might argue that in the long run, we can generate arbitrarily large integers. But as the old saying goes: “In the long run, we’re all dead.” (This saying is generally credited to John Maynard Keynes.) Just as a pseudo-number “infinity” can be defined as the solution to x = x + 1, interpreted as an iteration formula (as mentioned in Remark 8.4.4), so also we can think of the definition of the “biggest number” as the solution of an iteration process which daily increases the number of integers.
One of the first signs that a child is taking an interest in mathematics is when they ask: “What is the biggest number?” Sometimes children think up a really big number and think it is the biggest for a while, and then realize that they can add 1, multiply by 10, or do something else to make it bigger. Eventually they settle on “infinity” as the biggest number, until they find out that this has certain technical difficulties also. Like most of the ancient riddles and paradoxes of mathematics, the principle of mathematical induction remains incompletely resolved to the present day. Formalization of irresolvable difficulties by axiomatizing them as in Definition 7.3.2 does not solve these problems. Axiomatization is merely a compromise agreement to call a truce so as not to raise philosophical difficulties which get in the way of mathematical productivity. If the agreement is looked at too closely, the compromise may unravel. Axiomatization can only formalize and standardize our current state of ignorance. It does not remove the ignorance. 2.11.8 Remark: Infinite sets can be referrred to finitely in aggregate. Although we can only represent a finite number of integers in a finite time in a finite universe with finite communication channels, it is possible to refer to infinite sets of integers in aggregate. (The words “set” and “aggregate” have very similar meaning.) For example, it is impossible to list all of the odd numbers individually, but it is easy to refer to the set {n ∈ ; ∃m ∈ , n = 2m + 1}.
Z
Z
The very-large-sets problem also disappears in the case of aggregates. For example, we cannot list all of the integers from 0 up to 2N where N = 101,000,000 , as mentioned in Remark 2.11.2, but we can easily refer to the set {n ∈ ; 0 ≤ n ≤ N }.
N The set {n ∈ N; 0 ≤ n ≤ N } is well defined for any representable integer N , even though we cannot list
the elements if N is too large. The set is a well-defined aggregate because there is a well-defined procedure to determine if a number is or is not in the set. A set is defined by its membership. So the ability to answer “yes” or “no” to the membership question is the only pre-requisite for meaningfulness of the set. Similarly, the infinite set of all natural numbers is a well-defined aggregate because there are well-defined
N
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
In this perspective, the assumption that the integers are infinite is a way of avoiding the need to keep increasing the “biggest number” every day as people need larger and larger numbers.
56
2. Philosophical considerations
membership determination procedures. Another example is the set of prime integers. (See Definition 7.4.2.) It may be very difficult in practice to determine whether any given integer is a prime, but the procedures are well-defined. So the set is well-defined. 2.11.9 Remark: Operational definitions are required for representable sets and numbers. A requirement for the practical definition of any aggregate of integers is the availability of a procedure to determine whether any given integer is or is not inside the aggregate. There is a corresponding requirement for the definition of sums of series of real numbers. A number or set is “represented” or “pointed to” or “indicated” by a formula or expression if it is practically possible to determine if this number or set is the same or different to any number or set that someone else represents. For example, no one can write down all the decimal places of π or e, but it is easy to determine that these numbers are different. Given two different series expressions for π, it is possible to verify that the sums of the series are equal. (See also Remark 2.12.4.) Thus a “representation” of a number or set is the same sort of thing as an “operational definition”. (Lebesgue non-measurable sets, by contrast, are not representable. Nobody knows which numbers are members of such sets.)
2.11.11 Remark: The abundance of integers. We can think of all infinite structures in mathematics as arenas in which mathematical activity can take place. The closure axioms of infinite mathematical structures are sufficient to guarantee that whatever activity takes place will remain within the arena, but this does not imply that such acitivity can ever reach all parts of the arena. Some parts are unreachable, and most of the theoretically reachable parts of the arena will never be reached at any time in human history. We can think of this as a kind of abundance of integers. Given the choice between scarcity and abundance, probably abundance is the lesser of the two discomforts. 2.11.12 Remark: Classification of dark numbers into black numbers and grey numbers. Perhaps we could sub-classify dark numbers (mentioned in Remark 2.10.5) into black numbers and grey numbers. The dark real numbers which are essentially random and incompressible could be thought of as black numbers because they absolutely cannot be represented at all in any finite system such as a human mind or digital computer. But an integer with a randomness which cannot be compressed to an expression which is representable in a computer the size of a planet is only a grey number because the incompressible size requires only a finite state space, although that finite state space is far beyond all known technologies. In this sense, the childish question about the “biggest number” does have a serious answer. There is a biggest number which can be thought about or talked about, but if we knew which number this was, we could add 1 to it and make a bigger number. In other words, the biggest number is only the biggest number until we actively think about it. To be more precise, the biggest number is time-dependent. 2.11.13 Remark: Zermelo-Fraenkel set theory axiom of infinity. The infinity axiom, Definition 5.1.26 (8), has some difficulties. We will never be able to write down all of the finite ordinal numbers. The main thing which convinces people to accept this axiom is the difficulty in seeing where the numbers would run out. The induction argument is that if there are only finitely many integers then there must be a last integer, and we can add 1 to this to get a larger integer, which is a [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
2.11.10 Remark: Semi-pixie integers. Analogy to limited resolution of telescopes. The integers which we can never write down or “indicate”, because they have too many decimal digits and cannot be compressed by any humanly possible algorithm so as to fit into the state space of the total machinery available to humanity, may be termed “semi-pixie integers”. These are the integers which we can refer to in aggregate but never individually. Of course, there must be a grey area between the indicatable and non-indicatable integers. The set of semi-pixie integers is not well-defined. If we could write down one of the semi-pixie integers, it would not be a semi-pixie integers. You know that they are “out there”, but you can never see them. We don’t even know which integers are “visible” and which are not. This could be thought of a resolution problem. The resolution of our human ways of indicating things is not fine enough to resolve all individual integers. We can refer to vast sets of integers just as we can see pictures of distant galaxies of which we cannot see any individual atoms. We know that there are atoms out there, but we will never see them. Our inability to resolve individual atoms in distant galaxies does not make us uncomfortable with the assumption that the atoms are indeed there. The non-indicatability of most integers is no more frustrating or bewildering than this.
2.11. Integers and infinity
57
contradiction. This is quite convincing. We will never see an infinite number of integers, but “we know they’re there”. Mathematics seems to be full of objects which are invisible and unknowable, but this infinity axiom is perhaps the least disturbing of these mysteries. A more disturbing consequence of the infinity axiom is its combination with the power set axiom, Definition 5.1.26 (5), to show that the set IP(ω) of all subsets of the non-negative integers is well defined. This can be identified with the set 2ω , which is equivalent to the set of all real numbers in the interval [0, 1) expressed in binary (base-2) form. Since there are only countably many sentences that can be written in ZF theory, almost all elements of IP(ω) are “unreachable”. That is, they can never be written down. So almost all such sets are eternally beyond individual description by any mathematician. These sets can be used in aggregate, but they cannot be singled out and discussed individually. Whereas we can single out any single integer in ω, but not all them, the vast majority of subsets of ω cannot even be singled out. Nevertheless, most mathematicians nowadays do not worry about this too much.
By comparison, one may think of all the people on Earth. We may meet any one of them, but we can’t meet them all. However, we are happy to refer to the aggregate of all people on Earth. (Similarly we are happy to accept the number π although we will never see all of its digits.) This is level (1) “discomfort”. We can see many stars, but there are many stars which we will never be able to see even if we try. However, we are happy to talk about all the stars in the universe in aggregate as a category, and we can see a large number of examples, which is reassuring. This is level (2) “discomfort”. We know (more or less) that the inside of the Sun is extremely hot, but we can never go inside with a thermometer to check. However, we are still happy to assume that the inside of the Sun does exist and that it is indeed hot inside there. This is similar to level (3) “discomfort”. It seems that humans can cope with some fairly high levels of abstract thinking about things which will never or can never be observed directly. But there are limits to this kind of abstraction. In this book discomfort up to level (2) will be accepted, but theorems which require level (3) discomfort will be specially marked, for example as Theorem [zf+ac]. (See Theorem 20.1.4 for a real example.) 2.11.14 Remark: The origin of classes which are mentionable in aggregate but not as individuals. The concept of “aggregates” in Remark 2.11.13 arises in real-life set theory from an induction, extrapolation or generalization process. A person sees a rabbit, then another rabbit, then another. And after a while one acquires the concept of a general rabbit class which is defined by the observed attributes of the rabbits which one has seen so far. Then any future rabbits are classified into this class on the basis of the inferred attributes of the class. Thus the process has the stages: (1) see examples, (2) infer common attributes, (3) define a class in terms of attributes, (4) classify future class members according to attributes. This process of class definition, forming aggregate concepts from individual examples, is built into most animals. (See Figure 2.11.1.) Classification is more or less the same thing as pattern recognition. In the case of infinite classes, for example the integers, the process is the same. One sees many integers, notes their common characteristics, defines a class based on these characteristics, and then classifies future objects as integers on the basis of the characteristics. After a while, the class of objects acquires its own sense of reality. It is this ability to conceptualize classes or aggregates which is the basis of naive (i.e. in-born) set theory. Without this mental skill, humans would not be able to learn set theory. Classes are an important building-block of real-world models. Classes make possible the manipulation of an infinite number of instances based on attributes of classes rather than the attributes or identity of individual instances. Most classes are, in principle, infinite. The word “infinite” means literally “without end”. In other words, there is no “last element” of an infinite class, or at least, one cannot state which element of the class is the last. So, since one cannot see an “end” of the set, one assumes, for intellectual economy, that there is no [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
So the progression of discomfort here is from level (1): elements of a countably infinite set can be singled out, but they can’t all be given names in finite time; to level (2): infinitely many subsets of a countably infinite set can be singled out, but almost all subsets can never be even referred to individually; and to level (3): none of the elements of a set brought into existence by the axiom of choice can ever be referred to individually. At level (1), each element may be singly referred to or the whole set can be referred to in aggregate. At level (2), essentially any element of interest can be singled out, although the rest of the set may be referred to only in aggregate. At level (3), not even one single element can be referred to as an individual, and therefore the whole set may be referred to only in aggregate. Consequently, these levels of discomfort may be worth accepting if referring to aggregates suffices.
2. Philosophical considerations
attribute parameter 2
58
rats
mice
hamsters attribute parameter 1
Figure 2.11.1
The development of classes from individual observations
2.11.15 Remark: Mathematical induction, the infinity concept, and straws on camel’s backs. The common argument in favour of the existence of an infinite number of integers is that you can add 1 to the largest integer to get a larger integer. Therefore there is no largest integer. However, if you can’t write down the largest integer, you can’t add 1 to it. The number of atoms in the Universe is of the order of 1080 . Even if we store 1010 bits of data in each atom, 90 that gives us only 210 numbers which can be represented with all the matter in the universe. So out of the 10100 first 2 integers, certainly only at most one billionth of all integers can be represented at one time. With this sort of argument, it becomes clear that there must be a point at which we can no longer keep adding 1 to the integers. This is like the “straws on the camel’s back” argument. We know that a camel (without further breeding advances) cannot carry 100 tons of straw. But we cannot imagine that a single straw will break its back. (See also Remark 4.13.12 regarding camels and straw.) In the same way, there must be a largest integer. We just can’t imagine what it is. Therefore there is no concrete representation of an infinite set of integers, even in human minds. (Minds are, of course, much smaller than the Universe.) Since ZF set theory is only a model, and there is no concrete system which fulfils the ZF infinity axiom (Definition 5.1.26 (8)), it follows that the conclusions of ZF set theory can never be obtained. Since all of mathematical analysis (particularly differential equations) relies heavily on infinite and infinitesimal limits, it follows that mathematical analysis is not validated by ZF set theory. And since almost all models in physics are expressed in terms of differential equations, the vast majority of physics receives no support from ZF set theory. Consequently almost the whole of mathematical physics is essentially baseless. 2.11.16 Remark: For large numbers, name-to-number mapping ceases to be a yes/no proposition. Perhaps the best of thinking about the infinity issue in Remark 2.11.15 is to note that as numbers become larger, the ability to represent the number in a mathematical mind ceases to be a yes/no question. It is the applicability of the concept of a proposition which fails here, more than the concept of a number. In the phrase “let N be the largest integer”, the verb “be” is an implicit instruction to construct an name-tonumber map which maps N to the largest integer. It is this constructibility question which changes from a yes/no proposition into a confidence-level estimation which depends upon resource availability and political will. Logic is based upon yes/no questions. The logic model is not applicable otherwise. 2.11.17 Remark: Abstract-to-concrete variable name maps and the cosmic information bottleneck. One might ask why it is that, according to Remark 2.11.15, there cannot be infinitely many integers in any [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
end. Then literally, the set is “infinite”. (The ZF set theory definition of infinity in fact defines an infinite set to have no “last element”. That is, every element has a “successor”.) Now in terms of this “cognitive theory” for infinite sets, it becomes clear that there is no paradox in stating that a set may be mentioned in aggregate, but no element of that set may be referred to individually. Thus we may talk about the real numbers in aggregate, whereas almost all real numbers are not individually finitely describable in any language, simply because of the finite bandwidth of the human mind and human communications. The vast majority of mathematical models and mathematical concepts are expressed in terms of classes, i.e. aggregates, not in terms of individual things.
2.11. Integers and infinity
59
concrete system, whereas in ZF set theory, we can confidently assert that if M is the maximum integer, then M + 1 is a larger integer, therefore there is no largest integer, hence the integers are infinite. The explanation for this apparent contradiction is the fact that the abstract variable name space in ZF set theory (which contains variables names of sets and numbers) requires a variable name map in order to be applied to a concrete variable space (which contains concrete sets and numbers). (See Remark 4.12.3 and Figure 4.12.1 for abstract-to-concrete name maps for logical variables.) When a variable M is used in symbolic logic, the map µV : NV → V from the abstract variable name space NV to the concrete variable space V is arbitrary and unspecified. The deductions of symbolic logic are independent of the choice of variable name map µV . Now when we assert in the abstract variable name space that we can add 1 to M , this only has some significance for the concrete variable space if a variable name map µV is specified. Then we can interpret M + 1 as addition of µV (M ) and µV (1) in the concrete space of integers. However, we cannot even define the map µV if it requires too much information. The cosmic information bottleneck prevents us from writing down the map. Therefore the conclusions from the abstract name space are not applicable. Abstract results from a model are only applicable if a map can be defined from the model to a concrete system. 2.11.18 Remark: Possible unsuitability of infinitely many integers for physics. On a personal note, this author found the huge excess of real numbers quite disturbing in the early 1970s while studying both mathematical physics. In physics, nothing could be measured to more than about 10 significant figures whereas in mathematics, people took seriously the calculation of hundreds of digits of numbers like π. There were even contests on the mainframes in those days to calculate vast sequences of decimal digits of π whereas in reality, the ratio of a circle’s diameter and circumference could never be measured to more than 10 figures. The countably infinite set of integers seemed less disturbing than the uncountable set of real numbers. But now, upon further reflection, the stupendously large set of integers is just as disturbing. Even mathematical induction does not bear up well under very close scrutiny, although in younger years it seemed almost self-evidently valid.
2.11.19 Remark: Unsuitability of integers for describing the real world. Taking scepticism of the integers one step further, it is possible to doubt even that small integers are an adequate model for anything in the real world. One may argue that the integers have been natural since human beings were herding cattle 10,000 years ago. Counting cattle was important then. Surely the integers are an adequate model for the counting of cattle. However, there are problems with this. To see the inadequacy of integers to describe even the simplest counting task, suppose a cow is dying. At what point do you say that the cow has ceased to be a cow? Even when high-tech probes are placed in human beings, it is highly disputed when a human being has died. So in order to count cows, one would need an agreed definition of cow death so that the number of cows can be reduced by 1. Consider also births. At what point in time during birth has the new cow come into existence? There must be a time during which there is ambiguity. If the grazing area is large, it may be difficult to know if a cow is in one’s own territory or in the neighbour’s territory. Since one cannot observe all points simultaneously, the cow count could depend on the times and places at which observations are made. There is also a problem with species. At what point does the breeding of cattle (with or without genetic engineering) lead to a new species which is no longer described as cattle? If a mutant cow has two heads, is this two cows or one? Or is it a new species of cow? It is clear that counting cows does not always yield an integer as we understand integers in modern mathematics. Most counting situations have the same kinds of difficulties as counting cows. When the prehistoric problems are combined with quantum mechanics and relativity, the problems become even greater. Simultaneity of existence of cows is not always well defined. All observations have an irreducible uncertainty. Clearly the integers provide only an approximate model for real systems. The model is often useful, but only if one does not examine its accuracy too closely. These difficulties with integers apply equally to all sets. The whole of set theory is accurate only in describing the models inside the human brain, not the fuzzy, ambiguous reality which is “out there”. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
[ Remark 2.11.19 is very similar to Remarks 2.4.2 and 2.4.3. ]
60
2. Philosophical considerations
One might therefore ask whether there are better models for the real world than such crisp concepts as integers and sets. In fact, this is probably not the best question to ask. A better question is whether integers and sets provide a good basis for the formalization of human thinking and models. Mathematics provides tools for thinking and mental models, not for the real world. The mind forms models, and the model can be formalized in terms of mathematics. It is not the job of mathematics to accurately describe the real (i.e. phenomenal) world, but rather to provide a language and tool-box for human thinking about the real world. More accurately, mathematics provides a language framework for the communication of mathematical ideas. 2.11.20 Remark: Infinity and uncountability are negative concepts. An interesting clue to the possible vacuity of concepts of infinity lies in English-language words such as “infinite” and “uncountable”. Both of these words are negatives: “infinite” means “not finite” and “uncountable” means “not countable”. In each case, the word does not actually say what the concept means. Each word only says what the concept is not. When one shows that a set is infinite, one typically invalidates the assumption that the set is finite. That is, one assumes the set to be finite (typically by associating the set with a finite set of integers), and then arrives at a contradiction by showing that the set has an element which is not in the assumed set. To show that the real numbers are not countable, one typically uses a diagonal argument, which starts from the assumption that we have an ordered listing of the elements of the set and then proceeds to construct an element of the set which is not in the listing.
2.11.21 Remark: Proofs of infinity and uncountability use a challenge/response method. Both the inductive method of proof of the infinity of a set and the diagonal method of proof of the uncountability of a set may be regarded as “challenge/response” proof methods. In each case, the assertion is challenged by a finite (in the first case) or countable (in the second case) listing of the elements of the set. Then the response to this challenge is the construction of an element of the set which is not in the putative listing. Thus the construction in the proof is in the negative response to the attempted challenge to the main assertion. The constructive negative response is effectively an algorithm which can deal with an infinite numbers of possible challenges. But none of those challenges is presented explicitly. Thus the response is more like a template for an infinite number of responses. 2.11.22 Remark: The difficulty of imagining the termination of an infinite sequence. Saying that “set X is infinite” is a bit like saying: “I am the Ruler of the Universe.” The mere writing of a proposition does not make it true. Within axiomatized set theory, there is usually an axiom which states that infinite sets exist. This does not prove anything at all. Although the integers may seem to be obviously infinite because of the principle of mathematical induction, the inductive method of reasoning is not valid in real life. The fact that we cannot imagine that there is a point at which a sequence terminates does not imply that the sequence does not terminate. Life is full of experiences which one canno imagine coming to and end. But they do end anyway. Zeno’s paradoxes relied upon the inability to imagine infinite processes. If one cannot imagine an infinite sequence, one cannot imagine how the sequence may eventually terminate. If someone claims to be the Ruler of the Universe, it may not be possible to prove otherwise. But the inability to disprove an assertion does not imply the validity of the assertion. In particular, it may be that the assertion is meaningless. It is quite meaningful to say that one cannot imagine that there is a greatest integer. This just means that no attempt to utter the largest integer can succeed, or at least we cannot imagine that such an attempt would succeed. But this does not prove that the set of integers is infinite, unless one understands “infinite” in the literal sense that the set has no end.
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
In each case, when proving that a set is infinite or uncountable, the proof is obtained by challenging the validity of an assumed listing of elements in the set. Thus no infinite or uncountable list is constructed. This is certainly unsatisfying, which explains why concepts of infinity and uncountability were so deeply controversial in the 19th century.
2.12. Real numbers and infinitesimality
61
2.12. Real numbers and infinitesimality 2.12.1 Remark: Finite resolution of real-world measurements. The real numbers arose historically from measurements of length, area, volume, weight, angles and other physical observables. So the real numbers may be regarded as the standard mathematical model for physical measurements. This does not mean that physical things themselves have equations of motion which involve real numbers. It is only the measurements which are modelled by real numbers. Measurements are categorized in philosophy as “phenomena” as opposed to the underlying unknowable entities (the “noumena”) which produce the phenomena. Although fractional numbers, whether they are rational integer fractions, decimal expansions or sexagesimal expansions, may be continued without any obvious limit, one can never make a measurement to infinite accuracy. Real-world measurements work with intervals of uncertainty. However, the human mind can perform induction on decimal expansions, for example, to conclude that since no matter how accurate a measurement is it can always be made more accurate, it follows that there are no limits. Hence we typically assume the existence of infinite decimal expansions although we can never measure or experience them. (We are accustomed to accepting the existence of things which we cannot experience. For example, we cannot meet all 6,000 million people in the world, as mentioned in Remark 2.10.6. But this does not make us seriously doubt their existence.) The infinite resolution of real numbers causes no serious problem unless someone demonstrates that there is a limit to measurement resolution. For example, if it were shown that time and space are particulate or “atomic”, the real numbers would need to be replaced with something else. Then all of analysis and differential geometry would need to be totally revised in such a framework. So the sceptical mathematician or physicist should be quite alarmed (as the author is) when a textbook starts with the assertion without proof that space is a 3-dimensional real-number manifold. Such assertions should be tagged as axioms or conjectures, not stated as a-priori knowledge. The observed phenomena have finite accuracy of measurements. The infinite accuracy of the real number system is only an assumed model for the underlying unknowable entities. In the case of solids, liquids and gases, we know that the model is wrong. In the case of space and time, we have no evidence against the real-number model – yet! In recent times, there have been increasing indications that space and time may have a granular character, in particular with reference to the Planck length, which is about 10−35 metres. [ Write more about the Planck length. Perhaps also mention the holographic geometry idea: Craig J. Hogan, “Measurement of quantum fluctuations in geometry”, Phys. Rev. D 77, 104031 (2008). My personal suspicion is that space-time may be “created” or constructed by vast quantities of gravity particles which cause a “shadow force”. These particles then cause miniscule variations in space-time, in the same way that massive objects like stars cause large-scale curvature of space-time, and perhaps by the identical mechanism. This might then explain the GEO600 noise better than the holographic theory. I’m not a physicist. So I’m just speculating wildly here. But I might be right though! ] 2.12.2 Remark: Questionable validity of real numbers for the physics behind the mesaurements. During the 19th century, there was much debate about the validity of the real number system. Even in the 20th century, after the logical self-consistency problems had been dealt with, debate has continued on the relevance of the real numbers. Despite all the apparent logical paradoxes, the real numbers do seem to be at least logically self-consistent, whether or not they correspond to the true nature of physical space. The real numbers do seem to provide a good basic model for physical measurements, although it is still not clear how physical measurements correspond to the underlying “reality”. It is an interesting question whether or not the real numbers will always provide the principal underlying building block for physical models. Since manifolds are built very specifically out of the real numbers, the entire relevance of differential geometry depends crucially on the relevance of the real numbers. Why should we suppose that space and time are infinitely divisible when we can’t even examine them under a microscope? It seems to be merely a collective agreement among scientists that space and time are infinitely divisible and uniform down to any resolution. Coincidentally this is what the real number system is like. So we may be simply projecting our models onto reality. The philosophical problems with the real numbers should not be assumed to be problems with the real world. We cannot be certain that the real numbers are an accurate model for any real-world system at all. Therefore all of differential geometry, which is heavily [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
[ 2008-5-8: This section has not yet been integrated into this chapter. ]
62
2. Philosophical considerations
2.12.3 Remark: Unreachable real numbers. One of the serious philosophical problems with the real numbers is the question of “effective computability”. Since only a countable number of real numbers can be computed within ZF set theory, almost all real numbers must be “ZF-unreachable”. This, unfortunately, means that almost all real numbers are “pixies at the bottom of the garden”. We know they’re there, but we will never be able to write a specification for them. If these numbers are rejected, then there will seem to be a very dense set of “gaps” in the set of real numbers. This could cause big problems for Lebesgue measure theory. But if the existence of unreachable real numbers can be comfortably accepted, this would re-open the question of whether the existence of unreachable choice functions can be accepted with a similar level of comfort. It seem odd, though, to be making statements about objects which can never be written down. It’s a bit like writing the laws of physics for an alternative universe which one has never seen, and provably cannot be seen or experienced in any way. This perspective makes both uncomputable real numbers and axiom-of-choice functions seem distinctly metaphysical. To be told that almost all theorems are unreachable within ZF seems perfectly acceptable. But to be told that almost all real numbers are unreachable within ZF is more disconcerting. One might ask how the physical world copes with these issues. If the real world is somehow operating with real numbers, how does it resolve questions of uncomputability of numbers? Does the real world “skip the gaps” between the effectively computable numbers? There’s obviously something wrong with the real numbers if they have such extraordinary complexity and yet atoms and photons seem to be care-free and nonchalant about it. Although we can never measure anything to infinite significant figures in the real world, mathematicians fill in the gaps with infinite decimal expansions and such notions. So any infinite sequence of digits is a figment of the imagination anyway. And yet all of analysis is dependent on these infinite expansions. It could be that pixies really do exist, because without them it seems impossible to create a viable analysis and differential geometry. Since analysis is a bedrock underlying almost all physics, and since physics seems to get so many answers right, maybe those pixies at the bottom of the garden really are out there making analysis work correctly. This would take away an argument against the axiom of choice then. All in all, it seems that any infinite sequence of zeros and ones in a binary number should be permitted. They can’t all be reached, but it’s nice to know that they are there. This filling in the gaps with pixie numbers which will never be seen seems to be the lesser of two evils. The fact that we will never be able to write down algorithms to generate most of the real numbers is not sufficient reason to exclude them. The pixie numbers seem to be harmless enough. This contrasts with the situation of the axiom of choice where some important proofs rely upon the throw of the dice to generate choice functions, and then no choice function can be written down. In the case of real numbers, we have a healthy population of reachable real numbers to work with without ever having to postulate existence of unreachable numbers in order to make proofs work. Since we do not rely on the existence of pixie numbers, it will be no great sadness if they are taken away some day, whereas withdrawal of AC from a topic which relies on them causes some pain to discover and weed out the dud theorems. 2.12.4 Remark: The historical motivation for the formalization of real numbers. In the 21st century, the real numbers seem so natural, it is difficult to understand why there was ever a need to formalize the real numbers and establish the validity of the concept. The following quote from Bell [190], pages 519–520, helps to clarify the problem which needs to be solved. If two rational numbers are equal, it √ is no doubt√obvious that their square roots √ are√equal. √ Thus 2 × 3 and 6. But it is not obvious that 2× 3 = 2×3 and 6 √ are equal; so also then are √ √ √ √ 2× √3, and hence 2 × 3 = 6. The un-obviousness of this simple assumed equality, 2 × 3 = 6, taken for granted in school arithmetic, is evident if we visualize what the equality implies: the “lawless” square roots of 2, 3, 6 are to be extracted, the first two of these are then to be multiplied together, and the result is to come out equal to the third. As no one of these three roots can be extracted exactly, no matter to how many decimal places the computation is carried, it is clear that the verification by multiplication as just described will never be complete. The whole human [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
based on real number systems IRn , must be regarded as only a tool-box for creating models which are suitable until proven unsuitable. It is difficult to re-examine one’s assumptions about numbers after a lifetime immersed in conventional thinking. Since the role of scientists is partly to overthrow orthodoxy in the name of progress, it is important to take an aggressive attitude towards the weak points of conventional thinking.
2.12. Real numbers and infinitesimality
63
2.12.5 Remark: The requirement to include irrationals in the real numbers. The need for irrational numbers was not always accepted. In the 19th century, Kronecker urged mathematicians to build all of mathematics without irrationals. Mathematics could have continued without irrationals, and probably the physical sciences would have been seriously handicapped by their lack. Bell [190], pages 521–522 made the following comment about irrational numbers and infinite concepts. It depends upon the individual mathematician’s level of sophistication whether he regards these difficulties as relevant or of no consequence for the consistent development of mathematics. The courageous analyst goes boldly ahead, piling one Babel on top of another and trusting that no outraged god of reason will confound him and all his works, while the critical logician, peering cynically at the foundations of his brother’s imposing skyscraper, makes a rapid mental calculation predicting the date of collapse. In the meantime all are busy and all seem to be enjoying themselves. But one conclusion appears to be inescapable: without a consistent theory of the mathematical infinite there is no theory of irrationals: without a theory of irrationals there is no mathematical analysis in any form even remotely resembling what we now have; and finally, without analysis the major part of mathematics—including geometry and most of applied mathematics—as it now exists would cease to exist. The most important task confronting mathematicians would therefore seem to be the construction of a satisfactory theory of the infinite. Mathematics sometimes seems like an exhausting series of mind-stretches. First one has to stretch one’s ideas about integers to accept enormously large integers. Then one must accept negative numbers. Then fractional, algebraic and transcendental numbers. After this, one must accept complex numbers. Then there are real and complex vectors of any dimension, infinite-dimensional vectors and semi-normed topological vector spaces. To add pain to injury, one must somehow accept a dizzying array of transfinite numbers which are “more infinite than infinity”. Beyond this are dark numbers and dark sets which exist but can never be written down. If there were no benefits to this exhausting series of mind-stretches, only people with a serious personality disorder would study such stuff. Only the amazing success of applications to science and engineering justify the whole edifice of modern mathematics. The bizarre infinities and abstractions of mathematics cannot be said to be “true” in any sense. Intellectual discomfort is the price of obtaining the analytical power of mathematics. 2.12.6 Remark: Symbolic mathematics software can’t represent all real numbers. Nor can humans. One of this author’s objections to symbolic mathematics software packages in the 1970s until recent times was the fact that such packages could not possibly represent all of the real numbers. Therefore such software could only, at best, deal with a limited range of real mathematics. It took a long time to realize that this limitation applies to human beings also. Anything that a mathematician can write down can be represented and manipulated by computer software. So anything which computer software cannot represent cannot be written down by human beings either. One might counter-argue that mathematicians can think concepts which are not writable. Well, maybe so, but such concepts cannot be transmitted on the communications channels to other mathematicians. The mathematics which is written in books and papers consists only of the writable component of mathematics. All written and spoken mathematics transmits only a finite amount of information. (Even the source files for this book contain only a finite number of bytes of information!) Therefore there is no reason to exclude computers from the socio-mathematical network. Computers can express everything that mathematicians can, if they are programmed extensively enough. The problem with computers is only that they do not have the naive ontology for mathematics and logic which humans do. It may be, in fact, impossible to codify human in-built ontology well enough to permit a computer to even pretend to understand the meaning of what it is doing. (But the effort to program computers to pretend to understand might mevertheless be worthwhile!) [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
√ √ √ race toiling incessantly through all its existence could never prove in this way that 2 × 3 = 6. Closer and closer approximations to equality would be attained as time went on, but finality would continue to recede. To make these concepts of “approximation” and “equality” precise, or to replace our first crude conceptions of irrationals by sharper descriptions which will obviate the difficulties indicated, was the task Dedekind set himself in the early 1870’s—his work on Continuity and Irrational Numbers was published in 1872.
64
[ www.topology.org/tex/conc/dg.html ]
2. Philosophical considerations
[ draft: UTC 2009–3–21 Saturday 11:36 ]
[65]
Chapter 3 Logic semantics
Mathematical logic subject development . . . . . . . . . . . . General comments on logic . . . . . . . . . . . . . . . . . . . Modelling, meta-modelling and recursive modelling . . . . . . The universality (or otherwise) of modern logic . . . . . . . . Logic in literature . . . . . . . . . . . . . . . . . . . . . . . Proposition-store versus world-view ontology for logic . . . . . A proposition-store ontology for logic . . . . . . . . . . . . . Undecidable propositions and incomplete information transfer The semantics of truth and falsity . . . . . . . . . . . . . . . The semantics of logical negation . . . . . . . . . . . . . . . Proof by contradiction . . . . . . . . . . . . . . . . . . . . . The moods of logical propositions . . . . . . . . . . . . . . . Other remarks on the semantics of logic . . . . . . . . . . . . Naive mathematics . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
65 69 73 77 79 85 87 93 95 97 103 106 107 110
[ This chapter and the next are currently in the “ideas capture phase”. The grouping of remarks and definitions between and within the sections is not tidy at all. Some of the subsections of this chapter and the next are like mini-essays or sketch-pad notes. Much of the text is currently repetitive or tedious. During the “consolidation phase”, this chapter (and the next) will be totally rearranged. ] 3.0.1 Remark: Mathematical logic is the foundation layer of mathematics. The mathematics in this book starts at the beginning with mathematical logic, building up differential geometry from the lowest-level concepts because it is so unsatisfying to be told that one’s questions about the fundamentals of a subject are answered in a course one never took, in a book one never read, or in the realm of the “obvious”. Logic is the foundation layer upon which all of mathematics is built. Therefore this chapter and chapter 4 are the “bedrock” on which the following chapters rest. These are also the weakest chapters because they have nothing to rest on except anthropology and some naive notions of logic and set theory.
3.1. Mathematical logic subject development 3.1.1 Remark: Conceptual frameworks are not necessarily “true”. The framework for logic which is summarized in Remark 3.1.2 is not necessarily “true” or “correct” in any sense. The objective of any model or conceptual framework is to provide a fairly unified, consistent set of concepts which facilitates the study of a wide range of ideas within a subject. A conceptual framework is not very useful if one must constantly make excuses for things which do not fit, or if the effort to make things fit is disproportionate to the benefit. A framework is useful if one can easily find a place within the framework for every concept. Thus, for example, general relativity provides a useful framework for cosmology and astrophysics, and natural selection provides a useful framework for biology and palaeontology.
©
Alan U. Kennington, “Differential geometry reconstructed: a unified systematic framework”. Copyright 2009, Alan U. Kennington. All rights reserved. You may print this book draft in A4 format. Printing in all other formats is forbidden. You may not charge any fee for copies of this book draft. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 3.10 3.11 3.12 3.13 3.14
66
3. Logic semantics
The view in Remark 3.1.2 that the primary concept of logic is the “model”, that propositions are properties of models, and that logical argument is a mere algebra for “solving for unknowns” amongst the propositions, many of the paradoxes and discomforts of logic seem to disappear. Therefore this framework can at least be considered to be useful and comfortable.
(1) Define truth functions on concrete proposition domains: For any (naive) set P of concrete propositions, a function τ : P → {F, T} is assumed to be defined. This is merely a model. Like any other model, there may be differences between the model and the system being modelled. If the model has errors, it must be corrected or abandoned. In this model, each proposition P ∈ P has a truth value τ (P ) which may be true (T) or false (F). As in the case of any model, the values τ (P ) may be unknown although some relations between the values may be assumed a-priori as part of the modelling. The objective of the modelling is to solve for unknown truth values in terms of a-priori assumptions. (2) Define a set of abstract names for concrete propositions: An abstract proposition name space N is defined so that propositions may be referred to conveniently. This is just like using symbols x, y, A, B, f , g etc. to refer to numbers, sets or functions. Thus we may write P = “The Sun rises in the East.” Then P is easier to write than the full sentence. It is understood that the abstract names refer to arbitrary propositions. The proposition name map may be denoted as µ : N → P. Then t = τ ◦ µ : N → {F, T} maps abstract proposition names to the truth values of the propositions to which they refer. (3) Define logical expressions: Functions of the truth values on a concrete proposition domain P may be written as logical expressions which can be parsed to determine the functions which are intended. For example, the logical expression A ⇒ B refers to the function (A, B) 7→ φ⇒ (t(A), t(B)), where φ⇒ : {F, T}2 → {F, T} is defined by φ⇒ (x, y) = F if (x, y) = (T, F), otherwise φ⇒ (x, y) = T. More generally, truth functions have the form φ : {F, T}n → {F, T} for non-negative integers n. (4) Define “logical algebra” problems: Just as in the case of elementary real-number algebra, it is possible to define “logic algebra” problems where the values of some functions (i.e. logical expressions) have known values and the task is to solve for the values of individual propositions, the “unknowns”. For example, given that t(A ⇒ B) = T and t(B ⇒ A) = F, solve for t(A) and t(B). (In this example, the only solution is: t(A) = F and t(B) = T. Note that this does not follow, at this stage, from propositional calculus. The solution follows from the definitions of the functions which are represented by logical expressions.) The “solutions” of logic algebra problems are not necessarily the truth values of all individual propositions. Sometimes the objective is to determine the truth values of particular logical expressions rather than the individual propositions. Solutions of logical algebra problems are called “theorems”. (To make things confusing, theorems are sometimes called “propositions”. Such confusing terminology should be avoided.) (5) Formalize logical algebra methods as “propositional calculus”: The methods of solving logical algebra problems may be written in a formal language as a sequence of statement lines which use a small range of symbols and follow a small, explicit set of rules. Therefore the methods of logical algebra may be formalized as a strict, explicit set of rules and assumptions. Any such set of rules may be referred to as a “propositional calculus” if it always yields correct conclusions from given assumptions. Thus propositional calculus may be defined as “formalized logical algebra”. The strict formalization of logical algebra removes the need to understand the meanings of statements. If the rules are followed mechanically, true logical expressions always follow from true logical expressions. (6) Extend propositions to parametrized proposition families: In many applications of logic, it is most efficient to organize very large (or even infinite) sets of propositions into families, where a proposition is obtained for each value of a parameter called a “variable”. The families are called “predicates”. Let Q denote the set of predicates and V is the set of variables. Then the expression F (x) ∈ N is a proposition name for every F ∈ Q and x ∈ V. (It is convenient to use a common variable space V for all predicates. So t(F (x)) = T is assumed for all x for which the expression F (x) is undefined.) Predicates may accept multiple variables as parameters. Such predicates have the [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
3.1.2 Remark: High-level overview of mathematical logic. Mathematical logic may be conceptualized as the following stages of subject development. This may be regarded as a “tour of mathematical logic”. This summary presents the author’s current understanding of the “true nature” of mathematical logic.
3.1. Mathematical logic subject development
67
form F : V k → N for non-negative integers k. This is formally equivalent to replacing the set V of variables with the set of sequences of variables: V 0 ∪ V 1 ∪ V 2 ∪ V 3 . . .. (Strictly speaking, one should distinguish between variables in V and variable names in NV , and there should be a variable name map µV : NV → V.)
[ Modify the presentation here to correctly differentiate between variables and variable names. ]
(7) Define quantifiers to represent bulk conjunctions and disjunctions: It often happens that one wishes to form the conjunction of all propositions in a family F ∈ Q. The long way to do this is F (x1 ) ∧ F (x2 ) ∧ F (x3 ) ∧ F (x4 ) . . ., iterating explicitly over all variable values. The short way to do this is ∀x, F (x). This is symbolic notation for “t(F (x)) = T for all x ∈ V”. (This shows the advantage of having a common variable space. The symbolic notation does not need to specify it.) Similarly ∃x, F (x) means “t(F (x)) = T for some x ∈ V”. The long way to do this is F (x1 ) ∨ F (x2 ) ∨ F (x3 ) ∨ F (x4 ) . . .. The sub-expressions “∀x” and ∃x are called “quantifiers”. The “dummy variable” or “bound variable” x may be any symbol. (Generally these dummy variables are drawn from the same set of variable names NV as for other variables.) It would be possible to define many other kinds of “bulk logical operators”, but it turns out that two quantifiers are enough. Other bulk logical operators can be defined in terms of these two. (8) Define logical expressions for parametrized proposition families: Just as in the case of simple propositions, stage (3), a syntax may be defined the logical expressions which are permitted for parametrized proposition families. Such logical expressions include all of the syntax of simple propositions together with constructions which use quantifiers. As in stage (3), the symbolic expressions are merely notations for functions which must be determined by parsing the expressions. This kind of logic is called “predicate logic”.
(10) Formalize predicate algebra methods as “predicate calculus”: Just as in the case of simple propositions, stage (5), the methods of solution for parametrized proposition family problems (“predicate algebra”) may be strictly formalized as rules of deduction for sequences of lines of symbolic text with a well-defined syntax. Any such formalization is called a “predicate calculus”. (Sometimes the expression “first order language” is used.) As in stage (5), the strict formalization of predicate algebra removes the need to understand the meanings of statements. If the rules are followed mechanically, true logical expressions always follow from true logical expressions. (11) Add logical functions to parametrized proposition family logic: This is an extension of the parametrized proposition family logic in stage (6) to include functions of the form f : V k → V for non-negative integers k. Then the parameter x of any proposition F (x) may be replaced with the value of such a function. For example, the expression F (f (x, y, z)) yields a proposition for f : V 3 → V, F ∈ Q and x, y, z ∈ V. Various other such extensions are also possible. The quantifiers in stage (7), the logical expressions in stage (8), the predicate algebra in stage (9) and the predicate calculus in stage (10) may then be defined in the same way as before. (12) Define set theories in terms of predicate calculus: The framework in stage (11) is now expressive enough to define set theories. A set theory (such as Zermelo-Fraenkel or Neumann-Bernays-G¨odel) generally requires particular sets of predicates, functions and variables, together with a set of axioms (which are particular kinds of deduction rules). (13) Define abstract structures outside set theories: Many kinds of structures, such as number systems and groups, may be defined purely linguistically in a predicate calculus without basing these structures on set theory. Sets are merely one kind of system which can be defined within a predicate calculus. However, a typical set theory has a rich enough structure to enable many other kinds of structures to be defined with it. Therefore many kinds of mathematical structures are defined directly in terms of sets to save the burden of developing a separate language and calculus. Structures which are defined within set theory may be regarded as “representations” of the real structures. But all structures defined within a predicate calculus are merely models anyway. Sometimes it is convenient to define multiple representations of a single structure concept. (Examples of this are tensor algebras and tangent bundles.) [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
(9) Define “logical algebra” problems for parametrized proposition families: Just as in the case of simple propositions, stage (4), “logical algebra” problems may be defined for parametrized proposition families. To distinguish these two case, one could call the logical algebra in stage (4) “proposition algebra”, whereas the proposition families version could be called “predicate algebra”.
68
3. Logic semantics
(14) Extend set theories by including outside abstract structures: Structures defined inside and outside set theory may be combined by applying set construction methods to sets of objects which are not strictly set objects. The objects in ZF and NBG set theory are all sets (or classes). But one may define structures outside set theory and insert them into a set-theory structure. For example, one may define integers outside set theory with the Peano axioms. The integers 1, 2, 3,. . . which are defined in such a non-set-theoretic structure may then be permitted to participate in a set theory. Thus the set {1, 2, 3} would be a valid construction within a set theory even though the symbols 1, 2 and 3 do not represent sets. Perhaps the most difficult part of arriving at the above overview is forgetting what one “knows” from immersion in the subject. It is generally more difficult to lose a false belief than to acquire it in the first place. It follows from the above summary that mathematics cannot be identified exactly with any set theory. Set theories and other logical systems are merely models for systems which are assumed to exist outside the logical systems. The formal languages and methods of mathematics may be identified to a great extent with set theories and other logical systems, but the semantics always lies outside. This is true in the same sense that ordinary text and images generally refer to things which lie outside the text and images.
3.1.3 Remark: Mathematical logic is an art of modelling, whether the subject matter is concrete or not. There is a parallel between the history of mathematics and the history of art. In the olden days, paintings were all representational. In other words, paintings were supposed to be modelled on visual perceptions of the physical world. But although the original objective of painting was to represent the world as accurately as possible, or at least convince the viewer that the representation was accurate, the advent of cameras made this representational role almost redundant. Photography produced much more accurate representations for a much lower price than any painter could compete with. So the visual arts moved further and further away from realism towards abstract non-representational art. In non-representational art, the methods and techniques were retained, but the original objective to accurately model the visual world was discarded. With modern computer generated imagery, the capability to portray things which are not physically perceived has increased enormously. Mathematics was originally supposed to represent the real world, for example in arithmetic (for counting and measuring), geometry (for land management and three-dimensional design) and astronomy (for navigation and curiosity). The demands of science required rapid development of the capabilities of mathematics to represent ever more bizarre conceptions of the physical world. But as in the case of the visual arts, the methods and techniques of mathematics took on a life of their own. Increasingly during the 19th and 20th centuries, mathematics took a turn towards the abstract. It was no longer felt necessary for pure mathematics to represent anything at all. Sometimes fortuitously such non-representational mathematics turned out to be useful for modelling something or other in the real world, and this justified further research into models which modelled nothing. The fact that a large proportion of logic and mathematics models no longer represent anything in the perceived world does not change the fact that the structures of logic and mathematics are indeed models. Just as a painting is not necessarily a painting of physical things, but they are still paintings, so also the structures of mathematics are not necessarily models of physical things, but they are still models. It follows from this that mathematics is an art of modelling, not a science of eternal truth. Therefore it is pointless to ask whether mathematical logic is correct or not. It simply does not matter. Logic provides tools, techniques and methods for building and studying models. Whether those models correspond to anyone’s intuitive idea of “correct” logic is a matter of taste and applicability. 3.1.4 Remark: The excluded middle is non-negotiable. Logical argumentation must guarantee it. The logic development schema in Remark 3.1.2 guarantees the “excluded middle”. As mentioned in Remark 3.1.3, it does not matter whether logic “in the wild” satisfies the excluded middle. The fundamental starting assumption for logic models is that propositions can be true or false. All of the axioms and rules of deduction in logic are mere “techniques of solution” of “logical equations”. If those techniques arrive at [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Formal logic does not formalize the concrete objects of mathematics. It only formalizes what people write about mathematical objects. Only the language can be formalized, not the objects which are discussed in that language. Mathematical objects reside inside minds.
3.2. General comments on logic
69
the conclusion that a proposition is both true and false (or neither true nor false), then it is the techniques which are faulty. Similarly, the logic development schema in Remark 3.1.2 guarantees the validity of “proof by contradiction”. If the adoption of a tentative truth value for a proposition leads to a contradiction, then either the method of argument is faulty, or the proposed axioms are incompatible with any valid logical system model, or the tentative assumption was false. So one may validly infer that if the axioms are self-consistent and the argument is error-free, then the tentative assumption must be false. The principal assertion of Remark 3.1.2 is that the excluded middle is primary and logical argumentation is secondary. If there is a clash, it is the method of argument which must give way.
3.2.1 Remark: Modern logic arose by necessity, from paradoxes and confusion. Mendelson [164], pages 1–2, says the following on the historical origins of modern mathematical logic. Although logic is basic to all other studies, its fundamental and apparently self-evident character discouraged any deep logical investigations until the late nineteenth century. Then, under the impetus of the discovery of non-Euclidean geometries and of the desire to provide a rigorous foundation for analysis, interest in logic revived. This new interest, however, was still rather unenthusiastic until, around the turn of the century, the mathematical world was shocked by the discovery of the paradoxes, i.e. arguments leading to contradictions. One could argue that the deeper questions of logic only arose because mathematicians insisted on pushing their naive logic beyond its limits. If they had restricted themselves to fairly ordinary, non-pathological areas of mathematics, there might have been no need for the deep study of logic. However, they did push the boundaries to try to find more and more pathological and infinite sets with the most extreme characteristics. They tried to define ridiculously infinite sets which were of no practical value, which led to the paradoxes of Burali-Forti (1897), Cantor (1899) and Russell (1902). They tried to calculate the area beneath the nastiest possible curves, which culminated in Lebesgue non-measurable sets (Vitali, 1905), which were nonconstructible. Paradoxes were, in hindsight, inevitable at the boundaries of the scope of naive logic. It could perhaps be argued that it was the extreme extrapolation of naive logic which led inevitably to the complexities and abstractions of modern mathematical logic. Conversely, one could argue that there is little need for the intellectual intensity of mathematical logic if mathematicians restrict themselves to reasonable constructions and concepts. In particular, the paradoxes and tortuous entanglements of modern logic can probably be safely ignored by physicists and other users of mathematics. 3.2.2 Remark: Symbolic logic is not the basis of all rationality. Symbolic logic should not be taken to be absolutely true in any sense. The book by Lakoff/N´ un ˜ez [172], page 8, makes the following comment on this subject. Symbolic logic is not the basis of all rationality, and it is not absolutely true. It is a beautiful metaphorical system, which has some rather bizarre metaphors. It is useful for certain purposes but quite inadequate for characterizing anything like the full range of the mechanisms of human reason. 3.2.3 Remark: Most of mathematics can be reduced to symbolic logic. Logical argument is the art of convincing people (including onself) of the validity of propositions. But mathematics delivers more than mere propositions. The outputs or “deliverables” of mathematics also include calculations and diagrams. Calculations are arithm`etic in character whereas diagrams (and graphs) are geometric in character. However, diagrams and graphs may be coordinatized with numbers, which are arithm`etic, and arithmetic may be reduced to logical propositions. Therefore ultimately all mathematical deliverables would seem to be reducible to logical propositions, although the meaning of such propositions requires semantics which lie outside pure symbolic logic. The process of reformulation and interpretation of geometry, arithmetic and analysis in terms of mathematical logic is illustrated in Figure 3.2.1. Mathematicians don’t just “deliver” propositions, calculations and diagrams. They back up their deliverables with justifications. The purpose of such justifications is to convince the “client” to accept the deliverables. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
3.2. General comments on logic
70
3. Logic semantics geometry questions
geometry answers interpretation
reformulation arithmetic and analysis questions
arithmetic and analysis answers interpretation
reformulation symbolic logic assumptions Figure 3.2.1
logical arguments
symbolic logic assertions
Reformulation and interpretation of geometry, arithmetic and analysis
The mathematician is expected to provide a kind of certification of the correctness of the deliverables. This leads to a heavy emphasis on deciding whether a deliverable is “true”. The word “true” means in practice that the deliverable should be accepted . Thus all mathematical deliverables are divided into true and false, and one should accept the true deliverables and reject the false ones. This is the socio-mathematical significance of truth and falsity. Mathematics does much more than merely decide which propositions are true or false. For example, mathematics includes the reduction of arithmetic and geometry to symbolic logic, and the interpretation of the outcomes of logical analysis back to the tasks in arithmetic and geometry where they should be applied.
However, when logic is being applied to real mathematics, such as geometry, arithmetic and analysis as in Remark 3.2.3, the semantics may be rediscovered by retracing the steps of the formulation of these subjects into symbolic logic. 3.2.5 Remark: Mathematics is an unusually precise application of logic. Logic may be applied to a very wide range of contexts. For example, one may pick up a book at random, highlight several sentences in the book, and hypothesize that those sentences are either correct or incorrect. Then the propositions of this logic could be labelled Ai for i = 1, 2,. . . n, for some non-negative integer n, where each Ai means: “Sentence i is correct.” Each of these propositions has the truth value “true” or “false”. One may then notice that there are dependencies between the sentences resulting from a-priori knowledge (or assumptions) about the subject. Thus it may be that A1 ⇒ A2 , meaning that if sentence 1 is correct, then sentence 2 must be correct. By accumulating such information on relations between the truth values of the propositions, one may gradually determine individual truth values. This is similar to the procedures of algebra, where the numerical variables are solved for in terms of known relations between them. The application of logic to pure mathematics is similar to other applications of logic, but the logic of pure mathematics has two exceptional features. (1) Precision of modelling: The modelling of truth and falsity for pure mathematics is assumed to be perfectly accurate. In other words, all propositions are assumed to be either true or false, and all abstract logical operations are mapped verbatim to the propositions of the concrete logic. (2) Completeness of modelling: The logically deducible propositions are the only true propositions. In other words, only the deducible propositions are to be accepted. Any proposition whose truth value is not deducible is simply unknown. There is no presumption of truth until proven false. The entire span of the subject is the set of all deducible propositions. Everything outside that span is unknown territory. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
3.2.4 Remark: Abstract mathematics removes the semantics. Pure, abstract mathematical logic often seems to be a meaningless intellectual recreation. This is because the original meaning, the semantics, has been removed during rigorous axiomatization. It often seems that semantics is regarded as an optional extra in mathematical logic. Sometimes it is almost impossible to discover the meaning or origins of mathematical concepts.
3.2. General comments on logic
71
This is quite different to the application of logic in science and everyday real-world experience. When applying logic to the real world, one never has certainty of the truth value of any proposition, and only a subset of all possible propositions is subjected to logical analysis. The set of propositions to be analyzed is determined by the context of the discussion and the motives of the participants. The concepts of truth and falsity in abstract logic have no meaning in isolation from the application context. Abstract logic has a set of rules which must be followed, and the words “true” and “false” are merely abstract labels which must be manipulated according to the rules. However, in the application to pure mathematics, a proposition which is determined to be “true” according to the rules must be accepted in the corpus of mathematical knowledge. A proposition which is determined to be “false” according to the rules must be rejected (or ejected) from the mathematical knowledge corpus. Such acceptance and rejection of propositions is absolute in the pure mathematical realm. 3.2.6 Remark: Logic is the art of deducing propositions from other propositions. Since physics is expressed in terms of mathematics, and mathematics is formally justified in terms of logic, one might reasonably conclude that logic is very deep and important and fundamental. However, logic is very much less significant than it may at first seem. Logic is merely the algebra of logical expressions. The methods of logic allow one to deduce logical expressions from other logical expressions. This is very much like elementary algebra, which is a set of methods for solving for algebraic expressions in terms of given algebraic expressions. This analogy is summarized in the following table. entities numbers propositions
direct method indirect method measure numbers directly solve equations to determine numbers assess propositions directly deduce propositions from axioms
In a similar way, I may wish to determine whether the plants in my garden are wet. I could walk outside and measure this directly. However, I may have heard rain falling heavily for the last 30 minutes. So I know indirectly that the plants must be wet. This is because A ⇒ B, where A = “It has been raining for the last 30 minutes.” and B = “The plants in my garden are wet.” If I apply A and A ⇒ B to conclude B, this does not make the conclusion B any more credible than if I measure B directly. In the case of mathematical logic, the fact that I can deduce 2+2 = 4 from the ZF set theory axioms together with the definition of addition does not make the conclusion 2 + 2 = 4 any more credible because I derived it from axioms. In principle, all of the “facts” of mathematics could be determined directly through deep thinking, or experimental observation, or any other means. Logic is, in this sense, useless. But just as we do not wish to measure all physical phenomena directly, but prefer instead to predict the values of measurements from the application of inferred laws of physics, so also it is more efficient to codify mathematics in terms of small sets of laws from which all other “facts” may be deduced. Thus logical argumentation does not make assertions more valid than if they are examined on a case-bycase basis. Axiomatic systems play the same role in logic that physical models do in physics. By making calculations from such systems or models, it is possible to reduce the burden of direct measurement or assessment. As in the case of physical models, the validity of an axiomatic system is limited by its ability to pass predictive tests. If a physical model makes a prediction which disagrees with observation, the model must be restricted in scope, modified or discarded. The same applies to axiomatic systems. If a set of axioms and logical rules yields conclusions which are unacceptable, the system must be restricted in scope, modified or discarded. One cannot say that a proposition is given enhanced validity by being deduced from a set of axioms, no matter how reasonable the axioms may seem at first sight. It is sometimes difficult to explain to non-scientists that science does not claim to deliver truth, and that scientists try to disprove their own theories. Scientists only work with models, and models are frequently [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
For example, one may measure the height H of a building directly with a tape measure. Alternatively one may time the fall of a stone from the top of the building and solve the equation 21 gt2 = H for the height H in terms of the measured time t. The indirect method is not inherently superior to the direct method. The indirect method exploits a general rule which is inferred from past observations. It may be more efficient economically to use the indirect method. But the estimate of H does not have any greater credibility because the value was arrived at by using an equation.
72
3. Logic semantics
“voted out of office” if they do not deliver the right predictions. The fame and career of a scientist are greatly enhanced by disproving generally accepted models. So most experiments are designed to disprove theories, not prove them. In the same way, mathematicians and logicians should not place too much value on their methods of argumentation and calculution. The axiomatic systems which they develop, and the deduction rules which they apply, must be evaluated according to their ability to generate assertions which are acceptable. One may say that logical argumentation is merely the art of solving simultaneous logical equations. This enables a very large number of logical assertions to be generated from a compact set of axioms and rules. Therefore formal logic is useful for compactly presenting mathematics. But logic can never guarantee that anything is true. Logic never proves anything. Logic only facilitates the compression of a large number of propositions into a small space. Any “truth” that there is in mathematics comes from somewhere else. 3.2.7 Remark: Axioms do not fall from the sky on golden tablets. Axiomatization was famously successful in the case of Euclid’s geometry. It seems likely that the axiomatic minimalism instinct observable in mathematics and logic since that time has been inspired originally by the success of that project. Perhaps the axiomatization programme has gone too far. A subject may be axiomatized so as to generate the desired range of theorems, but then there may be unintended consequences such as paradoxes, pathological examples and counter-intuitive results. Quite often the undesired consequences of an axiomatic system are held to have credibility because of being deduced by logical arguments. Axioms do not fall from the sky on golden tablets. On the other hand, it does take courage to assert a proposition without the support of a deductive argument of some sort. 3.2.8 Remark: Linguistic logic is a higher brain function than real-world modelling. Logic seems to be a “higher” function than mere arithmetic and other modelling activity in the brain. A principal difference between logic and mere modelling is that in logic, one tolerates not knowing whether a hypothesis is true or false. The logical brain considers a proposition to be an “object” whose truth value is an unknown attribute.
The scientist says: “It may be true or it may be false, and we must wait to find out which.” Pre-scientific thinkers just want to know the answer and cannot tolerate the Schr¨odinger’s cat scenario where the world could have more than one possibility indefinitely. They don’t want to hear about modalities. It seems likely that there was once a pre-logical era where each human had just one world-view at any point in time, with no IFs and no ORs. That is why the ancient Gilgamesh story has no ORs and only nine IFs in 3000 lines. (See Remark 3.5.2.) Probably all humans still have the yearning for certainty, and maybe all humans feel uncomfortable with not knowing the answers to important questions. But most modern people resist the desire to fill in the gaps of ignorance with myths and ancient, simplistic world-pictures. A question arose in the writing of this book whether logic should follow or precede the sets and numbers chapters. One could argue that logic should come after arithmetic because all animals can do world-modelling and many can do arithmetic, whereas only humans, who have language, are capable of expressing logical propositions, which themselves are “objects” with attributes. However, logic and mathematics in this book do not have to be in the same order in which logic and maths developed among animals on Earth. Our modern approach to maths (and everything else) is immersed in a framework of logical thinking. Mathematics is not taught as a set of fixed and eternal methods any more, like the ancient Egyptian priests did when Thales visited them in about 600bc. The ancient Greeks introduced logic and deduction to mathematics, and they proved propositions. They also speculated actively and with enthusiasm about propositions which could not be decided to be true or false; in other words, paradoxes. The Greeks introduced IFs and ORs and hypotheses to maths, especially to geometry. That is the modern, logical way of thinking. Thus it seems right to start this book with our modern methods of thinking, which affect everything that follows in the book, even though historically, logical thinking probably dates only to about 750bc, about the [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Pre-scientific people were (and are) not able to accept the not-knowing. They need to know which of any two options is true. They need a definite world-model where everything is very definitely one way or the other.
3.3. Modelling, meta-modelling and recursive modelling
73
same time as the writing-down of Homer and the very early Olympics. (The last version of Gilgamesh falls into this early “logical era”.) The anthropology of logic could throw a lot of light on the nature of logical thinking. It would be interesting to know whether the tribes of New Guinea and the Brazilian forests discuss hypotheses and have logical arguments. Anthropologists study language, kinship relations, tools, technology, rituals, music and pictorial art, but they do not seem to research the anthropology of logical thinking. Yet logical thinking is probably the biggest thing to happen to the human species since the advent of language. Logical thinking is what made the modern science-based world possible!
3.3. Modelling, meta-modelling and recursive modelling 3.3.1 Remark: Mathematics is the study of models, including analysis and synthesis of models. Mathematics may be defined as the study of models. This includes the analysis and synthesis of models. The techniques of mathematics may be thought of as a toolbox for creating, studying and applying models. Some mathematical models are intended to correspond to the physical or empirical world. Other models are purely abstract because they correspond to processes inside minds, for example. (By comparison, painting was originally the art of depiction of the visible world, but over time, the subjects of depiction became more and more abstract. A totally abstract painting depicts only itself. In the same way, pure, abstract mathematics models nothing but itself.) Applied mathematics is concerned with the application of mathematical models to real-world systems. (The three modelling stages, abstraction, analysis and application, are discussed in Remark 3.9.6.) 3.3.2 Remark: World models are required in all animals in the sensor/motor feedback loop. Animals have sensor and motor paths which must be coordinated. The manipulation of the environment, and of the animal’s own body, requires a feedback loop to ensure that the motor path sends the right signals to achieve an objective. (See Figure 3.3.1.) It is almost obvious that some sort of model of the world must be implemented within the organism. Minds arise from motion and manipulation.
noumena
model motor path
world Figure 3.3.1
organism
Modelling of the world by an organism
[ Make a table of definitions of logic “structure”, “interpretation” and “model” by various authors. ] 3.3.3 Remark: The logic literature uses the word “model” differently. The word “model” is widely used in the logic literature in the reverse sense to the meaning in this book. Shoenfield [168], page 18, first defines a “structure” for an abstract (first-order) language to be a set of concrete predicates, functions and variables, together with maps between the abstract and concrete spaces, as outlined in Remark 4.12.3. If such a structure maps true abstract propositions only to true concrete propositions, it is then called a “model”. (Shoenfield [168], page 22.) Mendelson [164], page 49, defines an “interpretation” with the same meaning as a “structure”, just mentioned. Then a “model” is defined as an interpretation with valid maps in the same way as just mentioned. (Mendelson [164], page 51.) Shoenfield [168], pages 61, 62, 260, seems to define an “interpretation” as more or less the same as a “model”. Here the abstract language is regarded as the model for the concrete logical system which is being discussed. It is perhaps understandable that logicians would regard the abstract logic as the “real thing” and the concrete system as a mere “structure”, “interpretation” or “model”. But in the scientific context, generally the abstract description is regarded as a model for the concrete system being studied. It would be accurate to use the word “application” for a valid “structure” or “interpretation” of a language rather than a “model”. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
sensor path
74
3. Logic semantics
3.3.4 Remark: Discussion context and discussed context. Logic takes place within a “network of discussions”. Every logic discussion has a “discussion context” A and a “discussed context” B. But the discussed context B may itself be a discussion context which is dealing with a third discussed context C. (See Figure 3.3.2.) The third context may be referred to as a “meta-discussion context”.
model 1
model 2
machine 1
machine 2
Figure 3.3.2
machine 3
Modelling of modelling
Difficulties arise when one attempts to combine multiple discussion contexts in a cyclic fashion. For example, logicians discuss mathematics and comment on its validity, while mathematicians likewise discuss logic and comment on its validity. There is no cycle-free hierarchy of superior intellect here. The fact that one group of people A discuss the activities of another group of people B does not imply that A has superior knowledge above B. It is always “interesting” (and often exasperating) to see one’s own discipline being discussed by another discipline. This kind of interdisciplinary “critique” can lead to hostilities sometimes. (There are many strategies for prevailing in such conflicts. For example, “embrace and extend”. I.e. make “contributions” to the literature of the other discipline. Alternatively just ignore them if you have some serious work to do.) Logic is just one aspect of the way people think when they are discussing any subject. The subject matter may be anything at all, including the written or spoken discussions of other people. Since logic is often part of the discussed context, logic may be part of the subject of discussion as well as being part of the discussion itself. Confusion can be avoided in discussions of logic by distinguishing clearly between multiple contexts. Confusion can be maximized by mixing multiple contexts without indicating which portions of the discussion belong to each context. 3.3.5 Remark: Models and meta-models may co-exist within one logic machine. Meta-modelling is the modelling of a model. Both the model and the meta-model may co-exist within one logic machine. This is illustrated in Figure 3.3.3. Model 1 is a meta-model for machine 3 in this diagram and also in Figure 3.3.2.
model 1
model 2
machine 1 = machine 2 Figure 3.3.3
machine 3
Meta-modelling within one logic machine (or mind)
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
The 5-layer model illustrated in Figure 4.5.1 in Remark 4.5.4 typically refers to three different contexts. Layer 1 belongs to a discussed context. Layers 2 and 3 belong to a discussion context. Layers 4 and 5 belong to a meta-discussion context. As mentioned in Remark 2.2.7, a discussion context often influences the discussed context. (For example, grammar books and dictionaries influence the development of the languages which they describe.) So the boundaries of the contexts are sometimes difficult to determine.
3.3. Modelling, meta-modelling and recursive modelling
Figure 3.3.4
PC
ZF
ZF
PC
PC
ZF
ZF
PC
model 1
model 2
machine 1
machine 2
75
Recursive modelling of one machine by a second machine
3.3.6 Remark: Recursive modelling of one machine by a second machine. If two modelling machines are so foolhardy as to attempt to model each other, the result is recursive madness. This is illustrated in Figure 3.3.4. For example, Zermelo-Fraenkel set theory (ZF) may be modelled within propositional calculus (PC), and PC may be modelled within ZF. One should not examine recursive models too closely. It is particularly inadvisable to try to perform recursive modelling inside a single mind as in Figure 3.3.5. (Attempting to do this can adversely affect your sleep pattern!) One can at least hope that the external modelled system, machine 3 in this case, will not be a deeply recursive as the abstract model suggests. PC + ZF
PC
ZF
PC
ZF
PC
ZF
PC
ZF
PC
ZF
model 1
model 2
machine 1 = machine 2
machine 3
Recursive modelling inside a single machine (or mind)
It is difficult to discuss self-consistency for a recursive model because one can never really define anything fully. One can only define things in terms of other things, which are defined in terms of yet other things, and so forth. However, one could perhaps discuss “coherence” as a second best. 3.3.7 Remark: In practice, modelling loops do not become infinite. The recursive modelling situations alluded to in Remark 3.3.6 are not quite as disastrous as they look. The reason for this is the same as the reason why positive feedback in audio systems does not exhibit the theoretical infinite runaway behaviour which the simplest mathematical models predict. The voltages in an audio system do not reach billions of volts. Nor does the sound level destroy the Earth’s atmosphere by over-heating. Audio systems rapidly reach the limits of the simplest kind of model, after which the behaviour is constrained by those limits. In the case of modelling loops, the definitions are never expanded to an infinite extent. For example, a ZF set theory concept may be converted to pure predicate logic (such as a first order language). The pure logic may then be written in terms of the set theory of the concrete propositions domain and the various relations and maps which are required to formalize logic. Those sets, relations and maps may then be converted into pure predicate logic again. In practice, finite machines such as human brains never perform this unbounded expansion very far. The expansion is limited by practical concerns. If the expansion is performed on computers, memory space is generally an effective limitation. All of the infinite aspects of logic and set theory are only potential. The axioms and rules define the range of what can be expressed within logic and set theory. The full range cannot be instantiated in practice. Unbounded traversals through set theory and logic, following the recursive definitions of each in terms of the other, only needs to find no contradictions in those traversals. One may say that the combined recursive system is “coherent” if no contradictions can happen in any such traversal. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Figure 3.3.5
PC
76
3. Logic semantics
3.3.8 Remark: Logic and set theory do not build anything. Customers must provide everything. In a sense, there is no real problem with the circularity of logic and set theory. Both of these theories merely tell you that is you have a system which may be accurately modelled by the rules and axioms which you propose for them, then there are various consequences (i.e. theorems). The methods and techniques of logic and set theory do not build anything! They only tell you the consequences of rules and axioms which are presumed to be valid for a system which you must provide yourself. If you have no such system to be modelled, then no conclusions can be drawn. To put it simply, logic and set theory tell you: “Give me a system which satisfies these rules and axioms, and I will tell you some conclusions which can be drawn.” In other words, the spaces of propositions and sets must be built and provided by other means. The apparent mutual circularity of logic and set theory disappears when it is realized that the actual construction of concrete logic and set theory systems is “somebody else’s problem”! Logicians and set theorists are only service providers, not manufacturers. So it is okay that a logical system requires the a-priori provision of sets of logical propositions, predicates, variables and names. The methods of logic do not have to provide these. They are assumed to satisfy, for example, the rules and axioms of ZF or NBG set theory. It is the customer’s responsibility to provide these sets and ensure that they meet the prerequisites. Likewise it is okay that a set theory requires the a-priori provision of a logical system within which the set theory axioms may be presented and developed. Once again, it is the customer’s responsibility to provide this. Set theory delivers no conclusions at all unless a logical system is provided which meets the service provider’s specifications. Logic and set theory may be combined. The logic service provider requires you to provide a set theory which meets the requirements of the set theory service provider. And vice versa. If the customers are unable to provide the prerequisites in advance, the service providers will not be able to deliver any valid services at any price. 3.3.9 Remark: Truth-value status of propositions depends on logical system context. Within a logic machine A, all propositions which are proved in A are true and are believed without question. But from the point of view of a machine B which models machine A, all proved propositions in A are completely arbitrary, since the propositions in A are merely entities within a modelled system. If A is also modelling B, the propositions in machine B seem equally arbitrary to A. This kind of situation is familiar when the “logic machines” are human minds. But the situation also arises when two symbolic logic systems are able to model each other. Then, depending on which logical system context one is thinking within, each proposition may seem to be either unquestionably true or else completely arbitrary. Thus the truth-value status of propositions is highly dependent on the discussion context. Consequently one can never say that anything in mathematics is true or false. Everything is context-dependent. 3.3.10 Remark: Mathematical logic may be useful for ethics discussions, subject to conditions. A clear distinction is generally made between empirical propositions and value propositions. The empirical category includes: “You are eating the ice cream.” The value category includes: “You should not eat the ice cream.” This division of propositions into IS-propositions and SHOULD-propositions is not always completely clear. But science is generally held to be in the first category, while ethics is held to be in the second category. Logic may be applied to both the empirical and ethics categories of propositions. However, ethics propositions generally depend very much upon empirical propositions whereas empirical propositions are not supposed to depend on value considerations. At least, empirical propositions should not depend on ethical propositions. In practice, the two-state yes/no requirement for logical propositions is easier to satisfy with empirical propositions than with value propositions. Value statements are often subjective, controversial, ambiguous or meaningless. Standard mathematical logic absolutely requires two mutually exclusive states for each [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
In terms of the proposition-store ontology for logic in Section 3.7, one may say that any logical system is only “finitely populated” at any point in time. In other words, despite the theoretically infinite number of propositions which can be asserted within most logical systems, the proposition store never becomes infinitely populated with propositions. (One might compare this with the Earth, which only has a finite population at any point in time.) The axioms and rules of a logical system determine what propositional offspring is permitted from any group of parent propositions. The expansion of definitions follows similar breeding rules.
3.4. The universality (or otherwise) of modern logic
77
proposition. If this requirement can be satisfied, mathematical logic may be a useful model for ethical discussions, and logical tools may be of some value. Laws (for human conduct, not scientific laws) are very often written in the form A ⇒ B, where A is an empirical proposition and B is an imperative proposition. When B is some sort of punishment, it is implicit that A should not be done. In other words, A is then a value proposition. But when the law is applied, A is generally empirical (to be determined by evidence) whereas B is effectively a value statement which says that action B should or must be carried out. There is clearly ample opportunity here for ambiguity. On the other hand, laws are probably among the earliest applications of logic, with a history of thousands of years. The antecedent A of a law A ⇒ B is often a compound logical expression with many component propositions.
3.4. The universality (or otherwise) of modern logic 3.4.1 Remark: The correctness of mathematical logic is proved by robots on Mars. If logic is a mere anthropological observable, one might reasonably ask why our culture has so much certainty in the objectivity and universality of its logical processes. Is there any objective measure of the “correctness” of our culture of logical argument? After all, there have been other systems of logic within recorded history which are different to our current logic.
A person commencing education in our culture must make great efforts to accept the propositions and rules of argumentation of our society. People would presumably not make such effort if there were no incentives, but the incentives are very strong. Human beings obtain great power through their current methods of logical argumentation, mathematics and science. It is quite possible that if, in future, the outputs from science are viewed as bad, our society could abandon the currently fashionable logic and adopt other methods of argumentation. 3.4.2 Remark: Aristotle’s logic may have been popular because of Alexander’s success. The case could be made that Aristotle’s logic was held in high esteem by medieval Europe partly because he was well known to have been a tutor of Alexander the Great. Aristotle became tutor to Alexander in either 343bc (Russell [185], page 173; Barnes [186], page 10) or 342bc (Cotterill [180], page 468). According to Russell, Aristotle tutored Alexander between the ages of 13 and 16 years. The association of Aristotle with Alexander was very well known (Russell [185], page 174). At the death of Alexander, the Athenians rebelled, and turned on his friends, including Aristotle, who was indicted for impiety, but, unlike Socrates, fled to avoid punishment. This is a reference to the execution of Socrates in 399bc, partly on account of his tutoring of such unpopular politicians as Alcibiades. Similarly, Barnes [186], page 11, says: In 323 Alexander died in Babylon. When the news reached Athens Aristotle, unwilling to share the fate of Socrates, left the city lest the Athenians put a second philosopher to death. Russell [185], pages 206–212, is quite critical of Aristotle’s logic and describes numerous errors and points of confusion. On page 206, he says the following about the undeserved influence of Aristotelian logic. Aristotle’s influence, which was very great in many different fields, was greatest of all in logic. In late antiquity, when Plato was still supreme in metaphysics, Aristotle was the recognized authority in logic, and he retained this position throughout the Middle Ages. It was not till the thirteenth century that Christian philosophers accorded him supremacy in the field of metaphysics. This supremacy was largely lost after the Renaissance, but his supremacy in logic survived. Even at the present day, all Catholic teachers of philosophy and many others still obstinately reject the discoveries of modern logic, and adhere with a strange tenacity to a system which is as definitely antiquated as Ptolemaic astronomy. This makes it difficult to do historical justice to Aristotle. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
An extraterrestrial observer could note that our culture has sent robots to Mars. Previous cultures have not done so. It seems to be a fair statement that the ability to engineer martian robots, and navigate them through the Solar System’s gravitational fields and other obstacles to land them on Mars and transmit close-up photos and measurements back to Earth, is an objective indication that our logic is well attuned to the nature of the Universe. One could point to any of the accomplishments of the last century as supporting evidence for the superiority of our logic, mathematics, physics, chemistry and engineering.
78
3. Logic semantics His present-day influence is so inimical to clear thinking that it is hard to remember how great an advance he made upon all his predecessors (including Plato), or how admirable his logical work would seem if it had been a stage in a continual progress, instead of being (as in fact it was) a dead end, followed by over two thousand years of stagnation.
It seems plausible that the strong, broad acceptance of Aristotelian logic for such a long time was influenced by the very high regard in which Alexander’s empire-building capability was held. This would seem to support the idea that people accept the logic framework which they believe has the most prestige or the greatest power. Therefore our current system of logic may not be so absolute and universal as it seems. One could go further and speculate that all knowledge and “truth” is determined by prestige, power and economic self-interest. If science ever stops delivering concrete benefits, the world could return once more to a pre-scientific dark age. 3.4.3 Remark: Mathematical logic is an imperfect model for the physical world. It is very difficult indeed to see how logic could possibly be wrong. It seems that logic is so fundamental and obvious, it must be beyond doubt. The propositional calculus should be beyond doubt too. How can one find anything dubious in simple, basic logic? As a human calculational activity, as a set of procedures, one can no more cast doubt on simple logic than one can disprove the rules of the game of chess. Logic is a well-defined set of procedures which are agreed upon by large numbers of people. Anyone who adheres to the rules should arrive at the same conclusions in all cases.
Logical propositions are an abstraction from real-world propositions. In the real world, no proposition is ever true or false beyond all doubt. This is partly because of quantum mechanics. But it also follows from the unreliability of human observation. Even the state of a human mind is highly variable. It is impossible to know what anyone means when they say that a proposition is “true”. A person’s beliefs may change over time. So during a time of transition, the truth value is in doubt. Also, most propositions are contingent on other propositions, which are themselves in doubt. All empirical knowledge has a probabilistic or statistical nature. Most propositions are open to a wide range of interpretations. The definition of “truth” is itself very elusive. Truth is such a fundamental concept that it is impossible to define, just as probability cannot ultimately be defined in concrete terms. Truth is a “primitive concept” which is not reducible to other concepts. Thus logic is a perfect representation of itself, but it is dubious that it accurately models real-life propositions which are held by real-life human beings. The above comments may be summarized by the statement: “Logic is a human procedure.” In other words, logic is not an absolutely accurate model for anything in the real world. 3.4.4 Remark: Logic might not be a-priori absolute and universal. Immanuel Kant asserted that Euclidean geometry was somehow a-priori determined in some sense. Since his time (1724–1804), it has been discovered that real-world physical space does not necessarily follow Euclidean geometry at all. This raises doubts that anything is really a-priori determined, including logic and numbers. To quote Bell [189], page 344, writing in 1945: The arbitrary freedom in the mathematical construction of ‘spaces’ and ‘geometries’ at last made it plain that Kant’s a priori space and his whole conception of the nature of mathematics are erroneous. Yet, as late as 1945, students of philosophy were still faithfully mastering Kant’s obsolete ideas under the delusion that they were gaining an insight into mathematics. As Kant appealed [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
The doubtfulness of logic lies in its applicability to real-life propositions. This is analogous to the way in which the integers have well-defined rules, although the applicability of the integers to the real world is dubious. There are no collections of physical objects with 101000 elements. So the interpretation of integers as cardinalities of collections of objects is dubious. All but a finite number of integers must be “grey numbers”, as mentioned in Remark 2.11.12. So almost all integers cannot even be thought about, or written down, or represented in computers. Even the counting of small numbers of physical objects is quite dubious, as mentioned in Remark 2.11.19. So the integers are an abstraction which is merely a model for the real world. As the old saying goes, the real world is only an approximation to the models of theoretical physics. Similarly, real-world counting is a mere approximation to the mathematical integers, and real-world logic is only an approximation to the crisp perfection of abstract logic.
3.5. Logic in literature
79
to his mathematical misconceptions in the elaboration of his system, it is just possible that some other parts of his philosophy are exactly as valid as his mathematics. Russell [185], pages 685–686, gave the following summary of Kant’s assertions about geometry. As regards space, the metaphysical arguments are four in number.
The transcendental argument concerning space is derived from geometry. Kant holds that Euclidean geometry is known a priori , although it is synthetic, i.e. not deducible from logic alone. Geometrical proofs, he considers, depend upon the figures; we can see, for instance, that, given two intersecting straight lines at right angles to each other, only one straight line at right angles to both can be drawn through their point of intersection. This knowledge, he thinks, is not derived from experience. But the only way in which my intuition can anticipate what will be found in the object is if it contains only the form of my sensibility, antedating in my subjectivity all the actual impressions. The objects of sense must obey geometry, because geometry is concerned with our ways of perceiving, and therefore we cannot perceive otherwise. This explains why geometry, though synthetic, is a priori and apodeictic. Some philosophers have fought back against this attack on Kant’s apparent bungle on geometry. For example, see Palmquist [184]. If logic and numbers are examined in depth, they look less and less a-priori. However, just as the human concept of geometry has evolved to adapt to the evidence of the senses, it seems reasonable to suppose that logic also adapts over time to observations of the physical world. 3.4.5 Remark: Evolution should yield animals whose logic matches the real world. The amazing coincidence that human minds (and a large proportion of other animal minds) are well-prepared for conceptualizing two-dimensional and three-dimensional space is easily explained by adaptation to the environment. Similarly, colour perception is a useful adaptation, not a purely coincidental match between capability and utility. In the 20th century, the concept of geometry was extended from flat spaces to curved spaces in response to physical observations. The concept of colour was extended in the last few centuries from visible light to a vast spectrum of electromagnetic radiation in response to observations also. It would not be too surprising, then, if our concepts of logic needed to be extended in some way over the next few centuries, possibly in response to new discoveries in quantum mechanics or other areas of physics.
3.5. Logic in literature 3.5.1 Remark: Ancient literature may give clues to the meaning of logic. It is difficult to develop mental concepts without language. When concepts are present in a language, the users of that language develop and reinforce those concepts through frequent use. Therefore it seems plausible that one may trace the development of human thinking about particular concepts by studying their appearance in language. For example, one could study the use of logic words in the 6000 or so languages which are currently in use in the world. This might indicate something about the way in which people in different cultures actually think. A society which does not have a word for “if” and “or” would be unlikely to have well-developed capabilities in propositional logic. Written human language dates from about 3500–3250bc. Cuneiform writing on tablets dates from about 3000bc. Therefore we have potentially some access to the development of human language and thought over [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
(1) Space is not an empirical concept, abstracted from outer experiences, for space is presupposed in referring sensations to something external, and external experience is only possible through the presentation of space. (2) Space is a necessary presentation a priori , which underlies all external perceptions; for we cannot imagine that there should be no space, although we can imagine that there should be nothing in space. (3) Space is not a discursive or general concept of the relations of things in general, for there is only one space, of which what we call ‘spaces’ are parts, not instances. (4) Space is presented as an infinite given magnitude, which holds within itself all the parts of space; this relation is different from that of a concept to its instances, and therefore space is not a concept but an Anschauung.
80
3. Logic semantics
the last 5000 years. There are several obstacles in this path. There were not very many humans on Earth 5000 years ago. In 2250bc, there were only about 25 million people on Earth. In 3500bc, there were only about 4 million people in Europe, Western Asia and North Africa. (See McEvedy [183], page 34.) Very few of these people were actually writing anything. And of the tiny amount of literature that was written, only a very tiny proportion has survived to the current time.
3.5.2 Remark: The epic of Gilgamesh contains almost no logic in 3000 lines. The most ancient substantial narration (about 3000 lines) is the Gilgamesh epic, which is set principally in Iraq around 2750bc or 2650bc. It was initially composed orally about 2250bc and was written down in Sumerian and Akkadian in various versions between about 2050bc and 650bc, the standard version being written down around 1100bc. (See Gilgamesh [202].) Since the epic of Gilgamesh is so ancient, one would expect to see less logic in the text than in more modern texts for two main reasons: people probably had not developed logical thinking to a great extent, and any logic which was present in people’s thinking would probably not have been given vocabulary and grammatical structures at that time. This conjecture is supported by a reading of Gilgamesh. The entire epic seems to have no occurrences at all of the word “or”, nor any equivalent grammatical construction. This is not surprising because the word “or” generally indicates uncertainty or ignorance. This is unlikely to be popular in a narration. The word “or” requires the narrator to convey two possibilities, neither of which is certain. The truth of one proposition is conditionally linked to the truth of the other. (The inclusive “or” has the form (¬A) ⇒ B. The exclusive “or” has the form (¬A) ⇔ B.)
Logical expressions of the form ¬A ∧ ¬B also indicate certainty because, in this case, two negative assertions are made with certainty. For example, the following line appears in Gilgamesh [202], page 5. he knows not a people, nor even a country.
I.108
This is superficially equivalent to the form ¬(A ∨ B), but this is not how the statement is communicated. The following is another case of this logical form. ‘O Gilgamesh, there never has been a way across, nor since olden days can anyone cross the ocean.’
X.79 X.80
It could be argued that a statement of the form ¬A ⇒ B is equivalent to A ∨ B. (The quotation of lines VI.96–97 in Remark 46.4.1 is of this form.) But such statements are not expressed as disjunctions in Gilgamesh. The epic of Gilgamesh has 9 instances of the word “if”. These are listed in Remark 46.4.1. For 3000 lines of text, this is a rather small amount of logical language. This seems to support the idea that logic appeared in language at a late stage, and that quite likely, people did not think in the modern logical fashion very frequently. The paucity of logic in Gilgamesh contrasts with the abundance of numbers. For example, tablet I has 7 unique numbers in 300 lines ( 21 , 3, 32 , 13 , 6, 2, 7, in order of appearance). A fuller set of number counts is as follows. (The last column shows the number of IF-sentences.) [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
By contrast, the word “and” can be omitted because it is the natural way of thinking. (A ∧ B is equivalent to the sequence of propositions A, B, stated one after the other.) So the word “and” conveys certainty of two or more propositions, just like stating the propositions one after the other. The word “not” indicates certainty of the negative proposition, leaving no room for doubt.
3.5. Logic in literature tablet lines I 300 II 303 287 III IV 260 V 302 VI 183 VII 267 VIII 219 IX 196 X 322 XI 329 XII 153
81
count numbers 1 2 1 7 2 , 3, 3 , 3 , 6, 2, 7 5 2, 7, 60, 12 , 10 4 2, 7, 13, 20 10 20, 30, 50, 3, 1 12 , 2, 4, 5, 7, 6 5 2, 3, 13, 10, 6 7 3, 2, 7, 100, 200, 30, 6 12 2, 20, 6, 7, 3, 4, 5, 8, 9, 10, 11, 12 4 10, 40, 3, 2 2 1 13 3 , 3 , 12, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 14 6, 7, 300, 5, 3, 1 21 , 2, 4, 8, 9 10, 11, 12, 120 17 5, 10, 6, 7, 9, 30,000, 10,000, 20,000, 23 , 14, 2, 3, 4, 20, 30, 21 , 3 12 6 2, 3, 4, 5, 6, 7
if 1
3 1
4
This seems to support the view that logic is a later development than numbers in human history. One might argue that narrations are, by their nature, unlikely to include OR and IF logic, because these indicate uncertainty or ignorance, as opposed to AND and NOT logic, which indicate certain knowledge. The readers and listeners of narrations typically just want to know what happened. They don’t want to guess! Therefore it is no coincidence that all of the nine IF-sentences in Remark 46.4.1 are part of quoted dialogues, not part of the main narration.
3.5.4 Remark: B¯eowulf contains more logical language than Gilgamesh. The Anglo-Saxon (Old English) epic B¯eowulf [218] is a 3182-line poem composed in England about 840ad. A reading of this work reveals certainly more instances of “or” and “if” logical constructions. Many appearances of these words in an English translation [194] turn out to be artefacts of the translation. The first three instances of the word “or” in this translation (in lines 186, 250 and 252) are such artefacts. If one carefully reads the Old English source to eliminate such artefacts, there remain 16 genuine OR-constructions, in lines 283, 437, 635, 637, 693, 1491, 1763–1766 (7 instances), 1848, 2253, 2376, 2434, 2494, 2495, 2536, 2840 and 2922. This may seem a small number, but it contrasts strongly with the zero in Gilgamesh. (See Remark 46.4.2 for examples of logic in B¯eowulf.) The Old English word for “or” is variously written as “okke”, “okþe” or “oþke”. (Pronunciation: “k” (called “eth”) sounds like “th” in “this”; “þ” (called “thorn”) sounds like “th” in “thick”.) This word is derived from the word meaning “other”, which English retains in the word “otherwise”. This shows that originally the word “or”, which is derived from “okke”, had the meaning of an implication. In other words, a sentence of the form A ∨ B was effectively written as (¬A) ⇒ B. That is: “A is true, otherwise B is true.” There are 26 IF-constructions in B¯eowulf in lines 272, 280, 346, 442, 447, 452, 527, 593, 684, 1104, 1182, 1185, 1319, 1379, 1382, 1477, 1481, 1822, 1826, 1836, 1852, 2514, 2519, 2637, 2841 and 2870. (The Old English word for “if” is “gif” or “gyf”.) As in the case of OR-constructions, there is a contrast with Gilgamesh, which has only 9 IF-constructions in the same number of lines. There are also many instances of the word “unless”. (This is “nefne”. “nemne” or “nymþe” in Old English.) The word “unless” is logically equivalent to “or”. This is readily verified by noting that “A unless B” means “if not A, then B”, which means (¬A) ⇒ B, which is equivalent to A ∨ B. There are numerous instances of “unless” in B¯eowulf, including at lines 781, 2151, 2533, 2654 and 3054. Logic in B¯eowulf is also expressed with the word “except”, which is “b¯ utan” in Old English. This often means the same as “unless”. In fact, this gives a clue to how logic can enter into natural language. All OR-constructions and IF-constructions have the character of exceptions. This is summarized in the following table. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
3.5.3 Remark: The Hammurabi Code of Laws, which date from about 1860bc, show logic in almost every law. For example: 7. If any one buy from the son or the slave of another man, without witnesses or a contract, silver or gold, a male or female slave, an ox or a sheep, an ass or anything, or if he take it in charge, he is considered a thief and shall be put to death. Presumably there were laws before these were written down, and all laws would have much the same level of logical structure.
82
3. Logic semantics statement A or B if A then B A, unless B
formula A∨B A⇒B ¬A ⇒ B
main claim A ¬B A
second claim B A B
Each of these logical expressions has a main claim. But if the main claim fails, there is a back-up claim, the exception. This is why the word “or” in English is so closely related to “otherwise”, and both words come from the Old English “okke”. The word “otherwise” means “in the exceptional case that the main claim fails”. The OR-construction and IF-construction differ in that the main claim for the OR-construction is the foreground or default claim whereas the main claim for the IF-construction is in the background as the normal case is the exceptional condition does not apply. (See also Remarks 3.6.1 and 3.7.3 for foreground/ background propositions.) It follows that non-atomic logical expressions (i.e. not including AND and NOT expressions, which are really atomic assertions) have the character of afterthoughts. This suggests a state of mind in which one claim is made, followed by another state of mind in which the exception is stated. This may explain why there was little non-atomic logic in very early literature. 3.5.5 Remark: There are essentially only two kinds of binary logical operator. If one considers that propositions and their negatives are of equivalent status (since the negative of every atomic proposition is an atomic proposition), there are in fact only two kinds of non-atomic binary logical operator: inclusive and exclusive. To clarify this, let XT,T XT,F XF,T XF,F denote the sequence of values of a truth function X : {F, T}2 → {F, T}. For example, TTTF denotes the logical operator (A, B) 7→ A ∨ B. Then the 16 possible truth functions may be expressed as follows. operator
type
TTTT TTTF TTFT TTFF TFTT TFTF TFFT TFFF
⊤ ¬A ⇒ B ¬A ⇒ ¬B A A⇒B B A⇔B A∧B
always true implication implication atomic implication atomic implication conjunction
FTTT FTTF FTFT FTFF FFTT FFTF FFFT FFFF
A ⇒ ¬B A ⇔ ¬B ¬B A ∧ ¬B
¬A ¬A ∧ B ¬A ∧ ¬B ⊥
implication implication atomic conjunction atomic conjunction conjunction always false
sub-type
equivalents
atoms 0
inclusive inclusive inclusive
A∨B A ∨ ¬B
A ⇐ ¬B A⇐B
1
¬A ∨ B
¬A ⇐ ¬B
1
A, B
¬A ⇔ ¬B
2
exclusive inclusive exclusive
¬A ∨ ¬B A, ¬B ¬A, B ¬A, ¬B
¬A ⇐ B ¬A ⇔ B
1 2 1 2 2 0
The “always true” and “always false” operators convey no information at all. The 4 atomic propositions convey information about only one of the propositions. So these are not, strictly speaking, binary operators. The 4 conjunctions are equivalent to simple lists of two atomic propositions. So the information in these operators can be conveyed by individual propositions, one after the other. This leaves the 4 inclusive and 2 exclusive disjunctions (or implications). The 4 inclusive disjunctions run through the 4 combinations of T and F for the two propositions. They are equivalent to each other under swaps of the proposition truth values. The 2 exclusive disjunctions are similarly equivalent to each other under swaps of proposition truth values. Thus, apart from binary logical expressions which are equivalent to lists of 0, 1 or 2 atomic propositions, there are only two logical operators which are unique under swaps of proposition truth values. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
function
3.5. Logic in literature
83
3.5.6 Remark: Biconditionals are equivalent to lists of conditionals. One may observe that the biconditional A ⇔ B is equivalent to a list of two simple conditionals: A ⇒ B, B ⇒ A. Therefore all non-atomic propositions are conditionals. This re-write of biconditionals as conditionals is fairly close to how people do think about biconditionals colloquially. Alternatively, A ⇔ B is equivalent to the list: A ⇒ B, ¬A ⇒ ¬B. This is perhaps closer to how people really think about biconditionals. However, disjunctions are not so easily expressible as lists of disjunctions. The exclusive disjunction A △ B is equivalent to the list of disjunctions: A ∨ B, ¬A ∨ ¬B. This might seems convincing to the modern mathematical mind. But it is probably not how people think about the exclusive OR in non-technical contexts. People probably think of A △ B as meaning: (A ∨ B) ∧ ¬(A ∧ B), which is not a list of disjunctions. Another natural way of thinking about A △ B is: (A ∧ ¬B) ∨ (¬A ∧ B), which looks even less like a list of disjunctions. The list A ∨ B, A ⇒ ¬B seems moderately intuitively convincing. So does the list ¬A ⇒ B, A ⇒ ¬B. But these last two lists use implications in place of disjunctions. It can be tentatively concluded that inclusive and exclusive OR-expressions are qualitatively different. Similarly, conditionals and biconditionals are qualitatively different.
implication disjunction
permissive (one-way)
strict (two-way)
A⇒B ¬A ⇒ B
A⇔B A ⇔ ¬B
3.5.7 Remark: Colloquial logic confusions: inclusive versus exclusive disjunctions and implications. In colloquial logic, even in the 21st century, there is frequent confusion between the inclusive OR and the exclusive OR. Very often, it is the exclusive OR which is meant. Often it is not possible for even the speaker to determine whether the inclusive or exclusive OR is meant. Thus A∨B is confused with (A∨B)∧¬(A∧B). (These colloquial logic confusions are also mentioned in Remark 3.13.7.) Parallel to the inclusive/exclusive OR confusion is the conditional/biconditional confusion where someone says that B is true if A is true, but it is consciously or subconsciously implied that B is true only if A is true. Thus A ⇒ B is confused with A ⇔ B. These colloquial logic confusions are summarized in the following table. implication disjunction
inclusive A⇒B A∨B
exclusive A⇔B A ⇔ ¬B
As mentioned in Remark 3.5.5, disjunctions and implications are equivalent under swaps of proposition truth values. So there are only two unique, strictly binary logical operators. But these two are confused in colloquial logic. So there is effectively only one unique kind of strictly binary logical operator which is found in general literature. Further confusions are quite common in ordinary daily life. Some people who hear a sentence of the form “if A, then B” think that this is equivalent to “since A, then B”, which is equivalent to “B is true because A is true”. Such people are used to hearing conditional compound sentences only when the antecedent is true. But even if the probability of this being true in daily life is high, from the purely logical point of view, A ⇒ B is neither an assertion of A nor of B, and yet many people interpret it as an assertion of both! 3.5.8 Remark: Disjunctions are cognitively more complex than conjunctions. The low frequency of disjunctions in literature, relative to conjunctions, is not surprising. The compound proposition “A ∧ B” is equivalent to: [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
A permissive operator makes a one-way link from one proposition to another. A strict operator makes a two-way link between propositions. But the two two-way implications are not thought of in the same way. The two-way disjunction is thought of as exclusive multiple choices: (A ∧ ¬B) ∨ (¬A ∧ B). The two-way implication is thought of as a choice between “both true” and “neither true”: (A ∧ B) ∨ (¬A ∧ ¬B). In other words, in both cases, a more natural way of thinking is as a disjunction of conjunctions, not as a list of one-way implications or disjunctions. One may therefore conclude that all four non-atomic logical operators are qualitatively different, even though they may all be written as one or two implications.
84
3. Logic semantics
(1) A is true. (2) B is true. The compound proposition “A ∨ B” is equivalent to: (1) A may be true or false. (2) B may be true or false. (3) At least one of A and B is true. Seen from this perspective, the disjunction of two propositions is significantly more difficult to grasp than the conjunction. The disjunction is quite ambiguous about each of the simple propositions A and B. The number of true propositions in the pair must be at least one. This seems to imply that some sort of counting operation is required. This becomes clearer in the case of the triple disjunction A ∨ B ∨ C. (1) A may be true or false. (2) B may be true or false. (3) C may be true or false. (4) Either one, two or three of the propositions A, B and C are true. Neither of the three propositions is known to be true or false. But if two of them are known in some other way to be false, the remaining single proposition will then be known to be true. This can be expressed as follows. (1) A may be true or false. (2) B may be true or false. (3) C may be true or false. (4) If A and B are false, then C is true. (5) If A and C are false, then B is true. If a person’s objective is to establish that some simple proposition is true when A ∨ B ∨ C is given, the logical path towards this objective is somewhat complex. The triple conjunction has no such complexity. The triple conjunction gives three true propositions with no extra work required. 3.5.9 Remark: Conjunctions communicate more information than disjunctions. From the information perspective, the conjunction of propositions gives the maximum information about them because it specifies all of the truth values, whereas the disjunction gives very little information. The disjunction tells us only that one of the propositions is true, and we don’t even know which one it is. So it is no wonder that disjunctions are not favoured in ancient literature. (See Remark 4.13.9 for similar comments regarding universal and existential quantifiers.) Although conjunctions and disjunctions are superficially similar, since they are in some sense duals of each other, the differences in information content between them demonstrate that they are fundamentally different. If logical expressions are written in disjunctive normal form, it becomes clearer how much information is contained in them. For example, A ∧ B is already in disjunctive normal form, but A ∨ B is equivalent to (A ∧ B) ∨ (A ∧ ¬B) ∨ (¬A ∧ B), which only narrows down the options to 3 out of 4 possibilities for the pair of truth values (t(A), t(B)). The triple conjunction A ∧ B ∧ C specifies a single possibility out of 8 possibilities for the truth-value triple (t(A), t(B), t(C)), whereas A ∨ B ∨ C specifies 7 possibilities out of 8, which is clearly much less specific. [ In relation P to Remark 3.5.9, quantify the information in conjunctions and disjunctions of N propositions. E.g. − N −1 ln N , or something like that. ]
3.5.10 Remark: Geographical navigation may be a metaphor for logic. Navigation in the geographical terrain is a quite plausible metaphor for logical expressions. The expression A ⇒ B could be interpreted as: “If you follow path A, you will arrive at point B.” Multiple such implications can be concatenated to make up a path to arrive at a distant destination. The expression (A ∨ B) ⇒ C [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
(6) If B and C are false, then A is true.
3.6. Proposition-store versus world-view ontology for logic
85
could be interpreted as: “If you follow either path A or path B, you will arrive at point C.” The expression ¬A could be interpreted as: “Path A leads nowhere.” The analogy is not perfect, but it is plausible that the mental processes of terrain navigation and propositional calculus have a lot in common. If this point of view is correct, one may regard essentially all animals as being capable of logic. Animals which can navigate in sophisticated ways probably implement a richer range of logical expressions. For example, an animal which knows multiple ways to arrive at a destination can navigate alternative paths if one or more paths are unexpectedly blocked. The generality of logic required for terrain navigation is probably greater than that which is required for simple operant conditioning.
3.6.1 Remark: The world-model ontology for logic explains the double negative with two-way decisions. The acceptance/rejection model for truth and falsity which is presented in Remark 3.7.2 encounters serious difficulties when one considers concepts such as double negation. (See Section 3.10 for the semantics of negation.) Section 3.7 proposes that an ontology for logic can be based on a machine model where the machine accumulates true propositions over time. In this model, a proposition is tagged with “true” to mean that it should be added to the proposition-store whereas a “false” tag means that it should be rejected. But double negation poses difficulties. The rejection of an instruction to reject a proposition A does not imply that A should be accepted. In mathematical logic, however, we do want double negation to imply the assertion of a proposition. If one spends enough time an energy trying to make an accept/reject model of the true/false concept correspond naturally to a wide variety of logic contexts, one begins to find this approach frustrating and unsatisfying. There seems to be something fundamentally lacking in the proposition-store machine model. The problems with the proposition-store machine model seem to be fixed by a world-model machine model. In a world-model machine model for logic, each logic machine is thought of as deriving all propositions from a world-model. The world-model may be a model of anything at all, not necessarily within the physical world. The world-model may be pure fantasy. Propositions then arise within the world-model as choices between options for the world-model. For example, one world-model could be a Newtonian picture of planets revolving around a star in elliptical orbits. Such a model has a space of unknown parameters such as the number of planets, their orbital elements, and the mass, radius and temperature of each planet. One may propose that the fourth planet from the star has an orbital period greater than one year. The assertion or negation of this proposition A constrains the total space of parameters for the model. Such a model is not just a big set of true propositions. The propositions are attributes or properties of the model. The model is more than merely the totality of all true propositions about the model. If one accepts the world-model machine model for logic, the negation ¬A of a proposition A is always a positive assertion. When one asserts A, one is constraining the parameter space of the model to a particular subset of the total set of parameters. The negative assertion ¬A means that the set of parameters is constrained to the complement of the subset asserted by A. It follows from this kind of world-model-machine ontology that truth and falsity are not fundamentally different. The truth of a proposition implies going down one path. The falsity of the proposition implies going down some other path. In particular, the assertion that a proposition is false does not imply some sort of vacuum of information. Every negative assertion is effectively a positive assertion. Both positive and negatives assertions are branches of a two-way choice. (This is illustrated in Figure 3.6.1 for the case of palaeolithic mammoth-hunters deciding whether there is or is not a mammoth drinking at the local pond. If the news is negative, the state of mind Z does not cease to exist. The state of mind goes down one branch of a two-way decision tree.) To put it another way, a proposition which relates to a world model asserts a foreground option if it is true, but there are always one or more other options in the background. The alternative to the truth of a proposition is not a vacuum. To put it simply: For every foreground, there is a background. By contrast with the acceptance/rejection model for truth and falsity, the falsity of a proposition in the world-model machine model is never a mere rejection of a proposition. The falsity always implies acceptance [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
3.6. Proposition-store versus world-view ontology for logic
86
3. Logic semantics
state Z1
Mammoth!
No mammoth!
(Let’s go!)
(Let’s stay!)
true
state Z2
false Mammoth? (Go or stay?)
state Z0 Figure 3.6.1
Decision-making for mammoth-hunters: Falsity does not imply a vacuum
of a complementary proposition. In view of these considerations, a double negative ¬¬A is always equivalent to the positive assertion A because the complement of a complement equals the original set. 3.6.2 Remark: The observable behaviour of mathematicians suggests a proposition-store model for logic. Mathematics is quite unusual among the applications of logic. In the case of mathematics, it sometimes seems that all of the subject can be represented as a set of propositions. In fact, it seems to be possible to “do” mathematics without having any understanding of the meaning of the symbols at all. A person (or a computer) who knows only the rules of deduction can verify the correctness or incorrectness of any mathematical argument if it is presented formally enough. (Computer don’t seem to be very successful at discovering new, interesting theorems though. Nor are they able to distinguish between superficial and deep theorems.)
It was the apparent theorem-accumulation nature of mathematics which made this author at first incline to the view that a natural ontology for logic might be a machine which accumulates propositions. In fact, mathematics is one of the very few subjects where logical deduction in terms of abstract deduction, devoid of semantics, can be so productive. This is why computer software can apparently make sense of mathematics. The software only makes sense of the textual level of the subject, not the semantic level. This is also why computer software is not very successful in discovering new approaches in mathematics. Humans work at both the textual and semantic levels. The propositions of mathematics serve merely to record the conclusions of the mathematician, and to verify the self-consistency of the arguments. Discovery generally requires an understanding of the semantics underlying each mathematical subject. (That’s why this author is strongly in favour of diagrams to illustrate mathematics. Diagrams help to clarify the semantic content of a text.) The proposition-store-machine ontology in Section 3.7 makes truth and falsity seem to be fundamentally different in terms of their semantics. Similarly, this ontology makes conjunctions and disjunctions seem fundamentally different. (See Remark 3.7.14, for example.) The world-model-machine ontology in Remark 3.6.1 makes truth and falsity seems like duals of each other. This corresponds very well to the calculus of logical operators, which shows duality between true and false, and between conjunctions and disjunctions. Thus the world-model-machine ontology is, ironically, closer to the properties of abstract logical operator calculus than the proposition-store-machine ontology. In digital electronics, it is well known that the laws of logic are invariant under the swap of high and low voltages. In other words, when “true” and “false” are swapped, all of the logic rules stay the same. This fits very well with the world-model-machine ontology. 3.6.3 Remark: The excluded middle follows easily from a class model for logical propositions. The world-model ontology for propositional logic seems to be supported by the chapter on classes and symbolic logic in Lakoff/N´ un ˜ez [172], pages 121–139. On page 131, they make the following comment. The trick here is to conceptualize propositions in terms of classes, in just such a way that Boole’s algebra of classes will map onto propositions, preserving inferences. This is done in the semantics of propositional logic by conceptualizing each proposition P as being (metaphorically of course) the [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
The theorems which are accumulated in any mathematical subject form a proposition-store which seems to represent the accumulated knowledge in that subject. The strong emphasis in pure mathematics on the accumulation of theorems may be partly due to the lack of diagrams in journals in the past. This creates the impression that mathematics is an essentially text-based subject, and that all mathematical ideas can be represented as a sequence of propositions.
3.7. A proposition-store ontology for logic
87
class of world states in which that proposition P is true. For example, suppose the proposition P is It is raining in Paris. Then the class of world states A, where P is true, is the class of all the states of the world in which it is raining in Paris. Similarly, if Q is the proposition Harry’s dog is barking, then B is the class of states of the world in which Harry’s dog is barking. The class A ∪ B is the class of world states where either P or Q is true—that is, where it is raining in Paris or Harry’s dog is barking, or both. The authors then proceed to show how the law of the excluded middle follows from the corresponding law for classes. It it only a small intellectual leap to conclude that if all propositions are derived as attributes from world-models, then the exclude middle law must hold for propositional logic. Thus propositions which transmit world-model attributes can be either true or false, and not both, and not anything else. It is difficult to think of practically useful propositions which are not attributes of some sort of model. A completely abstract proposition with no semantics at all would not necessarily obey the excluded middle law, but such an abstract kind of proposition would seem to have only recreational interest. 3.6.4 Remark: Intuitionistic propositional calculus omits the excluded middle axiom. One of the schools of thought in mathematical logic called “intuitionism” claims that the excluded middle rule A ∨ ¬A, for all propositions A, should not be accepted. (See EDM2 [34], 156.C, page 614.) The excluded middle is also known by the Latin name tertium non datur . For a particular propositional calculus, Mendelson [164], page 43, says that intuitionistic propositional calculus is obtained by replacing the axiom ¬¬α ⇒ α with the weaker axiom ¬α ⇒ (α ⇒ β).
3.7. A proposition-store ontology for logic
3.7.1 Remark: A proposition may be true or false. An assertion claims that a proposition is true. The word “proposition” is Latin for “putting forth”. In other words, a proposition is something which one person (or group of persons) puts forward to another person (or group of persons). Thus a proposition is something which is offered, with more or less compulsion. The word “assertion” is Latin for “laying hold of” in the sense of laying claim to something as one’s property. By extension, an assertion may mean a claim that something is true. In modern mathematical logic, a proposition is a statement which may be true or false, whereas an assertion is a proposition which is claimed to be true. (In former times, a proposition was generally a statement which was claimed to be true.) Thus we distinguish between (1) the mere writing of a proposition, and (2) the assertion that it is true. The assertion symbol “ ⊢ ” before a proposition indicates that the writer is asserting that the proposition is true. The mere writing of a proposition means that the writer merely wishes to discuss the proposition, no matter if it be true or false. (This is essentially the same as the distinction between indicative and subjunctive verb moods discussed in Remark 3.12.2.) However, the context of an informal logical argument often indicates that a proposition is being asserted, not just discussed. It is the writer’s responsibility to clarify which meaning is intended. example proposition assertion
A∨B ⊢A∨B
description
verb mood
could be true or false subjunctive is claimed to be true indicative
truth value t(A ∨ B) = T or F t(A ∨ B) = T
3.7.2 Remark: Observed logical behaviour includes offering, accepting and rejecting propositions. In the real world, the practical interpretation of the concepts of truth and falsity by human minds may be thought of as the acceptance and rejection of propositions over time. The human mind may be thought of [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
[ 2008-12-12: The author has concluded a couple of weeks ago that the proposition-store ontology for logic is unsatisfactory. It seems to describe what mathematicians apparently do, but it doesn’t describe what they think . It is also useless for giving meaning to logical expressions, even as simple as logical negation. So the world-model ontology is now being adopted. So this section will be downgraded to a mere side-comment some time soon. ]
88
3. Logic semantics
as starting initially with zero (or very few) propositions about the world, and over time, propositions are added to the individual’s “true proposition set”, or rejected if found to be false. In practice, when a person says that a proposition is true, what they mean is that they will accept or maintain it in their collection of propositions. When they say that a proposition is false, they mean that they will reject it if offered, and will delete it if it was accepted earlier. This is illustrated in Figure 3.7.1.
offer of proposition universe of propositions Figure 3.7.1
proposition truth tester
accept
T
reject
F proposition stores
Model for truth in mathematical logic
When particular individuals assert the truth of a proposition, they are claiming that it should be accepted by themselves and others. This suggests that assertions have a socially coercive character, as suggested in Remark 3.7.12. There is no absolute meaning for “truth”. Each individual (and perhaps even each state of mind of each individual) has a particular set of true propositions. A proposition is said to be true for a particular speaker if it is accepted by that speaker. It is said to be false if it is rejected by the particular speaker. Acceptance and rejection are states of mind which can be inferred from observable social behaviour. According to this point of view, truth and falsity are subjective.
3.7.3 Remark: The concept of truth often has a multiple-choice character. In practice, a large proportion of logical propositions are of the “multiple choice” type. In other words, there is an a-priori assumption that one of a set of options is true. Then the decision to be made is which one of the options is true. For example, one might ask which direction a path leads in. The “true” direction would be the correct choice of direction. In the minds of people who are trying to answer a question about direction, the question is “which direction?” rather than whether each particular direction is “true”. In other words, instead of having to make a two-way choice between “true” and “false”, the choice is between the many directions of the compass. Thus the concept of truth is often a matter of choosing the right option from a set of options. (See Remark 3.9.5 for similar comments.) To put it another way, “true” means: “Yes, this is the right choice.” “False” means: “No, this is the wrong choice.” Behind every foreground proposition there is a background proposition. Every foreground has a background. So for every foreground proposition which can be true, there is a background proposition which is true if the foreground proposition is false. The human mind works on a foreground/background basis, not a foreground/vacuum basis. Deciding a proposition always determines which of two or more choices will be accepted. (See also Remark 3.6.1 for foreground/background propositions.) 3.7.4 Remark: Every assertion is effectively a double-negative assertion. To put it yet another way, whenever a proposition is asserted to be true, the negation of the proposition is brought into the picture as the background. Therefore the assertion of a proposition always implies that the proposition is in question, and that the negative of the proposition is also being considered. In this sense, one may regard all positive assertions as double-negative assertions, because the assertion is a denial of the negative of the asssertion. Thus propositions are like the two sides of a coin. There are no single-sided coins. All assertions arise from a two-way choice. (This explains why it is dangerous for a politician to deny anything. The audience immediately gives at least some credence to the negative of the assertion.) 3.7.5 Remark: Deducing new propositions from old propositions can be automated. If a group of individuals happen to adopt the same rules for deriving new “true” (i.e. acceptable) propositions from already accepted propositions, and if they start with the same set of “true” (i.e. accepted) propositions, they can be expected to agree on the new propositions which they derive. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
The proposition acceptance/rejection model for logic has some difficulties which become clear in the case of double negative propositions, as mentioned in Remark 3.10.8.
3.7. A proposition-store ontology for logic
89
We may regard truth and falsity as being merely a tagging of propositions by machine-like minds which manage their proposition-stores by accepting true and rejecting false propositions. To start the process, mathematicians try to agree on rules of argument and a set of axioms. This is illustrated in Figure 3.7.2.
well-formed formula rules
axioms and deduction rules
offer of proposition universe of propositions Figure 3.7.2
proposition truth tester
proposition re-use
accept
T
reject
F proposition stores
Rules and axioms in logical deduction
A propositional (or predicate) calculus may be thought of as the design of an abstract machine which has a proposition-store and a process for accumulating “true” propositions. The design of such a calculus is supposed to resemble the way a real-world mathematician works. If the calculus is implemented on a computer, one may hope that the implementation may be faster and less error-prone. In fact, one may regard mathematical logic as the mechanization of human mathematical thinking. In other words, it is an attempt to systematically describe yet another slow, unreliable, tedious human activity so that it may be delegated to machines which are fast, reliable and pain-free.
3.7.7 Remark: Conjunctions are difficult to give meaning to in the proposition-store logic model. Given any two propositions A and B which are accepted by a “logic machine”, we may form the combined proposition “A and B”. This has a clear meaning with reference to the “logic machine” model. It means that both propositions A and B are accepted by the machine. It follows, then, from our naive (i.e. in-built) knowledge of the world that the meta-proposition “A and B” is necessarily true if both A and B are true. This is because truth of a proposition means that it is accepted by a particular logic machine. The previous paragraph has a suspicious circularity. We, as observers of a logic machine M , are accepting the propositions “A is accepted by machine M ” and “B is accepted by machine M ” within our own minds, and we are ourselves “logic machines”. We then form the compound proposition “both A and B are accepted by machine M ” in our own minds, and we accept this compound proposition because of our own naive logical processes for the conjunction of propositions. We therefore conclude that machine M should accept the proposition “A and B”. But this is the imposition of our own logical processes on machine M . This is, in fact, exactly what an assertion is. An assertion is a claim by one logical machine that a proposition should be accepted by another machine. It is difficult to escape the conclusion that the conjunction of propositions (connecting them with the word “and”), and all of the rest of mathematical logic, is merely a behaviour which humans do, and which we believe any other rational machine should accept. 3.7.8 Remark: Logic is the art of leading an audience to conclusions. A logical argument is, broadly speaking, a communication which is intended to lead a reader or listener to accept one or more propositions. (In fact, the words “deduction” and “induction” both come from the Latin word “ducere” which means “to lead”.) A logical argument generally makes use of assumed prior accepted propositions (which correspond to axioms) and agreed rules of argument to lead the reader or listener in the desired direction. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
3.7.6 Remark: The tasks of logic include proposition evaluation and “solving logical equations”. The proposition testing units in Remark 3.7.5 must be able to do both evaluation of compound propositions, given the truth values of the component atomic propositions, and the deduction of conclusions from given compound propositions. These two tasks are analogous to the evaluation of algebraic expressions and the solutions of simultaneous algebraic equations respectively. (See Remark 4.4.2.)
90
3. Logic semantics
3.7.9 Remark: Temporal ordering of logical argumentation. In terms of the “logic machine” model, a logical argument has a temporal parameter. The sequence of steps in an argument is played out over time, inducing a sequence of states in the target logic machine M (i.e. the reader or listener), which finally reaches a state where one or more new propositions are accepted by M . For example, consider the following sketch of a trivial argument. Theorem: α, α ⇒ β ⊢ β Proposition Justification (1) α (2) α ⇒ β (3) β
Assumption 1 Assumption 2 MP (1,2)
Such a formalized argument assumes that the machine M already accepts the propositions α and α ⇒ β. Then in step (3), machine M is supposed to insert β into its own set of accepted propositions. So the contents of the proposition-store of machine M are as follows after each line of the argument has been put. Proposition Proposition-store of machine M (1) α (2) α ⇒ β (3) β
α, α ⇒ β α, α ⇒ β α, α ⇒ β, β
3.7.10 Remark: Static versus dynamic view of proposition validity. The main point of Remark 3.7.8 is that propositions are not true or false in a static sense. True and false propositions are accumulated in proposition stores inside logic machines (of which the human mind is but one example). In other words, truth and falsity are dynamic in character. However, human beings hold their propositions as if they were true in a static sense. Humans like to believe that the conclusions which they arrive at are true in an absolute and eternal sense. Therefore they do not say: “I now accept this proposition until such time as I reject it.” They say: “This proposition is true.” This is perhaps a necessary illusion. If every individual believed that all of their beliefs were arbitrary and temporary, they would not act with confidence on the basis of their beliefs. They would fear that their beliefs could be negated at any time. One consequence of the desire to maintain the illusion of the static nature of beliefs is that mathematicians try to establish, at least within the scope of a handful of axioms and rules (i.e. an axiomatic system such as the predicate calculus), that there are no possible contradictions between the propositions that may be accumulated by different people starting with the same axioms and rules. Russell’s paradox is just one example of how two different logic machines following the same axioms and rules could correctly deduce that a particular proposition is both true (must be accepted) and false (must be rejected). If one of the rules of argument is RAA (reductio ad absurdum), such a situation leads to the necessary acceptance and rejection of all propositions. (See Section 3.11 for discussion of RAA.) 3.7.11 Remark: Static versus dynamic view of logical expressions. The modus ponens deduction rule may be regarded as defining the implication operator “⇒”. On the other hand, a logical expression such as A ⇒ B as a notation for a truth value function φ⇒ (t(A), t(B)) in terms of the truth values t(A) and t(B) of the propositions named by A and B. (See for example Remark 3.12.2.) The first of these perspectives is dynamic, the second is static. The modus ponens view of the implication operator shows how implication expressions may be applied in the deduction process, which is what the proposition-store model does. Propositional calculus and predicate calculus are dynamic processes which are fairly well explained by the proposition-store model. The logical function tree view of the implication operator assumes that the component atomic propositions already have known truth values, which are then inserted into the calculation to produce a truth value for the logical expression. One might reasonably ask which of the two views is more “correct”, or which is more fundamental. The history of logical argumentation, particularly in ancient Greek philosophical speculation, and presumably also [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
In serious mathematics, the proposition-store of the reader or listener would, of course, be too large to write out explicitly.
3.7. A proposition-store ontology for logic
91
in relation to the legal systems which in Mesopotamia are known to date from 2000bc, has shown a strong bias towards the dynamic view of logic. But this is quite likely because the consequent of an implication is typically an imperative (i.e. an action to be performed) in non-scientific contexts. Ancient Greek logic was mostly in the from of syllogisms, which are dynamic in character. Implications in some non-scientific have a causal character, where the antecedent chronologically precedes the consequent. In most non-scientific contexts, there are subliminal undertones which go beyond cold, bare logic. In scientific contexts, it seems more natural to think of the static functional view of logical expressions as fundamental, and the dynamic logical argumentation view as derived from this. In scientific applications, one typically argues from assumed, or confirmed, general laws to particular consequences. Thus the static statement of laws or axioms is considred primary, and the dynamic argumentation process is simply a necessary task to recover unknown parameters of models from known parameters. The assertion of a logical expression is therefore seen as a static “fact”, and the argumentation process is seen as a human activity which combines static facts to infer unknown things from known things. In this context, modus ponens may be viewed as merely one among very many “tricks of the trade” for inferring unknowns from knowns. Most mathematical logic textbooks focus on the implication operator and modus ponens as the basis for their formalizations of logic. This emphasizes the dynamic view of logic. This may create the impression that all logic is fundamentally dynamic in character. But this confuses the methods of solving logic problems with the nature of logical assertions. 3.7.12 Remark: Influencing other individuals’ beliefs may seek to influence behaviour. In the case of human beings, the ultimate purpose of foisting propositions on other individuals is to modify their behaviour, since human behaviour is affected by beliefs. So logic could plausibly have arisen (some time in the last million years) as a method for social behaviour control through control of beliefs. By understanding the prior beliefs of other individuals, and their algorithms for acceptance or rejection of propositions, it is possible to induce other individuals to accept new propositions (or reject previously accepted propositions) by leading them down logical pathways.
One consequence of the logic machine perspective is the observation that all logical argument, such as in a mathematics paper or book, has an implicit “proposition store” in the background upon which arguments are based. Mathematicians generally use the already-proved propositions of other mathematicians, although the prior assumptions of the arguments of others are not always explicitly verified to be compatible with the assumptions of the author in question. To put it simply, the meaning of an assertion in a theorem is “add this to your proposition-store”. An assertion is essentially a command (or request) from one individual to another to accept a proposition in the proposition-store. 3.7.13 Remark: Utterance of propositions has both a descriptive and prescriptive character. The descriptive and prescriptive interpretations of logical assertions are illustrated in Figure 3.7.3 in the case of a compound proposition A ∧ B. sender A B proposition store 1 Figure 3.7.3
description “I accept A and B.” “You should accept A and B.” prescription
receiver
proposition store 2
Descriptive versus prescriptive interpretation of proposition A ∧ B
In the descriptive case, the sender of the proposition is stating that its proposition store includes both A and B at the time of sending. In the prescriptive case, the sender of the proposition is stating that the recipient’s proposition store should include both A and B shortly after the time of reception. (See Remark 3.12.1 for related comments.) [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Mathematical logic merely seeks to formalize and standardize the already existing methods of argument used by human beings.
3. Logic semantics
3.7.14 Remark: The consequent of a conditional has the character of an imperative. If the compound proposition A ∧ B in Remark 3.7.13 is replaced with A ∨ B, the situation is qualitatively different. When the proposition A ∨ B is used in the descriptive sense, the sender merely needs to determine if it believes either A or B, or both. Then the sender calculates whether the compound proposition A ∨ B is true or false, and this truth value is transmitted. However, when the recipient receives A ∨ B, it is usually not possible to determine truth values for A and B. The recipient must store the compound proposition as is. Then it must wait for further information. For example, if the proposition ¬A is received, this can be combined with A ∨ B to infer that B is true. (This is analogous to the situation where the sender transmits an equation such as x + y = 8, and the recipient later combines this with x = 3 to obtain y = 5.) To summarize this situation, if A∧B is received, the truth values of A and B can be immediately determined, and the original compound proposition A ∧ B may be discarded. But with a compound proposition like A ∨ B (or A ⇒ B), there is insufficient information to discard it in favour of the truth values of the atomic propositions A and B. In terms of the proposition insertion notation in Remark 3.10.2, the reception of A ∧ B may be decomposed into the pair of insertions A → T, B → T, whereas the reception of A∨B must be implemented as A∨B → T because it cannot be decomposed. Essentially this same point is made in Remark 3.13.6. (Conversely, if A ∧ B is asserted to be false, the truth values of A and B are unknown, in which case A ∧ B → F must be saved by the recipient rather than individual truth values for A and B.) In the case of sending compound propositions, there is no such qualitative difference between conjunctions and disjunctions. For both kinds of compound operators, it may be possible to determine the truth value if only one of the component truth values is known. (For example, if A is false, then A ∧ B is guaranteed false, no matter what B is. If A is true, then A ∨ B is guaranteed true, no matter what B is.) Consequently the sender of a truth value for a compound proposition may not even know the truth values of all the component propositions. So the ambiguity for the recipient is unavoidable. (See Remark 4.2.6 for an example truth table which allows the possibility of unknown truth values.) Although the sender of a compound proposition may have incomplete knowledge of the truth values of the component propositions, the receiver may have extra knowledge which permits it to deduce more of the component truth values than the sender possessed. These considerations are suggestive of the notion of a “logic machine network”; in other words, a network of logic machines which implement a kind of population dynamics. 3.7.15 Remark: Truth values are attribute tags for propositions. Notation 3.7.16 presents abbreviations for the truth values “true” and “false”. These may be thought of as proposition tags. In other words, they are attributes which are attached to propositions. EDM2 [34], 411.E, page 1552, uses the notation g for “true” and f for “false”. In some computer programming languages (such as C and C++), and in some symbolic logic formalisms, “true” is denoted 1 and “false” is denoted 0, which seems very much easier to remember than g and f. (There is also a very ancient numerological significance to the choice of 1 and 0 for true and false respectively. But this is a “family show”. So the reader will have to look elsewhere for the fine details.) 3.7.16 Notation: T denotes the proposition-tag “true”. F denotes the proposition-tag “false”. 3.7.17 Remark: Conditional propositions and disjunctions may be regarded as “delayed assertions”. As mentioned in Remark 3.7.14, if the insertion A ∨ B → T occurs in a recipient logic machine, a later insertion ¬A → T permits the deduction that B is true. This kind of delayed insertion of a proposition is particularly intended by the compound proposition A ⇒ B. This proposition permits the later deduction of B to be triggered if A is asserted. The chaining together of multiple such implications is the reason why propositional calculus is often formalized in terms of the implication operator. A disjunction such as A ∨ B is equivalent to each of the two implications (¬A) ⇒ B and (¬B) ⇒ A. Thus the delayed assertion of A is triggered by ¬B, and the delayed assertion of B is triggered by ¬A. (See Remark 3.10.14 for further comments on triggering simple propositions via compound propositions.) As mentioned in Remark 3.5.4, the word “or” originates from an Anglo-Saxon word which means “other” or “otherwise”. This is suggestive of the idea that if the first proposition is not true, then the other one is [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
92
3.8. Undecidable propositions and incomplete information transfer
93
true. Thus A ∨ B means: “A is true; otherwise B is true.” This agrees with the idea that a disjunction can be interpreted as a delayed assertion which is triggered by another assertion. In this case, B is triggered by ¬A. 3.7.18 Remark: The prescriptive and imperative character of logic may be may be abused. As the word “foisting” in Remark 3.7.12 suggests, logic has a somewhat aggressive character at times. In fact, the ancient Greeks did employ logic as a political weapon to achieve objectives which were often quite questionable ethically. It is particularly noteworthy that the combination of a contradictory set of axioms with the RAA deduction rule (mentioned in Remark 3.7.10) enables the proponent (i.e. the person who puts forward an argument) to deduce that any proposition at all is true or false. (This exploit can be observed frequently in modern politics if one knows what to look for.) 3.7.19 Remark: Generalized truth-value tags to indicate uncertainty. One could go further in the proposition-store “logic machine” modelling by proposing that all logic machines may have more than one proposition-store. For example, there could be a second store for false propositions. Our understanding of logic in our culture requires that a proposition may not be simultaneously in both the “true-store” and the “false-store”. (In practice, humans do in fact allow this sort of contradictory situation.) In addition to true and false proposition stores, logic machines could have an “undecidable-store” and an “unknown-validity-store” for propositions which are not yet decided. There could also be a “probably-truestore” and a “probably-false-store”, and a very wide range of similar modal indications. As a matter of implementation, it is more efficient to maintain a single store of propositions with tags for each proposition, indicating varying levels of truth or falsity.
3.8.1 Remark: Compound propositions communicate partial information. An extended truth table for the implication operator is presented in Remark 4.2.6. The extended truth table shows the sender’s view of the implication logical expression. The task of the recipient is to invert this truth table to infer the extended truth values of A and B from the extended truth value of A ⇒ B. If the received truth value for A ⇒ B is F, the recipient can say with certainty that the sender believes that A is true and B is false. This gives the maximum information. If the recipient receives the truth value T for A ⇒ B, there are 5 possible combinations of extended truth values for A and B. This gives the least information about the belief state of the sender. The order of truth values in Figure 3.8.1 is intentionally different to the normal English-language usage. The reason for this is the fact that F and T are represented by the numbers 0 and 1 respectively in some contexts. Therefore increasing numerical order is adopted here. true/false model F
T
τ P
P1 , P2 , P3 ,. . .
true/false/unknown model
incomplete information flow axioms and rules
machine M1 Figure 3.8.1
FTU τ′ P1 , P2 , P3 ,. . .
P
machine M2 Ontology of unknown truth values
3.8.2 Remark: Undecidable truth values are a consequence of incomplete information transfer. When a mathematical model contains undecidable propositions, a pseudo-truth-value tag U may be attached to those propositions which are proved to be undecidable. In this case, a situation arises which is illustrated in Figure 3.8.1. There is a presumed system M1 which is being modelled by system M2 . In M1 , the truth value map τ is well defined for all propositions P ∈ P. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
3.8. Undecidable propositions and incomplete information transfer
94
3. Logic semantics
One way in which the information about the truth values of propositions in space P would be incomplete transmission of truth values from M1 to M2 . In this case, truth values are transmitted one by one. But for large sets of propositions, particularly for infinite sets, one usually transmits information about propositions via axioms and deduction rules. In the axioms-and-rules case in machine M2 , incompleteness of the truth value function τ ′ relative to machine M1 may arise because the axioms and rules are insufficient to span the whole proposition space P. In the case of first order languages such as ZF set theory, there are indeed axiomatic systems which are incomplete. One may think of such systems is being imperfect models M2 of perfect models M1 . In other words, an incomplete axiomatic systems are models for implied complete systems in which all truth values are well defined. The pseudo-truth-value U indicates an incompleteness of information in a model which suffers from incomplete transfer of information. The question still remains, whether one should add extra truth values to symbolic logic to have the satisfaction of assigning a truth value to every proposition. This would be as useful as adding an unknown pseudointeger to the integers or an unknown pseudo-real-number to the real numbers. It is somewhat absurd to add such pseudo-values to every set in mathematics and logic. (This kind of absurdity is also discussed in Remark 4.2.8.) It is better to deal with incomplete information meta-logically, in the discussion context. The danger of inventing pseudo-truth-values for propositions is that one may fall into the error of supposing that the system under study (M1 in this case) possesses these pseudo-truth-values, which is not correct. The discussed system M1 has definite truth values for all propositions. They just happen to be unknown in the discussing system M2 . The situation is perhaps clarified by considering two logic machines M1 and M2 which are modelled by a third machine M3 . (See Figure 3.8.2.) true/false model T
τ1 P
P1 , P2 , P3 ,. . . machine M1 true/false model F
T
τ2 P
P1 , P2 , P3 ,. . .
inc o inf orm mple te a ti on fl axi ow om sa nd rul e
true/false/unknown model FTU s
te ple w m flo o on inc i t a orm inf es rul nd a s om axi
τ′ P1 , P2 , P3 ,. . .
P
machine M3
machine M2 Figure 3.8.2
Ontology of unknown truth values: two modelled logic machines
Even if the communicated axioms and rules are identical for M1 and M2 , the truth value functions τ1 and τ2 may be different for some propositions P ∈ P, but within the span of the axioms and rules, there must be complete agreement between τ1 , τ2 and τ ′ . For the propositions P ∈ P for which τ1 (P ) and τ2 (P ) are different, the value of τ ′ (P ) must be U. Now, if one wishes to model machine M2 by a third machine M3 , then it would be sensible to permit three truth values in the model of M2 by M3 . This causes a minor blow-out in the number of types of unknowns. (See Figure 3.8.3.) Of course, this procedure becomes more “interesting” if one has a larger network of machines which are all modelling each other. This introduces such concepts as “unknown unknowns”. Such a “network of logic [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
F
3.9. The semantics of truth and falsity
95
true/false model F, T τ P
P1 , P2 , P3 ,. . . machine M1
true/false/unknowns model F, T, {F, T}, {F, U}, {T, U} {F, T, U}
inc o inf orm mple te a ti on flo axi w om sa nd rul 1 es
te ple w om n flo c in tio a orm inf 2
τ ′′
P
P1 , P2 , P3 ,. . .
om axi
nd sa
true/false/unknown model F, T, U={F, T} τ′ P1 , P2 , P3 ,. . .
P
machine M2
es rul
machine M3 Figure 3.8.3
Ontology of unknown truth values: second-order modelling of unknowns
3.8.3 Remark: Logical algebra is the “restoration” of lost truth values. As mentioned in Remark 46.2.4, the original Arabic word “algebra” meant “restoration” in the sense of restoring bones to their original condition after breakage, which is analogous to restoring the unknown values of numbers, given some formulas for the unknowns which are somehow lost. This is fully applicable to “logic algebra” (i.e. propositional calculus), which seeks to “restore” the unknown truth values of propositions, given some logical formulas for their lost values. This perspective implies that truth values do have values, and the purpose of propositional calclus is to restore those values. Consequently, truth and falsity do not originate in the propositional calculus. Truth and falsity are assumed to be well defined, and the task is merely to discover their values. Hence no discussion of inconsistency-tolerant logic is required, and the excluded middle is guaranteed.
3.9. The semantics of truth and falsity 3.9.1 Remark: Analogy between undefined truth concept and undefined probability concept. Truth is generally not defined in mathematical logic, although it is the core concept. Truth is left as an undefined concept with well-defined procedures for its calculation, although what it is is not stated. This is analogous to the way in which probability is undefined in probability, and set membership is undefined in set theory. The meaning of “truth” must be sought outside the domain of mathematical logic. 3.9.2 Remark: Importance of giving meaning to the concepts of truth and falsity. The primary objective of logical argumentation is the tagging of propositions with the adjectives “true” or “false”, with the implication that the reader (or listener) should accept the true propositions and reject the false propositions (as discussed in Section 3.7). It is therefore reasonable to commence a treatment of logic by attempting to clarify the concepts of truth and falsity. 3.9.3 Remark: Mathematical logic truth is restricted to two truth values. The concept of “truth” can be employed in a wide variety of contexts. As mentioned in Remark 4.1.1, the abstract calculus of truth and falsity may be applied to a wide range of concrete systems whose components may be in either of two states. The states may be called “true” and “false”, or “on” and “off”, or “high” and “low”, or any other pair of names. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
machines” situation does often happen in human literature where multiple contexts refer to each other.
96
3. Logic semantics
3.9.4 Remark: The concepts of truth and falsity arise from doubt and suspicion. In human discussions and literature, there is no absolute need for the concepts of truth and falsity to play a role. As long as there is no doubt about the veracity of what is written, there is no need to discuss whether a statement is true or false. However, if any doubt arises, either because of the possible dishonesty or ignorance of the speaker or writer, it is necessary to attach some sort of truth value or confidence value to every proposition which is in doubt. Then each proposition P becomes the subject of a meta-assertion, which states that P is true or false. So truth and falsity are always part of a meta-discussion, where a previous discussion is itself the subject of the discussion. In the modern world, we are so familiar with such meta-discussion that we are almost unaware of the transition between assertions and meta-assertions. The sceptical mind-set doubts all statements by all people, including oneself. But this kind of distrust could perhaps be regarded as pathological, indicating a serious breakdown of trust. If all discussion is met with scepticism, the meta-discussions by the sceptics are likewise subject to doubt, in which case there is little purpose in listening to any discussions at all. In practice, generally only a small proportion of statements are subject to active doubt. When humans are doing practical logic, usually only a small number of propositions are thrown into question. [ Insert here a comment on the relation of the truth concept to indicator (or characteristic) functions. For example, “true North” means the correct choice of directions among many directions. The “true” North is the direction whose truth value is “true”. This is like superimposing a characteristic function on the set of all directions, such that the correct direction has the value “true” and all other directions have value “false”. A large proportion, maybe all, “truth” contexts are of the nature of a multiple choice, where one choice is “true”. Therefore one may express a proposition in two different but equivalent ways: either as a truth-valued “indicator function”, or as a simple proposition stating that the one and only one correct choice is the “true” choice. ] [ Remove the redundancy between Remarks 3.7.3 and 3.9.5. ] 3.9.5 Remark: Truth and falsity are not essential in natural language. Most of the time, people do not consciously think in terms of the two truth values for propositions. Mostly, people think in terms of “multiple choice”. (This is also mentioned in Remark 3.7.3.) For example, people think about which path is best to take when walking to a destination. If a person is thinking about whether the left-most path is the shortest, the person does not think: “Is it true or false that the left-most path is the best?” The conscious choice is not between “true” and “false”, but rather between the various path choices. The concepts of “true” and “false” are not really necessary at all. The concept of which choice to make is sufficient. Consider the question: “Is option A the correct decision?” This question can be answered with “yes” or “no”. This looks very much like a true/false answer. Instead of replying “yes” or “no” to the above [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
The concept of “truth” in abstract logic should be carefully distinguished from the related concept of the accuracy of a hypothesis, model or theory. Abstract logic is a well-defined discipline involving truth tables, logical operators, logical quantifiers, deduction rules, axioms, definitions and theorems. In abstract logic, truth and falsity are effectively defined by the logical procedures which manipulate propositions and their truth values. (The methods and procedures of abstract logic are the subject of Chapter 4.) By contrast, the truth (or falsity) of propositions which refer to the real world is highly subjective, depending on the entire intellectual framework and context in which the modelling is carried out. The notion of “scientific truth” refers generally to the accuracy and suitability of particular models for explaining observations of phenomena. Scientific truth depends on the ambiguities of our observations of the real world whereas logical truth is defined by social conventions alone. In colloquial language, the real-world kind of “truth” refers to both the lack of intentional deception and the lack of accidental error. The kind of truth referred to in this case is the accuracy of the correspondence between a person’s model of the world and the world itself. This kind of truth is not the subject of mathematical logic. Therefore there is no need for concepts like the confidence level of an assertion in mathematical logic. There are two truth values, and only two truth values. If observations of the real world are subjective and vulnerable to error and uncertainty, that is not an issue which needs to be discussed within mathematical logic. The distinction between abstract and concrete logic is similar to how the abstract game of chess is distinguished from the concrete pieces of wood or plastic which are moved around on a physical board.
3.10. The semantics of logical negation
97
question, one could answer with a different choice: “Option B is the correct decision.” It seems as if the attachment of truth values to propositions is unnecessary. If it is known a-priori that one and only one of a set of options is true, the assertion of a particular option implicitly negates the other options. There are some languages which lack the words “yes” and “no”, such as Latin and Irish Gaelic. (In Irish Gaelic, the response to a yes/no question is essentially to repeat the question in the indicative or in the negative.) Discussions can clearly be conducted without yes/no and true/false vocabulary. 3.9.6 Remark: Three stages of logic abstraction, analysis and application. Abstract logic is an attempt to collect and formalize the common elements of practical logical thinking. As for any kind of abstraction, there are three stages in this process. First, the concrete context to which abstract logic is to be applied must be mapped to an abstract logic system. Secondly, some calculations are carried out within the abstract framework. Thirdly, the conclusions from the abstract logic must be mapped back to the application context. (See Figure 3.9.1.) This process can be beneficial if the modelling is accurate. abstract propositions
logical calculus
abstraction
application
concrete propositions Figure 3.9.1
abstract propositions
concrete propositions
Three-stage logic abstraction and application process
3.9.7 Remark: Analytic logic and synthetic logic. The abstract logic domain is generally called “analytic logic”. The use of logic in concrete contexts is called “synthetic logic”.
3.9.8 Remark: Propositions may be abstracted from any model or observation. The most basic prerequisite for the applicability of abstract logic to a context is that there must be a welldefined space of propositions. Each proposition must have a truth-value attribute which takes one and only one of the values “true” and “false”. For example, in the case of the Leonardo da Vinci painting known as the “Mona Lisa”, one may ask whether she is smiling or not. If the proposition “she is smiling” can only be true or false, then abstract logic is applicable. If the answer to the question “Is she smiling?” could be “yes and no”, then standard abstract logic is not applicable. [ Strictly speaking, the proposition symbols in the following should be wff names like α rather than proposition names like A. Maybe this should be clarified. ]
3.10. The semantics of logical negation 3.10.1 Remark: Dynamic interpretation of truth tables for proposition-store logic machines. It is usual in textbooks to give the following “truth table” for the negation operator. A T F
¬A F T
This is taken to mean that if A is true, then ¬A is false; and if A is false, then ¬A is true. In terms of the logic machine model, the table means that if A is in the true-proposition list of machine M , then ¬A should be inserted in the list; if ¬A is in the true-proposition list of machine M , then A should be inserted in the list. In other words, the negation of a proposition blocks the acceptance of that proposition. However, the truth table does not mean: “If A is not in the true proposition list, then ¬A should be inserted in the list.” If neither A nor ¬A is in the true-proposition list, the truth table tells us nothing about [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
[ Remove the definitions of analytic and synthetic logic if no reference can be found for them. ]
98
3. Logic semantics
whether ¬A or A respectively should be inserted in the list. The truth table creates the impression that all propositions are either true or false. But in a dynamic system, neither the proposition A nor its negation ¬A may be in the true-proposition list at a given point in time. If a proposition and its negative are both not in the true-proposition list, this means that the proposition is undecided, not necessarily undecidable. 3.10.2 Remark: Interpretation of truth tables as proposition-store state-machine instructions. The truth table in Remark 3.10.1 may be converted to logic-machine pseudo-instructions as follows. (i) If A ∈ T, then ¬A → F. (If A is in the T-list, insert ¬A into the F-list.)
(ii) If ¬A ∈ F, then A → T. (If ¬A is in the F-list, insert A into the T-list.)
(iii) If ¬A ∈ T, then A → F. (If ¬A is in the T-list, insert A into the F-list.) (iv) If A ∈ F, then ¬A → T. (If A is in the F-list, insert ¬A into the T-list.)
The first row of the truth table in Remark 3.10.1 corresponds to rules (i) and (ii). The second row corresponds to rules (iii) and (iv). The ad-hoc notation A → T means “insert proposition A into the T-list”. Note that the phrase “A ∈ T” is a static prior condition which must be satisfied at the time that the rule is executed. The phrase “A → T” is a dynamic action which must be executed when a condition is satisfied. According to this logic-machine view, truth is dynamic. In other words, truth is a process.
3.10.3 Remark: Self-consistency rules for proposition-store logic machines. There is nothing in the insertion rules in Remark 3.10.2 to prevent both A and ¬A being in both the true-list and the false-list. To prevent this situation, there must be an a-priori rule to prevent contradictions. This “self-consistency rule” can have two different forms. (1) It is forbidden to have an identical proposition A in both the true-list and the false-list.
Rule (1) is not sufficient to prevent contradictions on its own. But if it is combined with the truth table for the negation operator, contradictions are prevented. For example, suppose A and ¬A are in the true-list. The negation operator rules implies that ¬A must be inserted in the false-list. Then ¬A will be in both lists. So the rule is contravened. Likewise, rule (2) is not sufficient to prevent contradictions on its own. But if it is combined with the truth table for the negation operator, contradictions are prevented. For example, suppose A is in both the true-list and the false-list. Then the negation operator rules imply that ¬A must be inserted into both lists. So A and ¬A will both be in both lists. So the rule is contravened. 3.10.4 Remark: If a proposition-store logic machine generates a contradiction, it must be rebooted. Remarks 3.10.1 and 3.10.3 lead to a dilemma if the negation-operator rule somehow clashes with the selfconsistency rule according to the example sequences of events described in Remark 3.10.3. Which rule takes precedence? Should the negation rule proceed unhindered, yielding an inconsistent logic machine (which breaks the self-consistency rule)? Or should the negation rule be prevented from acting in such cases, which then yields a system which is still in a self-contradictory state? In the computer software context, such a situation would lead to a “system hang” where the computer can no longer continue to function. Under such circumstances, one may “reboot” the system and hope that the problem will not occur again for a long time, or that the problem was caused by an “execution error”. Or one may abandon all machines with the same design because they yield contradictions. Alternatively, one may remove one or more of the “boottime” propositions (called “axioms”) and re-start the machine in the hope that the removed propositions were the sole cause of the problem. This is more or less what happened with Russell’s paradox. When this was discovered, the axioms of set theory were modified to (hopefully) prevent the same kind of contradiction happening again. Various new set theory designs were proposed, and these are now used instead of the earlier “buggy” set theory. The fixes for Russell’s paradox would be called “bug fixes” or “(software) patches” in the computer software context. The subsequent divergence of set theories into various flavours would be referred to as “forking” of the system into multiple “derived systems”. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
(2) For any proposition A, it is forbidden to have both A and ¬A in the true-list, or both A and ¬A in the false-list.
3.10. The semantics of logical negation
99
3.10.5 Remark: Reductio ad absurdum may be interpreted as a local-scope reboot. The show-stopping (or machine-stopping) contradictions mentioned in Remark 3.10.4 are acted out in miniature in the RAA (reductio ad absurdum) method of deduction. In the RAA method, a tentative assumption A is first made (with the intention of proving it false), and deductions are made from this assumption until a contradiction is encountered. This causes a mini-reboot of the logical system. (In fact, one may think of the logical argument which follows the tentative assumption as a temporary context which is “pushed” onto a “stack” of contexts, in terms of a particular kind of computer programming model.) When a contradiction is encountered, the RAA method specifies that the original tentative assumption A may be asserted in the negative (¬A), and all deductions from the positive assertion A must be removed. This is like building a mini-logic-machine which includes the tentative assumption A, and this mini-machine is destroyed when a contradiction is found in it. (In a sense, the RAA method is very similar to the “devil’s advocate” method of argument.) [ Does the regula falsi method have any connection with the RAA method? ] 3.10.6 Remark: The difficulty of expressing compound logical expressions in natural languages. The negation operator seems straightforward enough when one considers only abstract proposition names. One simply writes the symbolic expression ¬A corresponding to any expression A. However, suppose the name A refers to a concrete proposition such as: “The Earth is round.” How must one interpret the proposition ¬A? The concrete proposition “¬The Earth is round.” does not have any obvious meaning. The symbol “¬” is not part of the natural language of the space of concrete propositions. Since we do know English, we can write: ¬A = “The Earth is not round.” This form of negation seems plausible enough. We merely need to understand the language in which the concrete proposition (i.e. sentence) is written and then form the negative of that proposition. But this is not always so easy.
But the examples can be more complex than that. Let C = “Either the Arctic is melting or the Equator is warming or not becoming less humid, unless the global warming trend is not verified.” It quickly becomes clear that in order to write propositions such as ¬C in the concrete proposition language requires an unreasonably detailed knowledge of the language. A clean way to resolve the difficulty of concretely representing negations of propositions is to define ¬A to mean “A is false” or “t(A) = F”, or some such expression in a meta-language. To be more precise, ¬A means: “The proposition to which we give the name A is false.” To be even more precise than this, we could say that ¬A is a compound proposition name expression whose truth value is the negative of the truth value of A, but which does not point to a specified sentence in the concrete language, although there may be a sentence in the concrete language whose truth value is guaranteed to be the negative of the truth value of A also. Only simple proposition names point directly to concrete propositions. All proposition expressions point to simple proposition names, which in turn do point to concrete propositions. When we say that a proposition is false, the proposition is itself the subject of the assertion. Truth values are not always available in a concrete proposition language. We discuss truth-value proposition-attributes in a “discussion language” or “meta-language”. In natural English, we discuss truth and falsity within the language itself, but this is not necessarily possible in all concrete languages. For any concrete proposition A, we can fairly plausibly write ¬A as “It is not true that A.” However, in some languages, the concrete proposition A must be converted to a different verb mood and/or word order, and the rules for this may be quite complex, even if the language does possess a sentence construction of this kind. (In the case of transistor circuit voltages or Egyptian hieroglyphics, the map of abstract logic expressions to concrete “sentences” is even more problematic than is described here.) A classic example of the difficulty of negating a proposition is a proposition such as: “Have you stopped committing crimes?” One possible answer is: “I have stopped committing crimes.” The apparent negative of this answer is: “I have not stopped committing crimes.” Although both of these sentences have a welldefined meaning in English, they are not true negatives of each other. Most people would be reluctant to [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
As a more complex example, consider the sentence is B = “The Sun does not rise in the East.” Perhaps the negative of this sentence is: “The Sun does not not rise in the East.” This is almost acceptable, although it is not a common construction in English. Another candidate for the negative sentence is: “The Sun does not rise not in the East.” This is also unnatural. Since we know English very well, we can guess the correct negative as: “The Sun does rise in the East.” Or more simply: “The Sun rises in the East.”
100
3. Logic semantics
answer “yes” or “no” to the question because neither answer would seem correct. This suggests that concrete language is unsuitable for logical operations. If even simple negation is so problematic, there is little hope of performing the full generality of logical operations with crisp precision. 3.10.7 Remark: A negated proposition name is not the same as a meta-sentence. Remark 3.10.8 discusses the consequences of regarding ¬A as a meta-proposition relative to A. This seems right in the sense that the sentence “A is false.” is a meta-sentence relative to the sentence A. However, a better interpretation is that A is a name of a proposition, and ¬A is a logical expression constructed from the name A, and when asserted, ⊢ ¬A means that the expression φ¬ (t(A)) has the value T. That is, ⊢ ¬A is equivalent to φ¬ (t(A)) = T. Thus all logical expressions of propositions names, except for atomic proposition names, are proposition pseudo-names which are well-defined only in the abstract logic, not in the concrete logic. Similarly, A ∨ B is a proposition pseudo-name which, when asserted, has a well-defined meaning. Namely, ⊢ A ∨ B means φ∨ (t(A), t(B)) = T. Nevertheless, the approach discussed in Remark 3.10.8 is useful in clarifying the consequences of assuming that non-atomic logical expressions are meta-propositions relative to the concrete proposition space in some sense. 3.10.8 Remark: The difficulty of mapping abstract logical expressions back to concrete propositions. Even if one accepts that the truth and falsity of concrete propositions must be discussed within a metalanguage, the problem of the multiple application of logical operators still arises. As a trivial example, the double negation expression ¬¬A must be given meaning for any proposition A. If ¬A is located in a meta-language, surely ¬¬A is located in a meta-meta-language! If ¬A means “t(A) = F”, then surely ¬¬A means “t(“t(A) = F”) = F”. ≡ ≡
“ t(A) = F” “ t(“ t(A) = F”) = F”.
[ Alternatively t(¬A) = φ¬ (t(A)). So t(¬A) = F ⇔ t(A) = T? ] It is not obvious a-priori how propositions in the meta-meta-language can be mapped to the meta-language. After all, mapping propositions from the meta-language to the concrete language is, in general, either very difficult or even impossible. The situation is saved by the fact that the meta-language is a very special kind of language. Unlike the concrete proposition languages, we do know a lot about the syntax and semantics of the meta-language. In particular, we know that if the meta-proposition “t(A) = T” is false, then “t(A) = F” is true, and vice versa. Therefore double negation can be simply and reliably mapped from the meta-metalanguage to the meta-language. (Of course, this does not imply that a further map to the concrete language is simple or reliable.) The meta-meta-proposition “t(“t(A) = F”) = F” can be mapped to the meta-proposition “t(A) = T”. This follows from our knowledge that the “true” and “false” truth-values are mutually exclusive in the metalanguage layer. Similarly, the meta-meta-propositions “t(“t(A) = T”) = F” and “t(“t(A) = F”) = T” can both be mapped to the meta-proposition “t(A) = F”. “ t(“ t(A) = T”) = T” “ t(“ t(A) = F”) = F” “ t(“ t(A) = T”) = F” “ t(“ t(A) = F”) = T”
7 → 7→ 7 → 7 →
“ t(A) = T” “ t(A) = T” “ t(A) = F” “ t(A) = F”.
The meta-proposition “t(A) = F” cannot be mapped straightforwardly to a concrete proposition, as discussed above, but the meta-proposition “t(A) = T” is generally identified with the concrete proposition A. “ t(A) = T” “ t(A) = F”
7→ 7 →
A ???.
This is an exception. Essentially all logical expressions, apart from atomic expressions A, are problematic to map to the concrete language. (This is illustrated in Figure 3.10.1.) [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
¬A ¬¬A
3.10. The semantics of logical negation
101
meta-meta-propositions t(“t(“T´a an Domhan cothrom.”) = T”) = T t(“t(“T´a an Domhan cothrom.”) = F”) = F
t(“t(“T´a an Domhan cothrom.”) = T”) = F t(“t(“T´a an Domhan cothrom.”) = F”) = T
meta-propositions t(“T´ a an Domhan cothrom.”) = T
t(“T´ a an Domhan cothrom.”) = F
concrete propositions A = “T´ a an Domhan cothrom.”
Figure 3.10.1
“. . . ???. . . ”
Mapping (meta-)meta-propositions to (meta-)propositions
In the case of all other logical expressions, the same argument applies as for the double negative. Just as we can say that ¬¬A is fully equivalent to A, if we interpret A as the meta-proposition t(A) = T. so also we can interpret A ∧ A, for example, as fully equivalent to A with the same interpretation. More generally, all equivalences are valid within the meta-proposition layer, but these equivalences are not always valid in the concrete proposition layer.
3.10.9 Remark: General truth functions. Having discussed one particular “truth function”, namely the negation operator in Remark 3.10.1, it is straightforward to generalize this concept to general truth functions. (See Definition 4.2.4 for truth functions.) 3.10.10 Remark: Semantics of abstract logical propositions. The primary application of truth functions is to define logical operators. Each logical operator corresponds to a fixed truth function. The truth table in Remark 3.10.1 defines the unique unary truth function f : {F, T} → {F, T} which satisfies f (T) = F and f (F) = T. This is the truth table of the unary logical operator “¬”, namely the logical negation operator. To be more precise, let t(A) denote the truth value F or T of a proposition A. Then ¬A means a proposition whose truth value t(¬A) satisfies t(¬A) = f (t(A)), where f is the unary truth function defined by the table in Remark 3.10.1. There is something unsatisfying in this definition of negated propositions ¬A. How do we know that there exists a proposition in the universe of propositions which satisfies t(¬A) = f (t(A))? How do we know that there is at most one such proposition? A simple way to resolve these questions is to simply define ¬A to mean “the proposition which has the opposite truth value to A”. But this is also unsatisfying because meta-language is required to give this proposition meaning. In fact, this is essentially the real situation. In real life, assertions of the form “A is not true” have the character of meta-propositions. A similar comment applies to compound propositions such as “A and B are both true” and “B is true if A is true”. However, in our culture, we accept meta-propositions such as “it is not true that the cat sat on the mat” (t(A) = F) on an equal footing to simple propositions such as “the cat sat on the mat” (A). 3.10.11 Remark: Indicative and subjunctive verb moods for abstract propositions. The truth value attribute t(A) for propositions A, referred to in Remark 3.10.10, is related to the subjunctive [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
[ Find references for the “deflationary” and “redundancy” theories of truth, where apparently a sentence of the form “A is true.” is equivalent or identical to the corresponding sentence A. Of course, I don’t agree with such theories, but it’s good to present opposing viewpoints. ]
102
3. Logic semantics
verb mood which is discussed in Remark 3.7.1. Meta-propositions of the form “t(A) = T” correspond to the indicative mood. 3.10.12 Remark: Semantics of assertion of negative propositions. In the same way that one defines “⊢ A” to mean “there exists a proof for A” (where “exists” means that the person making the assertion can produce a proof on demand), one might conjecture that “⊢ ¬A” might mean “there exists no proof for A”. This could be interpreted to mean that there is a provable meta-theorem which asserts that no proof for the proposition A is possible. (Of course, this requires an entire meta-logic to be defined and developed.) One way to prove (in a meta-logic) that no proof for A can be provided would be to show that the result of any such proof would lead to a contradiction. But this would require a powerful general meta-theorem which proves that the original logic is contradition-free, i.e. consistent. This form of proof (of the impossibility of a proof for A) is essentially the same as the RAA rule. (See Section 3.11.) Other kinds of meta-logical proofs of the impossibility of finding a proof are likely to be much more complicated than this, for example involving concepts like “reachability” and “span”. Even if a meta-mathematical proof of the non-existence of a proof may be provided, this would not be a convincing meaning for the assertion “⊢ ¬A”. This kind of assertion must mean that the truth value of A is “false”. This, in turn, is only well-defined when interpreted within the concrete context to which the abstract logic is being applied.
3.10.13 Remark: Equivalent logical expressions versus identical logical expressions. When two compound logical expressions have the same truth value for all choices of truth values for the atomic propositions, one may ask whether these expressions are the same, and if so, in what sense. For example, are the expressions A and ¬¬A the same? This is similar to asking whether the algebraic expressions x and (x + 1) − 1 are the same. We would generally say that these expressions are different, but the always have the same value. So they are “equivalent”. In the same way, equal-valued logical expressions such as A and ¬¬A are equivalent, but not identical expressions. A logical expression specifies a set of operations which are to be carried out. For example, the expression ¬¬A specifies that the sequence of operations φ¬ ◦ φ¬ must be applied to A. More complex expressions require trees of operations, not just linear sequences of operations.
As mentioned in Remark 3.10.8, a concrete proposition corresponding to the abstract logical expression ¬A does not necessarily exist. The space of concrete propositions is not necessarily closed under the usual logical operations (in Notation 4.3.3) because the concrete proposition language might not have all such operations. In particular, an abstract expression like ¬¬A might not correspond to a concrete proposition. Therefore one certainly cannot say that the propositions A and ¬¬A are “the same”, because ¬¬A might not even be defined as a concrete proposition. It is much tidier to regard simple atomic propositions like A as names of concrete propositions, and then regard all compound logical expressions, including ¬A, as being merely notations for logical function trees in terms of the truth values of the component propositions. (The parsing of logical expressions to produce the corresponding function trees is discussed in Remark 4.3.10.) Thus ¬A, in particular, is not the name of a concrete proposition. 3.10.14 Remark: Interpretation of logical expressions as triggering delayed conclusions. As mentioned in Remark 3.7.17, compound propositions such as A ⇒ B and A ∨ B may be thought of as triggering future deductions. The proposition B if “triggered” by A if A ⇒ B is true. The proposition B is “triggered” by ¬A if A ∨ B is true.
The triggering concept gives a clue to how one might interpret the negation of a proposition. As observed in Remark 3.12.1, the consequent clause of a conditional is typically in the imperative verb mood in natural languages. (In other words, people rarely say that if A is true, then B is true. More often, people say that if A is true, then action B must be carried out.) [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
The semantics of “⊢ ¬A” may be denoted as “t(A) = F” in the same way that the meaning of “⊢ A” may be denoted as “t(A) = T”. There is no need to discuss proofs or existence of proofs at all. More generally, the assertion of any abstract logical expression has a meaning which can be fully expressed in terms of truth values of the concrete propositions to which the proposition names in the expression refer.
3.11. Proof by contradiction
103
Suppose A ⇒ B is a conditional, where B is in the imperative verb mood. If a person believes that A is true, this implies that they must carry out action B. If A is false, this implies that B does not need to be carried out, at least not on account of the rule A ⇒ B. In ancient history, laws had the form of conditionals with an imperative consequent clause. Finding that the antecedent clause was false meant that the consequent action did not need to be carried out. As suggested in Remark 3.10.1, the negation of a proposition blocks the acceptance of the proposition. Since a primary reason for wanting to establish the truth value of propositions is to determine consequent actions, the negation of a proposition has the result of blocking the consequences of accepting it. Thus the negation of a proposition is not merely the non-assertion of it, but also an active blocking of its assertion. 3.10.15 Remark: Truth and falsity versus belief and disbelief. Instead of “truth” and “falsity”, one should perhaps talk of “belief” and “disbelief”. In the English language, the word “disbelief” can mean either “not believing something” or “believing that something is false”. In other words, disbelief can indicate either a scepticism, which could later be converted to belief, or a positive belief in the falsity of a proposition. (This is reminiscent of the Scottish law distinction between “not guilty” and “not proven”.) When we say that a proposition is false, we generally mean that it is not only not provable, but also something stronger than that. One way to interpret falsity is to consider implications like: “If the food is poisonous, you must not eat it.” If the proposition P = “The food is poisonous.” is true, one must not eat the food. And if there is any chance at all that it is poisonous, one should presumably not eat it. However, if it is positively asserted that the food is not poisonous, it would be safe to eat it. The mere inability to prove P would not be sufficient grounds for eating the food. But the positive assertion that P is false means that the food is safe to eat. One no longer needs to think at all of the logical implications of P being true.
3.11.1 Remark: Equivalent and related concepts to proof by contradiction. The important method of mathematical logic known as “proof by contradiction” is equivalent, or identical, to the concepts and methods of reasoning known as “the excluded middle”, “reductio ad absurdum” (RAA), “Ex contradictione sequitur quodlibet”, “Ex falso sequitur quodlibet” and “the indirect method”. There is a long history of controversy about this method. [ Find references and translations for Ex contradictione sequitur quodlibet and Ex falso sequitur quodlibet. ] 3.11.2 Remark: Proof by contradiction is based on the excluded middle. One of the most important tools of mathematical argument is “reductio ad absurdum” (RAA), which is based on the “excluded middle”, which means that all well-formed propositions are either true or false (but not both, and not some other truth value). Bell [189], pages 565–566, quotes Luitzen Egbertus Jan Brouwer as saying that: “A implies the absurdity of the absurdity of A, but the absurdity of the absurdity of A does not imply A.” One may reply to this that: “the absolute impossibility of the absolute impossibility of A does imply A.” Therefore the “excluded middle” rule is accepted here. Doubt about the validity of the excluded middle law does arise in colloquial, informal argumentation where multiple world-models are mixed indiscriminately. This is due to mixing multiple meanings of the words “true” and “false” in a single context, applying these words to propositions belonging to multiple inconsistent world models without specifying which word instances refer to which models. 3.11.3 Remark: Proof by contradiction assumes that the logical system is contradiction-free. Logic is a model for human mental processes. The rules of logic must correctly describe how humans think. By accepting RAA, we are merely incorporating into formal logic the observation that mathematicians do generally argue that if they can obtain both A and ¬A from a tentative assumption that B is true, then B must be false. This is equivalent to expressing the certainty that all of the previous assertions in the logical system that one has built up contain no contradictions. 3.11.4 Remark: Proof by contradiction is potentially dangerous. The biggest danger of RAA is the fact that if a set of axioms is not self-consistent, then all propositions can be proved to be both true and false. So the whole logical system collapses. This danger may seem to [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
3.11. Proof by contradiction
104
3. Logic semantics
be an argument in favour of rejecting RAA, but if a logical system does contain self-contradictions, it is probably a good idea to make the system collapse. The RAA deduction method assumes that the logical system is self-consistent. The best cure for a logical system which is not self-consistent is to remove offending components of the system until it is self-consistent, not to remove the RAA deduction method. 3.11.5 Remark: Proof by contradiction is very dangerous if empirical propositions are permitted. If some of the propositions in a logical system incorporate empirical observations and inferences, there would be a fair likelihood that contradictory assertions could enter into the system. In this case, the RAA method would certainly be unreliable and undesirable. The slightest error in observation or inference could lead to totally arbitrary conclusions. In order to use RAA successfully, one must be totally certain that the logical system has not previously acquired any contradictory assertions. This guarantee is fairly achievable in a purely abstract model, but not in a system which is frequently acquiring assertions from unreliable observations and inferences. 3.11.6 Remark: The excluded middle follows from world-model ontology, not proposition-store ontology. The excluded middle assertion “¬¬A ⊢ A” seems difficult to justify if one accepts the proposition-store ontology for mathematical logic which is mentioned in Remark 3.7.2. According to this ontology, logical machines (such as human beings) accumulate propositions over time. Truth means that a proposition is accepted. Falsity means that a proposition is rejected. So double negation ¬¬A means that the rejection ¬A is rejected. But this does not imply the acceptance of the proposition A.
However, if one adopts the world-model ontology which is mentioned in Remark 3.6.1, the assertion “¬¬A ⊢ A” is much easier to justify. According to this ontology, propositions are attributes of models. That is, propositions are either true or false according to whether the logic machine’s world-model has the attribute or does not. The possession or non-possession of any specific attribute partitions all world-model states into two subsets. (This is illustrated in Figure 3.11.1.) sender
receiver model states z ∈ Z
“I assert P .”
ZP
P (z) true
P
is true
P (z) true
ZP
Z¯P
other states
P
is false
other states
Z¯P
ZP = {z ∈ Z; P (z)} machine M1 Figure 3.11.1
“I do not assert P .” Or: “I assert ¬P .” Or: silence
ZP = {z ∈ Z; P (z)} machine M2
World-model ontology for logical negation
So negation of an attribute means that the complementary set is indicated. The double complement of a set is itself. So the general assertion “¬¬A ⊢ A” is guaranteed by the semantics of propositions. Such a guarantee cannot be given in the absence of an ontology. The discussion in Remark 3.10.8 concludes that if the abstract proposition “t(A) = F” is false then “t(A) = T” must be true. In other words, the meta-meta-proposition “t(“t(A) = F”) = F” can be mapped to the meta-proposition “t(A) = T”. This follows specifically from the level of abstraction of these propositions. 3.11.7 Remark: Propositional calculus neither needs nor permits a third truth value. One may argue that for every proposition regarding a world model (inside a logic machine), there could be three possibilities for the world-model attribute: true, false or unknown. However, in this case there are three possible propositions: P1 = “attribute A is true”, P2 = “attribute A is false” and P3 = “attribute A is unknown”. Each one of these attributes may be true or false. One of the propositions P1 , P2 and P3 is true, and the other two are false. These three propositions each have two possible truth values, “true” and “false”. If the previous paragraph seems confusing, that may be because the words “true” and “false” are used in two different contexts, with two different sets of meanings. In the first context, these words refer to possible states of the discussed logic machine M1 , for which there are three possibilities for world-model attributes. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
model states z ∈ Z
3.11. Proof by contradiction
105
In the second context, the words “true” and “false” are used by logic machine M2 to describe machine M1 . The logical language of machine M2 conforms to the excluded middle rule, whereas the logical language of machine M1 does not. Ultimately it seems that the excluded middle is valid for all logic machines which have only two options for the truth value of any proposition. If there are more than two options, one is no longer dealing with logic, whose propositions are defined to have exactly two truth value options. Any logic machine which works with more than two truth value options can be accurately modelled by a logic machine which has two-option propositions. That is, in fact, what defines a proposition. A logical proposition always has two possible truth values. Anything else is a generalization of the concept of logical propositions. Such generalizations may be modelled, as outlined above, in terms of a two-option logic. (See Section 4.1 for the fundamental definitions of two-truth-value logics.) It may be concluded that all logic machines either implement a strict two-truth-value logic, or else they can be accurately modelled by a strict two-truth-value logic. Since logic is merely a model, its conclusions are only valid to the extent that the modelled system is accurately modelled. Therefore necessarily some logic machines cannot benefit from the conclusions of logic because the modelling assumptions are not satisfied. One may, however, transform a non-conformant system so that two-truth-value logic is applicable.
In terms of the world-model ontology, all propositions are attributes of the model. Each proposition answers a yes/no question about the model. So if a proposition is found to be inconsistent with other propositions about the model, this does imply that the proposition is necessarily false, which by definition of ¬A, implies that ¬A must be true.
Logical inconsistencies can only arise in the world-model ontology because of an error of logical calculus, which means that the machine does not satisfy the minimum requirements for the logic model, or else one made the wrong guess out of the two options: “true” and “false”. As mentioned in Section 4.1, the most fundamental assumption of standard two-state logic is that there is a set of concrete propositions, all of which are either true or false. Sometimes one is given the truth values of compound propositions (i.e. a logical expressions) instead of atomic propositions. From the truth values of compound propositions, the truth values of atomic propositions can be deduced. (This is similar to solving simultaneous equations in algebra.) If a contradiction arises in this deduction process, this implies that the truth values given for the compound propositions were not all correct. In the world-model ontology, contradictions always imply that one or more incorrect propositions have been introduced into the logical calculus. Therefore one can validly infer that if all assumptions but one are guaranteed in some way to be valid, the non-guaranteed assumptions must be false. The world-model ontological picture of the RAA rule may be summarized by saying that the truth values of all propositions are “out there”, and we only need to find out what they are. The logical calculus is merely a method of combining the clues which we have about the model under study to arrive at conclusions. In the proposition-store ontology, propositions are meaningless texts which are subject to arbitrary rules. The fact that a proposition is syntactically correct does not imply that it has a meaning. Therefore it seems that semantics is essential for the RAA method to work. In fact, the existence of an underlying model seems to be required for logic to be anything more than a recreational activity. 3.11.9 Remark: Discomfort with proof by contradiction has a long history. Bell [190], pages 277–278, makes the following comment about reductio ad absurdum in the context of a disagreement between Malus (1775–1812) and Cauchy (1789–1857). [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
3.11.8 Remark: Interpretation of proof by contradiction in the world-model logic ontology. The comments in Remarks 3.11.6 and 3.11.7 may now be applied to the issue of proof by contradiction. In terms of the proposition-store ontology, this method of proof is difficult to justify. In this ontology, the RAA method commences with the insertion of a trial proposition A into the proposition-store. Then calculations are performed which show that A is inconsistent with some combination of pre-existing propositions. So A must be removed from the store. This does not imply in any way that its negation ¬A must be inserted in the store. The fact that a proposition cannot be inserted does not automatically imply that its negation must be inserted. It is entirely feasible for the logic machine to continue with neither of the propositions A and ¬A in the store.
106
3. Logic semantics
3.11.10 Remark: Discomfort with proof by contradiction is still present in the modern world. Lakoff/N´ un ˜ez [172], page XIII, makes the following remark about the RAA method. [. . . ] why, in formal logic, does every proposition follow from a contradiction? Why should anything at all follow from a contradiction? On page 121, they say the following on this subject. And what about “P and not P , therefore Q”? Why should anything at all follow from a contradiction, much less everything? Generations of students have found these to be disturbing questions. Formal definitions, axioms, and proofs do not answer these questions. They just raise further questions, like “Why these axioms and not others?” It is incorrect to describe proof by contradiction as: “From a contradiction, any proposition follows.” If no errors are made in logical deduction from self-consistent assumptions, no contradictions will ever arise. So one cannot then make anything follow from a contradiction because there aren’t any! But if one does try to introduce an ad-hoc proposition into an argument, and that proposition is shown to contradict other propositions, one may correctly infer that the ad-hoc proposition is in fact false, assuming that it is at least a well-defined proposition which is within the scope of the logical system being considered. (A welldefined abstract proposition must be a valid logical expression whose individual proposition names refer to propositions in the concrete proposition domain.) It is assumed also that the deduction rules of the system have been shown to yield only true conclusions from true assumptions.
3.12. The moods of logical propositions Verb moods are also discussed in Remarka 3.7.1, 3.10.6, 3.10.11 and 3.10.14. 3.12.1 Remark: Descriptive and prescriptive moods of propositions. In terms of the descriptive and prescriptive interpretations of propositions discussed in Remark 3.7.13, the notations A ∈ T and A → T in Remark 3.10.2 may be interpreted as descriptive and prescriptive respectively. That is, A ∈ T means that the proposition is currently in the true proposition list of the sender of the proposition whereas A → T means that the recipient of the proposition should insert A in its true proposition list. To make the meaning of propositions clearer, one could, in principle, use notations such as A ∈ T and A → T in all mathematical logic literature to distinguish the indicative and imperative mood of the verb. Unfortunately, symbolic logic and mathematical notation generally do not indicate the mood, tense and other inflections of verbs. This information is usually indicated informally in the natural-language context. 3.12.2 Remark: Non-asserted propositions may be regarded as pseudo-propositions. Propositions can be interpreted as being in either the subjunctive or indicative verb mood. If a proposition P is asserted, for example with the notation “⊢ P ”, this corresponds to the indicative verb mood. When a proposition is not asserted, but is merely being discussed, this corresponds to the subjunctive verb mood. (This verb mood distinction is also discussed in Remark 3.7.1.) When the proposition A ∧ B is being asserted, it is equivalent to the truth value equation t(A ∧ B) = T, which means that t(A) = T and t(B) = T. But when the proposition A ∧ B is merely being discussed, not asserted, it satisfies t(A ∧ B) = T ⇔ t(A) = T and t(B) = T . [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
´ Etienne-Louis Malus was not a professional mathematician but an ex-officer of engineers in Napoleon’s campaigns in Germany and Egypt, who made himself famous by his accidental discovery of the polarization of light by reflexion. So possibly his objections struck young Cauchy as just the sort of captious criticisms to be expected from an obstinate physicist. In proving his most important theorems Cauchy had used the “indirect method” familiar to all beginners in geometry. It was to this method of proof that Malus objected. In proving a proposition by the indirect method, a contradiction is deduced from the assumed falsity of the proposition; whence it follows, in Aristotelian logic, that the proposition is true. Cauchy could not meet the objection by supplying direct proofs, and Malus gave in—still unconvinced that Cauchy had proved anything. The “indirect method” of proof is the same as reductio ad absurdum.
3.13. Other remarks on the semantics of logic
107
Thus the truth value of A ∧ B depends on the truth values of A and B. In other words, in the non-asserted case, we don’t know if the proposition A ∧ B is true or false unless we know the truth values of A and B. In fact, we could say that the non-asserted proposition A ∧ B is a pseudo-proposition whose only attribute is a truth value t(A ∧ B), which is defined to equal φ∧ (t(A), t(B)), where the function φ∧ : {F, T}2 → {F, T} is defined by n φ∧ (x, y) = T if (x, y) = (T, T) F otherwise. By contrast, the asserted proposition A ∧ B means that the truth values of both A and B are known to be T. 3.12.3 Remark: In general literature, the consequents of conditionals are mostly imperatives. In old literature, such as Gilgamesh (Remark 3.5.2) and B¯eowulf (Remark 3.5.4), the consequent of a conditional is generally future-tense or imperative clause. The kind of consequent clause which is of interest for mathematics is the present indicative. In mathematics, one generally wants to express conditionals such as: “If A is true, then B is true.” Pure mathematics is a fairly static system where propositions are either eternally true or eternally false. Computer software often has conditionals of the imperative form: “If A is true, then do B.” But pure mathematics is not like that.
3.13. Other remarks on the semantics of logic
3.13.1 Remark: Verbatim importation of abstract logical expressions into mathematics. As mentioned in Remark 3.2.5, the application of logic to pure mathematics has the special feature that logical operators are mapped verbatim from the abstract logic to the concrete logic. For example, if A = “x ∈ U ” and B = “x ∈ V ”, then the abstract proposition A ∧ B maps to the concrete proposition “x ∈ U ∧ x ∈ V ”. This map has the property that “x ∈ U ∧ x ∈ V ” is true if and only if A ∧ B is true. Mathematical logic is defined in such a way as to ensure that abstract and concrete logic operations give the same result. Many mathematicians are not comfortable with such an application of logic to mathematics. They prefer to communicate logical operations informally in the text of their expositions. However, the adoption of formal logical notations is generally clearer and less ambiguous than natural language, and the use of formal logic notation is not incompatible with the provision of informal clarification in natural language. 3.13.2 Remark: A finite set of propositions implies an infinite set of compound propositions. One problem with the logic machine model for propositional calculus is the question of whether all compound propositions should be automatically inserted into the proposition lists of the machine. For example, suppose the true-list of a machine has n simple propositions A1 , A2 ,. . . An . We may then insert the n2 compound propositions (Ai ∧ Aj ) into the true-list for integers i, j = 1 . . . n. Then the 2n3 propositions ((Ai ∧ Aj ) ∧ Ak ) and (Ai ∧ (Aj ∧ Ak )) may be added. Clearly an infinite number of true compound statements follow from a finite set of simple propositions. This contradicts the initial assumption of a finite machine. An infinite set of propositions would build all of the difficulties of the infinity concept into the design of the machine. This infinite set of potential true statements which follow from a finite set of simple propositions justifies the interpretation of truth functions as a dynamic permission to assert a proposition as opposed to a static specification of which compound statements are true. (When logical quantifiers are brought into the picture, the infinity of true propositions in the static picture becomes even more difficult to deal with.) 3.13.3 Remark: Use of a limited set theory to underly mathematical logic. When grappling with the cyclic, intertwined nature of logic and set theory, one thin hope for reprieve is to suppose that a finite naive set theory can be used as a more or less firm basis for logic, and then logic can be a basis for a set theory which encompasses infinite concepts. (See Figure 3.13.1.) If the naive set theory underlying logic is required to provide infinite sets, any hope of using a finite framework to explain infinite concepts is lost. In other words, infinity would need to be input to the framework in order to get infinity as an output. There would be little net profit. 3.13.4 Remark: Reformulation of logic in terms of set theory. The mechanization of mathematical logic seems to take away the problem of the circularity of mathematics and logic. When mathematics (including set theory and numbers) is used to define logic, this is like running [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
[ This section is a holding pen for remarks not yet allocated to sections. ]
108
3. Logic semantics infinite “rigorous” mathematics infinite functions
infinite order
infinite numbers
infinite set theory
mathematical logic
finite logic
finite sets
finite functions
finite order
finite numbers
finite naive mathematics Figure 3.13.1
Finite naive mathematics and infinite “rigorous” mathematics
a simulation of computer hardware inside the software which runs on that hardware. (See Figure 3.13.2.) This is clearly possible, but it is obvious that the simulation is not the same thing as the real hardware. In particular, a self-simulation runs much more slowly than the physical hardware. (Computer simulation is also mentioned in Remark 3.13.8.) software software hardware simulation
hardware Figure 3.13.2
Recursive simulation of a computer by itself
In the same way, when logic is re-formalized in terms of set theory and numbers, it is not the same logic as the naive logic which we used to formalize mathematical logic (and set theory and numbers) originally. (See Section 7.13 for the reformulation of mathematical logic in terms of axiomatic set theory and numbers.) Another way to look at this is that during mathematical education, we construct in our minds a series of virtual logic machines which are used progressively to define new virtual logic machines. As soon as we have reached a level where set theory and numbers are defined, we can return to our logic and re-formulate it. But this is then not the same logic which we used to formulate logic the first time we learned it. A conclusion from this is that it is somewhat deceptive to claim that mathematical logic is defined in terms of sets and numbers as it sometimes is in advanced logic textbooks. It should always be emphasized that this is a re-formulation in the same way that a simulation of hardware in software is not the same thing as the hardware on which the software runs. In particular, when it is claimed, on the basis of set-theoretic analysis, that some class of mathematical logics is complete, incomplete, or self-consistent, the validity of such claims is open to question. One must be at least sceptical of a “proof” of the correctness of a machine if the “proof” is only verified by the machine itself. 3.13.5 Remark: On-demand construction of compound propositions. The danger of infinite lists of propositions mentioned in Remark 3.13.2 can be avoided by generating derived compound propositions “on the fly”. One may regard compound propositions as meta-propositions which may be tested whenever desired. These meta-propositions about the state of the true proposition list are propositions about the state of the core proposition list. Thus there are two proposition stores: a (relatively) static core proposition list and a temporary, derived working meta-proposition list. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
hardware
3.13. Other remarks on the semantics of logic
109
3.13.6 Remark: Conjunctions are like proposition lists. Disjunctions are not. A meta-proposition of the form A ∧ B is equivalent to the insertion of propositions A and B into the core proposition list. (See Figure 3.13.3.) Thus if T, T′ are the sets of true propositions and true metapropositions respectively, A ∧ B ∈ T′ is equivalent to {A, B} ⊆ T. true meta-propositions T′
true meta-propositions T′
A∧B
is equivalent to
T
T
A B true propositions
Figure 3.13.3
true propositions
Interpretation of conjunction of propositions
However, there is no such equivalent for a disjunction such as A ∨ B. There is no set of simple propositions which can be inserted into the core proposition list to obtain the same effect as the insertion of A ∨ B in the meta-proposition list. (See Figure 3.13.4.) This suggests that conjunctions and disjunctions are qualitatively different. (This issue is also discussed in Remark 3.7.14.) true meta-propositions
T′
A∨B
is equivalent to T
T true propositions Figure 3.13.4
? ? ? true propositions
Interpretation of disjunction of propositions?
3.13.7 Remark: A proposition-store must be able to store both simple and compound propositions. The disjunction of two propositions seems to be qualitatively different to a conjunction. (Essentially this same point is made in Remark 3.7.14.) This is reflected in the use of the word “or” in colloquial English, which has numerous forms such as “exclusive or”, “inclusive or” and “and/or”, whereas there is only one form of “and”. The “if” operator, which is roughly equivalent to a disjunction, is also the subject of much confusion. (Many people find it difficult to accept that A ⇒ B is true if A is false. This subject is also mentioned in Remark 3.5.7.) The “and” operator indicates certainty whereas the “or” and “if” operators indicate uncertainty. Consequently, it is not possible to avoid the inclusion of compound propositions in the semi-static core proposition list of a logic machine. To be able to indicate the uncertainty (or conditionality) of the “or” and “if” operators, these operators must be allowed into the core list. Disjunctions and conditionals cannot be generated from a list of atomic propositions because one does not know if the true proposition list includes all true propositions. [ Remark 3.13.8 has some overlap with Remark 3.13.4. ] 3.13.8 Remark: Analogy of cyclic logical systems to cyclic computer simulations. When set theory and arithmetic are used in a meta-logical proof of the self-consistency or completeness of a first-order logic of some sort, such a proof inspires as much confidence as a computer software analysis or simulation of the processor hardware which is used by the computer for all of its own calculations. The acceptance of the correctness of the hardware leads to the conclusion that the conclusions of the software are correct, which implies the correctness of the hardware. This is clearly unsafe thinking. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
T′
true meta-propositions
110
3. Logic semantics
The process of building confidence in computer hardware started historically with computers which were simple enough for humans to analyze in great detail. If enough good designers worked on the task, it could be hoped that the hardware would be fairly bug free. A similar process occurred in the evolution of computer software. The first simulations of the operation of computers were performed by the human brain, which is itself a computer. Later, computers were used in the design and verificiation of computer designs, on the assumption that no bugs had crept in at the beginning, for example in the brains of the designers. As long as all of the computers are consistent with each other, one can be confident that they are either all right or all wrong. The same observation applies to mathematics, which evolved from naive beginnings into a system sophisticated enough to model itself. Mathematics has neither more nor less credibility than computer hardware designs. Each mathematical system is effectively a machine design. Mathematical systems have evolved in much the same way as computer designs.
3.14. Naive mathematics 3.14.1 Remark: To boot-strap the methods of mathematical logic, we need some “naive mathematics”. This corresponds to the “magma layer” in Figure 2.1.1. Any reader who cannot recognize the concepts of naive mathematics will not be able to understand mathematical logic, set theory and the rest of mathematics anyway; so there seems little harm in assuming that the reader will have acquired naive mathematics from previous experience. The purpose of a naive mathematics definition is to attempt to informally standardize a minimal collection of concepts which can be used as a basis for developing mathematical logic. (Serious formal definitions of naive mathematics are not possible because, of course, there is nothing of a formal nature to base it on!)
Nevertheless, naive mathematics does require consensus. In fact, there is only partial consensus on naive mathematics and logic. This is unavoidable. But the effort to achieve consensus is necessary. At the very minimum, each author must attempt to achieve consensus with some proportion of the readers. This book tries more than most to make explicit the underlying naive assumptions. Verification of naive mathematics must necessarily be conducted “by hand”. This is for the same reason that the first computer hardware and software designs needed to be verified by human authors and testers. (See Remarks 3.13.4 and 3.13.8 for comments on recursive computer simulations.) 3.14.3 Remark: There are cyclic dependencies among the concepts defined in Definition 3.14.4. Therefore this naive definition serves only to refine the concepts, not define them. [ Definition 3.14.4 will present a collection of naive mathematics concepts. This is an extremely rough first hack of a definition which will take a lot of work to make it useful. Please ignore it for now. ] 3.14.4 Definition [naive]: (1) A name is a sequence of letters (with optional subscripts and optional superscripts, which are themselves either names or numbers). (2) A (naive) set is a name A together with an effective procedure for determining whether any given object x is in the set A (denoted x ∈ A) or not in the set A (denoted x ∈ / A).
(3) An element of a set is any object x such that x is in A.
(4) A sequence is a set of names which are ordered in some way. (5) A number sequence is a sequence which is used for counting objects in sets. (??) (6) Two sets A and B are equinumerous if there is a one-to-one association between the elements of A and the elements of B. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
3.14.2 Remark: There is no absolute requirement to develop naive mathematics (or any other subject) in an axiomatic manner. As mentioned in Remarks 3.2.6 and 3.2.7, propositions do not magically acquire superior credibility or infallibility because of being deduced from other propositions. Axiomatization has many virtues, for example: compactness, minimalism, greater likelihood of self-consistency, and greater clarity of exposition. But some assertions must be permitted without the benefit of deductive justification.
3.14. Naive mathematics
111
(7) A (naive) cardinal number is a symbol or symbol-sequence n which is associated with sets in such a way that n is associated with both set A and set B if and only if A and B are equinumerous. (?) (8) The cardinality of a set A is any cardinal number n which is associated with A (according to the definition of a cardinal number). (?) (9) Two cardinal numbers m and n are called equivalent cardinal numbers if there exists a set A such that m and n are both associated with A (according to the definition of a cardinal number). (?) (To be continued. . . maybe!)
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
[ Definition 3.14.4, and perhaps all of Section 3.14, could be eliminated without much loss. If the naive mathematics is kept, it should define only those concepts which are actually used in the set-up of logic, set theory and numbers. ]
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
112
[ www.topology.org/tex/conc/dg.html ]
3. Logic semantics
[ draft: UTC 2009–3–21 Saturday 11:36 ]
[113]
Chapter 4 Logic methods
Concrete proposition domains . . . . . . . . . . . . . . Logic operations in concrete proposition domains . . . . Logical operators and expressions . . . . . . . . . . . . Logical expression evaluation and logical argumentation Propositional calculus formalization . . . . . . . . . . . Deduction rules . . . . . . . . . . . . . . . . . . . . . . An implication-based propositional calculus . . . . . . . Some propositional calculus theorems . . . . . . . . . . Meta-theorems and the “deduction theorem” . . . . . . Further theorems for the implication operator . . . . . . Other logical operators . . . . . . . . . . . . . . . . . . Parametrized families of propositions . . . . . . . . . . Logical quantifiers . . . . . . . . . . . . . . . . . . . . Predicate calculus . . . . . . . . . . . . . . . . . . . . Equality . . . . . . . . . . . . . . . . . . . . . . . . . Uniqueness . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
114 118 120 126 127 129 132 133 137 141 143 145 150 153 156 157
The purpose of this chapter is to present the methods and procedures of logic without thinking too much about the semantics, which is the subject of Chapter 3. The semantics of logic resides in the application contexts of logic. Each context has its own semantics. [ 2008-11-13: The split of the old logic chapter into Chapters 3 and 4 is not yet complete. The bulk of semantics should be in Chapter 3, whereas Chapter 4 should simply present the calculus of logic, just the methods and procedures. ] [ 2008-12-6: The two logic chapters still contain a lot of repetition, and a large amount of material still needs to be written. ] 4.0.1 Remark: The methods and procedures of logic come in many styles. These styles include truth tables, propositional calculus, predicate calculus, syllogisms and Boolean algebra. In addition to the textual and tabular methods of logic, it is also possible to make logical calculations by means of diagrams such as the function trees which are discussed in Remark 4.3.10. (There is even a quirky graphical method of syllogisms which was published by Lewis Carroll [158].) All of the different ways of doing logic are essentially equivalent. One chooses the methods and procedures which are best suited to particular applications. Within each style of logic, there are various flavours and sub-flavours which are adopted by different bodies of literature and particular authors. Sometimes the most powerful methods and procedures are not the easiest to apply. In general, it is best to study the most powerful framework for a subject in depth, and then study also some specific techniques which make work easier for particular applications. An objective of this chapter will be to present a broad, powerful framework, together with some useful specific methods for applications which may arise in the rest of the book.
©
Alan U. Kennington, “Differential geometry reconstructed: a unified systematic framework”. Copyright 2009, Alan U. Kennington. All rights reserved. You may print this book draft in A4 format. Printing in all other formats is forbidden. You may not charge any fee for copies of this book draft. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
4.1 4.2 4.3 4.4 4.5 4.6 4.7 4.8 4.9 4.10 4.11 4.12 4.13 4.14 4.15 4.16
114
4. Logic methods
[ Should have a section on truth-table methods, including the abbreviated, syntax-directed truth tables. Also have at least one section on Boolean algebra. Maybe even have a section on syllogisms? ] 4.0.2 Remark: There are two styles of calculation in symbolic logic. (1) The algebraic style: Logical expressions are treated as functions which can be evaluated in terms of their arguments, and deduction is performed in the manner of algebra, solving for the unknown truth values. (2) The linguistic style: Logical expressions are manipulated as lines of text subject to rules of deduction, ignoring even the most basic semantics of the expressions. Style (2) is a rigid formalization (or “mechanization”) of style (1). Logical arguments in the algebraic style are generally informal, as in algebra. The linguistic style removes the need for any semantics at all, which is preferable if one distrusts informality. The disadvantage of text-level linguistic argumentation is that human beings can find it dry and dreary.
4.0.4 Remark: The extent of standardisation in mathematical logic seems to be somewhat less than in set theory. In the case of set theory, one may simply say that one wishes to accept the Zermelo-Fraenkel axioms or the Neumann-Bernays-G¨ odel axioms, and the meaning is well standardized. There are many variants of these set theory axiom sets, but they are equivalent variants. In the case of mathematical logic, there are many alternative non-equivalent axiomatic systems (comprised of syntax rules, deduction rules and axioms), and these systems do not generally even have standard names. (Although the theorems are often equivalent, the proofs are typically very different.) Every author seems to have a different system. This necessitates a complete development of the body of theorems for each system, which wastes much effort. Consequently, the author of this text will try to determine a generally suitable axiomatic system for mathematical logic which combines most features of most of the worthwhile axiomatic systems and exemplifies most of the concepts. However, it is not possible to identify one “standard” logic which is generally accepted and applied. Nor is it possible to combine all, or most, axiomatic frameworks for logic into a single comprehensive framework. In particular, it should be noted that many frameworks for mathematical logic attempt to derive meta-logic “theorems” from some sort of lower-level semantic framework which is based directly on set theory. To this author, it is a great mystery why this kind of brazenly circular form of development is presented with such earnestness by so many authors. Rather than attempting to “prove” assertions about logic using a basis of set theory, one may as well come clean and start the development as a totally ad-hoc deductive system, leaving the reader to decide if they wish to accept the formalism.
4.1. Concrete proposition domains 4.1.1 Remark: The proposition names in abstract propositional calculus may refer to anything at all. One may consider the propositional calculus to be a “functional module” in the engineering sense. Propositional calculus may be applied to any set as the object space which is referred to by the proposition name space. The objects could be, for example, bistable transistor circuits which have two states, ON and OFF. Alternatively, the objects could be English-language sentences. Of greatest interest for mathematics is the case that the proposition object space is a set of symbolic expressions defined within a set theory such as Zermelo-Fraenkel. (See Figure 4.1.1.) The usual caveat emptor warning applies to the use of a particular propositional calculus in combination with a particular proposition object space. It is the responsibility of the user to ensure that the concepts of “true” and “false” have some relevance to the object space, and that the logical operators correspond [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
4.0.3 Remark: The calculus of logical expression is essentially void of meaning. It is simply a collection of notations and definitions. Logic only becomes interesting when one must solve “logical equations” for unknown individual propositions. Similarly, the mere differentiation of functions in calculus is of limited interest. The really exciting task is to solve (ordinary or partial) differential equations, where the derivatives are known (or solve some equation), and one must invert the differentiation process to determine the original function, i.e. the “solution”. (This is also discussed in Remark 4.4.2.)
4.1. Concrete proposition domains propositional calculus 1 name spaces
object spaces
A
B
V1 V2 . . . transistor voltages
Figure 4.1.1
A∨B
115
propositional calculus 2
A
B
S1 S2 . . . natural language sentences
A∨B
P1 P2 . . .
abstract logics
concrete logics
set theory propositions
Multiple object spaces for abstract propositional calculus modules
to something meaningful in the application domain. A propositional calculus has no absolute validity. It is simply a functional unit like an algebraic system. (See Chapter 9 for algebraic systems.) There is no guarantee that a particular algebraic system is applicable to a given set of objects. Any set of objects to which an algebraic system is applied must be tested to see if it satisfies the axioms of the algebraic system. Likewise one must test any intended object space for a propositional calculus to ensure that it satisfies the axioms and rules of that propositional calculus. [ Mendelson [164], page 32 footnote, uses the word “proposition” to mean a meta-theorem. Make a comparatative table of such vocabulary variations of various logic authors. Mendelson [164], page 31 (footnote), uses the term “object language” to mean what I call “abstract logic”. (See Figure 4.1.1.) ] 4.1.2 Remark: In Definition 4.1.3, the word “domain” has a particular meaning in the theory of functions within set theory. (See Definition 6.3.6.) The meaning as the source space of a relation or function is not intended here.
[ Definition 4.1.3 is tentative. It might need some improvement. It looks like a concrete proposition domain needs to be defined in a set which is defined within a background system, defined a-priori. But in a sense, this might not be such a bad thing. It could be explicitly stated that this is so, an maybe it should be stated here. ] 4.1.3 Definition: A truth value map is a map τ : P → {F, T} for any (naive) set P.
The concrete proposition domain of a truth value map τ : P → {F, T} is its domain P. 4.1.4 Remark: The “naive sets” in Definition 4.1.3 are not really sets at all. The English language requires some noun to describe those objects which pass a specific test procedure or operational definition. To be more precise, one should think of “naive sets” not as expressions like “{x; F (x)}”, where F is some test which is applied to things x, but rather as expressions like “x; F (x)”. In other words, one should talk about “those x which pass the test F ” rather than “the set of x which pass the test F ”. The word “class” can be used for such naive sets, although the word “class” is often used for more specific concepts such as NBG set theory classes. In short, one may say that naive sets or classes are an objectification of properties. There is not necessarily any “object” which corresponds to particular test procedures or operational definitions. 4.1.5 Remark: If the truth value map in Definition 4.1.3 is generalized to permit multiple values for the map, the result is that the propositions in the proposition domain may be true, false or both. In this case, the relation τ ∈ P × P has the property ∀x ∈ P, ∃y ∈ P, (x, y) ∈ τ . Such inconsistencytolerant or “paraconsistent” logics are not difficult to define, and could have useful applications. (See Mortensen [166], page 2.) However, it seems more economical intellectually to partition the proposition domain P for paraconsistent logics into a domain P0 of single-truth-valued propositions and a domain P1 of double-truth-valued propositions. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Some plausible alternatives for the word “domain” in Definition 4.1.3 would be “range”, “class”, “universe”, “space” or “zone”.
116
4. Logic methods
Any logical axioms which are proposed for the combined domain must specify whether they apply to propositions of one or both of the sub-domains. Proof by contradiction only works for propositions which lie in the single-valued propositions. This kind of two-tiered system with normal propositions and abnormal pseudo-propositions is reminiscent of the two-tiered NBG set theory, which has normal sets and abnormal proper classes. In both cases, the functionality for the abnormal tier is severely limited. One may talk about the abnormal things, but one cannot do very much with them. 4.1.6 Remark: The concrete proposition domains in Definition 4.1.3 can be static or dynamic. However, the dynamic case is expressible as the static case in the same way that time is bundled together with space parameters in physics. To be more specific, suppose the truth value function τ has a time parameter: τ : P × IR → {F, T}. The space/time domain P × IR can be replaced by the concrete proposition domain P ′ , which makes the function τ : P ′ → {F, T} look static. 4.1.7 Remark: One may draw an analogy between truth value maps and the solutions of BVPs (boundary value problems) and IVPs (initial value problems) for PDEs (partial differential equations). BVP solutions are analogous to static truth value maps and IVP solutions are analogous to dynamic truth value maps. PDEs, and boundary or initial conditions, place constraints on the solutions to PDE problems. Very often, it is not possible to solve PDE problems sufficiently to say what the solution u(x) is for a particular x in the domain. One must very often be content to know how the properties of solutions u depend on the constraints of a problem.
In a “logic problem”, existence and uniqueness of truth-value-map “solutions” is very important, just as it is in PDE theory. Roughly speaking, existence for logic problems is the same as “self-consistency” and uniqueness is the same as “completeness”. If a single contradiction is found in a logic system, this means that the number of solutions of the logic problem is zero. Self-consistency is not an optional extra. Without self-consistency, there is no solution to work with. This is the basis of proof by contradiction, which is discussed in Section 3.11. 4.1.8 Remark: The following are some examples of concrete proposition domains. (1) P = the propositions “x < y” for real numbers x and y. The map τ : P → {F, T} is defined so that τ (“x < y”) = T if x < y and τ (“x < y”) = F if x ≥ y in the familiar way. (2) P = the propositions “X ∈ Y ” for sets X and Y in a Zermelo-Fraenkel set theory. The map τ : P → {F, T} is defined so that τ (“X ∈ Y ”) = T if X ∈ Y and τ (“X ∈ Y ”) = F if X ∈ / Y in the familiar way. (One may think of the totality of mathematics as the task of calculating this truth value map.)
(3) P = the propositions “X ∈ Y ” and “X ⊆ Y ” for sets X and Y in a Zermelo-Fraenkel set theory. The map τ : P → {F, T} is defined so that τ (“X ∈ Y ”) = T if X ∈ Y , τ (“X ∈ Y ”) = F if X ∈ / Y, τ (“X ⊆ Y ”) = T if X ⊆ Y , τ (“X ⊆ Y ”) = F if X 6⊆ Y in the familiar way.
(4) P = the well-formed formulas of Zermelo-Fraenkel set theory. The map τ : P → {F, T} is defined so that τ (P ) = T if P is true and τ (P ) = F if P is false.
(5) P = the well-formed formulas of any first-order language. The map τ : P → {F, T} is defined so that τ (P ) = T if P is true and τ (P ) = F if P is false. In case (1), the domain P is a well-defined set in the Zermelo-Fraenkel sense. In the other cases above, P is generally not a ZF set. The following example is not so well defined as the others. (6) P = the propositions P (x, t) defined as “the voltage of transistor x is high at time t” for transistors x in a digital electronics circuit for times t. The truth value τ (P (x, t)) is not usually well defined for all pairs (x, t) for various reasons, such as settling time, related to latching and sampling. Nevertheless, propositional calculus can be usefully applied to this system. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
In the same way, very often one finds that the constraints placed on truth value maps for a particular “logic problem” ony permit one to determine various properties of the truth value map “solutions” for the problem. For example, if one could determine the truth value for every proposition “X ∈ Y ” in a Zermelo-Fraenkel set theory, every open problem of mathematics would be solved!
4.1. Concrete proposition domains
117
(7) P = the propositions P (n, t, z) defined as “the voltage Vn (t) at time t equals z” for locations n in a digital electronics circuit, at times t, with logical voltages z equal to 0 or 1. The truth value τ (P (n, t, z)) is usually not well defined at all times t. Propositions P (n, t, z) may be written as “Vn (t) = z”. The idea of concrete proposition domains and truth value maps is illustrated in Figure 4.1.2. concrete proposition domain P1
“π < 3”, “3.1 < π”, “π < 3.2”, . . .
τ1 val truth value space ue ma p τ2 F T {F, T} truth value map tru th
P2
“∃x, ∀y, y ∈ / x”, “∅ ∈ ∅”, “∅ ⊆ ∅”, . . .
concrete proposition domain
P3
“V1 (0) = 1”, “V2 (3.7 µS) = 0”, “V7 (7.2 mS) = 1”, . . .
th tru
τ3 ap em u l va
concrete proposition domain Concrete proposition domains and truth value maps
4.1.9 Remark: The kind of sets of propositions and names which appear in the definitions of formal symbolic logic are not the same as the kinds of sets which appear in formal set theories such as ZermeloFraenkel (Section 5.1) and Bernays-G¨ odel (Section 5.12). The sets in abstract symbolic logic are externally defined collections of propositions or names which have very limited membership relations. For example, if P denotes the set of propositions of a propositional calculus, the membership relation x ∈ P is defined for elements x of P, but there is no membership relation x ∈ y for pairs of elements x and y in P. If the external application context does permit such relations, this is not an issue for symbolic logic because such relations are not used at all in symbolic logic. It is meaningless (or irrelevant) to say that one proposition is an “element” of another proposition. Likewise, it is meaningless to say that one name or label is an “element” of another name or label. This remark is related to the theory of types developed by Russell and Whitehead [167]. (See Mendelson [164], page 4.) By organizing sets into layers called “types” and forbidding the set membership relation within each type, the dangers of Russell’s paradox are avoided. The “sets” required for the formalization of symbolic logic are therefore “naive sets” whose pruned-down properties, relations and axioms effectively avoid the dangers or mathematical set theory. The “naive set” idea also partially avoids the circularity of definitions between logic and set theory. It may same at first sight that the membership relation is permitted to operate between elements of the spaces used in ZF set theory. However, the propositions of ZF set theory do not possess any membership relation. The ZF membership relation operates between the parameter names for propositions. The sets of ZF set theory are merely parameters for propositions. Thus M (x, y) = “x ∈ y” is a proposition with parameters x and y. The membership relation M is a family of parametrized propositions. Parametrized propositions belong to predicate calculus, not propositional calculus. In the case of predicate calculus, the set of proposition parameters can encounter Russell’s paradox, but this is not a problem with the naive set of proposition parameters itself. The problem is simply that the axioms of a set theory can be inconsistent if they are not chosen carefully. To be specific, if U denotes the set of all sets and Q denotes the problematic set which satisfies ∀x ∈ U, M (x, Q) ⇔ (M (x, U ) ∧ ¬M (x, x)). This leads to the contradiction: M (Q, Q) ∧ ¬M (Q, Q). This is a consequence of the axioms which guarantee the existence of such a set Q. Therefore there cannot be a universal set U which supports a set relation M which [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Figure 4.1.2
118
4. Logic methods
satisfies the bad set of axioms. The contradiction lies in the axioms, not in the set of proposition parameters (i.e. the set of sets). The problem in Russell’s paradox is caused by allowing sets to be elements of themselves. Although it seems plausible that the class of all things should contain the class of all things as an element, the possibility of sets being elements of themselves is inconsistent with the restriction axiom. If the axiom ∀x ∈ U, ¬M (x, x) is accepted instead, we have Q = U and ¬M (Q, Q). So the specification ∀x ∈ U, M (x, Q) ⇔ (M (x, U ) ∧ ¬M (x, x)) for Q only yields the consequence ∀x ∈ U, ¬M (x, x), which doesn’t contradict anything. In summary, Russell’s paradox is a show-stopper for cyclic inclusion relations among sets, not for the sets themselves. To put it another way, let Q = {x ∈ P; x ∈ / x}. Then Q = P. So Q ∈ / Q because P ∈ / P. So Russell’s paradox does not apply. [ The discussion in Remark 4.1.9 needs to be checked very carefully! ] 4.1.10 Definition: A proposition name map is a map µ : N → P from a (naive) set N to a concrete proposition domain P. The proposition name space of a definition name map µ : N → P is its domain N . 4.1.11 Remark: The relation between proposition name spaces and concrete proposition domains is illustrated in Figure 4.1.3. proposition name space A, B, C, . . .
proposition µ name map “∃x, ∀y, y ∈ / x”, “∅ ∈ ∅”, P “∅ ⊆ ∅”, . . .
ab s tr act
τ truth value map
concrete proposition domain Figure 4.1.3
t
ru t = th va τ ◦ lue m µ ap
F
T
{F, T}
truth value space
Proposition name space and concrete proposition domain
The proposition name map µ in Definition 4.1.10 is quite arbitrary. A particular name space may be mapped to any concrete proposition domain in any way at all. (This is illustrated in Figure 4.1.4.) Abstract logic uses the proposition name space, not the concrete proposition domain. The theorems of abstract logic, which are validated at a linguistic level, may then be applied to any concrete proposition domain at all, with any choice of proposition name map. The theorems of abstract logic are only valid to the extent that the assumptions of the model are valid.
4.2. Logic operations in concrete proposition domains 4.2.1 Remark: The context of a particular concrete proposition domain may not permit the construction or expression of general logical operations. In particular, many natural languages are limited in their ability to express logical operations. This is discussed in Section 3.10, particularly in Remark 3.10.8. In the case that a proposition context does support logical operations, the propositions which are formed by those operations may be referred to as “truth-functional combinations”. (See Mendelson [164], page 12.) The truth-functional combinations are straightforward to define in formal set theory because the same logical operations are used as in formal logic. But in other concrete proposition domains, particularly in natural languages, truth-functional combinations are quite difficult to define and are vulnerable to ambiguity. It seems preferable to present the methods of logic at an abstract level, where precision and ambiguity are easy [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
N
4.2. Logic operations in concrete proposition domains
119
concrete proposition domains “π < 3”, “3.1 < π”, “π < 3.2”, . . .
proposition name space
τ1 hv truth value space alue map “∃x, ∀y, y ∈ / x”, τ2 “∅ ∈ ∅”, F T truth value map “∅ ⊆ ∅”, . . . τ3 ap {F, T} P2 em u l a v h trut “V1 (0) = 1”, “V2 (3.7 µS) = 0”, “V7 (7.2 mS) = 1”, . . .
µ 1 ap em nam µ2 A, B, C, . . . name map nam µ3 N em ap
Figure 4.1.4
P1
trut
P3
Proposition name space with multiple concrete proposition domains
to achieve, and then import the abstract logic into concrete domains. If this importation map can be shown to be precise and unambiguous, all of the methods of abstract logic will be directly applicable. 4.2.2 Remark: Statements which are purely logical combinations of other sentences are called “truthfunctional combinations”. The function f¯ in Definition 4.2.3 is necessarily of the form f¯ : {F, T}n → {F, T}. n There are therefore 2(2 ) different choices of f¯ for n component sentences.
4.2.3 Definition: A truth-functional combination of sentences A1 ,. . . An , for any non-negative integer n, is a sentence f (A1 , . . . An ) such that its truth-value t(f (A1 , . . . An )) is a function f¯(t(A1 ), . . . t(An )), where the function f¯ may depend on the truth values t(A1 ),. . . t(An ), but is independent of the sentences A1 ,. . . An themselves. 4.2.4 Definition: A truth function is a function f : {F, T}n → {F, T} for some non-negative integer n. A truth function of n arguments or n-ary truth function, for non-negative integer n, is a truth function with domain {F, T}n . 4.2.5 Remark: If one or more of the propositions A1 ,. . . An in Definition 4.2.3 do not have well-defined truth values, the function value f¯(t(A1 ), . . . t(An )) will not necessarily be well defined. But under such circumstances, the sentence f (A1 , . . . An ) may be well defined, and the truth value t(f (A1 , . . . An )) may also be well defined. This anomaly may be remedied by defining a third pseudo-truth-value U which means “unknown truth value”. Then the domain and range of f¯ may be extended so that f¯ : {F, T, U}n → {F, T, U}. Propositions with unknown truth values may occur particularly in systems where the truth values are determined dynamically in some way. 4.2.6 Remark: In terms of the “unknown truth value” in Remark 4.2.5, one may write the following extended truth table for the implication operator. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
The function f in Definition 4.2.3 has domain Dn and range D, where D is the “compound proposition space” under consideration. (See Remark 4.5.4.)
120
4. Logic methods A T T T F F F U U U
B A⇒B T T F F U U T T F T U T T T F U U U
This truth table is discussed in Remark 3.8.1 in the context of a logic machine network.
4.2.8 Remark: If we did want to add an unknown element U to the real numbers to represent the state of knowledge about a system A by a mathematical thinking machine M1 (such as a human brain), we would then have only the ordinary real numbers in system A, but this would be modelled by machine M1 using an extended set of real numbers. This is awkward, but not too absurd. The problem comes when a second machine M2 models the state of machine M1 . Then machine M2 may not fully know the state of machine M1 . Machine M2 may not know if machine M1 is representing a value x as a real number (because it knows the value), or as the pseudo-number U. If machine M2 does not have any information, it must use a second level of unknown real number, for example U2 . (This is perhaps one of the infamous “unknown unknowns”!) Machine M2 does not know if machine M1 knows the value of x or not. But the problem is more difficult than that. Machine M2 may know that machine M1 knows the value of x, but may now know what value it knows. This is a different kind of unknown to the original U (which was a “known unknown”). Now machine M2 needs to define an “unknown known”, possibly denoted U1 . The same issues arise in the case of extended truth values. If we add a third truth value U to permit machine M1 to model unknown truth values of propositions about system A, it becomes necessary to invent another two pseudo-truth-values when machine M2 models machine M1 . When machine M3 models machine M2 , a further four pseudo-truth-values are required. If two logic machines have the great misfortune to be modelling each other’s state of knowledge, an infinite number of pseudo-truth-values will be required to represent all of the possibilities. (Example: “I know that you don’t know that I don’t know that you know that I know whether the Sun rises in the East.”) The use of pseudo-truth-values to represent states of ignorance may be useful under some circumstances. But it is inconvenient to introduce such monstrosities into the formal theory of logic. (See also Remark 3.8.2 for discussion of this point.)
4.3. Logical operators and expressions This section is concerned with logical propositions without parameters and without quantifiers. Existential and universal quantifiers, and parametrized families of propositions, are introduced in Sections 4.12 and 4.13. 4.3.1 Remark: The logical operators in Notation 4.3.3 are applied to abstract propositions in a “discussion context”, not necessarily in the “discussed context”. However, in formal set theory, these operators are used in both contexts. One may think of this as the importation of the operators (and compound logical expressions) from the abstract discussion context into the concrete discussed context. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
4.2.7 Remark: It seems fairly reasonable to add a third truth value “unknown” to the standard two, “true” and “false”. However, the case for this is not overwhleming. Consider, for example, the analogous situation where the real-valued solution of an algebraic or PDE problem is unknown. In this case, we do not add an extra element U to the real numbers to represent unknown values. Thus if a value x in algebra or f (x, t) in PDE theory is unknown, we do not write x = unknown (or x = U) or f (x, t) = unknown (or f (x, t) = U). We simply say that the value is unknown. (In some elementary teaching, however, such monstrosities are sometimes seen!) We generally assume that the value of a solution to an equation is defined, even when we do not know what it is. The equations x = a and f (x, t) = b are satisfied for some a and b, but we just don’t know what they are.
4.3. Logical operators and expressions
121
4.3.2 Remark: Although the first three symbols in Notation 4.3.3 are common in logic, they are not so common in mathematics. In differential geometry, the ∧ symbol clashes with the wedge product in exterior algebra. So these three logic symbols are rarely used outside the preliminary chapters. 4.3.3 Notation: (i) ¬ means “not” (logical negation). (ii) ∧ means “and” (logical conjunction).
(iii) ∨ means “or” (logical disjunction).
(iv) ⇒ means “implies” (logical implication).
(v) ⇔ means “if and only if” (logical equivalence).
4.3.4 Definition: A propositional connective is any of the symbols in Notation 4.3.3. 4.3.5 Remark: The ∧ and ∨ symbols have some easy mnemonics (in English). The ∧ symbol suggests the letter “A” in the word “and”, but without the horizontal strut. The ∨ symbol is the opposite of this. The ∧ and ∨ symbols for propositions match the corresponding ∩ and ∪ symbols for sets. The ∪ symbol resembles the letter “U” for “union”. (Arguably the ∩ symbol suggests the “A” in the word “all”.)
The ¬ (not) symbol is perhaps inspired by the − (minus) symbol for arithm`etic negation. A popular alternative for “not” is the ∼ symbol. However, this is easily confused with other uses of the same symbol. So ¬ is preferable when logic is mixed with mathematics in a single text.
According to Lemmon [163], page 19, the ∨ symbol is actually a letter “v”, which is a mnemonic for the Latin word “vel”, which means the inclusive OR, as opposed to the Latin word “aut”, which means the exclusive OR.
[ Possibly replace the plain TEX ¬ symbol with
or something similar. ]
4.3.6 Remark: There is a fair amount of variation in notations for logical operators and statements. The following table summarizes some sample notations. author
not
EDM [33] EDM2 [34] KEM [121] Lemmon [163] Mendelson [164] Reinhardt [134] Shoenfield [168] Szekeres [44]
, ∼,¯ ∧, &, · ∨, + →, ⊃, ⇒ ↔, ⇄, ≡, ∼, ⊃⊂, ⇔ , ∼,¯ ∧, &, · ∨, + →, ⊃, ⇒ ↔, ≡, ∼, ⊃⊂, ⇔ ¬ ∧ ∨ → ↔ − & v → ↔ ∼ ∧ ∨ ⊃ ≡ ¬ ∧ ∨ ⇒ ⇔ & ∨ → ↔ and or ⇒ ⇔
Kennington
and
¬
∧
or
∨
implies
⇒
if and only if
⇔
nand nor
| | |
↓ ↓ ▽
|
↓
xor prop A A p P A A p P △
A
wff
A A H A
α
The notations for proposition names and well-formed formula (wff) names are indicated in the table as “prop” and “wff” respectively. The NAND operator is also called the “Sheffer stroke” or “alternative denial” operator. (See Mendelson [164], pages 26, 42.) The NOR operator is also called the “Peirce arrow”, “Quine dagger” or “joint denial” operator. (See Mendelson [164], page 26.) 4.3.7 Remark: There is no notation for the exclusive OR operator in Remark 4.3.6 because there seems to be no standard for it in logic and mathematics. The exclusive OR of A and B is (A ∨ B) ∧ ¬(A ∧ B), which is equivalent to A ⇔ ¬B. It also has the useful property that the truth value t(A ⇔ ¬B) equals T if and only if the sum of the truth values t(A) + t(B) is an odd integer. (This rule applies also to the exclusive OR of any finite set of propositions, which is well defined because the operator is associative and commutative.) So a notation resembling the addition symbol “+” could be suitable, such as “⊕”. This is in fact used in some contexts. (For example, see Lin/Costello [209], pages 16–17.) But this symbol is also used frequently [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
logic operator notations
122
4. Logic methods
in algebra. The sometimes-used exclusive-OR symbol “6≡” also clashes with a frequently used symbol. (For example, see CRC [99], pages 16–21.) ˙ “∨ ¯ ” or “⊻” could be used because the inclusive OR and exclusive OR Modified OR symbols such as “∨”, are often confused with each other. A fairly rational notation choice would be a superposed OR and AND symbol “∨ ∧”. This has the same sort of 4-way symmetry as the “⇔” biconditional symbol. But it would be tedious to write this by hand frequently. It would make some sense to use a superposed X (×) and circle ( ) to represent the exclusive OR operator. Such a symbol (if it could be easily produced in TEX) would look similar to the “∨ ∧” symbol. It would also have the same symmetries as the biconditional symbol. It would also suggest the first two letters of the abbreviation XOR. The triangle notation “△” would make good sense for the exclusive OR because it resembles the Delta symbol ∆ and the corresponding set operation is the “set difference”. (See Definition 5.13.14 and Notation 5.13.15.) To distinguish this symbol from the set operation, the corresponding small triangle symbol △ could be used instead. 4.3.8 Definition: The alternative denial operator is the map f : {F, T}2 → {F, T} satisfying f (v1 , v2 ) = F for (v1 , v2 ) = (T, T), otherwise f (v1 , v2 ) = T. The joint denial operator is the map f : {F, T}2 → {F, T} satisfying f (v1 , v2 ) = T for (v1 , v2 ) = (F, F), otherwise f (v1 , v2 ) = F. The exclusive-or operator is the map f : {F, T}2 → {F, T} satisfying f (v1 , v2 ) = T for (v1 , v2 ) = (T, F) or (F, T), otherwise f (v1 , v2 ) = F. The alternative denial operator is also known as the NAND operator or the Sheffer stroke. The joint denial operator is also known as the NOR operator or the Peirce arrow or Quine dagger.
4.3.10 Remark: Propositional calculus automates logic at the symbolic level, ignoring the semantics of logical expressions. Figure 4.3.1 illustrates the difference between the syntactic and semantic levels. parse tree
function tree
(A ∧ ¬B) ∨ C
φ∨
A ∧ ¬B A
C
¬B
t(A)
B
t(C) φ¬ t(B)
syntax
Figure 4.3.1
φ∧
semantics
t (A ∧ ¬B) ∨ C = φ∨ φ∧ (t(A), φ¬ (t(B))), t(C)
Example logical expression tree with syntax and semantics
The tree on the left shows how a simple abstract logical expression (A ∧ ¬B) ∨ C may be parsed. At the syntactic level, such parsing is required in order to ensure that the expression satisfies the rules for a wellformed formula. Propositional calculus performs operations on expression at the syntactic level, using only a simple set of blind rules to determine which deductions may be made from sets of compound propositions. At the semantic level, the example expression (A ∧ ¬B) ∨ C has a truth value t (A ∧ ¬B) ∨ C which equals φ∨ φ∧ (t(A), φ¬ (t(B))), t(C) . The expression (A ∧ ¬B) ∨ C is merely a written representation of the [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
4.3.9 Notation: | denotes the alternative denial operator. ↓ denotes the joint denial operator. △ denotes the exclusive OR operator.
4.3. Logical operators and expressions
123
corresponding spoken language. The meaning of the expression is a set of operations which must be carried out as indicated by the function tree in Figure 4.3.1. The function-tree diagram is an attempt to convey the required mental operations in the same way that the expression (A ∧ ¬B) ∨ C is an attempt to convey the same operations in linear written text. The rules and axioms of propositional calculus cannot be understood in terms of text alone. The origin of the deduction rules and axioms lies in the logical procedures which are summarized by abstract symbolic logic expressions. See Remark 4.13.11 for an analogous attempt to parse a quantified logical expression. [ Mention syntax-directed translation in connection with syntax and semantcs of parse-trees? Also tree decoration? Abbreviated truth tables are also an example of syntax-directed translation? ] 4.3.11 Definition: A conjunct of an expression of the form α ∧ β, for wffs α and β, is either α or β.
The left conjunct of an expression of the form α ∧ β, for wffs α and β, is α.
The right conjunct of an expression of the form α ∧ β, for wffs α and β, is β.
A disjunct of an expression of the form α ∨ β, for wffs α and β, is either α or β. The left disjunct of an expression of the form α ∨ β, for wffs α and β, is α.
The right disjunct of an expression of the form α ∨ β, for wffs α and β, is β.
The principal connective of an expression of the form α◦β, where ◦ is one of the connectives in Notation 4.3.3, and α and β are wffs, is the connective ◦.
If the parsing of a logical expression is ambiguous, the semantics of the expression is ambiguous. Logical expressions are merely notations for function trees. So if the parsing is non-unique, the meaning is nonunique. The fact that the same result may be provably unchanged for any choice of interpretation does not change the fact that the function tree interpretation is ambiguous. It would probably be more correct to refer to the “conjuncts (or disjuncts) of an operator” rather than the “conjuncts (or disjunts) of an expression”. Thus if an expression has the form α ∧ (β ∧ γ), for example, the conjuncts of the first ∧-operator would be α and (β ∧ γ), whereas the conjuncts of the second ∧-operator would be β and γ. More generally, one may refer to the “operands” of each operator in a logic expression. Usually the number of operands for any operator is one or two, but it is quite straightforward to generalize the concept to larger numbers of operands. (On the other hand, larger numbers of operands are best expressed in functional notation rather than by operators.) The associativity of the ∧ and ∨ operators implies that the parenthesization rules may be relaxed. One way to remove the function-tree ambiguity in this case is to decide on a fixed “associativity rule” for all such expressions. For example, a left-associativity rule would interpret α ∧ β ∧ γ as (α ∧ β) ∧ γ whereas right-associativity would interpret it as α ∧ (β ∧ γ). This approach effectively parenthesizes all expressions so that the order of application of operations is unique. (For a typical computer software context for left and right associativity, see for example Kernighan/Ritchie [205], page 200.) Another way to remove ambiguity in unparenthesized expressions such as α ∧ β ∧ γ is to interpret such expressions in terms of multi-operand functions. For example, α ∧ β ∧ γ could be interpreted as φ∧ (α, β, γ), which in turn is defined in terms of the two-operand operators. Similar comments refer to the definition of the “principal connective”. 4.3.13 Remark: In terms of the semantic-level function trees mentioned in Remark 4.3.12, the principal connective defined in Definition 4.3.11 for logical expressions corresponds to the function at the root of the tree for the parsed expression. The conjuncts (or disjuncts) of a conjunction (or disjunction) are the left and right branches of the root of the parse tree. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
4.3.12 Remark: A problem with Definition 4.3.11 is the possibility that an expression of the form α ∧ β, for example, may contain an additional operator ∧ in the sub-expression α, or β, or both. Then the choice of the pair of conjuncts would be non-unique. However, the use of strict parenthesizing rules removes the non-uniqueness. Under such rules, each side of a binary operator such as ∧ or ∨ is required to be either a primitive symbol or a parenthesized sub-expression. Then the ambiguity is removed by the parentheses.
124
4. Logic methods
4.3.14 Definition: A conditional (expression) is an expression of the form α ⇒ β for wffs α and β. The antecedent (subexpression) of a conditional expression α ⇒ β for wffs α and β is the wff α.
The consequent (subexpression) of a conditional expression α ⇒ β for wffs α and β is the wff β. A biconditional (expression) is an expression of the form α ⇔ β for wffs α and β. 4.3.15 Remark: Logical operators can be expressed in terms of arithm`etic relations among truth values if the function τ from the set of proposition names to the set of integers {0, 1} is defined by τ (P ) = 0 if P is false and τ (P ) = 1 if P is true. (The symbol “△” means the exclusive OR.) logical expression A ¬A
A∧B A∨B A⇒B A⇔B A△B A∧B ∧C A∨B ∨C A△B △C
arithm`etic expression τ (A) = 1 τ (A) = 0 τ (A) + τ (B) = 2 τ (A) + τ (B) ≥ 1 τ (A) ≤ τ (B) τ (A) = τ (B) τ (A) + τ (B) = 1 τ (A) + τ (B) + τ (C) = 3 τ (A) + τ (B) + τ (C) ≥ 1 τ (A) + τ (B) + τ (C) = 1 or 3
As often mentioned in this book, the logic layer is infested with dependencies on set and number concepts. All of logic relies upon the natural integers anyway. So it is not a substantial defeat to be using integers in definitions of logical operators. Conjunctions and disjunctions are easily expressed in terms of minimum and maximum operators as in the following table. A ¬A
A∧B A∨B
A∧B ∧C A∨B ∨C
min/max expression min(τ (A)) = 1 max(τ (A)) = 0 min(τ (A), τ (B)) = 1 max(τ (A), τ (B)) = 1 min(τ (A), τ (B), τ (C)) = 1 max(τ (A), τ (B), τ (C)) = 1
A substantial advantage of these min/max operators is that they are also valid for infinite sets of propositions. This is important in the predicate calculus, where propositions are organized into parametrized families, and these families are typically countably or uncountably infinite. 4.3.16 Remark: There is a sense in which proposition names and wff names are not an essential component of an axiomatic system for propositional calculus. All such names are arbitrary “dummy variables” whose sole purpose is to indicate which variables are the same. This is essentially the same as the role of pronouns in natural languages. The information contained in dummy variables can be communicated by other means, for example by links in a diagram. However, the scope of these dummy variables could be quite large. If the scope of a proposition name is spread over many pages, it would be inconvenient to use other forms of less arbitrary linkage between the variables in expressions. It is more convenient to simply remember that the particular choice of proposition names is arbitrary. Such arbitrariness of symbols is present in almost all of mathematics anyway. Parentheses are even more clearly inessential than proposition names because “Polish notation” is a welldefined full substitute for parentheses. The “reverse Polish” notation is not easy for humans to read, but the existence of this option proves that parentheses are not an essential part of the language. (Shoenfield [168], pages 14–16, presents a first-order language notation which is expressed in terms of prefix operators.) [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
logical expression
4.3. Logical operators and expressions
125
4.3.17 Remark: One might reasonably ask what a zero-operand logical operator looks like. A unary operand acting on an abstract proposition yields an abstract proposition whose truth value φ(t(A)) is a 1 function φ of a single truth-value variable t(A). There are 2(2 ) = 4 posssible such functions, as mentioned 0 in Remark 4.2.2. In the case of zero-operand operators, there are 2(2 ) = 2 choices for the operator. These are introduced in Definition 4.3.18 and Notation 4.3.19. 4.3.18 Definition: The true (zero-operand) operator is the zero-operand operator whose expression value is true. The false (zero-operand) operator is the zero-operand operator whose expression value is false. 4.3.19 Notation: ⊤ denotes the true zero-operand logical operator. ⊥ denotes the false zero-operand logical operator. 4.3.20 Remark: The true and false zero-operand operators may be used in a logical expression in the same way as other operators. For example, the expressions ⊤, ¬⊥, ¬(A ∧ ⊥) and A ∨ ⊤ are all valid logical expressions. In fact, they all have the value T for any choice of operands. In other words, for any abstract proposition A, t(⊤) = T, t(¬⊥) = T, t(¬(A ∧ ⊥)) = T and t(A ∨ ⊤) = T.
It follows, therefore, that the expressions ⊤ and ⊥ are valid abstract propositions, whose truth values are always-true and always-false respectively. There are also operators with any positive number of operands which are always-true or always-false. To avoid excessive diversity of notations, the always-true operators with n operands could be denoted as ⊤(), ⊤(·), ⊤(·, ·) and ⊤(·, ·, ·) for n = 0, 1, 2, 3, and so forth. Thus t(⊤(A, B)) = T for all abstract propositions A and B. The always-false operators with one or more operands may be denoted similarly. However, there is very seldom a practical requirement for such operands.
Note also that ⊤ and ⊥ are not names for concrete propositions. They are not labels for propositions in a concrete proposition domain. However, both ⊤ and ⊥ are abstract compound propositions which are valid logical expressions. Therefore it is incorrect to refer to them as the “always-true proposition” and the “always-false proposition” respectively. (On the other hand, they do happen to be always-true and alwaysfalse logical predicates respectively, although strictly speaking, logical predicates are not defined as such in propositional calculus.) 4.3.21 Remark: The author chose the notations ⊤ and ⊥ to represent “always true” and “always false” respectively on 4 July 2008, but only on a temporary basis due to lack of a better symbol. The symbols seemed arbitrary and unlikely to be popular, although the ⊤ symbol does look like a “T”, and the ⊥ symbol is the opposite of this. Unfortunately, the ⊥ symbol is also used in Euclidean geometry to mean “perpendicular”, which is a potential clash. On the other hand, an advantage of these symbols is that the ⊤ symbol suggests the graph of a function whose value is always equal to 1, whereas ⊥ suggests the corresponding function equal to 0 everywhere. This matches the usual arithmetic values of these symbols. However, on 24 January 2009, the author found the exact same notation in use in an article on formal proof (Harrison [160]). 4.3.22 Remark: Definition 4.3.23 introduces a “tautological” or “tautologous” compound abstract propositions, also known simply as tautologies. Compound abstract propositions contain zero or more atomic proposition names. The truth value t(α) of a compound proposition α is some function f : {F, T}n → {F, T} of the truth values of the n component propositions, where n is a non-negative integer. Thus t(α) = f (t(A1 ), t(A2 ), . . . t(An )), where A1 , A2 , . . . An are the component propositions. A compound proposition is a tautology if and only if f (t1 , t2 , . . . tn ) = T for all n-tuples (t1 , t2 , . . . tn ) ∈ {F, T}n . [ Remark 4.3.22 seems excessively tedious. Fix this. ] [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
The always-true and always-false logical operators should be distinguished from the always-true and alwaysfalse logical predicates. The logical operators take propositions as arguments whereas logical predicates take logical variables as arguments. (See Remark 4.12.9 and Notation 4.12.10 for always-true and always-false logical predicates.)
126
4. Logic methods
4.3.23 Definition: A tautology is a compound abstract proposition whose truth value is “true” for all possible combinations of truth values for its atomic propositions. A contradiction is a compound abstract proposition whose truth value is “false” for all possible combinations of truth values for its atomic propositions.
4.4. Logical expression evaluation and logical argumentation 4.4.1 Remark: The word “calculus” in “propositional calculus” and “predicate calculus” is perhaps misleading. The word “algebra” might be a more accurate analogy. The proposition names (such as A, B, C) in compound propositions (such as A ∧ (B ∨ C)) are analogous to variables (such as x, y, z) in algebraic expressions (such as x(y + z)). The procedures of propositional calculus to determine truth values of compound propositions are analogous to the procedures of algebra to determine values of algebraic expressions. So probably it would be more helpful to talk about “propositional algebra” and “predicate algebra”. In fact, an early attempt to build symbolic logic in the 19th century is now referred to as “Boolean algebra”. 4.4.2 Remark: In the case of algebra, there are two basic tasks. One is easy. The other is difficult. (1) Simple calculation: Evaluation of algebraic expressions, given the values of the individual variables. For example, evaluate 2x + 3y, given that x = 10 and y = 15. (2) Inverse problems: Inference of the values of individual variables, given the values of algebraic expressions. For example: solve for x and y, given that 2x + 3y = 65 and 5x − 2y = 20. In logic, there are two analogous basic tasks. (1) Simple calculation: Evaluation of truth-values of logical expressions, given the truth-values of individual logical variables. For example, determine the truth value of A ∧ (B ∨ C), given that A and C are true and B is false.
The rules of propositional calculus are reminiscent of the manipulation rules of algebra, which gradually reduce given equations to a desired form. Just as the rules of algebra ensure that the transformed equations have the same solutions as the initial equations which are given, so also the rules of propositional calculus ensure that the set of combinations of truth values for logical variables which satisfy the equations are not altered by application of the deduction rules. The rules of propositional calculus are required to verifiably leave the set of truth value combinations unchanged. This is usually expressed as the requirement that the deduction rules must never permit false conclusions from true premisses. One may make a further analogy with calculus. There are two well-known branches of calculus. (1) Simple calculation: Differentiation of functional expressions. In other words, the differential calculus. For example, calculate the derivative of exp(−x2 ). (2) Inverse problems: Determination of functional expressions whose derivatives are as specified. In other words, the integral calculus. For example: determine the form of f (x), given that the derivative of f is −2x exp(−x2 ) for all x ∈ IR. As in the case of algebra and logic, so also in calculus, case (1) is a straightforward calculation (requiring only a bounded, finite number of steps), whereas case (2) is more problematic, sometimes requiring an unbounded or infinite number of steps, and sometimes being even impossible to solve. In the propositional calculus, the number of steps for case (2) is finite, but potentially quite difficult (although all conjectural theorems can be easily tested with truth tables by virtue of the deduction theorem in Section 4.9). In the predicate calculus, case (2) can sometimes require an unbounded number of steps or could even be impossible to solve. (The predicate calculus is effectively the propositional calculus for infinite families of logical variables. So the difficulty of predicate calculus is not very surprising.) 4.4.3 Remark: An example of the observation in Remark 4.4.2 that the rules of deduction in propositional calculus may be regarded as techniques of solution of simultaneous logical equations, consider the modus [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
(2) Inverse problems: Inference of the truth-values of individual logical variables, given the truth-values of logical expressions. For example, determine the possible truth-values of A, B and C, given that A ∧ (B ∨ C) is true and (A ∨ B) ∧ C is false. (See Exercise 47.1.3 for solution.)
4.5. Propositional calculus formalization
127
ponens rule described in Remark 4.6.1. This is equivalent to solving for the truth value t(B), given the truth values t(A ⇒ B) = T and t(A) = T. Modus ponens is, in fact, similar to the substitution rule in real-number algebra. Since the truth value of A is T, this may be susbstituted into the true expression A ⇒ B to yield t(B) = T. 4.4.4 Remark: It may be argued that the propositional calculus is a terrible waste of time and energy because it can be done more simply and quickly with truth tables. It is true that all PC theorems may be easily determined to be provable or not provable by the use of simple algorithms such as truth tables. Every assertion may be easily converted to an equivalent single wff which is a tautology if and only if the assertion is provable. Whether a wff is a tautology can be determined by a simple algorithm which evaluates the truth or falsity of the wff for each of the possible 2n combinations of truth values of the n proposition names in the wff. (This approach may be given the name “tabular exhaustive enumeration”.) Therefore the propositional calculus could be abandoned in favour of simple calculation. A counter-argument to this argument is the fact that the predicate calculus (involving existential and universal quantifiers) is not so easy to analyze by simple algorithms. When the axioms of set theory are added to the predicate calculus, there seems to be no kind of “truth table” to facilitate the task of determining which statements are true or false. (A similar comment is made in Remark 4.14.3.) Since the propositional calculus is required as part of both the predicate calculus and set theory, it is beneficial to have a uniform approach to all three topics: propositional calculus, predicate calculus and set theory. It is most economical to use the axiomatic approach for all three levels. The only reason the truth-table approach is possible for propositional calculus is that fact that PC expressions represent only a finite number of propositions. (In predicate calculus, a single logical expression often signifies an infinite number of propositions involving an infinite number of individual variables.) Truth tables are useful for teaching purposes, but have limited applicability in serious mathematical deduction. Truth tables are more closely associated with the “simple calculation” tasks described in part (1) in Remark 4.4.2 than with the “inverse problems” tasks in part (2).
4.5. Propositional calculus formalization 4.5.1 Remark: Propositional calculus is a formalization of the methods of argument which are used for solving simultaneous logic equations. By formalizing the text-level procedures which are observed in the informal methods of solution, it is possible to dispense with the semantics and perform all calculations without any reference to the meaning of the symbols whatsoever. Thus one may regard propositional calculus a semantics-free framework for solving logic problems. In other words, propositional calculus is semantics-free logic. 4.5.2 Remark: There are hundreds of reasonable ways to develop the propositional calculus. Axiomatic systems for propositional calculus typically have the following components. (i) Operators: The set of primitive symbols such as operators. Typical symbols are ⇒, ∧, ∨ and ¬. These symbols are called the “primitive connectives” of the system. Sometimes the parentheses “(” and “)” may be defined as primitive symbols although their function is usually only for the grouping of symbols into sub-formulas to make operands unambiguous. (ii) Name space: The sets of permitted labels for propositions and well-formed formulas. The labels for propositions are called the “statement names” of the system. They are typically single letters, with or without subscripts. (Presumably the labels for well-formed formulas could be referred to as “wff names”?) (iii) Syntax rules: The rules which decide the syntactic correctness of logical sentences. A permitted sentence is called a “well-formed formula”, abbreviated to wff or wf. (A wff may also be referred to as a “statement form”.) (iv) Axioms: The set of axioms or axiom schemas. Axiom schemas are templates into which arbitrary well-formed formulas may be substituted. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
[ Check whether there might actually be an effective truth-table method of testing all predicate calculus assertions for validity. ]
128
4. Logic methods
(v) Inference rules: The set of rules of inference. Typical inference rules are “modus ponens” and “reductio ad absurdum”. (See Remark 4.6.8 for the significance and origin of the phrase “modus ponens”.) 4.5.3 Remark: It is conventional to typeset Latin phrases in italics, e.g. modus ponens and modus tollens, so that readers will not waste their time looking up these phrases in English dictionaries, although many common Latin words and phrases are defined in English dictionaries. 4.5.4 Remark: The relations between the statements, statement names, statement forms, statement-form names and statement-form-name forms in Remark 4.5.2 are illustrated in Figure 4.5.1. statement form letter forms
α⇒β
β⇒γ
γ ∨δ
¬δ
statement form letters
α
β
γ
δ
statement forms
A∧B
A ∧ (B ∨ C)
B ∨C
statement letters
A
B
statements
statement 1
statement 2
5
syntactic variables
4
¬C
abstract proposition formulas
3
C
D
abstract propositions
2
statement 3
statement 4
concrete propositions
1
Statements, names and forms
The statements (or propositions) in layer 1 in Figure 4.5.1 are in some externally defined concrete proposition space. (See Remark 4.1.1 for concrete spaces of propositions.) Concrete atomic propositions may be transistor voltages, natural-language statements, symbolic mathematics statements, or any other kinds of two-state components of systems. The statement names (or proposition names) in layer 2 are associated in an implementation-dependent fashion with concrete propositions. Two statement names may refer to the same concrete proposition. The association may vary over time. In fact, there is no need to have even an equality relation on the space of concrete propositions. In the terminology of Remark 3.3.4, the proposition names in layer 2 belong to the “discussion context” whereas the concrete propositions in layer 1 belong to the “discussed context”. Layer 3 also belongs to the discussion context. (Layers 4 and 5 belong to a meta-discussion context, which discusses the layer 2/3 context.) It is not necessary to have a well-defined concept of truth and falsity in layer 1, the space of concrete statements. In fact, the entities in layer 1 don’t need to have any sort of two-state attributes at all. Layer 2 in Figure 4.5.1 does have a crisp, sharp notion of truth and falsity, but this is not a contradiction. The upper four layers belong to a discussion context for layer 1, which is the discussed context. Conclusions which are arrived at in layer 2 can only be expressed in layer 1 if layer 1 has a full set of two-state attributes and logical expressions. In layer 3, the atomic abstract proposition names in layer 2 are combined into logical compounds. This is the layer in which propositional calculus does its work. The compound abstract expressions in layer 3 may or may not be associated with equivalent compound expressions in layer 1. Some concrete proposition domains do not support general logical expressions, in which case layer 3 may be regarded as an extension of the concrete proposition domain. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Figure 4.5.1
compound syntactic variables
4.6. Deduction rules
129
In layer 4, the meta-logical analysis of compound abstract propositions is facilitated by associating wff names with compound propositions in layer 3. This association is arbitrary except to the extent specified in the discussion context. The association is not necessarily fixed. Wff names are also known as “syntactical variables” (Shoenfield [168], page 7), “metalogical variables” (Lemmon [163], page 49) or “statement letters” (Mendelson [164], page 15). In layer 5, the syntactic variables in layer 4 are combined meta-logically into compounds of compounds. When the values of the syntactic variables are substituted into an expression in layer 5, the result is an expression in layer 3. Compound logical statements (in layer 3) are often explained as having their truth-values determined by the truth-values of atomic logical statements (in layer 2). In practice, the reverse is very often the case. In other words, the truth-values of compound statements are given, and the task is to solve for the truth-values of either the atomic components or other compound statements. This is analogous to the tasks undertaken in algebra. One is generally given relations between variables, and the task is to solve for the individual variables or other relations between variables. (The daily tasks of PDE analysis are similar, where one is given a PDE plus some boundary and/or initial values, and the task is to find the set of all solutions, except that mostly exact solutions cannot be computed; only the properties of solutions can be determined in most practical cases.) The names in each of the upper four layers in Figure 4.5.1 refer to individual entities in the corresponding lower layer.
4.5.6 Remark: Definition 4.5.7 is fairly woolly because it must be explained in terms of naive logic, naive sets and naive numbers. To put it more briefly, α1 , α2 , . . . αn ⊢ β means that the conclusion β can be deduced from the assumptions α1 , α2 , . . . αn for any non-negative integer n. The assertion symbol “ ⊢ ” may be given a subscript to indicate which propositional calculus is used for the deduction. 4.5.7 Definition: An assertion in a propositional calculus X is an ordered pair (α, β) such that α is a finite set of zero or more wffs, and β is a wff, and for some logical argument in X, the wff β is the conclusion and the wffs α are the assumptions. 4.5.8 Notation: α1 , α2 , . . . αn ⊢ X β, for non-negative integers n, denotes the assertion in propositional calculus X of the pair (α, β), where α = {α1 , α2 , . . . αn }. The subscript X may be omitted. Thus α1 , α2 , . . . αn ⊢ β, for non-negative integers n, denotes the assertion in a propositional calculus (which is implied in the context) of the pair (α, β), where α = {α1 , α2 , . . . αn }. [ The two-way assertion symbol in Notation 4.5.9 looks clumsy. The vertical lines are too close together and the horizontal dashes are too long. See Lemmon [163], page 34. It shouldn’t look like an electronic circuit diagram symbol for a capacitor. ] 4.5.9 Notation: α ⊣⊢ β denotes the assertion of both α ⊢ β and β ⊢ α.
4.6. Deduction rules [ It might be a good idea to summarize all of the propositional calculus rule-sets and axiom-sets which appear in various textbooks. The different systems should at least be listed in a table. ] 4.6.1 Remark: The modus ponens rule means that whenever a line of the form (n1 ) α appears on line (n1 ) in an argument, for any wff α, and the line [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
4.5.5 Remark: Truth and falsity do not necessarily have a crisp, well-defined status in the layer 1 of Figure 4.5.1 in Remark 4.5.4. If the propositions in layer 1 are defined within a predicate calculus or firstorder language (or some other symbolic logic context), then the truth values will be well defined. But if the propositions are in natural English language, truth values will generally be fuzzy and uncertain. Similarly, if the propositions are voltages of electronic logic circuits, the truth values can be indeterminate under some circumstances. Truth values are well defined in the higher 4 layers of Figure 4.5.1.
130
4. Logic methods
(n2 ) α ⇒ β appears in the same argument on line (n2 ), for any wff β, then the line (n3 ) β
MP (n1 ,n2 )
may be written on any line (n3 ) in the same argument if n1 < n3 and n2 < n3 . To assist checking for errors, it is conventional to indicate the two inputs to the rule in some such way as “MP (n1 ,n2 )” as indicated. 4.6.2 Remark: The modus ponens rule is sufficient for all deduction in the propositional calculus if a small set of axiom schemas is assumed. However, the MP rule gives very strong emphasis so the conditional operator “⇒”. This is an asymmetric operator which is not quite as intuitive in meaning as the andoperator “∧” and the not-operator “¬”. The PC axioms are somewhat unintuitive when expressed in terms of the conditional operator. One may therefore ask whether a rule equivalent to MP may be implemented in terms of the ∧ and ¬ operators. Consider the following form of deduction: α, ¬(α ∧ ¬β) ⊢ β (1) (2) (3) (4) (5) (6)
α ¬β α ∧ ¬β ¬(α ∧ ¬β) (α ∧ ¬β) ∧ ¬(α ∧ ¬β) β
Assumption 1 Assumption AND (1,2) Assumption 2 AND (3,4) RAA (2,5)
Thus a combination of the and-introduction rule and reductio ad absurdum yields the same result as MP, because ¬(α ∧ ¬β) has the same meaning as α ⇒ β. The and-introduction rule is really nothing other than the definition of the and-operator. So RAA is effectively of equivalent strength to modus ponens. Therefore the propositional calculus could be written in terms of the and-operator and the RAA rule.
4.6.3 Remark: Some deduction rules may be interpreted as definitions of particular logical operators. For example, the modus ponens rule is effectively the definition of the logical implication operator “⇒”. The AND-introduction and AND-elimination rules define the conjunction operator “∧”. The RAA rule is effectively a definition of the logical negation operator “¬”. 4.6.4 Remark: The modus ponens inference rule is powerful enough to be used essentially alone as the sole inference rule in part (v) of Remark 4.5.2. Many popular axiomatic systems use only this single rule. Logicians often strive to reduce propositional calculus to a spartan, minimalist set of inference rules, axioms and primitive symbols. The more spartan the axiomatic system is, the more difficult it is to deduce the basic properties of the operators listed in Notation 4.3.3. The use of a minimal axiomatic system can be justified on the grounds of reliability. The smaller the definition of the system, the easier it should be to verify that the system is valid in terms of one’s idea of how logic should be done. However, the intuitive correctness of the system is difficult to establish if the operators, axioms or rules seem unfamiliar because they are excessively abstracted. An extreme form of minimalist axiomatic system uses the NAND (not-and) operator (which is equivalent to the “Sheffer stroke” or “alternative denial”) as the only operator symbol, with a single axiom and a single inference rule. It is somewhat burdensome to have to derive the familiar standard logic from such an austere foundation. It also suffers from a serious lack of intuitive comprehensibility. A very similar minimalist axiomatic system uses the NOR (not-or) operator (also known as “joint denial”) as its only operator symbol. (These two operators correspond closely to the way a one-transistor circuit can be made to function as a logic device.) Since the modus ponens inference rule is so popular, and this rule specifically requires wffs of the form α ⇒ β, it seems sensible for even a minimalist axiomatic system to include the ⇒ symbol as part of its symbol set in part (i) of Remark 4.5.2. The complete set of logical operations cannot be defined in terms of the ⇒ symbol alone. In fact, only the NAND and NOR operators can generate the full set of logic operations from a single operator. (See Remark 4.11.3 for related comments.) [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
[ Write out the axioms of propositional calculus in terms of the and-operator and the RAA rule. ]
4.6. Deduction rules
131
[ Jean Nicod (1917) showed that the single axiom schema (α|(β|γ))|((δ|(δ|δ))|((ε|β)|((α|ε)|(α|ε)))) is sufficient for generating standard propositional calculus, using the single inference rule α, (α|(β|γ)) ⊢ γ. See Mendelson [164], page 42, where this NAND operator is called “alternative denial”. ] 4.6.5 Remark: The primacy of the implication operator in propositional calculus may seem inevitable, but it is valuable to consider why other operators cannot easily fulfil the same role. Since ancient Greek times, syllogisms such as “all people are mortal; therefore Socrates is mortal” have dominated logic. Such a deduction is equivalent to modus ponens, which is equivalent to the rule “α, α ⇒ β ⊢ β”. But rules and theorems of the form “α1 , α2 , . . . αn ⊢ β” are equivalent to tautologies of the form “α1 ⇒ (α2 ⇒ . . . (αn ⇒ β) . . .)”. This seems to support the claim that the implication operator is special in some way for logical deduction. The assertion symbol “ ⊢ ” appears to be essentially equivalent to an implication operator. Our method of logical deduction generally proceeds from a set of pre-established facts or rules, and proceeds to apply these to special circumstances. As a result, logical deduction tends to have the form of substituting special-instance information into general rules, and such deduction looks very much like modus ponens: “α, α ⇒ β ⊢ β”, where α ⇒ β is the general rule, α is the special-instance information, and β is the proposition which is deduced from the rule when it is applied to α. The possibility of replacing implication with conjunction as the primary operator for propositional calculus is briefly discussed in Remark 4.6.2. However, although this may be technically possible, it would not be a natural model for human deduction. Disjunction would be a much more likely candidate for logical operator primacy, but only because disjunction is very similar to implication. Real-world literature is generally expressed much more in terms of implications than disjunctions. So this seems to tip the balance in favour of the implication operator.
4.6.6 Remark: Tautologies such as “(A ∧ (A ⇒ B)) ⇒ B”, which have the ⇒ symbol as the primary operator, are often written interchangeably in the form of a theorem like “(A ∧ (A ⇒ B)) ⊢ B”. The concept of the assertion symbol ⊢ can be guessed by noting that the following three statements are essentially interchangeable. ⊢ (A ∧ (A ⇒ B)) ⇒ B. A ∧ (A ⇒ B) ⊢ B A, (A ⇒ B) ⊢ B A ⊢ (A ⇒ B) ⇒ B A list of zero or more propositions to the left of the assertion symbol is the set of “assumptions”. The single proposition to the right of the assertion symbol is the “assertion”. It is asserted that the assertion can be deduced from the assumptions. In principle, all theorems in mathematics may be written in this way, but such a rigorously correct notation is not popular among mathematicians although strict symbolic logic would make theorem statements much less ambiguous. The assertion symbol “ ⊢” is rarely used in this book except in the logic and set theory chapters. 4.6.7 Remark: The modus ponens inference rule may be thought of very roughly as replacing the ⇒ operator with the ⊢ symbol. The reverse replacement is known as the “deduction theorem”. According to this metatheorem, the theorem Γ ⊢ A ⇒ B may be proved if the theorem Γ, A ⊢ B can be proved, for any list Γ of wffs. (See Section 4.9.) [ Maybe arrange the four modes in Remark 4.6.8, and their rough meanings, in a table. ] 4.6.8 Remark: The phrase modus ponens is an abbreviation for the medieval reasoning principle called modus ponendo ponens. This was one of the following four principles of reasoning. (See Lemmon [163], page 61.) [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Implication rules can be chained together to make new rules. For example, α ⇒ β, β ⇒ γ, γ ⇒ δ ⊢ α ⇒ δ. This cannot be done with conjunctions or disjunctions because they are symmetric. In an aesthetic sense, the use of symmetric operators would be very pleasing. Symmetry is normally a sought-after attribute in mathematical theories. But in this case, the asymmetry is apparently highly desirable for utilitarian reasons. Deduction is itself asymmetric in nature.
132
4. Logic methods
(i) Modus ponendo ponens. Roughly speaking: A, A ⇒ B ⊢ B.
(ii) Modus tollendo tollens. Roughly speaking: ¬B, A ⇒ B ⊢ ¬A.
(iii) Modus ponendo tollens. Roughly speaking: A, ¬(A ∧ B) ⊢ ¬B. (iv) Modus tollendo ponens. Roughly speaking: ¬A, A ∨ B ⊢ B.
The Latin words in these reasoning-mode names come from the verbs “ponere” and “tollere”. The verb “ponere” (which means “to put”) has the following particular meanings in the context of logical argument. (See White [217], page 474.) (1) [ In speaking or writing: ] To lay down as true; to state, assert, maintain, allege. (2) To put hypothetically, to assume, suppose. Meaning (1) is intended in the word “ponens”. Meaning (2) is intended in the word “ponendo”. Thus the literal meaning of “modus ponendo ponens” is “assertion-by-assumption mode”. In other words, when A is assumed, B may be asserted. The Latin verb “tollere” (which means “to lift up”) has the following figurative meanings. (White [217], page 612.) To do away with, remove; to abolish, annul, abrogate, cancel. Mode (ii) (“modus tollendo tollens”) may be translated “negative-assertion-by-negative-assumption mode” or “negation-by-negation mode”. In other words, when B is assumed to be false, A may be asserted to be false. Effectively “tollendo” means “by negative assumption” while “tollens” means “negative assertion”. Mode (iii) (“modus ponendo tollens”) may be translated “negative-assertion-by-positive-assumption mode” or “negation-by-assumption mode”. When A is assumed to be true, B may be asserted to be false. Mode (iv) (“modus tollendo ponens”) may be translated “positive-assertion-by-negative-assumption mode” or “assertion-by-negation mode”. In this case, when A is assumed to be false, B may be asserted to be true. Since modes (iii) and (iv) are rarely used as inference rules, the more popular modes (i) and (ii) are generally abbreviated to simply “modus ponens” and “modus tollens” respectively.
The subject of this section is the particular formulation of propositional calculus which is described in Definition 4.7.4. This axiomatic system is adopted as the basis for logic theorems which are required in this book. [ Possibly also have a section which presents a NAND-based propositional calculus. This would be purely recreational. Probably not worth the ink. ] 4.7.1 Remark: Definition 4.7.4 is a compromise between the ascetic, impoverished NAND-based axiom system (mentioned in Remark 4.6.4) and an easy-going 4-symbol, 10-axiom system. Definition 4.7.4 defines a two-operator axiomatic system with ⇒ and ¬ as primitive symbols, together with three axiom schemas, and modus ponens as the sole inference rule. This one-rule, two-symbol, three-axiom system is less tedious to work with than the NAND and NOR logics, but it is still hard work to generate the basic properties of the symbols in Notation 4.3.3 from it. This axiomatic system is attributed to Jan Lukasiewicz. (It is described by Mendelson [164], pages 30–31.) [ Present a 4-symbol, 10-axiom, 1-rule system and refer to this in Remark 4.7.1. ] [ The single axiom schema ((((α ⇒ β) ⇒ (¬γ ⇒ ¬δ)) ⇒ γ) ⇒ ε) ⇒ ((ε ⇒ α) ⇒ (δ ⇒ α)) was shown by C. A. Meredith (1953) to be equivalent to the three-axiom set in Definition 4.7.4. This is mentioned in Mendelson [164], page 42. ] 4.7.2 Remark: Logic theorems which are deduced from the propositional calculus axiomatic system in Definition 4.7.4 will be tagged with the abbreviation “PC”. For example, see Theorem 4.8.3. (Theorems which are not tagged are, by default, derived from Zermelo-Fraenkel set theory. See Remark 5.0.10.) [ Get a reference for who invented the axiom system in Definition 4.7.4. Probably Jan Lukasiewicz. ] [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
4.7. An implication-based propositional calculus
4.8. Some propositional calculus theorems
133
[ One source gives a simpler version of the third axiom schema (PC 3) in Definition 4.7.4: (¬β ⇒ ¬α) ⇒ (β ⇒ α). Check that this is valid and find out why Mendelson gives the more complicated axiom schema. It must be shown that this modified axiom, together with axioms (PC 1) and (PC 2), implies axiom (PC 3) as stated. ] 4.7.3 Remark: Definition 4.7.4 is organized as parts (i) to (v) corresponding to the five parts of Remark 4.5.2. 4.7.4 Definition: The following is an axiom system for propositional calculus. (i) The primitive connectives are ⇒ (“implies”) and ¬ (“not”). The grouping parentheses are “(” and “)”.
(ii) The statement names are the upper-case letters of the Roman alphabet, A to Z, with or without decimal integer subscripts. The “wff names” are the lower-case letters of the Greek alphabet, with or without decimal integer subscripts. (iii) Any statement name is a wff. For any wff α, the formula (¬α) is a wff. For any wffs α and β, the formula (α ⇒ β) is a wff. Any formula which cannot be constructed by recursive application of these rules is not a wff. (For clarity, parentheses may be omitted in accordance with precedence rules.) (iv) The axiom schemas are as follows: (PC 1) α ⇒ (β ⇒ α).
(PC 2) (α ⇒ (β ⇒ γ)) ⇒ ((α ⇒ β) ⇒ (α ⇒ γ)). (PC 3) (¬β ⇒ ¬α) ⇒ ((¬β ⇒ α) ⇒ β).
(v) The only inference rule is modus ponens. (See Remark 4.6.1.)
4.7.5 Remark: Axiom schemas (PC 1)–(PC 3) in Definition 4.7.4 are perhaps a little difficult to interpret. Axiom (PC 1) is just one half of the definition of the ⇒ operator. Axiom (PC 2) looks like a “distributivity axiom” or “restriction axiom”, which says how the ⇒ operator associates with the ⇒ operator. 4.7.6 Remark: Axiom (PC 3) looks very much like a “reductio ad absurdum” axiom. It states that if the assumption ¬β implies both ¬α and α, then the assumption ¬β must be false; in other words β must be true. 4.7.7 Remark: The gradual build-up of theorems in an axiomatic system (such as propositional calculus, predicate calculus or Zermelo-Fraenkel set theory) is analogous to the way programming procedures (also called “functions”) are built up in software libraries. In both cases, there is an attempt to amass a hoard of re-usable “intellectual capital” which can be used in a wide variety of future work. Consequently the work gets progressively easier as “user-friendly” theorems (or programming procedures) accumulate over time. Accumulation of “theorem libraries” sounds like a good idea in principle, but a single error in a single re-usable item (a theorem or a programming procedure) can easily propagate to a very wide range of applications. In other words, “bugs” can creep into re-usable libraries. It is for this reason that there is so much emphasis on total correctness in mathematical logic. The slightest error could propagate to all of mathematics. The development of the propositional calculus is also analogous to “boot-strapping” a computer operating system. The propositional calculus is the lowest functional layer of mathematics. Everything else is based on this substrate. Logic and set theory may be thought of as the “operating system” of mathematics. Then differential geometry is a “user application” in this “operating system”.
4.8. Some propositional calculus theorems 4.8.1 Remark: Theorem 4.8.3 follows the usual informal approach to proofs in logic, which is to find proofs for desired assertions, while building up a small library of useful intermediate assertions along the way. A different approach would be to generate all possible assertions which can be obtained in n deductions steps [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
[ Should prove somewhere that Definition 4.7.4 is consistent with the truth-value function properties of the operators ⇒ and ¬. Also show that all properties of these operators may be derived from Definition 4.7.4. In other words, show that Definition 4.7.4 is equivalent to the operators. ]
134
4. Logic methods
with all possible combinations of applications of the deduction rules. Then n can be gradually increased to discover all possible theorems. This is analogous to finding the span of a set of vectors in a linear space. However, such a systematic, exhaustive approach is about as useful as generating the set of all possible chess games according to the number of moves n. As n increases, the number of possible games increases very rapidly. But a more serious problem is that it the vast majority of such games are worthless. Similarly in logic, the vast majority of true assertions are uninteresting. (A similar comment is made in Remark 2.9.1.) 4.8.2 Remark: The order of assertions in Theorem 4.8.3 is chosen so that earlier assertions assist in the proof of later assertions. Although the proof is long, and quite stressful if you’re out of practice, it does demonstrate some of the flavour of propositional calculus. After reading the proofs of the first one or two parts of the theorem, the reader may like to find proofs for the other parts without looking at the solutions provided here. Another useful exercise is to try to find shorter proofs than those which are given here.
(i) (ii) (iii) (iv) (v) (vi) (vii) (viii) (ix) (x) (xi) (xii) (xiii) (xiv) (xv) (xvi) (xvii)
⊢ α ⇒ α. ⊢ (¬α ⇒ α) ⇒ α. α ⇒ β, β ⇒ γ ⊢ α ⇒ γ. α ⇒ (β ⇒ γ) ⊢ β ⇒ (α ⇒ γ). ⊢ (¬β ⇒ ¬α) ⇒ (α ⇒ β). ⊢ ¬¬α ⇒ α. α ⇒ β ⊢ ¬¬α ⇒ β. ⊢ α ⇒ ¬¬α. α ⇒ β ⊢ α ⇒ ¬¬β. α ⇒ β ⊢ ¬¬α ⇒ ¬¬β. α ⇒ β ⊢ ¬β ⇒ ¬α. α ⇒ ¬β ⊢ β ⇒ ¬α. ¬α ⇒ β ⊢ ¬β ⇒ α. ⊢ ¬α ⇒ (α ⇒ β). ⊢ α ⇒ (¬α ⇒ β). ⊢ ¬(α ⇒ ¬β) ⇒ α. ⊢ ¬(α ⇒ ¬β) ⇒ β.
Proof: To prove part (i): (1) (2) (3) (4) (5)
⊢α⇒α
α ⇒ ((α ⇒ α) ⇒ α) (α ⇒ ((α ⇒ α) ⇒ α)) ⇒ ((α ⇒ (α ⇒ α)) ⇒ (α ⇒ α)) (α ⇒ (α ⇒ α)) ⇒ (α ⇒ α) α ⇒ (α ⇒ α) α⇒α
To prove part (ii):
⊢ (¬α ⇒ α) ⇒ α
(1) ¬α ⇒ ¬α (2) (¬α ⇒ ¬α) ⇒ ((¬α ⇒ α) ⇒ ¬α) (3) (¬α ⇒ α) ⇒ ¬α To prove part (iii): (1) (2) (3) (4)
PC 1 PC 2 MP (1,2) PC 1 MP (4,3)
part (i) PC 3 MP (1,2)
α ⇒ β, β ⇒ γ ⊢ α ⇒ γ
α⇒β β⇒γ (β ⇒ γ) ⇒ (α ⇒ (β ⇒ γ)) α ⇒ (β ⇒ γ)
[ www.topology.org/tex/conc/dg.html ]
Hyp Hyp PC 1 MP (2,3) [ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
4.8.3 Theorem [pc]: The following assertions follow from Definition 4.7.4.
135
(5) (α ⇒ (β ⇒ γ)) ⇒ ((α ⇒ β) ⇒ (α ⇒ γ)) (6) (α ⇒ β) ⇒ (α ⇒ γ) (7) α ⇒ γ
PC 2 MP (4,5) MP (1,6)
α ⇒ (β ⇒ γ) β ⇒ (α ⇒ β) (α ⇒ (β ⇒ γ)) ⇒ ((α ⇒ β) ⇒ (α ⇒ γ)) (α ⇒ β) ⇒ (α ⇒ γ) β ⇒ (α ⇒ γ)
Hyp PC 1 PC 2 MP (1,3) part (iii) (2,4)
α ⇒ (¬β ⇒ α) (¬β ⇒ ¬α) ⇒ ((¬β ⇒ α) ⇒ β) (¬β ⇒ α) ⇒ ((¬β ⇒ ¬α) ⇒ β) α ⇒ ((¬β ⇒ ¬α) ⇒ β) (¬β ⇒ ¬α) ⇒ (α ⇒ β)
PC 1 PC 3 part (iv) (2) part (iii) (1,3) part (iv) (4)
¬α ⇒ ¬α (¬α ⇒ ¬¬α) ⇒ ((¬α ⇒ ¬α) ⇒ α) (¬α ⇒ ¬α) ⇒ ((¬α ⇒ ¬¬α) ⇒ α) (¬α ⇒ ¬¬α) ⇒ α ¬¬α ⇒ (¬α ⇒ ¬¬α) ¬¬α ⇒ α
part (i) PC 3 part (iv) (2) MP (1,3) PC 1 part (iii) (5,4)
To prove part (iv): (1) (2) (3) (4) (5)
To prove part (v): (1) (2) (3) (4) (5)
To prove part (vi): (1) (2) (3) (4) (5) (6)
To prove part (vii): (1) α ⇒ β (2) ¬¬α ⇒ α (3) ¬¬α ⇒ β
To prove part (viii):
α ⇒ (β ⇒ γ) ⊢ β ⇒ (α ⇒ γ)
⊢ (¬β ⇒ ¬α) ⇒ (α ⇒ β)
⊢ ¬¬α ⇒ α
α ⇒ β ⊢ ¬¬α ⇒ β
⊢ α ⇒ ¬¬α
(1) ¬¬¬α ⇒ ¬α (2) (¬¬¬α ⇒ ¬α) ⇒ (α ⇒ ¬¬α) (3) α ⇒ ¬¬α
To prove part (ix): (1) α ⇒ β (2) β ⇒ ¬¬β (3) α ⇒ ¬¬β
To prove part (x): (1) α ⇒ β (2) ¬¬α ⇒ β (3) ¬¬α ⇒ ¬¬β
To prove part (xi): (1) (2) (3) (4)
α ⇒ β ⊢ α ⇒ ¬¬β
α ⇒ β ⊢ ¬¬α ⇒ ¬¬β
α ⇒ β ⊢ ¬β ⇒ ¬α
α⇒β ¬¬α ⇒ ¬¬β (¬¬α ⇒ ¬¬β) ⇒ (¬β ⇒ ¬α) ¬β ⇒ ¬α
[ www.topology.org/tex/conc/dg.html ]
Hyp part (vi) part (iii) (2,1) part (vi) part (v) MP (1,2) Hyp part (viii) MP (1,2) Hyp part (vii) (1) part (ix) (2) Hyp part (x) (1) part (v) MP (2,3) [ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
4.8. Some propositional calculus theorems
136
4. Logic methods
(1) (2) (3) (4)
α ⇒ ¬β ¬¬α ⇒ ¬β (¬¬α ⇒ ¬β) ⇒ (β ⇒ ¬α) β ⇒ ¬α
To prove part (xiii): (1) (2) (3) (4)
α ⇒ ¬β ⊢ β ⇒ ¬α
¬α ⇒ β ⊢ ¬β ⇒ α
¬α ⇒ β ¬α ⇒ ¬¬β (¬α ⇒ ¬¬β) ⇒ (¬β ⇒ α) ¬β ⇒ α
To prove part (xiv):
part (viii) part (xiv) part (iii) (1,2) ⊢ ¬(α ⇒ ¬β) ⇒ α
(1) ¬α ⇒ (α ⇒ ¬β) (2) ¬(α ⇒ ¬β) ⇒ α To prove part (xvii):
PC 1 part (v) part (iii) (1,2)
⊢ α ⇒ (¬α ⇒ β)
(1) α ⇒ ¬¬α (2) ¬¬α ⇒ (¬α ⇒ β) (3) α ⇒ (¬α ⇒ β) To prove part (xvi):
Hyp part (ix) (1) part (v) MP (2,3)
⊢ ¬α ⇒ (α ⇒ β)
(1) ¬α ⇒ (¬β ⇒ ¬α) (2) (¬β ⇒ ¬α) ⇒ (α ⇒ β) (3) ¬α ⇒ (α ⇒ β) To prove part (xv):
Hyp part (vii) (1) part (v) MP (2,3)
part (xiv) part (xiii) (1) ⊢ ¬(α ⇒ ¬β) ⇒ β
(1) ¬β ⇒ (α ⇒ ¬β) (2) ¬(α ⇒ ¬β) ⇒ β
PC 1 part (xiii) (1)
This completes the proof of Theorem 4.8.3. 4.8.4 Remark: It is fair to ask how one might discover the proof which is presented here for part (i) of Theorem 4.8.3. The designer of this axiomatic system chose the axioms so that the propositional calculus could only just be generated from them. The system designer had the unfair advantage of working backwards from the theorems to the axioms. For the non-specialist, the discovery of proofs is initially like solving some of those frustrating recreational puzzles which are sold in puzzle shops. They are designed so that a solution exists, but is very difficult to find. One might also compare the proof of basic theorems from minimal axioms sets to the deciphering of encrypted messages where you know the answer (or “plain-text”), and you have to descramble the message to arrive at the known answer. When proving logic theorems, one knows the answers, and the axioms are like a compressed, encrypted version of the full set of basic logic theorems. Therefore the main benefit of proving basic logic theorems is the acquisition of decryption skills, which hopefully will be applicable when the “plain-text” is not known in advance. The first thing one may notice about the first two axioms, (PC 1) and (PC 2), is that the “input” (the left side of the top-level implication) of (PC 2) looks similar to the whole of (PC 1). Recognizing this, one can write the following. ⊢ (α ⇒ β) ⇒ (α ⇒ α) (1) α ⇒ (β ⇒ α) (2) (α ⇒ (β ⇒ α)) ⇒ ((α ⇒ β) ⇒ (α ⇒ α)) (3) (α ⇒ β) ⇒ (α ⇒ α) [ www.topology.org/tex/conc/dg.html ]
PC 1 PC 2 MP (1,2) [ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
To prove part (xii):
4.9. Meta-theorems and the “deduction theorem”
137
Now the input of (3) is the same as the “output” (the right side of the top-level implication) of (PC 1) with α and β swapped: β ⇒ (α ⇒ β). However, it is clearly not possible to prove that β is true for any wff β. So this line of attack leads nowhere. However, if β is replaced in (3) with (β ⇒ α), the input of (3) becomes the same as axiom (PC 1). This gives a valid result as follows. ⊢α⇒α (4) α ⇒ (β ⇒ α) (5) (α ⇒ (β ⇒ α)) ⇒ (α ⇒ α) (6) α ⇒ α
PC 1 (from assertion above) MP (4,5)
Now that we know what we’re doing, we can combine the two above arguments into a single proof and pretend that we discovered it exactly as presented. ⊢α⇒α (1) (2) (3) (4) (5)
α ⇒ ((β ⇒ α) ⇒ α) (α ⇒ ((β ⇒ α) ⇒ α)) ⇒ ((α ⇒ (β ⇒ α)) ⇒ (α ⇒ α)) (α ⇒ (β ⇒ α)) ⇒ (α ⇒ α) α ⇒ (β ⇒ α) α⇒α
PC 1 PC 2 MP (1,2) PC 1 MP (4,3)
4.8.5 Remark: A particular difficulty of propositional calculus theorem proving is knowing the best order in which to attack a desired set of assertions. If the order is chosen well, the earlier assertions can be very useful for proving the later assertions. But it is often only after the proofs are found that one can see a better order of attack. In practice, one typically has a single target assertion to prove and looks (recursively) for assertions which can assist with this. If this backwards-deductive search leads to known results, the target assertion can be proved. The natural order of discovery is typically the opposite of the order of deduction of propositions. This is a constant feature of all mathematics research. Discovery of new results typically starts from the conclusions and works back to old results. (Most of the assertions in Theorem 4.8.3 were proved by the author in order to reach assertion (xvii), which happens to be equivalent to (α ∧ β) ⇒ α, an important property of the ∧ operator. The author first sketched a proof of assertion (xvii) using unproven assertions and then found proofs of these unproven assertions, which in turn required further unproven assertions, until finally all required assertions were proved from axioms.) 4.8.6 Remark: Theorem 4.8.3 (iii) means that the conditional logical operator is transitive. This is equivalent to the “a-fortiori” method of proof.
4.9. Meta-theorems and the “deduction theorem” [ Possibly some of the discussion in this section should be moved to Chapter 3. ] [ How many other PC meta-theorems are useful enough to present here? ] 4.9.1 Remark: As mentioned in Remark 4.9.4, all deduction rules may be regarded as meta-theorems. This is because deduction rules are generally proved by meta-logical means to yield only true results from true assumptions. Deduction rules are derived from more fundamental considerations such as the definitions of logical operators. (Some logic textbooks may give the impression that the deduction rules are more fundamental than logical operators.) One example of a deduction rule which is really a meta-theorem is the substitution rule in propositional calculus. This states that any compound proposition may be substituted for any proposition name in a theorem to yield a new theorem. If the concrete proposition domain is closed under all logical operators, then the validity of the substitution rule is almost obvious. This is because one needs only to define a proposition name which refers to each desired compound proposition and then substitute this name for the generic name which was in the original theorem. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
If β is replaced by α, this is the same as the proof which was given for Theorem 4.8.3 (i).
138
4. Logic methods
If the concrete proposition domain is not closed under logical operations, no such simple substitution is possible. In this case, it is necessary to determine whether the set of axioms is closed under substitution of compound propositions. In typical axiom sets, this is true. In a propositional calculus, the axioms will typically be templates into which arbitrary logical expressions may be substituted. So the substitution rule is once again valid. If the concrete proposition domain is not closed under logical operations and the axioms are not closed under substitution of logical expressions, it is possible that the substitution rule could be invalid. 4.9.2 Remark: In a formal sense, the assertions (1) “⊢ α ⇒ β” and (2) “α ⊢ β” are different for any wffs α and β, but in practice, they are “equipotent”, more or less. In other words, they both yield the same results if they are applied in a logical argument using modus ponens. If the wff α appears on a line of an argument, assertion (1) may be applied to infer β by modus ponens. Assertion (2) may be applied to infer β by the “theorem application rule”. So it seems to be unimportant whether the theorem is available in one form or the other. One fly in the ointment here is that an axiom such as (γ ⇒ (δ ⇒ ε)) ⇒ ((γ ⇒ δ) ⇒ (γ ⇒ ε)) (as in Remark 4.7.4 (PC 2)) can only yield a result using MP if a proposition of the form (γ ⇒ (δ ⇒ ε)) is available. It may be that α ⇒ β is of the required form, but α and β individually are not. So such an axiom could then be applied to assertion (1), but not to assertion (2). It is therefore desirable to be able to convert between these two forms of assertion. The conversion from (1) to (2) is discussed in Remark 4.9.3. The reverse conversion is the subject of Theorem 4.9.8. 4.9.3 Remark: It is always possible to convert an assertion of the form “⊢ α ⇒ β” to the assertion “α ⊢ β” using modus ponens. Suppose first that “⊢ α ⇒ β” is a proved theorem. Then the following argument will be valid. α ⊢ β (2) α ⇒ β (3) β
Hyp Theorem MP (1,2)
The proof is apparently trivial. However, the general rule that the existence of a proof for the assertion “⊢ α ⇒ β” implies the existence of a proof for the assertion “α ⊢ β” is not a theorem. It is a meta-theorem. The reason for this is that α and β are not names of completely general wffs in each assertion. The scope of these wffs includes both of the assertions. These wff names refer to fixed wffs within the combined scope of the two assertions. The meta-theorem is a statement about proofs, not about propositions or wffs. Any such theorem about proofs is a meta-theorem. 4.9.4 Remark: One way to bring the meta-theorem in Remark 4.9.3 into the main stream of argumentation within propositional calculus is to simply declare this meta-theorem to be a deduction rule. In other words, one may declare that whenever there exists a proof of an assertion of the form “⊢ α ⇒ β”, it is permissible to infer the assertion “α ⊢ β”. A deduction rule can permit any kind of inference one desires. In this case, we “know” that this rule will always give true inferences from true assumptions. This follows from the meta-proof in Remark 4.9.3. Since all of the other rules of inference are justified meta-logically in the same way, there is no reason to exclude this kind of deduction rule. In fact, all deduction rules may be regarded as meta-theorems because they can only be justfied by meta-proofs. Probably the principal reason for excluding the declaration of the meta-theorem in Remark 4.9.3 as a deduction rule is the “minimalist principle” which has pervaded all of mathematics and logic for the last hundred years. Another way to avoid the need for the meta-theorem in Remark 4.9.3 is to forbid all theorems where there are assumptions. Given a theorem of the form “⊢ α ⇒ β”, one may always infer β from α by modus ponens. In other words, there is no need for theorems of the form “α ⊢ β”. This would make the presentation of symbolic logic only slightly more tedious. An assertion of the form α1 , α2 , . . . αn ⊢ β would need to be presented in the form ⊢ α1 ⇒ (α2 ⇒ (. . . (αn ⇒ β) . . .)), which is somewhat untidy. For example, α1 , α2 , α3 ⊢ β becomes ⊢ α1 ⇒ (α2 ⇒ (α3 ⇒ β)). [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
(1) α
4.9. Meta-theorems and the “deduction theorem”
139
4.9.5 Remark: After doing a fairly large number of proofs in the style of Theorem 4.8.3, one naturally feels a desire for short-cuts. One of the most significant frustrations is the inability to convert an assertion of the form “α ⊢ β” to an implication of the form “⊢ α ⇒ β”. An assertion of the form “α ⊢ β” means that the writer claims that there exists a proof of the wff β from the assumption α. Therefore if α appears on a line of an argument, then β may be validly written on any later line. It is intuitively clear that this is equivalent to the assertion “⊢ α ⇒ β”. The latter assertion can be used as an input to modus ponens whereas the assertion “α ⊢ β” cannot.
Although it is possible to convert an assertion of the form “⊢ α ⇒ β” to the assertion “α ⊢ β” (as mentioned in Remark 4.9.3), there is no way to make the reverse conversion. A partial solution to this problem is called the “Deduction Theorem”. This is not actually a theorem at all. It is sometimes referred to as a metatheorem, but if the logical framework for the proof of the meta-theorem is not well defined, it cannot be said to be a real theorem at all. It might be more accurate to call it a “naive theorem” since the proof uses naive logic and naive mathematics. The proof of this “theorem” requires mathematical induction, which requires the arithmetic of infinite sets, which requires set theory, which requires logic. (This inescapable cycle of dependencies was forewarned in Remark 2.1.1 and illustrated in Figure 2.1.1.) However, all of the propositional calculus requires naive arithmetic and naive set theory already. So one may as well go the whole hog and throw in some naive mathematical induction too. (Mathematical induction is generally taught by the age of 16 years. One cannot generally progress in mathematics without accepting it. A concept which is taught at a young enough age is generally accepted as “obvious”.) One could call these sorts of “logic theorems” by various names, like “pseudo-theorems”, “meta-theorems” or “fantasy theorems”. In this book, they will be called “naive theorems” since they are proved using naive logic and naive set theory. 4.9.6 Remark: Quite likely the “wffs” in Remark 4.9.7 are really wff-wffs. This is because they are wffs whose names are themselves wffs. [ Check this. ]
To formulate naive Theorem 4.9.8, let W denote the set of all possible wffs in the propositional calculus in Definition 4.7.4, let W n denote the set of all sequences of n wffs for non-negative integers n, and let List(W ) S∞ denote the set n=0 W n of sequences of wffs with non-negative length. An element of W n is said to be a wff sequence of length n. The concatenation of two wff sequences Γ1 and Γ2 is denoted with a comma as Γ1 , Γ2 . [ Give a pseudo-theorem which states that quoting a theorem is a valid procedure for proving an assertion. See comment at end of Remark 4.9.10. ] [ Ideally one should have fully-developed meta-theory with meta-axioms and meta-rules and meta-classes etc. to give some credibility to “theorems” like Theorem 4.9.8. In principle, this is the purpose of Section 3.14. But actually this enterprise is doomed. For example, we should at be able to say what “provable” means. We could define “proofs” as sequences of “lines”, and define “lines” as sequences of “symbols”. But what is a symbol? The first machines must be created by hand. Only then can we hope that some machines will create other machines. ] 4.9.8 Theorem [naive]: “Deduction Theorem” In the propositional calculus in Definition 4.7.4, let Γ ∈ List(W ) be a wff sequence. Let α ∈ W and β ∈ W be wffs for which the assertion Γ, α ⊢ β is provable. Then the assertion Γ ⊢ α ⇒ β is provable. Proof: Let Γ ∈ List(W ) be a wff sequence. Let α ∈ W and β ∈ W be wffs. Let ∆ = (δ1 , . . . δm ) ∈ W m be a proof of the assertion Γ, α ⊢ β with δm = β. First assume that no other theorems are used in the proof ∆. Define the proposition P (k) for integers k with 1 ≤ k ≤ m by P (k) = “the assertion Γ ⊢ α ⇒ δi is provable for all positive i with i ≤ k”.
[ Define concepts like “line” and “proof” and “quoting a theorem” and “deduction rule” using sets and lists and integers so that the next paragraph will makes sense. Provide these pseudo-definitions before the statement of the pseudo-theorem. ] [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
4.9.7 Remark: Theorem 4.9.8 is an attempt to provide a corollary for the meta-theorem in Remark 4.9.3.
140
4. Logic methods
To prove P (1) it must be shown that Γ ⊢ α ⇒ δ1 is provable by some proof ∆′ ∈ List(W ). Every line of the proof ∆ must be either (a) an axiom (possibly with some substitution for the wff names), (b) an element of the wff sequence Γ, (c) the wff α, or (d) the output of a modus ponens rule application. Line δ1 of the proof ∆ cannot be a modus ponens output. So there are only three possibilities. Suppose δ1 is an axiom. Then the following argument is valid. Γ ⊢ α ⇒ δ1 (1) δ1
(from some axiom)
(2) δ1 ⇒ (α ⇒ δ1 )
PC 1
(3) α ⇒ δ1
MP (1,2)
If δ1 is an element of the wff sequence Γ, the situation is almost identical. Γ ⊢ α ⇒ δ1 (1) δ1
Hyp (from Γ)
(2) δ1 ⇒ (α ⇒ δ1 )
PC 1
(3) α ⇒ δ1
MP (1,2)
If δ1 equals α, the following proof works. Γ ⊢ α ⇒ δ1 (1) α ⇒ α
Theorem 4.8.3 (i)
When a theorem is quoted as above, it means that a full proof could be written out “inline” according to the proof of the quoted theorem. Now the proposition P (1) is proved. So it remains to show that P (k−1) ⇒ P (k) for all k > 1.
In case (d), line δk of proof ∆ is arrived at by modus ponens from two lines δi and δj with 1 ≤ i < k and 1 ≤ j < k with i 6= j, where δj has the form δi ⇒ δk . By the inductive hypothesis P (k − 1), there are ′ ′′ valid proofs ∆′ ∈ W m for Γ ⊢ α ⇒ δi and ∆′′j ∈ W m for Γ ⊢ α ⇒ δj . Then the concatenated argument ′ ′′ ∆′ , ∆′′ ∈ W m +m has α ⇒ δi on line (m′ ) and α ⇒ (δi ⇒ δk ) on line (m′ + m′′ ). A proof of Γ ⊢ α ⇒ δk may then be constructed as an extension of the argument ∆′ , ∆′′ as follows. Γ ⊢ α ⇒ δk
(m′ ) α ⇒ δi
′
(above lines of ∆′ )
(m′ + m′′ ) α ⇒ (δi ⇒ δk )
(above lines of ∆′′ )
′′
(m + m + 1) (α ⇒ (δi ⇒ δk )) ⇒ ((α ⇒ δi ) ⇒ (α ⇒ δk ))
PC 2
′
′′
MP (m + m , m + m + 1)
′
′′
MP (m′ , m′ + m′′ + 2)
(m + m + 2) (α ⇒ δi ) ⇒ (α ⇒ δk )
′
(m + m + 3) α ⇒ δk
′′
′
′′
(Alternatively one could first prove the theorem α ⇒ β, α ⇒ (β ⇒ γ) ⊢ α ⇒ γ and apply this to lines (m′ ) and (m′ + m′′ ).) This established P (k). Therefore by mathematical induction, it follows that P (m) is true, which means that Γ ⊢ α ⇒ β is provable. 4.9.9 Remark: For the principle of mathematical induction, see Remark 7.3.4. 4.9.10 Remark: Theorem 4.9.8 is clearly bogus. It is circular, like a thief who sells you an item which they have just stolen from you. However, this kind of pseudo-theorem is common in the mathematical logic literature. The proof is such a total mess, it would be much tidier to simply assume a kind of “reverse modus ponens” rule rather than invoke the machinery of mathematical induction at such a basic level in the development of logic. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Now suppose that P (k − 1) is true for an integer k with 1 < k ≤ m. That is, asssume that the assertion Γ ⊢ α ⇒ δi is provable for all integers i with 1 ≤ i < k. Line δk of the original proof ∆ must have been justified by one of the four reasons (a) to (d) given above. Cases (a) to (c) lead to a proof of Γ ⊢ α ⇒ β exactly as for establishing P (1) above.
4.10. Further theorems for the implication operator
141
Another way out of this mess is to do without any so-called “Deduction Theorem”. It is not essential. One work-around is to never, ever have any wffs on the left side of the assertion symbol in theorems. The cost of doing this is simply that in applying theorems, one must have an extra line of modus ponens to apply the theorem for every wff which was eliminated from the left side of the assertion symbol. This is a very tiny cost compared to the intellectual messiness of Theorem 4.9.8. Yet another way out of the “Deduction Theorem” mess is to go back to the proof of any theorems which have wffs “on the left” and apply the imported proofs “inline” in the proof in which the theorem is required. This is more onerous than the total avoidance of wffs “on the left”. [ Implement the idea in the following paragraph. ] The use of theorems is itself a kind of second inference rule in addition to modus ponens. Strictly speaking, there should also be a pseudo-theorem to “prove” by naive logic and naive set theory that the application of previous theorems to the proof of a new theorem is valid. [ Remarks 4.9.11 and 4.9.12 are perhaps too philosophical for this chapter. They could be either moved to the Chapter 2 or deleted. ]
4.9.12 Remark: The unsatisfying cyclic nature of logic and set theory at the core of mathematics is particularly sad for people who wish to take refuge in mathematics to escape the lack of careful logic in physics. On this subject, Bell [190], pages 516–517, says the following about Dedekind’s exit from physics and chemistry into mathematics. But he did not wander long in darkness. By the age of seventeen he had smelt numerous rats in the alleged reasoning of physics and had turned to mathematics for less objectionable logic. [ Since G¨odel’s famous theorems are meta-mathematical, relying on naive mathematics for their proofs, it seems that these theorems may all be bogus, since they rely on the prior establishment of much set theory for the meta-mathematics. Check this. The aspersions cast on mathematicians by logicians may all be baseless since their logical tools are constructed with the very mathematical tools which they seek to undermine. ] 4.9.13 Remark: An assertion of the form α1 , α2 , . . . αn ⊢ β means that there exists a proof of β from the assumptions α1 , α2 , . . . αn . Therefore if the wffs α1 , . . . αn appear on any n lines of an argument, then the wff β may be written on any later line. It is perhaps intuitively clear that this is equivalent to the assertion ⊢ α1 ⇒ (α2 ⇒ (. . . (αn ⇒ β) . . .)). It is also perhaps intuitively clear that a meta-proof of this meta-theorem may be derived from the “Deduction Theorem” (Theorem 4.9.8) with the aid of naive mathematical induction. However, it is not even possible to denote this inductive meta-theorem within the notation framework which has been defined. So it is difficult to see how one can realistically hope to prove an assertion which cannot be clearly written down!
4.10. Further theorems for the implication operator 4.10.1 Remark: Using the “deduction theorem”, the assertions in Theorem 4.8.3 which have wffs on the left of the assertion symbol may be converted to equivalent wffs with no wffs on the left as in Theorem 4.10.2. 4.10.2 Theorem [pc]: The following assertions follow from the propositional calculus in Definition 4.7.4. (i) ⊢ (α ⇒ β) ⇒ ((β ⇒ γ) ⇒ (α ⇒ γ)). (ii) ⊢ (α ⇒ (β ⇒ γ)) ⇒ (β ⇒ (α ⇒ γ)). [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
4.9.11 Remark: If the reader concludes from Remark 4.9.10 that all of mathematical logic and mathematics is bogus, that would be fair enough if the expectation was for a directed acylic graph of concepts, but that is a false expectation. Mathematics is like a natural-language dictionary which defines all words in terms of other words. The recursive look-up of words must eventually arrive at a cyclic dependency. One requires a certain amount of a-priori knowledge. Mathematics has the advantage that all of the cyclic dependencies can be isolated to a well-defined subset of the subject. Logic depends on set theory and integer arithmetic. Integer arithmetic depends on set theory, which depends on logic. If the reader can accept the network of concepts in these elementary topics, the rest of mathematics is “solid”.
142 (iii) (iv) (v) (vi) (vii) (viii)
4. Logic methods ⊢ (α ⇒ β) ⇒ (¬¬α ⇒ β). ⊢ (α ⇒ β) ⇒ (α ⇒ ¬¬β). ⊢ (α ⇒ β) ⇒ (¬¬α ⇒ ¬¬β). ⊢ (α ⇒ β) ⇒ (¬β ⇒ ¬α). ⊢ (α ⇒ ¬β) ⇒ (β ⇒ ¬α). ⊢ (¬α ⇒ β) ⇒ (¬β ⇒ α).
Proof: By the “Deduction Theorem” (Theorem 4.9.8), part (i) follows from Theorem 4.8.3 (iii), part (ii) follows from Theorem 4.8.3 (iv), part (iii) follows from Theorem 4.8.3 (vii), part (iv) follows from Theorem 4.8.3 (ix), part (v) follows from Theorem 4.8.3 (x), part (vi) follows from Theorem 4.8.3 (xi), part (vii) follows from Theorem 4.8.3 (xii) and part (viii) follows from Theorem 4.8.3 (xiii). 4.10.3 Remark: For proof of Theorem 4.10.2 without the “Deduction Theorem”, see Exercise 47.1.4. 4.10.4 Remark: The purpose of Theorem 4.10.5 is to prove Theorem 4.11.7 (viii), which is equivalent to Theorem 4.10.5 (vi). 4.10.5 Theorem [pc]: The following assertions follow from the propositional calculus in Definition 4.7.4. ⊢ (¬α ⇒ β) ⇒ ((α ⇒ β) ⇒ β). ⊢ (α ⇒ β) ⇒ ((¬α ⇒ β) ⇒ β). α ⇒ (β ⇒ γ), γ ⇒ δ ⊢ α ⇒ (β ⇒ δ). ⊢ (¬α ⇒ β) ⇒ ((β ⇒ γ) ⇒ ((α ⇒ γ) ⇒ γ)). α ⇒ (β ⇒ (γ ⇒ δ)) ⊢ α ⇒ (γ ⇒ (β ⇒ δ)). ⊢ (α ⇒ γ) ⇒ ((β ⇒ γ) ⇒ ((¬α ⇒ β) ⇒ γ)).
Proof: To prove part (i): (1) (2) (3) (4) (5) (6)
⊢ (¬α ⇒ β) ⇒ ((α ⇒ β) ⇒ β)
(¬β ⇒ ¬α) ⇒ ((¬β ⇒ α) ⇒ β) (α ⇒ β) ⇒ (¬β ⇒ ¬α) (α ⇒ β) ⇒ ((¬β ⇒ α) ⇒ β) (¬β ⇒ α) ⇒ ((α ⇒ β) ⇒ β) (¬α ⇒ β) ⇒ (¬β ⇒ α) (¬α ⇒ β) ⇒ ((α ⇒ β) ⇒ β)
To prove part (ii):
⊢ (α ⇒ β) ⇒ ((¬α ⇒ β) ⇒ β)
(1) (¬α ⇒ β) ⇒ ((α ⇒ β) ⇒ β) (2) (α ⇒ β) ⇒ ((¬α ⇒ β) ⇒ β) To prove part (iii): (1) (2) (3) (4) (5) (6) (7) (8) (9) (10)
part (i) Theorem 4.8.3 (iv) (1)
α ⇒ (β ⇒ γ), γ ⇒ δ ⊢ α ⇒ (β ⇒ δ)
α ⇒ (β ⇒ γ) γ⇒δ (α ⇒ (β ⇒ γ)) ⇒ ((α ⇒ β) ⇒ (α ⇒ γ)) (α ⇒ β) ⇒ (α ⇒ γ) (α ⇒ γ) ⇒ ((γ ⇒ δ) ⇒ (α ⇒ δ)) (α ⇒ β) ⇒ ((γ ⇒ δ) ⇒ (α ⇒ δ)) (β ⇒ γ) ⇒ ((γ ⇒ δ) ⇒ (β ⇒ δ)) α ⇒ ((γ ⇒ δ) ⇒ (β ⇒ δ)) (γ ⇒ δ) ⇒ (α ⇒ (β ⇒ δ)) α ⇒ (β ⇒ δ)
To prove part (iv):
PC 3 Theorem 4.10.2 (vi) Theorem 4.8.3 (iii) (2,1) Theorem 4.8.3 (iv) (3) Theorem 4.10.2 (viii) Theorem 4.8.3 (iii) (5,4)
Hyp Hyp PC 2 MP (1,3) Theorem 4.10.2 (i) Theorem 4.8.3 (iii) (4,5) Theorem 4.10.2 (i) Theorem 4.8.3 (iii) (1,7) Theorem 4.8.3 (iv) (8) MP (2,9)
⊢ (¬α ⇒ β) ⇒ ((β ⇒ γ) ⇒ ((α ⇒ γ) ⇒ γ))
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
(i) (ii) (iii) (iv) (v) (vi)
4.11. Other logical operators
143
(1) (¬α ⇒ β) ⇒ ((β ⇒ γ) ⇒ (¬α ⇒ γ)) (2) (¬α ⇒ γ) ⇒ ((α ⇒ γ) ⇒ γ) (3) (¬α ⇒ β) ⇒ ((β ⇒ γ) ⇒ ((α ⇒ γ) ⇒ γ))
Theorem 4.10.2 (i) part (i) part (iii) (1,2)
(1) α ⇒ (β ⇒ (γ ⇒ δ)) (2) (β ⇒ (γ ⇒ δ)) ⇒ (γ ⇒ (β ⇒ δ)) (3) α ⇒ (γ ⇒ (β ⇒ δ))
Hyp Theorem 4.10.2 (ii) Theorem 4.8.3 (iii)
To prove part (v):
To prove part (vi): (1) (2) (3) (4)
α ⇒ (β ⇒ (γ ⇒ δ)) ⊢ α ⇒ (γ ⇒ (β ⇒ δ))
⊢ (α ⇒ γ) ⇒ ((β ⇒ γ) ⇒ ((¬α ⇒ β) ⇒ γ))
(¬α ⇒ β) ⇒ ((β ⇒ γ) ⇒ ((α ⇒ γ) ⇒ γ)) (β ⇒ γ) ⇒ ((¬α ⇒ β) ⇒ ((α ⇒ γ) ⇒ γ)) (β ⇒ γ) ⇒ ((α ⇒ γ) ⇒ ((¬α ⇒ β) ⇒ γ)) (α ⇒ γ) ⇒ ((β ⇒ γ) ⇒ ((¬α ⇒ β) ⇒ γ))
part (iv) Theorem 4.8.3 (iv) (1) part (v) (2) Theorem 4.8.3 (iv) (3)
This completes the proof of Theorem 4.10.5.
4.11. Other logical operators 4.11.1 Remark: The implication-based propositional calculus which is introduced in Section 4.7 offers only two logical operators, ∧ and ¬. Of the five logical connectives in Notation 4.3.3, the two connectives ⇒ and ¬ are undefined in the axiom system described in Definition 4.7.4 because they are primitive connectives. (All of the operators are fully defined in the semantic context, but the symbols-only language context defines only manipulations rules, not meaning.) The other three connectives are defined in terms of ⇒ and ¬ in Definition 4.11.2.
(i) α ∨ β means (¬α) ⇒ β for any wffs α and β. (ii) α ∧ β means ¬(α ⇒ ¬β) for any wffs α and β. (iii) α ⇔ β means (α ⇒ β) ∧ (β ⇒ α) for any wffs α and β. [ Perhaps the NAND (|) and NOR (↓) operators should be defined near here. It would be a good idea to find a standard symbol for XOR too. ] 4.11.3 Remark: Definition 4.11.2 is not how logical connectives are defined in the real world. It just happens that there is a lot of redundancy among the operators in Notation 4.3.3. So it is possible to define the full set of operators in terms of a proper subset. Defining the operators in terms of a minimal set of operators is part of a minimalist mode of thinking which is not necessarily useful or helpful. [ Maybe there should be a discussion of reductionism, minimalism and naturalism in the philosophy chapter? ] Reductionism has been enormously successful in the natural sciences in the last couple of centuries. But minimalism is not the same thing as reductionism. Reductionism recursively reduces complex systems to fundamental principles and synthesizes entire systems from these simpler principles. (E.g. Solar System dynamics can be synthesized from Newton’s laws.) However, it cannot be said that the operators ⇒ and ¬ are more “fundamental” than the operators ∧ and ∨ . The best way to think of the basic logical connectives is as a network of operators which are closely related to each other. In many contexts, the set of three operators ∧ , ∨ and ¬ is preferred as the “fundamental” operator set. (For example, there are useful decompositions of all truth functions into “disjunctive normal form” and “conjunctive normal form”.) In the context of propositional calculus, the ⇒ operator is more “fundamental” because it is the basis of the modus ponens inference rule. But the modus ponens rule could easily be replaced by an equivalent rule which uses one or more different logical connectives. (The possibility of basing propositional calculus on the AND and NOT operators is mentioned in Remark 4.6.2.) A propositional calculus based on the single NAND operator with a single axiom and modus ponens is possible, but it requires a lot of work for nothing. The NAND operator is in no way “the fundamental operator” underlying all other operators. It is minimal, not fundamental. (See also Remark 4.6.4 for related comments on NAND-operator logic.) [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
4.11.2 Definition:
144
4. Logic methods
4.11.4 Remark: The ten assertions of Theorem 4.11.7 are the same as the axiom schemas of the propositional calculus described by Mendelson [164], page 40. This axiomatic system is referred to Kleene [161]. 4.11.5 Remark: Theorem 4.11.7 part (ix) seems to be quite difficult to prove from the three axiom schemas in Definition 4.7.4. To simplify the proof, some preliminary assertions are given in Lemma 4.11.6. Theorem 4.11.7 (ix) is a kind of mirror image of axiom (PC 2) in Definition 4.7.4. The former has a γ on the right in each of the three terms whereas the latter has an α on the left in each of the three terms. 4.11.6 Lemma [pc]: The following assertions for wffs α, β and γ follow from the propositional calculus in Definition 4.7.4. (i) ⊢ ¬(¬β ⇒ ¬(¬α ⇒ β)) ⇒ α. (ii) ⊢ ((α ⇒ β) ⇒ γ) ⇒ ((¬β ⇒ ¬α) ⇒ γ). 4.11.7 Theorem [pc]: The following assertions for wffs α, β and γ follow from the propositional calculus in Definition 4.7.4. ⊢ α ⇒ (β ⇒ α). ⊢ (α ⇒ (β ⇒ γ)) ⇒ ((α ⇒ β) ⇒ (α ⇒ γ)). ⊢ (α ∧ β) ⇒ α. ⊢ (α ∧ β) ⇒ β. ⊢ α ⇒ (β ⇒ (α ∧ β)). ⊢ α ⇒ (α ∨ β). ⊢ β ⇒ (α ∨ β). ⊢ (α ⇒ γ) ⇒ ((β ⇒ γ) ⇒ ((α ∨ β) ⇒ γ)). ⊢ (α ⇒ β) ⇒ ((α ⇒ ¬β) ⇒ ¬α). ⊢ ¬¬α ⇒ α.
Proof: Parts (i) and (ii) are identical to axiom schemas (PC 1) and (PC 2) respectively in Definition 4.7.4. Part (iii) is the same as ⊢ ¬(α ⇒ ¬β) ⇒ α by Definition 4.11.2 (ii). But this assertion is identical to Theorem 4.8.3 (xvi). Similarly, part (iv) is the same as ⊢ ¬(α ⇒ ¬β) ⇒ β by Definition 4.11.2 (ii), and this assertion is identical to Theorem 4.8.3 (xvii). To prove part (v): ⊢ α ⇒ (β ⇒ (α ∧ β)) (1) (2) (3) (4) (5)
(α ⇒ ¬β) ⇒ (α ⇒ ¬β) α ⇒ ((α ⇒ ¬β) ⇒ ¬β) ((α ⇒ ¬β) ⇒ ¬β) ⇒ (β ⇒ ¬(α ⇒ ¬β)) α ⇒ (β ⇒ ¬(α ⇒ ¬β)) α ⇒ (β ⇒ (α ∧ β))
To prove part (vi):
⊢ α ⇒ (α ∨ β)
(1) α ⇒ (¬α ⇒ β) (2) α ⇒ (α ∨ β) To prove part (vii):
Theorem 4.8.3 (i) Theorem 4.8.3 (iv) (1) Theorem 4.10.2 (vii) Theorem 4.8.3 (iii) (2,3) Definition 4.11.2 (ii) (4)
Theorem 4.8.3 (xv) Definition 4.11.2 (i) (1) ⊢ β ⇒ (α ∨ β)
(1) β ⇒ (¬α ⇒ β) (2) β ⇒ (α ∨ β)
PC 1 Definition 4.11.2 (i) (1)
Part (viii) is identical to Theorem 4.10.5 (vi). To prove part (ix): ⊢ (α ⇒ β) ⇒ ((α ⇒ ¬β) ⇒ ¬α) (1) (α ⇒ β) ⇒ (¬β ⇒ ¬α) (2) (¬β ⇒ ¬α) ⇒ ((β ⇒ ¬α) ⇒ ¬α) (3) (α ⇒ β) ⇒ ((β ⇒ ¬α) ⇒ ¬α) [ www.topology.org/tex/conc/dg.html ]
Theorem 4.10.2 (vi) Theorem 4.10.5 (i) Theorem 4.8.3 (iii) (1,2) [ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
(i) (ii) (iii) (iv) (v) (vi) (vii) (viii) (ix) (x)
4.12. Parametrized families of propositions (4) (5) (6) (7)
(β ⇒ ¬α) ⇒ ((α ⇒ β) ⇒ ¬α) (α ⇒ ¬β) ⇒ (β ⇒ ¬α) (α ⇒ ¬β) ⇒ ((α ⇒ β) ⇒ ¬α) (α ⇒ β) ⇒ ((α ⇒ ¬β) ⇒ ¬α)
145 Theorem 4.8.3 (iv) (3) Theorem 4.10.2 (vii) Theorem 4.8.3 (iii) (5,4) Theorem 4.8.3 (iv) (6)
Part (x) is identical to Theorem 4.8.3 (vi). [ 2007-1-19: This section is currently being rewritten. So everything after this point does not follow smoothly from the above definitions. ] [ Make Remarks 4.11.8 and 4.11.9 into theorems. They are actually meta-theorems. ] 4.11.8 Remark: If a wff α is a tautology, it may be combined in a disjunction with any wff β without altering the truth or falsity of the enclosing wff. Thus if β is a sub-wff of a wff, it may be replaced with the combined wff (α ∨ β). This fact follows from Theorem 4.11.7 (vi) and the theorem α ⊢ (α ∨ β) ⇒ β. Conversely, a tautology may be removed from a conjunction. Thus α may be substituted for α ∧ β if β is a tautology. An example application of this is line (5.15.2) in the proof of Theorem 5.15.4. 4.11.9 Remark: There is a corresponding observation to Remark 4.11.8 in terms of a contradiction instead of a tautology. (See Definition 4.3.23 for contradictions.) A contradiction β can be combined with any proposition α without altering its truth or falsity. Thus the combined proposition α ∨ β is equivalent to α for any proposition α and contradiction β. [ Highly desirable for Theorem 4.11.10 would be an “equivalent expression substitution” metatheorem. Then equivalent expressions could be substituted into logical expressions without changing their truth values, just like the substitution methods in number algebra. ] 4.11.10 Theorem [pc]: The following assertions for wffs α, β and γ follow from the propositional calculus in Definition 4.7.4. ⊢ (α ⊢ (α ⊢ (α ⊢ (α ⊢ α
∨ ∧ ∨ ∧ ∨
β) β) α) α) (β
(vi) ⊢ α ∧ (β
(vii) ⊢ α ∨ (β
(viii) ⊢ α ∧ (β
(ix) ⊢ α ∨ (α
(x) ⊢ α ∧ (α
⇔ (β ⇔ (β ⇔ α. ⇔ α. ∨ γ) ∧ γ) ∧ γ) ∨ γ) ∧ β) ∨ β)
∨ α). (Commutativity of disjunction.) ∧ α). (Commutativity of conjunction.)
⇔
⇔ ⇔ ⇔
(α ∨ β) ∨ γ . (Associativity of disjunction.) (α ∧ β) ∧ γ . (Associativity of conjunction.) (α ∨ β) ∧ (α ∨ γ) . (Distributivity of disjunction over conjunction.) (α ∧ β) ∨ (α ∧ γ) . (Distributivity of conjunction over disjunction.)
⇔ α. (Absorption of disjunction over conjunction.) ⇔ α. (Absorption of conjunction over disjunction.)
4.11.11 Theorem [pc]: The following tautologies hold for propositions A and B. (i) (A ⇒ B) ⇔ (A ⇔ (A ∧ B)). Proof: See Exercise 47.1.5.
4.12. Parametrized families of propositions The predicate calculus is introduced in Section 4.13. The basic concepts of predicate calculus include the idea of parametrized families of propositions. The parameters are called “variables”. 4.12.1 Remark: The predicate calculus may be regarded as a management system for very large sets of propositions by grouping them into classes. In the propositional calculus, each proposition is regarded as an individual entity to be managed without any regard to attributes which it may have in common with other propositions. For example, the propositions [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
(i) (ii) (iii) (iv) (v)
146
4. Logic methods
A = “The Sun is a star.” and B = “Alpha Centuri is a star.” are treated as having nothing in common at all. But each proposition has the form: P (X) = “X is a star.” The propositions A and B are members of a class of propositions P (X) for different values of the variable X. The predicate calculus tries to exploit the redundancy in this class of propositions. Thus one may write ∀X, P (X) to mean that everything is a star. The alternative in pure propositional calculus is to make this assertion for every choice of X, which would be tedious for a large set of X values, or impossible in the case of an infinite set. 4.12.2 Remark: It is often stated that the predicate calculus, which deals with parametrized families of propositions, is different to the propositional calculus in that it permits infinite classes of propositions because the set of predicate parameters is defined as a set which may be infinite and even uncountably infinite. However, there is no real difference. There is nothing to prevent the space of propositions in the propositional calculus from being arbitrarily infinite except the minor inconvenience of defining a naming notation for such a large set of propositions. The predicate calculus provides explicit support for the organization of the space of propositions as a parametrized family. Since the predicate calculus does typically deal with infinite sets of propositions, infinite conjunctions and disjunctions are provided. The universal and existential quantifiers are simply infinite conjunctions and disjunctions respectively. (See Section 4.13 for logical quantifiers.) They are only required when there are infinitely many propositions, but such infinite propositions sets are not strictly specific to the predicate calculus. Consequently, the propositional calculus and predicate calculus are not actually as different as they may at first seem. 4.12.3 Remark: The predicate calculus requires two kinds of names, namely (1) names which refer to propositions and (2) names which refer to parameters for the propositions. Thus the notation P (x) means a proposition template which includes a symbol x. The understanding is that one may substitute for x any other name from the set of parameter names. Similarly, the notation Q(x, y) means a proposition template which has two substitutable names, x and y. And so forth for any number of proposition parameters. In set theory, the parameter names typically represent sets. name map
name space
object domain
object type
µV µQ µP
NV NQ NP
V Q P
variables predicates propositions
The choice of the words “space” and “domain” here is fairly arbitrary. (See Remark 4.1.2 for discussion of this choice of words.) The spaces and maps in the above tables are illustrated in Figure 4.12.1. variable names NV abstract
predicate names NQ
x, y, z,. . .
µV V concrete
Q
∅, {∅},. . .
“∈”,. . . predicates
truth value names
P (x), Q(y, z),. . .
µQ
variables Figure 4.12.1
P , Q,. . .
NP
proposition names
t
F
T
{F, T}
F
T
{F, T}
µP P
∅ ∈ {∅},. . .
τ
propositions
truth values
Variables, predicates, propositions and truth values
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
The name-to-object maps for a predicate language are summarized in the following table.
4.12. Parametrized families of propositions
147
[ Strictly speaking, there should be a map µT : NT → T , where T is the concrete set of truth values, and NT = {F, T} is the abstract set of truth values. Add these to the table and the diagram. The concrete truth values may be voltages, for example. And the truth map may be varied for the same abstract and concrete truth value spaces. The abstract truth value name space NT probably should be a fixed space. On the other hand, other kinds of logic can use more than two truth values, for example. ] The space Q of predicates may be partitioned according to the number of parameters of each predicate. Thus Q = Q0 ∪ Q1 ∪ Q2 . . ., where Qk is the space of predicates which have k parameters. The predicates in Qk have the form P : V k → {F, T}. In terms of the list notation in Section 7.12, the predicates in Q have the form P : List(V) → {F, T}. 4.12.4 Remark: The five layers of linguistic structure for propositional calculus in Figure 4.5.1, in Remark 4.5.4, are applicable also to predicate calculus. The main difference is that the logical operators are extended by the addition of universal and existential quantifiers in the predicate calculus. 4.12.5 Remark: It is noteworthy that although the variables in predicate calculus are typically arbitrarily infinite, the number of predicates is typically a small finite number. For example, in ZF set theory, there are only two predicates, namely the set equality and set membership relations. (In NBG set theory, there an additional single-parameter predicate which determines whether a class is a set.) This large difference in size between the sets of variables and predicates is completely understandable when one views predicates as families of propositions. Each predicate corresponds to a different concept, typical a fundamental relation or property of objects. Another noteworthy difference between variables and predicates is that the predicates are usually constant. The axioms of a predicate calculus typically specify the properties of the very small number of predicates. The predicates are imported from the concrete logical system. There is typically no need for variable names for concrete predicates in the axioms.
4.12.6 Remark: The use of the word “set” for the classes of proposition names and parameter names in Remark 4.12.3 seems dangerous because sets have not been defined yet. However, these are “naive sets”. There is no membership relation “∈” on these two particular naive sets. The only membership relation is between the symbols (as elements) and the whole class. I.e. all of the symbols are elements of their class. The specification axiom is useful for indicating subsets of the full symbols sets. But very few set constructions are required. So the danger is not very great. [ Check Remark 4.12.7 to see if it makes good sense or not. ] 4.12.7 Remark: It seems potentially dangerous that parameter names such as x and y in Remark 4.12.3 could be the names of sets, such as ZF or NBG sets. But this is a semantic issue. Since predicate calculus operates only at the linguistic level, the meaning of the symbols is of no importance to the integrity of the language itself. At the linguistic level, we are only interested in applying rules and axioms to abstract logical expressions. Contradictions can occur if the axioms of the language are not well chosen. But this implies only that the axiomatic system is not self-consistent. It does not imply that the formulation of the system is itself inconsistent. The set of axioms is still a well-defined set of axioms, and the set of deductions rules is still a well-defined set of deductions rules. The fact that a contradiction may be proved within an axiomatic system just means that the system has undesirable characteristics, not that it contains contradictions within the formulation itself. 4.12.8 Remark: The wffs in a propositional calculus are meaningful only if a “domain of interpretation” is specified for its statement variables, relations, functions and constants. According to Mendelson [164], page 49, an “interpretation” requires that the variables be elements of a set D and each logical relation, function and constant must be associated with a relation, function and constant in the set-theory sense within the set D. In other words, the interpretation of a propositional calculus requires some of the basic set theory which is presented in Chapters 5 and 6. (Or alternatively, only some sort of limited naive set theory is required.) [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
In the case of abstract predicates which are built from the concrete predicates by the use of logical operators and quantifiers, variable predicates are frequently encountered, and they are typically infinite in number.
148
4. Logic methods
4.12.9 Remark: It is sometimes convenient to have a notation for a predicate which is always true (or always false). Such a predicate does not need parameters. So Notation 4.12.10 introduces the zero-parameter predicates which are always true or always false. This is the same notation as for zero-operand logical operators. (See Notation 4.3.19, which is discussed in Remark 4.3.20.) These predicates are added to the abstract predicate names for any particular predicate calculus. There are not necessarily any concrete predicates for which these abstract predicates are labels. As an extension of notation, logical predicates which are always true or always false and have one or more parameters, may be denoted as for functions. Thus, for example, ⊤(x, y, z) would be a true proposition for any variables x, y and z, and ⊥(a, b, c, d) would be a false proposition for any variables a, b, c and d. Luckily such notation is rarely needed. 4.12.10 Notation: ⊤ denotes the true zero-parameter logical predicate. ⊥ denotes the false zero-parameter logical predicate. 4.12.11 Remark: In addition to predicates and variables, predicate calculus may also have functions. These are naive functions of some kind, not the same as the relational functions in set theory. But they look very similar. A predicate logic function is a map of the form f : V n → V for some non-negative integer n. [ It seems like predicate logic functions on the domain of predicates are required. In other words, functions of the form g : Qn → V for non-negative integers n. For example {x; F (x)} looks like g(F ), yielding a set for each predicate. ] 4.12.12 Remark: In addition to the variable predicates, variable functions and individual variables, there is also a requirement for constants in each of these categories. For example, in ZF set theory, “∈” is a constant predicate and “∅” is an individual constant. Figure 4.12.2 illustrates the map µV of variable names for sets and the map µQ for the constant name “∈” for the concrete set membership predicate. abstract names C
∈
B ∈
µV
µV µQ
µV (A)
µQ
µQ (∈) concrete objects Figure 4.12.2
µV (B)
µV µQ (∈)
µV (C)
Mapping the constant name “∈” to a concrete predicate
[ There probably should be a separate section on the definition of “constant” in Remark 4.12.13. Probably this remark should be split up into smaller conceptual steps. Although this remark is probably correct in principle, and quite likely useful, it needs a lot of improvement to make it comprehensible. Some of the notation and terminology need fixing too. ] 4.12.13 Remark: The definition of “constant” in Remark 4.12.12, for names of predicates, functions and individuals, only has meaning if there is a definition of equality (or “identity”) on the corresponding concrete object space. Otherwise one cannot know whether two names are pointing to the same object. But definitions of constants require even more than that. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
A
4.12. Parametrized families of propositions
149
The definition of the word “constant” is not at all obvious. In mathematics presentations, one often hears statements like: “Let C be a constant.” But then it generally transpires that C is quite arbitrary. So it isn’t constant at all, although it will generally be constant with respect to something. Analysis texts often have propositions of the general form: ∀x, ∃C, ∀y, P (x, y, C), where P is some three-parameter predicate. (See for example the epsilon-delta criterion for continuity on metric spaces in Theorem 17.4.3.) In this case, C is constant with respect to y, but not with respect to x. Thus one may typically write C(x) to informally suggest that C may depend on x but not on y. So the question arises in the case of names of predicates, functions and variables as to what a constant name should be constant with respect to. The simple answer is that constant names are constant with respect to name maps, but this requires some explanation. Consider first the name maps µV : NV → V from the individual name space NV to the individual object space V. (The “individual” spaces are usually called “variable” spaces, but this is confusing when one is talking about “constant variables”. A more accurate terms would be “predicate parameter spaces”.) In principle, the language defined in terms of the name space should be invariant under arbitrary choices of the name map µV . That is, all of the axioms and rules should be valid for any given name map. Therefore the axioms and rules should be invariant under any permutation of the elements of the object space V.
If all elements of a concrete space are equivalent, there is no need to be concerned with the choice of names. But this is not usually the case. More typically, each element of a concrete space has a unique character which is of interest in the study of that space. In this case, one often wishes to have fixed names for some concrete objects. Consider the example of the empty set in ZF set theory. Let a ∈ V be the empty set in a concrete ZF set theory (often called a “model” of the theory). Then the name A ∈ NV in the parameter name space NV c c points to a if and only if µV (A) = a, where = : V × V → {F, T} denotes the equality relation on the concrete parameter space V. (See Figure 4.12.3.) A
C
abstract names B
C
NV
NV µ2V
µ1V
µ1V
µ1V
µ2V
µ2V
V
V c
c
c
a = µ1V (A) b = µ1V (B) c = µ1V (C) concrete objects Figure 4.12.3
c
c
c
a = µ2V (B) b = µ2V (A) c = µ2V (C) concrete objects
Definition of a constant name c
A variable name C ∈ NV can be said to be “constant” if ∀µ1V , µ2V ∈ V NV , µ1V (C) = µ2V (C), where V NV denotes the set of functions f : NV → V.
Since the concrete parameter space has unknown elements, one cannot know what a is. (Concrete spaces are an “implementation” matter, which cannot be standardized at the linguistic level in logic.) Therefore the c condition µV (A) = a is not meaningful. However, if C is a name which know (somehow) has the property c c µV (C) = a, we can define define A by the condition µV (A) = µV (C). This is now importable into the a a abstract name space as the condition: A = C, where =: NV × NV → {F, T} is the import of the concrete c relation = to the parameter name space. a
Now the condition A = C is also not meaningful because C is not known. The problem here is that the equality relation is fully democratic. To see how to escape from this problem, denote an ad-hoc abstract a c single-parameter predicate PC by PC (X) = “X = C”. (This is equivalent to PC (X) = “µV (X) = a”.) Then a c clearly PC (X) is true if and only if X = C. That is, ∀X, (PC (X) ⇔ (µV (X) = a)). So the predicate PC [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
A
abstract names B
150
4. Logic methods
characterizes the required constant C. That is, we can determine if any name X points to the fixed object a by testing it with the predicate PC . a A predicate P ∈ NQ which satisfies a uniqueness rule, namely ∀x ∈ NV , ∀y ∈ NV , (P (x)∧P (y)) ⇒ (x = y) , c characterizes a unique concrete object. The predicate PC defined by PC (X) = “µV (X) = a” does satisfy this uniqueness requirement, but this alone cannot define PC because it does not define the unknown object a ∈ V. c The predicate PC (X) = “µV (X) = a” is specific to a, but there is no way to explicitly specify a. We need to convert PC into a predicate which is meaningful. This can be done by defining P (X) = “∀y, y ∈ / X”. The predicate P does not refer to any a-priori knowledge of the identity of the empty set at all. So this does define a constant name for the empty set. A name X is the name of the constant object (called the “empty set”) if and only if P (X) is true. Next it must be noted that the predicate expression P (X) = “∀y, y ∈ / X” depends on the membership relation “∈”, which is in turn a fixed predicate which is imported from the concrete predicate space. This has shifted the constancy problem from predicate parameters to predicates. It seems that the need to have an a-priori mapping from some constant predicate names to particular concrete predicates is unavoidable. On the other hand, the ZF axioms characterize the membership relation “∈” very precisely. If there are two or more such membership relations on the concrete space, that is not a problem. All of the consequences of ZF set theory will be valid for all maps µQ from the predicate name space NQ to the concrete predicate space Q. The “constant” empty set name is characterized the predicate P (X) = “∀y, y ∈ / X”, which depends on the variable name “∈” for a membership relation which satisfies the ZF axioms. However, this is not a problem which anyone worries about. The choice of predicate name map µQ : NQ → Q is very strongly constrained by the ZF set axioms. Then in terms of the choice of the membership relation “∈”, the name ∅ is uniquely constrained by the predicate P (∅), which is true only if the name “∅” points to the unique empty set for the c particular choice of predicate name map for “∈”. Then we have a = µV (∅).
4.13. Logical quantifiers 4.13.1 Remark: The subject of this section is predicate calculus, which is the logic of parametrized proposition families with the existential and universal quantifiers. Predicate calculus is an extension of propositional calculus, which is the subject of Section 4.4. 4.13.2 Remark: The quantifier symbols ∀ and ∃ mean “for all” (the universal quantifier) and “for some” (the existential quantifier) respectively. The existential quantifier may be defined in terms of the universal quantifier. Thus ∃x, P (x) is defined to mean ¬(∀x, ¬P (x)). Notation 4.13.3 is an attempt to formalize the meanings of these symbols. 4.13.3 Notation: ∀x, P (x) for any variable name x and predicate name P means “P (x) is true for all variables x”.
∃x, P (x) for any variable name x and predicate name P means “P (x) is true for some variable x”. 4.13.4 Remark: The symbol ∃ does not mean “there exists” as many elementary texts erroneously claim. It is a quantifier, not a verb. Sentences of the form “∃x such that P (x).” are wrong – and very annoying to the cognoscenti. The correct form is “∃x, P (x).” or: “There exists x such that P (x).” Thus in colloquial contexts one may read ∃x as “there exists x such that”, but not “there exists x”. 4.13.5 Remark: Although the ∃ symbol does not mean “there exists”, the symbol is mnemonic for the letter “E” of the word “exists”. Similarly the ∀ symbol is mnemonic for “A” in the word “all”. The rotation of the symbols through 180◦ is a relic of the olden days when typesetting used lead fonts. Using an existing character presumably saved space in the font drawer (called the “case”). [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
In conclusion, the definition of a constant name (for a predicate parameter) requires a definition of equality on the concrete parameter space, which implies that a first order language without equality cannot have constant names, and at least one additional “constant” predicate must be specified in order to distinguish one parameter from another, so that one will know which concrete object is being pointed to by the constant.
4.13. Logical quantifiers
151
4.13.6 Remark: The use of commas after every quantifier is intended to remove ambiguity. The comma terminates the quantifier unambiguously. This is a good idea even if the following sub-expression is parenthesized because justaposition of two sub-expressions could be confusing. It is better to simply make all expressions easy to parse by using the quantifier-terminating commas. 4.13.7 Remark: There is some variation in notations for logical quantifiers. The following table gives a sample of notations. author
universal quantifier
EDM [33] EDM2 [34] KEM [121] Lemmon [163] Mendelson [164] Reinhardt [134]
∀xF (x), (x)F (x), ΠxF (x), ∀xF (x) ∀xA(x) (x)P x (x)A11 (x) P (x) ^ x
Shoenfield [168] Szekeres [44]
∀xA ∀x(P (x))
Kennington
∀x, P (x)
existential quantifier
^xF (x) ∃xF (x), (Ex)F (x), ΣxF (x), _xF (x) ∃xF (x) ∃xA(x) (∃x)P x (Ex)A11 (x) P (x) _ x ∃xA ∃x(P (x))
∃x, P (x)
4.13.9 Remark: The universal quantifier gives the maximum possible information about the truth values of a predicate P because the statement ∀x, P (x) determines the truth value of the proposition P (x) for all values of x in the universe of the logical system. By contrast, the existential quantifier gives almost the minimum possible information about the truth values of a predicate P because the statement ∃x, P (x) determines the truth value of P (x) for only one value of x in the universe, and we don’t even know which value of x this is. Although universal and existential quantifiers are superficially similar, since they are in some sense duals of each other, the differences in information content between them demonstrate that they are fundamentally different. A similar comment was made in the case of conjuncts and disjuncts of propositions in Remark 3.5.9. [ The following semantics remarks should be moved to the logic semantics chapter. ] 4.13.10 Remark: It is very difficult to specify notations for semantics. So Notation 4.13.3 necessarily explains the basic quantifier notations in natural language. If one looks too closely at this short explanation of the universal and existential quantifiers, however, some serious difficulties arise. The words “all” and “some” are not easy to explain precisely. In the physical world, it is very often impossible to prove that all the members of a class have a particular property. Even if the class is not infinite, it may still be impossible to test the property for all members. One might think that the word “some” is easier because the truth of the proposition is firmly proved as soon as one example is found. But if the ∃x, P (x) is true, it might still not be possible to find the single required example x which proves the proposition. There is thus a clear duality between the two quantifiers. If the class of variables x is infinite for empirical propositions P (x), it is never possible to prove that ∀x, P (x) is true, and it is never possible to prove that ∃x, P (x) is false. (This duality follows from the fact that “true” and “false” are duals of each other.) The inability to establish quantified predicates by physical testing, in the case of empirical propositions, is not necessarily a show-stopper. Propositions usually do not refer to the “real world” at all. Propositions are usually attributes of models of the real world. And the quantified predicates may not arise from case-by-case testing, but rather from some sort of a-priori assumption about the model. Then if the model does find a contradiction with the real world, the model must be modified or limited in some way. This is how most [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
4.13.8 Remark: The universal and existential quantifiers are sometimes denotes by ^ and _ respectively. This choice of symbols may be justified by considering expressions of the form “P (x1 ) ∧ P (x2 ) ∧ P (x3 ) ∧ . . .” and “P (x1 ) ∨ P (x2 ) ∨ P (x3 ) ∨ . . .” respectively. However, these notations would be confusing in the context of tensor algebra. So they are not used in this book. (The subscript version is also inconvenient for the vertical spacing of text, as is evident in Remark 4.13.7.)
152
4. Logic methods
universal propositions enter into models. They are adopted because they are initially not contradicted by observations, and the model is maintained as long as it is useful. In the case of mathematical logic, it is sometimes impossible to find counter-examples to universal propositions, or examples for existential propositions. An important example of this is the idea of a Lebesgue non-measurable set. No examples of these sets can be found in the sense of being able to write down a rule for determining what the elements of such a set are. If the axiom of choice is added to the ZF set theory axioms, it can be proved by means of axioms and deduction rules that such sets must exist. In other words, it is shown that ∃x, P (x) is true for the predicate P (x) = “the set x is Lebesgue non-measurable”. But no examples can be found. This does not prove that the assertion is false, which is very frustrating if one thinks about it too much. The difficulties with universal and existential quantifiers in mathematics are in one sense the reverse of the difficulties for empirical propositions. The empirical claim that “all swans are white” is always vulnerable to the discovery of new cases. So empirical universals can never be certain. But in the case of mathematics, universal propositions are often quite robust. For example, one may be entirely certain of the proposition: “All squares of even integers are even.” If anyone did find a counter-example x whose square x2 is not even, one could apply a quick deduction (from the definitions) that the x2 is not even if x is not even, which would be a contradition. More generally, universal mathematical propositions are usually proved by demonstrating the absurdity of any counter-example. There is no corresponding method of proof to show the total absurdity of non-white swans.
parse tree
function tree
∃b, (b ∈ X ∧ F (a, b))
φ∃
∃b
b ∈ X ∧ F (a, b) b∈X b
b
φ∧
F (a, b) X
a
φ∈ b
b
F X
a
b
syntax
semantics t ∃b, (b ∈ X ∧ F (a, b)) = max φ∧ φ∈ (b, X), F (a, b) b
Figure 4.13.1
Example quantified logical expression tree with syntax and semantics
4.13.12 Remark: The problem of interpreting infinite sets within set theory is really identical to the infinity problem in logic. The predicate calculus already has all of the infinity difficulties. On the other hand, no infinite set of propositions can ever be written down and checked one by one. So one can never really prove that any universally quantified infinite family of propositions is satisfied. This seems to put infinite conjunctions in the same general category as propositions which can never be proved true or false. The infinite logic of limits in calculus is so overwhelmingly important in physics that it is difficult to simply ignore infinities. Perhaps it is best to think of the word “infinite” as meaning “finite but unbounded”. Then one can carry through the analysis of limits without fear that the elements of a sequence will vanish into a vacuum while being observed. There are adequate metaphors for infinity concepts. For example, the idea that no matter how large a number is, there will always be a larger number, is very convincing. We simply cannot imagine that any number could be so big that we could not add 1 to it. But this, and any other infinity metaphor, breaks [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
4.13.11 Remark: In Remark 4.3.10, it is mentioned that unquantified logical expressions are straightforward to parse and attach semantics to. Figure 4.13.1 shows an analogous attempt to parse and attach semantics to a quantified logical expression: ∃b, (b ∈ X ∧ F (a, b)). The use of bound variables implies that the nodes of the parse tree do not have a fixed meaning. The tree must be traversed potentially an infinite number of times to determine the truth value of the entire quantified expression.
4.14. Predicate calculus
153
down when we consider straws on the backs of camels. It is very difficult to imagine that a single straw will break the back of a camel. But we also know that a weight of 100 tons cannot be carried by any camel. So there must be a point at which the camel will collapse. If we replace straws with single hairs or even lighter objects, it is even more difficult to imagine the camel collapsing from such a tiny weight increment. (See also Remark 2.11.15 regarding camels and straw.) As mentioned in Remark 2.11.2, the boundedness of the integers seems entirely plausible. But we just can’t imagine the breaking point where the universe’s ability to represent an integer would break down. (People can’t imagine their own death either, but this does not imply that humans are immortal. People’s minds are simply too limited to imagine some things.) It should be noted that although mathematics physics does rely very heavily on limits in its formulation, it is the results of differentiation and integration of functions which are useful in physics, not the limiting processes themselves. Limits just proved a basis for developing algorithms which give the right answers. Furthermore, the atomic (or elementary particle) nature of matter and the quantum limits on measurement imply that no differentiation or integration (or any other kind of prediction of models) can ever be measured to infinite accuracy or even unbounded accuracy. Therefore it is not even clear that the laws of physics can be verified in any circumstances at all. Exact differentiation and integration lie in the realm of metaphysics because they are defined within inferred models which underly phenomena. So in this sense, limits are not directly used in physics, and are therefore not validated by physics. 4.13.13 Remark: The set of proposition parameters in a first order logic should surely be a class in the sense of NBG set theory rather than a set. (See Section 5.12 for NBG set theory.) For example, ZF set theory is supposedly a first order logic, and the “set of all sets” is not a set. (See also Remark 4.1.4 regarding sets and classes of propositions.)
It’s important to distinguish between the set of variable names in the language and the “set” of individual objects (variables) in the semantic space of the language. (The variables names belong to the discussion context. The concrete variables belong to the discussed context. These two contexts are discussed in Remark 4.3.1.) The set of variable names is limited by the information-carrying ability of the medium in which one writes sentences in the language. This set must therefore presumably be no larger than countably infinite. But the semantic space with which one interprets the individual variable names could be a proper class (in the sense of NBG set theory). The observation that the set of names just be finite (or countably infinite), because of the bandwidth limitations of human writing and talking, whereas the class of objects in the concrete logical system may have a completely arbitrary cardinality, is in essence the reason why pixie (or dark) sets and numbers must exist. Although the ZF axioms imply that the real numbers, for example, are uncountable, the variable name space for the predicate calculus cannot refer to all of them. (For dark sets and numbers, see for example Section 2.10.)
4.14. Predicate calculus [ This section will present the logical axioms and the deduction rules for general predicate calculus. Then particular first order languages may be derived from this. ] [ Discuss completeness theorems for predicate calculus near here. For example, in a complete logical system there should be a meta-theorem that if A is not provable, then A is provably false. ] [ Discuss also first-order and second-order predicate logics (or languages). Second-order logics permit predicate names to be the bound variables in quantifiers. ] [ Possibly present a version of group theory which is formalized as a first-order language which does not need set theory. Perhaps this could be done just after Section 9.2? ] 4.14.1 Remark: In this book, QC is an abbreviation for predicate calculus. The Q suggests the word “quantifier”. There isn’t just one predicate calculus. But there is a core set of common logical axioms and deductions rules which apply to every predicate calculus. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
On the other hand, maybe the “set” of all individual variable names need not necessarily be sufficient to represent each element of the proposition parameter “set” by a separate name. So perhaps a ZF-style set suffices for the proposition parameter space.
154
4. Logic methods
4.14.2 Remark: There is essentially only one propositional calculus, although there are very numerous formulations of it. But in the case of predicate calculus, there are many very different first order languages because they have a wide variety of sets of predicates, functions and axioms. [ Most of Remark 4.14.3 almost certainly should be in the logic semantics chapter. ] 4.14.3 Remark: Truth tables are a tabular representation of the method of exhaustive substitution. When there are N propositions in a logical expression, there are 2N combinations to test. When there are infinitely many propositions, clearly the exhaustive substitution method is not so clearly applicable. (There is a similar comment in Remark 4.4.4.) Mendelson [164], page 56, makes the following comment on the inapplicability of the truth table approach to the evaluation of logical expressions involving parametrized families of propositions. In the case of the propositional calculus, the method of truth tables provides an effective test as to whether any given statement form is a tautology. However, there does not seem to be any effective process to determine whether a given wf is logically valid, since, in general, one has to check the truth of a wf for interpretations with arbitrarily large finite or infinite domains. In fact, we shall see later that, according to a fairly plausible definition of “effective”, it may actually be proved that there is no effective way to test for logical validity. The axiomatic method, which was a luxury in the study of the propositional calculus, thus appears to be a necessity in the study of wfs involving quantifiers, and we therefore turn now to the consideration of first-order theories.
It must be remembered that the axiomatization of logic is merely a formalization of a kind of “logical algebra”. (This is summarized in Remark 3.1.2.) There is an analogy here to numerical algebra, where a proposition such as x2 +2x = (x+1)2 −1 may be shown to be true for all x ∈ IR without needing to substitute all values of x to ensure the validity of the formula. One uses algebraic manipulation rules which preserve the validity of propositions, such “add the same number to both sides of an equation” and “multiply both sides of the equation by a non-zero number”, together with rules of distributivity, commutativity, associativity and so forth. But these rules are based directly on the semantics of addition and multiplication. The rules are valid if and only if they match the semantics of the arithmetic operations. In the same way, the axioms of predicate calculus are not arbitrary or optional. The predicate calculus axioms and rules derive their validity entirely from the propositional calculus, which derives its axioms and rules from the semantics of propositional logic. Consequently, the validity of predicate calculus is derived entirely from exhaustive substitution in principle. It is not possible to carry out exhaustive substitution for infinite families of propositions. But the rules and axioms are determined from a study of the meanings of logical expressions. Thus predicate calculus is merely a formalization of the “algebra” of parametrized families of propositions, and the objective of this algebra is to “solve” for the truth values of particular propositions and logical expressions, and also to show equality (or other relations) between the truth value functions represented by various logical expressions which may involve quantifiers. It is to easy to fall into the temptation to regard truth in predicate calculus as arising solely from manipulations of symbols according to apparently somewhat arbitrary rules and axioms. The truth of a proposition does not arise from the hocus-pocus with the line-by-line deductions. The truth arises from the semantics of the proposition, which is merely discovered or inferred by means of line-by-line argument. The above comment by Mendelson [164], page 56, continues into the following footnote. There is still another reason for a formal axiomatic approach. Concepts and propositions which involve the notion of interpretation, and related ideas such as truth, model, etc., are often called semantical to distinguish them from syntactical precise formal languages. Since semantical notions are set-theoretic in character, and since set theory, because of the paradoxes, is considered a rather [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
This observation requires careful interpretation. It does not imply that parametrized families of propositions can derive their validity only from axioms. The semantics of parametrized propositions is not totally different to the semantics of unparametrized propositions. The change of notation to indicate parameters, and the introduction of universal and existential quantifiers to indicate infinite conjunctions and disjunctions respectively, does not imply that the semantic foundations are irrelevant to the determination of truth and falsity of logical expressions.
4.14. Predicate calculus
155
4.14.4 Remark: Lemmon [163], page 91, has the following comment on the inapplicability of the truth table approach to the predicate calculus. (The word “sequent” here roughly means a theorem of propositional calculus.) At more complex levels of logic, however, such as that of the predicate calculus [. . . ], the truth-table approach breaks down; indeed there is known to be no mechanical means for sifting expressions at this level into the valid and the invalid. Hence we are required there to use techniques akin to derivation for revealing valid sequents, and we shall in fact take over the rules of derivation for the propositional calculus, expanding them to meet new needs. The propositional calculus is thus untypical: because of its relative simplicity, it can be handled mechanically—indeed, [. . . ] we can even generate proofs mechanically for tautologous sequents. For richer logical systems this aid is unavailable, and proof-discovery becomes, as in mathematics itself, an imaginative process. The requirement for the use of “the rules of derivation” is not very different to the requirement for deduction and reduction procedures in numerical algebra. The task of predicate calculus is to solve problems in “logical algebra”. If infinite sets of variables are present in any kind of algebra, numerical or logical, it is fairly clear that one must use rules to handle the infinite sets rather than particular instances. But this does not imply that the methods of argument, devoid of semantics, have a life of their own. By analogy, one may specify an arbitrary set of rules for manipulating numerical algebra expressions, equations and relations, but if those rules do not correspond to the underlying meaning of the expressions, equations and relations, those rules will be of recreational interest at best, and a waste of time and resources at worst. To put it simply, semantics does matter! It is not quite clear that “an imaginative process” is required, whatever that may mean. Perhaps the author meant that semantics has a role in proof discovery. It is remarkable that mathematics, historically speaking, was originally entirely a matter of “imagination”. Then the communications between mathematicians were codified symbolically to the extent that some mechanization and automation was possible. But then, when mechanical methods are inadequate, the application of “imagination” is almost regarded as a necessary evil, whereas it was originally the whole business! One could mention a parallel here with the industrial revolution, during which a vast proportion of human productive activity was mechanized and automated. The observation that the design, use and repair of machines requires human intervention, including manual dexterity and “an imaginative process”, is perhaps annoying at times, but one should not forget that humans were an endangered species of monkey only a couple of hundred thousand years ago. The automation of a large proportion of human thought by mechanizing logical processes can be as beneficial as the automation of economic goods and services. But semantics defines the meaning and purposes of logical processes. So semantics is as necessary to logic as human needs are to the totality of economic production. (A cynic would possibly comment here that sometimes too many people forget that the economy is supposed to serve humans, not vice versa. By analogy, one should not permit blind methods of logic to force mathematicians to conclusions which seem totally ridiculous. If the conclusions are too bizarre, then maybe its the mechanized logic which needs to be fixed, not the minds of mathematicians.) [ Express the QC axioms in Mendelson [164], page 57, in terms of min/max operators combined with clipping. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
shaky foundation for the study of mathematical logic, many logicians consider a syntactical approach, consisting in a study of formal axiomatic theories using only rather weak number-theoretic methods, to be much safer. The paradoxes in set theory seem to all arise from self-referential propositions. (This is discussed at length in Section 5.7.) But self-referential propositions are really a logic problem. It might be a little unfair to blame set theory for paradoxes which arise in logic which is imported from the predicate calculus in the first place! The paradoxes disappear when sets are defined non-self-referentially. So it might be safe to base logic on a semantical foundation of set theory after all. More important than this, the validity of predicate calculus is unquestionably derived from the semantics of set theory. This cannot be avoided. In fact, detaching predicate calculus from set theory creates vast opportunities to deviate from the natural meanings of logical propositions and head off into territory populated by ever more bizarre paradoxes of a logical kind. (Examples of this could include multi-valued logic and inconsistency-tolerant logic.) At the very least, the axioms of predicate calculus should be based on a consideration of the underlying semantics. This should prevent wild departures from the original purposes of the study of logic.
156
4. Logic methods
Must define “clip” functions like x 7→ max(0, x) somewhere. For example, A ⇒ B(x) is equivalent to max(t(A)−t(B(x)), 0) = 0. This approach is not very satisfying, but it does deliver a language for expressing the semantics of predicate calculus. ]
4.15. Equality [ See EDM2 [34], 411.J, for predicate calculus with equality. ] 4.15.1 Remark: The concept of uniqueness requires a definition for the equality relation, which is usually denoted as “=”. (See Remark 46.2.6 for the history of this symbol.) A definition of equality must be provided somehow in any set theory, for example in Zermelo-Fraenkel set theory. There are (at least) two ways to introduce equality into predicate logic. (1) Assume that the concrete logical system which is being modelled has its own equality relation which defines the identity of concrete objects. Import the concrete equality relation into the abstract predicate logic. The imported equality relation then necessarily satisifes the “reflexivity of equality” axiom (4.15.1) and “substitutivity of equality” axiom (4.15.2). (See Mendelson [164], page 75, for these axioms.)
wff α
∀A, A = A (A = B) ⇒ (α(A) ⇔ α(B)).
(4.15.1) (4.15.2)
In the case of set theory, the equality x = y may be defined for sets A and B to mean the proposition ∀x, ((x ∈ A) ⇔ (x ∈ B)), where “∈” denotes a set membership relation which is imported from a concrete predicate logic. This is an example of approach (2). By contrast, in approach (1), the presumed importation of the equality relation from a concrete predicate logic automatically yields (4.15.1) and (4.15.2). So any two names which refer to the same object may be used interchangeably in any proposition. (See Remark 5.2.3 for more detailed discussion of these two approaches to defining equality.) 4.15.2 Remark: The importation of a concrete equality relation into the abstract name space is achieved c c by defining A = B to mean µV (A) = µV (B) for all A, B ∈ NV , where = denotes the concrete equality c relation = : V × V → {F, T}. Similarly, the truth value P (A) for any one-parameter predicate name P , and any variable name A, is defined by P (A) = µQ (µV (A)). 4.15.3 Remark: Equality may be defined in a logical system which is not a set theory. The minimum requirements for an equality definition are the three well-known conditions for an equivalence relation as presented in Section 6.4, namely reflexivity, symmetry and transitivity. These three conditions are not limited to set theories. But they are not automatically defined in all logical systems. 4.15.4 Remark: There are (at least) two ways of interpreting the equality concept in symbolic logic. (1) Linguistic level: Regard equality as an abstract relation among symbols without any reference to the meaning of the symbols in some other space of entities. (2) Semantic level: Regard the symbols as mere temporary labels which refer always to elements of an externally defined set of objects. In case (1), the symbols are the real “things” which are defined in the logical system. So equality is an abstract relation between symbols. Usually this relation would have some sort of connection through axioms to other predicates in the system. In case (2), the relation of equality is defined by the association of the symbols with underlying objects. Then equality between symbol A and symbol B means that symbols A and B refer to the same object in the set of objects. The equivalence relation conditions (reflexivity, symmetry and transitivity) are automatically [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
(2) Define the abstract equality relation in terms of some other relation (such as set membership) which is imported from the concrete logical system. Then specify that the reflexivity and substitutivity axioms be satisfied for the defined equality relation.
4.16. Uniqueness
157
satisfied by importing the concrete equivalence relation from the concrete object space, from which the language acquires its semantics. One might fairly ask how the equality relationship is defined in the semantic space of concrete objects. However, this is an application question. The abstract theory of a predicate calculus with equality is applicable to any concrete object space which satisfies the stated requirements. It is the “user’s” responsibility to ensure that the requirements are met.
4.16. Uniqueness 4.16.1 Remark: There is no absolute reason why there could not be more than one equality relation in a logical system. Then one could define uniqueness for each equality relation. However, logical systems typically have a single equality relation which is thought of as defining the correspondence of names to underlying objects. 4.16.2 Remark: The shorthand “ ∃′ ” is often used in mathematics to mean “for some unique”. For example, “∃′ x, P (x)” should mean “for some unique x, P (x) is true”. This may be expanded to the statement “for some x, P (x) is true; and there is at most one x such that P (x) is true”. So it is really shorthand for two statements. This is stated more precisely in Notation 4.16.3. 4.16.3 Notation: ∃′ x, P (x), for a predicate P and variable name x, means: ∃x, P (x)
∧
∀x, ∀y, (P (x) ∧ P (y)) ⇒ (x = y) .
4.16.5 Notation: ∃′ x, P (y1 , . . . ym , x, z1 , . . . zn ), for a predicate P with m + n + 1 parameters for nonnegative integers m and n, and a variable name x, means:
∧
∃x, P (y1 , . . . ym , x, z1 , . . . zn )
∀u, ∀v, (P (y1 , . . . ym , u, z1 , . . . zn ) ∧ P (y1 , . . . ym , v, z1 , . . . zn )) ⇒ (u = v) .
4.16.6 Remark: In terms of the cardinality notation in set theory, one might write ∃′ x, P (x) informally as “#{x; P (x)} = 1”, meaning that there is precisely one thing x such that P (x) is true. (Existence means that #{x; P (x)} ≥ 1 whereas uniqueness means that #{x; P (x)} ≤ 1.) But the notation “∃′ ” corresponds to a logic concept which is not restricted to set theory. [ Formalize the idea that if something exists and is unique, then it may be given a name and can be used in later definitions and theorems. ] 4.16.7 Remark: Much of pure mathematics is concerned with proving that sets (or numbers, or functions) exist and are unique. A function is defined to have a value which exists and is unique. Much of partial differential equations theory is concerned with existence and uniqueness. So the ∃′ quantifier represents an important concept. In particular, if a set exists and is unique, it can be used in a definition. Definitions mostly define things which exist and are unique. One may use the word “the” for unique objects. For example, the Zermelo-Fraenkel empty set axiom (Definition 5.1.26 (2)) states that ∃A, ∀x, x ∈ / A. It is easily proved that ∃′ A, ∀x, x ∈ / A. Therefore the set A may be given the name “the empty set” and a specific notation “∅”. (See Theorem 5.8.2 and Notation 5.8.4.) [ Possibly give formal definitions of the “multiplicity quantifiers” in Remark 4.16.8. ] [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
4.16.4 Remark: If the predicate P in Notation 4.16.3 is a very complicated expression, possibly possessing several parameters, the longhand proposition could be quite irksome. It is important in practice to not ignore additional predicate parameters because uniqueness is usually conditional upon other parameters being restricted in some way. In colloquial contexts, the “∃′ ” notation is sometimes used in such a way that the dependencies are ambiguous. This can be avoided by ensuring that all quantifiers and conditions are written out fully and in the correct order. Notation 4.16.5 gives an extension of Notation 4.16.3 to explicitly show other parameters of the predicate P .
158
4. Logic methods
4.16.8 Remark: It would be convenient to have notations for concepts such as “there are at least 2 things x such that P (x) is true”, notated “∃2 x, P (x)”. More generally, “there are between m and n things such that P (x) is true” for cardinal numbers m ≤ n, notated “∃nm x, P (x)”. Then ∃′n would be a sensible shorthand for ∃nn . Hence the quantifiers ∃′ , ∃′1 and ∃11 are equivalent. The notation “∃n x, P (x)” should mean: “There are at most n things such that P (x) is true.” So “∃n ” should mean the same as “∃n0 ”. (See Exercises 47.1.6, 47.1.7, 47.1.8, 47.1.9, 47.1.10, 47.1.11, 47.1.12, 47.1.13 regarding expressions for “∃nm ”.) These kinds of generalized uniqueness and existence quantifiers could be referred to as “multiplicity quantifiers”. 4.16.9 Remark: Perhaps the generalized existential quantifier notations in Remark 4.16.8 are best written out in longhand. The simple case ∃2 x, P (x) may be defined as ∃x, ∃y, (P (x) ∧ P (y) ∧ (x 6= y)), but the more general cases become rapidly more complex. The slightly more complex case ∃′2 x, P (x) (that is, ∃22 x, P (x)) would mean ∃x, ∃y, (P (x) ∧ P (y) ∧ (x 6= y))
∧
∀x, ∀y, ∀z, (P (x) ∧ P (y) ∧ P (z)) ⇒ (x = y ∨ y = z ∨ x = z) .
4.16.10 Remark: Just as “unique” means that there is one thing only, the word “duplique” must mean that there are two things only. The adjectives for 3 and 4 might be “triplique” and “quadruplique” respectively. (This terminology is highly conjectural of course.) Just as the noun “existence” means that there exists at least one thing, the word “duplicity” might mean that two things exist. (The corresponding adjective might be “duplicate” or “duplex”.) The nouns for 3 and 4 might be “triplicity” and “quadruplicity” respectively.
[ Under what conditions can one say that ∃x, x = x is a tautology? ] [ According to Mendelson [164], page 47, a proposition of the form “∀y, α” for some wff α is the same as the unquantified wff α if α does not refer to the variable y. This is a definition of the meaning of “∀y, α”. This expression requires definition because the standard definition of “∀y” applies only in the case that the quantified wff refers to y. Formalize these ideas axiomatically. ] [ Are tautologies defined for predicate calculus? Or are they only defined in propositional calculus.? ] 4.16.12 Theorem [qc]: The following statements are tautologies. (i) ∀x, x = x. (ii) ∃y, y = x, for any x. (iii) ∀x, ∃y, y = x. [ Provide a more formal proof of Theorem 4.16.12. ] Proof: Part (i) follows from the definition of equality. Part (ii) follows by noting that x = x. Therefore y = x is satisfied by substituting x for y. Part (iii) is essentially the same as part (ii). [ Need to formalize the idea that a symbol like ∅ can represent a particular individual variable in a predicate logic, for example defined by “x; ∀y, y ∈ / x” without the curly brackets. Similarly define notations like ordered pairs {a, b} in terms of an expression which satisfies unique existence. ]
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
4.16.11 Theorem: The statement ∃n x, P (x) is equivalent to ¬(∃n+1 x, P (x)) for non-negative integers n. So ∃nm x, P (x) is equivalent to (∃m x, P (x)) ∧ ¬(∃n+1 x, P (x)). The statement ∃n x, P (x) is equivalent to ¬(∃n−1 x, P (x)) for positive integers n.
[159]
Chapter 5 Sets
Zermelo-Fraenkel set theory . . . . . . . . . . . . . . The ZF extension axiom . . . . . . . . . . . . . . . . The ZF empty set, pair, union and power set axioms . The ZF replacement axiom . . . . . . . . . . . . . . . The ZF regularity axiom . . . . . . . . . . . . . . . . The ZF infinity axiom . . . . . . . . . . . . . . . . . Russell’s paradox . . . . . . . . . . . . . . . . . . . . ZF set theory definitions and notations . . . . . . . . Axiom of choice . . . . . . . . . . . . . . . . . . . . Axiom of countable choice . . . . . . . . . . . . . . . Zermelo set theory . . . . . . . . . . . . . . . . . . . Bernays-G¨ odel set theory . . . . . . . . . . . . . . . . Basic properties of binary set unions and intersections Basic properties of general set unions and intersections Closure of set unions under arbitrary unions . . . . . . Specification tuples . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
161 165 167 169 169 171 173 185 188 194 195 195 196 198 200 202
5.0.1 Remark: Set theory is a foundation layer upon which most of mathematics may be built. In this book, however, set theory is explained in terms of an even lower foundation layer, mathematical logic, which is presented in Chapters 3 and 4. So it seems reasonable to commence a systematic explanation of modern mathematics with logic, progressing to set theory, followed by the rest of mathematics. (And then mathematical physics rests firmly and comfortably on such a solid, multi-layered foundation of mathematics.) On the other hand, Halmos [159], page vi, said: [. . . ] general set theory is pretty trivial stuff really, but, if you want to be a mathematician, you need some, and here it is; read it, absorb it, and forget it. It is true that mathematics can be done successfully without first studying set theory or mathematical logic. But some people have a strong desire to understand what everything means, not just how to do it. Since this book is strongly oriented towards meaning and understanding, as opposed to the achievement of mere calculational prowess, the presentation commences with the foundations. Some people study set theory just because “you need some”. They prefer to let dedicated specialists worry about the foundations. But the mathematian (and mathematical physicist) who has revolutionary intentions, not content with the gradual evolutionary development of their subject, will desire an in-depth understanding of the fundamentals. 5.0.2 Remark: Mathematics is full of perplexing and incomprehensible ambiguities. So it is always useful to be able to wave one’s hands and claim that everything can be made meaningful using “rigorous set theory”. This is like the ancient Greeks explaining events (e.g. in the Iliad) in terms of the whims of deities on Mount Olympus. As long as no one went there, no one could prove that the whims of deities on Mount Olympus didn’t explain events. Similarly, one should not examine set theory too closely because the myth of a solid basis in rigorous set theory is useful to maintain. Even in times when most logicians thought that
©
Alan U. Kennington, “Differential geometry reconstructed: a unified systematic framework”. Copyright 2009, Alan U. Kennington. All rights reserved. You may print this book draft in A4 format. Printing in all other formats is forbidden. You may not charge any fee for copies of this book draft. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
5.1 5.2 5.3 5.4 5.5 5.6 5.7 5.8 5.9 5.10 5.11 5.12 5.13 5.14 5.15 5.16
160
5. Sets
set theory was irrepairably self-inconsistent, mathematics progressed regardless. Mathematics did not stop for a decade while waiting for paradoxes in set theory to be resolved. So don’t panic!
5.0.4 Remark: Constructing all mathematical objects from ZF sets is like building all houses, cars and computers out of empty boxes. Although it is possible to build mathematics from highly convoluted constructions which are defined entirely in terms of the empty set, this is not at all natural. An alternative to the sets-only approach for the definition of mathematical objects is the axiomatic approach, where objects such as integers, real numbers and other low-level structures are defined by axioms. Then the particular set-based constructions of these structures are merely representations for the corresponding axiomatic systems. By allowing structures other than sets to be defined axiomatically, mathematical constructions will not then all be derivable from empty sets as their core underlying object. Some minimalism is thereby sacrificed for the sake of naturalism. However, minimalism in the foundations implies a vast amount of pointless work in the construction phase, constructing all mathematical objects out of the empty set. The fact that it can be done does not imply that it should be done. Suitable candidates for axiomatization (i.e. definition independent of pure set theory) include numbers, groups, fields, linear spaces, tensors and tangent vectors. Allowing some classes of mathematical objects to be defined independently of set theory could be regarded as a kind of diversification strategy or insurance policy. If it turns out later that there is some fundamental flaw in pure set theory which cannot be fixed, it would be useful to have an independent basis for at least some of the structures which are required for mathematics. 5.0.5 Remark: The origin of modern set theory is generally traced back to Georg Cantor’s work in the last quarter of the 19th century, particularly his paper “Grundlagen einer allgemeinen Mannigfaltigkeitslehre” (“Foundations of a general theory of aggregates”) in 1883. This partially intuitive form of set theory contained paradoxes which were resolved within axiomatic set theory in the early 20th century. 5.0.6 Remark: The use of set theory requires the acceptance of a small number of rules of logical procedure and some basic axioms. These rules and axioms are like building regulations which should prevent the toocreative architect from designing an edifice that may collapse in a storm. Some architects of mathematical structures may choose to build rogue projects outside the building safety regulations, but to have a work certified, it must be made to conform, and fellow architects must inspect it for flaws and design faults. Physicists and other scientists are not bound by the mathematicians’ building regulations. Therefore mathematical models constructed by physicists and others must sometimes be dismantled and reconstructed to fit the current regulations. Part of the work of mathematicians is to reconstruct models which were originally constructed by non-mathematicians. 5.0.7 Remark: The idea of trying to find a minimal set of axioms for set theory, from which one can deduce all of set theory by logical arguments, seems to be inspired by Euclid’s “Elements”, which attempted to achieve this goal for geometry. It is not entirely obvious that such a venture should succeed. It is clear that within almost any subject, it will always be possible to deduce at least some “facts” from other facts. It should perhaps be surprising that so much of mathematics can be deduced without contradictions from a small set of axioms. There are very few subjects which can be so thoroughly axiomatized as mathematics. It is difficult to be certain that nothing has been lost from mathematics in the attempt to reduce the assumptions and rules to a minimum. It is also not clear just how much is gained from this minimalist endeavour. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
5.0.3 Remark: It is not a-priori obvious that mathematics can be formalized in terms of symbolic logic. The overwhelming majority of mathematical literature is written in terms of informal arguments. Very often, a close examination of even the best-written research papers reveals ambiguities, self-contradictions and meaningless assertions. It is not certain that the ruthless reduction of real-life mathematical literature would not remove some of the mathematical corpus which is considered true by the majority of the best mathematicians. Nevertheless, most mathematicians would accept that any argument which is shown to be either wrong or meaningless when formalized should be rejected from the corpus, no matter how strongly supported the argument may be by “intuition”. The view is taken in this book that any argument which cannot be validated in formal set theory must be rejected from the corpus of mathematical knowledge.
5.1. Zermelo-Fraenkel set theory
161
One difficulty with the axiomatic approach is that one must begin with a knowledge of a large number of facts of a subject, and then work backwards to find a set of axioms which generates the entire subject by means of simple deduction rules. The axioms are justified by the facts which follow from them. But the deductive method justifies conclusions in terms of assumed axioms. In this sense, the axiomatic method is not intuitive. 5.0.8 Remark: Most proofs of theorems in this chapter are given as exercise questions (Chapter 47) because they are more useful for the reader’s active learning than for presenting novel techniques of deduction. Answers will be provided in Chapter 48 for all exercises in case the reader wishes to ensure the validity of proofs. Although most of the theorems are elementary in character, their proofs can be elusive or sometimes even frustrating, especially since the correctness of most of the assertions is “obvious”. 5.0.9 Remark: Abbreviations The following are abbreviations for some particular set theories. ZF means “Zermelo-Fraenkel”. BG means “Bernays-G¨odel”. NBG means “Neumann-Bernays-G¨odel”. The abbreviation AC means “axiom of choice”. CC means “axiom of countable choice”. The abbreviation CCω is also popular for the axiom of countable choice. 5.0.10 Remark: Theorems which are based on non-standard axioms (i.e. not Zermelo-Fraenkel set theory) are tagged. For example, a theorem based on the axioms ZF plus AC is written as Theorem [zf+ac]. For example, see Theorems 10.2.25 and 20.1.4. (See also Remark 4.7.2.) Non-standard additions to ZF set theory (“optional extras”) include AC, CC, the axiom of dependent choice and the continuum hypothesis.
5.1.1 Remark: An excellent introduction to “naive set theory” is Halmos [159]. For a very concise summary of Zermelo-Fraenkel set theory, see EDM [33], article 35, or the updated version in EDM2 [34], section 33.B. For a much less digestible but more rigorous treatment, see Mendelson [164], page 206, which defines Zermelo-Skolem-Fraenkel set theory in terms of Neumann-Bernays-G¨odel set theory, which is presented in Mendelson [164], pages 159–170. (The sceptical reader might question the need for so many different set theories. Why can’t there be just one standard set theory? As someone once famously observed, the good thing about standards is that there are so many to choose from!) 5.1.2 Remark: Sets are not defined. As far as set theory is concerned, sets are merely symbols which satisfy some axioms and rules of deduction. However, mathematicians think of those symbols as referring to mathematical objects. (See Section 2.3 for some comments on the ontology of mathematics.) Although sets are undefined, this is the normal way in which axiomatic systems are treated. For example, in probability theory, probability itself is not defined. 5.1.3 Remark: Set theory defines the membership relation “∈” rather than sets. So it would perhaps be more accurate to call set theory “membership theory”. Since the only relation between sets (apart from equality) is the membership relation, it is reasonable to expect that two sets are the same set if and only if they have the same membership relations with other sets. So it is usual to require that A = B if ∀x, (x ∈ A ⇔ x ∈ B). This requirement can be either an axiom or a definition of set equality. (This is also discussed in Section 5.2.) 5.1.4 Notation: x ∈ / y means ¬(x ∈ y). 5.1.5 Remark: When the elements of a set A are themselves sets under consideration, the set A may be referred to as a “collection”. There is really no difference between sets and collections, but it is sometimes useful to have a different word to help clarify an idea. Since essentially all things in set theory are sets, it follows that all members of sets must be sets. 5.1.6 Remark: One very annoying thing about set theory is the fact that an expression like {x; P (x)}, the “set of things satisfying predicate P ”, is not guaranteed to define a set. (See Notation 5.8.12.) To have a [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
5.1. Zermelo-Fraenkel set theory
162
5. Sets
high likelihood of constructing only well-defined sets, it is necessary to restrict set constructions to a limited set of set-construction rules. One such rule-set is the ZF set of axioms which are outlined in this section. (Set construction techniques are very succinctly explained in EDM2 [34], section 381.G.) Bernays-G¨odel (BG) set theory gets around the problem of ill-defined sets (Russell’s paradox) by introducing a second kind of set called a “class”. This extends ZF set theory. (See EDM2 [34], section 33.C, for BG set theory.) 5.1.7 Definition: A subset of a set B is a set A such that ∀x, (x ∈ A ⇒ x ∈ B). A superset of a set A is a set B such that ∀x, (x ∈ A ⇒ x ∈ B). 5.1.8 Notation: A ⊆ B means that A is a subset of B. A ⊇ B means that A is a superset of B. 5.1.9 Remark: The notation ⊂ meaning ⊆ is avoided in this book because although ⊂ mostly means the same thing as ⊆, it very often means strict inclusion. Similarly the notation ⊃ is avoided. 5.1.10 Notation: x 6= y means ¬(x = y) for any sets x and y. 5.1.11 Definition: A proper subset of a set B is a set A such that A ⊆ B and A 6= B. A proper superset of a set A is a set B such that A ⊆ B and A 6= B. 5.1.12 Notation: A⊂ 6= B means that set A is a proper subset of set B. A⊃ = \ B means that set A is a proper superset of set B. , !, ", #, $, %, &, ', (, ), *, +. ]
⊃ respec5.1.13 Remark: To be safe, one should use the clumsy-looking strict inclusion notations ⊂ 6= and = \ tively for the proper subset and superset relations. The clumsiness does not matter because these strict ⊃ must inclusion relations are almost never needed. It goes perhaps without saying that the relations ⊂ 6= and = \ not be confused with the relations 6⊆ and 6⊇ respectively.
5.1.14 Notation: A proposition of the form ∀x ∈ S, P (x) for a set S and set-theoretic formula P means ∀x, (x ∈ S ⇒ P (x)). A proposition of the form ∃x ∈ S, P (x) for a set S and set-theoretic formula P means ∃x, (x ∈ S ∧ P (x)). 5.1.15 Remark: It is curious that the form of the meanings of the proposition forms ∀x ∈ S, P (x) and ∃x ∈ S, P (x) are different. The former uses “⇒” while the latter uses “∧”. However, this difference ensures that ∃x ∈ S, P (x) is equivalent to ¬(∀x ∈ S, ¬P (x)). This matches the definition of the existential quantifier in Remark 4.13.2. (For proof, see Exercise 47.2.1.) The proposition ∀x, (x ∈ S ⇒ P (x)) is equivalent to ∀x, (x ∈ / S ∨ P (x)). 5.1.16 Remark: The definition of a ZF set theory is presented as a set of axioms in Definition 5.1.26 because sets are a basic concept in this book, not derived from other concepts. There is no canonical representation of the system of sets in terms of other systems. Each mathematician is expected to have their own representation of sets; the ZF axioms just provide a set of tests to ensure that all mathematicians are discussing an equivalent system when they discuss sets. A set of axioms such as ZF may be thought of as a set of regulations for an outsourced mathematical system; a conformant system is given a certificate of compliance if it is proved to comply with the axioms. The choice of representation is a mere implementation detail to be decided by the supplier. Humans may represent sets on paper as lists of symbols between braces, or as regions of a diagram, or in many other ways. Humans usually have some sort of set representation in their own minds also. Computers may represent sets as bit patterns in electrical or magnetic storage. As long as such representations satisfy the axioms, they constitute a valid set theory which must be isomorphic in some sense to any other representation. For example, every ZF set theory representation must have a unique empty set A and sets B = {A}, C = {A, B} etc. In any other representation, the empty set A′ and sets B ′ = {A′ }, C ′ = {A′ , B ′ } etc. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
[ Consider using some of the msbm symbols:
5.1. Zermelo-Fraenkel set theory
163
must have the same properties A′ ∪ C ′ = C ′ , B ′ ∩ C ′ = B ′ etc. as in the first representation. It should be impossible for one representation to have sets which another does not have because all of the elements of sets are required to be sets themselves. [ Is this really true? Is it an established fact that any two systems satisfying the ZF axioms are necessarily co-extensive? ] 5.1.17 Remark: Definition 5.1.26 specifies the properties of the membership relation between sets. It also specifies construction methods which yield new sets from given sets. So it provides a combination of membership relation properties and set existence rules. The set existence axioms 2, 3, 4, 5, 6 and 8 guarantee the existence of various kinds of sets. Axioms 1 and 7 specify constraints which limit set theory to safe ground. All of the ZF set existence axioms guarantee the existence of sets which are constructed with the aid of pre-existing sets except for Axioms 2 and 8, which make sets out of nothing. The only non-independent (i.e. redundant) ZF axiom is Axiom 2. [ There might be a second redundant set theory axiom. Check this. ] The productive axioms 2, 3, 4, 5, 6 and 8 may be thought of as defining lower bounds on the sets which may exist in ZF set theory. The restrictive axioms 1 and 7 may be thought of as defining upper bounds on set existence. 5.1.18 Remark: ZF set existence axioms 2, 3, 4, 5 and 6 all specify sets in terms of a set-theoretic formula. Therefore by the extension axiom (1), these sets are uniquely defined. The combination of existence and uniqueness is, of course, a pre-requisite for giving a specific name or notation to a set. Axiom 2 (empty set existence) is equivalent to ∃X, ∀z, (z ∈ X ⇔ ⊥(z)), where ⊥ denotes the always-false predicate. (See Notation 4.12.10.)
5.1.19 Remark: In a sense, all of the productive ZF axioms mentioned in Remark 5.1.17 make sets bigger with the exception of Axiom 6, which guarantees that you can prune back a set more or less as you wish, using an arbitrary set-theoretic formula. In particular, it allows you to prune back any set to the empty set. Since the infinity axiom (8) guarantees the existence of at least one set, it follows that an empty set exists. In other words, axiom 2 follows from axioms 6 and 8. 5.1.20 Remark: The set membership symbol ∈ is presumably derived from the first letter of the word “element”. This symbol must not be confused with ǫ (epsilon) or ε (variant epsilon). 5.1.21 Remark: Definition 5.1.22 arises from the antediluvian soup of intuitive set theory which is the basis of formal logic. Since a “formula” is not defined here, and nor are variables, logical connectives and relations, a “set-theoretic formula” is not very well defined. (Mendelson [164], page 164, uses the term “predicative well-formed formula”.) 5.1.22 Definition: A set-theoretic formula is a formula which contains only variables, logical connectives and the predicate symbol “∈” (set membership). 5.1.23 Remark: Definition 5.1.24 classifies variables as “bound” or “free”. Essentially a bound variable has local scope as a parameter of a quantifier whereas a free variable has global scope. This implies that a variable may be a free variable for a sub-formula of a formula and simultaneously a bound variable within the full formula. In computer software, a bound variable would be called a “dummy variable”. [ Definition 5.1.24 needs to be improved a lot. ] 5.1.24 Definition: A bound variable in a set-theoretic formula is a variable x which is within the scope of a universal quantifier “∀x” or existential quantifier “∃x”. A free variable in a set-theoretic formula is a variable which is not a bound variable. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Therefore all of the ZF set existence axioms yield unique, specified sets. This contrasts with the axiom of choice, which merely claims that a specified class of sets is non-empty. This is why the axiom of choice is so useless. (See also Remark 5.9.2.)
164
5. Sets
5.1.25 Remark: The function of a free variable in a logical expression is to permit substitution of values from a variable space. Thus a logical expression which contains free variables may be regarded as a template which generates a different particular proposition for each combination of values substituted for the free variables in the template expression. The function of a bound (or dummy) variable is to link two points in a logical expression. It is not permitted to substitute particular values for a dummy variable as one may do for free variables. The ZF set theory axioms in Definition 5.1.26 are proposition templates. Concrete individuals of the indicated type may be substituted for any of the free variables. All of the quantifiers refer to the set of individuals, not predicates or functions. All individuals are sets. So all quantifiers refer to sets. The equality relation “=” in Definition 5.1.26 is a two-parameter predicate which is imported from the concrete predicate domain. (See Remark 4.12.3 for concrete predicates.) 5.1.26 Definition: The axioms of a Zermelo-Fraenkel set theory are as follows. (1) The extension axiom: For any sets A and B, ∀x, (x ∈ A ⇔ x ∈ B) (2) The empty set axiom:
⇒ (A = B).
∃A, ∀x, x ∈ / A. (3) The unordered pair axiom: For any sets A and B, ∃C, ∀z, (z ∈ C ⇔ (z = A ∨ z = B)). (4) The union axiom: For any set K,
(5) The power set axiom:
∃X, ∀z, z ∈ X ⇔ ∃S, (z ∈ S ∧ S ∈ K) .
For any set X:
(6) The replacement axiom: For any set-theoretic formula f and set A, ∀x, ∀y, ∀z, ((f (x, y) ∧ f (x, z)) ⇒ y = z) ⇒ ∃B, ∀y, y ∈ B ⇔ ∃x, (x ∈ A ∧ f (x, y)) . (7) The regularity axiom: For any set-theoretic formula P , ∃x, P (x) ⇒ ∃z, P (z) ∧ ∀y ∈ z, (¬P (y)) . (8) The infinity axiom:
∃X, ∀z, z ∈ X ⇔ ((∀u, u ∈ / z) ∨ ∃y, (y ∈ X ∧ ∀v, (v ∈ z ⇔ (v ∈ y ∨ v = y)))) .
[ Shoenfield [168], page 239, has a different form of axiom (7), which seems a little simpler. But it may start from different assumptions. Check this. ]
(1) (2) (3) (4)
5.1.27 Remark: The following is a compact summary of the ZF set theory axioms. The variables on the left indicate what kind of free variable or free predicate is to be used in the axiom schema on the right. The word “formula” on the left means “set-theoretic formula”. sets A, B: (∀x, (x ∈ A ⇔ x ∈ B)) ⇒ (A = B) ∃A, ∀x, x ∈ /A sets A, B: ∃C, ∀z, (z ∈ C ⇔ (z = A ∨ z = B)) set K: ∃X, ∀z, (z ∈ X ⇔ ∃S, (z ∈ S ∧ S ∈ K))
(5) set X: (6) formula f , set A: (7) formula P : (8)
∃P, ∀A, (A ∈ P ⇔ ∀x, (x ∈ A ⇒ x ∈ X)) (∀x, ∀y, ∀z, ((f (x, y) ∧ f (x, z)) ⇒ y = z)) ⇒ ∃B, ∀y, (y ∈ B ⇔ ∃x, (x ∈ A ∧ f (x, y))) (∃x, P (x)) ⇒ ∃z, (P (z) ∧ ∀y ∈ z, (¬P (y))) ∃X, ∀z, (z ∈ X ⇔ ((∀u, u ∈ / z) ∨ ∃y, (y ∈ X ∧ ∀v, (v ∈ z ⇔ (v ∈ y ∨ v = y))))).
It is quite awesome that the entire basis of set theory can be written in 8 lines. The rest of mathematics is just definitions, theorems, notations and remarks. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
∃P, ∀A, A ∈ P ⇔ ∀x, (x ∈ A ⇒ x ∈ X) .
5.2. The ZF extension axiom
165
[ Refer to the set construction “stages” in Shoenfield [168], pages 238–240. ] 5.1.28 Remark: The following is a high-level interpretation of the compact summary of the ZF set theory axioms in Remark 5.1.27. The variables on the left indicate what kind of free variable or free predicate is to be used in the axiom schema on the right. The word “formula” on the left means “set-theoretic formula”. (1) (2) (3) (4) (5) (6) (7)
(A ⊆ B ∧ B ⊆ A) ⇒ (A = B) ∅ is a set {A, B} is a set S K is a set IP(X) is a set f is a function ⇒ f (A) is a set (∃x, P (x)) ⇒ ∃z, (P (z) ∧ ∀y ∈ z, (¬P (y)))
sets A, B: sets A, B: set K: set X: formula f , set A: formula P :
(8)
ω is a set.
5.2. The ZF extension axiom 5.2.1 Remark: The extension axiom, Definition 5.1.26 (1), is also known as the axiom of extensionality. This axiom may be thought of as the proposition (A ⊆ B ∧ B ⊆ A) ⇒ A = B. That is, if the sets referred to by two abstract set names A and B have the same membership relations “to the left” with all other abstract set names, then the abstract set names must refer to the same concrete object. And if two names refer to the same object, all attributes of that object must be identical. (See Figure 5.2.1.)
A
B
Figure 5.2.1
∈
x ∀x, (x ∈ A ⇔ x ∈ B) ⇒ A = B ZF extension axiom
In other words, the only property of a set is what it contains. A set is completely determined by its contents. Consequently if ∀x, (x ∈ A ⇔ x ∈ B) then ∀x, (A ∈ x ⇔ B ∈ x), which means that A and B have the same set membership relations “on the right” if they have the same membership relations “on the left”. [ In terms of the proposition naming framework in Section 4.1, A = B means that A and B refer to the same object in the concrete proposition domain, since the abstract equality relation is imported from the concrete logic domain. This is mentioned elsewhere already, but perhaps there should be a diagram of the import process near here. ] 5.2.2 Remark: In the presentation of NBG (Neumann-Bernays-G¨odel) set theory by Mendelson [164], pages 159–170, the extension axiom is defined differently to Definition 5.1.26 (1). This is because in a purely linguistic, abstract symbolic logic, equality needs to be explicitly defined since the symbols are just symbols which do not refer to anything concrete. In fully abstract logic, the equality relation “=” may be defined in terms of the membership relation “∈” by definining A = B to mean ∀x, (x ∈ A ⇔ x ∈ B). In terms of the defined equality relation “=”, the extension axiom then requires this relation to have the following relation to the set membership relation “on the right”. (See Mendelson [164], page 161.) A = B ⇒ ∀x, (A ∈ x ⇔ B ∈ x), This axiom is similar but quite different to Axiom (1). It means that two sets which are equal are contained in the same sets “on the right”. In other words, if the ∈-relations on the left are the same, then the ∈relations on the right are the same, in which case all ∈-relations of A and B are the same. So it is reasonable [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
∈
166
5. Sets
to think of the two labels A and B as referring to the same object. But set objects are outside the scope of symbolic logic, which only defines the truth and falsity of symbolic expressions. Maybe this is the difference between “naive” set theory and “real” set theory. In a naive set theory, the concept of set equality is taken for granted, and it is automatically assumed that two symbols which refer to a single set must have the same membership relations, whereas in dinkum set theory, there is no concept of symbols referring to objects outside the symbolic language. Another way to think of this is that in naive set theory, the labels are just names which may refer to the same object, whereas in symbolic algebra, the names are distinguishing attributes of objects; hence two objects may be equal but not the same! This is a bit like two hydrogen atoms being equal but not the same because they have a different location. In many practical situations in mathematics, it is necessary to think of two copies of the same set as being different objects if they have different labels, for example in the case of a function f between two sets A and B: if A and B happen to be equal as sets, we don’t necessarily want to admit f as a member of the set of maps from A to A. The fact that two sets being used in different parts of a system happen to have the same members shouldn’t necessarily make them the same in all respects. A clearer example is the fact that the right translation group of a group G and the left translation group of G are represented by identical sets although they are different kinds of structure. It is a philosophical question as to whether two things which are equal in all respects are necessarily the same object. A lot of time and energy can be wasted on this question if sufficient tea and coffee are provided. 5.2.3 Remark: Two ways of introducing the extension axiom are summarized in the following table. 1. (A = B) ⇒ ∀z, (z ∈ A ⇔ z ∈ B) 2. (A = B) ⇒ ∀z, (A ∈ z ⇔ B ∈ z) 3. ∀z, (z ∈ A ⇔ z ∈ B) ⇒ (A = B) 4. ∀z, (A ∈ z ⇔ B ∈ z) ⇒ (A = B)
modelling
linguistic
FOL + EQ definition FOL + EQ axiom axiom theorem
definition theorem
The modelling approach to logic is adopted in this book and also by Shoenfield [168], pages 238–240, and by EDM2 [34], 33.B, pages 147–148. In this approach, it is assumed that the names in the variable name space NV of the language refer to objects in a concrete variable space V. (In the case of set theory, this means that the names of sets refer to concrete sets in some externally defined space.) An equality relation is assumed to be defined already on the concrete variable space, and this relation is imported into the abstract space via the variable name map µV : NV → V.
Therefore in the modelling approach, the equality relation “=” is defined as an import of concrete equality relation. Since it assumed, furthermore, that the membership relation “∈” is also imported from a concrete membership relation, and the concrete relation is assumed to be well-defined for the concrete variables, it automatically follows that in the abstract name space, (A = B) ⇒ (z ∈ A ⇔ z ∈ B) and (A = B) ⇒ (A ∈ z ⇔ B ∈ z). In fact, B may be substituted for any instance of A like this without changing the truth value of the proposition. That is, (A = B) ⇒ (F (A) ⇔ F (B)) for any predicate F . (This is called the “substitutivity of equality” axiom. See Remark 4.15.1.) So lines (1) and (2) in the above table follow immediately. (“FOL + EQ” abbreviates “first order language with equality”.) In this case, line (3), ∀z, (z ∈ A ⇔ z ∈ B) ⇒ (A = B) is specified as an axiom of extension. This is required because it is not a-priori obvious that the concrete membership relation would be related to the concrete equality relation in this way. The language approach in the right column of the table is used by Mendelson [164], pages 159–170. In this approach, the equality relation is not imported from a concrete space. In this case, the equality relation needs to be defined in terms of the membership relation, which is the only relation which is imported from the concrete space. In this approach, A = B is defined to mean ∀z, (z ∈ A ⇔ z ∈ B). This yields lines (1) and (3) in the table. In the language approach, since the equality relation is no more than a definition in terms of the membership relation, there is no guarantee at all that this abstract equality of A and B implies that they both refer to the same concrete object, which therefore would have all attributes identical. Therefore this “substitutivity of equality” property must be specified as an axiom of extension. From this, it then follows as a theorem that [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
proposition
5.3. The ZF empty set, pair, union and power set axioms
167
this language is a first order language with equality. (See Mendelson [164], Proposition 4.2.) By contrast, the modelling approach assumes a-priori that set theory is a first order language with equality. This explains why line (3) is an axiom in the modelling approach whereas it is given by a definition in the language approach, and the reverse is true for line (1). Line (4) follows in both approaches from Theorem 5.2.4. 5.2.4 Theorem: ∀z, (A ∈ z ⇔ B ∈ z) ⇒ (A = B). Proof: Suppose that A and B are (names of) sets such that ∀z, (A ∈ z ⇔ B ∈ z). Let z = {A}. (This is a set by the ordered pair axiom.) Then A ∈ z. Therefore B ∈ z. So B = A.
5.2.5 Remark: When an a-priori assumption of a first order language with equality is made, a very general axiom of “substitution of equality” is adopted. (See also Remark 5.2.3.) This may be written as follows. ∀x, ∀y, (x = y) ⇒ α(x, x) ⇒ α(x, y) . 5.2.6 Remark: It should not be forgotten that the extension axiom, like all other axioms, is not intended to be “true”. The fact that this axiom is adopted in ZF set theory is merely intended to focus the mind on those systems which do satisfy the axiom. Thus ZF set theory restricts the focus to those systems which have the special property that the identity of each object is determined by its members. I.e. The identity of an object x is determined by the objects y such that y ∈ x. The members of a set determine its identity in a one-to-one fashion. The reverse implication (∀x, (x ∈ A ⇔ x ∈ B)) ⇐ (A = B) follows from the “substitution of equality” property which is required for a first order language with equality. Consequently, ZF set theory deals very specifically with collections of objects. These collections have no distinguishing attributes other than their membership relations “on the left”. One could almost say that the extension axiom is the only axiom which is required in set theory. This axioms on its own encapsulates the essential nature of sets. The other ZF axioms are required to force the system of sets to have at least a minimum of useful sets (axioms 2, 3, 4, 5, 6 and 8), and to not have sets which could cause trouble (axiom 7). (See also Remark 5.1.17 for related comments.) It would be possible to develop a set theory in which the sets have other attributes in addition to left-side membership relations. However, such a theory would almost certainly be easier to develop by extending ZF-style set theory with the addition of extra attributes to sets than by developing a new theory from scratch. For example, Section 2.6 discusses the desirability of attaching a “class tag” to each set when using sets constructions to represent classes of objects. (Class tags are also discussed in Section 5.16.) Then the empty set in one class could be distinguished from the empty set in another class, for example, and complex numbers (x, y) ∈ can be distinguised from real number tuples in (x, y) ∈ IR2 . Such a system is easily developed in terms of ordered pairs (C, X), where C is an element of a set of class tags and X is a ZF set.
C
5.2.7 Remark: As a historical note, the issues discussed in Remark 5.2.1 caused the author to spend a lot of time reading about what the symbols in logic books really mean. Like many, many months. It all made sense finally, and this required a major rewrite of much of the fundamental logic and set theory material. When the rewrite is done, the meaning of the logic and set theory will hopefully be clear in the minds of readers also!
5.3. The ZF empty set, pair, union and power set axioms 5.3.1 Remark: The remarks in this section are comments on the individual ZF set theory axioms in Definition 5.1.26. It is important to get comfortable with the axioms, at least to some extent, before proceeding to the systematic development which starts in Section 5.8. The commentary on Russell’s paradox in Section 5.7 refers principally to the regularity axiom (7). This is given its own section because it is such a big issue. The ZF extension axiom (1) is discussed in Section 5.2. The ZF replacement axiom (6) is discussed in Section 5.4. The ZF regularity axiom (7) is discussed in Section 5.5. The ZF infinity axiom (8) is discussed in Section 5.6. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
This means that a variable x in any wff α may be substituted with y in all, some or none of the places where x occurs in the formula. (See Mendelson [164], page 75.)
168
5. Sets
5.3.2 Remark: empty set axiom (2) The empty set axiom states that there exists an empty set. It follows from Axiom (1) that the empty set is unique. The empty set is denoted as ∅. (See Definition 5.8.3 and Notation 5.8.4 for the empty set.) 5.3.3 Remark: unordered pair axiom (3) A notation for the set A in the unordered pair axiom is {x, y}. By setting y = x, this axiom also implies the existence of {x}. Any set with one and only one element is called a “singleton”. (See Definition 5.8.9 for singletons.) 5.3.4 Remark: union axiom (4) The union that ∀K, ∃X, X = {z; ∃S ∈ K, z ∈ S}. (See Figure 5.3.1.) The notation for this S axiom means S set X is S∈K S or just K. K ∈
∈
S ∈
Figure 5.3.1
∃X, ∀z, z ∈ X ⇔ ∃S, (z ∈ S ∧ S ∈ K)
K
ZF union axiom for a general collection of sets
If K = {A, B} for sets A and B, the set X = special case of the union of two sets.)
S
S∈K
S may be denoted as A ∪ B. (See Figure 5.3.2 for the
A
B ∈
∈
z
A∪B ∃X, ∀z, z ∈ X ⇔ (z ∈ A ∨ z ∈ B) X
Figure 5.3.2
S
z
ZF union axiom for a collection of two sets
Together with Axiom (3), this axiom implies that 3-member sets {x, y, z} are well defined by considering K = {{x, y}, {y, z}}. Sets containing any finite number of given elements are similarly well defined. (It seems that if Axiom (3) was replaced with a “singleton axiom”, guaranteeing the existence of the set {x} for any x could be combined with the union axiom to construct pairs {x, y}. But Axiom (3) is not stronger than necessary because for this construction to work would require K = {{x}, {y}}, which is a pair!) 5.3.5 Remark: power set axiom (5) The power set axiom may be written as ∃P, ∀A, (A ∈ P ⇔ A ⊆ P ). That is, P = {A; A ⊆ X} is a set for any given set X. The power set P = {A; A ⊆ X} for any set X will be denoted IP(X). (See Notation 5.8.19.) [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
X
∈
5.4. The ZF replacement axiom
169
5.4. The ZF replacement axiom This section contains comments on the ZF replacement axiom, Definition 5.1.26 (6). 5.4.1 Remark: The replacement axiom says effectively that for any set-theoretic formula g and any set A, there exists a set B such that B = {g(x); x ∈ A}. This set may be denoted by g(A), although this is not actually unambiguous and the meaning of this notation must always be determined from the context. The axiom as stated allows for the possibility that the function g may only be partially defined on A, which may explain why a two-parameter formula f is used instead of a single-parameter formula g. The set B whose existence is guaranteed by the replacement axiom for a pre-existing set A and a function-like rule f is not required to be an element of A or any other pre-existing set. The rule f may be thought of as mapping elements of A to sets constructed according to rule f outside A, which are then gathered together in a new set B. (See Figure 5.4.1.)
f A
x
Figure 5.4.1
y
B
ZF replacement axiom
If the rule f in Axiom (6) associates elements of A with elements of A, and if for all x ∈ A the predicate f (x, y) is either false for all y or else true for only y = x, then f becomes a simple yes/no function on A. In this case Axiom (6) guarantees the existence of the subset of A which is defined by any single-parameter set-theoretic formula g. This special case is called the specification axiom or the axiom of subsets.
5.4.3 Remark: The replacement axiom (6) may be replaced by the combination of the separation axiom with a more general replacement axiom. (See EDM2 [34], 33.B, page 147.) The axiom of separation (or of specification, or of comprehension, or of subsets) is as follows, for any set X and predicate P . ∃Y, ∀z, (z ∈ Y ⇔ (z ∈ X ∧ P (z))).
(5.4.1)
A more general axiom of replacement is as follows for sets X and predicates R. ∃Y, ∀x, ((x ∈ X ∧ ∃a, R(x, a)) ⇒ ∃b, (b ∈ Y ∧ R(x, b))).
(5.4.2)
It is an interesting exercise to show that the combination of the two axioms (5.4.1) and (5.4.2) implies the single ZF replacement axiom. (See Exercise 47.2.2.) It is not difficult to show that the ZF replacement axiom implies (5.4.1). (See Exercise 47.2.3.) However, it seems that the ZF replacement axiom does not imply (5.4.2) unless it is assisted by the axiom of choice.
5.5. The ZF regularity axiom This section contains comments on the ZF regularity axiom, Definition 5.1.26 (7). 5.5.1 Remark: The regularity axiom is also known as the “axiom of foundation”. The regularity axiom says that if P (x) is true for some x, then there exists a set z such that P (z) is true but P (y) is false for all y ∈ z. In other words, for any predicate P which is not the always-false predicate, there exists a set z for which P (z) is true but none of the elements y of z satisfy P (y). This would imply that a sequence of set memberships must terminate. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
5.4.2 Remark: If the replacement axiom in ZF set theory is replaced by the slightly weaker specification axiom (6′ ) in Definition 5.11.1, the resulting set of axioms is called Zermelo set theory. Any theorem which is true in Zermelo set theory is also true in ZF set theory. (See Section 5.11 for Zermelo set theory.)
170
5. Sets
For a non-empty set V , define the predicate P by P (x) = “x ∈ V ” for sets x. Then the regularity axiom implies ∃z ∈ V, ∀y ∈ z, y ∈ / V (or equivalently ∃z ∈ V, ∀y ∈ V, y ∈ / z, or ∃z ∈ V, z ∩ V = ∅). In other words, for some z in V , none of the elements of z are in V . The negation of this would be ∀z ∈ V, ∃y ∈ z, y ∈ V . But this would mean that every element z0 of V contains at least one element z1 of V , which in turn contains at least one element z2 of V , and so forth, which would never arrive at a set containing no elements of V . (See Figure 5.5.1.) The axiom implies that the sequence . . . zn ∈ . . . ∈ z2 ∈ z1 ∈ z0 terminates on the left for some n by redefining V to be this infinite sequence of sets.
Figure 5.5.1
z6
z5
z4
z3
z2
z1
z5
z4
z3
z2
z1
z0
Infinite chain of set memberships “on the left”
It follows that the proposition x ∈ x is false for all sets x. This in turn prevents Russell’s paradox from happening because a set U satisfying ∀x, x ∈ U would contradict Axiom (7), as would also a set Q satisfying ∀x, (x ∈ Q ⇔ x ∈ / x), because clearly Q = U since x ∈ / x for all sets x. So Russell’s paradox is resolved by forbidding the kinds of sets which cause the embarrassment. (Bernays-G¨odel set theory in Section 5.12 gets around the paradox by demoting problematic “sets” such as U and Q to mere “classes” while accepting all of the sets in ZF theory as fully certified sets.) 5.5.2 Remark: Although the regularity axiom forbids an infinite sequence of set memberships on the left, an infinite sequence paths”. For S of such set memberships can be obtained as the union over all “leftward S example, the set ω is the union Sof all second-order members of the set ω. Then ω = ω. (This is illustrated in Figure 5.5.2. The set ω equals the set of all numbers in the second row below ω.) ω
1
ω={
0
2
...
3
0 1 0 1 0
n
...
0 . . . n−1 . . . }
2
0 0 1 0
Figure 5.5.2
Set membership tree for the set of finite ordinal numbers
SS S So . . . ω = ω. This only occurs because a union is taken over multiple membership paths. Any particular path downwards from ω in Figure 5.5.2 terminates after a finite number of steps, although this number of steps is unbounded. (This is a clear example of the difference between the words “infinite” and “unbounded”.) 5.5.3 Remark: Although the regularity axiom prevents an infinite chain of set memberships “to the left” there is nothing to stop an infinite chain of set memberships “to the right”. In fact, the set of ordinal numbers is defined as such an infinite sequence, whose existence is guaranteed by Axiom (8). (See Exercise 47.2.4.) Thus line (5.5.1) is okay. x ∈ {x} ∈ {{x}} ∈ {{{x}}} ∈ . . . (5.5.1) But line (5.5.2) is not okay. . . . ∈ U ∈ U ∈ U ∈ U ∈ U.
(5.5.2)
One might ask why there is an asymmetry here. One way to look at this is to consider that the nature and meaning of a set is determined by what it contains, not by what it is contained in. Thus to determine the nature of a set, one follows the membership relation network to the left. By looking at the members, and the members of the members, and so forth, one should be able to find an end-point of any traversal along [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
S
0
5.6. The ZF infinity axiom
171
a leftwards path in the membership relation network within a finite number of steps. The regularity axiom guarantees that this is so. Therefore the nature of all sets may be determined in this way. One might ask also why the nature of a set is not determined by the sets which it is contained in. That would seem to not be totally unnatural. However, this approach would require a very large number of top-level sets to be given meaning outside pure set theory. In the ZF approach to set theory, all membership relation traversals to the left ultimately end with the empty set. Therefore one only needs to give meaning to one set, namely the empty set. This does not seem to be very onerous. It is not at all clear how one could start with a single universe set and give meaning to its members, and members of members, and so forth. So the short answer to the question of why the regularity axiom is needed is that it provides certainty that all sets have determinable content. [ Find out if the “pure sets” requirement in Remark 5.5.4 does in fact imply satisfaction of the regularity axiom. See Shoenfield [168], pages 238–240, for set construction “stages”. ] 5.5.4 Remark: If all sets in ZF set theory are “pure sets”, i.e. all sets are built up in a finite number of “stages” from the empty set ∅ (axiom 2) and the finite ordinals set ω (axiom 8), it seems intuitively clear that the ZF regularity axiom should hold automatically. Each of the set construction axioms 3, 4, 5 and 6 seems to build sets which obey the regularity axiom if the source sets out of which they are built obey the regularity axiom.
5.6. The ZF infinity axiom 5.6.1 Remark: The concept of “infinity” is essentially impossible to resolve. One may write down a set of symbols on paper which purport to refer to something which is infinite. But in all of mathematics, it is difficult to know what any symbols are pointing to, other than thoughts or models in people’s minds, and the contents of those minds are inaccessible other than via writing and speech. As discussed in Sections 2.11 and 2.12, there are very serious problems with all infinite concepts. In the case of the ZF infinity axiom, even though it may look as if infinity has been objectively defined, in fact a set of axioms is only a particular set of propositions within a logical model. (See Section 3.3 for models.) A model has validity only to the extent that it accurately represents a modelled system. In particular, a universal qualifier “∀z” can only range over an infinite set of values if the concrete system being modelled has an infinite number of objects. The infinity axiom states that the modelled system has no last element in at least one set. So this axiom does encapsulate the idea that no matter how many elements are in the set, there is always one more – because no element is the last element. I.e. every member of the set has a successor which differs from all other members of the set. But this merely states a required property of an infinite set. It does not explicitly list the elements of the set, and obviously that is impossible. By comparison, consider that π has a specified set of properties, but we are never given the complete set of decimal digits. More than anything, it should not be forgotten that the development of ZF set theory only guarantees that various consequences follow from the axioms if there exists a concrete system which is accurately modelled by the axioms. If there is no such concrete system, then the assumptions (i.e. axioms) of ZF are not satisfied, and therefore there will be no consequences whatsover! The infinity axiom differs from the other set existence axioms by giving merely a recurrence relation for a set construction, not giving a more or less explicit list of elements. [ Also comment on the equivalent axiom of ∈-induction in Remark 5.6.2? ] 5.6.2 Remark: There are many common variants of the infinity axiom, Definition 5.1.26 (8). ∃X, ((∅ ∈ X) ∧ (∀y ∈ X, (y ∪ {y} ∈ X)))
(5.6.1)
∃X, (∃u, u ∈ X) ∧ ∀u, u ∈ X ⇒ ∃v, (v ∈ X ∧ u ⊆ v ∧ v 6= u) .
∃X, ∃y, (y ∈ X ∧ ∀z, z ∈ / y) ∧
∀u, (u ∈ X ⇒ ∃v, (v ∈ X ∧ ∀w, (w ∈ v ⇔ (w ∈ u ∨ w = u)))) .
[ www.topology.org/tex/conc/dg.html ]
(5.6.2) (5.6.3)
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
[ Maybe Remark 5.6.1 is not 100% correct. But there must be some truth in it. ]
172
5. Sets
Axiom (5.6.1) is given in EDM2 [34], 33.B, axiom (5.6.2) is given in EDM [33], 35.B, and Mendelson [164], page 169, and axiom (5.6.3) is given by Shoenfield [168], page 240. The variants of the infinity axiom differ in more than one feature. (1) Some variants are written in high-level set language, using specific set-construction expressions such as ∅ and y ∪ {y}, while others are written in raw, low-level logic language such as ∃y, ∀z, z ∈ / y. (2) Some variants use expressions like ∀y, (y ∈ z ⇔ F (y)) to specify the precise membership of a set z, whereas others use expressions like ∃y, (y ∈ z ∧ (F (y)) to specify only minimum requirements of set membership.
In terms of such an axiom, one may write the set ω of finite ordinal numbers immediately as the unique set X which satisfies (5.6.4). This is almost too easy! Proposition (5.6.4) may be rewritten in low-level pure predicate logic as follows. ∃X, ∀z, z ∈ X ⇔ ((∀u, u ∈ / z) ∨ ∃y, (y ∈ X ∧ ∀v, (v ∈ z ⇔ (v ∈ y ∨ v = y)))) . (5.6.5)
Proposition (5.6.5) is both specific and low-level. It means precisely that the set ω of finite ordinal numbers is well defined. By contrast, a proposition such as (5.6.2) means that “for some set X, X is infinite”. In other words, there exists at least one set which is infinite. They are effectively equivalent axioms when combined with the other axioms. So one may choose the form of the infinity axiom according to one’s objectives and stylistic preferences. [ Rewrite proposition (5.6.2) in low-level logic. ]
5.6.3 Remark: One might reasonably ask why the very general concept of infinite sets is described in ZF set theory in terms of the von Neumann ordinal number construction, where each element in the set ω is constructed as x ∪ {x} from each preceding set x. The construction of an infinite set X requires the specification of a new set which is different to all preceding sets, for each given subset of X. To be able to state that a set X is infinite, we need to be able to assert that no matter how large any subset Y ⊆ X is, we can always generate a set S(Y ) which is different to all elements of Y . So the general requirement is to find a general construction rule which yields a set S(Y ) for any given set Y , such that S(Y ) ∩ Y = ∅ for any set Y . The construction Y 7→ Y ∪ {Y } happens to satisfy the requirement. To see this, let S(Y ) = Y ∪ {Y } and note that Y ∈ S(Y ), but Y ∈ / Y because this is forbidden by the regularity axiom. Therefore S(Y ) 6= Y , by the “substitution of equality” axiom of a first order language with equality. So each generated set is different to the previous set. We also need to prove that S(Y ) is different to all preceding elements of the sequence of sets. To show this, note that Y ⊆ S(Y ). In fact, by mathematical induction, it is clear that all generated elements of the [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
When stating the ZF axioms initially, it is generally preferable to use only low-level logical language. Notations such as ∅ and y ∪ {y} should be introduced after all of the axioms have been presented. Therefore in (1), preference should be given to pure logical language, although it is very helpful to explain the axioms later in high-level language. In (2), the style of axiom which is adopted affects the amount of work required to develop the axioms into a useful initial corpus of theorems. If the less specific, weaker set-membership requirement is adopted, an axiom of substitution (or specification) must be applied to convert the set into a useful specific form. If the more specific, stronger set-membership requirement is adopted, the work is easier, but it may be more difficult to convince oneself that the more specific requirement is intuitively justifiable. If one’s aim is to convince a sceptical reader of the reasonableness of an axiom, probably the more general style could be preferable. But when a lot of manipulation is required to strengthen this style of axiom to a useful form, some of the credibility is lost. If the weaker, less specific style has no advantages (such as greater generality, for example), it is generally better to be specific, unless this would make the axiom too long or too abstract to easily understand. A style of infinity axiom which is in high-level set language, and which is fully specific about set membership, is as follows. ∃X, ∀z, z ∈ X ⇔ (z = ∅ ∨ ∃y ∈ X, z = y ∪ {y}) . (5.6.4)
5.7. Russell’s paradox
173
sequence include all previous elements of the sequence. In fact, each element Y of the sequence equals the union of all elements in the sequence up to and including Y . Therefore the proposition Y ∈ / Y also implies that S(Y ) is different to all of the preceding elements of the sequence. The construction S(Y ) = {Y } looks simpler, and it easily guarantees S(Y ) 6= Y (by the regularity axiom). One can show that the sequence ∅, {∅}, {{∅}},. . . has no elements the same by applying the regularity axiom inductively. [ Provide a much briefer, and more rigorous, proof that the finite ordinals in Remark 5.6.3 are all different. ] [ Find a convincing reason why one should not define S(Y ) = {Y } in Remark 5.6.3. ] 5.6.4 Remark: The inductive rule in the infinity axiom is uncomfortably similar to the inductive rules in Remarks 5.7.14 and 5.7.15 which arise from the study of Russell’s paradox for “universe sets”. One might reasonably ask whether the reasons for excluding universe sets from set theory might also be applicable in the case of infinite sets which are guaranteed to “exist” by the ZF infinity axiom. As mentioned in Remarks 5.5.1 and 5.5.3, infinite chains of set memberships “on the left” are forbidden in order to exclude universe-like sets which lead to Russell’s paradox. No logical contradiction arises from the infinity axiom, but it is difficult to see how an infinite set of sets can be a model for a concrete physical system, or even for a system which exists only in human minds. The infinity axiom is best thought of as providing a boundary-less “arena” in which mathematics can be carried out without ever having to worry about where the bounds lie. For comparison, it is useful to have a model of the physical universe which is infinite in time and space, whether or not this is true. This provides and economy of thought. If the model is boundary-less (which is what the induction principle tells us), then the special boundary case never needs to be considered. No matter where the real boundary is, we can always ignore it because it’s somewhere “over the horizon”.
5.7. Russell’s paradox [ The section is somewhat repetitive. It needs to be weeded and compressed. ] Russell’s paradox is so deep (and unpleasant), it deserves its own section. Maybe it deserves its own book! 5.7.1 Remark: The importance of Russell’s paradox lies in the fact that it is simple, and yet it was not known to be a serious problem by the professionals until 1903. If such a fundamantal difficulty with the idea of “the set of all sets” can go unnoticed for such a long time, one must seriously wonder whether there are perhaps other such serious problems still waiting to be found. A second good reason to be worried about Russell’s paradox is the fact that the work-around for this problem is the fairly arbitrary-looking axiom of regularity. (See Section 5.5.) This gets around the problem, but it is not instantly clear that it is a natural requirement for sets. The regularity axiom looks like an untidy patch for a fundamentally broken system. Two objectives arise from these observations: (1) to determine restrictions on logical systems (such as a set theory) which will ensure that such a big problem never arises again, and (2) to find a way of viewing the concept of “the set of all sets” so that it seems totally unreasonable to expect such a thing to be accepted. It is important to develop some sort of instinctual recognition which will prevent such things from happening again. 5.7.2 Remark: The term “naive set theory” has many different meanings, including the following. (1) Halmos [159] wrote a book titled “Naive set theory”, meaning an informal approach to axiomatic set theory. This approach was informal in the sense that there was no formal propositional calculus or predicate calculus. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
This view of the infinity axiom is equally applicable to the infinitesimality of the real numbers. Even if real time and space are somehow particulate in nature, at least the model has no bounds on resolution. So limits may be calculated without any concerns about the finite resolution of time or space. Giving our models a higher resolution than the real world presumably does less harm than if the models have an insufficient resolution.
174
5. Sets
(2) “Naive set theory” may be defined as set theory which is inborn in humans, or which is commonly known by people who study mathematics before encountering the mathematical theories of set theory such as Zermelo-Fraenkel. (3) A set theory may be called “naive” if it includes the axiom of naive comprehension: ∀F, ∃x, ∀y, (F (y) ⇔ y ∈ x).
(5.7.1)
This means that there is a “set” corresponding to each predicate function F .
5.7.4 Remark: The qualifiers ∃x and ∀y in (5.7.1) in Remark 5.7.2 refer to an implied concrete variable domain. The axiom of naive comprehension is supposed to be satisfied by such a domain. But the consequences of this axiom are only valid if a variable domain can be found which satisfies it! It is perhaps not emphasized enough in logic that none of the carefully argued proofs have any validity at all if no system can be found for which the axioms are valid. The discovery of a contradiction in a set of axioms implies, one would assume, that the set of axioms is null and void, and to no further effect. It is a mystery, therefore, why some people would want to persist with the study of axioms which are not valid for any possible system. Since the axiom of naive comprehension seems to give rise to a plethora of contradictions, one would assume that it should be abandoned and ignored. But it continues to occupy the minds of many otherwise serious people. No scientist, engineer or architect could make a serious career from the study of models which contain contradictions which prevent them being applicable in any possible universe. It is one of life’s mysteries that this kind of useless study is regarded as respectable when carried out in the philosophy department. [ Check what the correspondence is between NBG set theory and the axiom of naive comprehension in Remark 5.7.5. ] 5.7.5 Remark: Case (3) seems to correspond to NBG set theory, where sets co-exist with proper classes. NBG classes correspond to arbitrary predicates on the space of all sets. NBG classes are only permitted to be members of other sets if they are sets according to ZF. 5.7.6 Remark: Theorems which are derived in accordance with the axiom of naive comprehension are marked as [nc]. 5.7.7 Theorem [nc]: Russell’s paradox Let U be a set which satisfies ∀x, x ∈ U . Let Q = {x ∈ U ; x ∈ / x}. Then Q ∈ Q and Q ∈ / Q. Proof: First show that Q ∈ Q. If Q ∈ Q, then the proposition is obviously true. So suppose that Q ∈ /Q and let x = Q. Then x ∈ U by the definition of U . So x ∈ U and x ∈ / x. Therefore x ∈ Q by the definition of Q. In other words, Q ∈ Q, as was to be proven. Now show that Q ∈ / Q. If Q ∈ / Q then the proposition is obviously true. So suppose that ¬(Q ∈ / Q). By the law of the excluded middle, this implies that Q ∈ Q. Then by the definition of Q, Q ∈ U and Q ∈ / Q. So Q∈ / Q, as was to be proven. 5.7.8 Remark: Theorem 5.7.9 “proves” that Russell’s paradox is not a problem. It assumes NC, the axiom of naive comprehension. If the proof of Russell’s paradox is applied to a particular subset of the supposed universe set, it follows that the universe set does not exist. Even more satisfying than this “result” is the consequence that there are no sets at all which are elements of themselves. In other words, the axiom of naive comprehension implies a kind of regularity axiom. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
5.7.3 Remark: Case (3) in Remark 5.7.2 may be thought of as converting adjectives into nouns. If there is a one-to-one correspondence between predicates and sets, it seems almost superfluous to define sets at all. If every set corresponds to a predicate, and vice versa, set theory is really the same as the study of the relations between predicates. In terms of indicator functions, described in Section 7.9, there is a formal correspondence between any single-parameter logical predicate P and the indicator function χx , where x is the set whose existence is guaranteed by the NC axiom: ∃x, ∀y, (y ∈ x ⇔ P (y)). This requires the identification of the truth values F and T with the numbers 0 and 1 respectively. Then ∀y, (y ∈ x ⇔ P (y) ⇔ χx (y) = 1). (To be continued. . . or deleted!)
5.7. Russell’s paradox
175
5.7.9 Theorem [nc]: ∀x, x ∈ / x.
5.7.10 Remark: The “proof” of Theorem 5.7.9 gives a hint as to why it is necessary to get serious about the fundamentals of mathematical logic. It is often difficult to keep track of what is being assumed and what is being proved. When the set of accepted assertions is infected by just one meaningless proposition, everything that follows is pure nonsense. On the other hand, Theorem 5.7.9 does give a rather pleasing result. It “proves” that sets cannot be elements of themselves, which makes good sense. But then again, the proof of Theorem 5.7.7 is just as valid, but its result is totally horrible, namely the invalidity of the most fundamental property of truth values, the “excluded middle” rule. Sadly we cannot pick and choose which “theorems” we accept from a tainted batch. The whole batch must be discarded to be safe. In Theorem 5.7.9, an inconsistent set of axioms which permits a universal set U = {x; ⊤} satisfying ∀x, x ∈ U is shown to satisfy U ∈ / U , which directly contradicts the definition of the set U . But if the definition of U is contradicted by its own definition, this implies that the whole proof must be wrong because it relies totally on the definition of U . This suggests that there is something fundamentally wrong with the proposition ∃U, ∀x, x ∈ U . Perhaps it should be replaced with ∃U, ∀x, (x = U ∨ x ∈ U ). Such a proposition is reminiscent of the definition of a maximal element of a partially ordered set S: ∃M ∈ S, ∀x ∈ S, (x = M ∨ x < M ). So it does not seem unnatural to define the “most comprehensive set” in a similar fashion. Whether or not the proof of Theorem 5.7.9 is valid, and in which logical frameworks it may be valid or otherwise, the theorem does strongly suggest abandoning the idea that the set U of all sets satisfies U ∈ U . So the regularity axiom (Definition 5.1.26 (7)) is more natural than it may at first seem. In fact, it is natural to require at least that the set membership relation be anti-reflexive (i.e. ∀x, ¬(x ∈ x)) and antisymmetric (i.e. ∀x, ¬(x ∈ y ∧ y ∈ x)), and furthermore that there should be no cyclic chains. 5.7.11 Remark: It could be claimed that the regularity axiom (discussed in Remark 5.5.1) is an ad-hoc solution for a deep and real problem with the concept of sets and classes. In fact, Mortensen [166] suggests that Russell’s paradox may be escaped by abandoning the “excluded middle” logical law, adopting instead “paraconsistent logic” or “inconsistency-tolerant logic”. (See Mortensen [166], pages 1–5.) This is an extreme and unnatural escape route. It is, in fact, quite natural to forbid sets to be members of themselves, either directly or via a chain of containments: x ∈ y ∈ z ∈ . . . x. Set membership is inspired by the metaphor of physical containers, and one never observes in reality that a container is contained in itself, except perhaps in an Escher lithograph. (See for example “Print Gallery”, 1956, in Escher [200], page 127.) The superficially plausible claim that the set of all things should contain the set of all things is as reasonable as trying to assign meaning to a statement like: “This sentence is false.” A similar piece of nonsense is the equation x = x + 1. Consider also that we define “the fastest swimmer” to mean “the swimmer who is faster than all other swimmers”, not “the swimmer who is faster than all swimmers”. Similarly, it is natural to define the universe of all sets to be “the set which contains all other sets”. [ The axiom template in Remark 5.7.2 case (3) would be more likely to avoid Russell’s paradox if it was replaced with: ∀F, ∃x, ∀y, ((F (y) ∧ y 6= x) ⇔ y ∈ x). Check whether there may be a completely general such modification to the naive comprehension axiom which generates all NBG classes, or at least guarantees x ∈ / x, or guarantees the ZF regularity axiom. ] 5.7.12 Remark: In terms of the world-model ontology for logic described in Section 3.6, the replacement of the excluded middle with tolerance of contradictions means that a logic machine’s world-model is [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Proof: Let U = {x; ⊤}, where ⊤ denotes the always-true predicate. (See Notation 4.12.10.) Then U is the universe of all sets. Let V = U \ {U } and Q = {x ∈ V ; x ∈ / x}. Suppose Q ∈ Q. Then Q ∈ V and Q ∈ / Q. This is a contradiction. So the proposition Q ∈ Q must be false. Now suppose that Q ∈ / Q. If Q ∈ V , then Q ∈ Q by the definition of Q. This is a contradiction. So the proposition Q ∈ V must be false. Therefore Q ∈ / V. The only set which is not a member of V is U itself. So the proposition Q ∈ / V implies that Q = U . Then by the definition of Q, it follows that ∀x, (x ∈ U ⇔ (x ∈ V ∧ x ∈ / x)). This implies both ∀x, (x ∈ U ⇒ x ∈ V ) and ∀x, (x ∈ U ⇒ x ∈ / x). So U ⊆ V and ∀x ∈ U, x ∈ / x. The proposition U ⊆ V implies that U ∈ / U . The proposition ∀x ∈ U, x ∈ / x means that there are no sets which are members of themselves.
176
5. Sets
t(P (z)) = T
t(P (z)) = F z
ZP
Inconsistency-tolerant logic with world-model ontology
simultaneously in two complementary subsets of the world-model state space. (See Figure 5.7.1.) A careful study of the kind of thinking which arrives at inconsistency-tolerant logic reveals context confusion. Logic commences with the assumption that there are propositions which are either true or false, and not both. Then various relations between the truth values of propositions are noticed, and these relations are used to discover the truth values of propositions. But then when these relations yield contradictory results, somehow it is the truth values of the propositions which are held responsible. But it is clear that the relations must be faulty, because the vailidity of the relations between truth values comes entirely from their ability to correctly determine truth values of propositions. Now suppose it is argued that in reality, propositions can be both true and false. For example, in the quantum mechanical Schr¨ odinger’s cat scenario, the cat may be both alive and dead. But this is actually a third “mixed state”. There are three possible states of the cat in this model: “alive”, “dead” or “mixed”. For each of the three states z, the proposition that the cat is in state z may be either true or false, and precisely one of these three assertions is true according to this model. When one of these three propositions is true, the state z is in the set of world-model states corresponding to that proposition. When it is false, the state is in the complementary set. In the classical model, there are two states. In the quantum model, there are three states. If one interprets the mixed quantum state in terms of the classical model, it looks like a proposition is simultaneously true and false. But this mixed state does not occur in the classical model! So the contradiction occurs only when the two models are confused. Each model-state must be interpreted within its own model, because that it where it is given meaning. Figure 5.7.1 depicts a contradictory situation. But this does not agree with the definition of the word “false”. A proposition is said to be false when the world-model is not in one of the states where the proposition is true. What is illustrated here is a three-valued truth function. The states in which the proposition P is both true and false need to be given some sort of meaning. It is meaningless to simply say that a proposition is both true and false. Not all sequences of words are meaningful sentences. It is even difficult to give meaning to the individual truth values “true” and “false”. In the case of those who argue for an inconsistency-tolerant logic, there is a belief that methods of argument take priority over the definition of a proposition. But methods of argument are only valid to the extent that they correctly recover the truth values of propositions. If they yield contradictory results, this is an indication that one or more assumptions in the method of argument are invalid. To give an analogy, if one solves algebraic equations for a real number x, and one method of calculation yields x = 3 while another method yields x = 57, one would not say that the correct answer is that x equals both 3 and 57 simultaneously. There is no number which equals both 3 and 57. At the very least, a method of calculation must yield meaningful results. That is, the results must be at least consistent with the model under consideration. Meaningfulness is a prerequisite for correctness. 5.7.13 Remark: To understand why (and how) the “set of all sets” concept is hopelessly self-referential (like the Liar’s paradox), it is amusing to try to construct very simple examples of universe sets and see how Russell’s paradox (Theorem 5.7.7) happens. The simplest case of a universe of sets is U = ∅. In other words, we have a system which has no sets at all. However, U is the set of all sets, which is therefore, apparently, a set. Therefore we must have U ∈ U , apparently. So we must correct the definition of U to at least contain U itself. So we must have U = {U }. This seems fine. This universe candidate satisfies U ∈ U . It looks like we should add also the set {U } as a member of U , but this is not necessary because U and {U } are the same set. (If we look inside U , we will see another copy of U , and inside that is yet another copy, and so on ad infinitum. But that is not necessarily [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Figure 5.7.1
Z \ ZP
5.7. Russell’s paradox
177
5.7.14 Remark: Note that the proposition U ∈ U is in itself highly problematic, even without thinking about Russell’s paradox or any set theory axioms. In the simplest case, if U = {U }, we might like to know what U is. That’s simple! Clearly U equals {U }. So we can express an unknown set in terms of itself. If we ask what is inside U , it is just U . And when we look inside again, we see U again. Therefore U = {U } = {{U }} = {{{U }}} . . .. So we can know what U is if we know what U is. But knowing that U = {U } is very much like knowing that x = x + 1. (This must be a very big number because whatever x is, x must be 1 bigger than that. So x = (x + 1) + 1 = x + 2. So x is arbitrarily larger than whatever number we can think of. This is actually a rather poetic “definition” for infinity. But we cannot accept this as a serious number.) Some self-referential equations do not have solutions. And U = {U } seems to be such an equation. It is fairly clear that U = {U } really defines nothing at all. So this example of the proposition U ∈ U is unacceptable and meaningless, and should not be permitted to enter into mathematics. It is more suited to a book of riddles and brain-teasers. But suppose there are more elements in U . I.e. suppose that U = {U } ∪ X for some set X which satisfies U ∈ / X. Then when we look inside U , we see U together with all of the elements of X. And when we look inside this copy of U , we see another copy of U together with the elements of X. Making the substitution gives U = {{U } ∪ X} ∪ X. The process never ends, and we never find out what U is! So it doesn’t matter how we look at it, we can never know what U is. The self-reference problem is really the core of Russell’s paradox. From a meaningless assumption, any number of contradictions may be derived. Perhaps the main reason why it is so difficult to see what it wrong with Russell’s paradox is the fact that the core of the problem is hidden in the self-referential requirement U ∈ U which arises from ∀x, x ∈ U . The self-reference gives rise to positive feedback which – surprise, surprise – does not yield a stable solution. (Sounds a bit like my book really!) 5.7.15 Remark: The recursive construction in Remark 5.7.14 uses substitution in each iteration step. That is, the universe set U generates an infinite number of universes by iterating the rule Un+1 = {Un }, where U0 = U . Another kind of infinite recursion results from the equation Un+1 = Un ∪ {Un }. (This rule is the same as the successor set construction in Definition 7.2.2 which generates the ordinal numbers, and the Burali-Forti [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
a serious problem, although it does seem a rather unusual kind of set!) Now with U = {U }, what is Q? By definition, Q = {x ∈ U ; x ∈ / x} = ∅ because U ∈ U . Then Q is clearly a subset of U , but it is not (currently) an element of U . If Q ∈ / U , then we have no problems. We will have Q ∈ / Q = ∅, and Q ∈ / U = {U } because Q 6= U , because U 6= ∅. This seems to have avoided Russell’s paradox. However, the paradox rests heavily on the axiom of specification. It is assumed that the universe U is closed under the restriction {x ∈ U ; P (x)} of U by any predicate P . In particular, the always-false predicate yields the empty set. Therefore we must have ∅ ∈ U . So now we have U = {U, ∅}. Therefore Q = {∅} because U ∈ U and ∅ ∈ / ∅. Once again, we have Q ∈ / Q and Q ∈ / U . No problem! The problem, though, is that the specification axioms now requires {x ∈ U ; x ∈ / x} to be an element of U . So we must once again add Q = {∅} to U . This gives U = {U, ∅, {∅}} and Q = {∅, {∅}}, which does not trigger Russell’s paradox. But repeated applications of the specification axiom give U = {U, ∅, {∅}, {∅, {∅}}, . . .} and Q = {∅, {∅}, {∅, {∅}}, . . .}. The reader who has studied ordinal numbers should recognize that Q = ω at this stage, namely the set of finite ordinal numbers. (See Definition 7.2.12.) The problem does not end there. The requirement Q ∈ U arises once again from the specification axiom, even after we apply mathematical induction to arrive at U = {U } ∪ ω and Q = ω. We must now insert ω into U . This gives U = {U } ∪ ω ∪ {ω} and Q = ω ∪ {ω}. This process does not stop until the entire “set” of ordinal numbers is included in U . From this sort of argument, the ordinal numbers arise in a natural way from the application of the specification axiom to a supposed universe of sets. Thus when Russell’s paradox is applied to just the empty set, the resulting set construction leads directly to the Burali-Forti paradox, which is dated 1897, 6 years before Russell’s paradox, dated 1903. (For example, see Halmos [159], page 80; EDM2 [34], 319.B, page 1188; Mendelson [164], page 2.) This is because the “final” value of Q in this induction process is the “set” W of all ordinal numbers, and this “set” has the well-known property that W is less than W , which is a logical contradiction.
178
5. Sets
paradox if taken too far.) This kind of recursion arises when trying to define the universe set U as “the set which contains all other sets”. This definition avoids the self-referential nature of the definition: “the set of all sets”. However, this approach still has serious problems. If we have initially any set U0 which does not contain the universe U0 , we can add U0 into U0 to construct U1 = U0 ∪ {U0 }. Then we can add U1 to this to obtain U2 = U1 ∪ {U1 } = U0 S ∪ {U0 , {U0 }} and so forth. (See Figure 5.7.2.) But when induction is applied to this ∞ process to yield Uω = n=0 Un , we can construct Uω+1 = Uω ∪ {Uω } and so forth. This gives a class of sets which are the same as the ordinal numbers except that the seed set is U0 instead of ∅. U0
U0
U1 U1 U2 U2 U3 Figure 5.7.2
Iteration of universe sets
5.7.16 Remark: The attempt by the universe set to contain itself leads to a chain of expanding universe instances, and the rate of expansion accelerates as the universe set gets bigger. One might wonder if an expanding physical universe might be related to this. Accelerating expansion suggests a positive feedback loop. But the analogy is almost certainly superficial. 5.7.17 Remark: Within the “cognitive science” perspective, described in Lakoff/N´ un ˜ez [172], pages 121– 131, sets and classes are regarded as an abstraction of physical containers. In other words, the human mind performs manipulations of mathematical sets using the same general apparatus as is used for thinking about physical containers. The human capability to think about containers is clearly advantageous to survival. This capability is present also in many kinds of animals. The claim in the cognitive theory perspective is that it is this mental capability which is being put to work when a mathematician thinks about sets. This seems entirely plausible. The fact that set operations are taught at an elementary level in terms of Venn diagrams adds weight to this metaphorical connection. One might expect some help in resolving the question of universe sets by referring to the container metaphor. One could ask whether a bucket is inside itself or not. (See Figure 5.7.3.) The same ambiguities which arise in abstract set theory arise also in the case of a physical bucket. The words “inside” and “outside” do not seem to apply to the bucket itself. The bucket has a non-zero thickness. Points in the world are divided into “inside”, “outside” and “boundary”. Is the boundary within the set or outside it? In topology, this question is answered in exact and satisfying terms. (The topology of set boundaries relies heavily on infinite limits, which in turn lead to very much the same sorts of problems again. But that’s a story for another day.) If every bucket is contained inside itself, this suggests that we adopt the axiom ∀x, x ∈ x. This is not a totally impossible axiom to work with. We can define the empty set to be the set ∅ which satisfies x ∈ ∅ ⇔ x = ∅. The rest of set theory can be developed by carefully excluding the case x ∈ x when considering any set x. It would be an “interesting” exercise to re-express the ZF axioms to be consistent with the proposition ∀x, x ∈ x. (See Exercise 47.2.5.) It seems much less confusing to say that a bucket is not contained inside itself. This suggests an axiom such as ∀x, x ∈ / x, which excludes the possibility of a universe set. If we imagine a bucket which contains literally [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
From this kind of argument, it is clear that Russell’s paradox is only a consequence of the Burali-Forti paradox when the successor set construction is applied to any set at all. In this sense, Russell’s paradox is a downstream consequence of the Burali-Forti paradox because the real paradox lies in the universe set definition itself, not in its properties. On the other hand, Russell’s paradox may be presented in a very elementary way, whereas the ordinal numbers are an amazingly convoluted construction.
5.7. Russell’s paradox
boundary
Figure 5.7.3
inside
179
outside
A bucket
everything, it must contain itself. But the idea of a bucket containing itself is meaningless, a mere quandary for philosophers to discuss over afternoon tea. Discussion of the “the set of all sets” riddle should be kept harmlessly within the bounds of the philosophy department tea room. A third solution is to say that all buckets are on the “boundary” of themselves. This is neither x ∈ x nor x ∈ / x. In terms of logic, there cannot be a third possibility if we accept the “excluded middle” principle. So we must either invent a third logical truth value, or else invent a second set relation in addition to ∈. But then we would need to generalize ZF set theory to include this second set relation xBy to mean “x is inside the boundary of y”, which would be a lot of wasted effort, or else express ZF set theory in terms of a third truth value B (in addition to F and T) which would mean “on the boundary of being true and being false”. Luckily such monstrosities are not in current use by serious mathematicians.
Perhaps the most important thing to notice about any attempt to insert a universe set into the ZF sets is that it is really totally useless. The constructional activities within the bounds and constraints of ZF set theory are adequate for mathematics. When very young people want to know what the “biggest number” is, clearly we cannot name a number which is larger than all others. (This is closely related to the question: “What is the solution to the equation x = x + 1.”) We can define a non-number called “infinity” for either the integers or the real numbers, and we can do useful things with such a concept. But we do not try to insert “infinity” into the standard numbers. We define new sets of extended numbers and keep these sets of extended numbers clearly separate in our thinking. In the same way, it is perfectly okay to have a “set1 of all sets0 ” if we keep the level-0 sets separate from the level-1 sets. This sounds a little like the set classes introduced by Russell/Whitehead [167], but all of the useful sets are in this case in the first class U0 . The higher classes are provided for recreational mathematics and logic. 5.7.19 Remark: The difficulties described in Remark 5.7.18 arise in a totally analogous way in computer file systems. For example, in the unix-family operating systems, it is generally not permitted to make a “hard link” to a directory, precisely because this would be disastrous for any software which tries to traverse the file system’s directory tree. (In the olden days, the “super-user” was permitted to make hard links to directories, but ordinary computer users could not be trusted with such a dangerous capability.) Figure 5.7.4 illustrates how the file system “folders” (i.e. directories) would appear in a typical graphical user interface if a “hard link” is made inside a directory “universe” to the directory “universe” itself. (The unix-style command for this would be something like: “ln . universe”. Do not try this at home unless you have made a complete computer back-up first.) Inside the top-level folder, there would appear to be another folder (i.e. sub-directory) called “universe”. For clarity, the top directory is called U0 here, and the sub-directory is called U1 . (On a real computer, all of these directories would have the same name.) Then inside the folder U1 would appear another folder U2 , and so forth. There is actually nothing logically wrong with this hall-of-mirrors situation, apart from the fact that any software which tries to traverse the directory tree will “hang”. (Good software would detect the infinite loop [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
5.7.18 Remark: The attempt in Remark 5.7.15 to define a universe set would arise naturally if we commence with the ZF-generated sets, say, and try to insert into the ZF sets U0 = “the set of all ZF-generated sets”. Since this modifies the definition of the “the set of all ZF-generated sets” to U1 = U0 ∪ {U0 }, a recursive process arises as described.
180
5. Sets universe U0 universe U1 universe U2 universe U3 universe U4 universe U5 universe 6 universe unıv
...
Computer “folder” which contains a “hard link” to itself
and terminate it, but not all software is written so carefully.) The main argument against a self-containing universe set is that it is useless, but is also a huge burden to maintain. High price, no benefit. In mathematics, if the membership relation on sets is not an acyclic graph with a finite depth for traversals “on the left”, it will not be possible to determine the meaning of a set. The most basic level of meaning of a set is determined by its elements. The membership of a set is what distinguishes one set from another. A set’s membership determines its identity since the only property of a set is its membership relations. This is recursively true for each of the members of a set. If the recursive expansion of the elements of a set in terms of futher elements does not terminate, then a set’s meaning is unknowable. The finite termination of set membership traversal “on the left” is precisely what is stipulated by the ZF axiom of regularity. (See Section 5.5.) A different way to construct the directory hierarchy which is illustrated in Figure 5.7.4 is to start with an empty directory called “universe” and copy this empty directory into itself. There will then be two directories, U0 and U1 ∈ U0 , both called “universe”. Then U0 (which now contains U1 ) can be copied into U0 again. This time a new directory U2 appears inside U1 . The new U2 ∈ U1 is a copy of the old U1 . Each time the copy operation is done, a new sub-directory appears as in the diagram. This process yields an unbounded sequence of directories rather than an infinite sequence. But this dynamic process yields the same result as the static hard-link method up to a finite depth. The problems get serious when one tries to construct a file system directory corresponding to the set Q = {x ∈ U ; x ∈ / x}. First a directory Q should be constructed inside Q. Then there is the question of what to put inside the new directory Q. It is clear that Q is an empty directory. So Q does not contain the element Q. It is also clear that Q 6= U because Q is (currently) empty and U contains Q. So there is no doubt that Q ∈ / Q. But Q ∈ U . So the construction rule Q = {x ∈ U ; x ∈ / x} requires that Q must be inserted in Q. But now we have Q ∈ Q. So by the rule Q = {x ∈ U ; x ∈ / x} requires that Q must be removed from Q. This process clearly does not converge. It yields a sequence which alternates between the directory trees Q ∈ Q ∈ U and Q ∈ U . (See Figure 5.7.5.) universe U = {Q} Q = {x ∈ U ; x ∈ / x}
universe U = {Q} Q = {x ∈ U ; x ∈ / x} Q = {x ∈ U ; x ∈ / x}
Figure 5.7.5
Computer “folder” containing the folder of all non-self-containing folders
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Figure 5.7.4
5.7. Russell’s paradox
181
The source of the problem here is very clear. The membership of Q is defined in terms of the membership of Q. It is specified to be the reverse. Therefore when the test x ∈ Q ⇔ x ∈ / x gives a contradiction when Q is substituted for x. Note particularly that this has nothing at all to do with the special properties of the universe set U . The problem arises in exactly the same way if one specifies merely the simple system: U = {Q} Q = {x ∈ U ; x ∈ / x}. Such a system of equations has no solution for the membership relation between the sets. This suggests the conclusion that the cause of Russell’s paradox is the form of the equation for Q, not the universe set property U ∈ U . So it appears that Russell’s paradox is itself erroneous. So maybe there is no such paradox at all. On the other hand, the only reason that one expects a set satisfying Q = {x ∈ U ; x ∈ / x} to be a member of a set U is the combination of (1) the closure of U with respect to the specification axiom, and (2) the idea that U could be an element of U . This combination leads inevitably to the problematic set Q. Since the specification axiom seems entirely reasonable, this strongly suggests that U ∈ U is totally unreasonable. But it only becomes totally unacceptable when combined with the specification axiom. Then again, the specification axiom itself is potentially unreasonable. In the case of a fixed set U , it is totally reasonable. But then the set U depends on Q, and Q depends on U , as is the case here, such a dynamic application of the specification axiom becomes unacceptable. It is also interesting to try to construct a file system directory corresponding to the set W = {x ∈ U ; x ∈ x}. 5.7.20 Remark: The equality Q = {x ∈ U ; x ∈ / x} in Remark 5.7.19 is equivalent to x ∈ Q ⇔ (x ∈ U ∧ x∈ / x). When Q is substituted into this proposition, the result is Q ∈ Q ⇔ (Q ∈ U ∧ Q ∈ / Q). It follows directly that Q ∈ / U . Therefore U is not a universe set, or else Q exists in a different universe.
The problem with argument, however, is that both expressions {x; ⊥} and {x; ⊤} are meta-concepts. Written out fully in terms of the variable space V for the logical system, they signify {x ∈ V; ⊥} and {x ∈ V; ⊤} respectively. But the membership relation “∈” is in this case defined in the meta-set-theory within which the set theory has been defined. This is not the membership relation which is defined in the system’s predicate space Q. (See Remarks 4.12.3 and 3.1.2 for an explanation of these spaces.) Therefore the expression {x ∈ V; ⊤} signifies the set V, which is the concrete variable space which is supplied a-priori for the logical system. This set must be defined in a different set theory, within which the foreground set theory is defined. It is fairly clear from these observations that Russell’s paradox arises directly from the use of set theory to define set theory! This kind of chicken-and-egg situation is described in Remark 2.1.1 and elsewhere in this book. It is very easy to become confused between set operations in the background and foreground set theories; in other words, between the meta-set-theory and the set theory. If these two layers are confused, the kind of machine modelling loop which is described in Remark 3.3.5 arises. 5.7.22 Remark: In the perspective of Remark 5.7.21, it would be surprising if a paradox like Russell’s paradox did not arise. One can be confident that a paradox has been resolved only when its occurrence seems inevitable rather than surprising or disturbing. Self-containing sets are no more meaningful than the sentence S1 = “The sentence S1 is true.” This may seem to be a very safe sentence because it is true if it is true. But if one looks more carefully, one notices that one can only know that S1 is true if S1 is true. One can only say that S1 is true if it is true. How can one know that it is true? If it is false, one might think that a contradiction occurs. Maybe not so. Suppose S1 is false. Then the sentence S1 is false, which is not a contradiction at all. In other words, we observe that S1 is false if it is false. There is no contradiction here at all! A contradiction only occurs if a statement is both true and false. In this case, we can choose to say that S1 is true, in which case there is no contradiction, or false, in which case there is no contradiction. If one substitutes S1 into the right hand side of the equality repeatedly, the same answer “false” appears. Remember that simply stating a proposition is not the same thing as asserting it. (See Remark 3.7.1 for [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
5.7.21 Remark: Using the notations ⊤ and ⊥ for the propositions which are respectively always true or always false (as in Notation 4.12.10), one may write ∅ = {x; ⊥} and U = {x; ⊤}. This makes it appear that a universe set is as reasonable an idea as the empty set, which is rarely called into question.
182
5. Sets
discussion of this.) So there is no contradiction at all in the sentence S1 being false, unless one confuses the concepts of statement and assertion. When one realizes that there is a problem with the sentence S1 (because there is no contradiction is stating that it is true or false), it is then no great surprise when one considers S0 = “The sentence S0 is false.” In this case there is a serious problem whether it is true or false. There is a contradiction in both cases. But this is just the other side of the same coin. In the S1 case, a sentence can be both true and false. In the S0 case, the sentence can be neither true nor false. One may draw an analogy here between S1 and the real-number equation x = x, and between S0 and the real-number equation x = x + 1. The first equation permits all solutions for x whereas the second equation permits none. There is no contradiction here. The first equation is merely useless and the second equation is telling us to go home early. These analogies are summarized in the following table. sets sentences U = {x; x = x} S1 = “Sentence S1 is true.” Q = {x; x ∈ / x} S0 = “Sentence S0 is false.”
numbers x=x x = x + 1.
5.7.23 Remark: Suppose the axiom ∀x, x ∈ / x is accepted. Then any universe class U satisfies U ∈ / U . The Russell’s paradox construction may be applied to U to obtain Q = {x ∈ U ; x ∈ / x}. Clearly then Q = U . So Q ∈ / U . Therefore Q ∈ / Q by the definition of Q. It does not matter that Q does satisfy the second proposition “x ∈ / x” which defines Q because it does not satisfy the first criterion x ∈ U . Russell’s paradox disappears entirely. As outlined in Remark 5.7.25 in greater detail, it is okay to think about a set of objects, and then think about a set of sets of objects, and so forth. One may even think unambiguously about a set whose members are all of the sets which are constructed by iterating this procedure. The paradoxes arise only when there is confusion as to which side of the membership relation ∈ is the object and which side is the class. To put it simply, it is okay to think about the universe set, i.e. “the set of all sets”. But the first instance of the word “set” is a meta-set. It exists in a system in which the ordinary sets are being modelled. The NBG set theory approach tries to include a universe set as a second-class citizen which has no membership rights. (First-class sets may be elements of NBG proper classes, but NBG proper classes are not permitted to be a member of anything at all.) This approach is perhaps not such a good idea. It seems preferable to separate ZF sets from meta-sets which are formed from ZF sets. Then this process may be continued to arbitrary levels of meta-meta-ZF sets and so forth. Thus the ideal solution to Russell’s paradox is probably to define a universe set as “the meta-set of all sets”. The modern approach to set theory tries to break down the psychological barriers between sets and objects. When one first learns the set concept, one is told that a set is a collection of objects. Then one learns that one can define a set of sets, and so forth. In ZF set theory, all sets are either empty or else a set of sets. The lack of distinction between the different categories of sets, meta-sets, meta-meta-sets and so forth is helpful in reducing the complexity of language required to describe them. But when people use the same name for two different things, they being to think that those things have more in common than if they had different names. It is not necessary to put each set in an explicit layer number as proposed by Russell/Whitehead [167]. But it is necessary to keep set membership relations cycle-free. It is not necessary to define a separate object/class model for each level of set mebership as outlined in Remark 5.7.25. But it is necessary to place limits on the co-existence of sets in a single logical system so that membership relation cycles cannot occur. Most importantly, when someone says “all sets”, one must ask: “All sets in which universe?” The word “all” only has meaning if it refers to a specified universe. One cannot define a new set as U = “the set of all sets”. This only defines U to be equal to the pre-defined concrete set domain V of a particular logical system. This universe, being part of the framework (or meta-system) in which the logical system is defined, cannot be inside the universe which it defines. In conclusion, the “set of all sets” is very well defined, but it is defined in a meta-system or supporting framework. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
In both the liar’s paradox and Russell’s paradox, the problem is resolved by seeing clearly the self-referential nature of the system, which may be resolved by distinguishing between a logical system and its meta-logical framework system.
5.7. Russell’s paradox
183
It may seem that the almost endless discussion of Russell’s paradox is tedious and worthless. But it has a very positive value in helping to clarify the entire nature of all of logic and set theory. If Russell’s paradox can be made to “lie flat”, that is a sign that the whole edifice of mathematical logic has been constructed as a stable, robust structure. If the edifice is vulnerable to paradoxes, that is a sign that there is something fundamentally weak in the design. [ Find out if anyone has already developed the meta-sets alternative to NBG set theory which is suggested in Remark 5.7.23. ] 5.7.24 Remark: Perhaps yet another way to look upon “the set of all sets” is the dynamic perspective. That is, all sets are constructed by some unspecified mental process which actively collects a “collection of objects”. In this dynamic perspective, one should be very surprised indeed if one formed a collection of objects and found that inside that collection was the object which has just been constructed. This would be, at the very least, a violation of causality. One should not expect to be able to make a collection of objects which do not yet already exist. That would be as nonsensical as the phrase: “the largest number which can be expressed in ten words”. As soon as causality is introduced into set construction, the self-referential paradoxes disappear. In this dynamic perspective, “the set of all sets” can only mean “the set of all previously existing sets”. It is not an unrelated observation that the disturbing aspects of the concept of infinity are very much ameliorated by taking a dynamic perspective in which unbounded sets of objects are generated dynamically “on demand”. (This “on-demand” perspective of mathematical modelling is also alluded to in Remark 5.7.25.)
5.7.25 Remark: Attractive though the container metaphor in Remark 5.7.17 may seem, it is too simple to explain modern set theory and the way modern mathematicians think. Although sets are generally taught at an elementary level in terms of objects and containers (or boundary curves as in Venn diagrams), modern set theory permits arbitrary nesting of sets within sets, which blurs the distinction between objects and their containers. (In fact, in pure set theory, all objects are classes, and all classes are objects. This is like saying that all objects are buckets.) To give a credible ontology for modern set theory, it is helpful to make use of the modelling machines which are discussed in Section 3.3. Models or representations of the world are found in very simple animals. So world-modelling is apparently a more basic and ubiquitous activity amongst animals than arithmetic or logic. If a machine represents objects in a model, it seems safe to suppose that one may specify and name collections of objects based on attributes or enumeration. Thus if U is the total collection of objects in the model, one may specify collections SF = {x ∈ U ; F (x)} for any criterion F . Alternatively, one may explicitly list the elements of a collection S of objects. It is entirely reasonable for a name S to be given to a collection if the machine can effectively determine for each object x ∈ U whether x ∈ S or x ∈ / S. Classification of objects into categories S is important for basic survival. So it is not surprising that animals have this capability. The object/class concept becomes problematic when on attempts to include the collections of modelled objects within the model itself. Generally a model represents some system (either external or internal to the organism), and that system would not generally contain abstract collections as objects. For example, we may see birds in a tree, but we never see the class of all birds as a single object sitting on a branch of a tree. The classes within a model refer to the model’s own object representations. Classes are part of the apparatus of the model, not part of the system which is being modelled. If one tries to insert a model’s classes directly into the model’s own object space, the object space is thereby modified, which consequently modifies the model’s classes, which modifies the model’s classes. This insertion process is not at all guaranteed to reach a stable state, as discussed in Remarks 5.7.14, 5.7.15 and 5.7.19. Sets can be included as objects within the world-model ontology (alluded to in Section 3.6) by making a model of a model, where each model has distinct categories of objects and sets. Thus machine M1 may have distinct objects and classes of objects. Then machine M2 may model both the objects and classes in machine M1 . (See Figure 5.7.6.) [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Another way of looking at this is that the “axiom of naive comprehension” is not so much naive as careless or irresponsible. Perhaps it should be renamed the “axiom of reckless comprehension”. Then probably it would receive less respect than it currently does.
184
5. Sets modelled system
model
meta-model
classes
classes
objects
classes
objects objects
machine M0 Figure 5.7.6
machine M1
machine M2
Classes and objects, and meta-classes and meta-objects
The model in machine M2 may give names to arbitrary sets of sets of objects which are in the model of machine M1 . In the bird example, if person 2 models the state of mind of person 1, who is modelling the birds in the trees, then the class of all birds may indeed by a valid “object” inside the mind of person 1 which can be modelled by person 2.
modelled system
model
meta-model
metan−1 -model
metan -model
classes
classes
classes
classes
classes
classes
.... ....
.... .. ..
machine Mn
machine Mn+1
objects
. . . .
classes
objects objects
machine M0
machine M1
Figure 5.7.7
machine M2
Classes and objects in models, recursive
Since humans are able to introspect, they can model their own state of mind. Therefore the modelled system of a person’s modelling activity may be the same person’s past, present, future or potential state of mind. This is quite possibly how nested classes originated in human thinking during the last few thousand years. To represent an unbounded depth of set nesting in a single model, one may introduce the concept of an induction-capable machine ω which can represent any number of the machines n in its world model. It is not necessary to represent all of the finite-depth machines in the model of machine ω. This induction-capable machine only needs to be able to represent any finite-depth machine on demand . Thus machine ω does not require infinite modelling capability, only unbounded modelling capability. (See Figure 5.7.8.) inductive system modelled by Mω modelled system
model
meta-model
classes
classes
objects
classes
objects objects
machine M0
machine M1
machine M2
meta
. . . .
n−1
-model
metaω -model n
meta -model
classes
classes
classes
classes
.... ....
.... .. ..
machine Mn
machine Mn+1
classes
objects
machine Mω Figure 5.7.8 [ www.topology.org/tex/conc/dg.html ]
Classes and objects in models, infinite [ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
This process can be extended any finite number of times. In each case, machine Mn+1 creates a model of both the objects and classes in machine Mn and then constructs arbitrary subsets of the class of objects in this meta-model. (See Figure 5.7.7.)
5.8. ZF set theory definitions and notations
185
Transfinite induction-capable machines may implement models which represent the aggregate of a wellordered set of individual machines, ordered by the relation that one machine is modelled within another. Thus one can arrive at a machine which, in principle, may represent any ordinal number set. (See Section 7.2 for ordinal numbers.) However, the ordinal numbers are not closed under this process of building new machines. [ Check Remark 5.7.26 to see if it can be strengthened a little. ] 5.7.26 Remark: Yet another way to look at the “set of all sets” problem is to observe that the inclusion in a universe set U of the set U itself and all of its subsets is formally the requirement IP(U ) ⊆ U , where IP(U ) denotes the set of all subsets of U . (See Definition 5.8.18.) The specification axiom implies that all (or almost all) of the subsets of any set must also be sets. (The subsets must have the special property that they are expressible in terms of a set-theoretic formula. This actually places a severe restriction on which subsets can even be written down.) By basic cardinality theory, assuming that U does have a cardinality, if IP(U ) ⊆ U then #(IP(U )) ≤ #(U ). But #(IP(U )) > #(U ) for any set A. (See for example Halmos [159], pages 100-101.) This is a contradiction. This form of argument is not completely secure, but it does suggest that there is a problem with trying to include a set U and its specifiable subsets inside U . 5.7.27 Remark: In conclusion, one may say that the ZF regularity axiom (Definition 5.1.26 (7)) is not an ad-hoc kludge to fix up Russell’s paradox. The regularity axiom merely prevents self-referential monsters from inflicting painful positive feedback noise upon set theory. Mathematicians have enough work to do already without having to fight such hideous monsters.
5.8. ZF set theory definitions and notations 5.8.1 Remark: Theorem 5.8.2 states that there is only one set which satisfies the empty set axiom (Definition 5.1.26 (2)). In other words, there is one and only one empty set. Therefore this set can be given its own notation and can be called the empty set. 5.8.2 Theorem: Let A1 , A2 be sets which satisfy ∀x, x ∈ / A1 and ∀x, x ∈ / A2 . Then A1 = A2 . Proof: Let A1 , A2 be sets which satisfy ∀x, x ∈ / A1 and ∀x, x ∈ / A2 . To prove that A1 = A2 , it suffices to show that ∀x, (x ∈ A1 ⇔ x ∈ A2 ) (by the extension axiom, Definition 5.1.26 (1)). But ∀x, x ∈ / A1 ⇒ ∀x, (x ∈ / A1 ∨ x ∈ A2 ) ⇔ ∀x, (x ∈ A1 ⇒ x ∈ A2 ).
(5.8.1) (5.8.2)
So ∀x, (x ∈ A1 ⇒ x ∈ A2 ). Similarly ∀x, (x ∈ A2 ⇒ x ∈ A1 ). Therefore ∀x, (x ∈ A2 ⇔ x ∈ A1 ). Line (5.8.1) follows from Theorem 4.11.7 (vi). Line (5.8.2) follows from Definition 4.11.2. 5.8.3 Definition: The empty set is the set A which satisfies ∀x, x ∈ / A. 5.8.4 Notation: ∅ denotes the empty set. 5.8.5 Remark: From Theorem 5.8.2, it follows that ∀A, (∀x, x ∈ / A) ⇒ (A = ∅) .
Therefore for any concrete proposition domain (or concrete logic domain) which is modelled by ZF set theory, there will be one and only one individual object E such that the variable name map µV : NV → V maps ∅ to µV (∅) = E. (See Remark 3.1.2 for terminology and notations.) [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
[ Also describe Curry’s paradox briefly. Then reject it for the same reason that Russell’s paradox is rejected, namely because the axiom of naive comprehension is too naive. Just turning any arbitrary adjective into a noun is too permissive. Curry’s paradox shows that A = {x; x ∈ x ⇒ P } for any proposition P leads to a proof of P . So all propositions are true! However, this assumes that A can be a member of A. So this is a self-referential set. ]
186
5. Sets
5.8.6 Remark: The empty set in ZF set theory is the unique individual concrete set which has no members. The uniqueness follows from the ZF extension axiom as in the proof of Theorem 5.8.2. The existence follows from the ZF empty set axiom. When a concrete object exists and is unique, it may be given a constant abstract name. Names are generally given to objects in an informal way. However, it is possible to define constant names in a systematic way. If a set S(z1 , z2 , . . . zn ) satisfying x ∈ S(z1 , z2 , . . . zn ) ⇔ P (x, z1 , z2 , . . . zn ) exists and is unique for all ntuples (z1 , z2 , . . . zn ), then S(z1 , z2 , . . . zn ) is a well-defined n-parameter name family. This may be regarded as a named abstract n-parameter logical function. This definition may be introduced as follows. ∀z1 , z2 , . . . zn ,
S(z1 , z2 , . . . zn ) = {x; P (x, z1 , z2 , . . . zn )}.
Using this convention, one may formally define the 0-parameter function ∅ as ∅ = {x; x 6= x}. This is, however, an artifical way of defining the empty set because the predicate P (x) = “x 6= x” is merely designed to be false for all x. This predicate has nothing at all to do with the equality relation. It would be more natural to define P (x) = “⊥(x)”, where ⊥ is the always-false predicate. (See Remark 4.12.9 and Notation 4.12.10.) Thus ∅ = {x; ⊥(x)}. But this is also an artificial way to define the empty set.
An axiom (or theorem) which guarantees the existence of an n-parameter family of sets S(z1 , z2 , . . . zn ) often has the form: ∀z1 , z2 , . . . zn , ∃X, ∀y, (y ∈ X ⇔ P (y, z1 , z2 , . . . zn )). In this case, it is possible to notate the set X as S(z1 , z2 , . . . zn ) = {x; P (x, z1 , z2 , . . . zn )}. But sometimes the existence axiom (or theorem) may have the following form. ∀z1 , z2 , . . . zn , ∃X, P (X, z1 , z2 , . . . zn ). In this case, it would be desirable to introduce the notation S(z1 , z2 , . . . zn ) for X as follows. S(z1 , z2 , . . . zn ) = X; P (X, z1 , z2 , . . . zn ).
In other words, S(z1 , z2 , . . . zn ) is the unique set X for which P (X, z1 , z2 , . . . zn ) is true. That is, instead of notating the set of objects which satisfy the predicate P , one notates the unique single object which satisfies P . In the case of the empty set, we could then define the notation ∅ as: ∅ = X; ∀y, y ∈ / X, where the predicate P is defined by P (X) = “∀y, y ∈ / X”. This form of notation introduction is easy to convert into plain language. It means that ∅ denotes the unique set X which satisfies ∀y, y ∈ / X. In other words, it denotes the unique set which has no members. Definition 5.8.9 and Notation 5.8.11 introduce singleton sets {a} for any a. These may be formalized as follows. ∀a,
{a} = {x; P (x, a)} = {x; x = a},
(5.8.3)
where P is defined by P (x, z) = “x = z”. Alternatively, one may define: ∀a,
{a} = X; ∀x, (x ∈ X ⇔ P (x, a)). = X; ∀x, (x ∈ X ⇔ x = a).
(5.8.4)
In this case, the “bracket-less” form of definition (5.8.4) seems less natural than the standard bracketed form (5.8.3). It is important to remember that the equality symbols “=” in the above definitions do not represent an equality relation. (This notational issue for definitions is discussed in Remark 1.6.5.) [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
∀z1 , z2 , . . . zn ,
5.8. ZF set theory definitions and notations
187
5.8.7 Remark: The empty set may be thought of as a “universal subset” because ∅ ⊆ A for all sets A. By symmetry, one might expect also the unique existence of a “universal superset”. It turns out that such a concept is as painful as the empty set is painless. There is no universal set in ZF set theory, in order to avoid Russell’s paradox, but Bernays-G¨ odel set theory does have a “universal class”. (See Section 5.12 for BG set theory.) 5.8.8 Theorem: Let A be a set. Then ∅ ⊆ A. Proof: Let A be a set. It is to be shown that ∀x, (x ∈ ∅ ⇒ x ∈ A). By Definition 5.8.3, ∀x, x ∈ / ∅. Therefore by Theorem 4.11.7 (vi), ∀x, (x ∈ / ∅ ∨ x ∈ A). By Definition 4.11.2, this is equivalent to the proposition ∀x, (x ∈ ∅ ⇒ x ∈ A). 5.8.9 Definition: A singleton is a set A which satisfies ∃x, (x ∈ A ∧ (∀y ∈ A, y = x)). 5.8.10 Remark: For any set x, there is a unique set A which satisfies x ∈ A and ∀y ∈ A, y = x. Existence follows from the ZF unordered pair axiom, Definition 5.1.26 (3). An equivalent condition for a set A to be a singleton is ∃′ x, x ∈ A. The unique singleton A which contains a set x is characterized by the proposition y ∈ A ⇔ y = x. 5.8.11 Notation: {x} denotes the singleton set A which satisfies x ∈ A ∧ ∀y ∈ A, y = x. 5.8.12 Notation: {x; P (x)}, for any set-theoretic formula P , means the set S which satisfies x ∈ S ⇔ P (x) if it can be proven that such a set exists.
5.8.14 Remark: There are a few popular variants of Notation 5.8.12. For example, Shoenfield [168], page 242, uses [x | P (x)] to mean {x; P (x)}. (See also Remark 5.8.26 for discussion of alternative notations.) 5.8.15 Notation: {x ∈ A; P (x)} means {x; (x ∈ A) ∧ P (x)} for any set A and set-theoretic formula P . 5.8.16 Remark: A difficulty with Notation 5.8.15 is the fact that “x ∈ A” is actually a proposition with two variables x and A. There is no absolute reason why one should assume that x is the bound variable or “dummy variable” and that “x ∈ A” is a condition which is to be combined with P (x) to define the set. One way to avoid such ambiguity would be a notation such as Ex [(x ∈ A) ∧ P (x)] (where E is mnemonic for “ensemble”), which makes it clear that the symbol x is a bound variable and A is a free variable. (See Remark 5.1.23 for free and bound variables.) One possible generalization of Notation 5.8.15 would be to say that {x R y; P (x)} means {x; (x R y) ∧ P (x)} for any relation R. An ambiguity in this sort of generalized notation is exemplified by an expression such as “{A ⊆ X; P (A)}”. One may guess that A is the dummy variable because it is the first variable before the semicolon, and it appears in P (A). But if this set is written as “{X ⊇ A; P (A)}”, the meaning is ambiguous even though the proposition “X ⊇ A” is equivalent to the proposition “A ⊆ X”. Notation 5.8.23 attempts to clarify this situation. 5.8.17 Theorem: (i) A = {x; x ∈ A} = {x ∈ A; x = x} for any set A.
(ii) {x ∈ A; P (x)} ⊆ A, for any set A and any set-theoretic formula P . 5.8.18 Definition: The power set of a set X is the set {A; A ⊆ X} of all subsets of X. 5.8.19 Notation: IP(X) for a set X denotes the power set of X. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
5.8.13 Remark: If it has not been proven in ZF set theory that x ∈ S ⇔ P (x) for some set S, then Notation 5.8.12 denotes merely a conjectural set. The introduction of such conjectural sets is error-prone because sometimes such notations are mistaken for well-defined sets which can be used as such. Therefore all use of this form of notation should be accompanied by a proof that the set exists, or else should be indicated very clearly as conjectural. By contrast, Notation 5.8.15 is safe because the set is known to be well defined by the specification (or replacement) axiom.
188
5. Sets
5.8.20 Remark: A popular alternative notation for the power set IP(X) is 2X , but this is inconvenient for typing and is not very accurate. The notation 2X is best reserved for the set of functions from X to 2 = {0, 1}. (See also Notation 6.5.13 and Remark 7.9.6.) 5.8.21 Remark: The power set in Definition 5.8.18 is well defined for any set X by ZF axiom (5) in Definition 5.1.26. 5.8.22 Remark: The power set IP(X) is defined by the proposition ∀A, A ∈ IP(X) ⇔ A ⊆ X . Therefore the proposition A ⊆ X can always be replaced by the equivalent proposition A ∈ IP(X) and vice versa. This equivalence is frequently useful in applications. 5.8.23 Notation: {A ⊆ X; P (A)} for any set X and set-theoretic formula P means {A ∈ IP(X); P (A)}. 5.8.24 Remark: Life gets more confusing with expressions such as {(x, y, z) ∈ IR3 ; y = z} or {f : X → Y ; f is one-to-one}. Such expressions can be reduced to Notation 5.8.15, for example {p ∈ IR3 ; p = (x, y, z) ∧ y = z} and {f ∈ Y X ; f is one-to-one}. (See Notation 6.5.15 for the latter style of notation for sets of functions.) Usually the context should make clear which variables are the dummy variables. 5.8.25 Remark: Another kind of confusion can arise with expressions such as {f (x); x ∈ A} for a settheoretic expression f which is a function on A (i.e. f (x) is uniquely defined for all x ∈ A). It follows by ZF Axiom (6) (Definition 5.1.26) that {f (x); x ∈ A} is a set if A is a set and f is a set-theoretic function on A. Notation 5.8.27 formalizes this. 5.8.26 Remark: Some people use the notation {x ∈ A : P (x)} for the set of x ∈ A such that P (x) is true. The colon can be confusing in set constructions such as {f : A → B : P (f )}.
Some people use the notation {x ∈ A | P (x)}, but this can be confusing in probability contexts. In this book, the notation {x ∈ A; P (x)} is used throughout. The semicolon means “such that”. The vertical stroke notation “ | ” for “such that” may be confused with the following notations.
(ii) The modulus |z| of a complex number z, the norm |v| of a vector v, or the determinant |A| of a matrix A. (iii) The restriction f X of a function f to a set X. (iv) The divisibility operator. For example, m | n means that m divides n.
5.8.27 Notation: {f (x); x ∈ A} for any set A and set-theoretic function f means {y; ∃x ∈ A, f (x) = y}. 5.8.28 Remark: It is a good habit to avoid “naked dummy variables” when specifying sets. An expression such as {x; P (x)} which does not specify a set which is to be restricted by the proposition P is dangerous because the expression might not specify a valid set. This is no problem in NBG set theory (which is not adopted for this book), but in ZF set theory it is very important to ensure that all set specifications are valid sets so as to avoid Russell’s paradox. The simplest way to achieve this is to use Notation 5.8.15 to specify sets. If A is previously verified to be a set, and P is a valid set-theoretic formula, then {x ∈ A; P (x)} is a set by ZF Axiom (6) in Definition 5.1.26.
5.9. Axiom of choice [ Like most of the book, this section is in the “ideas capture phase”. Therefore there is currently a lot of repetition and informal discussion. ] 5.9.1 Remark: The author has decided to present ZF as the standard, safe set theory for the differential geometer to work in. Theorems which are valid in ZF without AC (axiom of choice) could be referred to as “ZF-clean” theorems. All “AC-tainted” theorems will be tagged with a warning indication like “Theorem [zf+ac]”. (See for example Theorem 20.1.4.) Mathematicians who are comfortable with AC might call such theorems “AC-enhanced”. The reader can thus be made aware that such theorems are at the border of meaningfulness, perhaps even a little outside the border. Theorems which can be proven with the weaker [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
(i) The probability notation “Prob(E | F )” for events E and F , which means the probability of the event E conditioned by the event F .
5.9. Axiom of choice
189
axiom of countable choice will be tagged as “Theorem [zf+cc]” and will be called “CC-tainted”. (Examples: Theorem 7.2.26 and 7.2.36.) It is very easy to accidentally use AC in a proof, especially in topology. Theorems claimed as ZF-clean in this book may be tagged as AC-tainted or CC-tainted in a later revision if a dependency is discovered. Caveat lector!
5.9.3 Remark: The axiom of choice may be described as a “pixies at the bottom of the garden” axiom – you know they’re there, but you never see them. Given a set U , a subset X of U which is guaranteed to exist only by the axiom of choice cannot be determined explicitly. That is, there are elements x ∈ U for which it is not possible to determine whether x ∈ X. The set “exists”, but you can’t see it. (A good example of this is a set which is not Lebesgue measurable.) Some mathematicians feel quite comfortable with this while others find the discomfort intolerable. The axiom of choice is rejected in this book. (If the reader sees any explicit or implicit use of the axiom of choice in the book, please notify the author immediately.) Mendelson [164], page 197, says: “The Axiom of Choice is one of the most celebrated and contested statements in the theory of sets.” Struik [193], page 201, says the following about Zermelo’s proof of the well-ordering of real numbers in 1902: Since Zermelo based his proof on the Auswahlaxiom, which states that from each subset of a given set one of its elements can be singled out, mathematicians differed on the acceptability of a proof where no constructive procedure could be given for the finding of such an element. Hilbert and Hadamard were willing to accept it; Poincar´e and Borel were not. This kind of disagreement led to a split between “formalism” and “intuitionism”. The battle is still worth fighting. Most scientists would reject a theory that explains events on Earth in terms of an arbitrary theory about goblins in another galaxy. Even though formally the theory might be valid, an untestable theory is useless. The axiom of choice is untestable because it gives you sets which you can never see. Just as Ockham’s razor is invoked in science to eliminate arbitrary and untestable theorizing, so also should the axiom of choice be rejected in mathematics. Real mathematicians solve problems by constructing solutions, not by pulling them out of a hat. 5.9.4 Remark: In the author’s opinion, there is no need for the axiom of choice in serious mathematics. It is an interesting axiom for recreational mathematics. For example, the existence of Lebesgue non-measurable subsets of the real numbers can be proved with the axiom of choice, but you can never know what is in such a set. If X is an non-measurable subset of IR, it is not possible to determine for all x ∈ IR whether x ∈ X or x ∈ / X. In other words, such sets may exist in some abstract sense, but it is not possible to determine the set’s contents. The axiom of choice is the “lucky dip axiom”. With all of the other existential axioms, you choose what is in the set, and the axiom guarantees that your set exists if you obey the rules. The set which the Axiom of Choice delivers to you is not of your choosing and under your control. The choice set or choice function is an unknowable, random selection of elements from a set of sets. Not only is the set or function determined by roulette wheels and dice, but you don’t even get told what the numbers are after the wheel has been spun and the dice have been rolled. The Axiom of Choice is the roulette wheel of set theory! If a set’s contents can be determined, there must be some sort of rule or procedure which determines which elements are in the set. In other words, the set is constructible according to some procedure or formula. Even though the axiom of choice is intuitively appealing because one feels that choices from non-empty sets must be possible, it causes sets to come into existence which have an indeterminate membership. It is very difficult to say that a set really exists if it has a membership which cannot be determined. Since the whole raison d’ˆetre of sets is to specially distinguish members from non-members, a set with undeterminable membership is incapable of fulfilling its only purpose. One can write symbols for such objects and manipulate them with self-consistent logic, but they are of no practical use. Therefore it will be assumed in this book that the axiom of choice is not one of the axioms of mathematics. (In fact, the axiom of choice will be derided and vilified at every opportunity. If a theorem requires AC, tough!) [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
5.9.2 Remark: The ZF existence axioms 2, 3, 4, 5, 6 and 8 in Definition 5.1.26 are essentially of the explicit set-membership form ∃X, ∀z, (z ∈ X ⇔ F (z)) for some predicate F . The axiom of choice is not of this form. The explicit set-membership specification permits the immediate definition of a unique set X whose members satisfy F . (See Remark 5.1.18.) The axiom of choice does not enable any such definition, unique or not. This is why the axiom of choice is useless. AC simply does not tell you what the members of a set X are. One must remember that a set is defined by its members. The membership of a set is its only property. If the membership is not known, the set is not known.
190
5. Sets
The author rejects the Axiom of Choice not because it is possibly wrong, but because it is certainly useless. 5.9.5 Remark: Definition 5.9.6 adds the axiom of choice to the ZF set theory in Definition 5.1.26. This can be useful when courage and moral fortitude fail. The set of axioms ZF plus AC is sometimes denoted as ZFC. Although it is now well known that AC is independent of the ZF axioms, there are limited forms of the axiom of choice which do follow from ZF. Some AC-similar ZF theorems are discussed in Section 7.8. 5.9.6 Definition: A Zermelo-Fraenkel set theory with axiom of choice is a set theory which satisfies all of the conditions of Definition 5.1.26 together with Axiom (9). (9) The choice axiom: For any set X, ∀A, A ∈ X ⇒ ∃x, (x ∈ A) ∧ ∀A, ∀B, (A ∈ X ∧ B ∈ X ∧ A 6= B) ⇒ ¬∃x, (x ∈ A ∧ x ∈ B) S ⇒ ∃y, y ⊆ X ∧ ∀A, (A ∈ X ⇒ ∃z, (z ∈ A ∧ z ∈ y ∧ ∀w, (w ∈ A ∧ w ∈ y ⇒ w = z))) . [ Comment on the individual sub-expressions of Definition 5.9.6, Axiom (9), and axioms (5.9.1) and (5.9.2). Comment on how they are related. ] 5.9.7 Remark: S It is difficult to have faith in an axiom which doesn’t fit on a single line, even using the abbreviation X for the union of X. Using even more abbreviations, Definition 5.9.6, Axiom (9), may be expressed as follows. (∀A ∈ X, A 6= ∅) ∧ (∀A, B ∈ X, (A 6= B ⇒ A ∩ B = ∅))
⇒ ∃y ⊆
S
X, ∀A ∈ X, ∃z, A ∩ y = {z}. (5.9.1)
As mentioned in Remark 5.1.18, the set y whose existence is asserted by Definition 5.9.6, Axiom (9), is not of the form ∃y, ∀x, (x ∈ y ⇔ F (x)) forSsome predicate F . Th axiom of choice does not specify a unique set. The axioms only claims that the set X is non-empty. There is no indication of how one might determine the membership of the set y. Since the membership of a set is its only property, and this essential property cannot be determined, the axiom of choice is useless for constructing a set. [ Show the equivalence of the choice axiom (9) and axiom (5.9.2). ] 5.9.8 Remark: EDM2 [34], section 33.B expresses the axiom of choice similarly to the following: (∀x ∈ X, ∃y, P (x, y)) ⇒ ∃f, (∀x ∈ X, P (x, f (x))),
(5.9.2)
for any set X and set-theoretic formula P . (See Definition 5.1.22 for an explanation of set-theoretic formulas.) Here f is a set which happens to be a function with domain including X, not a set-theoretic formula. (See Figure 5.9.1.) This formulation of the axiom is incomplete because it does not include the requirement that f be a function. However, it has the advantage of intuitive clarity.
,1) (x 1, y 1
∈P
x1 X
2,2 )
[ www.topology.org/tex/conc/dg.html ]
y1,2 = f (x1 ) y1,3
x2 (x2 , y
Figure 5.9.1
y1,1
y2,1 = f (x2 ) ∈P
y2,2
Choice function f for a relation P [ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
The set y contains a single choice of an element in each set A in the collection X.
5.9. Axiom of choice
191
5.9.9 Remark: A simple summary of the axiom of choice is (S 6= ∅ ∧ (∀i ∈ S, Xi 6= ∅)) ⇒ ×i∈S Xi 6= ∅. But this requires the formal definition of arbitrary cross products in Section 7.7. Unlike the ZF existence axioms, the axiom of choice does not specify the elements of a particular set. AC tells you that a particular set of sets is not empty. This is a weak assertion. No new set is brought into existence. One is told by AC only that any direct product of a non-empty collection of non-empty sets contains at least one element. When an ex machina “roulette wheel” has chosen the elements for all sets in a collection, one may make one’s own choices for a finite sub-collection, and let the roulette wheel choose the rest. In this way, many more elements of the direct product can be found. 5.9.10 Remark: It is important to be aware of what one loses by not accepting the axiom of choice. The following assertions require the general axiom of choice. (i) Every set is equinumerous to an ordinal number. Therefore the cardinality of all sets is well defined. (See Mendelson [164], page 200.) (ii) Any non-empty direct product of compact topological spaces is a compact topological space. (Tikhonov’s theorem. See Remark 15.7.10.) For a topological space to be compact requires only that a particular kind of subcollection of given collections of sets should exist. Compactness does not require the specification of any method of determining the contents of these subcollections. This makes the axiom of choice very tempting because it asserts the existence of sets, but no method of determining the contents. (iii) There exists a subset of IR which is not Lebesgue measurable. (See Theorem 20.1.4.) (iv) Any linear space X over a field k has a basis. (See Theorem 10.2.25.) (v) Cardinal numbers are comparable. [ See EDM2 [34], 34.B. This is probably the same as trichotomy? ] (vi) The Heine-Borel theorem (Theorem 17.3.27)? [ This must be checked. ] The following assertions require the axiom of countable choice.
(viii) Every infinite set has a subset which is equinumerous to the ordinal numbers. That is, if X is an infinite set, then ∃f, (∀i ∈ ω, f (i) ∈ X) ∧ (∀i, j ∈ ω, (i = j ⇔ f (i) = f (j))) . (See Theorem 7.2.28.) (ix) The union of a countable number of countable sets is countable. (See Theorem 7.2.36.)
5.9.11 Remark: Concerning the Lebesgue non-measurable sets mentioned in Remark 5.9.10 (iii), an example of a non-measurable set may be constructed by partitioning IR into equivalence classes according to whether the difference between two numbers is a rational number. Then these (apart from the equivalence class of zero) are paired into mirror images of each other, and a function f is defined to be 1 on one class of each pair and 0 on the other class. No doubt, such a function can be defined. But without the axiom of choice, such a set cannot be reached be set-theoretic formulas inserted into the ZF axiom templates. The problem seems to be that the set theory language cannot independently specify an uncountable number of function values. More importantly, if humans could do so, then surely this could be expressed in symbolic algebra. Therefore the problem seems to be ultimately a limitation of the human mind. It is not possible to choose a toss of the coin for each member of an uncountable set of equivalence classes which cannot be enumerated in some systematic way. Countable sets can often be enumerated and manipulated according to inductive or iterative rules. (This is without an axiom of countable choice. See Remark 7.8.3.) There is apparently no way to express a rule to deal with the very complex structure of these equivalence class pairs. Since we cannot enumerate them in terms of other sets, we cannot assign explicit values. So AC comes to the rescue by tossing a coin for us an infinite number of times. But since we cannot see the results of the coin tosses, and we would not be able to register the results in our limited minds anyway, we can do absolutely nothing with the resulting non-measurable set except feel comforted (or discomforted) by its existence. We will never be able to draw a graph of it as one can in the case of a nowhere-differentiable continuous function, for example. It is not possible to know what its value f (x) is for x = exp(4.2), except that f (x) = 0 or 1. In short, such a function is a dead end. All sets and functions ‘constructed’ with AC are dead ends. It is not even possible to devise algorithms to numerically approximate the values of such sets and functions. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
(vii) Any Dedekind-finite set (Definition 7.2.24) is a finite set as defined by equinumerosity to finite ordinal numbers. (See Theorem 7.2.26.)
192
5. Sets
5.9.12 Remark: Mendelson [164], page 201, has the following pro-AC statement: The status of the Axiom of Choice has become less controversial in recent years. To most mathematicians it seems quite plausible and it has so many important applications in practically all branches of mathematics that not to accept it would seem to be a wilful hobbling of the practicing mathematician. So AC makes life easier. This seems a poor excuse for introducing sets with undeterminable membership into set theory, considering that the membership of a set is the only purpose for its definition. The sets which require AC can safely be ignored because one will never arrive at them without AC. No contradictions arise by not admitting the “pixies at the bottom of the garden”. It may feel better to know that all sets have a cardinal number, but the sets which do not have a cardinal number will never be encountered in practice. AC does not give any practical benefit. A good example of the uselessness of the axiom of choice is the existence of a basis for every linear space. Theorem 10.2.25, with the assistance of the axiom of choice, asserts that every linear space has a basis. However, it is not possible to do anything in practice with a linear space which has an intangible basis. So in practice, only those spaces for which a basis can be constructed are useful. Therefore in this book, when a basis is required for a theorem, the existence of the basis is required as a condition. In this way, the axiom of choice is avoided, and no useful theorems are lost. An example is Theorem 10.5.3 which asserts the existence of linear functionals of a particular kind on any linear space which has a basis. Every usable linear space has a basis. So this theorem covers all usable linear spaces. The same general observation applies to all applications of the axiom of choice. [ See Halmos [159], page 60, for a comment on the history of trying to avoid the axiom of choice. ] 5.9.13 Remark: The proof of the Heine-Borel theorem (Theorem 17.3.27) sometimes uses the Axiom of Choice in the equivalent form of Zorn’s Lemma. (See Simmons [139], p. 110–114.) But it seems that AC is [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
In the case of countably infinite enumerations, even though we cannot carry out an infinite number of calculations, we can carry out as many as we wish. Given enough time, we can imagine reaching any point in the enumeration. (Some mathematicians have rejected this view though.) In the case of uncountable sets, there is typically no way to convince ourselves that we could get to any point in finite time. In other words, countable induction fails. Since the human mind cannot go beyond this point, it seems pointless to fill in this void in human capability with the Axiom of Choice which provides only dead ends to further investigation. Consider the example of the decimal expansion of π. We could never calculate 1080 digits of π, but we generally accept that it can be done “in principle”. So it is impossible, but we accept it anyway. The whole concept of the real number system relies upon this kind of acceptance that decimal expansions (or some equivalent) can be continued indefinitely. Even though π is unknowable, we can know as many digits as we wish, without any hard barrier to stop us. AC only tells us that there is something in the void, but we cannot know anything about it. One may accept the Axiom of Choice if one wishes to do so. But a mathematics which has black-box sets which cannot be opened is untidy. This like receiving gifts of thousands of books in boxes which can never be opened. One might as well refuse the gifts and keep the house tidy. One may never finish reading the infinite book of digits of π, but most people would accept that being able to read several pages is better than nothing. Reality can’t be measured to infinite digits anyway. (And besides, according to general relativity, the circumference/diameter ratio of a circle depends on local space-time curvature and the size and orientation of the circle!) It seems that the human mind can accept that linear thought patterns can be extended indefinitely, which is what induction is: an indefinite sequence of linear thought. But uncountable sets cannot be reached in such a linear sequence. It is then highly questionable whether a simple linear mind can ever appreciate a universe which is not at all linear. Differential geometry, as a tool-chest for physics, is an attempt to reduce massively uncountable complexity to linear sequences of thought. This is surprisingly feasible, but it must never be forgotten that the world is understood through the extremely narrow funnel of the single-threaded human mind. Even basic measure theory has dead ends which cannot be investigated further. So the full complexity of differential geometry inevitably has a forest of dead ends. The question of the Axiom of Choice gives some idea of the limitations of mathematical thinking and modelling. Mathematics is only neat and tidy if you ignore the voids beyond which one cannot think. This is yet another reason to not view mathematics as a solid bedrock for physics or any other science.
5.9. Axiom of choice
193
not required. For example, see Taylor [144], pages 30–31. [ There’s a problem with using ordered pairs in Remark 5.9.14. They are not introduced until Definition 6.1.3. ] 5.9.14 Remark: In the literature, there are many lists of equivalents for the Axiom of Choice. (See for example EDM2 [34], section 34, Mendelson [164], page 197, Halmos [159], page 60.) Some well-known equivalents for AC, under the assumption of the ZF axioms, are as follows. (1) Some slightly different formulations of the Axiom of Choice in Remark 5.9.8, equation (5.9.2): (1.1) For any set X, (X 6= ∅ ∧ ∀A ∈ X, A 6= ∅) ⇒ ∃f, ∀x ∈ X, ∀y, z, (((x, y) ∈ f ∧ (x, z) ∈ f ) ⇒ y = z) ∧ ∀B ∈ X, f (B) ∈ B . In other words, for any non-empty set X of non-empty sets, there is a function f with domain X such that f (B) ∈ B for all B ∈ X. This is perhaps the best-known form of AC.
(1.2) For any set X,
∃f, ∀B, (B ⊆ X ∧ B 6= ∅) ⇒ f (B) ∈ B .
The function f is a set which happens to be a function, not a set-theoretic formula. This axiom means that for any set X, there exists a choice function f such that for any subset B of X, f (B) ∈ B; that is, f chooses a single element of B. (See Mendelson [164], page 197.) This is illustrated in Figure 5.9.2.
B3
B1
X
f (B2 )
B2
f (B3 )
B4
B5 f (B6 )
f (B4 )
f (B5 )
B6 Figure 5.9.2
Choice of one element for every subset Bi of a given set X
(1.3) The Multiplicative Axiom. (Mendelson [164], page 198.) This states that for any set X, ∀A, B ∈ X, (A ∩ B = ∅ ∨ A = B) ⇒ ∃C ∀A ∈ X, ∃′ x ∈ C, x ∈ A .
In other words, if X is a disjoint collection of sets, then there exists a set C such that for any member A of X, the set A ∩ C contains exactly one element x. So C is a “choice set” which selects exactly one element from each element of a disjoint collection. See Figure 5.9.3.
A1
x1 ∈C
A3
x2 ∈C
A2
x3 ∈C x6 ∈C x4 ∈C
A4 Figure 5.9.3
x5 ∈C
A5
A6
Choice of one element xi from each set Ai of a disjoint collection X
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
f (B1 )
194
5. Sets
(2) The Well-Ordering Theorem. This theorem states that any set can be well-ordered. It seems that Zermelo formulated the Axiom of Choice in 1904 with the express purpose of proving the well-ordering theorem. [ Define well-ordering. ] (3) Zorn’s Lemma. According to EDM2 [34], 34.C, Zorn’s lemma has at least the following 5 forms. (3.1) For any (partially) ordered set X, if every totally ordered subset of X has an upper bound, then X has at least one maximal element. In other words, every inductively ordered set has at least one maximal element. (3.2) If every well-ordered subset of an ordered set M has an upper bound, then there is at least one maximal element in M . (3.3) Every ordered set M has a well-ordered subset W such that every upper bound of M belongs to W . (3.4) For a condition C of finite character for sets, every set X has a maximal (for the relation of inclusion) subset of X that satisfies C. (3.5) Let C be a condition of finite character for functions from X to Y . Then, in the set of functions that satisfy C, there is a function whose domain is maximal (for the relation of inclusion). (4) Trichotomy. (See Mendelson [164], page 198.) For any sets X and Y , either X is equinumerous to a subset of Y , or Y is equinumerous to a subset of X. In other words, either there exists a bijection from X to a subset of Y , or vice versa. A common feature of AC equivalents is that they talk about “any set”. Theorems containing this phrase should always be viewed with suspicion and be scrutinized for dependence on the axiom of choice. The purpose of the above list is to help the reader spot any sneeky use of AC in an equivalent form. [ There seem to be plenty more equivalents for the Axiom of Choice. See for example Mendelson [164], page 199 (exercises), Taylor [144], pages 19–21, EDM2 [34], 34.A. ]
5.10.1 Remark: This little piece of text appears on the internet somewhere. The Axiom of Countable Choice (CC) is a weak form of the Axiom of Choice. It states that every countable set of nonempty sets has a choice function. ZF+CC (that is, the Zermelo-Fraenkel axioms together with the Axiom of Countable Choice) suffices to prove that the union of countably many countable sets is countable. It also suffices to prove that every infinite set has a countably infinite subset. These consequences seem desirable enough. So probably CC should be invoked to acquire these propositions. It might be very difficult indeed to track down which theorems already use CC without being aware of it. 5.10.2 Remark: The countable version of the axiom of choice is a weaker than the standard axiom. The countable version is perhaps more intuitively “obvious” than the full axiom, but it still brings into existence sets whose membership is unknowable. [ Must find a tidy set-theoretic expression which means “X is countable”. Then use this in Definition 5.10.3 instead of plain English. ] 5.10.3 Definition: A Zermelo-Fraenkel set theory with countable axiom of choice is a set theory which satisfies all of the conditions of Definition 5.1.26 together with Axiom (9′ ). (9′ ) The countable choice axiom: For any countable set X and any set-theoretic formula P , (∀x ∈ X, ∃y, P (x, y)) ⇒ ∃f, (∀x ∈ X, P (x, f (x))).
(5.10.1)
5.10.4 Remark: CC-tainted theorem examples in this book are Theorems 7.2.26, 7.2.28 and 7.2.36.
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
5.10. Axiom of countable choice
5.11. Zermelo set theory
195
5.11. Zermelo set theory Probably there is no use for Zermelo set theory in this book. The difference between Zermelo and ZF set theory is that there are some unusual sets which are guaranteed to exist in ZF which are not guaranteed to exist in Zermelo set theory. Probably these kinds of sets have no practical use in differential geometry. So the slightly weaker Zermelo set theory could (probably) be used for DG instead of ZF. But this book assumes ZF anyway because the replacement axiom does not seem particularly implausible. [ Should quote some theorems about consistency of ZF and Zermelo axioms. ] 5.11.1 Definition: A Zermelo set theory is a set theory which satisfies all of the conditions of Definition 5.1.26 except that the replacement axiom (6) is replaced with Axiom (6′ ). (6′ ) The specification axiom: For any set x and set-theoretic formula A, ∃y, ∀u, (u ∈ y ⇔ (u ∈ x ∧ A(u))) 5.11.2 Remark: Axiom (6′ ) is also known as the axiom of separation, the axiom of comprehension or the axiom of subsets. 5.11.3 Remark: The specification axiom is implied by the replacement axiom. Theorem 5.11.4 follows from this. The existence of the set {ω, IP(ω), IP(IP(ω)), IP(IP(IP(ω))), . . .}, where ω is the set of non-negative integers, can be proved in ZF theory but not in Zermelo set theory. (See EDM2 [34], 33.B, page 148, for this statement. See Halmos [159], Section 19, pages 74–77, for constructions using the axiom of substitution.) The difference between the two axiom-sets is not very likely to be an issue for differential geometry. The weaker axiom of specification (6′ ) is probably sufficient. However, just to be on the safe side, the ZF set theory Axiom (6) is adopted in this book. [ Theorem 5.11.4 needs to be proved, or else a reference must be found. ]
5.11.5 Remark: There is no axiom-of-choice style of problem here. Axiom (6) constructs sets with definite contents, not some random, unknowable sample of elements of sets. If the specification axiom guarantees the existence of a set X, you can determine whether any given set y is in X or not. This is because the set X is constructed by your own set-theoretic formula which you used in the invocation of the axiom. With AC, by contrast, you never know which elements are in the constructed set. It’s a lucky dip. You get something from the lucky dip, but you don’t know what you get. All of the ZF existential axioms guarantee the existence of sets whose contents you have specified in your choice of the variables (sets or set-theoretic formulas) which you insert into the axiom templates. AC gives you a pig in a poke!
5.12. Bernays-G¨ odel set theory [ This section will contain a presentation of Bernays-G¨odel set theory, not just informal comments. ] [ According to EDM2 [34], 381.G: “A set in the naive sense is a collection {x; C(x)} of all x which satisfy a certain condition C(x).” Does this correspond to the definition of a “class” in Bernays-G¨odel set theory. ] 5.12.1 Remark: ZF set theory may be extended by introducing two types of class. In Bernays-G¨odel set theory, “classes” are defined instead of sets, and a set is defined as a special type of class which is an element of some other class. Thus a class X is said to be a set if there is some class Y such that X ∈ Y . Then “proper classes” are the classes which are not members of any other class. This allows one to define, for example, the class of all sets. The class is not a member of any other class, but it has all sets as members. Russell’s paradox does not happen because the class U , say, of all sets is not a member of itself. The class Q with X ∈ Q ⇔ (X ∈ U ∧ X ∈ / X) gives no paradox because Q ∈ / Q but X ∈ / U . So the Russell’s paradoxical “set” Q is not a set. This sounds a little like an ad-hoc fix, but since BG set theory is reportedly logically self-consistent, it’s difficult to argue against it. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
5.11.4 Theorem: Any theorem that can be proved in Zermelo set theory can also be proved in ZermeloFraenkel set theory.
196
5. Sets
BG set theory with its two types of class seems to be a half-way house between ZF set theory which has a single type of class or set and the Russell-Whitehead set theory which had an infinite number of types. A clear presentation of BG set theory is given by Mendelson [164], Chapter 4, pages 159–170. (They call this NBG set theory after John von Neumann who first proposed a similar set of axioms in 1925 and 1928.) 5.12.2 Remark: According to Mendelson [164], page 204, the axiom of choice is consistent and independent with respect to NBG set theory. 5.12.3 Remark: Since Zermelo-Fraenkel set theory is essentially the same thing as Bernays-G¨odel set theory minus the proper classes, one could regard BG set theory as a kind of conceptual layer underlying ZF because ZF is a specialization of BG. Most concepts of ZF theory seem to work very similarly in BG set theory, but classes must always be checked to see if they are sets or proper classes. This increases the amount of work required. Since ZF is embedded inside BG theory, it seems that there is little value in presenting the BG set theory axioms for the purposes of differential geometry. It is more convenient to adopt the ZF axioms, and if one wishes to talk about concepts like the class of all linear spaces or the class of all manifolds, it is sufficient to simply remember to use the word “class” instead of “set” for such generalities so as to avoid Russell’s paradox. Most people just want to do correct calculations anyway. So the philosophical alley-ways into which BG set theory leads are a luxury which one may profitably forego.
5.13. Basic properties of binary set unions and intersections From this point onwards, the Zermelo-Fraenkel set theory axioms are assumed. Any theorem which requires additional set theory axioms will be tagged with the required axioms. 5.13.1 Remark: The union of any two sets is well defined by the union Axiom (4) in Definition 5.1.26. The intersection of any two sets is well defined by the replacement Axiom (6) or specification Axiom (6′ ).
5.13.3 Notation: A ∪ B for sets A and B denotes the union of A and B A ∩ B for sets A and B denotes the intersection of A and B 5.13.4 Theorem: (i) A ∪ ∅ = A for any set A. (ii) A ∩ ∅ = ∅ for any set A. (iii) A ⊆ B ⇔ (z ∈ A ⇔ (z ∈ A ∧ z ∈ B)) for any sets A and B. Proof: See Exercise 47.2.6. 5.13.5 Remark: Theorem 5.13.4 (iii) is equivalent to the tautology (α ⇒ β) ⇔ (α ⇔ (α ∧ β)) with the meanings α = “z ∈ A” and β = “z ∈ B”. [ Must also do the families version of Theorem 5.13.6 following Definition 6.8.1. ] 5.13.6 Theorem: The following identities hold for all sets A, B and C. (i) (ii) (iii) (iv) (v) (vi) (vii)
A ∪ B ⊇ A. A ∪ B ⊇ B. A ∩ B ⊆ A. A ∩ B ⊆ B. A ∪ B = B ∪ A. (Commutativity of union.) A ∩ B = B ∩ A. (Commutativity of intersection.) A ∪ A = A.
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
5.13.2 Definition: The union of sets A and B is the set {x; x ∈ A ∨ x ∈ B}. The intersection of two sets A and B is the set {x; x ∈ A ∧ x ∈ B}.
5.13. Basic properties of binary set unions and intersections
197
(viii) A ∩ A = A.
(ix) A ∪ (B ∪ C) = (A ∪ B) ∪ C. (Associativity of union.)
(x) A ∩ (B ∩ C) = (A ∩ B) ∩ C. (Associativity of intersection.)
(xi) A ∪ (B ∩ C) = (A ∪ B) ∩ (A ∪ C). (Distributivity of union over intersection.)
(xii) A ∩ (B ∪ C) = (A ∩ B) ∪ (A ∩ C). (Distributivity of intersection over union.)
(xiii) A ∪ (A ∩ B) = A. (Absorption of union over intersection.) (xiv) A ∩ (A ∪ B) = A. (Absorption of intersection over union.)
Proof: These formulas follow from the corresponding formulas for the logical operators ∨ and ∧.
Part (i) follows from Theorem 4.11.7 (vi). Part (ii) follows from Theorem 4.11.7 (vii). Part (iii) follows from Theorem 4.11.7 (iii). Part (iv) follows from Theorem 4.11.7 (iv). Part (v) follows from Theorem 4.11.10 (i). Part (vi) follows from Theorem 4.11.10 (ii). Part (vii) follows from Theorem 4.11.10 (iii). Part (viii) follows from Theorem 4.11.10 (iv). Part (ix) follows from Theorem 4.11.10 (v). Part (x) follows from Theorem 4.11.10 (vi). Part (xi) follows from Theorem 4.11.10 (vii). Part (xii) follows from Theorem 4.11.10 (viii). Part (xiii) follows from Theorem 4.11.10 (ix). Part (xiv) follows from Theorem 4.11.10 (x). 5.13.7 Definition: Two sets A and B are disjoint if A ∩ B = ∅. 5.13.8 Definition: The (relative) complement of a set B within a set A is the set {x ∈ A; x ∈ / B}. 5.13.9 Notation: A \ B for sets A and B denotes the complement of B within A. 5.13.10 Theorem: The following identities hold for all sets X and A. (i) (X \ A) ∩ A = ∅.
(ii) (X \ A) ∪ A = X ∪ A.
(iii) (X \ A) \ A = X \ A.
Hence the following identities hold if A ⊆ X. (v) (X \ A) ∩ A = ∅.
(vi) (X \ A) ∪ A = X.
(vii) (X \ A) \ A = X \ A.
(viii) X \ (X \ A) = A.
5.13.11 Theorem (de Morgan’s law):
The following identities hold for all sets A, B and X.
(i) X \ (A ∪ B) = (X \ A) ∩ (X \ B).
(ii) X \ (A ∩ B) = (X \ A) ∪ (X \ B). 5.13.12 Theorem: The following identities hold for all sets A, B and X. (i) A ⊆ B ⇒ (X \ A) ⊇ (X \ B). 5.13.13 Theorem: The following identities hold for all sets A, B, C and D. (i) (A ∪ B) ∩ (C ∪ D) = (A ∩ C) ∪ (B ∩ C) ∪ (A ∩ D) ∪ (B ∩ D).
(ii) (A ∩ B) ∪ (C ∩ D) = (A ∪ C) ∩ (B ∪ C) ∩ (A ∪ D) ∩ (B ∪ D).
And so forth . . . [ See Simmons [139], p. 6–14 for more properties of sets. ] Proof: See Exercise 47.2.7. 5.13.14 Definition: The (symmetric) set difference of two sets A and B is the set (A \ B) ∪ (B \ A). 5.13.15 Notation: A △ B, for sets A and B, denotes the symmetric set difference of A and B. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
(iv) X \ (X \ A) = X ∩ A.
198
5. Sets
5.13.16 Theorem: The symmetric set difference has the following properties. (i) A △ A = ∅.
(ii) A △ ∅ = A.
(iii) A △ B = B △ A.
(iv) (A △ B) △ C = A △ (B △ C). 5.13.17 Remark: The properties of the symmetric set difference in Theorem 5.13.16 are quite satisfyingly simple and symmetric. This is because a symmetric set difference contains a point if and only if the number of sets which contain the point is odd. This observation easily implies both the symmetry rule (iii) and the associativity rule (iv). The set A △ A in part (i) contains no points because all points are either in zero or two of the sets A and A. Theorem 5.13.18, on the other hand, contains almost no satisfyingly simple and symmetric rules because the symmetric set difference operator, which is arithmetic in nature, is mixed with the union and intersection operators, which are logical in nature. The logical operators have simple relations among themselves, and the arithmetic operator △ has simple relations. But combinations of these two kinds of relations yield very few simple results. 5.13.18 Theorem: The symmetric set difference has the following properties. (i) A ∩ (B △ C) = (A ∩ B) △ (A ∩ C).
(ii) A ∪ (B △ C) = (A ∪ B ∪ C) \ ((B ∩ C) \ A). (iii) A △ (B ∪ C) = (A △ B) ∪ (A △ C) \ A ∩ (B △ C) . (iv) A △ (B ∪ C) = (A △ B) ∩ (A △ C) ∪ (B △ C) \ A . (v) A △ (B ∩ C) = (A △ B) ∪ (A △ C) \ (B △ C) \ A . (vi) A △ (B ∩ C) = (A △ B) ∩ (A △ C) ∪ A ∩ (B △ C) . 5.14.1 Remark: The properties of general unions and intersections are especially applicable in topology. 5.14.2 Notation: S T S denotes the set {x; ∃A ∈ S, x ∈ A} for any set S. S denotes the set {x; ∀A ∈ S, x ∈ A} for any non-empty set S. S 5.14.3 Remark: The set S in Notation 5.14.2 is well defined by the union Axiom (4) in Section 5.1. T The set S for a non-empty set of sets S is well defined by Axiom (6) or (6′ ). 5.14.4 Theorem: S (i) x ∈ S ⇔ ∃A ∈ S, x ∈ A ⇔ T (ii) x ∈ S ⇔ ∀A ∈ S, x ∈ A ⇔
∃A, (x ∈ A ∧ A ∈ S) for any set of sets S. ∀A, (x ∈ A ∨ A ∈ / S) for any non-empty set of sets S.
5.14.5 Theorem: S (i) ∅ = ∅. S (ii) {A} = A for any set A. T (iii) {A} = A for any set A. S (iv) {A, B} = A ∪ B for any sets A and B. T (v) {A, B} = A ∩ B for any sets A and B.
5.14.6 Remark: Theorem 5.14.7 gives some generalizations to sets of sets of the statements in Theorems 5.13.6 and 5.13.13. (The reader may like to ponder why part (vii) of Theorem 5.14.7 requires S1 and S2 to be non-empty.) [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
5.14. Basic properties of general set unions and intersections
5.14. Basic properties of general set unions and intersections
199
5.14.7 Theorem: Let A be a set. Let S, S1 and S2 be sets of sets. Then the following statements are true. S S (i) S1 ⊆ S2 ⇒ S1 ⊆ S2 . T T (ii) S1 ⊆ S2 ⇒ S1 ⊇ S2 if S1 6= ∅ and S2 6= ∅. S S (iii) A ∩ S = {A ∩ X; X ∈ S}. T T (iv) A ∪ S = {A ∪ X; X ∈ S} if S1 6= ∅. S S S (v) S1 ∩ S2 = {X1 ∩ X2 ; X1 ∈ S1 , X2 ∈ S2 }. T T T (vi) S1 ∪ S2 = {X1 ∪ X2 ; X1 ∈ S1 , X2 ∈ S2 } if S1 6= ∅ and S2 6= ∅. S S S (vii) S1 ∪ S2 = {X1 ∪ X2 ; X1 ∈ S1 , X2 ∈ S2 } if S1 6= ∅ and S2 6= ∅. T T T (viii) S1 ∩ S2 = {X1 ∩ X2 ; X1 ∈ S1 , X2 ∈ S2 } if S1 6= ∅ and S2 6= ∅. S S S (ix) S1 ∪ S2 = (S1 ∪ S2 ). T T T (x) S1 ∩ S2 = (S1 ∪ S2 ) if S1 6= ∅ and S2 6= ∅. S (xi) A ∈ S ⇒ A ⊆ S. T (xii) A ∈ S ⇒ A ⊇ S if S 6= ∅. S (xiii) ∀X ∈ S, X ⊆ A ⇔ S⊆A . T (xiv) ∀X ∈ S, X ⊇ A ⇔ S ⊇ A if S 6= ∅. S S (xv) ∀X1 ∈ S1 , ∀X2 ∈ S2 , X1 ∩ X2 = ∅ ⇔ S1 ∩ S2 = ∅ . Proof: See Exercise 47.2.8.
5.14.9 Theorem: Let A be a set. Let S be a set of sets. Then the following statements are true. S T (i) A \ S = {A \ X; X ∈ S} if S 6= ∅. T S (ii) A \ S = {A \ X; X ∈ S} if S 6= ∅.
5.14.10 Remark: Notation 5.8.15 introduced the abbreviation {x ∈ A; P (x)} for {x; (x ∈ A) ∧ P (x)}, where A is a set and P is a set-theoretic formula. Theorem 5.14.11 presents some properties of unions and intersections applied to sets of this form. Part (ii) is perhaps more surprising than part (i). 5.14.11 Theorem: S (i) x ∈ {A ∈ S; P (A)} ⇔ ∃A ∈ S, (x ∈ A ∧ P (A)) ⇔ ∃A, (x ∈ A ∧ A ∈ S ∧ P (A)) for any set of sets S and set-theoretic formula P . T (ii) x ∈ {A ∈ S; P (A)} ⇔ ∀A ∈ S, (x ∈ A ∨ ¬P (A)) ⇔ ∀A, (x ∈ A ∨ A ∈ / S ∨ ¬P (A)) for any non-empty set of sets S and set-theoretic formula P . Proof: See Exercise 47.2.9. 5.14.12 Remark: Theorem 5.14.13 presents some basic properties of the power set construction in Definition 5.8.18 and Notation 5.8.19. 5.14.13 Theorem: Let A1 and A2 be sets. Let S1 and S2 be sets of sets. Then the following statements are true. (i) (ii) (iii) (iv) (v) (vi)
A1 ∈ IP(A1 ). A1 ⊆ A2 ⇒ IP(A1 ) ⊆ IP(A2 ). S (IP(A1 )) = A1 . S S S S ∀C ∈ IP(S2 ), C ∈ IP( S2 ). That is, S1 ⊆ S2 ⇒ S1 ⊆ S2 . S S1 ⊆ IP( S1 ). S S1 ⊆ IP(S2 ) ⇒ S1 ⊆ S2 .
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
5.14.8 Remark: Propositions (i) and (ii) in Theorem 5.14.9 are generalizations of De Morgan’s law. (See Theorem 5.13.11.)
200
5. Sets
Proof: See Exercise 47.2.10. S 5.14.14 Remark: The set construction x∈X f (x) in Notation 5.14.15 (i) is well defined by the replacement axiom, Definition 5.1.26 (6), which guarantees that {y; ∃x ∈ X, y ∈ f (x)} = {f (x); x ∈ X} is a set. Then the union axiom, Definition 5.1.26 (4), guarantees that the union of this set is also a set. T The set construction x∈X f (x) in Notation 5.14.15 (ii) is well defined by the axiom of specification.
5.14.15 Notation: S S (i) x∈X f (x) means {y; ∃x ∈ X, y ∈ f (x)} for any set X and any set-theoretic function f such that f (x) is a set for all x ∈ X. T T (ii) x∈X f (x) means {y; ∀x ∈ X, y ∈ f (x)} for any set X and set-theoretic function f such that f (x) is a set for all x ∈ X and f satisfies ∃x ∈ X, ∃y ∈ Y, y ∈ f (x). 5.14.16 Theorem: S (i) y∈Y {x ∈ X; P (x, y)} = {x ∈ X; ∃y ∈ Y, P (x, y)} for any sets X and Y and set-theoretic formula P . T (ii) y∈Y {x ∈ X; P (x, y)} = {x ∈ X; ∀y ∈ Y, P (x, y)}, for any sets X and Y and set-theoretic formula P which satisfies ∃x ∈ X, ∃y ∈ Y, P (x, y).
Proof: See Exercise 47.2.11. S 5.14.17 Remark: The set y∈Y {x ∈ X; P (x, y)} in Theorem 5.14.16 may be thought of as the domain of the set-theoretic formula P if it is subject to the restrictions x ∈ X and y ∈ Y on P (x, y). T The set y∈Y {x ∈ X; P (x, y)} is perhaps less T useful. If the set {(x, y) ∈ X × Y ; P (x, y)} is interpreted as the graph of a function f : X → Y , the set y∈Y {x ∈ X; P (x, y)} is the set of x ∈ X for which f ({x}) = Y . Relations and functions are defined in Chapter 6. [ Give a list of properties of the symmetric set difference operator over arbitrary unions and T S distributed intersections? For example, try to simplify A △ S and A △ S .] 5.15.1 Remark: Theorem 5.15.4 states that for any set of sets S, the set of unions of subsets of S is closed under arbitrary unions. This fact is useful in topology, for example for the proof of Theorem 14.8.8. One may write Theorem 5.15.4 more formally as (∀U ∈ Q, ∃C ∈ IP(S), U = for any Q and S.
S
C) ⇒ (∃C¯ ∈ IP(S),
S
Q=
S¯ C)
S S 5.15.2 Remark: In the statement of Theorem 5.15.4, C ⊆S S for all C ⊆ S (by Theorem 5.14.7 (i)). So S S C ∈ IP( S) (as in Theorem 5.14.13 (iv)). Therefore T ⊆ IP( S). This guarantees that T is a well-defined set by the ZF axiom of specification. (Alternatively the replacement axiom may be used.) 5.15.3 Remark: To prove Theorem 5.15.4, it is intuitively clear that one may combine all of the collections C of which the setsSU areScomposed into a collection C¯ ∈ IP(S),Sand this combined collection C¯ will have ¯ which immediately implies that Q ∈ T as claimed. The obstacle here is the property that Q = C, S that it is not always possible to reconstruct the collections C from the unions S C. Given a set Q ⊆ T , it is only known that each element U of Q has the property: ∃C ∈ IP(S), U = C. The set Q is not equipped with information about how its elements U were constructed from sets C ∈ IP(S). Even if Q was actually constructed by someone from sets C ∈ IP(S), the construction history is discarded when the set is “handed over” to the “recipient” of the set. The recipient of the set Q only knows what the elements U are, together with the useful hint that each such set U is equal to the union of some subset of S. The recipient is not told which subset! For a very large set S, it could be a computationally challenging to determine which combination of elements of S may be combined to produce U . The required combination of elements may be uncountably infinite. An even more unpleasant fly in the ointment is the possibility that the set Q was never even constructed at all. It may be a completely arbitrary subset of T , and S could be a completely arbitrary [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
5.15. Closure of set unions under arbitrary unions
5.15. Closure of set unions under arbitrary unions
201
[ Give an example of a set S in Theorem 5.15.4 for which it is impossible or computationally impractical to S reconstruct sets C ∈ IP(S) for which U = C for U ∈ Q. E.g. make the solution of this problem equivalent to solving a crypto puzzle or the axiom of choice. Lebesgue non-measurable sets could be useful for the latter. ] S S 5.15.4 Theorem: Let S be a set of sets. Define T = { C; C ⊆ S}. Then ∀Q ⊆ T, Q ∈ T . S S Proof: Let S be a set of sets. Define T = { C; C ⊆ S}. Let Q ⊆S T . It is to be shown that Q ∈ T . The set Q is defined by the proposition that ∀U ∈ Q, ∃C ∈ IP(S), U = C. This means that Q is a collection of sets which are unions of subcollections of the collection S. S S S¯ S Let C¯ = {C ∈ IP(S); C ∈ Q}. It will be shown that C is an element of T which is equal to Q. To S S prove this, it suffices to show that C¯ ⊆ S and C¯ = Q. S S To show that S C¯ ⊆ S,S note that C¯ ⊆ IP(S) by Theorem 5.14.7 (i) and IP(S) = S by Theorem 5.14.13 (iii). The equality C¯ = Q may be derived as follows. S x ∈ Q ⇔ ∃U ∈ Q, x ∈ U ⇔ ∃U, (x ∈ U ∧ U ∈ Q) ⇔ ∃U, (x ∈ U ∧ U ∈ Q ∧ U ∈ T ) (5.15.1) S ⇔ ∃U, (x ∈ U ∧ U ∈ Q ∧ ∃C ∈ IP(S), (U = C)) S ⇔ ∃U, ∃C, (x ∈ U ∧ U ∈ Q ∧ C ∈ IP(S) ∧ U = C) S ⇔ ∃C, ∃U, (C ∈ IP(S) ∧ U ∈ Q ∧ x ∈ U ∧ U = C) S ⇔ ∃C, (C ∈ IP(S) ∧ ∃U, (U ∈ Q ∧ x ∈ U ∧ U = C)) S S S ⇔ ∃C, (C ∈ IP(S) ∧ ∃U, ( C ∈ Q ∧ x ∈ C ∧ U = C)) S S S ⇔ ∃C, (C ∈ IP(S) ∧ C ∈ Q ∧ x ∈ C ∧ (∃U, U = C)) S S ⇔ ∃C, (C ∈ IP(S) ∧ C ∈ Q ∧ x ∈ C) (5.15.2) S ⇔ ∃C, (C ∈ IP(S) ∧ C ∈ Q ∧ ∃V, (x ∈ V ∧ V ∈ C)) S ⇔ ∃V, ∃C, (C ∈ IP(S) ∧ C ∈ Q ∧ x ∈ V ∧ V ∈ C) S ⇔ ∃V, (x ∈ V ∧ ∃C, (C ∈ IP(S) ∧ V ∈ C ∧ C ∈ Q)) S ⇔ ∃V, (x ∈ V ∧ ∃C ∈ IP(S), (V ∈ C ∧ C ∈ Q)) ¯ ⇔ ∃V, (x ∈ V ∧ V ∈ C) (5.15.3) ¯ ⇔ ∃V ∈ C, x ∈ V S¯ ⇔ x ∈ C. S It follows that Q ∈ T . Line (5.15.1) follows from Theorem 5.13.4 (iii) because Q ⊆ T . Line (5.15.2) follows from Theorem 4.16.12 (ii) and Remark 4.11.8. Line (5.15.3) follows from Theorem 5.14.11 (i). [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
general set. In such a situation, there is no history of “construction” of Q to be recovered. Recovering the sets C ∈ IP(S) may be equivalent to solving a cryptographic puzzle, for example, or some famous “open problem”. It follows from this discussion that the simple aggregation of the sets C will not deliver the ¯ required combined set C. The axiom of choice would bail us out ofSthis quandary by providing, for each set U ∈ Q, a single choice of collection C ∈ IP(S) such that U = C. This might not be the same set C which was used in the construction of U , if S it was S constructed, but that doesn’t matter. The important thing is to obtain a set ¯ C¯ ∈ IP(S) such that Q = C. The choice of set S C for each U ∈ Q would be equivalent to constructing a choice function f : Q → IP(S) which satisfies C = U for each set U ∈ Q. Then Theorem 6.8.13 applied to f would yield the desired result. However, since this book is avoiding the axiom of choice whenever possible, this approach is not followed here. S ¯ S The approach followed in the proof of Theorem S 5.15.4 S ¯is to use the set C¯ = {C ∈ IP(S); C ∈ Q} for the combined collection of sets which will yield Q = C. This choice of C combines all possible collections S S C which could have been used in the construction of the sets U ∈ Q. So it should hopefully satisfy C¯ = Q, and hopefully S the choice axiom will not be required to prove this. By using all possible set collections C with U = C, we avoid making a choice of set collection C, thereby avoiding the axiom of choice.
202
5. Sets
T [ Is there a similar theorem to 5.15.4 for C? What about finite unions and intersections? Can this be applicable to showing that a topology is generated by all finite intersections of all unions of a sets of sets? ] 5.15.5 Remark: Assertions of the form ∀x ∈ X, ∃y, P (x, y) for a proposition P depending on two variables often lead to the temptation to use the axiom of choice because such assertions intuitively suggest the existence of a function f on the set X such that S P (x, f (x)) holds for all x ∈ X. In the particular case of Theorem 5.15.4, ∀U ∈SQ, ∃C ∈ IP(S), U = C, which leads to the temptation to propose a function f : Q → IP(S) satisfying f (U ) = U for all U ∈ Q.
5.15.6 Remark: Proving theorems without the axiom of choice is a bit like learning how to cook without using meat, fish or eggs. Just as the lacto-vegetarian must find substitutes for forbidden foods, the mathematician who rejects AC must find alternative paths to achieve desired results. The non-AC mathematician must learn to recognize theorems which are “contaminated” by AC or CC. This would be much easier if everyone provided accurate labelling of theorems which indicated the ingredients used in their manufacture. 5.15.7 Remark: Theorem 5.15.4 is related to the CC-tainted Theorem 7.2.36, which says that the union of a countable set of countable sets is countable.
5.15.8 Remark: The many lines of the logical calculation in the proof of Theorem 5.15.4 may seem excessive, but this is the price of certainty that the logic is correct. By checking that every line is correctly derived from axioms and already-proved lines, the final line of a derivation can be trusted to not depend on intuition or unwarranted assumptions. Most mathematical proofs are presented informally. An informal proof relies on the ability of the reader to “fill the gaps” between the steps of the argument. If the correctness of an argument is in contention, the reader must know how to write out the argument in full, no matter how tedious this may be. In mathematics, occasional tedium is better than occasional incorrectness. S T [ The unary ∪ and ∩ operators are too small and are too close to the symbol on the right, but the and symbols are too large for this purpose when the symbol on the right is a small letter. So what is needed are symbols which are in between these two sizes. Probably should define macros for these symbols and define them later to be some better-sized symbols. ]
5.16.1 Remark: The fact that mathematicians do not agree on which sets should represent the concepts of differential geometry shows that the sets are not themselves the objects under consideration, but merely serve to indicate which object is being looked at within a class of objects. Every mathematical object should ideally have not only a set to indicate which object it is, but also an object class, a name tag, a scope identifier and an encoding class. The encoding class is really necessary for the same reason that it is necessary in computer software. Computers represent all data in terms of zeros and ones which are same kinds of zeros and ones whether the data is text, integers, floating-point numbers or images in dozens of formats. Therefore all data in computers has some sort of indication of the encoding rules used for each piece of data. It is also necessary to know what class of object is represented. To distinguish one object from another, identifiers (name tags) are used. Since names from different contexts may clash, scope identifiers are often used implicitly or explicitly. Strictly speaking, mathematics should also have all five of these components: (1) a set to indicate which object of a class is indicated; (2) a class tag to indicate the human significance of the object class; (3) an encoding tag to indicate the chosen representation; (4) a name tag to indicate which object of a class is indicated; (5) a scope tag to remove ambiguity from multiple uses of a name tag. All of this is done informally in most mathematics, but it is helpful to be aware of the limits of the expressive power of sets. Sets should be thought of as the mathematical equivalent of the zeros and ones of computer data. Either explicitly or implicitly, this raw data must be brought to life. 5.16.2 Remark: Mathematical objects may be organized into classes. The objects in each mathematical class may be indicated by a “specification tuple”, a parameter sequence which uniquely determines a single object in the class. Names for things are often confused with the things which they refer to. Mathematical names are no exception. A specification tuple is often thought of as the definition of the object itself. For example, a pair (G, σG ) may be defined to be a group if the function σG : G × G → G satisfies the axioms of a group. The trivial group with identity 0 would then be the tuple ({0}, {((0, 0), 0)}). However, given only [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
5.16. Specification tuples
5.16. Specification tuples
203
this pair of sets, it would be difficult to guess that it is supposed to be a group. There is something missing in the bare parameter list. One could think of the missing significance as the “essence of a group”. 5.16.3 Remark: A full specification tuple may be inconveniently long. In this case, the tuple may be abbreviated. For example, a topological group might be referred to as a set G with various operations and a topology, whereas a full specification tuple might be (G, TG , σG ) where TG is the topology and σG is the group operation. Informal specifications are fine for simple situations, but as structures become progressively more complex, the burden on the reader’s memory and guesswork becomes excessive. It is best to specify the full set of parameters to avoid ambiguity when introducing a new concept. 5.16.4 Remark: Although one thinks of such notations as G and (G, TG , σG ) as referring to the same thing, they cannot be equal. A statement such as “G = (G, TG , σG )” is logical nonsense. It is preferable to use an asymmetric notation such as G− < (G, TG , σG )
or
(G, TG , σG ) − > G,
5.16.5 Remark: In a more formal presentation, one might indicate mathematical classes explicitly. For example, the class of all topological principal G-bundles might be denoted as TPFB[G, TG , σG ], where G− < (G, TG , σG ) is a topological group. (In this case, there is a different class TPFB[G, TG , σG ] for each choice of the triple of parameters (G, TG , σG ), where G is a group, TG is a topology on G and σG is an algebraic operation on G.) P An individual member of this class might be denoted as TPFB[G, TG , σG ](P, TP , q, B, TB , AG P , µG ), which indicates both the parameters of the class and the parameters of the particular object. The 3 class parameters P are (G, TG , σG ). The 7 object parameters are (P, TP , q, B, TB , AG P , µG ). With such a notation, the reader would see at all times the clear division between class and object parameters. P In practice, such an object is denoted as (P, q, B) or (P, TP , q, B, TB , AG P , µG ), and the class membership is indicated in the informal context. Although this book is not written in such a formal way, an effort has been made to ensure that most class and object specifications could be formalized if required. Most texts freely mix class and object parameters, which can make it difficult to think clearly about what one is doing. 5.16.6 Remark: Suppose LTG(G, X, σ, µ) denotes a left transformation group G with group operation σ : G × G → G, acting on X with action µ : G × X → X. Let RTG(G, X, σ, µ) denote the corresponding right transformation group. Then for any group GP(G, σ), the parameters of LTG(G, G, σ, σ) and RTG(G, G, σ, σ) are identical. Yet they are respectively the left and right transformation groups of G acting on G. So they are different classes of structure with identical parameters. (This ambiguity is also mentioned in Remark 34.8.8.) This shows that there must be something extra which indicates the class of a specification tuple. Thus, for example, when this book talks about “the group (G, σ)”, what is really meant is “the structure GP(G, σ)”, where the meaning of GP is explained only in non-technical human-to-human language. The meaning of the class of a structure lies outside formal mathematics in the socio-mathematical context. So the reader should have no illusions that the pair (G, σ) is a group. It is just a pair of parameters. 5.16.7 Remark: The idea that mathematical definitions refer to “classes” or “objects” does not derive from the corresponding terminology in computer programming. For example, Bell [190], page 505, published the following in 1937. A manifold is a class of objects (at least in common mathematics) which is such that any member of the class can be completely specified by assigning to it certain numbers [. . . ]
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
to indicate that G is an abbreviation for (G, TG , σG ). The non-standard chicken-foot symbols “− ” are used frequently in this book.
204
[ www.topology.org/tex/conc/dg.html ]
5. Sets
[ draft: UTC 2009–3–21 Saturday 11:36 ]
[205]
Chapter 6 Relations and functions
6.1 6.2 6.3 6.4 6.5 6.6 6.7 6.8 6.9 6.10 6.11 6.12
Ordered pairs . . . . . . . . . . . . . . . . . . . . Cartesian product of a pair of sets . . . . . . . . . Relations . . . . . . . . . . . . . . . . . . . . . . Equivalence relations and partitions . . . . . . . . Functions . . . . . . . . . . . . . . . . . . . . . . Function set maps and inverse set maps . . . . . . Composition of functions . . . . . . . . . . . . . . Families of sets and functions . . . . . . . . . . . Cartesian products of families of sets and functions Partial Cartesian products and identification spaces Partially defined functions . . . . . . . . . . . . . Notations for sets of functions . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
206 208 209 213 213 216 218 219 221 222 223 225
Two rules which specify relations (or functions) are considered to be equal if and only if they generate the same set of ordered pairs. But since any such sets of ordered pairs may be impossible to list explicitly, the equality of the generated sets is determined by logical analysis of the rules, not by comparing the ordered pair lists. Suppose relations R1 and R2 are defined by rules R1 = {(x, y); P1 (x, y)} and R2 = {(x, y); P2 (x, y)}, where P1 and P2 are set-theoretic relations (i.e. predicates with two variables). Then R1 = R2 ⇔
∀x, ∀y, P1 (x, y) ⇔ P2 (x, y) .
In other words, testing equality of the sets R1 and R2 is equivalent to testing the equivalance of the predicates P1 and P2 . Although one is expected to have in mind the sets, in practice one works with the predicates in the logical domain. The two ways of formalizing relations may be referred to as “relation-sets” and “relation-predicates”. The correspondence between relation-sets and relation-predicates breaks down under some circumstances. For example, when one refers to the set of all relations R between real numbers, it is simply not possible to write all such relations as finite rules. Such “dummy variable” relations must be formalized as sets. Another circumstance is the case of Lebesgue non-measurable functions. These relations are chosen with the Axiom of Choice and cannot be specified by any rule. Relation-sets have the advantage of avoiding issues such as the existence of finite rules to specify the relations. On the other hand, relation-predicates have the advantage that they are well defined even when the domain and range of the relation are not sets. (For example, the set inclusion relation “⊆” is defined for all sets,
©
Alan U. Kennington, “Differential geometry reconstructed: a unified systematic framework”. Copyright 2009, Alan U. Kennington. All rights reserved. You may print this book draft in A4 format. Printing in all other formats is forbidden. You may not charge any fee for copies of this book draft. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
6.0.1 Remark: Relations and functions may be represented as sets of ordered pairs. In other words, objects which are related to each other may be explicitly and exhaustively listed. In practice, the ordered pairs are too numerous to list. So the ordered pair lists are specified by a compact, finite rules of some sort. Therefore one may regard the set-of-ordered pairs formalism as merely an abstract conceptualization of relations and functions.
206
6. Relations and functions
but the domain and range are the “set of all sets”, which is not a set! So this relation-predicate has no corresponding relation-set.) These comments apply to sets in general, not just to relations and functions. 6.0.2 Remark: Although functions are represented as particular kinds of sets, they are generally thought of as being dynamic, in contrast to sets which are static. Functions may be thought of as moving or transforming objects, or associating attributes with objects. Sometimes one thinks of functions as having a temporal or causal quality. Since functions may be applied to functions, it often happens that a function acts on static objects, but the function is itself acted upon by other functions. So the same function may sometimes have the active role, sometimes a passive role. Similarly, relations are thought of as more than just sets of ordered pairs. They may be thought of as indicating associations between things. The meanings of sets, relations and functions are clearly not fully captured in the set theory formalism. The meanings are communicated within the socio-mathematical context. 6.0.3 Remark: Figure 6.0.1 shows some of the relations, er. . . relationships between relations and functions. (The “relations” between classes of mathematical objects are “naive relations” in the realm of “naive mathematics” or “a-priori mathematics” or “metamathematics”. The “relations” which are defined in Section 6.3 as subsets of Cartesian products of sets constitute a well-defined class of sets within set theory.) Cartesian product relation partially defined function
equivalence relation
function
total order
surjective function
bijective function
Figure 6.0.1
Relationships between relations and functions
6.1. Ordered pairs In order to define relations and functions, it is necessary to first define ordered pairs. 6.1.1 Remark: The ordered pairs {{a}, {a, b}} in Definition 6.1.3 are well defined by ZF Axiom (3). One peculiarity of this representation is the fact that the ordered pair (a, a) is represented as {{a}}. (Clearly one does not normally think of sets of the form {{a}} as ordered pairs. This shows once again that sets on their own mean very little. One must also know the class of objects to which a set belongs in order to know which mathematical object it points to.) Another peculiarity is that n-tuples (a1 , a2 , . . . an ) for non-negative integers n are represented as functions with values a1 , a2 , . . . an , but functions are represented as ordered pairs. So a 2-tuple is represented as a function, which is a set of ordered pairs, but a 2-tuple is thought of as being the same thing as an ordered pair and is often called such. The circularity of the definitions here is broken by first defining ordered pairs as in Definition 6.1.3, then representing functions in terms of these ordered pairs, and then defining general n-tuples in terms of functions. This is an example of boot-strapping of definitions in mathematics. Ordered pairs are first defined in a “bare-handed” way and then redefined later using more sophisticated machinery. Yet another peculiarity is the fact that ordered pairs have two different set representations. In practice, as soon as functions have been defined, one may ignore Definition 6.1.3 and use the n-tuple representation instead. [ See Halmos [159], section 6, for ordered pairs. ] Although a kind of function-free n-tuple is discussed in Remark 6.1.13, such function-free n-tuples are not used in practical definitions for n > 2. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
injective function
partial order
6.1. Ordered pairs
207
6.1.2 Remark: One could axiomatically define a “system of pairs of sets”, called Pairs, say, to contain all of the ordered pairs of sets in a ZF set theory Sets, say. The relations between sets and pairs could be defined in terms of a function P : Sets × Sets → Pairs and functions L : Pairs → Sets and R : Pairs → Sets such that L(P (x, y)) = x, R(P (x, y)) = y and P (L(p), R(p)) = p for sets x and y and pairs p. We could call P (x, y) the ordered pair of sets x and y. Then Definition 6.1.3 would be just one possible representation of the system Pairs in terms of sets. So Definition 6.1.3 should not be taken too seriously. The set expression {{a}, {a, b}} is a mere parametrization of the concept of an ordered pair. The concept of the “order” of two things is something that the reader must already know about and use in the understanding of ordered pairs. 6.1.3 Definition: An ordered pair is a set of the form {{a}, {a, b}} for any sets a and b. 6.1.4 Notation: (a, b) denotes the ordered pair {{a}, {a, b}} for any a and b. 6.1.5 Remark: According to Mendelson [164], page 162, Kazimierz (Casimir) Kuratowski discovered the representation of ordered pairs in Definition 6.1.3. 6.1.6 Remark: The set-theoretic functions P , L and R in Remark 6.1.2 for the set-pair representation in Definition 6.1.3 would be P (a, b) = {{a}, {a, b}}, L({{a}, {a, b}}) = a and R({{a}, {a, b}}) = b. (Of course, one cannot really talk about P mapping the cross-product Sets × Sets to Pairs because Sets is not a set and cross products have not been defined yet. So P cannot be thought of as mapping ordered pairs of sets (a, b) ∈ Sets × Sets to Pairs unless naive notions of cross product and ordered pairs are used. So there are at least three layers of ordered pair definitions here: a logic layer, a basic set-pair layer, and a sequence-indexed-by-integers layer. One could dig more deeply for more definitional layers for ordered pairs!) 6.1.7 Remark: Theorems 6.1.8 and 6.1.9 show that the left and right elements of an ordered pair can be extracted from the pair.
Proof: Let p = (a, b) be an ordered pair. Then p = {{a}, {a, b}}. Therefore ∀x ∈ p, a ∈ x. But it must be shown that this uniquely determines the set a. In other words, it must be shown that if ∀x ∈ p, a′ ∈ x, then a′ = a. Suppose a = b. Then a′ ∈ {a}. Therefore a′ = a. Suppose a 6= b. Then a′ ∈ {a} and a′ ∈ {a, b}. In particular, a′ ∈ {a} again. So a′ = a. 6.1.9 Theorem: The right element of an ordered pair p is the set b defined by ∃′ x ∈ p, b ∈ x. Proof: The proposition ∃′ x ∈ p, b′ ∈ x means that ∃x ∈ p, b′ ∈ x and ∀x, y ∈ p, (b′ ∈ x ∧ b′ ∈ y) ⇒ x = y . (See Notation 4.16.3 for the unique existence notation.) This clearly holds with b = b′ . So it remains to show that any b′ which satisfies ∃′ x ∈ p, b′ ∈ x must equal b. Suppose a = b. Then b′ ∈ {a} = {b}. So b′ = b. Suppose a 6= b. Then either b′ ∈ {a} or b′ ∈ {a, b}. So either b′ = a or b′ = b. Suppose b′ = a. Then x = {a} and y = {a, b} implies that (b′ ∈ x ∧ b′ ∈ y) and x 6= y, which contradicts the assumption. Hence b′ = b. 6.1.10 Remark: Theorems 6.1.8 and 6.1.9 are unsatisfying. One would ideally like to have simple logical expressions left and right which yield a and T b from p as a = left(p) and b = right(p). There are expressions for {a} and {b}. We may write {a} = (p) = {x; ∀x ∈ p, a ∈ x} and {b} = {x; ∃′ x ∈ p, b ∈ x}. An alternative for {b} in terms of p = (a, b) would be {b} =
S T Sp \ p p
S T if S p 6= T p if p = p.
These are very clumsy constructions, but there seem to be no expressions for a and b themselves at all. In fact, there seems to be no expression to simply construct a from {a}.
One way out of this difficulty would be to invent a form of expression such from X = {a}. But this logical expression only yields a single thing if the set to have an operator which yields “the thing inside the set”. The closest one inside X” is “x ∈ X”. Within the context, one must also then claim that ∃′ x, [ www.topology.org/tex/conc/dg.html ]
as “x; x ∈ X” to extract a is a singleton. It is desirable can come to “x = the thing x ∈ X.
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
6.1.8 Theorem: The left element of an ordered pair p is the set a defined by ∀x ∈ p, a ∈ x.
208
6. Relations and functions
6.1.11 Remark: Theorem 6.1.12 states the property of ordered pairs which motivates the definition. Any definition which had the same property would be equally suitable, but Definition 6.1.3 seems to be universally accepted. 6.1.12 Theorem: Ordered pairs satisfy ∀a, ∀b, ∀c, ∀d, ((a, b) = (c, d) ⇔ ((a = c) ∧ (b = d))). Proof: See Exercise 47.2.12. 6.1.13 Remark: Ordered triples, quadruples, pentuples and higher-order tuples are defined in Section 7.7 (Definition 7.7.3) in terms of functions (which are in turn defined in terms of ordered pairs in Section 6.5). If ordered triples are required, there will be a cyclic loop of definitions here unless ordered triples can be defined without functions. This cycle of definitions can be broken by Definition 6.1.14. 6.1.14 Definition: An ordered triple is a set of the form ((a, b), c) for any three sets a, b and c. This is denoted as (a, b, c). An ordered quadruple is a set of the form ((a, b, c), d) for any four sets a, b, c and d. This is denoted as (a, b, c, d). 6.1.15 Remark: By naive induction, Definition 6.1.14 may be extended to any order of tuple by the rule (a1 , a2 , . . . an+1 ) = ((a1 , a2 , . . . an ), an+1 ). (Of course, this doesn’t make much sense if functions and integers haven’t been defined yet.) To make the induction complete, it is convenient to define a 1-tuple as (a) = a for any set a. This inductive rule cannot be used for the definition of a 0-tuple.
6.2. Cartesian product of a pair of sets
6.2.1 Definition: The Cartesian product of two sets A and B is the set {(a, b); a ∈ A, b ∈ B} of all ordered pairs (a, b) such that a ∈ A and b ∈ B. 6.2.2 Notation: A × B denotes the Cartesian product of sets A and B. 6.2.3 Remark: The Cartesian product of two sets A and B may be visualized as in Figure 6.2.1. (a, b)
b
A×B
B
a A Figure 6.2.1
Cartesian product of sets
6.2.4 Remark: The Cartesian product A × B is well defined because it is a subset of IP(IP(A ∪ B)). (To see this, note that {a} and {a, b} are elements of IP(A ∪ B) so that (a, b) ∈ IP(IP(A ∪ B)) and A × B ∈ IP(IP(IP(A ∪ B))).) So the existence follows from Axioms (3), (4) and (5). [ Is Axiom (6′ ) required here? ] [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
The Cartesian product of a pair of sets may be defined in terms of ordered pairs. The Cartesian product of more than two sets is generally defined in terms of functions and integers, which are not yet defined in this section. Therefore this section is concerned only with the Cartesian product of two sets.
6.3. Relations
209
6.2.5 Theorem: The Cartesian product has the following properties for sets A, B, C and D. (i) (ii) (iii) (iv) (v) (vi)
(A × B = ∅) ⇔ ((A = ∅) ∨ (B = ∅)). If A 6= ∅ and B 6= ∅, then (A × B ⊆ C × D) ⇔ ((A ⊆ C) ∧ (B ⊆ D)). (A × C) ∪ (B × C) = (A ∪ B) × C and (C × A) ∪ (C × B) = C × (A ∪ B). (A × C) ∩ (B × C) = (A ∩ B) × C and (C × A) ∩ (C × B) = C × (A ∩ B). (A × B) ∩ (C × D) = (A ∩ C) × (B ∩ D). (A × B) ∪ (C × D) ⊆ (A ∪ C) × (B ∪ D).
6.3. Relations [ See Halmos [159], section 7 for more on relations. ] 6.3.1 Remark: The mathematics literature uses the words “relation”, “function”, “domain” and “range” in confusing and inconsistent ways. Definitions 6.3.2 and 6.3.6 attempt to reduce the confusion a little by insisting that the domain and range must be implicit in the specification of a relation. In other words, a relation is defined to be no more than a set of ordered pairs, whereas a popular alternative formalization includes the domain and range explicitly in the definition of a relation. 6.3.2 Definition: A relation is a set of ordered pairs. A relation between a set A and a set B is a subset of A × B. A relation from a set A to a set B is a subset of A × B. A relation in a set A is a subset of A × A. A pair (a, b) is said to satisfy the relation R if (a, b) ∈ R.
6.3.4 Remark: The definition of a relation R in Definition 6.3.2 may be decomposed into the following conditions. (i) R is a set; (ii) ∀z ∈ R, ∃a, ∃b, z = (a, b). Condition (i) is significant. One often refers to a predicate with two parameters as a “relation”, but this is not at all the same thing. In ZF set theory, such a relation-predicate applies in principle to all sets, and there is no “set of all sets” in ZF set theory. (This is discussed at length in Section 5.7.) Therefore unrestricted relationpredicates are not relations in the sense of Definition 6.3.2. However, a relation-predicate can be converted into a relation-set by restricting the domain and range to a set. For example, the equality relation and membership relation for ZF sets are both not sets. However, for any set X, the sets {(x, y) ∈ X × X; x = y} and {(x, y) ∈ X × X; x ∈ y} are well-defined relations in the sense of Definition 6.3.2. 6.3.5 Example: The following are relations for all sets X and Y . (1) (2) (3) (4) (5) (6)
X ×Y. ∅. {(x, y) ∈ X × Y ; x = y}. {(x, y) ∈ X × Y ; x ∈ y}. {(x, y) ∈ X × IP(X); x ∈ y}. {(x, y) ∈ IP(X) × IP(Y ); x ⊆ y}.
6.3.6 Definition: The domain of a relation R is the set {a; ∃b, (a, b) ∈ R}. The range (or the image or the codomain) of a relation R is the set {b; ∃a, (a, b) ∈ R}. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
6.3.3 Remark: Remark 6.0.1 mentions the difference between a relation-predicate (defined by a logical predicate expression or “rule”) and a relation-set (defined as a set of ordered pairs of related objects). In this terminology, Definition 6.3.2 defines a relation-set. (See Definition 5.1.22 for a half-hearted attempt to define a “set-theoretic formula”, which is how relation-predicates are defined.)
210
6. Relations and functions
6.3.7 Notation: Dom(R) denotes the domain of a relation R. Range(R) denotes the range of a relation R. Im(R) is an alternative notation for Range(R) for relations R. 6.3.8 Remark: Since Definition 6.3.2 specifies that a relation is a set, it follows that the domain and range in Definition 6.3.6 are also sets. It is perhaps not immediately clear why this is so. As is very often the case, it is the ZF replacement axiom (Definition 5.1.26 (6)) which saves the situation. The specification axiom (which is weaker) implies that any expression of the form {x ∈ X; P (x)} is a genuine set if X is a set and P is a set-theoretic function. Hence it is sufficient to find suitable X and P to represent the domain and range SS of a function. In this case, the set X = R does the job for any relation R. 6.3.9 Theorem: The domain and range of a relation are sets. For any relation R,
and
SS SS Dom(R) = a ∈ R; ∃b ∈ R, (a, b) ∈ R
SS SS Range(R) = b ∈ R; ∃a ∈ R, (a, b) ∈ R .
(a, b) ∈ R ⇔ ⇒ ⇔ ⇒ ⇔
{{a}, {a, b}} ∈ R ∃y, ({a} ∈ y ∧ y ∈ R) S {a} ∈ R S ∃z, (a ∈ z ∧ z ∈ R) SS a∈ R.
SS An almost identical argument showsSthat R. By a double application of the ZF S (a, b) ∈ R ⇒ b ∈ union axiom (Definition 5.1.26 (4)), R is a well defined set. It then axiom follows SS SS S S by the replacement SS (Definition 5.1.26 (6)) that a ∈ R; ∃b ∈ R, (a, b) ∈ R and b ∈ R; ∃a ∈ R, (a, b) ∈ R SS SS are both well defined sets. But since (a, b) ∈ R implies that a ∈ R and b ∈ R, these sets are the same as {a; ∃b, (a, b) ∈ R} and {b; ∃a, (a, b) ∈ R} respectively, which are simply the definitions of Dom(R) and Range(R). It follows that the domain and range of R are sets for any relation R. 6.3.10 Remark: One may regard the steps in the S proof of Theorem 6.3.9 as progressively replacing the set-braces “{” and “}” with the set-union operator “ ”. The two levels of nesting of a and b in set-braces are replaced by two levels of set-union operators. SS SS 6.3.11 Theorem: R ⊆ R× R for any relation R. Proof: The assertion follows immediately from Theorem 6.3.9.
6.3.12 Remark: The word “the” is part of the defined terms in Definition 6.3.6 because generally any set X such that ∀z ∈ R, ∃a ∈ X, ∃b, z = (a, b) may be referred to as “a domain for R”. Similarly, any set Y such that ∀z ∈ R, ∃a, ∃b ∈ Y, z = (a, b) may be referred to as “a range for R”. In other words, any set X with Dom(R) ⊆ X may be called “a domain for R”, and any set Y with Range(R) ⊆ Y may be called “a range for R”. From this terminology, one may say that R is a relation from A to B if and only if R is a relation, and A is a domain for R, and B is a range for R. 6.3.13 Definition: The image of a set A by a relation R is the set {b; ∃a ∈ A, (a, b) ∈ R}.
The pre-image (or inverse image) of a set B by a relation R is the set {a; ∃b ∈ B, (a, b) ∈ R}. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Proof: Let R be a relation. Let a satisfy ∃b, (a, b) ∈ R. By definition of an ordered pair, this means that ∃b, {{a}, {a, b}} ∈ R. It follows that {a} ∈ (a, b) for some b. Let y = (a, b). Then {a} ∈ y and y ∈ RSfor some y. That is, ∃y, ({a} ∈ y ∧ y S ∈ R). By the definition of set union, this is equivalent toS{a} S ∈ R. Let z = {a}. Then a ∈ z and z ∈ R. So by definition of set union again, this implies a ∈ R. This argument may be summarized as follows.
6.3. Relations
211
6.3.14 Definition: The source set of a “relation from a set A to a set B” is the set A. The target set of a “relation from a set A to a set B” is the set B. 6.3.15 Remark: The terms “source set” and “target set” in Definition 6.3.14 are probably non-standard. These sets are often referred to colloquially as the “domain” and “range” of the relation. If a relation is defined to be merely a set R of ordered pairs, as opposed to the tuple (R, A, B), then the sets A and B are not properties of the set R. These sets are merely part of the context in which the relation is discussed. Thus Dom(R) and Range(R) are well defined, whereas Source(R) and Target(R) are not well defined. Nevertheless it is very often useful to be able to refer to the source and target sets. Therefore they need names which are different to “domain” and “range”. In view of the above comments, it should be noted that Definition 6.3.14 is in fact a pseudo-definition. These sets are properties of the meta-language used in the discussion of a relation, not properties of the relation itself. (Such pseudo-definitions are fairly rare in careful mathematics. An example of a pseudo-notation is the commonly used “M n ” for an n-dimensional manifold. The careful reader will notice such pseudo-notations and pseudo-definitions when they occur, but hopefully not too many in this book!) Definition 6.3.14 may be converted into a valid definition if the word “the” is replaced by “a” as in Definition 6.3.16. 6.3.16 Definition: A source set for a relation R is any set A such that A ⊇ Dom(R). A target set for a relation R is any set B such that B ⊇ Range(R). 6.3.17 Remark: It is clear from Definitions 6.3.2 and 6.3.6 that if R is a relation between sets A and B, then Dom(R) ⊆ A and Range(R) ⊆ B. The word “range” is used in the mathematics literature for two different concepts. Sometimes the range of a relation R means the set B (in the case of a relation between sets A and B), but equally often it means the set {b; ∃a, (a, b) ∈ R}. It is the latter definition which is adopted here. This choice of definition is influenced by the meaning of the English-language word “range”, but it also agrees with Halmos [159], page 27.
6.3.18 Remark: The concept of a relation-set should be distinguished from a relation-predicate or “settheoretic relation”, which is a logic concept. A set-theoretic relation is a symbol in mathematical logic which is built out of language primitives whereas a relation-set is a particular kind of set within set theory. There is a third concept of “relation” which is the natural language meaning of the word. (To avoid confusion, the word “relationship” could be used for the natural language meaning.) Unfortunately, all three concepts are combined in mathematical writing. The reader must infer from context which of the three meanings is intended. 6.3.19 Remark: A relation-set is not usually thought of as just a set of ordered pairs. Generally a relation will be introduced into a mathematical discussion as a “relation between (sets) A and B” or a “relation from (set) A to (set) B”. A relation is throught of as associating objects with each other. Thus the ordered pair (a, b) associates object a with object b. A relation, being a set of ordered pairs, associates any number of objects in this way. Generally the writer or speaker will have in mind specific sets A and B between which the relation defines associations. Therefore it is common to define a relation as a subset of a Cartesian product A × B. However, it is, strictly speaking, unnecessary to specify the source and target sets A and B respectively. The fact that a relation R is required in Definition 6.3.2 to be a set guarantees that Dom(R) and Range(R) are both sets. From this it follows that R ⊆ A × B if A and B are chosen as A = Dom(R) and B = Range(R). 6.3.20 Remark: In some formalisms, a relation is defined as a triple of sets (R, A, B), where A and B are sets and R ⊆ A × B. This is then generally abbreviated to R. Thus R − < (R, A, B), using the “chicken-foot” notation for abbreviations introduced in Section 5.16. In such a formalism, the reader must decide according to context whether R means the full tuple (R, A, B) or just the set of ordered pairs R ⊆ A × B. The tuple formalism (R, A, B) has the advantage of communicating the intended context to the reader, but it is often clumsy and confusing. The contextual sets A and B are probably best communicated in the surrounding text. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
[ Remark 6.3.18 overlaps a lot with Remark 6.3.4. ]
212
6. Relations and functions
6.3.21 Notation: a R b for a relation R means that (a, b) ∈ R. 6.3.22 Remark: The infix notation a R b is abstracted from the well-known notations for relations such as a = b, a 6= b, a ∈ b, a < b, a > b, a ≤ b, a ≥ b, a ⊆ b, a ⊇ b, a ≡ b, a ∼ b, a ≃ b, a ≈ b and a ∼ = b. 6.3.23 Definition: The composition or composite of relations R1 and R2 is the set {(a, b); ∃c, ((a, c) ∈ R1 ∧ (c, b) ∈ R2 )}. 6.3.24 Notation: R2 ◦ R1 denotes the composition of two relations R1 and R2 . 6.3.25 Remark: The composite of two relations is a relation. This is almost obvious because a relation is defined as a set of ordered pairs. The only non-obvious assertion is that the composite of two relations is a set. The fact that someone writes down something of the form X = {x; P (x)} does not necessarily imply that X is a set in Zermelo-Fraenkel set theory. Theorem 6.3.26 verifies that the composition of relations yields a set. 6.3.26 Theorem: The composite R2 ◦ R1 of two relations R1 and R2 is a relation which satisfies R2 ◦ R1 ⊆ Dom(R1 ) × Range(R2 ). Proof: Let R1 and R2 be relations. By Theorem 6.3.9, Dom(Ri ) and Range(Ri ) are sets for i = 1, 2. Thus Ri ⊆ Dom(Ri ) × Range(Ri ) for i = 1, 2. Let R = R2 ◦ R1 denote the composition of R1 and R2 . Let (a, b) ∈ R. Then (a, c) ∈ R1 and (c, b) ∈ R2 for some c. Therefore a ∈ Dom(R1 ) and b ∈ Dom(R2 ). Hence R ⊆ Dom(R1 ) × Range(R2 ). 6.3.27 Definition: The inverse of a relation R between two sets A and B is the relation between B and A specified as the set {(b, a); (a, b) ∈ R} of reversed pairs of R. 6.3.28 Notation: R−1 denotes the inverse of a relation R.
6.3.29 Definition: A reflexive relation in a set X is a relation R in X such that ∀a ∈ X, (a, a) ∈ R. A symmetric relation is a relation R such that ∀a, ∀b, (a, b) ∈ R ⇒ (b, a) ∈ R . A transitive relation is a relation R such that ∀a, ∀b, ∀c, (a, b) ∈ R ∧ (b, c) ∈ R ⇒ (a, c) ∈ R .
6.3.30 Theorem:
(i) The inverse (R−1 )−1 of the inverse R−1 of a relation R satisfies (R−1 )−1 = R. (ii) The composite R−1 ◦ R of a relation R with its inverse R−1 is a symmetric relation. Proof: To prove part (ii), let R be a relation. Let S denote the composite R−1 ◦ R of R with its inverse. Suppose (x1 , x2 ) ∈ S = R−1 ◦ R. Then for some y ∈ Y , (x1 , y) ∈ R and (y, x2 ) ∈ R−1 . So (x2 , y) ∈ R by the definition of R−1 . Similarly, (y, x1 ) ∈ R−1 . Hence (x2 , x1 ) ∈ S by the definition of the composite. Therefore S is symmetric. 6.3.31 Definition: An injective relation is a relation R which satisfies ∀x1 , ∀x2 , ∀y, (x1 , y) ∈ R ∧ (x2 , y) ∈ R ⇒ x1 = x2 .
6.3.32 Theorem: The composite of any two injective relations is an injective relation. Proof: Let R1 and R2 be injective relations. By Definition 6.3.23, the composite of R1 and R2 is the relation R = {(a, b); ∃c, ((a, c) ∈ R1 ∧ (c, b) ∈ R2 )}. Suppose (a1 , b) ∈ R and (a2 , b) ∈ R. Then for some c1 and c2 , (a1 , c1 ), (a2 , c2 ) ∈ R1 and (c1 , b), (c2 , b) ∈ R2 . Since R2 is an injective relation, c1 = c2 . So a1 = a2 because R1 is an injective relation. Hence R is an injective relation. 6.3.33 Definition: The (domain) restriction of a relation R to a set A is the relation {(x, y) ∈ R; x ∈ A}. The range restriction of a relation R to a set B is the relation {(x, y) ∈ R; y ∈ B}. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
[ Definition 6.3.29 should be extended to give a comprehensive list of names of properties of relations. See for example EDM2 [34], 311. Possibly define these properties before this point instead. ]
6.4. Equivalence relations and partitions
213
6.4. Equivalence relations and partitions 6.4.1 Definition: An equivalence relation in a set X is a relation R in X such that (i) ∀x ∈ X, x R x; (reflexivity)
(ii) ∀x, y ∈ X, (x R y ⇒ y R x); (symmetry)
(iii) ∀x, y, z ∈ X, ((x R y and y R z) ⇒ x R z). (transitivity) 6.4.2 Definition: A partition of a set X is a set S ⊆ IP(X) such that S (i) S = X. (ii) ∀A, B ∈ S, (A = B or A ∩ B = ∅).
A set X is said to equal the disjoint union of S if S is a partition of X. 6.4.3 Remark: There is a slight abuse of language in the termS “disjoint union”. This term creates the impression that there are two kinds of union: and ordinary union S and a disjoint union of S, which many S people denote as ˙ S. However, the phrase “X is the disjoint union of S” means “X is the union of S, and S is a pairwise disjoint collection of sets”. The phrase “disjoint union of S” means “union of S, which happens to be a pairwise disjoint collection”. If the collection S doesn’t happen to be a pairwise disjoint union, then “the disjoint union of S” doesn’t have any meaning. [ Here present the relationship between a partition and an equivalence relation. ] 6.4.4 Definition: The quotient set of a set X with respect to an equivalence relation R is the set X/R defined as the set of equivalence classes of X with respect to R. . .
6.4.5 Remark: The set X/R of equivalence classes of a set X with respect to a relation R may also be called an identification set or identification space. The term “identification space” implies the inheritance by X/R of some structure on the set X, such as a topology or linear space structure. Another name for the quotient set X/R is the classification of X by R. It can also be called simply the partition of X by R
6.5. Functions 6.5.1 Remark: The word “function” refers to two kinds of mathematical entity: (i) a rule-based “set-theoretic function” (specified as a “set-theoretic formula” as in Definition 5.1.22); (ii) a special kind of relation-set (as defined in Definition 6.3.2). A rule-based (“set-theoretic”) function may be thought of as a procedure or a sequence of operations in the logic layer which yields a new set from a given set. The domain of a set-theoretic function is not necessarily a set. As an example, the operation of constructing the union X ∪ {∅} from any given set X is clearly a well-defined operation on all sets. But the “set of all sets” is not a set. This kind of logic-layer set-theoretic function is not the subject of this section. A reasonable name for the kind of function in part (i) would be a “function-predicate” by analogy with the “relation-predicate” alluded to in Remark 6.3.4. Then a reasonable name for the kind of function in part (ii) would be a “function-set” by analogy with the corresponding “relation-set”. 6.5.2 Remark: Functions are generally represented in set theory as particular kinds of sets, but they are usually thought of as being a separate class of object – something like a machine which produces outputs for given inputs. The representation of functions as sets is an economical measure which keeps the number of object classes low. (Conceptual economy is discussed by Halmos [159], section 6. Dissatisfaction with the passive definition of a function as a set is discussed by Halmos [159], section 8.) [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
[ For the quotient set, see Halmos [159], page 28. ]
214
6. Relations and functions
6.5.3 Remark: It may be that the feeling which mathematicians have that a function is different to a set is due to the historical origins of functions. In the olden days, for instance, the square of an integer x was defined by a procedure of multiplication of x by itself, which was an active process of generating one number from another. But the set definition of the “square function” is more like a look-up table in computing. The set definition is a set of ordered pairs (x, x2 ). So to calculate the square of a number with the set definition, you look up the value in the set of ordered pairs. In a more active definition of functions, you would specify an algorithm or procedure. It seems to have been necessary historically to abandon functions defined as procedures in favour of functions defined as look-up tables in order to remove an arbitrary limitation on the set of functions that one can discuss. The down side of this has been the loss of ‘active mood’ in the definition of functions. 6.5.4 Remark: The nouns “function” and “map” are used synonymously in this book. In many contexts, the word “map” indicates a function between sets which are peers in some sense (such as differentiable manifolds), whereas “function” is then used to indicate a more light-weight function such as a real-valued function. The word “mapping” is synonymous with “map”, and a “family” is really the same thing as a general function except that it is thought about differently. All of these synonyms for “function” are useful for putting the focus on different aspects of functions.
6.5.6 Definition: A function is a relation f such that (i) ∀x, ∀y1 , ∀y2 , (x, y1 ) ∈ f ∧ (x, y2 ) ∈ f ⇒ y1 = y2 .
A function on X, for any set X, is a function f such that (ii) Dom(f ) = X,
A function from X to Y , for any sets X and Y , is a function f on X such that (iii) Range(f ) ⊆ Y , 6.5.7 Notation: f : X → Y means that f is a function from X to Y . 6.5.8 Remark: A relation f is function f : X → Y if an only if ∀x ∈ X, ∃′ y ∈ Y, (x, y) ∈ f . This can be expressed in terms of set cardinality as ∀x ∈ X, #{y ∈ Y ; (x, y) ∈ f } = 1. In other words, there is one and only one y for each x ∈ X such that (x, y) ∈ f . In terms of the set map f¯ in Definition 6.6.1, the condition for a relation f to be a function f : X → Y may be written as ∀x ∈ X, #(f¯({x})) = 1. 6.5.9 Remark: The domain, range and image of a function are defined exactly as for relations in Definition 6.3.6. The notations Dom(f ), Range(f ) and Im(f ) are defined for functions exactly as for relations in Notation 6.3.7. 6.5.10 Remark: The range of a function f from X to Y is sometimes defined to be the set Y rather than the set {y ∈ Y ; ∃x ∈ X, (x, y) ∈ f } ⊆ Y , which is not generally the same set. The term “image”, however, is always the set {y ∈ Y ; ∃x ∈ X, (x, y) ∈ f } of values of f . In this book, both the range and image of a function are understood to be the set of values {y ∈ Y ; ∃x ∈ X, (x, y) ∈ f } of the function f as in Definition 6.3.6. The term “image” should be preferred because it is less ambiguous. A practical difficulty with the word “image” is the fact that the abbreviation Im clashes with the abbreviation for the imaginary part of a complex number. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
6.5.5 Remark: Whereas a relation is introduced in Definition 6.3.2 as a set of ordered pairs without any specification of an explicit domain or range set, most introductory texts do explicitly define a function in terms of a specified domain set and range set. But functions, like relations, do not the domain or range to be specified in advance. The domain can always be determined from the set of ordered pairs in a function. Similarly, any set which contains all values of the function may be considered to be a range set for it. Therefore Definition 6.5.6 introduces three levels of specification of a function. When neither the domain nor the range is specified, condition (i) requires only that the value of the function must have a unique value if it has a value at all. When the domain is specified, it is required by condition (ii) to be equal to the set of values for which the relation does have a value. When both the domain and range sets are specified, condition (iii) only requires that the specified range set Y should contain all of the values of the relation f . It does not need to equal the set of values of the relation.
6.5. Functions
215
For maximum clarity one should use the term “target set” for the set Y in the phrase “function from X to Y ”, and “image” for the set of values of f . The arbitrariness of the target set is a nuisance which is accepted for good reasons. 6.5.11 Definition: An argument of a function f is any element of the domain of f . A value of a function f is any element of the range of f . The value of a function f for an argument x of f is the value y of f such that (x, y) ∈ f . 6.5.12 Notation: f (x) denotes the value of a function f for an argument x ∈ Dom(f ). fx denotes the value of a function f for an argument x ∈ Dom(f ). (fi )i∈X is an alternative notation for a function f with domain X. 6.5.13 Notation: Y X , for sets X and Y , denotes the set of functions from X to Y . 6.5.14 Remark: The set Y X is the set of functions f such that Range(f ) = X and Dom(f ) ⊆ Y . 6.5.15 Notation: {f : X → Y ; P (f )} for sets X and Y and set-theoretic formula P means the set {f ∈ Y X ; P (f )}. 6.5.16 Remark: If ⊤ denotes the always-true set-theoretic formula with zero arguments (as in Notation 4.12.10), then the set Y X may be written in terms of Notation 6.5.15 as {f : X → Y ; ⊤}. But then the dummy variable f does not appear in the expression P (f ) = ⊤. So it seems superfluous to write the letter f at all. (One could make use here of the single-parameter always-true logical predicate which is alluded to in Remark 4.12.9.) The set Y X is sometimes written as {f : X → Y }, which also contains the superfluous dummy variable f . In this book, the notation “X → Y ” is proposed as an equivalent for Y X . (See Notation 6.12.2.) This notation has the advantage that it avoids the superfluous dummy variable, but it is slightly non-standard. One place where the author has found this sort of notation is in a computer software user manual for the Isabelle/HOL “proof assistant for higher-order logic” [165], page 5.
6.5.18 Remark: If f is the empty function f = ∅ then Dom(f ) = ∅ and Range(f ) = ∅. It follows that f is a function from X to Y if and only if X = ∅. In other words, the target set is arbitrary. 6.5.19 Definition: The identity function on a set X is the function f : X → X with ∀x ∈ X, f (x) = x. 6.5.20 Remark: The identity function on any set X is clearly the same thing as {(x, x); x ∈ X}. 6.5.21 Notation: idX for any set X denotes the identity function on X. 6.5.22 Remark: Since the identity function idX is parametrized by a set which is defined in the context where it is used, it is really a kind of “meta-function” or “function template”. 6.5.23 Definition: A function f : X → Y is injective (or an injection) if ∀x1 , x2 ∈ X, (f (x1 ) = f (x2 ) ⇒ x1 = x2 ). An injective function is also said to be one-to-one or 1–1. A function f : X → Y is surjective (or a surjection) if ∀y ∈ Y, ∃x ∈ X, f (x) = y. A surjective function is also said to be onto. A function f : X → Y is bijective (or a bijection) if it is injective and surjective. 6.5.24 Remark: Whereas the injective property is independent of the choice of range set Y , the surjective property depends completely on the set Y . In fact, since a function f is specified as a set of ordered pairs, it is possible to determine the domain of f as the set Dom(f ) = {x; (x, y) ∈ f }, but it is not possible to determine the range of f from only the ordered pairs of f . That is, the range of a function is not an attribute of the function as usually specified. One can only determine that Y ⊇ Range(f ) = {y; (x, y) ∈ f }. Then a function f : X → Y is said to be surjective if Y = Range(f ), which means that surjectivity is a relation between a function f and a given set Y , not an attribute of the function as injectivity is. It follows that the target space Y of a function f : X → Y must always be stated when asserting that a function is surjective or onto. It is best to say explicitly something like “f is onto Y ” or “f is surjective to Y ”. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
6.5.17 Definition: The empty function is the set ∅.
216
6. Relations and functions
6.5.25 Theorem: A function f : X → Y is a bijection if and only if the inverse relation f −1 : Y → X is a function. 6.5.26 Remark: It is not necessary to define “inverse function” because it is the same as the “inverse relation” if the “inverse function” is well defined. The inverse of a function is always well defined, but the inverse might not be a function. Thus “the inverse relation f −1 ” is well defined for any function (or relation) f , but “the inverse function f −1 ” is not well defined unless f is a bijection. It is important to distinguish between “the inverse of a function f ” (which is always defined) and “the inverse function f −1 ” (which is only defined if f is a bijection). 6.5.27 Definition: A restriction of a function f is any function g such that g ⊆ f .
The restriction to a set A of a function f is the function {(x, y) ∈ f ; x ∈ A} = f ∩ (A × Range(f )). 6.5.28 Notation: f A for a function f and set A denotes the restriction of f to A.
6.5.29 Remark: For expressions F , the notation F (x) x=a means F (a). This is useful for substituting a value into a complicated expression as in this example: ∂ i ψβ ◦ ψα−1 (x) . j ∂x x=ψα (p)
6.5.30 Remark: It is not necessary for the set A in Definition 6.5.27 to be a subset of the domain X of f . A function g is a restriction of a function f if and only if it is a restriction of f for some set A. Suppose g is a restriction of f . Then g ⊆ f ; so g = f Dom(g) . Conversely, if g = f A , then clearly g ⊆ f by the definition of f A . Definition 6.5.31 defines a function extension so that g is an extension of f if and only if f is a restriction of g. 6.5.31 Definition: An extension of a function f is any function g such that f ⊆ g.
6.6. Function set maps and inverse set maps 6.6.1 Definition: The set map corresponding to a function f : X → Y is the function f¯ : IP(X) → IP(Y ) defined by f¯(A) = {f (x); x ∈ A} for all A ⊆ X. The inverse set map corresponding to a function f : X → Y is the function f¯−1 : IP(Y ) → IP(X) defined by f¯(B) = {x ∈ X; f (x) ∈ B} for all B ⊆ Y . 6.6.2 Remark: The set map f¯ in Definition 6.6.1 is illustrated in Figure 6.6.1.
IP(X)
X Figure 6.6.1 [ www.topology.org/tex/conc/dg.html ]
A
x
f¯
f
f¯(A)= {f (x); x∈A}
f (x)
IP(Y )
Y
Set map between power sets [ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
This is not the same thing as function restriction. The notation F (x) x=a denotes substitution of a value a, which gives the unique value of the expression F when it is restricted to the set {a} ⊆ Dom(F ). Clearly F (a) is not exactly the same thing as F {a} .
6.6. Function set maps and inverse set maps
217
The set maps f¯ and f¯−1 in Definition 6.6.1 are usually denoted simply as f and f −1 . This notation re-use, although economical, leads to actual contradictions when the domain or range of f has elements which are contained in other elements. Thus for instance, if f : X → Y with ∅ ∈ X, then f (∅) may refer to the original function f or the corresponding set map f¯. If this seems to be a problem for pathological sets X and Y only, consider X = ω, the set of ordinal numbers. In this case, every element of ω is also a subset of ω, which makes it absolutely essential to distinguish between a function f : X → Y and its corresponding set map f¯.
Despite the real danger, most authors use such ambiguous notation. For clarity, as in Theorems 6.6.3 and 6.6.4, different notation may be used for set maps. This kind of difficulty could be resolved by placing a tag on each element of X and IP(X) to distinguish their set membership. But this is not the way mathematics is currently done. 6.6.3 Theorem: Let f : X → Y be a function, and let f¯ : IP(X) → IP(Y ) denote the set map corresponding to f . Then the following statements are true for any sets A, B ∈ IP(X). (i) f¯(∅) = ∅ and f¯(X) ⊆ Y . (ii) A ⊆ B ⇒ f¯(A) ⊆ f¯(B).
(iii) f¯(A ∪ B) = f¯(A) ∪ f¯(B). (iv) f¯(A ∩ B) ⊆ f¯(A) ∩ f¯(B). (v) f¯(X \ A) ⊇ f¯(X) \ f (A).
[ For properties of set maps and inverse set maps, see EDM2 [34], 381.C. ] 6.6.4 Theorem: Let f : X → Y be a function, and let f¯−1 : IP(Y ) → IP(X) denote the inverse set map corresponding to f . Then the following statements are true for any sets A, B ∈ IP(Y ). (i) f¯−1 (∅) = ∅ and f¯−1 (Y ) = X. (ii) A ⊆ B ⇒ f¯−1 (A) ⊆ f¯−1 (B).
(v) f¯−1 (Y \ A) = f¯−1 (Y ) \ f −1 (A).
And so forth . . . If f is surjective, then (ii) has the stronger form (ii′ ) A ⊆ B ⇔ f¯−1 (A) ⊆ f¯−1 (B). Therefore A = B ⇔ f¯−1 (A) = f¯−1 (B). [ Try to think up some statements here for when f is a bijection. ] Proof: Part (ii) is elementary. To show (ii′ ), suppose that f −1 (A) ⊆ f¯−1 (B) and let y ∈ A. Then f (x) = y for some x ∈ A. So x ∈ f¯−1 (A). Therefore x ∈ f¯−1 (B) and so y = f (x) ∈ B. ...
6.6.5 Remark: The reason for the simplicity of Theorem 6.6.4 relative to Theorem 6.6.3 if the fact that the inverse of a function is automatically one-to-one and onto even though the inverse may not be a function. 6.6.6 Theorem: Let f : X → Y and let f¯ and f¯−1 be as in Theorems 6.6.3 and 6.6.4. Then: (i) ∀S ∈ IP(Y ), f¯(f¯−1 (S)) = S. (ii) ∀S ∈ IP(X), f¯−1 (f¯(S)) ⊇ S.
6.6.7 Remark: Theorem 6.6.8 extends Theorems 6.6.3 and 6.6.4 to arbitrary sets of sets. This theorem involves the “double set map” f¯ : IP(IP(X)) → IP(IP(Y )) for the function f , which is defined by f¯ : S 7→ {f¯(A); A ∈ S}, and the “double inverse set map” (the set map of the inverse set map) f¯−1 : IP(IP(Y )) → IP(IP(X)) defined by f¯−1 : S 7→ {f¯−1 (A); A ∈ S}. 6.6.8 Theorem: Let f : X → Y be a function with set map f¯ : IP(X) → IP(Y ) and inverse set map f¯−1 : IP(Y ) → IP(X). Then the following statements are true for any S ∈ IP(IP(X)) and S ′ ∈ IP(IP(Y )). [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
(iii) f¯−1 (A ∪ B) = f¯−1 (A) ∪ f¯−1 (B). (iv) f¯−1 (A ∩ B) = f¯−1 (A) ∩ f¯−1 (B).
218
6. Relations and functions
S S S (i) f¯( S) = {f¯(A); A ∈ S} = f¯(S). T T T (ii) f¯( S) ⊆ {f¯(A); A ∈ S} = f¯(S) if S 6= ∅. S S S (iii) f¯−1 ( S ′ ) = {f¯−1 (A); A ∈ S ′ } = f¯−1 (S ′ ). T T T (iv) f¯−1 ( S ′ ) = {f¯−1 (A); A ∈ S ′ } = f¯−1 (S ′ ) if S ′ 6= ∅. And so forth . . .
6.6.9 Theorem: Let X and Y be sets and let f : X → Y be a function from X to Y . Then X is the disjoint union of the sets f −1 ({y}) for y ∈ Y . That is, S −1 X= f ({y}) y∈Y
and
∀y1 , y2 ∈ Y, y1 6= y2 ⇒ f −1 ({y1 }) ∩ f −1 ({y2 }) = ∅ .
6.6.10 Remark: Theorem 6.6.9 is illustrated in Figure 6.6.2. f −1 ({y1 })
f −1 ({y2 })
X
f Y
y2
Partitioning of a set X by an inverse function f −1
Functions provide a useful tool for partitioning sets. The value of a function f on a set X may be thought of as a tag which identifies elements of X which belong to the same part of the parition. In other words, a function effectively defines an equivalence relation on its domain. 6.6.11 Remark: Theorem 6.6.9 provides the foundation for non-topological fibrations (groupless fibre bundles). The tuple (E, π, B) could be regarded as a fibration if π : E → B is any function from E to B. Then the “total space” E is partitioned by the “fibre sets” π −1 ({b}) for b ∈ B. In other words, the set {π −1 ({b}); b ∈ B} is a partition of E. (See Section 22.1 for non-topological fibrations.)
6.7. Composition of functions 6.7.1 Definition: The composition or composite of two functions f : A → B and g : C → D such that B ⊆ C is the function h : A → D defined by h(x) = g(f (x)) for all x ∈ A. 6.7.2 Notation: g ◦ f denotes the composition of two functions f and g. 6.7.3 Remark: The composition of functions f and g in Definition 6.7.1 would be a well-defined function under the weaker assumption that Im(f ) ⊆ C. However, it is always possible to define B to be Im(f ) anyway. A further inconvenience caused by the arbitrariness of the range set D is that although D may equal Im(g), the image of f ◦ g might be a proper subset of Im(g) (unless Im(f ) = C). Thus one may wish to adjust the target set of f ◦ g to a set other than D. In that case the expression h : A → D in Definition 6.7.1 would not hold. This shows the nuisance value of always having to attach a useless target set to every function. Definition 6.3.23 gives an even more general composite f ◦ g of functions f and g which does not even require Range(f ) ⊆ Dom(g). However, such a composite is not always defined everywhere on Dom(f ). Partially defined functions are presented in Section 6.11. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Figure 6.6.2
y1
6.8. Families of sets and functions
219
6.7.4 Remark: Although (f ◦g)(x) is the same thing as f (g(x)), f ◦g is not the same thing as f (g) because f ◦ g is a function constructed from f and g, not the value of f for the argument g. 6.7.5 Remark: It is sometimes desirable to define f ◦ g even if Range(f ) 6⊆ Dom(g) in Definition 6.7.1. In this case, f must first be restricted to f −1 (Dom(g)) before composing f with g. It would be useful to be able to denote this generalized composite also by g ◦ f , but this does not seem to be common practice. 6.7.6 Remark: A function composition of the form g ◦ f −1 : B → C for f : A → B and g : A → C may be defined when f is not injective if the non-invertibility of f is somehow cancelled by the non-invertibility of g. To make sense of this, suppose that f is surjective and that g(x1 ) = g(x2 ) whenever f (x1 ) = f (x2 ) for x1 , x2 ∈ A. (This means that ∀y ∈ B, ∃z ∈ C, f −1 ({y}) ⊆ g −1 ({z}), where z is unique for each y.) Then the set g(f −1 ({y})) must be a singleton for all y ∈ B. Consequently g ◦ f −1 may be defined to map y to the unique element of this singleton. This is formalized in Definition 6.7.7. (It is tempting to go wobbly at the knees here and apply the axiom of choice to construct a right inverse f −1 for f from which g ◦ f −1 may be constructed. Luckily that’s not the only way to do it.) Definition 6.7.7 is illustrated in Figure 6.7.1. y′ B
g ◦ f −1
y
x1 x2 x3 x4
Generalized right inverse function: g = (g ◦ f −1 ) ◦ f
6.7.7 Definition: The function quotient of a function g : A → C with respect to a surjective function f : A → B such that ∀x1 , x2 ∈ A, f (x1 ) = f (x2 ) ⇒ g(x1 ) = g(x2 ) is the function g ◦ f −1 : B → C defined by g ◦ f −1 = {(f (x), g(x)); x ∈ A}. 6.7.8 Remark: The relation h = {(f (x), g(x)); x ∈ A} ⊆ B × C in Definition 6.7.7 is a function from B to C because for all y ∈ B, there is at least one x ∈ A with f (x) = y since f is surjective, and the uniqueness of g(x) for a given f (x) follows from the assumed relation between f and g. The function quotient satisfies g = (g ◦ f −1 ) ◦ f . This may be regarded as a “generalized right inverse” of some sort.
6.8. Families of sets and functions Families are defined simply as functions. The main difference is in the notation and the focus on the function values. The domain of a family is regarded as a mere index set which only provides tags for the values of the function. A family usually has values that are all sets or all functions. A set of sets or functions is often provided with tags to create a family out of a set. Sets of things and families of things are often thought of interchangeably since a family can be constructed from a set by providing tags for all elements of the set, and a set can be constructed from a family as the range of the family. An important difference is the fact that the same object may appear twice in the range of a family whereas there cannot be two copies of an object in a set. So if a set is indexed and the indexes are removed, the original set is recovered. But if a family has its indices removed, it may not be possible to reconstruct the family from the range of the family. A family of sets or functions may be thought of as an “indexed set” of sets or functions. But the indexed set is sometimes defined to be the range of the family. Although families have the same set representation as functions, they are thought of as being a different kind of object. This shows once again that mathematical objects have more significance than their set representation alone. A similar observation is that all functions are represented as sets. So a family of functions is also a family of sets, although a family of sets is not generally a family of functions. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Figure 6.7.1
C
g
f A
z
220
6. Relations and functions
6.8.1 Definition: A family of sets with index set I is a function S with domain I such that S(i) is a set for all i ∈ I. 6.8.2 Notation: The value S(i) for a family of sets S is usually denoted as Si . A family of sets S with index set I may be denoted as (Si )i∈I . 6.8.3 Remark: It is unnecessary to require all of the sets Si in a family (Si )i∈I in Definition 6.8.1 to be subsets of a single set X. The sets Si are always subsets of the union of all the sets Si , which is a set by Definition 5.1.26, Axiom (4). 6.8.4 Notation: As a convenience in notation, if two families (Ai )i∈I and (Bi )i∈I have the same index set I, then the family of pairs (Ai , Bi ) may be denoted in abbreviated fashion as (Ai , Bi )i∈I instead of the explicit notation (Ai , Bi ) i∈I . An n-tuple of families (A1i )i∈I , (A2i )i∈I , . . . (Ani )i∈I for n ≥ 1 may be denoted as (A1i , A2i , . . . Ani )i∈I instead of the explicit notation (A1i , A2i , . . . Ani ) i∈I .
is the union of the range of S. In other words, 6.8.5 Definition: The union of a family of sets S = (Si )i∈I S S it is the set {Si ; i ∈ I}. The union of S may be denoted as i∈I Si .
The intersection of a family T of sets S = (Si )i∈I such that I 6= ∅ is the intersection T of the range of S. In other words, it is the set {Si ; i ∈ I}. The intersection of S may be denoted as i∈I Si . 6.8.7 Theorem: Let A = (Ai )i∈I and B = (Bj )j∈J be families of sets and let C be a set. Then the following statements are true. T T (i) C ∪ (C ∪ Ai ) if A 6= ∅. i∈I Ai = Si∈I S (ii) C ∩ i∈I Ai = i∈I (C ∩ Ai ). T T T (iii) i∈I Ai ∪ j∈J Bj = (i,j)∈I×J (Ai ∪ Bj ) if A 6= ∅ and B 6= ∅. S S S (iv) i∈I Ai ∩ j∈J Bj = (i,j)∈I×J (Ai ∩ Bj ). 6.8.8 Remark: Theorem 6.8.9 is an indexed version of Theorem 6.6.8.
6.8.9 Theorem: Let f : X → Y be a function with set map f¯ : IP(X) → IP(Y ) and inverse set map f¯−1 : IP(Y ) → IP(X). Then the following statements are true for any families of sets A = (Ai )i∈I : I → IP(X) and B = (Bi )i∈I : I → IP(Y ). S S (i) f¯( i∈I Ai ) = i∈I f¯(Ai ). T T (ii) f¯( i∈I Ai ) ⊆ i∈I f¯(Ai ) if I 6= ∅. S S (iii) f¯−1 ( i∈I Bi ) = i∈I f¯−1 (Bi ). T T (iv) f¯−1 ( i∈I Bi ) = i∈I f¯−1 (Bi ) if I 6= ∅.
[ For more properties of functions for families of sets, see EDM2 [34], 381.D. ] 6.8.10 Remark: Theorems 6.8.11 and 6.8.13 are applicable to topology.
[ Theorems 6.8.11 and 6.8.13 look suspiciously similar to Theorem 5.15.4. Check to ensure that the axiom of choice is not used in Theorems 6.8.11 and 6.8.13. ] S S SS 6.8.11 Theorem: Let f : I → IP(IP(X)) for sets I and X. Then { f (A); A ∈ I} = {f (A); A ∈ I}. S S SS Proof: Approaching the equation { f (A); A ∈ I} = {f (A); A ∈ I} from the left-hand side, S S S { f (A); A ∈ I} ⇔ ∃A ∈ I, x ∈ f (A) ⇔ ∃A ∈ I, ∃B ∈ f (A), x ∈ B. Approaching from the right-hand side, x∈
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
6.8.6 Remark: Theorem 6.8.7 extends Theorem 5.14.7 to general families.
6.9. Cartesian products of families of sets and functions x∈
SS {f (A); A ∈ I} ⇔ ⇔ ⇔ ⇔
221
S ∃B ∈ {f (A); A ∈ I}, x ∈ B S ∃B, (B ∈ {f (A); A ∈ I} ∧ x ∈ B) ∃B, ∃A ∈ I, (B ∈ f (A) ∧ x ∈ B) ∃B ∈ f (A), ∃A ∈ I, x ∈ B.
The result follows. 6.8.12 Remark: See Section 5.15 for a version of Theorem 6.8.13 which does not use functions. 6.8.13 set andSQ S ∈ IP(IP(X)). Let fS :SQ → IP(IP(X)) be a function such that S Theorem: Let X be a S A = f (A) for all A ∈ Q. Then Q = { f (A); A ∈ Q} = {f (A); A ∈ Q}. S S S S Proof: Clearly Q S = S {A; A ∈ Q} = { f (A); A ∈ Q}, and from Theorem 6.8.11, it follows that S S { f (A); A ∈ Q} = {f (A); A ∈ Q}. 6.8.14 Definition: A family of functions with index set I is a function f with domain I such that f (i) is a function for all i ∈ I. 6.8.15 Notation: The value f (i) for a family of functions f is usually denoted as fi . A family of functions f with index set I may be denoted as (fi )i∈I . A 6.8.16 Remark: It could be useful to define a notation πB : X A → X B for general sets B ⊆ A and X by A πB : f 7→ f B .
For any sets X, A and B and any function g : B 7→ A, one could usefully define a notation πg : X A → X B for the projection map πg : f 7→ f ◦ g. Definition 6.9.8 defines specific projection maps for Cartesian products of set families such as ×i∈I Si .
The elements of the Cartesian product in Definition 6.9.1 may be thought of as either functions or sets of ordered pairs (the graphs of the functions). The perspective may be chosen according to one’s purposes. 6.9.1 Definition: The Cartesian product of a family of sets (Si )i∈I is the set of functions
x:I→
S
i∈I
Si ; ∀i ∈ I, xi ∈ Si .
6.9.2 Notation: ×i∈I Si for a family of sets (Si )i∈I denotes the Cartesian product of the family of sets according to Definition 6.9.1. 6.9.3 Remark: The Cartesian product in Definition 6.9.1 may be written as: × Si =
i∈I
f ∈ IP(I ×
S
i∈I
Si ); (∀j ∈ I, ∃′ x ∈
S
i∈I
Si , (j, x) ∈ f ) ∧ (∀j ∈ I, ∃x ∈ Si , (j, x) ∈ f ) .
6.9.4 Remark: If Si = X for all i ∈ I, then ×i∈I Si = X I for any sets X and I. (See Notation 6.5.13 for the set of functions X I .) n If I = n for some n ∈ + 0 and Si = X for all i ∈ I, then ×i∈I Si = X . (See Notation 7.2.33 for index sets n .)
N
N
Z
6.9.5 Remark: The Cartesian product in Definition 6.9.1 is a well-defined set because it is a subset of I × S i∈I Si . But it cannot always be proven to be non-empty in general without an axiom of choice. Unless one is trying very hard to construct pathological sets, it will usually be true that a Cartesian product will be non-empty if all of the member sets are non-empty. Nevertheless, the non-emptiness of a Cartesian product should not be assumed without at least thinking about the issue. Section 7.8 has some ideas on establishing non-emptiness of Cartesian products for readers who dislike the axiom of choice. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
6.9. Cartesian products of families of sets and functions
222
6. Relations and functions
6.9.6 Remark: The Cartesian product of a family of sets may be interpreted as the set of all tagged sets of elements of the family of sets. In other words, one element is sampled from each of the sets in the family, and the index of each set is tagged onto the chosen element to keep track of where it came from. This is clearer in the case that all of the sets in the family are the same set. Thus the Cartesian product of a family (Si )i∈I where Si = X for all i ∈ I may be interpreted as the set of all tagged subsets of X, where one element of X must be chosen for each index i ∈ I. The partial Cartesian product in Definition 6.10.1 differs in that all indices are optional – it is not necessary to choose an element for each index. 6.9.7 Remark: The definition of ×i∈I Si is reminiscent of the definition of a cross-section of a fibre bundle because of the way that an element of Si must be chosen for each i ∈ I. 6.9.8 Definition: The projection maps for a Cartesian product S = ×i∈I Si are the functions πj : S → Sj defined for j ∈ I by πj : (xi )i∈I 7→ xj for all (xi )i∈I ∈ S. [ See Halmos [159], page 36, for set projections. ] [ Define also projections for arbitrary subsets of the index set. Define cross-sections of arbitrary projection maps as right inverses of surjective functions. ] 6.9.9 Remark: See Remark 6.8.16 for some more general notations for projection maps. 6.9.10 Definition: An n-ary function from a set X to a set Y for n ∈
Z+0 is a function f : X n → Y .
[ Somewhere near here should be a definition and notation for sets like {X1 × X2 ; X1 ∈ S1 and X2 ∈ S2 }, where S1 and S2 are sets of sets. I don’t remember what such constructions are useful for. ] 6.9.11 Definition: The direct product of any two functions f and g is the function f × g : Dom(f ) × Dom(g) → Range(f ) × Range(g) defined by (f × g) : (x, y) 7→ (f (x), g(y)) for (x, y) ∈ Dom(f ) × Dom(g).
6.9.13 Remark: Definition 6.9.11 is standard. (E.g. see EDM2 [34], 381.C.) Definition 6.9.12 and its notations are probably non-standard. The pointwise direct product function is the same as the composition of the direct product with a diagonal map. Let X = Dom(f ) ∩ Dom(g) in Definition 6.9.12 and define the diagonal map d : X → X × X by d(x) = (x, x) for all x ∈ X. Then (f × g) ◦ d is the same as the pointwise ˙ g. In practice, the notation f × g will generally be used instead of f × ˙ g, but the context direct product f × should always resolve the ambiguity.
6.10. Partial Cartesian products and identification spaces ˚ 6.10.1 Definition: The partial S Cartesian product of a family of sets (Si )i∈I is the set of functions ×i∈I Si ˚ defined by ×i∈IS Si = {x : J → i∈I Si ; J ⊆ I and ∀i ∈ J, xi ∈ Si }. (This is a well-defined set because it is a subset of I × i∈I Si .)
6.10.2 Remark: If the index set I is a subset of the integers, then the elements of a partial Cartesian product may be referred to as “partial sequences”. 6.10.3 Remark: Definition 6.10.1 is probably non-standard. The idea here is that the functions S are not ˚i∈I Si is a superset of the standard Cartesian product necessarily defined on all of the index set I. The set × ˚i∈I Si are restrictions of the elements of ×i∈I Si to arbitrary subsets of I. (These set ×i∈I Si . Elements of × restrictions are actually projections.) That is, the elements of the partial Cartesian product may be thought of as families (xi )i∈I of the normal Cartesian product for which some of the elements xi may be undefined. Unfortunately, set theory does not have a standard symbol or definition for an undefined element. 6.10.4 Remark: Unlike the situation with full Cartesian products, the partial Cartesian products in Definition 6.10.1 are guaranteed to be non-empty, even if some of all of the sets in the family are empty. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
˙ g : 6.9.12 Definition: The pointwise direct product of any two functions f and g is the function f × ˙ Dom(f ) ∩Dom(g) → Range(f ) × Range(g) defined by (f × g)(x) = (f (x), g(x)) for all x ∈ Dom(f ) ∩Dom(g).
6.11. Partially defined functions
223
6.10.5 Definition: An identification space of the family of sets (Si )i∈I is a subset X of the partial Carte˚i∈I Si such that sian product set × (i) ∀x ∈ X, x 6= ∅,
(ii) ∀x, y ∈ X, ∀i ∈ I, (xi = yi ⇒ x = y), and
(iii) ∀i ∈ I, ∀s ∈ Si , ∃x ∈ X, xi = s.
˚i∈I Si such that every element of each set Si In other words, the elements of X are non-empty elements of × is the element xi of one and only one element x of X.
6.10.6 Remark: Identification spaces are closely related to quotient sets, which are introduced in Definition 6.4.4. [ Near here, give a commentary on each of the three conditions of Definition 6.10.5. ] Defini˚i∈I Si is only a represention 6.10.5 requires some interpretation. Strictly speaking, a subset X of the set × tation of an identification space of the family of sets (Si )i∈I . The identification space of a family of sets is a kind of equivalence class of the sets in the family. It is easier S to interpret Definition 6.10.5 if the sets Si are pairwise disjoint. In this case, there is a canonical map f : i∈I Si → X defined so that f (s) is the unique element x of X such that xi = s for some i ∈ I. (See Figure 6.10.1, where a dot “·” is used for the “undefined” elements of sequences in X.)
x2 S1 x1
y1
z2
S2 ⇒
x
y
z
f
(w1 , ·) = f (w1 ) f (z2 ) = (·, z2 ) f (x1 ) = f (x2 ) = (x1 , x2 ) Figure 6.10.1
w
˚ Si = S1 × ˚S2 X⊆ × i∈I
Identification space of sets S1 and S2
Since f is a well-defined S surjective function, the sets of the form f −1 ({x}) for x ∈ X are non-empty and constitute a partition of i∈I Si . The elements of each set f −1 ({x}) are regarded as identified. In other words, the elements in such a set are regarded as “grafted” onto each other.
If the sets Si are not disjoint, it is straightforward to attach a “tag” i to elements of each Si so as to force the sets to be disjoint. If the setsSSi are disjoint, it is possible to formalize the elements of the identification space as a partition of the set i∈I Si rather than as a set of tagged elements of the partial Cartesian S ˚i∈I Si . In fact, the partial Cartesian product set × ˚i∈I Si is simply the set of subsets of i∈I Si product × with tags on each element to indicate which set Si it was drawn from.
The concept of grafting sets onto each other to create identification spaces is fundamental to differential geometry. Differentiable and topological manifolds are generally created by grafting portions of Euclidean spaces onto each other to create more general classes of spaces. The sets (Si )i∈I are in this case the domains of charts in an atlas. The domain of each chart is a “patch”. The patches are grafted together to form an abstract manifold. It is necessary to also define an identification space topology and an identification space differentiable structure, and so forth, in order to build up the structural layers of manifolds.
6.10.7 Remark: The concept of a “direct sum of a family of sets” (see EDM2 [34], 381.E) is essentially equivalent to an identification space for a disjoint family of sets.
6.11. Partially defined functions 6.11.1 Remark: A function is said to be “well defined” if it has a unique value for every element of its domain set. In other words, a function is well defined if and only if it is a function. The reason for the superfluous adjective “well-defined” is the fact that sometimes one wishes to discuss “partially defined [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
w1
y2
224
6. Relations and functions
functions”, which are not truly functions because they are not necessarily defined for the whole domain. Sometimes “multiple-valued functions” are discussed, especially in the context of complex functions. To avoid woolly thinking, it is best to avoid using the word “function” except when it is well defined. 6.11.2 Remark: Definition 6.11.3 and Notation 6.11.4 are non-standard but often useful. There are many situations in differential geometry and analysis where the functions of interest are not defined everywhere. 6.11.3 Definition: A partially defined function (or partial function or local function) from a set A to a set B is a relation f − < (f, A, B) such that (i) ∀(a1 , b1 ), (a2 , b2 ) ∈ f, (b1 = b2 ⇒ a1 = a2 ).
6.11.4 Notation: A → ˚ B denotes the set of all local functions from a set A to a set B. f :A→ ˚ B means that f is a local function from a set A to a set B. ˚A . However, it is best to leave the sets A 6.11.5 Remark: An alternative notation for A → ˚ B would be B and B unadorned to make way for various kinds of subscripts and superscripts. For example, a partially defined function f from C01 (IRn ) to IR would be of the form f : C01 (IRn ) → ˚ IR or or f : (IRn → IR) → ˚ IR. It would be difficult to comfortably position a small circle on top of either of the sets C01 (IRn ) or IRn → IR. 6.11.6 Remark: There is a notation Y X for the set of functions f : X → Y , but there is no simple notation for the set or partially defined functions {f : U → Y ; U ⊆ X}. This could be denoted in the following ways. S {f : U → Y ; U ⊆ X} = YU U⊆X
≡ (Y ∪ {Y })X = (Y + )X .
[ Show that the composite of any two functions is a partially defined function. See Definition 6.3.23 and Remark 6.7.3. ] < (f1 , A1 , B1 ) and 6.11.7 Definition: The composition or composite of partially defined functions f1 − f2 − < (f2 , A2 , B2 ) is the relation f − < (f, A1 , B2 ) where f = {(a, b); ∃c ∈ B1 ∩ A2 , ((a, c) ∈ f1 ∧ (c, b) ∈ f2 )}. 6.11.8 Theorem: The composite of any two partially defined functions is a partially defined function. Proof: Let f1 − < (f1 , A1 , B1 ) and f2 − < (f2 , A2 , B2 ) be partially defined functions. By Definition 6.3.23, the composite of f1 and f2 is the relation f − < (f, A1 , B2 ) where f = {(a, b); ∃c ∈ B1 ∩ A2 , ((a, c) ∈ f1 ∧ (c, b) ∈ f2 )}. Suppose (a, b1 ) ∈ f and (a, b2 ) ∈ f . Then for some c1 , c2 ∈ B1 , (a, c1 ), (a, c2 ) ∈ f1 and (c1 , b1 ), (c2 , b2 ) ∈ f2 . Since f1 is a partially defined function, c1 = c2 . So b1 = b2 because f2 is a partially defined function. Hence f is a partially defined function. 6.11.9 Remark: Theorem 6.11.8 is suspiciously similar to Theorem 6.3.32. Theorem 6.11.8 is illustrated in Figure 6.11.1. A1
B1
f1
a1 a2
c1
A2 c2
f1 f1 : A1 → B1
f2 ◦ f1 : A1 ∩ f
Figure 6.11.1
f2
B2 b1 b2
c3 f2
−1
f2 : A2 → B2
(A2 ) → B2
Composite of two functions is a partially defined function
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
The second notation has the advantage of making it obvious that the cardinality of the set is (#(Y )+1)#(X) , but it is not good, meaningful set notation.
6.12. Notations for sets of functions
225
6.11.10 Theorem: The composite of any two functions is a partially defined function. Proof: This is an immediate corollary of Theorem 6.11.8.
6.12. Notations for sets of functions 6.12.1 Remark: Let A and B be sets. It is useful to denote by A → B the set of functions f : A → B. Then one may write f ∈ (A → B). This set of functions may also be denoted as B A . (See Notation 6.5.13.) Thus f ∈ (A → B) and f : A → B mean exactly the same thing as f ∈ B A . The notation A → B seems useless until one considers function-valued functions. Let C be a set, and let f be a function on A whose A values are functions from B to C. One may write this as f : A → C B or f ∈ C (B ) . It is much clearer to write f : A → (B → C) or f ∈ (A → (B → C)). (See Remark 6.5.16 for further comments on this notation.) 6.12.2 Notation: X → Y for sets X and Y denotes the set Y X . 6.12.3 Remark: For sets A, B and C, a function f : A → (B → C) has a “function transpose” f t : B → (A → C) defined by f t (b)(a) = f (a)(b) for a ∈ A and b ∈ B. (One could also refer to f t as a “re-sequence” of f .) As a practical example, a tangent operator field on a differentiable manifold M is of the form X : M → ((M → IR) → IR). (Here A = M , B = (M → IR) and C = IR.) This may be transposed as the function X t : (M → IR) → (M → IR) defined by X t (f )(x) = X(x)(f ) for all f ∈ (M → IR) and x ∈ M . In fact, whenever a function is function-valued, and the functions all have the same domain, the function may be transposed in this way to make a new function.
f :A×B →C ft : B × A → C f¯ : A → (B → C) f¯t : B → (A → C)
defined by defined by defined by
f t (b, a) = f (a, b) f¯(a)(b) = f (a, b) f¯t (b)(a) = f (a, b),
where f¯ denotes the “domain-split” of f . Any one of these functions may be defined in terms of any of the others. Therefore they all contain the same information. Functions are often freely converted between these forms with little or no comment. Any function valued on a cross-product may be regarded as a function-valued function by fixing the value of one coordinate and constructing a function using the remaining coordinate. The functions f¯ : A → (B → C) and f¯t : B → (A → C) may be thought of as “projections” of f onto A and B respectively. One could even invent notations such as π1 f for f¯ and π2 f for f¯t . Obviously there are very many more ways of doing this in the case of a Cartesian product of many sets. [ Must also discuss here the “circled arrow” or “William Tell” notation “−→”, ◦ for group actions as shown in Figure 23.6.2. Use notation G −→ ◦ F to denote G → (F → F ) or G × F → F . ]
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
It often happens in differential geometry that function-valued functions are transposed in this way for convenience according to context. Noteworthy examples of this are differentials and connections. There are four different ways of representing the information in a function f : A × B → C using transposition and “domain-splitting”. These are as follows.
226
[ www.topology.org/tex/conc/dg.html ]
6. Relations and functions
[ draft: UTC 2009–3–21 Saturday 11:36 ]
[227]
Chapter 7 Order and integers
7.1 7.2 7.3 7.4 7.5 7.6 7.7 7.8 7.9 7.10 7.11 7.12 7.13
Ordered sets . . . . . . . . . . . . . . . . . . . . . . . Ordinal numbers . . . . . . . . . . . . . . . . . . . . . Natural numbers . . . . . . . . . . . . . . . . . . . . . Unsigned integer arithmetic . . . . . . . . . . . . . . . Signed integers . . . . . . . . . . . . . . . . . . . . . . Extended integers . . . . . . . . . . . . . . . . . . . . Cartesian products of sequences of sets and functions . . Choice functions without the axiom of choice . . . . . . Indicator functions and delta functions . . . . . . . . . Permutations . . . . . . . . . . . . . . . . . . . . . . . Combinations and ordered selections . . . . . . . . . . . List spaces for general sets . . . . . . . . . . . . . . . . Reformulation of logic in terms of axiomatic mathematics
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
227 230 235 237 237 239 239 240 241 242 244 246 248
Order does not require set theory. Definitions of various categories of order relations may be expressed in terms of predicate calculus alone. However, applicable order relations are typically defined on sets. Therefore only order on sets will be presented here.
7.1. Ordered sets Totally ordered sets may be thought of as abstractions from the order relation of the real numbers. Partially ordered sets may be thought of as an abstraction from the inclusion relations of sets. (See Definition 6.3.2 for relations.) Some useful references for ordered sets are Halmos [159], section 14, Simmons [139], section 8, pages 43–48, and EDM2 [34], article 311. Ordered sets are particularly useful for indexing sets. A set which is indexed by a totally ordered set is called a “sequence”. The index sets of sequences are most often subsets of the integers, but Definition 7.1.14 generalizes this to any totally ordered set. This generality is applicable to “paths” which are traversals of a given set in a specified order. Paths are usually specified (as in Section 16.4) in terms of a real-number parameter, but a totally ordered parameter is the natural generalization. < (R, X, X) which satisfies: 7.1.1 Definition: A (partial) order for a set X is a relation R − (i) ∀x ∈ X, x R x; (weak reflexivity)
(ii) ∀x, y ∈ X, (x R y ∧ y R x) ⇒ x = y; (antisymmetry)
(iii) ∀x, y, z ∈ X, (x R y ∧ y R z) ⇔ x R z. (transitivity) A (partially) ordered set is a pair (X, R) such that X is a set and R − < (R, X, X) is a partial order for X.
©
Alan U. Kennington, “Differential geometry reconstructed: a unified systematic framework”. Copyright 2009, Alan U. Kennington. All rights reserved. You may print this book draft in A4 format. Printing in all other formats is forbidden. You may not charge any fee for copies of this book draft. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
The concept of order on sets is closely associated with numbers. Historically, the first numbers were ordinal numbers, and order could generally be expressed in terms of numbers. However, both order and numbers have broad generalizations beyond their historical origins.
228
7. Order and integers
7.1.2 Theorem: If a relation R for a set X is a partial order, then the inverse relation R−1 is a partial order for X also. 7.1.3 Example: For any set A, the power set X = IP(A) has a partial order R defined by R = {(x, y) ∈ X × X; x ⊆ y}. In other words, x R y ⇔ x ⊆ y for x, y ∈ X. Thus “ ⊆ ” is a partial order on IP(A) for any set A. [ Define partial order on X I for ordered set X and set I as in Remark 16.1.8: x ≤ y ⇔ (∀i ∈ I, xi ≤ yi ). ] 7.1.4 Definition: A total order for a set X is a relation R − < (R, X, X) which satisfies: (i) ∀x, y ∈ X, x R y ∨ y R x; (strong reflexivity)
(ii) ∀x, y ∈ X, (x R y ∧ y R x) ⇒ x = y; (antisymmetry)
(iii) ∀x, y, z ∈ X, (x R y ∧ y R z) ⇔ x R z. (transitivity) A totally ordered set is a pair (X, R) such that X is a set and R − < (R, X, X) is a total order for X. [ Define lexicographic total order on X I for ordered sets X and I as in Remark 16.1.8: x ≤ y ⇔ ∀j ∈ n , (∀i ∈ n , i < j ⇒ xi = yi ) ⇒ xj ≤ yj . Equivalently, x ≤ y ⇔ ∀j ∈ n , ((xj ≤ yj ) ∨ (∃i ∈ n , (i < j ∧ xi 6= yi ))). (See also Exercise 47.7.10.) ]
N
N
N
N
7.1.5 Remark: The difference between a partial and total order lies in the stronger reflexivity condition for a total order. Clearly every total order is a partial order. So all theorems and definitions which mention a partial order also apply to a total order.
7.1.6 Theorem: If a relation R for a set X is a total order, then the inverse relation R−1 is a total order for X also. 7.1.7 Definition: The dual order of an order R for a set X is the inverse R−1 of the relation R. In other words, the dual of an order R is the relation R−1 = {(x, y); (y, x) ∈ R}. 7.1.8 Notation: The symbol “≤” is often used for a partial or total order R. Thus x ≤ y means x R y. The symbol “≥” means the dual of the order “≤”. Thus “≥” equals R−1 .
The symbol “” denotes the relation which satisfies x > y ⇔ (x ≥ y ∧ x 6= y).
7.1.9 Remark: In computer programming (in some computer languages), a function f which defines a total order on a set X typically returns a value in the set {−1, 0, 1}, where f (x, y) =
(
−1 if x < y 0 if x = y +1 if x > y.
However, in mathematics an order relation is represented as a subset of X × X, which is equivalent to a boolean (or indicator) function on X × X, which is a function whose values lie in {0, 1}, where the integer 1 represents “true”. The reason for the difference is the fact that equality of elements x, y ∈ X is taken for granted in mathematics. Therefore only two of the order-function values need to be specified. In computer programming, partial orders are not often returned as functions. For such an order-function, it would be necessary to have a fourth value to represent “unrelated” because it is possible that neither x nor y may be elements of the order. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Either reflexivity condition implies that the domain and range of the relation R are both equal to the set X. So the full relation triple (R, X, X) is redundant. Therefore it is best to specify an order as (X, R) where R is just the set of ordered pairs (the graph) of the order relation.
7.1. Ordered sets
229
7.1.10 Definition: A minimal element of a subset A of a partially ordered set X is an element x ∈ A such that ∀a ∈ A, ¬(a < x).
A maximal element of a subset A of a partially ordered set X is an element x ∈ A such that ∀a ∈ A, ¬(a > x).
A lower bound of a subset A of a partially ordered set X is an element x ∈ X such that ∀a ∈ A, x ≤ a.
An upper bound of a subset A of a partially ordered set X is an element x ∈ X such that ∀a ∈ A, x ≥ a.
An infimum of a subset A of a partially ordered set X is a lower bound x ∈ X for A such that y ≤ x for all lower bounds y for A. A supremum of a subset A of a partially ordered set X is an upper bound x ∈ X for A such that x ≤ y for all upper bounds y for A. A minimum of a subset A of a partially ordered set X is an element x ∈ A such that ∀y ∈ A, x ≤ y.
A maximum of a subset A of a partially ordered set X is an element x ∈ A such that ∀y ∈ A, x ≥ y. [ Define chains somewhere in this section. Show that a strict order has no cyclic chains. ] 7.1.11 Remark: There is at most one maximum and one minimum for any subset of a partially ordered set. The infimum of a subset A is a maximum of the set of all lower bounds of A, and the supremum A is a minimum of the set of all upper bounds of A. Therefore there is at most one infimum and one supremum for any subset of a partially ordered set. 7.1.12 Remark: One may define homomorphisms and isomorphisms with respect to the order structure on a set as in Definition 7.1.13. The definitions apply to both partial and total orders. It is important to remember that the inequality symbol “ ≤ ” represents two different order relations according to the ordered set in which the elements are being compared. If the sets X and Y are the same (or X ∩ Y 6= ∅), one should use distinct notations such as RX and RY to indicate which order is being used, since two different orders may be defined on the same set.
An order isomorphism between two ordered sets X and Y is a bijection f : X → Y such that ∀x1 , x2 ∈ X, x1 ≤ x2 ⇔ f (x1 ) ≤ f (x2 ) .
7.1.14 Definition: A sequence (of sets) is a family of sets (Xi )i∈I such that the index set I is a totally ordered set. A sequence of functions is a family of functions (fi )i∈I such that the index set I is a totally ordered set. 7.1.15 Remark: Families of sets and functions are defined in Section 6.8. Definition 7.1.14 is not strictly correct because an ordered set is really a pair (I, R) rather than a set I. A family of sets (Definition 6.8.1) is strictly speaking a triple X − < (X, I, J), namely a function X : I → J for some sets I and J. Therefore the order (R, I, I) must be incorporated into the triple (X, I, J) somehow. Since the order is thought of as being tightly attached to the set I, one good possibility would be to specify a sequence as (X, I, R, J), or perhaps as (X, (I, R), J). The author’s preference is to standardize on (X, I, R, J) as the specification tuple for a sequence, although an equally good solution is to simply have two different tuples: (X, I, J) for the sequence and (I, R) for the total order on I. 7.1.16 Remark: A family of sets or functions (with no total order on the index set) may sometimes be referred to casually as a “sequence” of sets or functions. If the index set has no specified order, the word “family” should be used. The word “sequence” comes from the Latin word “sequi” meaning “to follow”. So a sequence must have a specified order. 7.1.17 Definition: An ordered traversal of a set X is a bijection f : I → X for some totally ordered set I. [ Should also define ordered traversals of ordered traversals. I.e. permit the index set to be a doubly totally ordered set. This is like a database table with a primary key and a secondary key. Also allow an arbitrary number of “keys” for multiply ordered traversals. This concept could be useful for defining multi-parameter paths. See Remark 16.1.8. ] [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
7.1.13 Definition: An order homomorphism between two ordered sets X and Y is a map f : X → Y such that ∀x1 , x2 ∈ X, x1 ≤ x2 ⇒ f (x1 ) ≤ f (x2 ) .
230
7. Order and integers
7.1.18 Remark: The non-standard Definition 7.1.17 is essentially equivalent to Definition 7.1.14. It is intended as a generalization of the concept of a continuous curve in a topological space (as in Section 16.2) to a non-topological curve. The very least that one expects of a curve is that it should have an order in which the points are traversed. If I = IR, then this permits discontinuities which a continuous curve forbids. An equally good term for a traversal would be a “trajectory”, although this word has a more specific meaning in physics. An obvious variation of the ordered traversal idea is to simply transfer the order relation from the ordered set I in Definition 7.1.17 to the set X itself. One can obviously also invert the map f without materially affecting the definition. A much more interesting variation is to replace the total order on I with a general partial order on I. But once again, one may as well consider the order to be defined on the set X itself. The purpose of inventing an “ordered traversal” is to put any superfluous structures on the ordered set I into the background. If one uses real-number intervals for I, the topological, differentiability and algebraic properties of the interval suggest themselves, which is often not desirable.
7.2. Ordinal numbers [ The number sections are very scrappy and sloppy right now because it’s all very standard material which everybody knows. They will be written properly after the more difficult parts of the book have been written. ]
7.2.1 Remark: The concept of infinity follows naturally from the concept of extrapolation, which is the ability to infer a rule and apply it to make predictions. No one ever holds an infinite amount of information in the mind because of fundamental limitations on information storage. Even if the human mind could hold infinite information, it would not be possible to communicate that information because of signal processing limitations on information transmission. Therefore notionally infinite concepts are actually rules, and the rules themselves are finite. Even simple animals are able to recognize patterns and learn rules. Psychologists have a concept of “conditioning”, which means that an animal learns to follow a pattern. The learning animal infers a rule and follows it. The recognition of a rule or pattern is sometimes called the “aha” experience when the individual “gets it”. When a pattern has been recognized, the individual’s perception of a sequence of events changes from surprise to disinterest. The first few times that a human put a seed in the ground and an edible plant grew out of it, this may have been surprising, but after a while recognition must have dawned, after which the inferred rule would become “knowledge”, from which a conceptually infinite number of predictions could be made. Humans are good at inferring rules from experience. For example, when inference is applied to a sequence such as 1, 2, 4, 8, 16. . . , a human can infer the rule f (n) = 2n , which is a finite rule that in principle specifies an infinite amount of information, but this is only possible because of the redundancy in the sequence. So rule and pattern recognition rely upon redundancy in the sequences of experiences. It is important to remember that nothing in mathematics is really infinite. Infinite concepts are only infinite in principle. Without infinite concepts, mathematics would have to return to the 16th century or even much earlier. Without infinite concepts, there would certainly be no differential geometry. However, all finite concepts and manipulations are expressed in terms of finitely representable rules. 7.2.2 Definition: The successor set of a set X is the set X ∪ {X}. [ Check the claim in Remark 7.2.3 that a set which equals its own successor set suffers from the Burali-Forti paradox. ] 7.2.3 Remark: For any set X, the successor set X ∪ {X} is a well-defined set. This follows from the unordered pair axiom and the union axiom in ZF set theory. The successor set construction is used in the definition of the ordinal numbers, both finite and infinite. If it is supposed that a set W is equal to its own successor set, such a “set” suffers from the Burali-Forti paradox. [ Probably von Neumann’s set construction for ordinal numbers in Remark 7.2.4 was published in 1928? ] [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
[ Shoenfield [168], pages 246–252, has an interesting version of ordinals and transfinite induction. ]
7.2. Ordinal numbers
231
7.2.5 Remark: In real life, people tend to represent the non-negative integers as sequences of decimal digits. Thus 4096 and 99999 would be non-negative an alternative representation as S integers. This suggests n finite sequences (i.e. lists) of the form (di )ki=1 ∈ ∞ ({0, 1, . . . 9}) . Ths fly in the ointment here is that n=0 such sequences require the prior definition of the non-negative integers to provide index sets for sequences of arbitrary length. This raises the question of whether sequences of arbitrary length can be defined without first defining integers. Definition 6.1.14 suggests that this can be done. However, each length of sequence must be defined separately because inductive definition cannot be done without first defining all of the non-negative integers! Thw whole purpose of the apparently absurd von Neumann construction for ordinal numbers is to boot-strap the definitions of all integer-related things starting from the basis of the ZF axioms. The axiom of infinity is the crucial axiom which makes it all work. As soon as the ordinal numbers have been defined, and the basic properties have been established, they may be used to re-define integers as sequences of digits, either decimal or binary, or in another related form. It may seem that the von Neumann construction is hereby shown to be at a more fundamental level than the representation of integers as digit sequences. However, this may be an illusion. All of mathematical logic, upon which the ZF axioms are based, pre-supposes the naive notion of sequences of symbols of arbirary length (in each proposition) and sequences of propositions of arbitrary length (in deductive arguments). If sequences are required already for logic, then surely they may be applied in the representation of integers as digit sequences. This is true in principle, but not in practice. In practice, when the ZF axioms have been built up on a basis of propositional and predicate calculus, the underpinnings of the logical calculus are removed and ZF set theory must stand on its own. There is some small justice in this, namely the fact that by discarding naive logic and mathematics when the 8 axioms of ZF set theory have been set up, this minimal set of axioms is easier to examine for validity than an ill-defined body of naive mathematics. On the other hand, the metamathematical examination of the validity of ZF set theory requires a lot of naive mathematics. One may argue back and forth forever in this way. It is better to just not think about it too much. 7.2.6 Remark: It is not possible to use induction to define the finite ordinal numbers in the obvious way because induction requires the prior definition of ordinal numbers. The ZF infinity axiom guarantees the existence of at least one infinite set, but from this we must construct a set of ordinals which is infinite but not too infinite. When the finite ordinals have been rigorously defined, the rest of differential geometry can be based securely on them. For this important leap it is convenient to state a few intuitively obvious properties of the finite ordinal numbers N ∈ ω defined informally (by naive induction) as 0 = ∅, 1 = {0}, 2 = {0, 1} etc. The following intuitive properties of finite ordinal numbers can be used both to define them and to verify that the definition matches the intuitive idea. (i) For all N ∈ ω, either N = ∅ or ∃m ∈ ω, N = m ∪ {m}. That is, all finite ordinal numbers are either the empty set or can be constructed as the successor of another ordinal number. However, this property does not exclude the possibility of an infinite set. S S (ii) For all N ∈ ω, N ∈ N . The element N of N is the maximum S element S of NS. This property prevents the set N from being infinite because the successor set N = ( N ) ∪ { N } of N cannot be a member of N . (iii) For all N ∈ ω, for all m ∈ N , m ∈ ω. That is, all elements of finite ordinal numbers are finite ordinal numbers. In other words, N ⊆ ω. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
7.2.4 Remark: Roughly speaking, the finite ordinal numbers are defined inductively as 0 = ∅, 1 = {0}, 2 = {0, 1}, 3 = {0, 1, 2}, n+ = n ∪ {n}, etc. Definition 7.2.12 gives a more rigorous characterization. According to EDM2 [34], 33.B, this successor-set construction for the ordinal numbers is due to John von Neumann. This construction has various useful properties. For example, 0 ∈ 1 ∈ 2 ∈ 3 . . . and 0 ⊆ 1 ⊆ 2 ⊆ 3 . . .. ⊂ ⊂ But x ∈ y ⇒ x 6= y. Therefore 0 ⊂ 6= 1 6= 2 6= 3 . . .. The membership relation is transitive, and in fact corresponds exactly to our usual notion of the “ 0} = {1, . . .} denotes the set of positive integers. + 0 = {n ∈ ; n ≥ 0} = {0, 1, . . .} denotes the set of non-negative integers. − = {n ∈ ; n < 0} = {. . . − 3, −2, −1} denotes the set of negative integers. − 0 = {n ∈ ; n ≤ 0} = {. . . − 3, −2, −1, 0} denotes the set of non-positive integers.
Z Z Z Z
Z Z Z Z
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
[ Here define non-negative integer powers of non-negative integers. ]
238
7. Order and integers n
−8 −7 −6 −5 −4 −3 −2 −1
0
1
2
3
4
5
6
7
8 9 10 11 12 13 14 15 16
p Figure 7.5.1
Construction of signed integers from pairs of unsigned integers
Z
Z
7.5.5 Remark: Reinhardt [134], page 8, gives the notations + = {1, 2, 3 . . .} and + 0 = {0, 1, 2, 3 . . .}. These do not seem to be in common use in the English-language literature, but they are used in this book for the greatest overall simplicity and consistency.
Zn denotes the set {i ∈ Z; 0 ≤ i < n} = {0, 1, . . . n − 1} for n ∈ Z+0. 7.5.7 Remark: Since the elements of the set Zn in Notation 7.5.6 are all non-negative, one could define Zn as a subset of the ordinal numbers ω. In fact, the set Zn is essentially identical to the ordinal number n = {0, 1, . . . n − 1}. However, the notation Zn is very commonly used. It is also useful to be able to handle negative numbers in the same context as such sets since the modulo function (Definition 8.6.16) is defined for signed numbers and is closely associated with the sets Zn . 7.5.6 Notation:
7.5.8 Remark: The “two’s complement” representation of negative integers in computers, which is the most popular by far in personal computers, is of the form (−2n , p), where p ∈ + 0 is a non-negative integer satisfying 2n−1 ≤ p < 2n , where n is the number of bits. The bit-patterns for 0 ≤ p < 2n−1 represent themselves. So the value represented by a two’s complement binary number p is ((p + 2n−1 ) mod 2n ) − 2n−1 in terms of the modulo operator in Definition 8.6.16. 7.5.9 Remark: Figure 7.5.2 illustrates another popular way of representing signed integers. s
0 Figure 7.5.2
-1
-2
-3
-4
-5
-6
-7
-8
-9 -10 -11 -12 -13 -14 -15 -16
1
2
3
4
5
6
7
8
9
10 11 12 13 14 15 16
v
Construction of signed integers from unsigned integers and sign
This kind of representation is sometimes used in computers. (For example, n-bit “one’s complement” representation has one sign bit followed by n − 1 value bits.) In this case, each signed integer corresponds to a unique value-sign pair (v, s), where v is a non-negative integer and s = 0 or 1. The pair (0, 1) (meaning “−0”) may or may not be excluded from the set. If “−0” is included in the set, it is defined to be equal to 0 anyway.
Z
7.5.10 Remark: a = (aα )α∈A denotes a function with domain A. If A is a subset [m, n] of , then a may be regarded as a sequence and written as (ai )ni=m . If A is a cross-product of two suitable subsets m and n of , then a may be regarded as a matrix and written as [aij ]m−1,n−1 . i,j=0
Z
[ Should define “sequences” as families with integer number domain. But Definition 7.1.14 defines a sequence as a family with a totally ordered index set. ] [ Must define addition and multiplication on integers. This should maybe wait until the algebra chapter. ] [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Z
7.6. Extended integers
239
7.5.11 Remark: The first publication of the modern definition of multiplication for signed numbers is attributed to Rafael Bombelli in 1572. (See Remark 46.2.7.) [ Define computer representations and arithmetic for signed integers. ] [ Maybe comment on Gaußian integers near here. Basically that’s n . Also comment on the linear transformations which are permitted for such grids. ]
Z
7.6. Extended integers [ Also present an axiomatic system for extended signed integers. ] 7.6.1 Remark: It does not matter how the pseudo-numbers ∞ and −∞ are represented. The only thing that really matter is the order relations ∀n ∈ , n < ∞ and ∀n ∈ , −∞ < n, and of course −∞ < ∞.
Z
Z Z
Z
7.6.2 Definition: The set of extended integers is the set ∪ {−∞, ∞}. − denotes the set of extended integers ∪ {−∞, ∞}. 7.6.3 Notation:
Z
7.6.4 Notation: −+ − = {n ∈ ; n > 0} = {1, 2, . . . , ∞} denotes the set of positive extended integers. − −+ = {n ∈ ; n ≥ 0} = {0, 1, . . . , ∞} denotes the set of non-negative extended integers. −0− − = {n ∈ ; n < 0} = {−∞, . . . − 3, −2, −1} denotes the set of negative extended integers. −− − 0 = {n ∈ ; n ≤ 0} = {−∞, . . . − 3, −2, −1, 0} denotes the set of non-positive extended integers.
Z Z Z Z
Z Z Z Z
7.7. Cartesian products of sequences of sets and functions 7.7.1 Notation: X n , for any set X and n ∈ for all i ∈ Nn .
Z+0, denotes the Cartesian product ×i∈N
n
X i where Xi = X
7.7.2 Remark: Notation 7.7.1 has the interesting consequence that X 0 = Y 0 for any sets X and Y . This shows how necessary it is to have class tags and even name tags for sets, since the set value of a combination of symbols does not contain the entire meaning of those symbols.
Z
n 7.7.3 Definition: An n-tuple of elements of a set X for any n ∈ + 0 is any element of X . A 2-tuple may be referred to as a duple (rarely) or ordered pair, a 3-tuple may be called a triple, a 4-tuple may be called a quadruple and a 5-tuple may be called a pentuple.
7.7.4 Remark: The phrase “ordered pair” should be used with care because it can also mean Definition 6.1.3. 7.7.5 Notation: The expressions (a, b), (a, b, c), (a, b, c, d) and (a, b, c, d, e) for a, b, c, d, e ∈ X denote respectively a duple, triple, quadruple and pentuple of elements of X. (The ordering on the printed line represents the order within each tuple.) Notations for general n-tuples are defined inductively on n.
Z
m 7.7.6 Definition: For any set X and m, n ∈ + × Xn → 0 , the standard identification map concat : X m+n m n m+n X is defined for a ∈ X and b ∈ X by concat : (a, b) 7→ c, where c ∈ X is defined by ci = ai for i ∈ m and ci+m = bi for i ∈ n.
Z
[ Define Gaußian integers n . ] [ Also define Cartesian product of sequence of functions corresponding to families of functions as in Definition 6.9.11. ] [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
7.6.5 Remark: The bar over the symbol for the set of integers in Notation 7.6.3 is inspired by the usual notation for the closure of a set in topology. The rough idea here is that by including the pseudo-numbers ∞ and −∞, the set of numbers is made in some sense “complete” by including the limits of numbers as they get arbitrarily large in either direction. Of course, the notion of “limit” requires topology, which is not defined for a mere algebraic system. (Example 14.7.3 gives a hint of how to give topological meaning to the idea that ∞ is the limit of arbitrary large integers.) The same over-bar convention is followed consistently (in this book) for the rational and real number systems also, as summarized in Remark 1.6.1.
240
7. Order and integers
7.8. Choice functions without the axiom of choice [ Near here should have a list of ways of avoiding AC by having particular knowledge about a collection of sets. E.g. if all of the sets in a collection are the same, one can choose the same value for each set. In the particular example of the set of all non-empty compact subsets of the real numbers, one can choose the infimum of each subset as the representative. This is an uncountable collection which is quite complex, but choices can easily be made. ] Despite all that has been said against the Axiom of Choice in Section 5.9, there are often times when it is desirable to be able to say that the cross product of a non-empty family of non-empty sets is non-empty. The elements of any cross product can be thought of as “choice functions” because each element of the cross product singles out an element from each element of the family. As mentioned in Remark 5.9.9, the axiom of choice may be expressed as (S 6= ∅ ∧ (∀i ∈ S, Xi 6= ∅)) ⇒ ×i∈S Xi 6= ∅.
(7.8.1)
[ There are hopefully some more sets of conditions for AC, perhaps using total orders or well-ordering on the collections of sets. If each set in a collection of sets is nicely enough ordered, then perhaps a unique minimum or maximum of each set can be “chosen”. It seems pretty clear that if ∀i ∈ S, ∃′ x ∈ Xi , ∀y ∈ Xi , xRi y, where Ri is an order on Xi for all i ∈ S, then one can “choose” f (i) = mini (Xi ) for all i ∈ Xi , where mini denotes the minimum with respect to the order Ri . But this seems to have nothing to do with order specifically. (See Theorem 7.8.2.) Any only uniqueness result for a proposition would do. Alternatively, the set of all choices for finite subcollections of the collection of sets could be given a suitable ordering so that a global choice can be made in terms of the ordering. ] T 7.8.1 Theorem: Let the family (Xi )i∈S have non-empty intersection i∈S Xi . Then (S 6= ∅ ∧ (∀i ∈ S, Xi 6= ∅)) ⇒ ×i∈S Xi 6= ∅.
T Proof: To say that the intersection i∈S Xi is non-empty means that ∃x, ∀i ∈ S, x ∈ Xi . So define f = {(i, x); i ∈ S}. Then f ∈ ×i∈S Xi . Therefore ×i∈S Xi 6= ∅. 7.8.2 Theorem: If the family (Xi )i∈S satisfies ∀j ∈ S, ∃′ x ∈ Xj , P (j, x) for some set-theoretic formula P , then (S 6= ∅ ∧ (∀i ∈ S, Xi 6= ∅)) ⇒ ×i∈S Xi 6= ∅. S S Proof: Define f : S → i∈S Xi by f = {(j, x) ∈ i∈S Xi ; P (j, x) ∧ x ∈ Xj }.
7.8.3 Remark: It seems reasonable that an “axiom of countable choice” should follow from the ZF axioms using induction. However it appears that this is not so. Equation (7.8.1) must be proven for countable S.
Let (Xi )i∈S be a non-empty countable family of non-empty sets. Define Y = ×i∈S (Xi ∪ {∅}). Then Y 6= ∅ because the function z = {(i, ∅); i ∈ S} is an element of Y . It may be assumed that the index set S is the set ω of finite ordinal numbers by composing the map X with a surjective map g : ω → S to obtain X ◦ g : ω → Range(X). Let P (n) be the proposition: ∃x ∈ Y, ∀j < n, xj ∈ Xj . This means that there exists a sequence of choices xj ∈ Xj for j < n, although the remaining xj for j ≥ n might all equal the empty set. Clearly P (0) is true because the function z as above satisfies the proposition. Now suppose that P (k) is true for some k ≥ 0. Then ∃x ∈ Y, ∀j < k, xj ∈ Xj . But ∃y, y ∈ Xk because Xk 6= ∅. So define x ¯ ∈ Y by x ¯j = [ www.topology.org/tex/conc/dg.html ]
xj y
for j 6= k j=k [ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
In other words, if a non-empty family (Xi )i∈S consists entirely of non-empty S members, then its cross product ×i∈S Xi is non-empty. In other words, there exists a function f : S → i∈S Xi (called a “choice function”) such that ∀i ∈ S, f (i) ∈ Xi . Although the general axiom of choice is rejected in this book, there are special conditions under which Equation (7.8.1) follows from the ZF axioms in Section 5.1. Theorem 7.8.1 gives one set of special conditions.
7.9. Indicator functions and delta functions
241
for all j ∈ ω. Then ∀j < k + 1, x ¯j ∈ Xj . So P (k + 1) is true. Therefore P (n) is true for all n by induction. That is, ∀n ∈ ω, ∃x ∈ Y, ∀j < n, xj ∈ Xj .
Unfortunately, it does not seem to be possible to swap the universal and existential quantifiers here. (This is a very common frustration indeed in mathematics. Swapping the quantifiers without sufficient justification is the cause of many errors, especially when one the desired result is true. A similar kind of error S T“knows” that T S is swapping intersections and unions as in i j Sij and j i Sij .) Several lines of attack here all lead back subtly to the axiom of choice which one is trying to avoid. For example, let Yn = {x ∈ Y ; ∀i < n, xi ∈ Xi }. Then there is surely a function h : ω → Y such that hn ∈ Yn for all n ∈ ω. If so, then it is simple to ¯∈T construct a function h n∈ω Yn from the sequence (hn )n∈ω . But the existence of such a function requires the axiom of countable choice! There is an interesting comment by Taylor [144], page 19, regarding the axiom of countable choice. Any non-empty set A contains at least one element x, and in the ordinary process of logic one can choose a particular element from a non-empty set. By using the principle of induction it follows that one can choose an element from each of a sequence of non-empty sets, but difficulty arises if one has to make the simultaneous choice of an element from each set of a non-countable class C . This comment seems to assume the axiom of countable choice. If so, it shows how easy it is to use AC without knowing it. 7.8.4 Remark: This whole subject of axioms of choice and pixies at the bottom of the garden has become so deeply convoluted that the author will have to make some hard choices in order to avoid spending the next 10 years in deep meditation on set theory. The choice will probably be to accept the ZF axioms, but to tag all theorems requiring the axiom of general or countable choice so that the reader will be able to assess just what they will lose or gain by rejecting or accepting such axioms. The AC-sceptic may simply ignore all AC-tainted results.
7.9.1 Remark: The numbers 0 and 1 in Definition 7.9.2 may be unsigned integers, signed integers, real numbers, complex numbers, or elements of any unitary ring or field. The range of the function depends on the context. So the indicator function is actually a kind of “meta-function”. The same observation applies also to the Kronecker delta function and the Levi-Civita alternating symbol. The symbols “0” and “1” are to be interpreted according to context. 7.9.2 Definition: The indicator function of a subset A of a set S is the function χA : S → {0, 1} defined by χA (x) =
1, x ∈ A 0, x ∈ S \ A,
for any subset A of S. 7.9.3 Notation: χA for a subset A of a set S denotes the indicator function of A. 7.9.4 Remark: Figure 7.9.1 illustrates an indicator function χA where A = {a} is a subset of S = The ambient set S is usually clear from the context.
Z2.
χ{a} (a, 1)
S
Figure 7.9.1 [ www.topology.org/tex/conc/dg.html ]
Function χ{a} for S =
Z2 [ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
7.9. Indicator functions and delta functions
242
7. Order and integers
An indicator function is also sometimes called a characteristic function, but this can be confusing, especially in probability contexts. However, the use of the Greek letter χ for the indicator function is no doubt derived from the first letter of qarakt r (which means “stamp, mark, characteristic trait, character, token”). 7.9.5 Theorem: For any set S, the sets IP(X) and 2S are equinumerous. Proof: Define f : IP(S) → 2S by f : A 7→ χA for all A ⊆ S, where χA denotes the indicator function for A as a subset of S. Then f is a bijection. 7.9.6 Remark: The sets 2S and IP(S) are quite frequently identified with each other. The inverse of the canonical map in the proof of Theorem 7.9.5 is the map f −1 : g 7→ {x ∈ S; g(x) = 1}. In the case S = ω, there is a bijection φ : 2ω → [0, 1) which maps infinitePsequences of zeros and ones to the binary expansions of real numbers in the interval [0, 1), namely φ : g 7→ n∈ω g(n)2−n−1 .
7.9.7 Definition: The (integer) power-of-two function is the function f : 1 and f (n + 1) = 2f (n) for n ∈ + 0. 7.9.8 Notation:
Z+0 → Z+ which satisfies f (0) =
Z + 2n for n ∈ Z+ 0 denotes the integer power-of-two function for argument n ∈ Z0 .
7.9.9 Theorem: For any finite set S, #(IP(S)) = #(2S ) = 2#(S) . 7.9.10 Definition: The Kronecker delta function on a set S is the function δ : S × S → {0, 1} which is defined so that δ(i, j) = 1 if and only if i = j. 7.9.11 Notation: The Kronecker delta expression δ(i, j) may be denoted as δij , δji , δij or δ ij . 7.9.12 Remark: The Kronecker delta function in Definition 7.9.10 is often applied to sets S which are subsets of the integers. The Kronecker delta function for S = is illustrated in Figure 7.9.2.
Z
j
i Figure 7.9.2
Kronecker delta function on the integers
As mentioned in Remark 7.9.1, the range of the Kronecker delta function must be interpreted according to context. The symbols “0” and “1” must be defined within the context. Thus the Kronecker delta function is a kind of meta-function. Theorem 7.9.13 shows that the indicator function and characteristic function for any set S are closely related. 7.9.13 Theorem: For any set S, ∀i, j ∈ S,
χ{i} (j) = δij .
7.10. Permutations 7.10.1 Remark: A permutation is usually defined only for finite sets. Permutations are useful for defining symmetries and for constructing examples of finite groups. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
δ ij
7.10. Permutations
243
7.10.2 Remark: The word “permutation” is used for two different but related concepts. The first kind of permutation, given in Definition 7.10.3, is a map from an entire set to itself. This may be thought of as an ordering of the elements of the set although the set does not need to have a defined order. A permutation P of a set X is typically used in conjunction with a function g : X → Y or a function h : Y → X for some set Y . Since P : X → X is a bijection, the functions g ◦ P : X → Y and P ◦ h : Y → X are well defined. The function g ◦ P is injective if and only if g is injective. Similarly, g ◦ P is surjective if and only if g is surjective. Corresponding relations hold between P ◦ h and h. Thus permutations are useful for modifying functions by rearranging the elements of the domain and/or range. The second kind of permutation is an “ordered selection” or “ordered subset” of a given set X. Thus this kind of permutation of the elements of a set X is a subset S of X which has a total order structure added to the subset S. The simplest way to add an order structure to a set S is to induce the order from a set which has a well-known pre-defined order, such as a set of integers for example. Thus an injective function ψ : S → induces an order on a set S by defining x ≤ y for x, y ∈ S if and only if ψ(x) ≤ ψ(y). For a finite set X, it is customary to use the inverse of such a map ψ and require the domain of this inverse to be a contiguous set such as k for some k ∈ + k → S. 0 . Thus a permutation could be defined as a map p :
Z
N
Z
N
This kind of permutation is called a “k-permutation” by EDM2 [34], article 330, to distinguish it from the setbijection style of permutation. In the special case that X = k , the two concepts are the same. Feller [106], page 28, uses the term “ordered sample” for an ordered selection and uses the word “permutation” for the special case #(X) = k.
N
Set-bijection permutations are defined in this section. Ordered-selection “permutations” are defined in Section 7.11. 7.10.3 Definition: A permutation of a set X is a bijection from X to X. 7.10.4 Notation: perm(X) for a set X denotes the set of permutations of X. 7.10.5 Remark: Notation 7.10.4 is probably non-standard.
If the set X in Definition 7.10.7 has less than two elements, there are no transpositions on X. [ Probably should define the “swap” operator for general sets. Some other list operators are similarly general in nature. ] 7.10.7 Definition: A transposition of a set X is a bijection f : X → X such that for some i, j ∈ X with i 6= j, ∀x ∈ X,
f (x) =
(
j if x = i i if x = j x otherwise.
7.10.8 Remark: Definition 7.10.9 assumes that a total order exists on the domain of a permutation. However, this is not really necessary. The parity of a permutation can be defined in terms of the number of transpositions that the permutation is equivalent to. [ Show that the parity of the number of transpositions of a permutation is equal to the parity in Definition 7.10.9. Determine necessary and sufficient structures and conditions on the domain to make this true more generally. ] 7.10.9 Definition: The parity of a permutation f : X → X of a totally ordered set X is the integer parity(f ) = (−1)k where k = #{(i, j) ∈ X; i < j ∧ f (i) > f (j)}. 7.10.10 Definition: An even permutation of a totally ordered set X is a permutation f of X such that parity(f ) = 1. An odd permutation of a totally ordered set X is a permutation f of X such that parity(f ) = −1. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
7.10.6 Remark: A permutation is a transposition if and only if exactly two elements of the set are swapped. Definition 7.10.7 is related to the “swap” operator in Definition 7.12.2.
244
7. Order and integers
7.10.11 Remark: The parity of a permutation is also called the sign or the index of the permutation. Parity commutes with function composition. The Levi-Civita symbol in Definition 7.10.20 is essentially the same thing as the parity function. 7.10.12 Theorem: (i) If f and g are permutations of a set X, then parity(f ◦ g) = parity(f ) parity(g). (ii) parity(f ) = −1 for any transposition f of any set X. (iii) A permutation f of a set X is even if and only if f is equal to the composition of an even number of transpositions of X. 7.10.13 Definition: The factorial function is the map f : ∀n ∈ + , f (n) = nf (n − 1).
Z
Z+0 → Z+0 defined inductively by f (0) = 1 and
7.10.14 Notation: n! denotes the value of the factorial function for argument n. Qn 7.10.15 Remark: The factorial function may be expressed as n! = i=1 i. 7.10.16 Definition: The Jordan factorial function is the map f : + ∀n ∈ + 0 , f (n, 0) = 1 and ∀n, k ∈ 0 , f (n, k + 1) = (n − k)f (n, k).
Z
Z
Z+0 × Z+0 → Z+0 defined inductively by
7.10.17 Notation: (n)k denotes the value of the Jordan factorial function for argument (n, k). Q 7.10.18 Remark: The Jordan factorial function may be expressed as (n)k = k−1 i=0 (n − i). When n < k, (n)k = 0. When n = k, (n)k = n!. When n ≥ k, (n)k = n!/(n − k)! The Jordan factorial function is also defined (and very useful) for a general real or complex argument n. Notation 7.10.17 is not very safe in differential geometry, which is infested with parentheses and subscripts. 7.10.19 Remark: The number of permutations of a set with n elements is n! for n ∈
Z+0.
N
N
Z
ǫ(f ) = for all f :
Nn → Nn.
(
N N N
−1 if f is an odd permutation of n 0 if f is not a permutation of n +1 if f is an even permutation of n ,
Z
n 7.10.21 Notation: ǫi1 ,...in for n ∈ + 0 denotes the value ǫ(i) of the Levi-Civita symbol for i = (ik )k=1 . ǫi1 ,...in is an alternative notation for ǫi1 ,...in .
7.10.22 Remark: See Definition 7.10.3 for permutations. The Levi-Civita symbol is essentially the same thing as the parity function in Definition 7.10.9. Only the notation is different.
7.11. Combinations and ordered selections 7.11.1 Remark: An ordered selection is also known as an “ordered sample”, a “k-permutation” or simply a “permutation”. This is explained in Remark 7.10.2. A combination is called a “k-combination” by EDM2 [34], article 330, but it could also be referred to as an “unordered selection” or “unordered sample”. Although these concepts are important in probability theory, they are even more important in analysis, particularly as coefficients of Taylor series. 7.11.2 Definition: The combination symbol is the function C : ∀n, r ∈
Z+0,
7.11.3 Notation: Crn for n, r ∈ [ www.topology.org/tex/conc/dg.html ]
Z+0 × Z+0 → Z+0 defined by
C(n, r) = #{x ∈ IP(n); #(x) = r}.
Z+0 denotes the value of the combination symbol C(n, r). [ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
7.10.20 Definition: The Levi-Civita (alternating) symbol or Levi-Civita tensor (with n indices) is the function ǫ : ( n → n ) → {−1, 0, 1}, for some n ∈ + 0 , defined by
7.11. Combinations and ordered selections
245
7.11.4 Remark: Definition 7.11.2 means that Crn is the number of distinct r-element subsets of an n n element set. The notation r is a popular alternative to Crn . However, when combination symbols are mixed with other areas of mathematics, the notation nr tends to be ambiguous. 7.11.5 Theorem: ∀n, r ∈
Z+0,
Crn =
r−1 Y i=0
n−i . 1+i
7.11.6 Remark: When the combination symbol values are arranged in a triangle, this is called “Pascal’s triangle”, although Pascal called it an “arithm`etic triangle” or “triangle arithm´etique”. Pascal’s triangle has so many interesting properties that whole books have been written about it. In particular, Blaise Pascal wrote a small treatise “Trait´e du triangle arithm´etique” [129] in 1654, published posthumously in 1665. He wrote the following preface to the chapter on applications. DIVERS VSAGES DV TRIANGLE ARITHMETIQVE. Dont le generateur est l’Vnit´e Apres auoir donn´e les proportions qui se rencontrent entre les cellules & les rangs des Triangles Arithmetiques, ie passe ` a diuers vsages de ceux dont le generateur est l’vnit´e; c’est ce qu’on verra dans les traictez suiuans. Mais i’en laisse bien plus que ie n’en donne; c’est vne chose estrange combien il est fertile en proprietez, chacun peut s’y exercer; I’auertis seulement icy, que dans toute la suite, ie n’entends parler que des Triangles Arithmetiques, dont le generateur est l’vnit´e. This may be translated into English as follows. After having given the proportions which are encountered between the cells and the rows of arithm`etic triangles, I pass to various uses of those of which the generator is unity; that is what one will see in the following tracts. But I have left out many more of them than I have given; it is a strange thing how fertile in properties it is, everyone may exercise himself on it; I mention here only that in all of the following, I intend only to talk of arithm`etic triangles of which the generator is unity. The comment about fertile properties is more often translated as: “It is extraordinary how fertile in properties this triangle is. Everyone can try his hand.” The conclusion one may draw from this is that it is pointless to try to make a comprehensive list of properties of the combination symbol (i.e. Pascal’s triangle). Pascal’s stipulation that “the generator is unity” is equivalent to part (i) of Theorem 7.11.9. 7.11.7 Remark: Struik [193], page 74, notes that Pascal’s triangle was published by Chinese mathematicians Yang Hui (about 1238–1298ad) and Zhu Shijie (about 1260–1320ad) during the Sung dynasty (960–1279ad). Yang Hui presents us with the earliest extant representation of the Pascal triangle, which we find again in a book c. 1303 written by Zhu Shijie (Chu Shi-chieh). However, it seems these Chinese authors merely wrote out six or eight rows of the triangle without systematically investigating its properties as Pascal did. 7.11.8 Remark: To establish that the “cells” of Pascal’s triangle are equal to the combination symbol in Definition 7.11.2, it is necessary to show that the combination symbol satisfies the induction rule in Theorem 7.11.9. n Cr−1 Crn Crn+1 7.11.9 Theorem:
Z Z
(i) ∀r ∈ , Cr0 = δr0 . n+1 n (ii) ∀n ∈ + = Cr−1 + Crn . 0 , ∀r ∈ , Cr
Z
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Various uses of the arithm`etic triangle with unit generator
246
7. Order and integers
7.11.10 Notation:
Z+0 denotes the set {f : Nr → Nn; ∀j, k ∈ Nr , j < k ⇒ f (j) < f (k)}. Jrn for n, r ∈ Z+ 0 denotes the set {f : Nr → Nn ; ∀j, k ∈ Nr , j ≤ k ⇒ f (j) ≤ f (k)}. Irn for n, r ∈
7.11.11 Theorem:
Z+0, #(Irn) = Crn. n n+r−1 (ii) ∀n, r ∈ Z+ . 0 , #(Jr ) = Cr (i) ∀n, r ∈
Proof: [ A proof of the symmetric case (ii) is given by Feller [106], section II.5, page 38. (The antisymmetric case is there too! But that’s much simpler.) ] 7.11.12 Remark: The notations Irn and Jrn in Notation 7.11.10 are non-standard. Federer [105], 1.3.2, p.15, gives the notation Λ(n, r) for Irn . Sets of increasing and non-decreasing sequences of indices (i1 , . . . ir ) selected from a complete set (1, . . . n) are frequently encountered in tensor algebra. One often wishes to sum an expression for index sequence values which have a unique representative for each subset of distinct index values. It is perhaps noteworthy that Theorem 7.11.11 is equivalent to the following assertions. ∀n, r ∈
Z+0,
∀n, r ∈
Z+0,
#(Irn ) = #(Jrn ) =
r−1 Y
i=0 r−1 Y i=0
n−i . 1+i n+i . 1+i
These are more similar to each other than suggested by parts (i) and (ii) of Theorem 7.11.11.
N
Nn → A and i ∈ Imn for any set A means the sequence f ◦ i = Nm → A.
7.11.14 Notation: f i for a function f : i(j) m (f (i(j)))m )j=1 = (f i1 , . . . f im ) : j=1 = (f
7.11.15 Remark: It may seem that Notations 7.11.13 and 7.11.14 are contradictory. Notation 7.11.14 often arises in contexts where the Einstein index convention (Remark 13.8.18) is used. According to this convention, raised indices are a mnemonic which indicate that the sequence is contravariant in some sense (whereas lowered indices are a mnemonic for covariance). Notation 7.11.14 simply preserves the choice of subscript or superscript. The set A in Notations 7.11.13 and 7.11.14 is typically the field of numbers which is used for coordinates and coefficients of vectors or a set of vectors in a linear space. [ Here insert ordered selections. See for example EDM2 [34], article 330. ]
7.12. List spaces for general sets 7.12.1 Remark: S∞ Lists are finite sequences of arbitrary length. A list of items in a set X is an element of the set union i=0 X i . The set X i may be taken literally as the set of functions f : i = {0, 1, . . . i − 1} → X or the set of functions f : i = {1, 2, . . . i} → X. The initial index 0 is used in this section for maximum convenience.
N
Although lists are familiar in computer software and everyday life, they are rarely formalized in mathematics texts. The items in a list may be referred to also as components, elements or members. One application for lists is in the definition of the exterior derivative of a differential form. (See Definition 20.6.2.) List spaces are also useful in homology theory. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
N
n 7.11.13 Notation: fi for a function f : n → A and i ∈ Im for any set A means the sequence f ◦ i = m m (f (i(j)))j=1 = (fi(j) )j=1 = (fi1 , . . . fim ) : m → A.
7.12. List spaces for general sets
247
[ Consider the space List(X) to be embedded in X ∞ ? ] [ Should indicate for all list operations what the length of the resulting list is. ] [ Many of the following list operators are well-defined for general functions. Make a separate definition for such any-function operators. ] 7.12.2 Definition: For any set X, the list space List(X) on X is the set List(X) =
Z
i∈
together with the following operations:
Xi =
S
∞ S
X i,
i=0
+ 0
Z
(i) The length function length : List(X) → + 0 defined by length : x 7→ #(Dom(x)). (ii) The canonical injection i : X → List(X) defined by i : x → (x), where (x) is the one-element list containing the single element x. (iii) The concatenation function concat : List(X) × List(X) → List(X), defined by
Z
∀x, y ∈ List(X), concat(x, y) ∈ X length(x)+length(y) and ∀i ∈ length(x)+length(y) , xi if i < length(x) concat(x, y)i = yi−length(x) if i ≥ length(x). (iv) The “restriction to n items” function. . . (v) The “omit item i” function omitj : List(X) → List(X) defined by: omit(ℓ)(i) = j
ℓi ℓi+1
if i < j if j ≤ i < length(ℓ) − 1.
That is, omit(ℓ) = (ℓ0 , . . . , ℓj−1 , ℓj+1 , . . . , ℓlength(ℓ)−1 ). Note that omitj acts as the identity on lists which have length less than or equal to j. Otherwise, omitj maps X k to X k−1 , where k = length(ℓ). (vi) The “omit items i and j” function omitj,k : List(X) → List(X) is defined for j, k ∈ + 0 such that j 6= k by if i < min(j, k) ℓi omit(ℓ)(i) = ℓi+1 if min(j, k) ≤ i < max(j, k) j,k ℓi+2 if max(j, k) ≤ i < length(ℓ) − 2.
Z
Note that omitj,k acts as the identity on lists which have length less than or equal to min(j, k). (vii) The “swap items i and j” operator swapj,k : List(X) → List(X) is defined for j, k ∈ + 0 by
Z
ℓk swap(ℓ)(i) = ℓj j,k ℓi
if i = j and j ∈ Dom(ℓ) if i = k and k ∈ Dom(ℓ) if i ∈ / {j, k}.
The operator swapj,k equals the identity function if j = k. (This operator is well-defined for any function, not just for lists.) (viii) The “substitute x at position j” operator subsj,x : List(X) → List(X) defined by: subs(ℓ)(i) = j,x
x ℓi
if i = j if i = 6 j.
The operator subsj,x equals the identity on lists ℓ for which j ∈ / Dom(ℓ). (This operator is well-defined for any function, not just for lists.) [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
j
248
7. Order and integers
(ix) The “insert x at position j” function insertj,x : List(X) → List(X) defined by: ℓi insert(ℓ)(i) = x j,x ℓi−1
That is,
if i < j if i = j if j < i < length(ℓ) + 1.
insert(ℓ) = (ℓ0 , . . . , ℓj−1 , x, ℓj , . . . , ℓlength(ℓ)−1 ). j,x
The operator insertj,x acts as the identity on lists which have length less than or equal to j. (x) The “subsequence of length n starting at k” function subseqk,n : List(X) → List(X) is defined by subseq(ℓ)(i) = k,n
n
ℓk+i 0
for 0 ≤ i < min(n, length(ℓ) − k) otherwise.
This function maps X j to X m , where m = min(n, j − k). 7.12.3 Remark: It is often convenient to use the initial index 1 instead of 0 in Definitions 7.12.2, 9.12.1 and 9.12.2. Then all operations are the same except that the indices are shifted. It may be useful to define list spaces with a variable starting index or even general indices instead of integer indices. 7.12.4 Remark: It is often useful to allow a mixture of finite and infinite sequences as in Definition 7.12.5. [ Should define list operations for Definition 7.12.5. ] 7.12.5 Definition: For any set X, the extended list space List(X) on X is the set
=
S
∞ S
i=0 i
Xi
{X ; i ∈
Z0 }
= {X ; i ∈ ω + } = List(X) ∪ X ω . S
i
−+
7.12.6 Remark: The notation List(X) for an extended list space is supposed to suggest and analogy with −+ −+ + 0 = 0 ∪ {∞} and other such extended number sets. (See Notation 7.6.4 for 0 .) A suitable alternative − ∗ + would be List (X), which suggests completion of a topological space, as for example IR+ 0 = IR0 ∪ {∞}.
Z
Z
Z
7.13. Reformulation of logic in terms of axiomatic mathematics As mentioned in Remarks 3.13.4 and 3.13.8, in order to avoid circularity in definitions, it is necessary to first define symbolic logic in terms of naive mathematics, and then later return to make a reformulation of mathematical logic in terms of the axiomatic mathematics which has been built on the logical foundations. Since mathematical logic seems to require only sets, order, numbers, relations, functions and other such elementary concepts, it seems that this section of the book is a suitable location for a systematic reformulation of mathematical logic in terms of formal set theory and numbers.
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
List(X) = X ω ∪
[249]
Chapter 8 Rational and real numbers
8.1 8.2 8.3 8.4 8.5 8.6 8.7
Rational numbers . . . . . . Extended rational numbers . Real numbers . . . . . . . . Extended real numbers . . . Real number tuples . . . . . Some useful basic real-valued Complex numbers . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . functions . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
249 250 251 253 254 254 258
8.1. Rational numbers
Z
[ Give sets of axioms for the rational numbers here. One set in terms of the integers . The other set specifying the rationals from scratch. ]
Z
Z
8.1.1 Definition: The set of rational numbers is the set {[(n, d)]; n ∈ , d ∈ \ {0}}, where [(n, d)] denotes the equivalence class {(n′ , d′ ) ∈ 2 ; d′ 6= 0 and nd′ = n′ d} for all pairs (n, d) ∈ × ( \ {0}). 8.1.2 Notation:
Z Z
Q denotes the set of rational numbers.
8.1.3 Remark: Definition 8.1.1 is a popular style of representation for the set of rational numbers in terms of equivalence classes of pairs of integers with equal ratios. This is illustrated in Figure 8.1.1. (This diagram is reminiscent of some constructions in projective geometry.) −4 3
−4 1
n
6 4 3 1 1 1
2 1
3 4 6 2 3 5
1 1
6 7
3 4
2 3 3/5 1/2 2/5 1/3 1/4 1/5 1/10
d
0/1
−1/10 −1/5 −1/4 −1/3 −2/5 −1/2 −3/5 −6 −3 −2 −3 −6 −1 −6 −3 −2 1 1 1 2 5 1 7 4 3
Figure 8.1.1
Construction of rational numbers from integers
©
Alan U. Kennington, “Differential geometry reconstructed: a unified systematic framework”. Copyright 2009, Alan U. Kennington. All rights reserved. You may print this book draft in A4 format. Printing in all other formats is forbidden. You may not charge any fee for copies of this book draft. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Z
250
8. Rational and real numbers
[ Present floating-point representations of rational numbers, such as 2 × (2 × 2M ) × 2N , where the first component is the sign bit, the second component is the sign of the exponent, the third component is the absolute value of the exponent (or something like that, e.g. 2’s complement representation), and the last component is the “fractional part” of the number. ] [ Present a list of reasonable representations for rational numbers similar to the list in Remark 8.3.3. ] [ Define a canonical embedding of the integers inside , and also all of the order, arithm`etic and metric structures which has. Also the subtraction, reciprocal and division operations, and the lowest common denominator, the norm/modulus, and normalized “mixed fractions”. Also define decimal and binary expansions, and floating point representations. ] [ Define intervals and interval notations [q1 , q2 ] etc. Present the use of the ∞ symbol to represent unbounded intervals. ]
Z
Z
8.1.4 Notation: + denotes the set + 0 denotes the set − denotes the set − 0 denotes the set
Q Q Q Q
Q
Q Q Q Q [ Define the arithmetic of intervals and general subsets of Q. For example [a, b] + [c, d] and q.[a, b]. ] [ Do a version of Theorem 8.3.11 for rational numbers. Define inf and sup for subsets of Q. ] of of of of
positive rational numbers {q ∈ ; q > 0}. non-negative rational numbers {q ∈ ; q ≥ 0}. negative rational numbers {q ∈ ; q < 0}. non-positive rational numbers {q ∈ ; q ≤ 0}.
[ Present arithmetics here for computer representations of rational numbers. In particular, define floating point representations and arithmetic. ] and show non-closure. ] [ Define Dedekind cuts and Cauchy sequences for [ Rational number tuples should be presented somewhere, maybe in the matrix algebra chapter, Chapter 11. Show closure under linear transformations with rational number matrix coefficients. Although matrix multiplication belongs in the matrix chapter, tuples and non-algebraic operations on tuples probably should be presented near here. ] [ Bresenham’s line algorithm is used for the rasterisation of lines with rational parameters, using an integer grid. Perhaps such algorithms should be discussed near here. They are used in practice for the implementation of rational and real numbers on real-world systems. ]
8.2. Extended rational numbers [ Give an axiomatization or two for the extended rational numbers. ] 8.2.1 Definition: The set of extended rational numbers is the set − 8.2.2 Notation: denotes the set of extended rational numbers − [ Define arithmetic, order, metric, etc. on . ]
Q
Q
Q ∪ {−∞, ∞}. Q ∪ {−∞, ∞}.
8.2.3 Remark: The “infinite rational numbers” −∞ and ∞ are not easy to represent. It is possible to artificially define −∞ = {(n, 0); n ∈ − } and ∞ = {(n, 0); n ∈ + } by analogy with the ordered pair equivalence classes in Definition 8.1.1. However, the arithm`etic rules will not then be consistent with the finite rationals. Since the rules for order and arithmetic must be specified separately for the infinite rationals anyway, one may as well leave their representation unspecified. The choice of representation is of purely academic interest!
Z
8.2.4 Notation: −+ denotes the set −+ 0 denotes the set −− denotes the set −− 0 denotes the set
Q Q Q Q
of of of of
Z
Q Q Q Q
− positive extended rational numbers {q ∈ ; q > 0}. − non-negative extended rational numbers {q ∈ ; q ≥ 0}. − negative extended rational numbers {q ∈ ; q < 0}. − non-positive extended rational numbers {q ∈ ; q ≤ 0}.
[ Give a comparative table of (extended) rational number notations for various authors, including the sets of positive and negative rational numbers. ] [ Define general extended real number intervals. Define arithmetic for extended real number intervals. ] [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Q
8.3. Real numbers
251
8.3. Real numbers [ Refer to some numerical analysis concepts near here. Computers use rational numbers which approximate to real numbers. ] 8.3.1 Remark: The “floating-point numbers” which are implemented in digital computer hardware represent rational numbers, not real numbers, although the term “real numbers” is often used incorrectly for computer numbers. However, rational numbers are often used in practice to represent intervals of real numbers. For example, a decimal floating-point number like 3.141592654 is understood to indicate an interval such as [3.1415926535, 3.1415926545]. Whether computer hardware number implementations are interpreted as rational numbers or intervals of real numbers, their operations are generally only approximate at best. The purpose of mathematical definitions of real numbers is to remove the ambiguity and imprecision of the informal every-day usage. The idea of number intervals as an interpretation for the common usage of numbers provides a natural basis for a rigorous definition of real numbers. 8.3.2 Remark: The real numbers sometimes seem to have some surprising properties, and one may wonder whether they are an accurate model for the universe. However, one must always remember that mathematical systems only give conclusions if the assumptions are true. In the case of real numbers, it is highly unlikely that any physical system would satisfy the axioms. It is dubious that even the rational numbers could be a correct model for any physical system. Certainly measurements in physics do not provide unbounded significant figures of accuracy. All physical measurements have a limited resolution.
The real number system only tells us the properties of a modelled system if it precisely matches the specification. So if the properties of real numbers seem a little odd, and do not match exactly our experience of the phenomena of the universe, it doesn’t matter because we will never be able to check. Therefore the real number system should not be taken too seriously. It’s only a model. [ Insert here a set of axioms for the real numbers. Maybe give one axiomatization from scratch, and one based on . Then give the Cantor representation of real numbers as equivalence classes of Cauchy sequences. Probably some work is required to show that the definition satisfies the axioms. See EDM2 [34], section 294.E for both Dedekind and Cantor representations of real numbers. See Spivak [143], pages 487–512, for axiomatic definition of complete ordered fields (i.e. the real numbers), the Dedekind and Cantor constructions, and proof of uniqueness of real numbers up to isomorphism. ]
Q
8.3.3 Remark: Representations of the real numbers include the following. (1) Dedekind cuts. (2) Cauchy sequences. (3)
Z × 2ω . The first component represents the floor of the real number. (See Definition 8.6.8.) The second component represents the fractional part of the number. (See Definition 8.6.13.)
(4) 2×ω×2ω . The first component represents the sign of the real number. (See Definition 8.6.2.) The second component represents the floor of the absolute value of the number. (See Definition 8.6.1.) The second component represents the fractional part of the absolute value of the number. (See Definition 8.6.13.)
Z
(5) 2× ×2ω . The first component represents the sign of the real number. The second component represents the shift of the binary “decimal point” of the absolute value of the number. The second component represents the fractional part of the absolute value of the number. (6)
Z × {x ∈ 2ω ; #{i ∈ ω; x(i) = 0} = ∞}.
The first component represents the floor of the real number. (See Definition 8.6.8.) The second component represents the fractional part of the number. By contrast with option (3), there is not need to apply an equivalence class to this representation.
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Although physical phenomena (the appearances of things) do not offer unbounded resolution, physical noumena (the underlying true natures of things) could possibly offer unbounded or infinite resolution in some systems. But there is no way of knowing the true natures of things. When real numbers are used to model noumena, one can only say that the finite-resolution approximations which are provided by phenomena seem to match very well with many of the models in physics. If the underlying things behind phenomena are not in fact modelled correctly by the real numbers, this does not really matter. Physics never says what a thing is, only how it appears.
252
8. Rational and real numbers
(7) {u ∈ 2ω ; #{i ∈ ω; u(i) = 1} < ∞} × {x ∈ 2ω ; #{i ∈ ω; x(i) = 0} = ∞}. This is the same as option (6) except that the first component (the floor of the real number) is represented as a finite sequence of a binary ones. Other representations may be based, for example, on the usual floating point representations for rational numbers, or could replace the base 2 with other integers which are greater than 1. [ Define some or all of the real-number representations in Remark 8.3.3. Show the correspondences between them. Maybe show how arithmetic is done in some or all of these representations. ] is a set of equivalence classes on × ( \ {0}). Define IR as a set of equivalence classes on {f ∈ [ ω ; f is a Cauchy sequence}, or Dedekind cuts. Define IRn as the set of functions from n to IR. Make all vectors have indices which start at 0. ]
Q Q
Z
Z
8.3.4 Remark: The most popular representations for the real numbers in pure mathematics are the Dedekind representation and the Cantor representation. The former defines real numbers as semi-infinite intervals of the rational numbers called Dedekind cuts. The latter defines real numbers as equivalence classes of Cauchy sequences. The Dedekind representation exploits the total order on the rationals to “fill the gaps”. The Cantor representation exploits the distance function on the rational numbers to “fill the gaps”. The total order and the distance function on the rational numbers are not unrelated. However, the distance function approach is much more generally applicable. Any metric space may be completed (i.e. the “gaps” can be filled in) using the Cauchy sequence approach. On the other hand, the total order approach requires much less convoluted logic for its construction. In applied mathematics and the sciences, the most popular representations of the real numbers are as finite and infinite decimal or binary expansions. However, irrational numbers cannot be represented precisely by such expansions, since they require an infinite amount of information.
∀ε ∈
Q+ , ∃n ∈ ω, ∀i ≥ n, |xi − xj | < ε.
Q such that
[ Definition 8.3.6 is only one representation of the real numbers. There should be a definition of “a system of real numbers” rather than “the system of real numbers”. This should be followed by several representations. Define the Dedekind, Cauchy sequence, decimal and binary representations. The decimal representation is perhaps closest to how people really think about the real numbers? Finite decimal expansions are what we typically get from measurements of the real world because that’s how measurement equipment is designed. Rational numbers are a consequence of dividing up units into equal portions. Then it is only reasonable to assume that the gaps between rationals are filled in somehow. The Cantor representation is closely related to the concept of “successive approximation”. ] 8.3.6 Definition: The set of real numbers is the set of equivalence classes of all Cauchy sequences of rational numbers, where two Cauchy sequences x = (xi )i∈ω and y = (yi )i∈ω are considered equivalent if the combined sequence (zi )i∈ω defined by z2i = xi and z2i+1 = yi for all i ∈ ω is a Cauchy sequence. 8.3.7 Notation: IR denotes the set of real numbers. [ Define order on IR. ] 8.3.8 Notation: IR+ denotes the set IR+ 0 denotes the set IR− denotes the set IR− 0 denotes the set
of of of of
positive real numbers {r ∈ IR; r > 0}. non-negative real numbers {r ∈ IR; r ≥ 0}. negative real numbers {r ∈ IR; r < 0}. non-positive real numbers {r ∈ IR; r ≤ 0}.
8.3.9 Remark: Some authors use IR+ for non-negative real numbers, but then there would be no obvious notation for positive reals. It would be tedious to have to write IR+ \ {0} for the positive reals. Some authors use notations such as IR+ and + instead of the superscript versions. The advantage of this would be that you could write IRn+ for the Cartesian product of n copies of IR+ . However, this would cause ambiguity for notations like IR0+ for the non-negative real numbers.
Z
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
8.3.5 Definition: A Cauchy sequence of rational numbers is a sequence x : ω →
8.4. Extended real numbers
253
8.3.10 Definition: An open interval of the real numbers is any set of the form (a, b) = {x ∈ IR; a < x < b}, (a, ∞) = {x ∈ IR; a < x}, (−∞, b) = {x ∈ IR; x < b} or IR, for some a, b ∈ IR with a ≤ b.
A closed interval of the real numbers is any set of the form [a, b] = {x ∈ IR; a ≤ x ≤ b}, [a, ∞) = {x ∈ IR; a ≤ x}, (−∞, b] = {x ∈ IR; x ≤ b} or IR, for some a, b ∈ IR with a ≤ b. A closed-open interval of the real numbers is any set of the form [a, b) = {x ∈ IR; a ≤ x < b} for some a, b ∈ IR with a ≤ b.
An open-closed interval of the real numbers is any set of the form (a, b] = {x ∈ IR; a < x ≤ b} for some a, b ∈ IR with a ≤ b.
A semi-closed interval or semi-open interval of the real numbers is any open-closed or closed-open interval of the real numbers for some a, b ∈ IR with a ≤ b. An interval of the real numbers is any open, closed or semi-closed interval of the real numbers.
The closed unit interval of the real numbers is the set [0, 1]. The open unit interval of the real numbers is the set (0, 1). 8.3.11 Theorem: A non-empty subset I of IR is an interval if and only if I ⊇ (inf I, sup I). Proof: It is easy to show that all non-empty intervals satisfy I ⊇ (inf I, sup I). So suppose that I is a non-empty subset of IR which satisfies I ⊇ (inf I, sup I). (To be continued. . . )
8.3.12 Remark: In terms of the standard topology on the real numbers, a subset of IR is connected if and only if it is an interval. (See Definition 14.9.1 for the standard topology on the real numbers.) Note that an open interval may be the empty set. In this case, inf I = ∞ and sup I = −∞. So (inf I, sup I) = ∅.
[ Show that the real numbers are uncountable by using a diagonal construction from an ordering of a supposedly countable set of real numbers. Maybe give two or more proofs of the uncountability of the real numbers. Comment on the constructibility issues for this proof. ] [ Even more than in the case of integers, the concept of “compressibility” of real numbers may be a useful concept. (See comment at end of Remark 2.11.1.) Whereas all integers have a finite representation, almost all real numbers have no finite representation, whether compressed or not. Therefore one could usefully introduce a “code book” concept to select a finite or countable subset of the real numbers with partial or full closure under arithm`etic operations. Such a space of compressible numbers could provide a limited universe of real numbers for doing arithmetic for some purposes. Some simple examples of this are the algebraic real numbers and the sub-algebras of the real numbers generated by solutions of algebraic equations. ] 8.3.13 Remark: If space and time turn out to be discrete in some sense in some future fundamental theory of the universe, it may be necessary to abandon the real numbers. It would be interesting to try to develop various potential replacements for the real numbers in anticipation of future developments.
8.4. Extended real numbers [ Give an axiomatization or two for the extended real numbers. ] 8.4.1 Definition: The extended real number system is. . . − 8.4.2 Notation: IR denotes the set of extended real numbers IR ∪ {−∞, ∞}. [ Give a comparative table of (extended) real number notations for various authors, including the sets of positive and negative real numbers. ] [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
[ Near here define sup and inf for subsets of IR. Show existence of these for bounded sets etc. Is this the HeineBorel theorem? How does this task differ for the Cantor and Dedekind definitions of real numbers? Since the supremum and infimum may be infinite, this requires the definition of positive and negative infinities before extended real numbers are defined. ]
254
8. Rational and real numbers
− 8.4.3 Remark: The notation IR for the extended real numbers is used by very few authors. (Federer [105], 2.1.1, page 51, does use this over-bar notation.) The principal advantage of the over-bar notation is the ability to add superscripts, which is difficult for the more popular asterisk superscript notation IR∗ . The asterisk superscript has the advantage of suggesting the usual notation for topological compactification of a set. (See for example Taylor [144], page 34.) However, the topological space asterisk usually denotes − a one-point compactification, whereas IR is a two-point compactification. So there is not much loss in abandoning the asterisk superscript. [ Give a full set of arithm`etic operations for the real numbers. ] 8.4.4 Remark: The “number” infinity, denoted ∞, may be thought of as the solution of the equation: x = x + 1.
(8.4.1)
2
This is not much more absurd than defining the imaginary number i as the solution of x = −1. The equation x = x + 1 may be “solved” by iteration. Start with x = 1 and substitute this into the right hand side. This gives x = 2. Now substitute again to obtain x = 3. By continuing this an infinite number of times, the solution x converges to infinity! Another way to solve Equation (8.4.1) is to divide both sides by x, which gives 1 = (x + 1)/x, which means that the ratio of x + 1 to x equals 1, which is a more and more accurate approximation as x tends to infinity. 8.4.5 Notation: − IR+ denotes the set −+ IR0 denotes the set − IR− denotes the set − IR− 0 denotes the set
of of of of
− positive extended real numbers {r ∈ IR; r > 0}. − non-negative extended real numbers {r ∈ IR; r ≥ 0}. − negative extended real numbers {r ∈ IR; r < 0}. − non-positive extended real numbers {r ∈ IR; r ≤ 0}.
[ Define intervals of extended real numbers. Give notations. ]
8.5. Real number tuples
8.5.1 Definition: IRn denotes the set of real n-tuples for n ∈
Z+0.
8.5.2 Remark: See Notation 7.7.1 for the definition of X n for general sets X. The conventional index set of all n-tuples in IRn is the set Nn = {1, 2, . . . , n}, although there are strong arguments in favour of the index set n = {0, 1, . . . n − 1}. 8.5.3 Definition: The m, n-concatenation operator for real (number) tuples for m, n ∈ Qm,n : IRm × IRn → IRn defined by
Z+0 is the function
Qm,n : ((x1 , . . . xm ), (y1 , . . . yn )) 7→ (x1 , . . . xm , y1 , . . . yn ).
8.6. Some useful basic real-valued functions [ See the document fund.tex for more basic function definitions. ] 8.6.1 Definition: The absolute value function | · | : IR → IR is defined by x if x ≥ 0 ∀x ∈ IR, |x| = −x if x ≤ 0. 8.6.2 Definition: The sign function of a real variable is defined by ( 1 x>0 ∀x ∈ IR sign(x) = 0 x=0 −1 x < 0.
8.6.3 Remark: The sign function is also called the signum function (from the Latin word “signum”). Some authors use the notation sgn(x) for sign(x). The absolute value function is sometimes known as the “modulus”. The absolute value and sign functions are illustrated in Figure 8.6.1. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
[ There is perhaps not much that can be said about rational and real number tuples without getting into matrix algebra. However, it is useful to give some brief notations and definitions here. There could be some “issues” arising from the superscripts on the positive/negative subsets. ]
8.6. Some useful basic real-valued functions |x|
-3
-2
sign(x)
3
3
2
2
1
1
-1
0
255
1
2
x
3
-3
-2
-1
1
2
3
x
-1 Figure 8.6.1
Absolute value and sign functions
8.6.4 Theorem: ∀x ∈ IR, sign(x).|x| = x. Proof: sign(x).|x| equals 1.x = x for x > 0, (−1).(−x) = x for x < 0, and 0.x = x for x = 0. 8.6.5 Remark: The Heaviside function H : IR → IR is variously defined with H(0) equal to 0, 1 or 12 . In generalized function and transform contexts where it usually appears, the value of H(0) often has no significance. As a Fourier transform, it usually is best to set H(0) = 12 , but for simplicity in calculations, H(0) equal to 0 or 1 is more convenient. It is often advantageous for functions to be right-continuous, which suggests that H(0) should equal 1. Generally the value is chosen to suit the application. [ Give a comparative table of conventions for the Heaviside function’s notation and value at 0. ] EDM2 [34] gives H(0) = 1 in section 125.E and appendix A, 12.II, but H(0) = 12 in section 306.B. Rudin [137], page 180, exercise 24 gives H(0) = 0. Treves [146], pages 26 and 240, leaves H(0) undefined. Definition 8.6.6 gives the Heaviside function in terms of the signum function, which forces H(0) = 12 . Some authors give the notation ε(t) for the Heaviside function of t. (E.g. see CRC [155], page F-180.) The Heaviside function is also called the unit step function.
8.6.7 Remark: As illustrated in Figure 8.6.2, the Heaviside function value H(0) is not the same as H(0)00 . This could be regarded as a bad thing. Generally the preferred value for 00 would be 0. More serious, perhaps, is the fact that H(x)p 6= H(x) for x = 0 and integers p > 1. The equality H(x)p = H(x) does hold if H(0) equals 0 or 1 (but not for any other values). H(x) 6≡ H(x)x0
-2
-1
H(x)x1
H(x)x2
2
2
2
1
1
1
0
1
Figure 8.6.2
2 x
-2
-1
0
1
2 x
-2
-1
0
1
2 x
Heaviside function multiplied by monomial
The value H(0) = 0 would have the advantage that H(x) = limp→0+ xp for all x ≥ 0. But the value H(0) = has the advantage that H(x) = 1 − H(−x) for all x ∈ IR.
1 2
Z is defined by floor(x) = ⌊x⌋ = sup{i ∈ Z; i ≤ x}.
8.6.8 Definition: The floor function floor : IR → ∀x ∈ IR,
Z is defined by ceiling(x) = ⌈x⌉ = inf{i ∈ Z; i ≥ x}.
8.6.9 Definition: The ceiling function ceiling : IR → ∀x ∈ IR, [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
8.6.6 Definition: The Heaviside function H : IR → IR is defined by ∀x ∈ IR, H(x) = (1 + sign(x))/2.
256
8. Rational and real numbers floor(x)
ceiling(x)
4
4
3
3
2
2
1
1
-1
1
2
3
4
Figure 8.6.3
5
x
0
1
2
3
4
x
5
The floor and ceiling functions
8.6.10 Theorem: ∀x ∈ IR, ceiling(x) = − floor(−x). Proof: See Exercise 47.3.4. 8.6.11 Remark: Some authors use the notation ent(x) for floor(x). (E.g. see CRC [155], page F-180.) 8.6.12 Remark: Two useful functions which can be easily expressed in terms of the floor function are the “fractional part” and “round” functions. 8.6.13 Definition: The fractional part function frac : IR → [0, 1) is defined by frac(x) = x − floor(x).
8.6.14 Definition: The round function round : IR →
Z is defined by
round(x) = sign(x). floor(|x| + 21 ).
∀x ∈ IR,
This may also be called the nearest integer function. round(x) 4
frac(x)
3 2 1
1 x
-3
-2
-1
0
Figure 8.6.4
1
2
3
x -1
0
1
2
3
4
5
Fractional part and rounding functions
8.6.15 Remark: There are several common variants of the rounding function in Definition 8.6.14. Probably the most popular version rounds to the nearest integer “away from zero”, as defined here. 8.6.16 Definition: The modulo function mod : IR × (IR \ {0}) → IR+ 0 is defined by ∀x ∈ IR, ∀m ∈ IR \ {0}, [ www.topology.org/tex/conc/dg.html ]
Z
x mod m = inf{y ∈ IR+ 0 ; ∃i ∈ , x = i.m + y}. [ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
∀x ∈ IR,
8.6. Some useful basic real-valued functions
257
8.6.17 Remark: The modulo function satisfies Z = inf {x − k|m|; k ∈ Z and x ≥ k|m|}
x mod m = inf {x + i.m; i ∈ } ∩ IR+ 0
∀x ∈ IR, ∀m ∈ IR \ {0},
= x − |m|⌊x/|m|⌋ = m frac(x/m).
Figure 8.6.5 illustrates the modulo function and a shifted modulo function x 7→ (x + a) mod 2a − a. Such functions are familiar in electronics as “relaxation oscillator” waveforms. (x + a) mod 2a − a
x mod a a
a
− 2a
−a
0
a
Figure 8.6.5
2a
x
−a
− 2a
a
0
2a x
−a
Modulo functions
8.6.18 Theorem: The absolute value, floor and modulo functions satisfy the following: (i) ∀a ∈ IR, ∀b ∈ IR \ {0}, a mod b = a mod (−b) = a mod |b|. (ii) ∀a ∈ IR, ∀b ∈ IR \ {0}, 0 ≤ a mod b < |b|. (iii) ∀a ∈ IR, ∀b ∈ IR \ {0}, a mod b = a − |b|. floor(a/|b|). 8.6.19 Remark: The sawtooth functions shown in Figure 8.6.6 are useful for defining the sine and cosine functions in Section 20.13. |(x + a) mod 2a − a|
|(x + 3a) mod 4a − 2a| − a
a a − 2a
−a
0
Figure 8.6.6
a
2a
x
−4a
−2a −3a −a
2a a
4a 3a
x
−a
Sawtooth functions
A particular sawtooth function is the “distance to nearest integer”, defined by inf{|x − k|; k ∈ difference between x and the rounded value round(x) = ceiling(x − 12 ) is x − sign(x). floor(|x| + 12 ) = sign(x). |x| − floor(|x| + 12 ) = sign(x). frac(|x| + 12 ) − 21 .
Z}.
The
Therefore the distance from a real number x to the nearest integer equals the absolute value of this, which is frac(|x| + 1 ) − 1 = (|x| + 1 ) mod 1 − 1 2 2 2 2 with a = 12 , which is illustrated in Figure 8.6.6. [ www.topology.org/tex/conc/dg.html ]
= (|x| + 12 ) mod 1 − 12 = (x + a) mod 2a − a
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Proof: See Exercise 47.3.5.
258
8. Rational and real numbers
8.6.20 Remark: Figure 8.6.7 shows some basic square functions. The functions floor(x) and floor(x/a) − 2 floor(x/(2a)) are right-continuous. The functions ceiling(x) and 2 ceiling(x/(2a)) − ceiling(x/a) are leftcontinuous. floor(x/a) − 2 floor(x/(2a))
2 ceiling(x/(2a)) − ceiling(x/a)
a
− 2a
−a
0
a
a
x
2a
Figure 8.6.7
− 2a
−a
0
a
2a
x
Square functions
8.7. Complex numbers [ Give an axiomatisation for the complex numbers. ]
C denotes the set of complex numbers.
8.7.1 Notation:
[ Must define a representation of the complex numbers here. ] [ Present complex number arithmetic, order(?), intervals(??), metric, norm, modulus, Cauchy sequences and embedding in IR. Also present complex number tuples? ]
8.7.3 Remark: There is a strong argument for the reality of complex numbers in the properties of Taylor series. For example, consider the function f : IR → IR defined by f : x 7→ (1 + x2 )−1 . This function is clearly analytic, but the Taylor series for f at a point x0 ∈ IR has convergence radius (1 + x20 )1/2 , which is coincidentally the distance from the point (x0 , 0) to (0, 1) in the complex plane, and (0, 1) happens to be a pole of the analytic extension of f to the complex numbers. (See Figure 8.7.1.)
x4 -6
-5
-4
-3
ce gen ver us n o i c rad
-2
1
con ver rad gence ius
x1 1 x2 2
-1
x3
x 3
4
5
6
−1
radius of convergence of (1 + x2 )−1 Figure 8.7.1
Convergence radii suggesting complex pole for real-analytic function
This kind of rule holds generally for real-analytic functions, which suggests that even if the complex numbers are ignored, they still have their unavoidable effect on the real line. This is reminiscent of the way in which geophysicists detect minerals a long distance under the ground using gravitometric, electromagnetic and other detection methods. The existence and nature of minerals deep underground may be inferred from evidence obtained entirely above ground. In the same way, the poles of real-analytic functions may be detected “deep in the complex plane” may be detected by their influence on the radius of convergence of Taylor series for real-valued functions evaluated purely within the real line. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
8.7.2 Remark: Complex numbers are avoided as far as possible in this book. Life is already difficult enough without them.
8.7. Complex numbers
259
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
This kind of argument only shows the “reality” of the complex numbers if one accepts the “reality” of power series. But there is no reason in physics why power series should have a special significance, apart from the fact that addition and multiplication are programmed into calculators. It is very unusual for a physical system to have state functions which are truly real analytic except in mathematical models, because an analytic function is determined everywhere from any infinitesimal region. At the very microscopic level at which quantum theory is effective, probably the real functions are indeed real analytic; so one would expect complex numbers to be relevant. The usefulness in quantum theory of complex numbers is therefore not surprising. There seems to be no argument for the “reality” of quaternions similar to the argument for the reality of complex numbers.
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
260
[ www.topology.org/tex/conc/dg.html ]
8. Rational and real numbers
[ draft: UTC 2009–3–21 Saturday 11:36 ]
[261]
Chapter 9 Algebra
9.1 9.2 9.3 9.4 9.5 9.6 9.7 9.8 9.9 9.10 9.11 9.12
Semigroups . . . . . . . . . . . . . . . . . . . Groups . . . . . . . . . . . . . . . . . . . . . Subgroups . . . . . . . . . . . . . . . . . . . Left transformation groups . . . . . . . . . . . Right transformation groups . . . . . . . . . . Mixed transformation groups . . . . . . . . . . Figures and invariants of transformation groups Rings and fields . . . . . . . . . . . . . . . . . Modules . . . . . . . . . . . . . . . . . . . . Associative algebras . . . . . . . . . . . . . . Lie algebras . . . . . . . . . . . . . . . . . . . List space for sets with algebraic structure . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
262 263 266 269 274 276 278 280 281 284 285 287
Figure 9.0.1 shows the family relations between several of the algebraic classes in this chapter. (Another family tree is shown in Figure 9.9.1.)
group (G,σG )
commutative group (G,σG )
transformation semigroup (G,X,σG ,µ)
transformation group (G,X,σG ,µ) left module over a group (G,M,σG ,σM ,µ)
ring (R,σR ,τR )
unitary ring (R,σR ,τR )
effective transformation group (G,X,σG ,µ)
commutative ring (R,σR ,τR )
free transformation group (G,X,σG ,µ)
commutative unitary ring (R,σR ,τR )
field (K,σK ,τK )
Figure 9.0.1
Family tree of semigroups, groups, rings and fields
Only the basic facts are given about most of these classes although transformation groups are presented in more detail because of their importance in differential geometry. Linear spaces are even more important; so they have their own chapter: Chapter 10.
©
Alan U. Kennington, “Differential geometry reconstructed: a unified systematic framework”. Copyright 2009, Alan U. Kennington. All rights reserved. You may print this book draft in A4 format. Printing in all other formats is forbidden. You may not charge any fee for copies of this book draft. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
semigroup (G,σG )
262
9. Algebra
As explained in Section 5.16, specification tuples for mathematical objects may be abbreviated. For example, a group may either be fully specified as a tuple (G, σ), where G is the set and σ : G × G → G is the operation on G, or else the group may be abbreviated to just G. The adoption of an abbreviation is indicated by a notation such as G − < (G, σ), which means that “G is an abbreviation for (G, σ)”. The meaning of G is ambiguous because it represents both the basic set and the full tuple, which cannot be equal, but this kind of symbol re-use is standard practice. The chicken-foot notation “− n in Definition 11.1.25. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
11.1.19 Theorem: For any field K, ∀m, n, p ∈
11.2. Component matrices of linear maps 11.1.27 Theorem: Let A ∈ Mm,n (K) with m, n ∈
311
Z+0 for some field K. If m > n then A is not orthogonal.
[ Define row rank and column rank of rectangular matrices. Also define nullity. Define the determinant of a rectangular matrix as a matrix. Define cofactors of elements of a matrix. These definitions make much more sense after the determinant of a square matrix has been defined. So maybe delay these topics? ] 11.1.28 Remark: In the olden days, matrix algebra was performed by hand with pen (or pencil) on paper. Hand-written matrix algebra is most convenient when the vector components are depicted as column vectors. I.e. the elements of the n-tuples are listed in increasing order down the page. This gives the best layout when such a vector is multiplied by a matrix as in the following. a12 a22 .. .
... ... .. .
a1n x1 a2n x2 . . .. . .. . . . amn xn
am2
The use of vertical (i.e. column) vectors keeps the handwritten calculations compact in the horizontal direction. A further advantage of column vectors is that the composition of multiple linear transformations gives the same order for matrices. Thus if y = f (x) = Ax and z = g(y) = By for matrices A and B, the composition g(f (x)) equals BAx. Thus in both notations, the order is the reverse of the temporal (or causal) order which intuitively underlies the composition of functions. An m × n matrix over a field K is used for linear maps f : K n → K m , which has m and n in the reverse order. And advantage P of multiplying a vector by a matrix from the left is that the indices are contiguous. For example, yi = nj=1 aij xj . The two instances of the index j are close to each other. In Markov chain theory, the reverse convention is adopted. There the vectors are row vectors, which makes the matrix multiplication order the reverse of the standard function composition notation order. The advantage in this case is that the matrix multiplication order matches the temporal order since the initial state of the system is a rowPvector P which is multiplied on the right by state transition matrices. For example, in the equation wk = ni=1 nj=1 vi Pij Qjk , the initial state is v and one or matrices on the right modify the state, first P then Q. 11.1.29 Remark: Theorem 11.1.30 is a generalized multinomial formula. For m = n = 2, this theorem states that (a11 + a12 )(a21 + a22 ) = a11 a21 + a11 a22 + a12 a21 + a12 a22 . 11.1.30 Theorem: Let A ∈ Mm,n (K) be a matrix over a field K with m, n ∈ m X n Y
i=1 j=1
aij =
X
NN
J∈
n
n
m Y
Z+0. Then
ai,J(i) .
i=1
Proof: This theorem may be proved by a double application of induction to the distributive law for a field. (See Exercise 47.5.2 for proof.)
11.2. Component matrices of linear maps The principal application of rectangular matrix algebra is to linear maps between finite-dimensional linear spaces. This section presents the correspondence between linear maps and matrix algebra. The principal property of the component matrices of linear maps is given in Theorem 11.2.5, namely that the matrix of a composition of linear maps is the product of the matrices of the linear maps. Therefore calculations may be performed on linear maps in terms of their component matrices. Linear maps are often specified in terms of component matrices. The principal drawback of component matrices is the fact that they depend on the choice of basis for both the source and target space. Conversion between matrices with respect to different bases is tedious and error-prone. The decision to work with bases and matrices or directly with linear spaces and linear maps is a tradeoff between their advantages and disadvantages in each application context. Differential geometers who [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
y1 a11 y2 a21 . = . . .. . am1 ym
312
11. Matrix algebra
proclaim the virtues of “coordinate-free” methods are advocating the avoidance of bases and matrices. The “coordinates” they refer to are the components of vectors and matrices (and tensors) with respect to particular choices of bases. Abstract theorems can mostly be written in a coordinate-free style, but practical calculations mostly require the use of vector component tuples and matrices with respect to bases. Matrices are also useful for representing bilinear maps with respect to a linear space basis. This is discussed in Remark 13.1.14. 11.2.1 Definition: The (component) matrix of a linear map φ : V → W with respect to bases (ej )nj=1 ∈ m V n and P (fi )m for linear spaces V and W over a field K is the matrix α ∈ Mm,n (K) such that i=1 ∈ W m φ(ej ) = i=1 fi αij for all j ∈ n .
N
11.2.2 Remark: The matrix α in Definition 11.2.1 is well defined because by definition of a basis, each m vector φ(ej ) has a unique m-tuple (αij )m of components with respect to the basis (fi )m i=1 ∈ K i=1 . In terms of the component map in Definition 10.2.14, αij = κW (φ(ej ))i , where κW : W → K m is the component map with respect to the basis (fi )m i=1 . 11.2.3 Definition: The linear map for a (component) matrix α ∈ Mm,n (K) with respect to bases m (ej )nj=1 ∈ V n and (fi )m for linear spaces V and W over a field K is the linear map given by i=1 ∈ W Pm Pn φ : v 7→ i=1 j=1 fi αij κV (v)j for v ∈ V , where κV : V → K n is the component map for the basis (ej )nj=1 .
11.2.4 Remark: Definition 11.2.3 specifies a unique linear map φ ∈ Lin(V, W ) for each matrix α ∈ Mm,n (K) for given bases for V and W . Conversely, Definition 11.2.1 specifies a unique matrix α ∈ Mm,n (K) for each linear map φ ∈ Lin(V, W ) for the same bases. This establishes a bijection between the linear maps and component matrices. Thus one may freely choose whether to work “coordinate-free”, in terms of linear maps, or “using coordinates”, in terms of matrices.
11.2.6 Remark: Whereas the basis vectors are transformed by multiplying the sequence of basis vectors on the right by the matrix α in Definition 11.2.1, component tuples are multiplied on the left by α. Pn Let v = j=1 vj ej be a vector in V with components (vj )nj=1 ∈ K n . (Here v denotes both the vector and its component tuple.) Let φP :V → be the linear map for the component matrix α ∈ Mm,n (K). Then PW n by Definition 11.2.3, φ(v) = m f α v . So the component tuple for φ(v) with respect to the basis i=1 j=1 Pn i ij j m m (fi )m ∈ W is (w ) , where w = m . This is multiplication of the component tuple i i=1 i i=1 j=1 αij vj for i ∈ of v on the left by the matrix α. This is illustrated in Figure 11.2.1.
N
tuple space (vj )nj=1
Kn
tuple space matrix multiplication Pn wi = j=1 αij vj
component n map κV : V → K
V
(ej )nj=1
Km
component κW : W → K m map
φ:V →W
linear map
linear space Figure 11.2.1
(wi )m i=1
(fi )m i=1
W
linear space
Matrix multiplication corresponding to a linear map
[ Present the special case of linear spaces V = K n and W = K m . ] [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
11.2.5 Theorem: Let U, V, W be linear spaces over K with bases a ∈ U ℓ , b ∈ V m and c ∈ W n respectively. Let φ ∈ Hom(U, V ) and ψ ∈ Hom(V, W ), with matrices α ∈ Mm,ℓ (K) and β ∈ Mn,m (K) respectively. Then the matrix of φ ◦ ψ is the product matrix αβ.
11.3. Square matrix algebra
313
[ Should also discuss matrices for changes of basis in a single space. ] [ Present the relation between the matrix transpose and the transpose of a linear map. See Definition 10.5.23. ] [ Maybe have a new section for the inner product component matrices which are mentioned in Remark 11.2.7. ] 11.2.7 Remark: Since an inner product on a linear space over the real number field is a symmetric bilinear map on the linear space, it is possible to represent the inner product as a matrix multiplied by the components of vectors with respect to a basis.
11.3. Square matrix algebra Whereas general rectangular matrices correspond to general linear homomorphisms φ : V → W , square matrices are required for the components of linear endomorphisms φ : V → V . The component matrices of linear automorphisms are invertible square matrices. 11.3.1 Notation: Mn (K) denotes Mn,n (K) for n ∈
Z+0 and field K,
Z+0 is the sum Pni=1 aii. Tr(A) denotes the trace of a matrix A ∈ Mn,n (K) for n ∈ Z+ 0 and a field K, The trace of square matrices has the following properties for all n ∈ Z+ 0 and fields K.
11.3.2 Definition: The trace of a matrix A = (aij )ni,j=1 ∈ Mn,n (K) for n ∈ 11.3.3 Notation: 11.3.4 Theorem:
(i) ∀λ ∈ K, ∀A ∈ Mn,n (K), Tr(λA) = λ Tr(A). (ii) ∀A, B ∈ Mn,n (K), Tr(A + B) = Tr(A) + Tr(B). (iii) ∀A ∈ Mn,n (K), Tr(AT ) = Tr(A).
11.3.5 Remark: Determinants are defined in terms of permutations and parity, which are defined in Section 7.10. The set perm( n ) is the set of n! permutations of n = {1, 2, . . . n}.
N
11.3.6 Definition: The determinant of a matrix A = (aij )ni,j=1 ∈ Mn,n (K) for n ∈ P Q element of K given by P ∈perm(Nn ) parity(P ) ni=1 ai,P (i) .
Z+0 and field K is the
11.3.7 Notation: det(A) denotes the determinant of a matrix A ∈ Mn,n (K) for n ∈ X
det(A) =
P ∈perm(
parity(P )
N
n Y
Z+0 and a field K.
ai,P (i) .
i=1
n)
Z
11.3.8 Remark: If the field K in Definition 11.3.6 is 2 = {0, 1}, the parity values 1 and −1 are the same. The matrix A has only zeros and ones as elements. Since the determinant can only be zero or one, it is interesting to ask what kinds of matrices have the non-zero determinant value. The diagonal matrix with aij = δij for i, j ∈ n certainly has det(A) = 1. It is also clear that det(A) = 1 for any lower diagonal matrix with aij = 1 for i = j and aij = 0 for i < j. (This is true for any field.)
N
11.3.9 Theorem: Let A ∈ Mn,n (K) for n ∈
Z
−1 Proof: For any n ∈ + ;P 0 , the set {P −1 perm( n ) if and only if P ∈ perm( n ). So
N
N
det(A) =
Z+0 and a field K. Then det(AT ) = det(A). ∈ perm(Nn )} is equal to perm(Nn ). This is because P
X
P ∈perm(
parity(P )
N
n)
n Y
∈
ai,P −1 (i) .
i=1
N
Q Q Q But ni=1 ai,P −1 (i) = ni=1 aP (i),P (P −1 (i)) = ni=1 aP (i),i for any P ∈ perm( n ) because a permutation of the factors in a product in a field does not affect the value of the product. (According to Definition 9.8.8, the product operation of a field is commutative.) Therefore det(A) =
X
P ∈perm(
N
n)
parity(P )
n Y
aP (i),i ,
i=1
which equals det(AT ). [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
N
314
11. Matrix algebra
11.3.10 Theorem: Let A ∈ Mn,n (K) for n ∈
Z+0 and a field K. Then det(λA) = λn det(A) for λ ∈ K.
11.3.11 Remark: For Theorem 11.3.13, it is useful to first prove Lemma 11.3.12. The parity function is extended by parity(Q) = 0 for functions Q : n → n which are not permutations (i.e. not bijections).
N
N
[ Try to find a shorter proof for Lemma 11.3.12. It can’t be so difficult! ] 11.3.12 Lemma: Let A ∈ Mn,n (K) for n ∈ X
P ∈perm(
Z+0 and a field K. Let Q : Nn → Nn. Then
parity(P )
N
n Y
aQ(i),P (i) = parity(Q) det(A).
(11.3.1)
i=1
n)
N
Proof: Suppose first that Q is not a permutation of n . (This implies, incidentally, that n ≥ 2.) Then Q(k) = Q(ℓ) for some k, ℓ ∈ n with k 6= ℓ. Let S ∈ perm( n ) be the permutation which swaps k and ℓ. (That is, S(k) = ℓ and S(ℓ) = k; otherwise S(i) = i.) Then Q ◦ S = Q ◦ S −1 = Q and parity(S) = −1. Substitute P = T ◦ S in the left-hand side of (11.3.1) for permutations T ∈ perm( n ). Then
N
N
N
X
P ∈perm(
N
parity(P )
n Y
aQ(i),P (i) =
i=1
n)
X
N
X
N
T ∈perm(
=
T ∈perm(
n)
parity(T ◦ S)
n Y
aQ(i),T S(i)
i=1
parity(T ) parity(S)
aQS −1 (i),T (i)
i=1
n)
X
= parity(S)
n Y
T ∈perm(
parity(T )
N
n Y
aQ(i),T (i)
i=1
n)
because x = −x for x ∈ K implies that x = 0. [ The Lemma must be true for a field which contains an element x with x + x = 0. Must find a different kind of proof in this case? ] Since parity(Q) = 0 for a non-permutation, the Lemma is verified in this case. If Q is a permutation of X
P ∈perm(
Nn, substitution of P = T ◦ Q in (11.3.1) yields
parity(P )
N
n Y
i=1
n)
X
aQ(i),P (i) =
T ∈perm(
=
X
T ∈perm(
N
n)
N
parity(T ◦ Q)
n Y
aQ(i),T Q(i)
i=1
parity(T ) parity(Q)
aQQ−1 (i),T (i)
i=1
n)
= parity(Q)
n Y
X
T ∈perm(
N
parity(T )
n Y
ai,T (i)
i=1
n)
= parity(Q) det(A), which is as claimed. 11.3.13 Theorem: Let A, B ∈ Mn,n (K) for n ∈ Proof: Let Nn = (
Nn)N
n
and Pn = perm(
det(AB) =
Nn) ⊆ Nn . Then
X
sign(P )
X
sign(P )
P ∈Pn
=
P ∈Pn [ www.topology.org/tex/conc/dg.html ]
Z+0 and a field K. Then det(AB) = det(A) det(B).
n X n Y
aik bk,P (i)
i=1 k=1 n X Y
Q∈Nn i=1
ai,Q(i) bQ(i),P (i)
(11.3.2)
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
= 0,
11.4. Real square matrix algebra = = =
X X
sign(P )
Q∈Nn P ∈Pn n X Y
ai,Q(i)
=
n Y
X
sign(P )
X
sign(Q) sign(P )
P ∈Pn
bQ(j),P (j)
j=1
P ∈Pn
Q∈Nn i=1
X
ai,Q(i) bQ(i),P (i)
i=1
ai,Q(i)
Q∈Nn i=1 n X Y
n Y
315
sign(Q)
n Y
ai,Q(i)
i=1
Q∈Pn
= det(A) det(B).
X
n Y
bj,P (j) j=1 n Y
sign(P )
P ∈Pn
(11.3.3)
bj,P (j)
j=1
Line (11.3.2) follows from Theorem 11.1.30. Line (11.3.3) follows from Lemma 11.3.12. 11.3.14 Remark: Left and right inverses of matrices are given by Definition 11.1.24. In the case of square matrices, a matrix B is a left inverse of a matrix A if and only if B is a right inverse of A. This common left/right inverse is a unique matrix which is called the inverse of A if such a matrix exists. 11.3.15 Definition: An invertible matrix is a matrix A ∈ Mn,n (K) for n ∈ B ∈ Mn,n (K), B is the inverse of A.
Z+0
such that for some
11.3.16 Notation: A−1 denotes the inverse of a square matrix A. inv 11.3.17 Theorem: The set Mn,n (K) of invertible n × n matrices over a field K for n ∈ following properties.
Z+0 has the
inv (i) ∀λ ∈ K \ {0K }, ∀A ∈ Mn,n (K), (λA)−1 = λ−1 A−1 .
inv (ii) ∀A, B ∈ Mn,n (K), (AB)−1 = B −1 A−1 .
inv (iii) ∀A ∈ Mn,n (K), det(A−1 ) = det(A)−1 .
[ Define eigenvalues and eigenvectors of square matrices. Give relations between invertibility of a matrix and its set of eigenvalues. ] 11.3.18 Remark: It turns out that the trace and determinant functions, and the inverse matrix operation, are given a lot more meaning when viewed within the framework of eigenspaces. The concept of eigenspaces for linear space endomorphisms is discussed in Section 10.4. The one-to-one onto correspondence between matrices and linear maps is discussed in Section 11.2. The eigenspace concept for linear spaces may be applied to square matrices via this correspondence. This is because linear space endomorphisms correspond to square matrices. 11.3.19 Definition: A symmetric n × n matrix over a field K for n ∈ such that ∀i, j ∈ n , aij = aji .
N
Z+0 is a matrix (aij )ni,j=1 ∈ Mn,n (K)
11.4. Real square matrix algebra [ Generalize the definitions in this section to fields other than IR. ] 11.4.1 Remark: The real-number field IR has a standard total order. This enables the definition of a various norms on matrices over IR. 11.4.2 Definition: The upper norm for real square matrices is the function λ+ : Mn,n (IR) → IR defined for n ∈ + 0 by
Z
∀A ∈ Mn,n (IR),
λ+ (A) = sup
[ www.topology.org/tex/conc/dg.html ]
n nX
i,j=1
aij vi vj ; v ∈ IRn and
n X i=1
o vi2 = 1 . [ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Proof: Property (iii) follows from Theorem 11.3.13 and the definition of an inverse matrix.
316
11. Matrix algebra
The lower norm for real square matrices is the function λ− : Mn,n (IR) → IR defined for n ∈ ∀A ∈ Mn,n (IR),
−
λ (A) = inf
n nX
i,j=1
n
aij vi vj ; v ∈ IR and
n X i=1
Z+0 by
o vi2 = 1 .
11.4.3 Remark: The terms “upper norm” and “lower norm” in Definition 11.4.2 are probably nonstandard. These functions are useful for putting upper and lower limits on the coefficients of second-order partial derivatives in partial differential equations. 11.4.4 Remark: In the case n = 0 in Definition 11.4.2, λ+ (A) = −∞ and λ− (A) = ∞ for all A ∈ Mn,n (IR). Therefore these functions are of dubious value for n = 0. 11.4.5 Theorem:
Z+0, ∀A ∈ Mn,n (IR), λ+( 12 (A + AT )) = λ+(A). T − − 1 (ii) ∀n ∈ Z+ 0 , ∀A ∈ Mn,n (IR), λ ( 2 (A + A )) = λ (A). + 1 T − 1 T (iii) ∀n ∈ Z+ 0 , ∀A ∈ Mn,n (IR), λ ( 2 (A − A )) = λ ( 2 (A − A )) = 0. (i) ∀n ∈
11.4.6 Remark: In the study of second-order partial differential equations, the order properties of the coefficient matrix for the second-order derivatives are important for the classification of operators and equations. An elliptic operator, for example, requires a second-order coefficient matrix which is positive definite. A weakly elliptic operator requires a positive semi-definite second-order coefficient matrix. In curved space, similar classifications are applicable, although the second-order derivatives in this case should be covariant and the coefficient matrix is replaced by a second-order contravariant tensor. Then the (semi)definiteness properties are applied to the components of this contravariant tensor with respect to a basis. Of course, the (semi)definiteness properties must be chart-independent in order to ensure that the properties are well defined. 11.4.7 Definition: A real square matrix (aij )ni,j=1 ∈ Mn,n (IR) for n ∈ Pn positive semi-definite if ∀v ∈ IRn , i,j=1 aij vi vj ≥ 0, Pn negative semi-definite if ∀v ∈ IRn , i,j=1 aij vi vj ≤ 0, Pn positive definite if ∀v ∈ IRn \ {0}, i,j=1 aij vi vj > 0, Pn n negative definite if ∀v ∈ IR \ {0}, i,j=1 aij vi vj < 0. 11.4.8 Theorem: Let n ∈
Z+0 is said to be
Z+ and A ∈ Mn,n (IR). Then
(i) A is positive semi-definite if and only if λ− (A) ≥ 0;
(ii) A is negative semi-definite if and only if λ+ (A) ≤ 0.
Z
Proof: To show (i), let n ∈ + and suppose that A is positive A ∈P Mn,n (IR) and Pnsemi-definite. Then n n Pn n 2 ∀v ∈ IR , i,j=1 aij vi vj ≥ 0. By Definition 11.4.2, λ− (A) = inf a v v ; v ∈ IR and ij i j i,j=1 i=1 vi = 1 . Pn But i,j=1 aij vi vj ≥ 0 for any v ∈ IRn . Therefore λ− (A) ≥ 0 as claimed. P Now suppose that A ∈ Mn,n (IR) and λ− (A) ≥ 0. If v ∈ IRn is the zero vector v = 0, then ni,j=1 aij vi vj = 0. Pn Pn 2 So a v v ≥ 0 as required. If v 6= 0, define w ∈ IRn by w = k−1/2 v, where k = i,j=1 i=1 vi . Pn ij 2i j Pn P n − Then definition of λ (A). Therefore i=1 wi = 1. So i,j=1 aij wi wj ≥ 0 be the i,j=1 aij vi vj = Pn Pn n k i,j=1 aij wi wj ≥ 0 since k ≥ 0. It follows that i,j=1 aij vi vj ≥ 0 for all v ∈ IR . Hence A is positive semi-definite. The proof of (ii) follows by suitable changes of sign. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
P P Proof: Parts (i) and (ii) follow from ni,j=1 ( 12 (aij + aji ))vi vj = ni,j=1 aij vi vj , which follows from comP mutativity of real number multiplication. Part (iii) follows from ni,j=1 ( 21 (aij − aji ))vi vj = 0.
11.5. Real symmetric matrix algebra
317
11.4.9 Remark: Definite and semi-definite matrices are generally defined only for the spacial case of real symmetric or complex hermitian matrices. (See for example EDM2 [34], 269.I.) Whether a real matrix A has a definiteness or semi-definiteness property depends only on the symmetric part 21 (A + AT ). (This is stated in Theorem 11.4.5.) The main applications of (semi)definiteness are to eigenspaces for real symmetric (or complex hermitian) matrices, which are guaranteed to have real eigenvalues. Such matrices therefore have well-defined ordering properties when applied to vectors. Nevertheless, Definition 11.4.7 defines (semi)definiteness for all real square matrices. Notation 11.6.1 specializes the concept to real symmetric matrices.
11.5. Real symmetric matrix algebra [ Probably discuss in this section the basic facts about symmetric matrices, orthogonal eigenspaces, preservation of matrix symmetry under linear transformations, and similar useful things. Define also characteristic polynomials. ] [ It is usual to discuss real symmetric matrices in the context of hermitian complex matrices. The usual sets of complex matrices will be presented if I can find a use for them. There should be a new section for complex matrices. ] 11.5.1 Remark: Symmetric matrices arise constantly in partial differential equations as the second derivatives of C 2 real-valued functions on manifolds. Since the matrices U = (uij )ni,j=1 of second derivatives of such functions are necessarily symmetric, the coefficient matrices A = (aij )ni,j=1 of second-order terms such P as Tr(AT U ) = ni,j=1 aij uij in PDEs may be assumed symmetric also, since the anti-symmetric component has no effect. The following notations and theorems are motivated by PDE applications.
N
Z+0 is a matrix (aij )ni,j=1 ∈ Mn,n (IR) such that
11.5.3 Notation: Sym(n, IR) denotes the set of real symmetric n × n matrices for n ∈
Z+0.
11.5.4 Remark: It is not generally true that AB ∈ Sym(n, IR) if A, B ∈ Sym(n, IR). By Theorem 11.5.5, (AB)T = BA if A, B ∈ Sym(n, IR), which is not quite the same thing. Theorem 11.1.19 implies that (AB)T = B TAT for any matrices A, B ∈ Mn,n (IR). So (AB)T = BA = B TAT = (ATB T )T for all A, B ∈ Sym(n, IR). It is not possible to conclude the general equality of (AB)T and AB from this. However, AB + BA is symmetric if A and B are symmetric, as shown in Theorem 11.5.6. 11.5.5 Theorem: Let n ∈
Z+0 and A, B ∈ Sym(n, IR). Then (AB)T = BA.
Proof: See Exercise 47.5.3. 11.5.6 Theorem: Let n ∈
Z
Z+0 and A, B ∈ Sym(n, IR). Then AB + BA ∈ Sym(n, IR).
T Proof: Let n ∈ + 0 and A, B ∈ Sym(n, IR). Then by Theorem 11.5.5 and Remark 11.1.15, (AB + BA) = T T (AB) + (BA) = BA + AB = AB + BA. So AB + BA ∈ Sym(n, IR) as claimed.
[ Possibly Remark 11.5.7 applies to general symmetric matrices for any field? ] 11.5.7 Remark: Theorem 11.5.8 gives a spectral decomposition for a symmetric matrix A using a style of inductive proof which does not seem to rely on solving the polynomial equation det(λI − A) = 0.
The eigenvalues and eigenvectors that arise from Theorem 11.5.8 derive their name from the German adjective “eigen”, which means “own”, “typical” or “particular”. The reason for this choice of word is the Pnfact that eigenvalues and eigenvectors are invariant under orthogonal transformations. To see this, if A P = i=1 λi ei eTi n ′ ′ T is transformed by the orthogonal matrix B, then the transformed matrix is B −1 AB = i=1 λi ei (ei ) ′ −1 T with ei = (B ei ) . So the eigenvalues and eigenvectors may be thought of as being “attached” to these matrices under any orthogonal transformation. P [ Prove the assertion about B −1 AB = ni=1 λi e′i (e′i )T in Remark 11.5.7. This should be in a separate section on eigenspaces. ] [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
11.5.2 Definition: A real symmetric n × n matrix for n ∈ ∀i, j ∈ n , aij = aji .
318
11. Matrix algebra
11.5.8 Theorem: For any symmetric matrix A ∈ P Sym(n, IR), there are numbers λ1 , . . . λn ∈ IR and n n n T orthonormal vectors e , . . . e ∈ IR such that Ax = 1 n i=1 λi (x ei )ei for all x ∈ IR . Consequently, A = Pn T i=1 λi ei ei .
Proof: [ Try to prove this spectral decomposition theorem using the induction method. ]
[ Quote proof of orthogonality of eigenspaces of self-adjoint operator in Rudin [137], page 313, thm 12.12. ]
11.6. Real symmetric definite and semi-definite matrices 11.6.1 Notation:
Z+0. + Sym− 0 (n, IR) denotes the set of real negative semi-definite symmetric n × n matrices for n ∈ Z0 . Sym+ (n, IR) denotes the set of real positive definite symmetric n × n matrices for n ∈ Z+ 0. − Sym (n, IR) denotes the set of real negative definite symmetric n × n matrices for n ∈ Z+ 0.
Sym+ 0 (n, IR) denotes the set of real positive semi-definite symmetric n × n matrices for n ∈
Z
11.6.2 Remark: Theorem 11.5.6 states that AB + BA ∈ Sym(n, IR) if A, B ∈ Sym(n, IR), for all n ∈ + 0. + However, it is not generally true that AB + BA ∈ Sym+ 0 (n, IR) if A, B ∈ Sym0 (n, IR). Example 11.6.3 is a counterexample to this hypothesis. Roughly speaking, a positive semi-definite real symmetric matrix corresponds to a linear transformation which rotates vectors by no more than 90◦ . But a composition of two such transformations may rotate vectors by up to 180◦ . Hence positive semi-definiteness is not transitive, even when restricted to real symmetric matrices. [ Rudin [137], page 330, has some useful results on positive operators on Hilbert spaces which are related to Remark 11.6.2. ]
a 0 1 b 2 11.6.3 Example: Let A = and B = with a = 2, b = 1 and v = . Then AB + BA = 0 1 b 1 −3 2a b(a + 1) 4 3 = . So v T (AB + BA)v = −2 6≥ 0. Hence AB + BA ∈ / Sym+ 0 (2, IR) although b(a + 1) 2 3 2 + A, B ∈ Sym0 (2, IR). 11.6.4 Remark: Theorem 11.6.5 is important in the theory of elliptic partial differential operators. [ One way of proving Theorem 11.6.5 would be to diagonalize either A or U . But that requires some theorems about diagonalization of matrices and invariance of the set of eigenvalues under orthogonal transformations. ] 11.6.5 Theorem: If A, U ∈ Sym+ 0 (n, IR) for some n ∈
Z+0, then Tr(AU ) ≥ 0.
[ See Rudin [136], pages 186–188, for properties of products of matrices. ]
11.7. Matrix groups [ In this section, define matrix groups such as GL(n), SL(n), O(n) and SO(n). These are related to the corresponding automorphism groups GL(IRn ) and so forth in Section 10.9. ] [ Also define the groups of Euclidean transformations and affine transformations on IRn . Might as well do projective and conformal groups too. Also add translation groups to these groups. And just for good measure, do Lorentz transformations too. ] [ Also make sure the include groups which preserve a hyperbolic norm like the Minkowski and Lorentz groups. ] 11.7.1 Remark: The classical matrix groups include the following. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
[ Give a theorem that positive semi-definite matrices do not rotate vectors by more than 90◦ . ]
[ www.topology.org/tex/conc/dg.html ]
notation GL(n) SL(n) O(n) SO(n) U(n) SU(n) 11.7. Matrix groups
name definition general linear group special linear group orthogonal group special orthogonal group unitary group special unitary group
319
[ draft: UTC 2009–3–21 Saturday 11:36 ] Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
320
[ www.topology.org/tex/conc/dg.html ]
11. Matrix algebra
[ draft: UTC 2009–3–21 Saturday 11:36 ]
[321]
Chapter 12 Affine spaces
12.1 Affine spaces discussion . . . 12.2 Affine space definitions . . . 12.3 Affine transformation groups 12.4 Euclidean spaces . . . . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
321 322 323 324
12.0.1 Remark: The word “affine” is used in many ways, some of which are mutually contradictory. An “affine transformation” of a linear space is a combination of a translation and an invertible linear transformation. An “affine space” is a combination of a space of points with a linear space of vectors at each point. An “affine connection” is really a linear connection. It is possible to justify the name “affine connection”, but it is confusing.
12.1. Affine spaces discussion The concepts of affine spaces and affine transformations were apparently first published by Euler [103] in 1748 in a superficial way, and were first dealt with in a serious manner by M¨obius [75] in 1827. Euler chose the name “affine” for the concept, but it was M¨obius who correctly defined the general affine transformation and proved some non-trivial properties and invariants of affine transformations, such as the invariance of convex combinations of points. In choosing the name “affine”, Euler needed something to describe a geometric relation which was weaker than similarity, and since “affine” means “related”, he thought it seemed appropriate. (See Section 46.3 for details.) It is important to have some understanding of affine spaces in order to better understand the concept of an affine connection on a differentiable manifold. (Affine connections are presented in Chapter 37.) An affine space is analogous to an affine connection in the same sense that a Euclidean space is analogous to a Riemannian metric. In an affine linear space, straight lines and parallelism are defined, but not distance or angles. An affine connection defines geodesics and parallel transport, but not distance or angles. The summary of classical point-and-line geometry in Reinhardt [134], pages 128–169, is very enlightening. [ Defines analytic (coordinate) and synthetic (axiomatic point/line) geometry. ] [ The Euclidean and affine groups will be defined in Section 12.3. ] An affine space is a linear space such as IRn with no special zero vector and no scalar multiplication or vector addition operations, although vector subtraction is permitted. An affine space is a democratic space
©
Alan U. Kennington, “Differential geometry reconstructed: a unified systematic framework”. Copyright 2009, Alan U. Kennington. All rights reserved. You may print this book draft in A4 format. Printing in all other formats is forbidden. You may not charge any fee for copies of this book draft. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
An affine space is not the same as a linear space. A linear space has only vectors. An affine space has both points and vectors. In practical calculations, points and vectors may be confused. It is important to clearly distinguish between points and vectors in the theoretical development of the subject. Affine spaces provide a familiar model which can assist understanding of parallel transport and affine connections on general manifolds. An affine space may be thought of as a manifold with absolute (i.e. pathindependent) parallelism. In other words, affine spaces are examples of “flat spaces”. The point set of an affine space may be identified with the familiar Euclidean or Cartesian spaces of arbitrary dimension. An obstacle to axiomatizing a very familiar model is the difficulty of turning off one’s intuition.
322
12. Affine spaces
where all points are equal. To avoid confusion, Definition 12.2.1 removes the unwanted concepts from IRn by starting with a bare set X and adding only those properties of IRn which are required. Clearly any linear space could be used in place of IRn to give other kinds of affine spaces. Definition 12.2.1 removes the zero vector from an affine space, and removes concepts such as angles or distance by using an abstract linear space instead of IRn . Another construction to represent an affine space would be an equivalence class of linear maps. This is the approach taken for defining manifolds and it is just as applicable here. Thus an affine space could be defined as a set X together with a bijection ψ : X → IRn . The pair (X, ψ) is then taken to be equivalent to the pair (X ′ , ψ ′ ) if and only if X = X ′ and ψ ′ ◦ ψ −1 : IRn → IRn is an affine transformation, namely an element of GL(n). This is closer to “the truth” than Definition 12.2.1. There is a third possibility for defining affine spaces which seems to be much better than even the atlas approach. Affine spaces can be defined in a natural, intrinsic manner by constructing a four-way equivalence relation on the points of a set. Thus given a set X, a parallelism relation on the set X × X is defined to −−⇀ −⇀ mean that (P, Q) ∼ (R, S) whenever the vector P Q is parallel to RS. The following table shows which kinds of transformations leave which relations invariant in point-and-line geometries. transformation projective affine similarity congruence
preserved relations
group
coincidence of lines and points rational linear parallelism of lines general linear angles conformal distances orthogonal
DG concept projective connection? affine connection conformal connection? Riemannian metric
Definition 12.2.1 follows EDM2 [34], article 7, by using a completely general linear space. It is similar to the definition by Weyl [50], section 2, which uses a general real linear space. Greenberg/Harper [113], section 8, page 41, use the linear space IRn , which is probably not a good idea. The important thing to note about Definition 12.2.1 is the fact that the linear space V has no specified basis and no inner product or metric. The base space X is given only algebraic structure by the linear space V . If the linear space IRn had been chosen instead of a general linear space, then there would have been ambiguity as to which properties were supposed to be inherited by the base space. The approach taken in Definition 12.2.1 represents the parallelism structure of an affine space in terms of an associated linear space. It is also possible to take a more intrinsic approach whereby the parallelism is represented by an equivalence relation. There seem to be three sensible ways to define an affine space. (i) An abstract set X with vectors in an abstract linear space V as in Definition 12.2.1. (ii) The same approach as in Definition 12.2.1 except that instead of a single difference operation δ, the set of all difference operations which are related by affine transformation to this is used. (iii) Start with an abstract linear space V or a concrete linear space such as IRn and remove unwanted properties by insisting on invariance under the group of affine transformations. This is the point of view of the Erlanger Programm. (iv) An equivalence relation for point pairs in an abstract set X. This is a synthetic geometry approach. 12.2.1 Definition: An affine space over a linear space V with field K is a non-empty set X together with a function δ : X × X → V , denoted as the binary operation “ − ”, such that
(i) for all Q ∈ X, the function δ1Q : X → V defined by P 7→ δ(P, Q) = P − Q is a bijection, (ii) for all P ∈ X, the function δ2P : X → V defined by Q 7→ δ(P, Q) = P − Q is a bijection, (iii) for all P, Q, R ∈ X, if P − Q = v and Q − R = w, then P − R = v + w. The function σ : X × V → X, denoted as the binary operation “+”, is defined by (iv) ∀P, Q ∈ X, σ(Q, δ(P, Q)) = P . (I.e. Q + (P − Q) = P .) The function σ is called the affine structure for the affine space. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
12.2. Affine space definitions
12.3. Affine transformation groups
323
12.2.2 Remark: The transitivity condition (iii) means that (P −Q)+(Q−R) = (P −R) for all P, Q, R ∈ X. (This is illustrated in Figure 12.2.1.) P
v+w R v w Q Figure 12.2.1
Transitivity of affine space vectors
It follows that P −P = 0 for all P ∈ X. This can be seen by noting that P −Q = (P −Q)+(Q−P )+(P −Q) by transitivity. Therefore P − Q = −(Q − P ). Consequently P − P = −(P − P ).
Condition (iv) implies that the addition operation “ + ” is just a way of providing a convenient shorthand for the “ − ” operation. Thus Q + v really means “the unique point P ∈ X such that P − Q = v”.
The spaces X and V are the point space and the vector space respectively for the affine space (X, V ). This contrasts with the situation in linear spaces where points and vectors are the same thing. The distinction between points and vectors in affine spaces is useful as a precedent for dealing with tangent vectors for differentiable manifolds, where points and vectors are even more distinct than in the case of affine spaces. 12.2.3 Remark: The maps δ1Q and δ2P in Definition 12.2.1 are effectively manifold charts with a linearity constraint. n 12.2.4 Notation: of notation, whenever the r + 1 numbers a0 , aP 1 . . . ar ∈ IR Pr As a further convenience r + sum to 1 (i.e. i=0 ai = 1), where r ∈ 0 , and the points Pr P0 , P1 . . . Pr ∈ X, the expression i=0 ai Pi (or equivalently a0 P0 + a1 P1 + . . . ar Pr ) is defined as P0 + i=1 ai (Pi = P0 ).
In the special case r = 1, these convex combinations take the form (1 − t)P0 + tP1 = P0 + t(P1 − P0 ), where a0 = 1 − t, a1 = t and t ∈ IR.
12.2.5 Definition: The line through points P, Q ∈ X is the set {(1 − t)P + tQ; t ∈ IR}. The line segment through points P, Q ∈ X is the set {(1 − t)P + tQ; t ∈ [0, 1]}.
[ Should give some examples of affine spaces which are not linear spaces. A good example could be the set of points in IRn+1 which lie in any plane which does not pass through the zero vector. ] 12.2.6 Remark: EDM2 [34], article 7, summarizes the entire subject of affine geometry and affine spaces. Greenberg/Harper [113], section 8, page 41, gives a useful 2-page overview of affine space definitions. Weyl [50], section 2, gives a useful 6-page account in German. [ Define affine transformations and groups of affine transformations. These may be expressed in terms of the tangent bundle on IRn . Give all affine space definitions also in terms of the tangent bundle T (IRn ). ]
12.3. Affine transformation groups [ This section should present the standard “inhomogeneous” linear transformation groups which permit a translation of some sort. These should be presented both as abstract linear space groups and as matrix groups. Include inhomogeneous Lorentz transformations in this section. Maybe projective or rational linear groups could be included also? Both affine and rational linear transformations may be represented as square matrices with an additional row or column which is treated differently. ] [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Z
324
12. Affine spaces
12.4. Euclidean spaces 12.4.1 Remark: Euclidean spaces are defined in many different ways including the following. In parentheses are possible names for some of these concepts and constructions. Some specification tuples could be given for all of these, but that would be taking it too seriously. The point being made here is that there is no single “standard definition” of a Euclidean space. There are many standards to choose from. (i) The linear space IRn with no metric or inner product. (Euclidean linear space?) (ii) The linear space IRn with a metric but no inner product. (Euclidean metric space?)
As a result of the wide variety of meanings in the literature, it is difficult to say exactly how much algebraic, geometric or analytic structure is referred to when someone says “Euclidean space”. But ambiguity turns to absurdity when a notation such as “E n ” is proposed for one or more of these Euclidean spaces. When the notation “E n ” is used in elementary texts it is supposedly to remind the reader to think of IRn with some range of usual “Euclidean” structures attached to it. However, such an “E n ” notation is not the nth power of anything – certainly not the nth power of E, whatever E is. This kind of pseudo-notation should be replaced with meaningful notation wherever it is found. (An even more absurd notation is “M n ” for an n-dimensional manifold. Teachers do more harm than good by using such meaningless notation because it forces students to abandon logical thinking.) [ Maybe present here some standardized definitions of Euclidean space? Could maybe define many kinds of Euclidean space, such as “Euclidean affine spaces” “Euclidean linear spaces” and “Euclidean topological spaces”. ]
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
(iii) The linear space IRn with a norm and an inner product. (Euclidean inner product space?) (iv) A linear space of n dimensions, namely a linear space with the absolute origin and axis directions removed. (This can be done by starting with an abstract set and adding to this an n-dimensional chart and the set of all n-dimensional charts which are related to this by isometries.) This is the space of Euclidean geometry. (v) A linear space of n dimensions together with a tangent bundle. (vi) A linear space of n dimensions together with the usual tangent bundle, Riemannian metric and LeviCivita connection. (vii) Affine space of n dimensions. (Euclidean affine space?) (viii) The set IRn together with the usual flat affine connection. (ix) The set IRn together with its usual topology. (Euclidean topological space?) (x) Any topological space which is homeomorphic to IRn with the usual topology. (xi) The set IRn together with its usual differentiable structure. (xii) Any topological space which is diffeomorphic to IRn with the usual differentiable structure. (xiii) The set IRn together with its usual Lebesgue measure.
[325]
Chapter 13 Tensor algebra
13.1 13.2 13.3 13.4 13.5 13.6 13.7 13.8 13.9 13.10 13.11 13.12
Multilinear maps . . . . . . . . . . . . . . . . . . . Linear spaces of multilinear maps . . . . . . . . . . . Symmetric and antisymmetric multilinear maps . . . Tensor product metadefinition . . . . . . . . . . . . Tensor products of linear spaces . . . . . . . . . . . Covariant tensors . . . . . . . . . . . . . . . . . . . Mixed tensors . . . . . . . . . . . . . . . . . . . . . General tensor algebra . . . . . . . . . . . . . . . . Alternating tensors . . . . . . . . . . . . . . . . . . Alternating tensor algebra . . . . . . . . . . . . . . Tensor products defined via free linear spaces . . . . Tensor products defined via lists of tensor monomials
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
326 329 330 332 334 336 337 338 342 344 344 345
13.0.1 Remark: A tensor is the “multilinear effect” of a sequence of vectors. For example, if v1 ∈ V1 and v2 ∈ V2 are vectors in real linear spaces V1 and V2 , the tensor product v1 ⊗ v2 is defined as the map v1 ⊗ v2 : f 7→ f (v1 , v2 ) evaluated on the space of bilinear functions f : V1 × V2 → IR. In other words, v1 ⊗ v2 is the effect of the pair (v1 , v2 ) on bilinear functions f : V1 × V2 → IR. Thus v1 ⊗ v2 is the “bilinear effect” of the pair (v1 , v2 ). m More generally, if (vi )m i=1 ∈ ×i=1 Vi is a sequence of m vectors in linear spaces Vi over a field K, the tensor product v1 ⊗ . . . vm is defined as the effect f 7→ f (v1 , . . . vm ) of the sequence (v1 , . . . vm ) on multilinear functions f : ×m i=1 Vi → K.
More precisely, a tensor is an arbitrary sum of multilinear effects of sequences of vectors. Each individual sequence of vectors (v1 , . . . vm ) has a multilinear effect which is a map from functions f : ×m i=1 Vi → K to values f (v1 , . . . vm ) ∈ K. But the set of these maps forms a linear space over K with respect to pointwise addition and scalar multiplication. It is this linear space of arbitrary sums of maps which is called a “tensor space”. The map for any individual sequence of vectors is called a “tensor monomial”. 13.0.2 Remark: The construction of tensors from linear spaces is similar to the way a (finite-dimensional) linear space may be reconstructed from its self-dual. A vector v in a linear space V may be thought of as the “linear effect” of v on the dual space V ∗ of linear functionals on V . That is, the value λ(v) for v ∈ V and λ ∈ V ∗ may be regarded as a map µ(v) : V ∗ → V defined by µ(v) : λ 7→ λ(v). Thus µ : V → V ∗∗ is a linear space isomorphism. It may seem useless to replace a linear space with the dual of its dual space. Nothing is gained in the case of linear functionals. But in the case of multilinear functionals f : ×m i=1 Vi → IR on more than one linear space, the “dual of the dual” is not isomorphic to the original method for tensors is the same as for the space V , nor to the Cartesian product ×m i=1 Vi . The construction Q dual of the dual, but the constructed space has dimension m dim(V i ), which is different to the dimension i=1
©
Alan U. Kennington, “Differential geometry reconstructed: a unified systematic framework”. Copyright 2009, Alan U. Kennington. All rights reserved. You may print this book draft in A4 format. Printing in all other formats is forbidden. You may not charge any fee for copies of this book draft. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
This chapter presents the algebra of tensor spaces. Topics such as differential forms (Section 20.5) and the exterior derivative (Section 20.6) require differential and integral calculus for their treatment. Therefore these analytic topics are delayed until Chapter 20. This chapter is limited to abstract linear spaces.
326
13. Tensor algebra
Pn
dim(Vi ) = dim(×ni=1 Vi ). Therefore the tensor product space is not isomorphic to the Cartesian product. It is a very different kind of product with a very different algebra. i=1
13.0.3 Remark: Tensors are not quite as perplexing as they seem at first. We are used to thinking of the ratios 3/4 and 6/8 as exactly the same thing. In the same way, the tensor products (0, 2) ⊗ (3, 4) and (0, 1) ⊗ (6, 8) are exactly the same thing. Any bilinear function acting on these vector pairs will give the same result. This is like saying that if you multiply by 3 and divide by 4, you get the same “effect” as multiplying by 6 and dividing by 8. We could say that 3/4 and 6/8 have the same “fractional effect”. 13.0.4 Remark: An alternating tensor is the “antisymmetric multilinear effect” of a sequence of vectors. Alternating tensors arise naturally as line elements, area elements and volume elements in multi-dimensional integration theory. A symmetric tensor is the symmetric multilinear effect of a sequence of vectors. However, there is (probably) no real need to develop the theory of symmetric tensors because the algebra of symmetric tensors is not very interesting. 13.0.5 Remark: The set of “multilinear effects” of vector sequences is not closed under addition. The closure of this set under addition is called a “tensor (product) space”. 13.0.6 Remark: The subject of tensors may be thought of as “multilinear algebra”. It seems to have started with Hermann G¨ unter Grassman in the middle of the 19th century. The topic of alternating tensors is called exterior algebra, exterior calculus or Grassman algebra. (See Federer [105], page 8 and Frankel [18], page 66 for notes on Grassman as the originator of the exterior calculus.) 13.0.7 Remark: Three ways of defining tensor products are presented in this chapter. (i) Characterization in terms of a multilinear map (Metadefinition 13.4.1). (ii) The dual of the linear space of multilinear maps (Definition 13.5.2). Metadefinition 13.4.1 defines tensor space representations in general. Definitions 13.5.2 (dual of multilinear maps) and 13.11.1 (quotient of free linear space) are particular tensor space representations. All tensor space representations (for a particular sequence of linear spaces) are related by a unique isomorphism. Therefore calculations in all representations give the same answers. The appearance of multiple representations in the literature is not peculiar to tensors. For example, the real numbers may be represented as either Dedekind cuts, equivalence classes of Cauchy sequences, or binary expansions. Tangent vectors on manifolds may be represented as coordinates, differential operators or equivalence classes of differentiable curves. The acceptance of multiple representations may be thought of as “outsourcing” the construction of mathematical systems to multiple definition providers subject to interoperability standards. [ The word “tensor” was possibly introduced by Hamilton in 1846. Check this. ] 13.0.8 Remark: Some useful references for this chapter are Frankel [18], sections 2.4–2.6, Federer [105], chapter 1, Warner [49], chapter 2, EDM2 [34], article 256.I–O, Gallot/Hulin/Lafontaine [19], page 36, and Crampin/Pirani [11], page 101.
13.1. Multilinear maps A multilinear map is a vector-valued function of multiple vector variables which depends linearly on each variable individually. 13.1.1 Remark: Definition 13.1.3 means that f : ×α∈A Vα → U is multilinear if f is linear with respect to each space Vβ individually for fixed values of vα ∈ Vα for indices α ∈ A with α 6= β. This requirement for linearity with respect to each variable for fixed values of other variables is similar to the definition of partial derivatives, which also require all but one of the variables to be fixed. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
(iii) The quotient of a free linear space by a set of multilinear identities (Definition 13.11.1).
13.1. Multilinear maps
327
13.1.2 Remark: Definition 13.1.3 uses the “substitution operator” notation which defines E(x) x=V to mean the substitution of expression V into expression E(x) wherever the variable x occurs. This is a textlevel metanotational convention. Probability theory uses a similar convention as in Prob(X|E) for logical expressions E. The notation {x ∈ S; P (x)} for the restriction of a set S by a condition P is a comparable text-level metanotation in which a proposition appears. 13.1.3 Definition: A multilinear map from a Cartesian product ×α∈A Vα of linear spaces (Vα )α∈a over a field K to a linear space U over K is a map f : ×α∈A Vα → U such that ∀v ∈ × Vα , ∀β ∈ A, ∀w1 , w2 ∈ Vβ , ∀λ1 , λ2 ∈ K, α∈A f v = λ1 f v vβ =λ1 w1 +λ2 w2
vβ =w1
+ λ2 f v
vβ =w2
.
(13.1.1)
13.1.4 Remark: The linearity of a multilinear map with respect to all component spaces of its domain is illustrated in Figure 13.1.1. U f (v1 , v2 , w3 ) f (v1 , v2 , v3 ) f (w1 , v2 , v3 ) f (v1 , w2 , v3 ) linear
V1
linear
linear
v1
V2 w1
w2
V3
v2
w3
v3
Figure 13.1.1
Linearity of a multilinear function with respect to domain components
When #(A) = 1 in Definition 13.1.3, the map f is linear. When #(A) = 2, the map is said to be bilinear. When #(A) = 3, the map is said to be trilinear, and so forth. Figure 13.1.2 gives an alternative style of visualization for the definition of multilinear functions. It is important to keep in mind that the “axes” for the Cartesian product V1 × V2 × V3 represent the entire spaces V1 , V2 and V3 , not one-dimensional subspaces of these spaces. V3
f ∈ L (V1 , V2 , V3 ; U ) (v1 , v2 , w3 )
(v1 , v2 , v3 ) ∈ V 1 × V 2 × V3
f (v1 , v2 , w3 ) ∈ U
linear (v1 , w2 , v3 )
linear
(w1 , v2 , v3 ) V2
linear
f (v1 , w2 , v3 ) f (v1 , v2 , v3 ) f (w1 , v2 , v3 )
0 V1 × V2 × V3 Figure 13.1.2 [ www.topology.org/tex/conc/dg.html ]
V1
U
Definition of multilinear function [ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
trilinear map f : V1 × V2 × V3 → U
328
13. Tensor algebra
13.1.5 Notation: L ((Vα )α∈A ; U ) for a family (Vα )α∈A of linear spaces over a field K and a linear space U over K denotes the set of multilinear maps from ×α∈A Vα to U .
Z
m 13.1.6 Notation: L (V1 , . . . Vm ; U ) for a sequence of m ∈ + 0 linear spaces (Vi )i=1 and a linear space U over a field K denotes the set L ((Vα )α∈A ; U ) with A = m = {1, . . . m}.
Z
N
13.1.7 Notation: Lm (V, U ) for m ∈ + 0 and linear spaces V and U over a field K denotes the set of multilinear maps L ((Vα )α∈A ; U ) with A = m and Vα = V for all α.
N
13.1.8 Remark: It follows from Definition 13.1.3 that the value of a multilinear function is zero if any of its arguments is zero. (See Theorem 13.1.9 (i).) This is a consequence of the multiplicative nature of the tensor product space in contrast to the additive nature of the Cartesian product space. The set of multilinear maps L (V ; U ) is the same thing as the set of linear space homomorphisms Hom(V, U ) defined in Section 10.3. If U is the field K regarded as a linear space over K, then L (V ; K) = Hom(V, K) is the linear space dual V ∗ of V as in Section 10.5. 13.1.9 Theorem: Let (Vα )α∈A be a finite family of linear spaces over a field K and U be a linear space over K. Then: (i) ∀f ∈ L (×α∈A Vα ; U ), ∀β ∈ A, ∀v ∈ (Vα )α∈A , (vβ = 0 ⇒ f (v) = 0). Proof: For (i), let v = (vα )α∈A with vβ = 0 for some β ∈ A. In Definition 13.1.3, let w1 = w2 = 0 and λ1 = λ2 = 0. Then line (13.1.1) gives f (v) = 0.f (v) + 0.f (v) = 0. 13.1.10 Example: Figure 13.1.3 illustrates a bilinear function f from linear spaces V1 = V2 = IR1 to the linear space U = IR1 . x2 f (x1 , x2 ) = −1
f (x1 , x2 ) = x1 x2 Figure 13.1.3
1
2 1.5 1 0.5
0.5 0.25
x1 −0.5 −1 −1.5 −2
1
-1 f (x1 ,x2 )=0
-1
2
1
−1
−2
Bilinear function f : IR1 × IR1 → IR1
All multilinear functions from IR1 × IR1 to IR1 are of the form f (x1 , x2 ) = kx1 x2 for some k ∈ IR. The diagram uses k = 1. It is clear from the contour curves that f is not linear. A linear function of two variables would have straight lines for contours. However, the value of f (x1 , x2 ) for a constant value of x1 clearly varies linearly with respect to x2 . 13.1.11 Remark: Theorem 13.1.12 re-expresses Definition 13.1.3 in terms of a logical expression which does not use the substitution metanotation in Remark 13.1.2. However, expression (13.1.2) is probably less easy to interpret than (13.1.1). 13.1.12 Theorem: A map f from a Cartesian product ×α∈A Vα of linear spaces (Vα )α∈A over a field K to a linear space U over K is multilinear if and only if ∀β ∈ A, ∀λ1 , λ2 ∈ K, ∀u, v, w ∈ × Vα , α∈A
uβ = λ1 vβ + λ2 wβ and ∀α ∈ A \ {β}, uα = vα = wα [ www.topology.org/tex/conc/dg.html ]
⇒ f (u) = λ1 f (v) + λ2 f (w).
(13.1.2)
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
−2 −1.5 −1 −0.5 0 0.5 1 1.5 2
f (x1 , x2 ) = 1
13.2. Linear spaces of multilinear maps
329
13.1.13 Remark: Linear homomorphisms are defined for a broader class than linear spaces. For example, maps between modules over a commutative ring have a well-defined concept of linearity. (For such maps, see Definition 9.9.25.) Multilinearity of maps is well-defined for these more general classes. So tensor spaces are well-defined for these classes also. (See for example Bump [96], chapter 9.) 13.1.14 Remark: In Section 11.2, the correspondence between matrices and linear maps was presented. There is also a useful correspondence between matrices and bilinear functions on a linear space. Let V be an n-dimensional linear space over a field K with basis (ei )ni=1 ∈ V n . Let α : V × V → K be a bilinear map on V . For i, j ∈ n , let aij = α(ei , ej ). The matrix a ∈ Mn,n (K) then satisfies
N
∀v, w ∈ V,
α(v, w) = α = =
n X
vi ei ,
i=1 n n XX
n X
wj ej
j=1
vi wj α(ei , ej )
i=1 j=1 n X
aij vi wj ,
i,j=1
where v =
Pn
i=1 vi ei
and w =
Pn
wj ej express v and w in terms of the given basis. Pn Conversely, for any matrix a ∈ Mn,n (K), the map (v, w) 7→ i,j=1 aij vi wj for v, w ∈ V defines a bilinear map on V . The correspondence between the matrices and the bilinear maps is one-to-one and onto. Consequently matrices offer an equivalent way of expressing bilinear maps. (This is illustrated in Figure 13.1.4. This is similar to Figure 11.2.1 but different.) j=1
tuple space K ×K
n
(vi )ni=1 (wj )m j=1
P
n i,j = 1
bij v i wj bilin ear f orm component n κ : V → K V map ap ear m bilin →K v ×V V : β w V ×V
β(v, w)
K
field
linear space Figure 13.1.4
Components of a bilinear map with respect to a basis
The matrix representation requires a basis and the matrix is different for each basis. Nevertheless the matrix representation is the most usual way of specifying bilinear maps in practical applications.
13.2. Linear spaces of multilinear maps 13.2.1 Remark: The pointwise addition and scalar multiplication operations for L ((Vα )α∈A ; U ) are defined so that (λ1 f1 + λ2 f2 )((vα )α∈A ) = λ1 f1 ((vα )α∈A ) + λ2 f1 ((vα )α∈A ) for all f1 , f2 ∈ L ((Vα )α∈A ; U ), λ1 , λ2 ∈ K and (vα )α∈A ∈ ×α∈A Vα . 13.2.2 Theorem: The set L ((Vα )α∈A ; U ) is closed under pointwise addition and scalar multiplication. Proof: See Exercise 47.6.1. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
n
330
13. Tensor algebra
13.2.3 Definition: The linear space of multilinear maps from the family (Vα )α∈A of linear spaces over a field K to a linear space U over K is the set L ((Vα )α∈A ; U ) together with the operations of pointwise vector addition and scalar multiplication. 13.2.4 Remark: Strictly speaking, the linear space of multilinear maps in Definition 13.2.3 is the tuple L ((Vα )α∈A ; U ) − < (K, V¯ , σK , τK , σV¯ , µK ), where K − < (K, σK , τK ) is the field, V¯ = L ((Vα )α∈A ; U ) is the set of maps, σV¯ : V¯ × V¯ → V¯ denotes pointwise addition on V¯ , and µK : K × V¯ → V¯ denotes pointwise multiplication of elements of V¯ by elements of K. (See Definition 10.1.2 for linear spaces.) α 13.2.5 Remark: Let (eα,i )ni=1 be a basis for the linear space Vα for α ∈ A, so that nα = dim(Vα ) for α ∈ A. Let c : ×α∈A nα → K be a family of coefficients in the field K. (Thus c(j) ∈ K for families of indices j = (jα )α∈A ∈ ×α∈A nα .) Pnα i α α be the coefficients of vα with respect to (eα,i )ni=1 for α ∈ A. Thus vα = i=1 vα eα,i for α ∈ A. Let (vαi )ni=1 Define the map fc : ×α∈A Vα → K by
N
N
fc : (vα )α∈A 7→
X
j∈ ×
α∈A
N
c(j)
Y
vαjα .
α∈A
nα
Then fc ∈ L ((Vα )α∈A ; K).
N
[ Maybe show that fc (eα,I(α) )α∈A = c (I(α))α∈A for I ∈ ×α∈A nα ? ] [ Show how Remark 13.2.5 specializes to A = {1, 2}. Also comment on how this embeds the dual of the dual in the primal space. ] [ Can Remark 13.2.5 be extended to target spaces U 6= K? Show that Q the above gives a basis for the space of multilinear maps. Deduce the dimension dim(L ((Vα )α∈A ; IR)) = α∈A dim(Vα ) from this (or otherwise). ] [ Theorem 13.2.6 is closely related to Theorem 13.5.17 for the tensor product space. ]
Y dim L ((Vα )α∈A ; K) = dim(Vα ). α∈A
[ Show a canonical isomorphism L ((Vα∗ )α∈A ; K)∗ ∼ = L ((Vα )α∈A ; K). Frankel [18] gives L ((Vα∗ )α∈A ; K) as the definition of the space of contravariant tensors and L ((Vα )α∈A ; K) as the space of covariant tensors. See comment after Theorem 13.6.3. ]
13.3. Symmetric and antisymmetric multilinear maps 13.3.1 Remark: In Definitions 13.3.2 and 13.3.3 for symmetric and antisymmetric multilinear maps, recall that a permutation of a finite set A is any bijection P : A → A. (See Definition 7.10.3.) When requiring a symmetry for all permutations, all linear spaces in the product must be the same. 13.3.2 Definition: A symmetric multilinear map from a Cartesian product V m of a linear space V over a field K, for m ∈ + , to a linear space U over K is a multilinear map f ∈ L (V, U ) such that f swap (v) = m j,k 0 m f (v) for all v = (vi )m ∈ V and j, k ∈ . m i=1
Z
N
13.3.3 Definition: An antisymmetric multilinear map from a Cartesian product V m of a linear space + V over a field space U over K is a multilinear map f ∈ Lm (V, U ) such that K, for m ∈ 0 , to a linear f swapj,k (v) = −f (v) for all v = (vi )m ∈ V m and j, k ∈ m such that j 6= k. i=1 An antisymmetric multilinear map is also called an alternating multilinear map.
Z
+ 13.3.4 Notation: Lm (V, U ) for m ∈ m linear maps from V to U . Thus
N
Z+0 and linear spaces U and V
+ Lm (V, U ) = {f ∈ Lm (V, U ); ∀j, k ∈ [ www.topology.org/tex/conc/dg.html ]
denotes the set of symmetric multi-
Nm, f ◦ swap = f }. j,k [ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
13.2.6 Theorem: Let (Vα )α∈A be a finite family of finite-dimensional linear spaces. Then
13.3. Symmetric and antisymmetric multilinear maps − 13.3.5 Notation: Lm (V, U ) for m ∈ multilinear maps from V m to U . Thus
Z+0 and linear spaces U
− Lm (V, U ) = {f ∈ Lm (V, U ); ∀j, k ∈
331
and V denotes the set of antisymmetric
Nm, (j 6= k ⇒ f ◦ swap = −f )}. j,k
+ 13.3.6 Theorem: The set Lm (V ; U ) is closed under pointwise addition and scalar multiplication.
Proof: See Exercise 47.6.2. − 13.3.7 Theorem: The set Lm (V ; U ) is closed under pointwise addition and scalar multiplication.
Proof: See Exercise 47.6.3.
Z
+ (V, U ) for m ∈ + 13.3.8 Theorem: Let f ∈ Lm 0 and linear spaces V andmU over a field K. Then for all m m permutations P : m → m and v = (vi )m ∈ V . f (v )) P (i) i=1 = f (vi )i=1 . In other words, i=1
N
N
+ ∀f ∈ Lm (V, U ), ∀P ∈ perm(
Nm), ∀v ∈ V m,
f (v ◦ P ) = f (v).
Z
− 13.3.9 Theorem: Let f ∈ Lm (V, U ) for m ∈ + 0 and linear spaces V and U over amfield K. Then for all m m ∈ V . f (v )) = parity(P )f (v ) permutations P : m → m and v = (vi )m i i=1 . In other words, P (i) i=1 i=1
N
− ∀f ∈ Lm (V, U ), ∀P ∈ perm(
Nm), ∀v ∈ V m,
f (v ◦ P ) = parity(P )f (v).
13.3.10 Theorem: Let V be a finite-dimensional linear space over a field K, and m ∈ dim Lm (V, K) = dim(V )m − dim(V ) dim Lm (V, K) = Cm + dim(V )+m−1 dim Lm (V, K) = Cm
Z+0. Then
P Proof: Let (ei )ni=1 be a basis for V . All elements λ or Lm (V, K) have the form λ : v 7→ I∈NN m aI vI n m for v ∈ V . Since there are no constraints on the coefficients aI , It follows that dim(Lm (V, K)) = dim(V )m . − In the case of Lm (V, K), the coefficients aI are constrained by the antisymmetry rule aI = parity(P )aI◦P for all permutations P ∈ perm( m ). From this it follows that aI = 0 for index sequences I with any two indices equal. The remaining index sequences may be partitioned according to the equivalence relation I ≡ J if and only if ∃P ∈ perm( m ), I = J ◦ P . A unique representative may be chosen from each equivalence class by sorting into increasing order. There is one and only one increasing index sequence in each equivalence class, and there is one and only one equivalence class for each increasing index sequence. It follows that the n number of equivalence classes equals the number of increasing index sequences in Im . But this is equal to n Cm by Theorem 7.11.11 (i). + In the case of Lm (V, K), the symmetry rule implies that aI = aJ whenever I = J ◦P for some P ∈ perm( m ). Equivalent index sequences may be partitioned into equivalence classes as in the antisymmetric case, but coefficients with repeated indices are not set to zero. A unique representative for each equivalence class may be obtained by sorting into non-increasing order. Since aI = aJ if and only if the sorted index sequences I and J are equal, it follows that the number of independent coefficients is equal to the cardinality of the set n n+m−1 Jm of non-increasing maps from m to n . By Theorem 7.11.11 (ii), this equals Cm .
N
N
N
N
N
[ The proof of Theorem 13.3.10 is a bit too intuitive. It should be possible to do elementary combinatorics a bit more rigorously than this. ] 13.3.11 Remark: A different way to specify the antisymmetry in Definition 13.3.3 is to require f (v) = 0 m for all v = (vi )m such that vi = vj for some i, j ∈ m with i 6= j. i=1 ∈ V Suppose that the alternative condition is true. Let i, j ∈ m with i 6= j. Let w1 , w2 ∈ V . By the multilinearity of f ∈ Lm (V, U ), 0 = f v v =w +w ,v =w +w = f v v =w ,v =w + f v v =w ,v =w + i i 1 2 j 1 2 i 1 j 2 1 j 1 f v v =w ,v =w +f v v =w ,v =w = 0+f v v =w ,v =w +f v v =w ,v =w +0. Hence f v v =w ,v =w = i 2 j 1 i 2 j 2 i 1 j 2 i 2 j 1 i 1 j 2 −f v v =w ,v =w . Since f is antisymmetric for a simple two-parameter transposition, the antisymmetry i 2 j 1 holds for all permutations of parameters. A similar proof yields the converse.
N
[ www.topology.org/tex/conc/dg.html ]
N
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
N
332
13. Tensor algebra
13.4. Tensor product metadefinition In Sections 13.4 and 13.5, tensors are defined by removing from sequences of vectors any information which disappears in multilinear maps on those sequences. In other words, a tensor is constructed from sequences of vectors by equating those sequences of vectors for which any multilinear map gives the same value. Tensor spaces are first characterized or specified by Metadefinition 13.4.1, which is based on Federer [105], section 1.1.1, page 9. (See also Gallot/Hulin/Lafontaine [19], page 36.) 13.4.1 Metadefinition: A tensor product space for a finite family (Vα )α∈A of linear spaces over a field K is a pair (W, µ) such that (i) W is a linear space over K, (ii) µ : ×α∈A Vα → W is multilinear, and
(iii) for any pair (U, ν) where U is a linear space and ν : ×α∈A Vα → U is multilinear, there exists a unique linear map g : W → U such that ν = g ◦ µ. The map µ is referred to as the canonical multilinear map. 13.4.2 Remark: Metadefinition 13.4.1 is illustrated in Figure 13.4.1. There is a different map g for each multilinear map ν, but the map µ is unique to the particular tensor product space definition and there is a unique function g for each pair (µ, ν). U1
g1
ν1
W µ
g2
U2
ν2
× Vα
α∈A
Metadefinition of tensor spaces
13.4.3 Remark: EDM2 [34], section 256.I, calls the canonical multilinear map µ in Metadefinition 13.4.1 the “canonical bilinear mapping” in the case that #(A) = 2, but the noun “mapping” is generally avoided in this book. The noun “map” is used here instead. Usually the map µ is not surjective. The tensors are an extension of range of the canonical map µ. The extension W (minimally) closes the tensor product space with respect to linear operations. That’s the whole point of tensor spaces. Tensor spaces would be useless is they were nothing more than the image of a multilinear map. Both Bump [96], page 50, and Fulton/Harris [108], page 471, call the map µ in Metadefinition 13.4.1 the “universal” bilinear map in the case #(A) = 2. 13.4.4 Remark: Condition (iii) in Metadefinition 13.4.1 may be interpreted as saying that all of the information in any multilinear map from the product ×α∈A Vα to any linear space U is contained in the representation W , because after “filtering” the product through the map µ, it is still possible to reconstruct any multilinear map ν from the space W via a map g : W → U . The requirement that g should be unique means that no unnecessary information remains in the space W following the application of the map µ. So the pair (W, µ) keeps all relevant information and removes all irrelevant information for the construction of multilinear maps on the cross product. This justifies the claim that tensors are the “multilinear quintessence” of sequences of vectors. 13.4.5 Remark: All tensor product definitions are equivalent because any two tensor product definitions will yield isomorphic tensor spaces. It follows from condition (iii) that if two tensor product definitions yield pairs (W1 , µ1 ) and (W2 , µ2 ) for a family (Vα )α∈A , then there exist maps g12 : W1 → W2 and g21 : W2 → W1 such that µ2 = g12 µ1 and µ1 = g21 µ2 . Therefore g12 and g21 are linear isomorphisms between W1 and W2 . This is illustrated in Figure 13.4.2. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Figure 13.4.1
13.4. Tensor product metadefinition g12
W1
g21
µ1
333
W2 µ2
× Vα
α∈A
Figure 13.4.2
Uniqueness of tensor space up to isomorphism
13.4.6 Remark: Very importantly, the isomorphism in Remark 13.4.5 between any two representations of a tensor product is unique. This ensures that any individual tensor in one representation may be identified with one and only one particular tensor in the other representation. So there is no ambiguity when one asks; “Which tensor is this?” The same considerations apply to Metadefinition 28.2.1 for tangent bundles to manifolds. 13.4.7 Theorem: For any two representations (W1 , µ1 ) and (W2 , µ2 ) of the tensor product of a sequence of linear spaces, there exists a unique isomorphism between the representations which commutes with the canonical maps µ1 and µ2 . [ Federer proves Theorem 13.4.7. Must rewrite its statement more precisely. ]
⊗α∈A Vα . Vi or V1 ⊗ . . . Vm .
13.4.8 Notation: The tensor product space W in Metadefinition 13.4.1 is denoted as If A =
Nm for m ∈ Z
+ 0,
then the tensor product space may be denoted as
m i=1
⊗
∀(vα )α∈A ∈ × Vα , α∈A
⊗ fα
α∈A
⊗ vα = ⊗ fα (vα ).
α∈A
α∈A
Proof: . . . 13.4.10 Remark: Theorem 13.4.9 is illustrated in Figure 13.4.3. The direct product function ×α∈A fα is introduced in Definition 6.9.11.
⊗
α∈A
Vα
⊗ fα
α∈A
⊗
α∈A
µ′
µ × Vα
α∈A
Vα′
× fα
× Vα′
α∈A
α∈A
Figure 13.4.3
Unique lift from a family of linear maps to a single tensor space map
[ Maybe also define (anti)symmetric tensors in the definition context. ] [ When proving properties of tensors, some properties are valid for any representation consistent with the general definition. Other properties are representation-dependent. Should clearly distinguish between these. In particular, should try to prove as much as possible in terms of the general definition before presenting the specific representation. ] [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
13.4.9 Theorem: Let (Vα )α∈A and (Vα′ )α∈A be families of linear spaces with the same index set. Then for any family of linear maps (fα )α∈A such that fα : Vα → Vα′ for all α ∈ A, there is a unique linear map ⊗α∈A fα : ⊗α∈A Vα → ⊗α∈A Vα′ such that
334
13. Tensor algebra
13.5. Tensor products of linear spaces 13.5.1 Remark: The standard definition for tensor products in this book is the dual of the linear space of multilinear maps from a linear space family (Vα )α∈A to the field K of the linear spaces. This definition is simpler, clearer and more economical than the representation in Section 13.11 as the quotient space of a free linear space on ×α∈A Vα with respect to the subspace generated by the set of multilinear equivalence rules. 13.5.2 Definition: The tensor product (space) of a finite family (Vα )α∈A of linear spaces over a field K is the dual Hom(L ((Vα )α∈A ; K), K) of the linear space of multilinear maps L ((Vα )α∈A ; K). A tensor space is the tensor product of any finite family of linear spaces. A tensor is any element of a tensor space. [ Must show that Definition 13.5.2 satisfies Metadefinition 13.4.1. Make this a theorem. ] 13.5.3 Notation: ⊗α∈A Vα for a finite family (Vα )α∈A of linear spaces over a field K denotes the tensor product space of (Vα )α∈A . Thus
⊗
α∈A
Vα = L ((Vα )α∈A ; K)∗ = Hom(L ((Vα )α∈A ; K), K).
13.5.4 Definition: The canonical multilinear map of a tensor space
⊗α∈A Vα defined by
⊗α∈A Vα is the map µ : ×α∈A Vα →
µ (vα )α∈A (λ) = λ (vα )α∈A ,
for all (vα )α∈A ∈ ×α∈A Vα and λ ∈ L ((Vα )α∈A ; K).
(Vα∗ )α∈A dual (Vα )α∈A Figure 13.5.1
mdu al al du m
iso
L ((Vα )α∈A , K) du a
l
al du L ((Vα∗ )α∈A , K) iso
⊗α∈A Vα∗
⊗α∈A Vα
iso
du al al du iso
(⊗α∈A Vα )∗
(⊗α∈A Vα∗ )∗
Multilinear spaces and tensor products of linear space families
The families of linear spaces (Vα )α∈A and (Vα∗ )α∈A are shown in Figure 13.5.1 rather than the Cartesian products ×α∈A Vα and ×α∈A Vα∗ for two reasons. First, the dual (×α∈A Vα )∗ of the space ×α∈A Vα is not equal to the Cartesian product ×α∈A Vα∗ of the dual spaces. Secondly, The spaces multilinear spaces L ((Vα )α∈A ; K) and L ((Vα∗ )α∈A ; K) require the full linear space structures of the spaces Vα and Vα∗ respectively for their definition, not just the point sets. So to be precise in the diagram, it is best to show the families of linear spaces rather than the Cartesian products. 13.5.6 Definition: A simple tensor in a tensor product space canonical multilinear map µ.
⊗α∈A Vα is any element of the image of its
13.5.7 Notation: ⊗α∈A vα for an element (vα )α∈A of a finite family (Vα )α∈A of linear spaces over a field K denotes the value µ (vα )α∈A , where µ is the canonical multilinear map for the tensor product ⊗α∈A Vα in Definition 13.5.4. m
13.5.8 Notation: ⊗i=1 Vi for a sequence of linear spaces (Vi )m i=1 over a field K, where m ∈ the tensor product space ⊗α∈A Vα with index set A = m . [ www.topology.org/tex/conc/dg.html ]
N
Z+0, denotes
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
13.5.5 Remark: Figure 13.5.1 illustrates the relations between spaces of multilinear maps and tensor product spaces of linear space families. The abbreviation “iso” means “isomorphism”, and “m-dual” means the “multilinear dual” or “space of multilinear maps”.
13.5. Tensor products of linear spaces
335
13.5.9 Notation: V1 ⊗ . . . Vm for a sequence of linear spaces (Vi )m i=1 over a field K, where m ∈ m denotes the tensor product space ⊗i=1 Vi .
Z+0,
[ Maybe say something near here about Hom(L ((vα )α∈A ; W1 ), W2 ). Is this good for anything? Is it isomorphic to something? ] 13.5.10 Remark: In terms of Notations 13.5.8 and 13.5.9, one may write V 1 ⊗ . . . Vm =
m
∗ ∗ ⊗ Vi = L ((Vi )m i=1 ; K) = L (V1 , . . . Vm ; K) .
i=1
m m 13.5.11 Notation: ⊗m i=1 vi for a sequence of vectors (vi )i=1 in linear spaces (Vi )i=1 over a field K, where + m m ∈ 0 , denotes the simple tensor µ (vi )i=1 .
Z
m 13.5.12 Notation: v1 ⊗ . . . vm for a sequence of vectors (vi )m i=1 in linear spaces (Vi )i=1 over a field K, + m where m ∈ 0 , denotes the simple tensor ⊗i=1 vi .
Z
13.5.13 Remark: In terms of Notations 13.5.11 and 13.5.12, one may write m
v1 ⊗ . . . vm = ⊗ vi : λ 7→ λ((vi )m i=1 ) = λ(v1 , . . . vm ) i=1
for all λ ∈ L ((Vi )m i=1 ; K). m
13.5.14 Notation: ⊗ V denotes the tensor product of m copies of a linear space V for any m ∈ m m m other words, ⊗ V = ⊗i=1 Vi , where Vi = V for i ∈ m . Thus ⊗ V = Lm (V, K)∗ .
N
Z+0. In
13.5.15 Remark: The canonical map µ inDefinition 13.5.4is not injective. For example, if K = IR, A = {1, 2} and V1 = V2 = IR3 , then µ (v1 , v2 ) = µ (tv1 , t−1 v2 ) for any t ∈ IR \ {0}. In general, the canonical map µ is not surjective either. For example, tensors of the form µ (v1 , v2 ) + 2 µ (v3 , v4 ) ∈ ⊗ IR3 for vk ∈ IR3 are usually not expressible as µ (v5 , v6 ) for v5 , v6 ∈ IR3 .
13.5.16 Remark: To interpret Definitions 13.5.2 and 13.5.4, consider the case A = {1, 2} with V1 = V2 = on the linear space L (V IRn . Let v1 ∈ V1 and v2 ∈ V2 . Then µ (v1 , v2 ) is the linear function 1 , V2 ; IR) which maps every multilinear function λ ∈ L (V1 , V2 ; IR) to λ (v1 , v2 ) . In other words, µ (v1 , v2 ) = v1 ⊗ v2 : λ 7→ λ(v1 , v2 ) for all (v1 , v2 ) ∈ V1 × V2 . P For example, denote a real n×n matrix as (aij )ni,j=1 , and define λ : V1 ×V2 → IR by λ(v1 , v2 ) = ni,j=1 aij v1i v2j for (v1 , v2) ∈ V1 × 2 . Then λ is clearly bilinear. Therefore Definition 13.5.2 PV Pn implies that (v1 ⊗ v2 )(λ) = n µ (v1 , v2 ) (λ) = i,j=1 aij v1i v2j for all v1 ∈ V1 and v2 ∈ V2 . The value of i,j=1 aij v1i v2j clearly does not change if v1 and v2 are scaled by inverse factors. The “tensor quality” of a tensor product v1 ⊗ v2 is whatever quality makes a difference to such quadratic-looking expressions. Any other quality doesn’t count. [ Show the dimension of the explicit tensor definition in addition to proving it from the general definition? ] 13.5.17 Theorem: If #(A) < ∞ and dim Vα < ∞ for all α ∈ A, then dim
⊗
α∈A
Y Vα = dim Vα , α∈A
and a basis for the tensor product is. . . Proof: [ See Federer [105], 1.1.2, etc. ] Q 13.5.18 Remark: P The dimension α∈A dim(Vα ) of the tensor product ⊗α∈A Vα may be compared with the dimension α∈A dim(Vα ) of the direct product ⊕α∈A Vα , which is also known as the “direct sum” of the family of linear spaces. This shows clearly how different the two concepts are. It also explains why the word “sum” is used for the direct sum of linear spaces. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
[ Define the degree of a tensor near here. ]
336
13. Tensor algebra m
1
13.5.19 Remark: When m = 1 in Notation 13.5.14, the space ⊗ V = ⊗ V may be identified with the 1 linear space V although it is not represented by the same set. The space ⊗ V is represented as L1 (V, K)∗ , which is the dual of the dual of V . The difference between this and the space V is generally ignored. This is not a serious embarrassment because the number 2 is represented differently as an element of and IR, and no one worries about that, even though strictly 2Z 6= 2IR . For any set X, the Cartesian product X m of m copies of the set X is identified with X when m = 1, even though X 1 is really a set of functions which are valued in X. As long as the user of a definition knows how to convert the informal “equalities” into strictly correct identification maps, there is no problem. A similar situation alluded to in Remark 27.2.9 is the identification of topological manifolds with C 0 manifolds. [ Also mention equivalents, isomorphisms and duals such as L (Vα )α∈A , W ∼ = Hom(⊗α∈A Vα , W ). ]
Z
13.6. Covariant tensors
[ Rewrite this section. ]
13.6.2 Remark: The words “covariant” and “contravariant” often seem to be defined with the reverse meanings to what is expected. The terminology may be justified by noting that the coefficients of contravariant vectors vary as inverses of the transformations of basis vectors. However, contravariant vectors themselves are the primal vectors whereas covariant vectors are the dual vectors. In a comment about the transformation rule for contravariant vector coefficients with respect to a transformed basis, Szekeres [44], page 84, says: “This law of transformation of components of a vector v is sometimes called the contravariant transformation law of components, a curious and somewhat old-fashioned terminology that possibly defies common sense.” (See also a related discussion of the confusing terminology for the “covariant derivative” in Remark 37.4.3.) [ Must look at how to interpret the dual of spaces such as L (V1 , V2 ; U ) for linear spaces U 6= IR as contravariant tensors of some sort. ] 13.6.3 Theorem: (⊗α∈A Vα )∗ is canonically isomorphic to L ((Vα )α∈A ; K).
Proof: Since ⊗α∈A Vα = (L ((Vα )α∈A ; K))∗ by Definition 13.5.2, this theorem follows from the general fact that the dual of the dual of any linear space is canonically isomorphic to the primal. (See Theorem 10.5.17.) In this case, define the linear map h : L ((Vα )α∈A ; K) → (⊗α∈A Vα )∗ by h(λ)(w) = w(λ) for all λ ∈ L ((Vα )α∈A ; K) and w = ⊗α∈A vα ∈ ⊗α∈A Vα . 13.6.4 Remark: Theorem 13.6.3 is illustrated in Figure 13.6.1. [ Show something like ⊗α∈A Vα∗ ∼ = L ((Vα )α∈A ; K) in some canonical sense. Similarly, show that ⊗α∈A Vα ∼ = ∗ L ((Vα )α∈A ; K). EDM2 [34], 256.I, says that Hom(V1 ⊗ V2 , W ) ∼ = L (V1 , V2 ; W ), for instance. Since linear spaces are isomorphic if they have the same dimension, these isomorphisms should be natural in some sense. Note that Frankel [18], p. 59, section 2.4b uses L ((Vα∗ )α∈A ; K) as the definition of ⊗α∈A Vα . ] 13.6.5 Remark: Figure 13.6.2 illustrates the relations between spaces of multilinear maps and tensor product spaces of a single linear space. As in Figure 13.5.1, the abbreviation “iso” means “isomorphism”, and “m-dual” means the “multilinear dual” or “space of multilinear maps”. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
13.6.1 Remark: There are many choices for representation of all kinds of tensors. The simplest kind of representation for covariant tensors is as the multilinear functionals Lm (V, IR) on Cartesian products of a linear space. But the representation chosen for contravariant tensors is the dual space Lm (V, IR)∗ . So to be consistent, it would make sense to represent covariant tensors as the space Lm (V ∗ , IR)∗ , which is the linear dual of the m-linear dual of the linear dual of V . In a choice between the symmetry of the space Lm (V ∗ , IR)∗ and the simplicity of the space Lm (V, IR), it seems best to favour simplicity. This is particularly because of the heavy use of covariant tensors in applications. In many contexts, it becomes clear that contravariant and covariant vectors are not exact mirror images of each other. The notations of tensor calculus give the illusion of this mirror-image equivalence by using upper and lower indices for contravariant and covariant vector coordinates. It would be self-defeating to insist on a mirror-image symmetry in tensor representations when this symmetry is in fact not always valid.
13.7. Mixed tensors
L (Vα )α∈A ; K
∗
=
⊗
α∈A
Vα
L (Vα )α∈A ; K
337
w h(λ) ∈
λ
⊗
α∈A
Vα
∗ K
w = ⊗ vα α∈A
λ × Vα
v
α∈A
Figure 13.6.1
v = (vα )α∈A
Maps for the dual of a tensor product space
dual V ∗∗
iso
V∗ m -du al dual al du V m
Figure 13.6.2
Lm (V, IR)
iso
du al
Lm (V ∗ , IR)
al du iso
⊗m V ∗
iso
du al al du
⊗m V
iso
m
(⊗ V )∗
m
(⊗ V ∗ )∗
Linear and multilinear duals of a linear space
13.7. Mixed tensors s,r 13.7.1 Remark: EDM2 [34], 256.J, mentions a natural isomorphism (⊗ V )∗ ∼ = L ((V )si=1 , (V ∗ )rj=1 ; K). r,s ∼ This agrees with Theorem 13.6.3. They obtain a natural isomorphism ⊗ V = L ((V )si=1 , (V ∗ )rj=1 ; K) by s,r r,s combining this with the duality (⊗ V )∗ = ⊗ V . r,s
13.7.2 Definition: The mixed tensor space ⊗ V is defined for linear spaces V and r, s ∈ r+s tensor product ⊗i=1 Vi , where Vi = V for i ≤ r and Vi = V ∗ for i > r. r,s An element of ⊗ V is said to be a mixed tensor of type (r, s).
Z+0 as the
[ Define various kinds of tensor degree near here. Maybe r is the “contravariant degree”, s is the “covariant degree” and r + s is the “total degree”? ]
Z
Z
Z
+ + 13.7.3 Remark: It is tedious to have to always specify that (r, s) ∈ + 0 × 0 or r, s ∈ 0 . Therefore the + + type (r, s) of a tensor is always assumed to lie in 0 × 0 unless explicitly restricted in some way.
Z
Z
13.7.4 Remark: The mixed tensors in Definition 13.7.2 are required to have a sequence of primal spaces followed by a sequence of dual spaces, but clearly this could be generalized so that the primal and dual spaces are mixed up in any order. It is unclear why this is not generally done. It is possible to construct a closed algebra of mixed tensors by always keeping the primal and dual spaces grouped together, but in physics, the full generality is required. (See Remark 29.3.1 for further comment on this.) This issue is briefly mentioned by Misner/Thorne/Wheeler [37], section 3.5, page 84. They say: “Because the frame-independent geometric notation is somewhat ambiguous . . . , one often uses component notation to express coordinate-independent, geometric relations between geometric objects.” In other words, the order is important and the standard tensor notations do not permit one to express this order adequately. As an example, a possible notation for the space of tensors whose components are written as Ki jk ℓ m would 0,1,2,1,1 u ,d ,u ,d ,... be ⊗ IRn . More generally, the notation would be ⊗ 1 1 2 2 V with contravariant and covariant degrees respectively equal to u1 , u2 , . . . and d1 , d2 , . . .. It would then be necessary to develop a set of notations for arbitrary contractions and products of such tensors and spaces. This seems to be an area where the physicists’ component notations are better developed than the pure mathematical notations. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
[ Rewrite this section. ]
338
13. Tensor algebra
13.7.5 Remark: [ This remark is a bit conjectural, like Remark 13.7.4. ] It might be that the kludgy component notations lead one astray. In the context of a manifold without metric and without connection, there is not way to raise of lower indices in an arbitrary fashion. So a mixed r,s tensor in ⊗ V can only mean an s-linear map on a tensor product of r copies of a linear space V . There is actually no difference between e1 ⊗ e2 . . . er ⊗ e1 ⊗ e2 . . . es and e1 ⊗ e2 . . . es ⊗ e1 ⊗ e2 . . . er . In the absence of a metric, there is no way to arbitrarily raise and lower any coefficients of such tensor products. The meaning of a simple tensor aI J e1 ⊗ e2 . . . er ⊗ e1 ⊗ e2 . . . es for multi-indexes I ∈ rn and J ∈ sn is a r s r map in Lin(⊗ V, ⊗ V ) ≡ Ls (⊗ V, V ). This seems to indicate that the representation of mixed tensors as a tensor product of a mixture of copies of the linear space V and its dual V ∗ is purely for convenience to save r s r having to discuss Lin(⊗ V, ⊗ V ) or Ls (⊗ V, V ). The coefficients aI J are thus really of the form (aI )J , r s which more clearly suggests the space Lin(⊗ V, ⊗ V ). From this we may conclude that mixed tensors r s r should be defined as Lin(⊗ V, ⊗ V ) or Ls (⊗ V, V ). Then one may then note in passing that there is an r s equivalent space ⊗ V ⊗ ⊗ V ∗ . The latter space is how mixed tensors are usually defined. r ,s ,r ,s ,r ,s It may be concluded that any attempt to define spaces like ⊗ 1 1 2 2 3 3 V would only yield worthless I permutations of coefficient arrays of the form (a )J for I ∈ rn1 +r2 +r3 and J ∈ sn1 +s2 +s3 . Therefore definitions of such spaces would be worthless. However, it is important to ask the question and obtain the answer. In the case of metric spaces, however, the picture is different. In this case, there is an additional Einstein index convention that multiplying by suitable copies of the metric tensor g and its inverse yields a tensor with the same symbol but different location of indices (i.e. raised or lowered). Thus coefficients Ri jkℓ may ℓ be converted to Rijk , and other similar variations. In this case, the index locations keep track of the metric tensor multiplications which have been applied.
N
N
13.7.6 Notation: ⊗s V denotes the tensor product of s copies of a linear space V , for any s ∈ 0,s Thus ⊗s V = ⊗ V .
13.7.7 Notation: V r,s denotes the sequence of linear spaces (Ui )r+s i=1 for non-negative integers r, s ∈ where Ui = V for i = 1 . . . r and Ui = V ∗ for i = r + 1, . . . r + s.
Z+0. Z+0,
13.7.8 Notation: Lr,s (V, W ) denotes the space L (V r,s , W ) of multilinear maps from V r,s to W for any linear spaces V and W , and non-negative integers r, s ∈ + 0 . In particular, W may be the field K of V .
Z
13.7.9 Remark: Notations 13.7.7 and 13.7.8 may be non-standard, but they seem reasonable enough. The r,s mixed tensor space ⊗ (V ) is, by Definition 13.7.2, the dual of Lr,s (V, K), where K is the field of the linear r,s ∗ space V , for any r, s ∈ + 0 . That is, ⊗ (V ) = Lr,s (V, K) .
Z
[ Must determine whether T 0,2 (M ) is the dual of T 2,0 (M ) in some sense, etc. etc. ] [ Also see EDM2 [34], 256.I, for tensor products of functions f : M1 → M2 or f : V1 → V2 . These should be useful for defining T r,s (M1 , M2 ). ] [ Must define non-degenerate (0, 2) tensors around here somewhere. ] [ Introduce a bilinear inner product or metric on these mixed spaces. Federer [105] introduces inner products in section 1.7, page 27. ] [ Define contractions and traces here. For example, Cℓk would denote the contraction of the kth contravariant index with the ℓth covariant index. This assumes a certain amount of orderliness in the sets of indices. Note that contractions are not well-defined on tensor algebras. They must be defined on tensors with a well-defined type. ]
13.8. General tensor algebra [ Rewrite this section. ] Tensor spaces have operations of vector addition and scalar multiplication, but they have no vector product operation. In order to accommodate such a product operation, tensor algebras are built from an infinite sequence of tensor spaces. The tensor spaces are not individually closed under the tensor product operation. [ It is possible to define tensor product operations for mixed spaces Vα . Of course, this is a little untidy. Even in the case of two kinds of spaces Vα , such as a primal space V and dual space V ∗ , the requirement to keep [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
N
N
13.8. General tensor algebra
339
track of order is untidy. Maybe this can be done by always ignoring order in such products? But then again, in tensor calculus, contravariant and covariant indices are often mixed in arbitrary order. So probably this is a useful thing to define here! Alternating wedge-style products probably wouldn’t make sense though. Definitely must do a general version of Definition 13.8.2 for an arbitrary pair of tensor spaces. ] [ The numbers r and s are called the “degrees” of tensors if they are not mixed. See EDM2 [34], 256.J. The pair (r, s) is called the “type” of the tensor. ] 13.8.1 Remark: Definition 13.8.2 uses the extended canonical map µ in Definition 13.12.5. The expression λ(vi , wj ) in Definition 13.8.2 means the value of λ for the sequence of r + s vectors formed by concatenating the the tensor of degree r defined by µ(v)(λ) = Pmr1 vectors vi,k with the s vectors wj,ℓ . The term µ(v) Pmeans m2 λ(v ) for all λ ∈ L (V, IR), and similarly µ(w) = λ(w i r j ). i=1 j=1
[ Must try to find a more abstract definition of tensor product operation which does not use polynomial representations as in Definition 13.8.2. ] 13.8.2 Definition: The tensor product operation for a linear space V is the operation r s r+s ⊗:⊗ V ×⊗ V →⊗ V which is defined for all r, s ∈ + 0 by
Z
∀λ ∈ Lr+s (V, IR),
(µ(v) ⊗ µ(w))(λ) =
m1 X m2 X
λ(vi , wj )
i=1 j=1
r m1 s m2 1 2 for all tensor polynomials v = ((vi,k )rk=1 )m and w = ((wj,ℓ )sℓ=1 )m . i=1 ∈ (V ) j=1 ∈ (V )
13.8.3 Theorem: The product in Definition 13.8.2 is independent of tensor polynomial representations. In other words, if µ(v) = µ(v ′ ) and µ(w) = µ(w′ ), then µ(v) ⊗ µ(w) = µ(v ′ ) ⊗ µ(w′ ). m′
m′
′
′
′ ′ 1 2 Proof: Let v ′ = ((vi,k )rk=1 )i=1 ∈ (V r )m1 and w′ = ((wj,ℓ )sℓ=1 )j=1 ∈ (V s )m2 be alternative representations ′ ′ for v and w respectively so that µ(v) = µ(v ) and µ(w) = µ(w ). Then for all λ ∈ Lr+s (V, IR),
i=1 j=1
′
λ(vi , wj ) −
′
m1 m2 X X
λ(vi′ , wj′ )
=
m1 X m2 X
′
λL wj (vi )
−
i=1 j=1
i=1 j=1
=
m2 X
′ λL w′ (vi ) j
(13.8.1)
i=1 j=1 ′
µ(v)(λL wj )
j=1
= µ(v)
′
m1 m2 X X
m2 X
−
m2 X
µ(v ′ )(λL w′ )
(13.8.2)
λL ′ w ,
(13.8.3)
j
j=1 ′
λL wj
j=1
−
m2 X j=1
j
L r s where the r-linear functions λL y ∈ Lr (V, IR) are defined by λy : x 7→ λ(x, y) for x ∈ V and y ∈ V . The Pm′2 L P m2 L r-linear map j=1 λwj − j=1 λw′ ∈ Lr (V, IR) on line (13.8.3) may be evaluated for any x ∈ V r as follows: j
m2 X j=1
′
λL wj −
m2 X j=1
′
m2 m2 X X L λw′ (x) = λ(x, wj ) − λ(x, wj′ ) j
j=1
=
m2 X j=1
(13.8.4)
j=1 ′
λR x (wj )
−
m2 X
′ λR x (wj )
(13.8.5)
j=1
′ R = µ(w)(λR x ) − µ(w )(λx ) = 0,
(13.8.6)
R r s where the s-linear functions λR x ∈ Ls (V, IR) are defined by λx : y 7→ λ(x, y) for x ∈ V and y ∈ V . Since the r-linear map on line (13.8.3) is the zero map, it follows that the expression on the left of line (13.8.1) is zero, which means that µ(v) ⊗ µ(w) = µ(v ′ ) ⊗ µ(w′ ) as claimed. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
m1 X m2 X
340
13. Tensor algebra
13.8.4 Remark: The proof of Theorem 13.8.3 is perhaps not instantly comprehensible. Line (13.8.1) highlights the fact that a multilinear function of r + s vectors is in particular multilinear with respect to the first r vectors. Thus by fixing the last s vectors (wj,ℓ )sℓ=1 , the (r + s)-linear function λ becomes the r-linear r r function λL wj of the remaining r vectors (vi,k )k=1 . This is exactly what the tensor µ(v) ∈ ⊗ V of degree r needs as an argument. Therefore line (13.8.2) applies the definition of µ(v), which gives µ(v)(λL wj ) = Pm1 L i=1 λwj (vi ). (This shows the convenience of defining tensors as the dual of a space of multilinear maps rather than the stodgy old quotient space of a free linear space in Section 13.11.) Line (13.8.3) uses the fact that µ(v) = µ(v ′ ) (because v and v ′ represent the same tensor) and the linearity of µ(v) with respect to r-linear maps. The coefficient of µ(v) in line (13.8.3) is an r-linear map which is a linear combination of r-linear maps. In line (13.8.4), this r-linear map is applied to a sequence x ∈ V r of r vectors and expanded according to the L definitions of λL wj and λw′ . Line (13.8.5) uses the reverse trick to line (13.8.1) by fixing the first r arguments j
′ of λ to construct an s-linear map λR x . This is what the tensors µ(w) and µ(w ) act on. So line (13.8.6) uses the definitions µ(w) and µ(w′ ) to simplify the expression. Since the tensors µ(w) and µ(w′ ) are the same tensor with two different representations, the result is zero. By the linearity of µ(v) in line (13.8.3), this implies that Definition 13.8.2 gives the same result, no matter which representations are used.
13.8.5 Example: Let v = ((v1 , v2 ), (v3 , v4 )) ∈ (V 2 )2 and w = ((w1 , w2 ), (w3 , w4 )) ∈ (V 2 )2 in Definition 13.8.2. These represent the tensors µ(v) = v1 ⊗ v2 + v3 ⊗ v4 and µ(w) = w1 ⊗ w2 + w3 ⊗ w4 . Therefore µ(v) ⊗ µ(w) = (v1 ⊗ v2 + v3 ⊗ v4 ) ⊗ (w1 ⊗ w2 + w3 ⊗ w4 ) = (v1 ⊗ v2 ⊗ w1 ⊗ w2 ) + (v1 ⊗ v2 ⊗ w3 ⊗ w4 ) + (v3 ⊗ v4 ⊗ w1 ⊗ w2 ) + (v3 ⊗ v4 ⊗ w3 ⊗ w4 ). < (K, V, σK , τK , σV , µVK ) is the tuple 13.8.6 Definition: The tensor algebra of a linear space V − A (K, A, σK , τK , σA , τA , µK ) where: r
u⊗v = for u, v ∈ A with u =
Pk
i=0
k X ℓ X i=0 j=0
ui and v =
Pℓ
ui ⊗ vj =
j=0 vj i
k+ℓ X n X
n=0 i=0
j
13.8.7 Notation:
⊗∗ V
(ui ⊗ vn−i )
⊗i V and vj ∈ ⊗j V for i = 0 . . . k i+j → ⊗ V is as in Definition 13.8.2.
with ui ∈
j = 0 . . . ℓ, and the tensor space product ⊗ : ⊗ V × ⊗ V
⊗r V ;
and
denotes the tensor algebra A − < (K, A, σK , τK , σA , τA , µA K ) in Definition 13.8.6. ∞
r
13.8.8 Remark: The direct sum ⊕r=0 ⊗ V in Definition 13.8.6 (i) is defined as the set of all almost-all∞ r zero infinite sequences of elements from the different degrees of tensor a = (ar )∞ r=0 ∈ ⊕r=0 ⊗ V Pr spaces. Thus∞ r + if ar ∈ ⊗ V for all r ∈ 0 . Thus part (ii) means that a ⊗ b = s=0 as ⊗ br−s r=0 .
Z
13.8.9 Remark: Most textbooks say very little indeed about product operations of tensor algebras for fully general sums of tensors with different degrees. The tensor product in Definition 13.8.2 applies only to pairs of tensors where each element of the pair is in a single tensor space. Kobayashi/Nomizu [26], page 22, define the general product, and EDM2 [34], section 256.O, talks about the direct sum of tensor spaces, which implies that the tensor algebra contains general sums. It seems that there is not much interest in such sums of tensors of mixed degree in applications. Therefore the general mixed product operations are defined here for logical completeness only. For example, consider 5 + v1 + v1 ⊗ v2 + v3 ⊗ v4 ⊗ v5. This is an element of ⊕3r=0 ⊗r V since 5 ∈ ⊗0 V . There is no way to simplify the sum of v1 and v1 ⊗ v2 since they are in different components of the direct sum of tensor spaces.
13.8.10 Remark: Since any linear space may be inserted into Definition 13.8.6 to construct a tensor algebra, certainly the dual of any linear space may also be turned into a tensor algebra. If V ∗ is the dual of ∗ a linear space V , then ⊗ V ∗ is a well-defined tensor algebra. For a given linear space V , the tensor algebra ∗ ⊗ V is called the contravariant tensor algebra of V whereas ⊗∗ V ∗ is called the covariant tensor algebra of V . It is not quite so easy to construct the tensor algebra of mixed contravariant and covariant tensors. (For mixed tensors, see Definition 13.7.2.) [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
∞
(i) (K, A, σK , τK , σA , µA K ) = ⊕r=0 ⊗ V is the direct sum of the linear spaces (ii) τA : A × A → A, denoted as the binary operation ⊗, is defined by
13.8. General tensor algebra
341
13.8.11 Definition: The mixed tensor product operation for a linear space V is the operation r ,s r ,s r +r ,s +s ⊗ : ⊗ 1 1 V × ⊗ 2 2 V → ⊗ 1 2 1 2 V which is defined for all r1 , r2 , s1 , s2 ∈ + 0 by
Z
∀λ ∈ Lr1 +r2 ,s1 +s2 (V, IR),
(µ(v) ⊗ µ(w))(λ) =
m1 X m2 X
λ(vi1 , wj1 , vi2 , wj2 )
i=1 j=1
1 r1 2 s1 r1 ,s1 m1 1 r2 2 s2 r2 ,s2 m2 1 2 for all v = ((vi,k )k=1 , (vi,k )k=1 )m ) and w = ((wj,ℓ )ℓ=1 , (wj,ℓ )ℓ=1 )m ) . i=1 ∈ (V j=1 ∈ (V
13.8.12 Remark: The mixed linear spaces V r1 ,s1 and V r2 ,s2 in Definition 13.8.11 are defined in Notation 13.7.7. The mixed multilinear space Lr1 +r2 ,s1 +s2 (V, IR) is defined in Notation 13.7.8. 13.8.13 Theorem: The product in Definition 13.8.11 is independent of tensor polynomial representations. In other words, if µ(v) = µ(v ′ ) and µ(w) = µ(w′ ), then µ(v) ⊗ µ(w) = µ(v ′ ) ⊗ µ(w′ ). Proof: A proof like for Theorem 13.8.3 probably works. Trust me, I’m a mathematician. 13.8.14 Definition: The mixed tensor algebra of a linear space V − < (K, V, σK , τK , σV , µVK ) is the tuple A (K, A, σK , τK , σA , τA , µK ) where: ∞
(i) (K, A, σK , τK , σA , µA K ) = ⊕r,s=0 ⊗
r,s
V is the direct sum of the linear spaces
(ii) τA : A × A → A, denoted as the binary operation ⊗, is defined by u⊗v =
k1 X ℓ1 X k2 X ℓ2 X
(ur1 ,s1 ⊗ vr2 ,s2 ) =
r1 =0 s1 =0 r2 =0 s2 =0
rX 1 +r2 sX 1 +s2 r=0
r X s X
s=0 i=0 j=0
⊗r,s V ;
(ui,j ⊗ vr−i,s−j )
13.8.15 Notation:
⊗∗,∗ V
denotes the tensor algebra (K, A, σK , τK , σA , τA , µA K ) in Definition 13.8.14.
13.8.16 Remark: Notation 13.8.15 is admittedly not standard although it is perfectly logical. EDM2 [34], section 256.K, calls it T (V ). Kobayashi/Nomizu [26], page 22, call this space T. 13.8.17 Remark: It can be seen from Definitions 13.8.11 and 13.8.14 in particular that tensors are an index management nightmare. Luckily, tensors of mixed degree are rarely seen. Gallot/Hulin/Lafontaine [19], page 36, say about the tensors: “It is one of the unpleasant tasks for the differential geometers to define them!” They’re not exaggerating!! [ Try to define tensor product and algebra for
⊗r ,s ,r ,s ,... V . ] i
1
2
2
13.8.18 Remark: The “Einstein index convention” is a very useful collection of pseudo-notations. The meaning of expressions which use this convention depends to some extent on context. So it is important to always provide sufficient context to make the meaning clear. The convention includes the following rules. (i) The vectors in a basis for a linear space V use subscript indices. Example: (ei )ni=1 . (ii) P The coordinates of a vector v with respect to a basis (ei )ni=1 use superscript indices. Example: v = n i i=1 v ei .
(iii) When subscript and superscript indices (which follow the convention) are summed the linear space Pover n basis index set, the sum symbol may be omitted. Example: v = v i ei means v = i=1 v i ei . (iv) When a “metric tensor” (gij )ni,j=1 is provided for a linear space, and (v i )ni=1 is the sequence of vector coordinates with respect to a basis (ei )ni=1 , the sequence of coordinates (vi )ni=1 is defined by vi = gij v j = Pn j n . This is called “lowering the indices”. i=1 gij v for i ∈
N
(v) etc. etc. etc.
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Pk Pℓ Pk Pℓ r ,s for all u, v ∈ A, where u = r11=0 s11 =0 ur1 ,s1 and v = r22=0 s22 =0 vr2 ,s2 for some ur1 ,s1 ∈ ⊗ 1 1 V r2 ,s2 and vr2 ,s2 ∈ ⊗ V for r1 = 0 . . . k1 , s1 = 0 . . . ℓ1 , r2 = 0 . . . k2 , and s2 = 0 . . . ℓ2 , and the tensor space r ,s r ,s r +r ,s +s product ⊗ : ⊗ 1 1 V × ⊗ 2 2 V → ⊗ 1 2 1 2 V is given by Definition 13.8.11.
342
13. Tensor algebra
It is important to ensure that the implicit linear space basis and metric tensor are defined clearly in the context of expressions which use the Einstein index convention. This is particularly important when multiple bases and multiple metric tensors are used in a single context. The meaning of the choice of subscript or superscript for a particular sequence of objects depends on what kind of object it is. For example, tensor coefficients and tensor basis elements have opposite choices. It is also important to remember that some sequences with subscripts and subscripts are not tensors of any kind at all, although they do use the Einstein index convention. For example, the Christoffel symbol is not a tensor. The individual terms in the exterior derivative are also not generally tensors. Remark 7.11.15 mentions some index convention rules for multi-index subscripts and superscripts.
13.9. Alternating tensors Alternating covariant tensors are also known as exterior forms, skew-symmetric forms, skew-symmetric tensors and antisymmetric tensors. Alternating tensors are motivated by integration over submanifolds of flat spaces or manifolds. This is the subject of geometric measure theory. The familiar Lebesgue measure is used for integration with respect to volume elements. The region of integration for Lebesgue measure typically has the same dimension as the ambient space. But in geometric measure theory, the region of integration typically has dimension less than the ambient space dimension. Examples are integration over surfaces and curves in 3-dimensional space.
13.9.2 Remark: The lowered index in the notation Λm V for covariant antisymmetric tensor spaces is − inherited from the Lm (V, K) notation. VConveniently this matches the “Einstein convention” for lowered m indices. The raised index in the notation V for the contravariant space (“wedge product”) is inherited from m the corresponding ⊗ V notation. Once again, the raised index matches the convention for contravariant tensors. Notations 13.9.3 and 13.9.5 denote the linear spaces of antisymmetric multilinear maps which were introduced in Theorem 13.3.7. The relations between many of the antisymmteric tensor spaces in this section are illustrated in Figure 13.9.1.
dual V Figure 13.9.1
∗∗
iso
V∗ m -du al dual al du V m
Λm (V, IR)
∗
iso
du al
Λm (V , IR)
al du iso
Vm
V∗
Vm
V
iso
du al al du iso
Vm ∗ V Vm
Linear and antisymmetric multilinear duals of a linear space
V∗
∗
Z
− 13.9.3 Notation: Λm (V, W ) for a linear space V with field K, where m ∈ + 0 , denotes the set Lm (V, W ) together with its pointwise vector addition and scalar multiplication operations.
13.9.4 Definition: The elements of Λm (V, W ) for a linear space V with field K, where m ∈ (alternating) m-forms. 13.9.5 Notation: Λm V for a linear space V over a field K, where m ∈ where K is identified with the linear space K over the field K.
Z+0, are called
Z+0, is an abbreviation for Λm (V, K), Z
13.9.6 Definition: The alternating tensor product of m copies of a linear space V for any m ∈ + 0 is the dual linear space of the linear space of antisymmetric multilinear forms Λm (V, K). Vm 13.9.7 Notation: V for mV∈ + 0 and a linear space V over a field K denotes the alternating tensor product of m copies of V . Thus m V = Λm (V, K)∗ .
Z
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
13.9.1 Remark: The m-area spanned by a sequence of m tangent vectors at a point in a manifold is an anti-symmetric multilinear function of the tangent vectors. Therefore alternating tensors of degree m contain exactly the right amount of information for integrating over an m-submanifold.
13.9. Alternating tensors 13.9.8 Definition: An m-covector for a linear space V and m ∈ 13.9.9 Definition: An m-vector for a linear space V and m ∈
343
Z+0 is any element of ΛmV .
Z+0 is any element of Vm V .
product 13.9.10 Definition: Vm A simple m-vector in an alternating tensor m m∈ + is any f ∈ V of the form f : λ → 7 λ(v) for some v ∈ V . 0
Z
Vm
V for a linear space V and
m 13.9.11 Notation: V∧m i=1 vi for a sequence v ∈ V , where V is a linear space and m ∈ simple m-vector f ∈ m V defined by f (λ) = λ(v). v1 , . . . vm is an alternative notation for ∧m i=1 vi .
Z+0, denotes the
13.9.12 Remark: Definition 13.9.6 and Notation 13.9.7 imply that for m ≥ 2 and dim(V ) ≥ 1, Vm V = λ ; λ ∈ Lm (V, K)∗ Λm (V,K) m = λ ; λ∈⊗ V Λm (V,K) m
6⊆ ⊗ V = Lm (V, K)∗ .
V m That is, the individual tensors in m V are subsets (or function subgraphs) of the individual tensors in ⊗ V , m but the alternating tensor space as a whole is not a subset of the tensor space ⊗ V . This observation scarcely rises above the pedantic. The reader (and writer) may safely ignore it.
general tensor contravariant covariant ⊗m V — — — — — ⊗mQV , T0m (V ), ⊗m VQ∗ , Tm0 (V), m ∗ m L V , IR L V, IR Federer [105] ⊗m V — Flanders [65] — — m m ∗ Frankel [18] ⊗ V ⊗ V Fulton/Harris [108] V ⊗m — Gallot/Hulin/Lafontaine [19] ⊗m V ⊗m V ∗ Kobayashi/Nomizu [26] Tm T0m (V ) 0 (V ) Lang [30] — Lm (V, IR) m Lee [32] T (V ) Tm (V ) Malliavin [35] — Lm (V, IR) Spivak [42] Tm (V ) T m (V ) Szekeres [44] — —
reference Bump [96] Crampin/Pirani [11] Darling [13] EDM2 [34]
Kennington
⊗m V
Lm (V, IR),
⊗m V ∗
antisymmetric tensor contravariant covariant m ∧ V — Vm Vm V∗ Vm V Vm ∗ V ,V Am (V → IR) Vm V m ∗ V V V V Vm m V V— m V — V— m V — — — Λm (V ) Vm V
Vm
V — Vm ∗ Vm V ∗ Vm V ∗ V — Vm ∧ m V , La (V, IR) Λm (V ) Lm,a (V, IR) Ωm (V ) Λm (V ∗ ), Λ∗m (V ) V Λm (V, IR), m V ∗
[ Present a basis for Λr V . Then show how things work in terms of this basis. ] [ Demonstrate how alternating tensors are related to area and volume. ] [ Give a diagram showing how vectors u and v are related to the directed area u ∧ v spanned by the vectors. Show sets of equivalent vector pairs. See Figure 25.4.1. ] 13.9.14 Theorem: Let V be a linear space and m ∈
Z+0. Then dim(VmV ) = dim(ΛmV ) = Cmdim(V ).
n Proof: It follows Vmfrom Theorem 13.3.10 and Notation 13.9.3 that dim(Λm V ) = Cm for n = dim(V ). The dimensionality of V then follows from Theorem 10.5.10. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
13.9.13 Remark: The following table summarizes the general and antisymmetric tensor space notations of a selection of authors. Although there is clearly much agreement, there is also significant diversity.
344
13. Tensor algebra
[ See Notation 7.11.10 for index sets Irn with Crn elements. ] [ Also mention equivalents, isomorphisms and duals such as Hom(Λm V, W ) ∼ = Λm (V, W ). See Federer [105], page 17. Also Hom(Λm V ∗ , W ) ∼ = Λm (V ∗ , W ). ]
13.10. Alternating tensor algebra Alternating tensors have a much more interesting algebra than general tensors. 13.10.1 Definition: The algebra Λ∗ (V, W ) of alternating forms over the linear space V is. . . [ See Federer [105] 1.4.2. ] 13.10.2 Definition: The exterior algebra Λ∗ V over the linear space V is. . . [ See Federer [105] 1.3.1. ] 13.10.3 Definition: The alternating algebra Λ∗ (V, W ) over the linear space V with coefficients in the alternating algebra W is. . . [ See Federer [105] 1.4.2. ] [ After definition of Λr V , must also define the exterior product ∧ : Λr V × Λs V → Λr+s V . See Frankel [18], pages 67–68. ] [ Maybe have Maybe not interesting enough. Could use Jma tiny section on symmetric tensors near here? m notation V for the space of symmetric tensors and ⊙ v = v1 ⊙ v2 , . . . vm for symmetric tensors. Use i i=1 J + m (V, W ) for the space Lm (V, W ) of symmetric multilinear maps on V ? See Federer [105], page 41. ]
13.11. Tensor products defined via free linear spaces
13.11.1 Definition: The tensor product ⊗α∈A Vα of a family (Vα )α∈A of linear spaces over a field K is the quotient linear space F/G of the free linear space F on the set ×α∈A Vα with respect to the subspace G of F generated by the set {eu + ev − ew ; u, v, w ∈ × Vα and ∃β ∈ A, (uβ + vβ = wβ and ∀α ∈ A \ {β}, uα = vα = wα )} α∈A
∪ {eu − cev ; u, v ∈ × Vα , c ∈ K, and ∃β ∈ A, (uβ = cvβ and ∀α ∈ A \ {β}, uα = vα )}, α∈A
where eu ∈ F denotes the function eu = χ{u} : ×α∈A Vα → K. The standard immersion of a product ×α∈A Vα of linear spaces over a field K into the tensor product ⊗α∈A Vα is the function µ : ×α∈A Vα → ⊗α∈A Vα defined by ∀v ∈ × Vα ,
µ(v) = ev + G.
α∈A
That is, each element v of ×α∈A Vα is mapped onto the coset in F/G of the characteristic function of {v}. 13.11.2 Remark: Definition 13.11.1 may be interpreted as representing each symbol of the form v as ev . (See Figure 13.11.1.) K (u, 1)
× Vα
α∈A
Figure 13.11.1
Function eu = χ{u} for α = 2, V1 = V2 = K = IR, u = (3, 4)
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
The section presents an alternative approach to defining tensor products of linear spaces. A simpler approach is presented in Section 13.5. In this section, tensor products are carved out of free linear spaces. Free linear spaces are defined in Section 10.10. [ The map u 7→ eu in Definition 13.11.1 is called the “canonical projection” in EDM2 [34], 256.I. The standard immersion is called the “canonical multilinear map”. ]
13.12. Tensor products defined via lists of tensor monomials
345
13.11.3 Remark: The tensor concept seems to have been invented by physicists. It sometimes happens that physicists represent concepts by symbols without regard to the representation of those symbols in terms of set-theoretic constructions. It is easy to write the symbol v ⊗ w for any vectors v and w, but due to the equivalence rules, it is difficult to give this a unique set theory representation, particularly for practical computation requirements. This is not difficult if using a fixed basis, but this has the disadvantage of dependence on the basis. 13.11.4 Remark: An example which demonstrates how Definition 13.11.1 works is the equality (2, 4) ⊗ (3, 5) = 2(1, 2) ⊗ (3, 5) = (1, 2) ⊗ (6, 10) = (1, 2) ⊗ (1, 9) + (1, 2) ⊗ (5, 1)
N
in the tensor product space IR2 ⊗ IR2 . Essentially, when the index set is of the form A = k = {1, . . . k}, Definition 13.11.1 creates a tensor space from a free linear space by applying any number of rules of the form (u1 , . . . uβ , . . . uk )+(u1 , . . . vβ , . . . uk ) = (u1 , . . . uβ +vβ , . . . uk ) and c(u1 , . . . uβ , . . . uk ) = (u1 , . . . cuβ , . . . uk ) to determine equivalence classes. Thus, for instance, (2, 4) ⊗ (3, 5) really means [((2, 4), (3, 5))], the equivalence class of the pair ((2, 4), (3, 5)) ∈ IR2 × IR2 . [ Near here, show how tensors may be defined in terms of a basis. ] 13.11.5 Remark: Let e : ×α∈A Vα → F be the standard immersion of Definition 13.11.1. Let j : F → F/G be the standard map of Definition 10.7.1. Then the map µ = j ◦ e : ×α∈A Vα → F/G satisfies µ(v) = χ{v} + G. The subspace G of F corresponds to the relation that two vectors in F are equivalent if and only if they are equal except for. . .
13.11.7 Theorem: The tensor product immersion µ : ×α∈A Vα → ⊗α∈A Vα satisfies (i) µ is multilinear,
(ii) for any linear space W (over the same field as for the Vα ) and multilinear map ψ : ×α∈A Vα → W , there exists a unique linear map f : ⊗α∈A Vα → W such that ψ = f ◦ µ.
Therefore Definition 13.11.1 satisfies Metadefinition 13.4.1.
13.11.8 Remark: All of the definitions, notations and theorems for the tensor space definition in Section 13.5 apply also to the free linear space definition.
13.12. Tensor products defined via lists of tensor monomials 13.12.1 Remark: Many important definitions for tensors are stated in terms of tensor monomials and then extended by linearity to all tensors. Therefore it is important to establish the relation between tensor monomials and tensors. It is also important to ensure that all definitions expressed in terms of tensor monomials are independent of the polynomial representation. The “polynomial representation” in Definition 13.12.3 is not really a polynomial; it is a sum of monomials. A slightly more general-looking expression would have an arbitrary field element as a factor in front of each monomial term ⊗α∈A vi,α , but such factors can be absorbed into the monomials. 13.12.2 Definition: A tensor monomial in a tensor space ⊗α∈A Vα is a tensor ⊗α∈A vα = µ (vα )α∈A for some (vα )α∈A ∈ ×α∈A Vα , where µ is the canonical map of the tensor space. [ A tensor monomial is called a “simple tensor” by Federer [105]? See Definition 13.5.6. ] [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
13.11.6 Remark: A metadefinition for the tensor product of linear spaces is given in Section 13.4. Definitions of tensor product spaces are “characterized” by a set of properties which any definition of tensor products must satisfy. These properties are asserted in Theorem 13.11.7 for the free linear space definition of tensor product given here.
346
13. Tensor algebra
13.12.3 Definition: A tensor polynomial representation of a tensor w in a tensor space ⊗α∈A Vα is a Pk sequence v = (vi,α )α∈A ki=1 ∈ (×α∈A Vα )k , for some k ∈ + 0 , such that w = i=1 µ ⊗α∈A vi,α . In other Pk words, w(λ) = i=1 λ(×α∈A vi,α ) for all λ ∈ L ((Vα )α∈A ; K).
Z
13.12.4 Remark: The set of all tensor polynomial representations of a tensor space ⊗α∈A Vα in Defini S∞ tion 13.12.3 is a “list space” of the form List ×α∈A Vα = k=0 (×α∈A Vα )k . (See Sections 7.12 and 9.12 for list spaces. Although it is more logical to begin list indices at 0, it will be assumed here that monomial indices start at 1.) The list space consists of finite lists of indexed sets of vectors. The concatenation of two such lists yields another list in the same list space. Other useful operations on these lists include the omission of a one or more elements and the insertion of one or more elements. Such operations are used extensively in tensor algebra, often in an informal manner which can be confusing or ambiguous. The list representation in Definition 13.12.3 is very close to the way one would represent tensors in symbolic algebra software. It is also very similar to the free linear space quotient style of definition of tensor spaces in Section 13.11. 13.12.5 Definition: The extended canonical map for a tensor space ⊗α∈A Vα is the map µ : List ×α∈A Vα → ⊗α∈A Vα defined so that µ(v) is the tensor represented by the polynomial v.
13.12.6 Remark: Although it is clear that all tensor polynomial representations in Definition 13.12.3 specify a tensor, it is not so obvious that all tensors have finite polynomial representations. Theorem 13.12.7 states that the extended canonical map in Definition 13.12.5 is surjective.
13.12.7 Theorem: All tensors in a tensor space ⊗α∈A Vα with finite-dimensional linear spaces Vα for a finite index set A have a finite tensor polynomial representation.
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
α Proof: For each α ∈ A, let (vα,i )ni=1 , where nα = dim(Vα ), be a basis for the linear space Vα . ...
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
[347]
Chapter 14 Topology
14.1 14.2 14.3 14.4 14.5 14.6 14.7 14.8 14.9 14.10 14.11 14.12
History and generalities . . . . . . . . . . . . . . Topological spaces . . . . . . . . . . . . . . . . . Some simple topologies on finite sets . . . . . . . Interior and closure of sets . . . . . . . . . . . . Exterior and boundary of sets . . . . . . . . . . . Limit points and isolated points . . . . . . . . . Some simple topologies on countably infinite sets . Generation of topologies from collections of sets . The standard topology for the real numbers . . . Open bases and open subbases . . . . . . . . . . Continuous functions . . . . . . . . . . . . . . . Homeomorphisms . . . . . . . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
349 351 353 355 358 363 364 367 369 369 371 374
14.0.2 Remark: Topology is the study of the connectivity of sets and the continuity of functions. So: Topology is the study of connectivity and continuity. Roughly speaking, continuous functions are those which preserve the connectivity of sets. (See Section 15.5 for details.) So continuity may be defined in terms of connectivity. But connectivity and continuity may both be defined in terms of the concepts of interior, exterior and boundary of sets. Thus: Topology is the study of the interior, exterior and boundary of sets. The interior Int(S) and exterior Ext(S) of a set S in a topological space X may be defined in terms of the boundary Bdy(S) as Int(S) = S \ Bdy(S) and Ext(S) = (X \ S) \ Bdy(S). So everything in topology may be defined in terms of boundaries of sets. Hence: Topology is the study of boundaries. The modern technical specification of a topology is expressed in terms of open sets. (An open set is a set which contains none of its boundary points.) The interior, exterior and boundary of a set are then defined in terms of these open sets. However, in terms of the meaning of a topology, boundaries are more fundamental than open sets. Both concepts contain the same information in a technical sense, but boundaries have a much stronger intuitive appeal than open sets.
©
Alan U. Kennington, “Differential geometry reconstructed: a unified systematic framework”. Copyright 2009, Alan U. Kennington. All rights reserved. You may print this book draft in A4 format. Printing in all other formats is forbidden. You may not charge any fee for copies of this book draft. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
14.0.1 Remark: Chapters 14 to 17 are not a full introduction to general topology. Only those aspects of topology which are needed for later chapters are presented here. Apart from the basic definitions, the most important aspects of topology for differential geometry are topological space construction techniques (such as product and quotient topologies), classes of topological spaces, continuous curves and paths, topological transformation groups, and metric spaces.
348
14. Topology
14.0.3 Remark: The concepts of the interior, exterior and boundary of sets are familiar from many real-life contexts, including the following. (1) (2) (3) (4) (5)
Biological cells and the bodies of animals and plants. Enclosed vehicles such as cars, aeroplanes, spacecraft and ships. Oceans, seas and lakes, water droplets, icebergs and glaciers. Nations, territories and continents. Planets, stars, asteroids.
14.0.4 Remark: A topology for a set X defines the interior Int(S), exterior Ext(S) and boundary Bdy(S) of every subset S of X. These three sets form a partition of X for each set S. The technical specification of a topology uses open neighbourhoods to define these three sets. The interior of a set S is consists of the points x1 which have at least one neighbourhood which is entirely inside the set S. The exterior consists of the points x3 which have at least one neighbourhood which is entirely outside the set S. The boundary consists of the points x2 which are neither interior nor exterior. In other words, every neighbourhood of a boundary point contains at least one point inside and one point outside S. (See Figure 14.0.1.) X \S
x2 ∈ Bdy(S) S x1 ∈ Int(S)
neighbourhoods x3 ∈ Ext(S) Figure 14.0.1
Interior, boundary and exterior of a set
One way to think about set interiors and boundaries is to recall Zeno’s paradox of Achilles and the tortoise. Imagine the tortoise walking towards the boundary. Whenever Achilles gets to where the tortoise was, he still has the tortoise between him and the boundary. So Achilles always has a neighbour inside the set. 14.0.5 Remark: The pure mathematical concept of a zero-width boundary of a set (in the case of a metric space) will not match anything in the physical world if space (or space-time) turns out to be granular in nature. Even if space and time really are infinitely divisible, no physical measurement can be made to an infinite number of significant figures. However, the exact topological concepts of interior, exterior and boundary are applied only within pure mathematical models. The correspondence between the physical world and mathematical models is an application issue. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
It is not totally implausible that the earliest vertebrate animals on Earth implemented the concepts of interior, exterior and boundary in the world-models by which they navigated their environments. These topological concepts are arguably more fundamental than numbers or propositional logic. The logical concept of a class of objects implies both an interior and an exterior of the class. So there is apparently a close fundamental relation between set theory and topology. Mathematical models of the physical world are very often expressed in terms of the interior, exterior and boundary of sets. For example, the Stokes theorem (in Section 20.9) expresses integrals over a region in terms of integrals over its boundary. Conservation equations for physical flows are expressed in terms of boundary integrals and interior integrals. Solutions of boundary value problems are typically expressed as integrals over boundaries and interior regions. A large proportion of complex analysis is concerned with integrals over boundary curves and their relations to integrals over interior regions bounded by curves. Perhaps most importantly, all of mathematical analysis is expressed in terms of limits, which are effectively equivalent to boundaries of sets. Thus analysis is based upon the boundary concept, which is the core concept of topology.
14.1. History and generalities
349
14.1. History and generalities 14.1.1 Remark: The word “topology” was introduced in 1847 by Johann Benedict Listing in “Vorstudien in Topologie”. This Greek-derived word (from “topo ”) corresponds roughly to the earlier Latin phrase “analysis situs” for the same subject. The Latin word “situs” is defined by White [217], page 572, as: “The manner of lying; the situation, local position, site of a thing”. It is not really clear how this word is related to the subject matter of topology. Bell [190], page 492, suggested (in 1937) that the word “topology” was earlier than the term “analysis situs”. [. . . ] topology (now called analysis situs) as first developed bore but little resemblance to the elaborate theory which today absorbs all the energies of a prolific school [. . . ] 14.1.2 Remark: Most topology can be divided into two flavours: global connectivity classification and local continuity analysis.
The difference between these two flavours of topology is well described by Simmons [139], page viii, as follows. Historically speaking, topology has followed two principal lines of development. In homology theory, dimension theory, and the study of manifolds, the basic motivation appears to have come from geometry. In these fields, topological spaces are looked upon as generalized geometric configurations, and the emphasis is placed on the structure of the spaces themselves. In the other direction, the main stimulus has been analysis. Continuous functions are the chief objects of interest here, and topological spaces are regarded primarily as carriers of such functions and as domains over which they can be integrated. These ideas lead naturally into the theory of Banach and Hilbert spaces and Banach algebras, the modern theory of integration, and abstract harmonic analysis on locally compact groups. Within the context of differential geometry, one could say, broadly speaking, that topology flavour (1) is principally the concern of the pure mathematicians whereas topology flavour (2) is of more interest to physicists. A prime example of the pure mathematical focus in differential geometry is the Poincar´e conjecture. Physicists tend to be more interested in infinite-dimensional linear spaces of functions (such as vector and tensor fields) which are defined on topologically well-understood spaces such as IRk or S k . 14.1.3 Remark: In popular presentations, topology is often explained as the study of properties of topological spaces which are invariant under homeomorphisms. (This is topology flavour (1) in Remark 14.1.2.) One could call this “tea-cup and dough-nut topology”. This kind of homeomorphism-invariant focus of topology is more or less in line with Felix Klein’s 1872 “Erlanger Programm”. The set of homeomorphisms may be regarded as the structure group for the geometry of topological spaces. This does not quite fit, however. Klein’s definition of a geometry has a fixed set with a fixed set of automorphisms, not an infinite number of point sets. (See Remark 19.4.2 for related comments on pseudogroups of homeomorphisms.) 14.1.4 Remark: There is a close correspondence between set connectivity and function continuity. It follows from Theorem 15.5.8 that a bijection between two open subsets of IRn is a homeomorphism if and only if the image (and pre-image) of any disconnected pair of sets is a corresponding disconnected pair of sets. (This applies more generally to normal topological spaces.) In this sense, the connectivity of sets is a maximal invariant of the set of homeomorphisms of IRn , and the set of homeomorphisms is the maximal pseudogroup which preserves connectivity. Thus one may say that connectedness is the fundamental invariant of homeomorphisms in the same sense that tangent vectors are (more or less) the fundamental invariant of C 1 diffeomorphisms. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
(1) Global connectivity classification. This flavour of topology is concerned with classifying topological spaces into equivalence classes according to their connectivity properties. Algebraic topology, for example, attaches homeomorphism-invariant algebraic structures to topological spaces to assist in their classification. (2) Local continuity analysis. This flavour of topology is mostly concerned with the pointwise continuity of maps between topological spaces for which the global connectivity properties are either trivial or of no interest. Topological vector spaces, for example, are generally trivially connected according to all of the connectivity algorithms defined in algebraic topology.
350
14. Topology
14.1.5 Remark: The local continuity analysis flavour of topology is closely associated with the limit concepts of analysis. In every interesting topology, each point x ∈ X has an infinite number of neighbourhoods which get closer and closer to x. (If there were only finitely many neighbourhoods, a single innermost neighbourhood could be used instead of an infinite set of them.) In this way, the notion of a “limit” is defined. A point x is a limit of a sequence of points y ∈ X if there is at least one point y in each of the neighbourhoods of x.
The idea of a “limit” has been psychologically troubling ever since the famous limit paradoxes of Zeno. Even in the 18th and 19th centuries, limits were found to be philosophically troubling. Topology is the subject which is supposed to resolve all of the issues regarding limits in a logically self-consistent fashion. However, since the concept of “infinity” is itself difficult to grasp and accept, it is never possible to fully resolve all difficulties. Topology can only ensure that the formalism is logically self-consistent. Topology cannot take away the essential discomfort of infinite and infinitesimal concepts. 14.1.6 Remark: Topology has numerous levels of structure within numerous classes. For example, separation classes in Section 15.2, connectivity classes in Section 15.4, separability classes are discussed in Section 15.6, and compactness classes in Section 15.7. It is very important to take note of the levels of structure required for each definition and theorem. Equally, it is important to clearly state the assumptions upon which all theorems and definitions are based. The full statement of assumptions is sometimes tedious both to write and read, but this is less painful in the long run than the occasional false application of theorems and definitions. (It is a common source of error in mathematics to apply a theorem when its assumptions are not satisfied, particularly in applications outside pure mathematics.) A similar hazard of false application is present in differential geometry. The level of structure required for definition and theorems in differential geometry is often implied in the context, not stated explicitly in the statement of the definition or theorem. (This is the motivation for presenting the five-level DG structure model in Section 1.1.)
14.1.7 Remark: It could be argued that the scope of general topology is too wide. The most general definition of topology (Definition 14.2.3) permits absurd extremes of scope which are difficult to find applications for. Topological concepts have been generalized enormously in the last 200 years, far away from the original focus on Euclidean spaces and metric spaces. Most useful topology is carried out with topological spaces which are constrained in various ways to make them applicable, for example by requiring the spaces to fall within one or more of the classes referred to in Remark 14.1.6. In particular, most of the topology examples in Sections 14.3 and 14.7 are of very dubious applicability. Unbridled generalization often makes paths into barren deserts with oases few and far between. [ Find a reference for the statement in Remark 14.1.8 that Cantor’s set theory was motiviated by the desire to fill the gaps between the algebraic real numbers. This seems to be implicitly justified by Bell [189], pages 273–278, but an explicit reference would be better. ] 14.1.8 Remark: Set theory, as introduced by Georg Cantor, arose during the last quarter of the 19th century out of a need to fill the gaps between the algebraic real numbers. In topological language, this is a question about the completeness of the real numbers. So set theory, which is now the framework of almost all mathematics, may be said to have arisen from a question in topology. Therefore it is no surprise that so much basic set algebra is required for topology. 14.1.9 Remark: In topology, as in measure theory, many definitions are expressed in terms of arbitrary sets and vast collections of subsets. This often leads one into the temptation to invoke the axiom of choice as a convenient way of bringing sets and functions into existence to simplify proofs of theorems. The mathematician who is against the axiom of choice must be constantly on guard.
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
[ Apparently a topological space was originally defined by Hausdorff in 1914 with the Hausdorff space requirement. Then Kuratowski generalized topological spaces in 1922 to the modern definition. Find a printed reference for this in Remark 14.1.7. ]
14.2. Topological spaces
351
14.2. Topological spaces 14.2.1 Remark: There are numerous equivalent formalisms for topological spaces. A topology may be formalized, for example, in terms of open sets (Definition 14.2.3), closed sets (Notation 14.2.15), interior operators (Definition 14.4.1), closure operators (Definition 14.4.4) or per-point neighbourhoods. (These are described and formalized axiomatically in EDM2 [34], 425.B, page 1606.) The most popular formalism defines a topology to be its set of open sets as in Definition 14.2.3. Simmons [139], page 98, says the following on this subject. A good deal of research was done along these lines in the early days of topology. It was found that there are many different ways of defining a topological space, all of which are equivalent to one another. Several decades of experience have convinced most mathematicians that the open set approach is the simplest, the smoothest, and the most natural. From the point of view of doing the practical analysis, undoubtedly the open-set formalization is the best. However, it does lack direct intuitive appeal, as mentioned in Remark 14.0.2. 14.2.2 Remark: There is some redundancy in the specification tuple for a topological space. The set X S in a tuple (X, T ) always satisfies X = T . But in usage, the set X is usually in the foreground and the topology T is in the background. Therefore the pair (X, T ) is often abbreviated to just X. 14.2.3 Definition: A topology for a set X is a set T ⊆ IP(X) such that (i) {∅, X} ⊆ T ,
14.2.4 Definition: < (X, T ) such that T is a topology for the set X. A topological space is a pair X − A point in a topological space (X, TX ) is an element of X. The point set of a topological space (X, TX ) is the set X. A set in a topological space (X, TX ) is a subset of X.
14.2.5 Notation: Top(X) denotes the topology T on a topological space X − < (X, T ) when the choice of topology T on X is implicit in a particular context. That is, Top(X) = T . S 14.2.6 Remark: For topological spaces (X, T ), specification of the set X is redundant because X = T . So it is perhaps perplexing that the pair (X, T ) is usually abbreviated as X. The set X certainly does not contain the full information in the pair (X, T ), but the set T does contain the full information. 14.2.7 Remark: Since Theorem 14.2.8 (ii′ ) implies condition (ii) of Definition 14.2.3, the finite intersection condition (ii′ ) may be substituted for the two-set intersection condition (ii) without changing the definition. The two-set intersection rule (ii) in Definition 11.6.5 is chosen to facilitate the proof of validity of topologies. But by Theorem 14.2.8, the two-set rule always implies the more powerful finite intersection rule. 14.2.8 Theorem: Let (X, T ) be a topological space. Then T (ii′ ) ∀C ⊆ T, (1 ≤ #(C) < ∞) ⇒ C ∈ T .
Proof: To show (ii′ ), let (X, T ) be a topological space and let C ⊆ T satisfy 1 ≤ #(C) < ∞. If #(C) = 1, T then C = {Ω} for some set Ω ∈ T . So C = Ω ∈ T as claimed. If #(C) = 2, then C = {Ω1 , Ω2 } for some T sets Ω1 , Ω2 ∈ T . So C = Ω1 ∩ Ω2 ∈ T by Definition 14.2.3 (ii). T The result C ∈ T for general #(C) may be proved by an induction argument. Let n = #(C) > 1. Then T Tn Tn−1 there is a bijection f : n → C. So C = i=1 f (i) = i=1 f (i) ∩ f (n). This is an element of T by Definition 14.2.3 (ii) if (ii′ ) is valid for #(C) = n − 1. Therefore by induction on n, (ii′ ) is valid for all n = #(C).
N
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
(ii) ∀Ω1 , Ω2 ∈ T, Ω1 ∩ Ω2 ∈ T , S (iii) ∀C ⊆ T, C ∈ T .
352
14. Topology
14.2.9 Remark: A topology for a set X is a set of subsets of X which contains the empty set, the set X, and any finite intersection or arbitrary union of sets in X. One might reasonably ask why there is an asymmetry between set intersections and set unions in Definition 14.2.3. A simple answer is that topology would be very boring if closure under arbitrary intersections was required. In that case, every topology on a set X would be the set of all unions of a partition of X. If the topology had the ability to separate pairs of points at all (in the sense of the very weak T1 separation class in Definition 15.2.4), the topology would be the power set IP(X). Then the only connected sets would be singletons. A better way to answer the question is to consider the intuitive idea of the interior of an open set. A set is intuitively defined to be “open” if every point in the set is in the interior of the set. If a point x is in the interior of an open set Ω1 and also in the interior of Ω2 , then we would expect x to be in the interior of Ω1 ∩ Ω2 although the “walls” of the set would be a little “closer”. In the case of a union Ω1 ∪ Ω2 , the “walls” of the union will be either the same “distance” away or further away. Since the union operation makes sets bigger, we are guaranteed to always be in the interior of a set no matter how many open sets are in a union, even an infinite or uncountably infinite number of open sets. The shinkage of a set, on the other hand, has the danger that eventually we might not have enough room to move. Being in the “interior” of a set intuitively means that we have at least some “space” between each point and the “walls”. The amount of space for the intersection Ω1 ∩ Ω2 should be the minimum of the space for the individual sets Ω1 and Ω2 . But the minimum of a “small space” and a “small space” is still a “small space”. So by naive induction, we expect that the intersection of any finite number of sets will leave us at least some “room” around each point which is still in the intersection. Therefore a topological space requires closure under finite intersections and arbitrary unions.
14.2.11 Notation: Topx (X) denotes the set of neighbourhoods of a point x ∈ X − < (X, T ) for an implicit topology T on X. That is, Topx (X) = { Ω ∈ Top(X); x ∈ Ω } = {Ω ∈ T ; x ∈ Ω}. 14.2.12 Definition: An open set in a topological space (X, TX ) is any set Ω ∈ TX . A closed set in a topological space (X, TX ) is any set K ⊆ X such that X \ K ∈ TX .
14.2.13 Remark: In the English-language literature, the letter G is often used for open sets and F is often used for closed sets. Maybe this is because the German word for “open” is “ge¨offnet” and the French word for “closed” is “ferm´e”. The letter K tends to be used for compact sets (Definition 15.7.4), but is also commonly used for closed sets. The letter Ω is very popular for open sets because “open”, “offen” and “ouvert” all start with “o” and in Greek, O-mega means “big O”. (The Greek letter o has the name “o-micron”, which means “small O”.) 14.2.14 Remark: The set of closed sets for a topology is rarely given its own notation. EDM2 [34], 425.B, page 1606, uses O (Fraktur O) for the set of open sets and F (Fraktur F) for the set of closed sets. But the Fraktur font is difficult to write and read. The non-standard Notation 14.2.15 uses an over-bar to indicate the set of closed sets of a topological space (X, T ) by analogy with Notations 14.2.5 and 14.4.5. 14.2.15 Notation: Top(X) denotes the set of closed sets in a topological space X − < (X, T ). That is, Top(X) = {F ∈ IP(X); X \ F ∈ Top(X)} = {X \ Ω; Ω ∈ Top(X)} = {X \ Ω; Ω ∈ T }. Topx (X) denotes the set of closed sets in a topological space X − < (X, T ) which contain a point x ∈ X. That is, Topx (X) = {F ∈ Top(X); x ∈ F }. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
14.2.10 Remark: Some texts (e.g. Simmons [139], section 16, p.92) do not permit a topology to be defined on an empty set X. However, this would be inconvenient for the statement of some kinds of general rules. It is tedious to have to always specially exclude cases where a set is empty. Therefore the pair (X, T ) where X = ∅ and T = {∅} will be regarded as a valid topology in this text. (See Example 14.3.2 for details.)
14.3. Some simple topologies on finite sets
353
14.2.16 Theorem: (1) The union of any finite set of closed sets in a topological space is closed. (2) The intersection of any non-empty set of closed sets in a topological space is closed. Proof: To prove part (1), let C be a non-empty finite set of closed sets in a topological T space X. Let C ′ =S{X \ K; K T ∈ C}. By Definition 14.2.12, ∀S ∈ C ′ , S ∈ Top(X). By Theorem 14.2.8,S C ′ ∈ Top(X). But C = X \ ( C ′ ). So the union of C is closed by Definition 14.2.12. If C is empty, C = ∅, which is a closed set.
′ For part (2), let C be a non-empty set of closed sets in a topological space S ′ X. Let C = {XS\ K; K ∈ C}. T By ′ Definition 14.2.12, ∀S ∈ C , S ∈ Top(X). By Definition 14.2.3 (iii), C ∈ Top(X). But C = X \ ( C ′ ). So the intersection of C is closed by Definition 14.2.12.
14.2.17 Remark: For any set X, both {∅, X} and the powerset IP(X) are valid topologies on X. (See Exercise 47.7.1.) Note that the trivial topology {∅, X} contains only one element if X = ∅. Note also that the trivial topology and the discrete topology are the same if X is empty or contains only one element. It is clear that {∅, X} ⊆ T ⊆ IP(X) for any topology T on any set X. So the trivial topology is the smallest topology on a set, and the discrete topology is the largest topology (in the sense of the partial order on sets defined by set inclusion). 14.2.18 Definition: The trivial topology on a set X is the set TX = {∅, X}.
A trivial topological space is a topological space (X, T ) such that T is the trivial topology on X. 14.2.19 Definition: The discrete topology on a set X is the set TX = IP(X). A discrete topological space is a topological space (X, T ) such that T is the discrete topology on X.
14.2.20 Remark: When more than one topology is being discussed on a single set, one often compares the relative “strength” or “weakness” of the topologies. It is often true that if a given topology has a property, then all stronger, or weaker, topologies also have that property. Therefore knowing the relative strength or weakness of topologies is often useful for proving properties more easily. In contradiction to standard english, topologies T1 and T2 such that T1 = T2 are said to be both weaker and stronger than each other. It is simpler to adopt this convention than to say, for instance, that “T1 is weaker than or equal to T2 ”. For any set X, the trivial topology {∅, X} is clearly weaker than all topologies on X, and the discrete topology IP(X) is stronger than all topologies on X. 14.2.21 Definition: A topology T1 on a set X is said to be weaker than a topology T2 on X if T1 ⊆ T2 . T1 is said to be stronger than T2 if T1 ⊇ T2 .
14.3. Some simple topologies on finite sets 14.3.1 Remark: It is a useful familiarization exercise to determine in full generality the set of all topologies on the most trivial sets. On the other hand, the subject of topology is not directed at finite sets. Topology is principally concerned with limiting processes and continuity, and these concepts are only non-trivial for infinite sets. However, connectivity does make good sense in a finite set. Connectivity for finite sets is sometimes referred to as “network topology”, which is discussed in Section 16.10. The smallest topology Top(X) = {∅, X} on any set X is the trivial topology in Definition 14.2.18. The largest topology Top(X) = IP(X) = {S; S ⊆ X} on any set X is the discrete topology in Definition 14.2.19. The interesting thing is to determine what the other possibilities are for the topology on any given set X. 14.3.2 Example: The only possible topology on the set X = ∅ is Top(X) = {∅}. This gives the empty topology (X, T ) = (∅, {∅}) which is mentioned in Remark 14.2.10. 14.3.3 Remark: When considering the possible topologies on countable sets of points, it is convenient to consider only sets whose “points” are integers. Only the cardinality of the point set matters for this task. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
[ The terms “coarse” and “fine” are sometimes used instead of weak and strong. Define these. ]
354
14. Topology
14.3.4 Example: On a single-element set X = {1}, the only possible topology is Top(X) = {∅, X}. In this case, the trivial topology and the discrete topology are the same. 14.3.5 Example: On a two-element set X = {1, 2}, there are four possible topologies. topology abbreviation a b c d
{∅, X} {∅, {1}, X} {∅, {2}, X} {∅, {1}, {2}, X}
0 1 2 1, 2
Topology a is the trivial topology. Topology d is the discrete topology. Topologies c and d are equivalent under a permutation of the point set. So there is only one “interesting” topology on a two-point set. The 2 four possible topologies (amongst the 2(2 ) = 16 subsets of IP({1, 2})) are illustrated in Figure 14.3.1. a 1
b 2
1
c 2
1
d 2
1
2
All topologies on the set {1, 2}
Figure 14.3.1
14.3.6 Example: On a three-element set X = {1, 2, 3}, there are 8 unique topologies. The other topologies are obtained by permuting the point set. topology multiplicity a 0 1 b 1 3 c 12 3 d 1, 12 6 e 1, 2, 12 3 f 2, 12, 23 3 g 1, 2, 12, 23 6 h 1, 2, 3, 12, 13, 23 1 total
26
Topology a is the trivial topology. Topology h is the discrete topology. Including all of the permutations of the 3 point set, there are 26 possible topologies amongst the 2(2 ) = 256 subsets of IP({1, 2, 3}). (Combinatorics enthusiasts may like to amuse themselves by trying to find a general formula for the number of possible topologies for each point-set cardinality.) Topologies a to g are illustrated in Figure 14.3.2. a 1
2
b 3
d 1
2
Figure 14.3.2
1
2
c 3
e 3
1
2
1
2
3
f 3
1
2
g 3
1
2
3
Unique topologies on the set {1, 2, 3}
The determination of the set of all valid topologies on a four-element set is left to the interested reader. (See Exercise 47.7.2.) Since ∅ and X are always elements of X = n for n ∈ + , the number of subsets of IP(X) which must be n checked to find valid topologies is 22 −2 for n ≥ 1. This number increases rather rapidly with increasing n.
N
[ www.topology.org/tex/conc/dg.html ]
Z
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Since ∅ and X are always elements of a topology on a set X, it makes sense to ignore them. Similarly, the set brackets are a distraction. So the topologies may be abbreviated as in the right column of the table.
14.4. Interior and closure of sets
355
14.3.7 Remark: In the case of finite point sets X, there is a simple duality between the open sets and closed sets because an “arbitrary union” of sets in a topology on X means a “finite union” of sets when X is finite. Let T be a topology on a finite set X. Then T˜ = {X \Ω; Ω ∈ T } is a topology on T . The topology T˜ is a kind of “dual topology” of T . The closure of T˜ under set union follows from the fact that (X \ Ω1 ) ∪ (X \ Ω1 ) = X \ (Ω1 ∩ Ω2 ) for all Ω1 , Ω2 ∈ T . Closure under intersections follows from (X \ Ω1 ) ∩ (X \ Ω1 ) = X \ (Ω1 ∪ Ω2 ). Of course the dual of the dual T˜ is the same as the original topology T . In Example 14.3.5, topology b is the dual of topology c. Both topologies a and d are self-dual. In Example 14.3.6, the dual of topology b is the same as a permutation of topology c and topology e is the same as a permutation of topology f . Topologies d and g are equivalent (under set permutations) to their own dual topologies. Topologies a and g are self-dual. 14.3.8 Example: In practice, one usually wants topologies which have some sort of uniformity or symmetry. For example, a topology on sets like , , IR and IRn for n ∈ + 0 would generally be expected to be invariant under arbitrary translations. Translation invariance is a very strong constraint which greatly reduces the set of possible topologies on a set. If X is a finite set, the only topologies on X which are invariant under all permutations of X are the trivial and discrete topologies. To see this, suppose {x} ∈ Top(X) for some x ∈ X. Then permutation invariance implies that {x} ∈ Top(X) for all x ∈ X. Since all unions of elements of Top(X) are elements of Top(X), this implies that Top(X) = IP(X). Now suppose that A is an arbitrary non-empty subset of X such that A 6= X. Then there are elements x, y ∈ X such that x ∈ A and y ∈ / A. So by permutation invariance of Top(X), the set Bz = swapy,z (A) is an element of Top(X) for allTz ∈ Z = X \ {x}. (See Definition 7.12.2 (vii) for the swap function.) Therefore the finite intersection B = z∈Z Bz must be an element of Top(X). But B = {x}. So by the first argument, Top(X) is once again the discrete topology. Note that the finiteness of X was an essential step in this proof. It follows that there are no interesting uniform topologies on finite sets with respect to the set of all permutations of the point set.
Z
14.3.9 Example: If the set of all permutations of a finite set is replaced by the set of rotations, the situation in Example 14.3.8 is slightly different. A rotation Rd : n → n by distance d for n, r ∈ + 0 with n ≥ 1 and d < n is defined by Rd : x 7→ 1 + (x + d − 1) mod n. Suppose n is a composite integer with n = kℓ and k, ℓ ∈ + \ {1}. (See Definition 7.4.2 for composite integers.) Let T be the set of subsets of n which have periodicity k. Then T is a topology on n . This follows from the fact that the union of any set of subsets with period k also has period k. The same is true for intersections. This topology is then the same as ℓ copies of the discrete topology on k .
N
Z
N
Z N
N
N
14.4. Interior and closure of sets 14.4.1 Definition: The (topological) interior of a set S in a topological space (X, T ) is the union of all open sets in (X, T ) which are included by the set S. In other words, the interior of S is the set S S Ω = {Ω ∈ T ; Ω ⊆ S}. Ω∈T Ω⊆S
14.4.2 Notation: Int(S) denotes the interior of a set S with respect to an implicit topological space (X, T ). 14.4.3 Remark: The intersection expression in Definition 14.4.4 is well-defined because the set {X \Ω; Ω ∈ T and S ⊆ X \ Ω} is non-empty since ∅ ∈ T and S ⊆ X \ ∅ = X. 14.4.4 Definition: The (topological) closure of a set S in a topological space (X, T ) is the intersection of all closed sets in (X, T ) which include the set S. In other words, the closure of S is the set T T X \ Ω = {X \ Ω; Ω ∈ T and S ⊆ X \ Ω} T Ω∈T = {K ∈ IP(X); X \ K ∈ T and S ⊆ K} X\Ω⊇S T = {K ∈ Top(X); S ⊆ K}. 14.4.5 Notation: S¯ denotes the closure of a set S with respect to an implicit topology.
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
ZQ
356
14. Topology
14.4.6 Remark: The notation S¯ for the closure of a set S is used by Simmons [139], page 68; Rudin [136], page 39; Rudin [137], page 7; Taylor [144], page 26; Robertson/Robertson [135], page 6; Treves [146], page xv; Gilbarg/Trudinger [110], page 9; Adams [92], page 9; Helms [116], page 1; Darling [13], page 114; Malliavin [35], page 122; and Reinhardt [134], page 214. The notation S a is used by EDM2 [34], 425.B, page 1607; and Yosida [150], page 3. (The letter “a” may be mnemonic for “adherent set”. But Yosida [150] says that it comes from the German phrase for closure: “abgeschlossene H¨ ulle”.) The notations S − and Cl S are given by Ahlfors [93], page 53. 14.4.7 Remark: Notations 14.4.2 and 14.4.5 require the topology to be determined by the context. So there can be confusion if there is more than one topology under consideration. As mentioned in Remark 14.2.6, a topological space (X, T ) is fully determined by the set T . Therefore Notation 14.4.8 S fully determines the implicit topological space (X, T ) by naming only the topology T because X = T . 14.4.8 Notation: IntT (S) denotes the interior of a set S with respect to a topological space (X, T ).
14.4.9 Theorem: Let S be a subset of a topological space X. Then: (1) Int(S) ∈ Top(X). (2) Int(S) ⊆ S. (3) S¯ ∈ Top(X).
¯ (4) S ⊆ S. (5) X \ S¯ = Int(X \ S). (6) S¯ = X \ Int(X \ S). ¯ (7) Int(S) ⊆ S.
(8) Int(S) = X \ (X \ S).
(9) X \ S = X \ Int(S).
(10) S1 ⊆ S2 ⇒ Int(S1 ) ⊆ Int(S2 ). (11) S1 ⊆ S2 ⇒ S¯1 ⊆ S¯2 . Proof: Part (1) follows from Definition 14.2.3 because the interior of a set is the union of a set of open sets by Definition 14.4.1. S For part (2), let C = {Ω ∈ Top(X); Ω ⊆ S}. Then ∀z ∈ C, z ⊆ S. So C ⊆ S by Theorem 5.14.7 (xiii). That is, Int(S) ⊆ S.
Part (3) follows from Theorem 14.2.16 because the closure of a set is the intersection of a non-empty set of closed sets by Definition 14.4.4. (The set C of closed supersets of a set S ⊆ X is non-empty because X is a closed set in any topological space X. Therefore X ∈ C. See also Remark 14.4.3.) S For part (4), let C = {K ∈ Top(X); S ⊆ K}. Then ∀z ∈ C, z ⊇ S. So C ⊇ S by Theorem 5.14.7 (xiv). That is, S¯ ⊇ S. S S PartT(5) follows from Int(X \ S) = {Ω ∈ Top(X); Ω ⊆ X \ S} = {Ω; Ω ∈ Top(X) ∧ S ⊆ X \ Ω} = X \ {X \ Ω; Ω ∈ Top(X) ∧ S ⊆ X \ Ω}, which equals X \ S¯ by Definition 14.4.4. Part (6) follows from part (5) and Theorem 5.13.10 (viii). Part (7) follows from parts (2) and (4). Part (8) follows from part (5) by substituting X \ S for S. Part (9) follows from part (8).
Part (10) follows from Theorem 5.14.7 (i) because S1 ⊆ S2 implies that {Ω ∈ Top(X); Ω ⊆ S1 } ⊆ {Ω ∈ Top(X); Ω ⊆ S2 } (by the transitivity of the set inclusion relation). Part (11) follows from Theorem 5.14.7 (ii) because S1 ⊆ S2 implies that {K ∈ Top(X); K ⊇ S1 } ⊇ {K ∈ Top(X); K ⊇ S2 } (by the transitivity of the set inclusion relation). [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Let S1 and S2 be subsets of a topological space X. Then:
14.4. Interior and closure of sets
357
14.4.10 Theorem: Let S be a subset of a topological space X. Then: (1) S ∈ Top(X) ⇔ Int(S) = S. (2) S ∈ Top(X) ⇔ S¯ = S. (3) Int(Int(S)) = Int(S). ¯ (4) S¯ = S. ¯ (5) Int(S) ⊆ S. ¯ ⊇ Int(S). (6) Int(S) Let S1 and S2 be subsets of a topological space X. Then: Int(S1 ) ∪ Int(S2 ) ⊆ Int(S1 ∪ S2 ). Int(S1 ) ∩ Int(S2 ) ⊆ Int(S1 ∩ S2 ). S¯1 ∪ S¯2 ⊇ S1 ∪ S2 . S¯1 ∩ S¯2 ⊇ S1 ∩ S2 .
Proof: SFor part (1), let S be an open set in X. That is, S ∈ Top(X). Then S ∈ {Ω ∈ Top(X); Ω ⊆ S}. So S ⊆ {Ω ∈ Top(X); Ω ⊆ S} = Int(S) by Definition 14.4.1. But Int(S) ⊆ S by Theorem 14.4.9 (1). So Int(S) = S. Now suppose that S ⊆ X and Int(S) = S. Then S is open by Theorem 14.4.9 (1). So part (1) is verified. For part T (2), let S be a closed set in X. That is, S ∈ Top(X). Then S ∈ {K ∈ Top(X); K ⊇ S}. So S ⊇ {K ∈ Top(X); K ⊇ S} = S¯ by Definition 14.4.4. But S¯ ⊇ S by Theorem 14.4.9 (3). So S¯ = S. Now suppose that S ⊆ X and S¯ = S. Then S is closed by Theorem 14.4.9 (3). So part (2) is verified. Part (3) follows from part (1) and Theorem 14.4.9 (1). Part (4) follows from part (2) and Theorem 14.4.9 (3). Part (5) follows from Theorem 14.4.9 parts (2) and (11). Part (6) follows from Theorem 14.4.9 parts (4) and (10). For part (7), note that Int(S1 )∪Int(S2 ) is open by Theorem 14.4.9 (1) and Definition 14.2.3 (iii). So Int(S1 )∪ Int(S2 ) = Int(Int(S1 ) ∪ Int(S2 )) by part (3). This is a subset of Int(S1 ∪ S2 ) by Theorem 14.4.9 (10). Part (8) may be proved as for part (7) expect that Definition 14.2.3 part (ii) is used instead of part (iii). For part (9), note that S¯1 ∪ S¯2 is closed by Theorem 14.4.9 (3) and Theorem 14.2.16 (1). So S¯1 ∪ S¯2 = S¯1 ∪ S¯2 by part (4). This is a superset of S1 ∪ S2 by Theorem 14.4.9 (11). Part (10) may be proved as for part (9) expect that Theorem 14.2.16 part (2) is used instead of part (1). 14.4.11 Theorem: The closure of any set S in a topological space X is equal to the complement of the interior of the complement of S in X. In other words, S¯ = X \ Int(X \ S). The interior of S is the complement of the closure of the complement of S. That is, Int(S) = X \ X \ S .
Proof: To prove the first part, note that T S¯ = {X \ Ω; S = X \ {Ω; S = X \ {Ω; = X \ Int(X
Ω ∈ Top(X) ∧ S ⊆ X \ Ω} Ω ∈ Top(X) ∧ S ⊆ X \ Ω} Ω ∈ Top(X) ∧ Ω ⊆ X \ S} \ S).
The second part follows by substituting X \ S for S. 14.4.12 Theorem: (1) An element x of a set S in a topological space X is an element of the interior of S in X if and only if ∃Ω ∈ Topx (X), Ω ⊆ S. (2) An element x of a set S in a topological space X is an element of the closure of S in X if and only if ∀Ω ∈ Topx (X), Ω ∩ S 6= ∅. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
(7) (8) (9) (10)
358
14. Topology
Proof: To show part (1), assume that ∃Ω ∈ Topx (X), Ω ⊆ S. Let G ∈ Topx (X) satisfy G ⊆ S. Then x ∈ G ∈ {Ω ∈ Top(X); Ω ⊆ S}. So by Definition 14.4.1, x is in interior of S. Conversely, assume that x ∈ Int(S). Then x ∈ {Ω ∈ Top(X); Ω ⊆ S}. Therefore ∃Ω ∈ Top(X); x ∈ Ω ⊆ S. It follows that x ∈ Ω and so Ω ∈ Topx (X). Hence ∃Ω ∈ Topx (X), x ∈ Ω ⊆ S, as was to be shown. Part (2) follows easily from part (1) by the application of Theorem 14.4.11. But it is instructive to prove it directly. For any x ∈ X, T S x ∈ {X \ Ω; Ω ∈ T and S ⊆ X \ Ω} ⇔ x ∈ / {Ω; Ω ∈ T and S ⊆ X \ Ω} ⇔ ¬ ∃Ω ∈ Top(X), (x ∈ Ω ∧ S ⊆ X \ Ω) ⇔ ¬ ∃Ω ∈ Top(X), (x ∈ Ω ∧ S ∩ Ω = ∅) ⇔ ∀Ω ∈ Top(X), (x ∈ / Ω ∨ S∩Ω∈ / ∅) ⇔ ∀Ω ∈ Top(X), (x ∈ Ω ⇒ S ∩ Ω ∈ / ∅) ⇔ ∀Ω ∈ Topx (X), S ∩ Ω ∈ / ∅. In other words, x is in the closure of S if and only if ∀Ω ∈ Topx (X), Ω ∩ S 6= ∅, as claimed. 14.4.13 Remark: Definitions 14.4.1 and 14.4.4, and Theorem 14.4.12 may be summarized as follows. S Int(S) = {Ω ∈ Top(X); Ω ⊆ S} = {x ∈ X; ∃Ω ∈ Topx (X), Ω ⊆ S} T ¯ S = {X \ Ω; Ω ∈ Top(X) and S ⊆ X \ Ω} T = {K ∈ Top(X); S ⊆ K} = {x ∈ X; ∀Ω ∈ Topx (X), Ω ∩ S 6= ∅}, 14.4.14 Remark: Theorem 14.4.15 uses Notation 14.4.8 for the interior IntT (S) of a set S with respect to a topology T . Unfortunately, the notation S¯ for the closure of a set does not easily permit the addition of a subscript to indicate the implied topology explicitly. Therefore the ad-hoc notation ClosureT (S) may be used for the closure of a set S with respect to a topology T . 14.4.15 Theorem: Let T1 and T2 be topologies on a set X such that T1 ⊆ T2 . Then: (1) IntT1 (S) ⊆ IntT2 (S) for all S ∈ IP(X). (2) ClosureT1 (S) ⊇ ClosureT2 (S) for all S ∈ IP(X). Proof: For part (1), note that S {Ω ∈ T1 ; Ω ⊆ S} ⊆S{Ω ∈ T2 ; Ω ⊆ S}. Therefore by Theorem 5.14.7 (i) and Definition 14.4.1, IntT1 (S) = {Ω ∈ T1 ; Ω ⊆ S} ⊆ {Ω ∈ T2 ; Ω ⊆ S} = IntT2 (S). Part (2) follows from part (1) and Theorem 14.4.9 (6). 14.4.16 Remark: Theorem 14.4.15 says that strengthening the topology on a fixed set X makes interiors of sets larger, and makes closures of sets smaller. (The inequalities in Theorem 14.4.15 are illustrated in Figure 14.5.2.) In the extreme case of the strongest topology on X, namely the discrete topology IP(X), both the interior and closure of S equal S itself. (This is discussed in more detail in Remark 14.7.10.) In the opposite extreme of the trivial topology T = {∅, X} on X, the interior of any set S in IP(X) \ {∅, X} equals ∅, and the closure of such a set S equals X. (See Remark 14.5.13 for similar comments on the exterior and boundary of sets.)
14.5. Exterior and boundary of sets 14.5.1 Remark: It seems reasonable to define the exterior of a set analogously to the interior. This is done in Definition 14.5.2. The exterior turns out to be the same as the complement of the closure given in Definition 14.4.4. A reasonable notation for the exterior of a set S would be Ext(S). It would perhaps also be reasonable to define the “exterior closure” of a set as the complement of the interior. But there seems to be little demand for this. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
for any subset S of a topological space X.
14.5. Exterior and boundary of sets
359
14.5.2 Definition: The (topological) exterior of a set S in a topological space X is the union of all open sets in X which are included in the set X \ S. In other words, the exterior of S is the set S
Ω∈Top(X) Ω⊆X\S
Ω=
S
{Ω ∈ Top(X); Ω ∩ S = ∅}.
14.5.3 Notation: Ext(S) denotes the exterior of a set S in an implicit topological space. 14.5.4 Definition: The (topological) boundary of a set S in a topological space X is the complement of the interior of S within the closure of S; in other words, S¯ \ Int(S). 14.5.5 Notation: Bdy(S) denotes the boundary of a set S with respect to an implicit topology. ∂S is an alternative notation for Bdy(S). 14.5.6 Remark: Combining Definition 14.5.4 and Notation 14.5.5 gives ∂S = Bdy(S) = S¯ \ Int(S) for subsets S of a topological space X. The notation ∂S is much more common than Bdy(S) for the boundary of a set S. However, the curly-dee symbol ∂ is also used extensively for denoting partial derivatives. In fact, the two concepts are closely related within the context of the theory of distributions. (The gradient of the indicator function of the set S is related to the boundary of S.)
(1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12) (13) (14) (15) (16) (17) (18) (19) (20) (21) (22) (23) (24)
Ext(S) = Int(X \ S). Ext(S) ∈ Top(X). ¯ Ext(S) = X \ S. Ext(S) ∩ S¯ = ∅. Ext(S) ⊆ X \ S. Bdy(S) = S¯ ∩ X \ S. Bdy(S) ∈ Top(X). Bdy(S) = X \ Int(S) \ Ext(S). Bdy(S) = Bdy(X \ S). X = Int(S) ∪ Bdy(S) ∪ Ext(S). (Int(S) ∩ Bdy(S) = ∅) ∧ (Int(S) ∩ Ext(S) = ∅) ∧ (Bdy(S) ∩ Ext(S) = ∅). X \ S = Bdy(S) ∪ Ext(S). Int(S) = S¯ \ Bdy(S). Int(S) = S \ Bdy(S). Ext(S) = (X \ S) \ Bdy(S). S¯ = Int(S) ∪ Bdy(S). S¯ = S ∪ Bdy(S). ¯ Ext(Ext(S)) = Int(S). Int(Ext(S)) = Ext(S). Bdy(Int(S)) ⊆ Bdy(S). ¯ ⊆ Bdy(S). Bdy(S) Bdy(Ext(S)) ⊆ Bdy(S). Bdy(S) = Bdy(S). Bdy(Bdy(S)) ⊆ Bdy(S).
Let S1 and S2 be subsets of a topological space X. Then: (25) S1 ⊆ S2 ⇒ Ext(S1 ) ⊇ Ext(S2 ). [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
14.5.7 Theorem: Let S be a subset of a topological space X. Then:
360 Proof: Part (1) follows from Ext(S) = equals Int(X \ S) by Definition 14.4.1.
14. Topology S
{Ω ∈ Top(X); Ω ∩ S = ∅} =
S
{Ω ∈ Top(X); Ω ⊆ X \ S}, which
Part (2) follows from Theorem 14.4.9 (1) because Ext(S) = Int(X \ S) by part (1). Part (3) follows from part (1) and Theorem 14.4.9 (5).
Part (4) follows from part (3) and Theorem 5.13.10 (v). Part (5) follows from part (1) and Theorem 14.4.9 (2). Part (6) follows from Definition 14.5.4 and Theorem 14.4.9 (8). Part (7) follows from part (6), Theorem 14.4.9 (3) and Theorem 14.2.16 (2). For part (8), note that Bdy(S) = S¯ \ Int(S) = X \ Int(X \ S) \ Int(S) = X \ Int(S) \ Ext(S) by part (1).
To show part (9), note that by part (8), Bdy(X \ S) = X \ Int(X \ S) \ Ext(X \ S) = X \ Ext(S) \ Int(S) by part (1) and Theorem 5.13.10 (viii). This equals Bdy(S) by part (8). Part (10) follows from part (8) because Bdy(S) = X \ Int(S) \ Ext(S) = X \ (Int(S) ∪ Ext(S)).
Part (11) follows from part (8) because Int(S) ∩ Ext(S) = ∅ follows from part (4) and Theorem 14.4.9 (7).
For part (12), note that Bdy(S) ∪ Ext(S) = (X \ Int(S) \ Ext(S)) ∪ Ext(S) = X \ Int(S) by part (11). But by Theorem 14.4.9 (6), X \ S = X \ Int(S). Part (13) follows from Definition 14.5.4 and Theorem 14.4.9 (7). Part (14) follows from part (13) and Theorem 14.4.9 (2). Part (15) follows from parts (1) and (9). Part (16) follows from Definition 14.5.4 and Theorem 14.4.9 (7). Part (17) follows from part (16) and Theorem 14.4.9 parts (2) and (4). ¯ by part (3). Part (18) follows from Ext(Ext(S)) = Int(X \ Ext(S)) by part (1), which equals Int(S) Part (20) follows from Bdy(Int(S)) = Int(S) \ Int(Int(S)) by Definition 14.5.4, which equals Int(S) \ Int(S) by Theorem 14.4.10 (3), which is a subset of S¯ \ Int(S) by Theorem 14.4.10 (5), which equals Bdy(S). ¯ = S¯ \ Int(S) ¯ by Definition 14.5.4, which equals S¯ \ Int(S) ¯ by Theorem For part (21), note that Bdy(S) 14.4.10 (4), which is a subset of S¯ \ Int(S) by Theorem 14.4.10 (6), which equals Bdy(S). ¯ by part (3), which equals Bdy(S) ¯ by part (9), and this For part (22), note that Bdy(Ext(S)) = Bdy(X \ S) is a subset of Bdy(S) by part (21). Part (23) follows from part (7) and Theorem 14.4.10 (2). For part (24), note that Bdy(Bdy(S)) ⊆ Bdy(S) by Definition 14.5.4, but Bdy(S) = Bdy(S) by part (23).
Part (25) follows from part (1), Theorem 14.4.9 (10) and Theorem 5.13.12 (i).
14.5.8 Remark: Parts (10) and (11) of Theorem 14.5.7 imply that the set {Int(S), Bdy(S), Ext(S)} is a partition of the topological space X for any subset S of X. 14.5.9 Remark: For any set S in a topological space X, the points in Int(S) are always elements of S (by Theorem 14.4.9 (2)), and the points in Ext(S) are always elements of X \ S (by Theorem 14.5.7 (5)). But the points of Bdy(S) may belong to either S or X \ S. (See Figure 14.5.1.)
If the set S is open, all points of Bdy(S) belong to X \ S, whereas if the set S is closed, all points of Bdy(S) belong to S. Therefore one may refer to the points of Bdy(S) which are elements of X \ S as the “open portions” of the boundary, and the points of Bdy(S) which are elements of S as the “closed portions” of the boundary. 14.5.10 Theorem: Let X be a topological space. Then the following propositions are true. (1) Int(∅) = ∅, Bdy(∅) = ∅ and Ext(∅) = X.
(2) Int(X) = X, Bdy(X) = ∅ and Ext(X) = ∅. 14.5.11 Remark: Notation 14.5.12 gives topology-dependent notations for the exterior and boundary of sets analogous to Notation 14.4.8 for the interior of sets. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Part (19) follows from part (2) and Theorem 14.4.10 (1).
14.5. Exterior and boundary of sets po sed clo
X \S
open
rtion
por tio
361
n
Bdy(S) ⊆ X \ S
S
Bdy(S) ⊆ S
Figure 14.5.1
Open and closed portions of boundary of a set S
14.5.12 Notation: ExtT (S) denotes the exterior of a set S with respect to a topological space (X, T ). BdyT (S) denotes the boundary of a set S with respect to a topological space (X, T ). 14.5.13 Remark: Theorem 14.5.14 says that strengthening the topology on a fixed set X makes set exteriors larger and set boundaries smaller. In the extreme case of the strongest topology on X, namely the discrete topology IP(X), the exterior of S is equal to X \ S and the boundary of S is empty. In the opposite extreme of the trivial topology T = {∅, X} on X, the exterior of any set S in IP(X) \ {∅, X} equals ∅, and the boundary of such a set S equals X. (See Remark 14.4.16 for similar comments on the interior and closure of sets.) 14.5.14 Theorem: Let T1 and T2 be topologies on a set X such that T1 ⊆ T2 . Then: (1) ExtT1 (S) ⊆ ExtT2 (S) for all S ∈ IP(X). (2) BdyT1 (S) ⊇ BdyT2 (S) for all S ∈ IP(X).
14.5.15 Remark: The inequalities in Theorems 14.4.15 and 14.5.14 are illustrated in Figure 14.5.2.
topology T1
S¯ Bdy(S)
Int(S)
Int(S)
Bdy(S) S¯
Figure 14.5.2
weaker topology
X \S
S
topology T2
Ext(S)
Ext(S)
stronger topology
X
Relation between topology strength and interior/boundary/exterior of sets
The decreasing (actually non-increasing) boundary set Bdy(S) for a fixed set S with respect to the strength of the topology may be thought of a process of nibbling away of the boundary by the interior Int(S) and Ext(S) as more empty sets are added to the topology. When the topology is weak, many points have “undecided status”. That is, they are neither in the interior nor in the interior. (Figure 14.5.1 in Remark 14.5.9 illustrates the “undecided status” of points in Bdy(S) which are in S or X \ S, but which are not allocated to either Int(S) or Ext(S).) [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Proof: Part (1) follows from Theorem 14.4.15 (1) because ExtT (S) = IntT (X \ S) for T = T1 and T2 by Theorem 14.5.7 (1). Part (2) follows from Theorem 14.4.15 and Theorem 14.5.7 (8).
362
14. Topology
As the topology is strengthened, more and more points are decided as either interior or exterior points. The extreme cases, the trivial and discrete topologies, are illustrated in Figure 14.5.3, where almost all sets S in the trivial topology have Bdy(S) = X because no points in X are “decided”, whereas all sets S in the discrete topology have Bdy(S) = ∅ because all points in X are “decided”. 14.5.16 Remark: It is reasonable to seek additional inequalities resembling Theorems 14.4.15 and 14.5.14. For example, let S1 , S2 ∈ IP(X) for a topological space X. Then by Theorems 14.4.9 (10), 14.4.9 (11), and 14.5.7 (25), it follows that if S1 ⊆ S2 , then Int(S1 ) ⊆ Int(S2 ), S¯1 ⊆ S¯2 and Ext(S1 ) ⊇ Ext(S2 ), but there is no such general inequality relating Bdy(S1 ) to Bdy(S2 ). There seem to be no such inequalities for a fixed set S in the intersection X1 ∩ X2 of two topological spaces (X1 , T1 ) and not possible for two different sets X1 and X2 to have the same topology T (X2 , T2 ). It isT (because X1 = T1 and X2 = T2 ). But a reasonable correspondence between T1 and T2 would be {Ω ∩ X1 ; Ω ∈ T2 } = {Ω ∩ X2 ; Ω ∈ T1 }. (This means that the relative topologies of T1 and T2 on X1 ∩ X2 ¯ Bdy(S) are the same. See Definition 14.10.13 for relative topology.) With such a correspondence, Int(S), S, and Ext(S) are all independent of the choice of topology. So no interesting inequalities seem to be available for the condition X1 ⊆ X2 . 14.5.17 Theorem: Let X be a discrete topological space. Then the following propositions are true. ∀S ∀S ∀S ∀S ∀S ∀S
∈ IP(X), ∈ IP(X), ∈ IP(X), ∈ IP(X), ∈ IP(X), ∈ IP(X),
Proof: Part (2) Part (3) Part (4) Part (5) Part (6)
S ∈ Top(X). S ∈ Top(X). Int(S) = S. Ext(S) = X \ S. Bdy(S) = ∅. S¯ = S.
By Definition 14.2.19 for a discrete topological space, Top(X) = IP(X). This implies part (1). follows from part (1) and Definition 14.2.12. follows from part (1) and Theorem 14.4.10 (1). follows from part (3) and Theorem 14.5.7 (1). follows from parts (3) and (4) and Theorem 14.5.7 (8). follows from part (2) and Theorem 14.4.10 (2). (Or from part (4) and Theorem 14.5.7 (3).)
14.5.18 Theorem: Let X be a trivial topological space. Then the following propositions are true. (1) (2) (3) (4) (5) (6)
∀S ∀S ∀S ∀S ∀S ∀S
∈ IP(X), (S ∈ Top(X) ⇔ (S = ∅ ∨ S = X)). ∈ IP(X), (S ∈ Top(X) ⇔ (S = ∅ ∨ S = X)). ∈ IP(X) \ {X}, Int(S) = ∅. ∈ IP(X) \ {∅}, Ext(S) = ∅. ∈ IP(X) \ {∅, X}, Bdy(S) = X. ∈ IP(X) \ {∅}, S¯ = X.
Proof: Part (1) is equivalent to Definition 14.2.18 for the trivial topology. Part (2) follows from part (1) and the definition of closed sets. For part (3), note that if S ∈ IP(X) \ {X} and Ω ∈ Top(X) and Ω ⊆ S, then Ω = ∅. Part (4) follows from part (3) and Theorem 14.5.7 (1). Part (5) follows from parts (3) and (4). Part (6) follows from part (4) and Theorem 14.5.7 (3). 14.5.19 Remark: Theorems 14.5.17 and 14.5.18 are illustrated in Figure 14.5.3. Note that for the trivial topology on a non-empty set X, the label Int(S) = ∅ applies only if S 6= X; the label Ext(S) = ∅ applies only if S 6= ∅; the label Bdy(S) = ∅ applies only if S ∈ / {∅, X}; and the label S¯ = X applies only if S 6= ∅. (See Remark 14.5.15 for related comments.) [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
(1) (2) (3) (4) (5) (6)
14.6. Limit points and isolated points
trivial topology {∅, X}
Int(S) = ∅
S¯ = X Bdy(S) = X
Figure 14.5.3
Ext(S) = ∅
weakest topology
X \S
S discrete topology IP(X)
363
Int(S) = S Bdy(S) = ∅ Ext(S) = X \ S S¯ = S X
strongest topology
Interior/boundary/exterior of sets including trivial and discrete extremes
14.5.20 Remark: Figure 14.5.4 illustrates the way in which boundary thickness of a fixed set S in a topological space X decreases as the topology is strengthened. It is notable that both the weakest (i.e. trivial) topology and the strongest (i.e. discrete) topology contain no information. The extremes of topology strength add no information to the set S. X
Bdy(S) = X
weakest topology
S
X
thick boundary
∂S
weak topology
X
X
thin boundary
strong topology
∂S
X
Bdy(S) = ∅ Figure 14.5.4
S
strongest topology
The influence of topology strength on boundary thickness
14.6. Limit points and isolated points 14.6.1 Definition: A limit point of a set S in a topological space X is a point x ∈ X which satisfies ∀Ω ∈ Topx (X), Ω ∩ (S \ {x}) 6= ∅. A limit point is also known as an accumulation point or a cluster point. The limit set of a set S in a topological space X is the set of limit points of S. 14.6.2 Remark: If a point x has only a finite number of neighbourhoods Ω ∈ Topx (X), the point x can T be a limit point only if there is at least one point y distinct from x which is in the intersection Topx (X) of all neighbourhoods of x. (See Figure 14.6.1.) T But then G = Topx (X) must be an element of Topx (X) (i.e. an open neighbourhood of x) if the number of neighbourhoods is finite. It would therefore follow that {x, y} ⊆ G ⊆ Ω for all non-empty Ω ∈ Topx (X). This would imply that the topology is extremely weak. Such a topology would not even have the extremely weak T1 separation property in Definition 15.2.4. Therefore limit points are of real interest only when there are infinitely many neighbourhoods at each point. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
∂S
364
14. Topology limit point x
not a limit point x Ω
Ω
S
S
Figure 14.6.1
Limit point x of a set S
14.6.3 Remark: The limit set of a set S in a topological space X may be written as {x ∈ X; ∀Ω ∈ Topx (X), Ω ∩ (S \ {x}) 6= ∅}. 14.6.4 Theorem: A point x is a limit point of a set S in a topological space X if and only if x ∈ S \ {x}. Proof: The proof follows from Definitions 14.6.1 and 14.4.1. Let x ∈ X. Then x is a limit point of S ⇔ ⇔ ⇔ ⇔
∀Ω ∈ Topx (X), Ω ∩ (S \ {x}) 6= ∅ ¬ ∃Ω ∈ Topx (X), Ω ∩ (S \ {x}) = ∅ ¬ ∃Ω ∈ Topx (X), Ω ⊆ X \ (S \ {x}) x∈ / Int(X \ (S \ {x}))
⇔ x ∈ S \ {x}.
The last line follows from Theorem 14.4.11.
[ Show the relations between limit points and the continuity of functions. In some sense, continuous functions are those which preserve limit points. Taht is, the limit set of the image of a set by a continuous function includes the image of the limit set, roughly speaking. Also show the relation between limits of sets and limits of functions. And also show the exact general relation between boundaries and limit sets. Continuous functions preserve boundaries. So they should preserve limits also. Boundaries can usually be thought of as the simultaneous limits of interior and exterior points. Limits of sequences have something to do with this too. ]
14.7. Some simple topologies on countably infinite sets 14.7.1 Remark: Topology on finite sets is not very useful. It is an interesting exercise to check one’s understanding of the axioms of topology in finite-set situations which are simple enough to analyze completely. But the real value of topology is the ability to study limiting processes. Limits of finite sequences are essentially devoid of interest. Topology (in the small) arose historically from the study of limits of points and functions. Limits are the essence of analysis. The word “ana-lysis” itself means breaking up something into tiny bits, from the Greek word “lÔsi ” (“loosing, dissolution, separation”) and “an” (“up, upwards”). Thus “anlusi ” means “dissolution” in the sense that a salt may be dissolved or loosened into ions by being immersed in water. (Related words are “cata-lysis”, “dia-lysis”, “electro-lysis”, “hydro-lysis”, “para-lysis” and “photo-lysis”.) The English word “solve” comes from the Latin word “solvere” (“to loosen”), which comes from Latin “seluo” (reflexive of “luo”). The Latin word “luo” (“loosen”) comes from Greek “lÔw” (“to loosen, set free, release, dissolve, sever, destroy”), which is where the word “lÔsi ” comes from. A finite set cannot be broken up into ever tinier bits. The task of topology is to assist the study of limiting processes when sets become infinitesimal (i.e. arbitrary small). Consequently, topology on finite sets is purely recreational and educational. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
14.6.5 Definition: An isolated point of a set S in a topological space X is a point x ∈ S which is not a limit point of S.
14.7. Some simple topologies on countably infinite sets
365
14.7.2 Remark: The task of a topology is to separate points from each other (and sets from each other). A stronger topology is better at separating points. A weaker topology has less ability to separate points. (See Definition 14.2.21 for weaker and stronger topologies.) When a set is finite, the ability to separate points from each other implies that all singletons are open sets, which implies that the topology is the discrete topology. (See Theorem 15.2.10.) When a set is infinite, the ability to separate all pairs of singletons does not result in the topology being discrete. This fact is demonstrated in Example 14.7.3. 14.7.3 Example: Figure 14.7.1 illustrates a topology with imperfect separation on a countably infinite set X = + 0 . In other words, the topology is not the discrete topology. Even though the point x = 0 is separated from any other point y ∈ X by a set Ωy ∈ Top(X), it is not possible to construct {x} as the intersection of a finite set of open sets Ωy . + + Figure 14.7.1 illustrates the set T = {∅} ∪ {Ωy ; y ∈ + 0 } where Ωy = {i ∈ 0 ; i = 0 ∨ i > y} for y ∈ 0 .
Z
Z
Z
Z
Ω7 Ω6 Ω5 Ω4 0
Figure 14.7.1
9 8 7 6
5
4
3
2
1
Topology with poor separation on a countably infinite set
It is easily verified that the range of any non-increasing sequence of subsets of IP(X) (with respect to the set-inclusion partial order) is closed under finite intersection and arbitrary union. So if ∅ and X are added, the result is a valid topology. It follows that T is a valid topology on X. Although x = 0 is separated from y by the open set Ωy for all y ∈ + / Ωy ), the 0 (because 0 ∈ Ωy and y ∈ set {0} is clearly not equal to the intersection of any finite number of sets Ωy . So {0} ∈ / T although 0 is separated from all individual elements of X. The topology T may be extended to include all sets {y} for y ∈ + . Let T ′ = {G1 ∪ G2 ; G1 ∈ T and G2 ∈ + IP( + )}. (This is the same as T ′ = IP( + ) ∪ {Ω ∈ IP( + \ Ω) < ∞}.) Then it is (fairly) clear that 0 ); #( ′ T is also a valid topology on X. This larger topology completely separates all points x ∈ + 0 from other elements y ∈ X \ {x}. The set {x} is in T ′ for all x ∈ + , but {0} ∈ / T ′.
Z
Z
Z Z
Z
Z
Z
Z
14.7.4 Example: There is now the question of whether infinite sets such as have interesting topologies which are invariant under various groups of permutations of the point set. Suppose a topology T on is invariant under all translations of and contains at least one non-empty finite set. Then T = IP( ). To show this, let Ω ∈ T be a non-empty finite set. Let d = max(Ω) − min(Ω). Then Ω ∩ (Ω + d) = {max(Ω)} contains exactly one element of , where Ω + d denotes the translate of Ω by a distance d. It follows that {x} ∈ T for all x ∈ . From this, every subset of can be constructed as a union of singleton sets. The set Tk of all subsets of with period k ∈ + is a translation-invariant topology on . These topologies are simply infinite copies of the discrete topology on k . The case k = 1 is the trivial topology.
Z
Z
Z
Z
Z
Z
Z
Z
Z
N
[ Give a general theorem about the consequences of invariance of a topology under all permutations of the point set. General permutations include all bijections. Maybe also examine a sub-class of permutations which only permute at most a finite set of points, or a countably infinite set of points, and so forth. ] 14.7.5 Example: The topologies in Example 14.7.4 do not exhaust all possibilities for translation-invariant topologies on . A topology can be defined on X = by
Z
Z
Z
Top(X) = {∅} ∪ {Ω ∈ IP(X); #(X \ Ω) < ∞}.
Z
(14.7.1)
Z
Z
This set of subsets of X = is closed under finite intersection because \ (Ω1 ∩ Ω2 ) = ( \ Ω1 ) ∪ ( \ Ω1 ), which is a finite set if Ω1 , Ω2 ∈ Top( ). Closure under union is equally clear. This topology is invariant under all permutations of the point set . [ www.topology.org/tex/conc/dg.html ]
Z
Z
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Z
366
14. Topology
The set defined by (14.7.1) is a valid topology for any set X, and it is always permutation-invariant. On the downside, it is not very useful in applications. (However, Remark 15.2.15 does use this topology as a borderline example.) [ It would be interesting to apply the style of analysis in this section to uniform or translation-invariant topologies on , IR and IRn for n ∈ + 0 .]
Q
Z
14.7.6 Definition: The trivial closed-point topology on a set X is the set T = {∅} ∪ {Ω ∈ IP(X); #(X \ Ω) < ∞}.
(14.7.2)
A trivial closed-point topological space is a topological space (X, T ) such that T is the trivial closed-point topology on X. 14.7.7 Remark: Definition 14.7.6 is almost certainly non-standard. The “trivial closed-point topology” on a set X is the smallest topology on X for which singletons {x} are closed sets for all x ∈ X. (This property of a topology is defined as the T1 separation property in Section 15.2. So a less clumsy name for this concept would be a “trivial T1 topology”.) 14.7.8 Theorem: The trivial closed-point topology on a set X is a topology on X.
14.7.9 Theorem: The trivial closed-point topology on a set X is the smallest topology on X for which {x} is a closed set for all x ∈ X. Proof: To show that {x} is closed for all x ∈ X for the trivial closed-point topology on the set X, let Ω = X \ {x} and note that #(X \ Ω) = 1 < ∞. Let T be the trivial closed-point topology on a set X. To show that T ⊆ T ′ for all topologies T ′ on X such that {x} is a closed subset of X with respect to T ′ for all x ∈ X, let T ′ be such a topology. Let Ω ∈ T . If Ω = ∅, then Ω ∈ T ′ because T ′ is a topology. So let Ω be a subset of X such that #(X \ Ω) < ∞. Let C = {X \ {x}; x ∈ X \ Ω}. Then #(C) < ∞ and C ⊆ T ′ because every singleton {x} is closed with respect to T ′ . If #(C)T= 0 then Ω = X, and so Ω ∈ T ′ because T ′ T is a topology on X. So assume that 1 < #(C) < ∞. Then C ∈ T ′ because T ′ is a topology. Thus Ω = C ∈ T ′ . Hence T ⊆ T ′ . That is, T is smaller than (or equal to) every topology on X for which {x} is a closed set for all x ∈ X. 14.7.10 Remark: Let X be a topological space with the trivial closed-point topology. If X is a finite set, then Top(X) = IP(X). In other words, The topology is the same as the discrete topology in Definition 14.2.19. So for any subset S of X, the interior, boundary and exterior are as follows. Int(S) Bdy(S) Ext(S) S
∅
X \S
If the set X is infinite, the interior, boundary and exterior are as follows. cardinality of S #(S) < ∞ #(S) = ∞, #(X \ S) = ∞ #(X \ S) < ∞ [ www.topology.org/tex/conc/dg.html ]
Int(S) Bdy(S) Ext(S) ∅ ∅ S
S X X \S
X \S ∅ ∅ [ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Proof: Let X be a set. Define T as in equation (14.7.2). Then clearly T ⊆ IP(X) and {∅, X} ⊆ T . So Definition 14.2.3 (i) is satisfied. Let Ω1 , Ω2 ∈ T . If Ω1 = ∅ or Ω2 = ∅, then Ω1 ∩ Ω2 = ∅ ∈ T . So suppose that Ω1 6= ∅ and Ω2 6= ∅. Then Ω1 , Ω2 ∈ IP(X), and #(X \ Ω1 ) < ∞ and #(X \ Ω2 ) < ∞. But X \ (Ω1 ∩ Ω2 ) = (X \ Ω1 ) ∪ (X \ Ω2 ) by Theorem 5.13.11. So #(X \ (Ω1 ∩ Ω2 )) < ∞ and therefore Ω1 ∩ Ω2 ∈ T , which satisfies Definition 14.2.3 (ii). S Let C ⊆ T and let Ω = C. If Ω = ∅ then Ω ∈ T . So suppose that Ω 6= ∅. Then ∃G ∈ C, GS 6= ∅. Therefore ∃G ∈ C, (G ⊆ X ∧ #(X \ G) < ∞). But by Theorem 5.14.7 (xi), G ∈ C implies that G ⊆ C. Therefore S ∃G ∈ C, (G ⊆ Ω ∧ #(X \ G) < ∞). Hence #(X \ Ω) ≤ #(X \ G) < ∞. So C ∈ T , which satisfies Definition 14.2.3 (iii).
14.8. Generation of topologies from collections of sets
367
These tables suggest that the trivial closed-point topologies are rather uninteresting. When X is finite, there are no boundary points for any set, and the interior and exterior are simply S and X \ S for any set S ∈ IP(X). So the topology gives no information about a set S other than its set of elements. When X is infinite, any set S for which #(S) = ∞ and #(X \ S) = ∞ has Bdy(S) = X. In other words, such sets have no interior and no exterior. So the topology says very little of interest about sets S, other than whether they (and their complements) are finite or infinite.
It can be hoped that this simple class of topologies may be of some use in providing pathological examples to disprove false conjectures. In particular, various trivial classes of topologies show that Definition 14.2.3 for a general topology is perhaps overly broad. This motivates the introduction of the classes of topologies in Chapter 15, which add various sets of extra axioms to topologies to make them more likely to be useful. Sections 14.8 and 14.10 introduce methods of generating topologies which are more interesting than the trivial closed-point topologies because the interior, boundary and exterior of sets can be made to contain much more information when topologies are built up in more sophisticated ways.
14.8. Generation of topologies from collections of sets 14.8.1 Theorem: Let T be a non-empty set of topologies on a set X. Then
T
T is a topology on X.
14.8.2 Theorem: Let (Ti )i∈I be a non-empty family of topologies on a set X. Then T = topology on X.
T
i∈I
Ti is a
14.8.3 Remark: On any set X, a topology may be generated on X from any given subset S of IP(X). (Note that two distinct subsets S1 6= S2 of IP(X) may generate the same topology on X.) This is a very general and useful procedure for specifying topologies on sets.
14.8.5 Theorem: Let X and S be sets which satisfy S ⊆ IP(X). Then the topology generated by S on X is a valid topology on X. Proof: Let T = {T ∈ IP(IP(X)); S ⊆ T and T is a topology on T X}, where X and S are sets which satisfy S ⊆ IP(X). Then the topology generated by S on X is equal to T . The set T is non-empty because T IP(X) ∈ T for any set X. (See Definition 14.2.19 for the discrete topology IP(X) on X.) So T is a valid topology on X by Theorem 14.8.1. 14.8.6 Remark: In the special case that S = ∅, the topology generated by S on any set X is the trivial topology {∅, X} on X. This is still true under the slightly weaker condition that S ⊆ {∅, X}. 14.8.7 Remark: Theorem 14.8.8 shows a method of constructing the topology generated by a set-collection S on a set X. The condition {∅, X} ⊆ S ensures that the constructed set T is a valid topology on X. The construction for the set T in Theorem 14.8.8 may be combined in a single line as follows: T ={
S
Q; ∀U ∈ Q, ∃C ∈ IP(S), (U =
T
C and 1 ≤ #(C) < ∞)}.
14.8.8 Theorem: Let {∅, X} ⊆ S ⊆ IP(X) for sets X and S. Define T′ = and T =
T
S
C; C ∈ IP(S) and 1 ≤ #(C) < ∞ Q; Q ∈ IP(T ′ ) .
Then T is the topology generated by S on X. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
14.8.4 Definition: The topology generated by S on X, for any sets X and S such that S ⊆ IP(X), is the intersection of all topologies T on X such that S ⊆ T .
368
14. Topology
Proof: First show that T is a topology according to Definition 14.2.3. To show that ∅ ∈ T , let Q = ∅. S ′ Then ∅ ⊆ T (by Theorem 5.8.8). So ∅ = Q ∈ T T S . To show that X ∈ T , let C = {X} ⊆ S. Then C = X ∈ T ′ . So Q = {X} ⊆ T ′ . Therefore X = Q ∈ T . This establishes Definition 14.2.3 (i). S S S S Let A1 , A2 ∈ T . Then Ai = Qi for some Qi ⊆ T ′ for i = 1, 2. Hence A1 ∩A2 = Q1 ∩ Q2 = {U T1 ∩ U2 ; U1 ∈ Q1 , U2 ∈ Q2 } (by Theorem 5.14.7 (v)). For i = 1, 2, Ui ∈ Qi impliesTthat Ui ∈TT ′ and so U = Ci i T for some collection Ci ⊆ S with 1 ≤ #(Ci ) < ∞. Therefore U1 ∩ U2 = CT C2 = (C1 ∪ C2 ) 1 ∩ (by Theorem 5.14.7 (x)). Then C1 ∪ C2 ⊆ S and 1 ≤ #(C1 ∪ C2 ) < ∞. So (C1 ∪ C2 ) ∈ T ′ . That is, U1 ∩ U2 ∈ T ′ . Therefore A1 ∩ A2 ∈ T . This proves part (ii) of Definition 14.2.3. The closure of T under arbitrary unions is guaranteed by Theorem 5.15.4. So T satisfies all of the conditions of Definition 14.2.3 for a topology. To prove that T is the topology generated by S on X, showTfirst that S ⊆ T . Note that S ⊆ T ′ . (This is ′ ′ because C = {U } ⊆ S and #(C) = 1 for S all U ∈ S. So U = C ∈ T .) Similarly, T ⊆ T . (This is because ′ ′ Q = {V } ⊆ T for all V ∈ T . So V = Q ∈ T .) It follows that S ⊆ T . Let T¯ be a topology on X which satisfies S ⊆ T¯. Then T ′ ⊆ T¯ because T¯ is closed under finite intersection and T ′ is the closure of S under finite intersection. Similarly, T ⊆ T¯ because T¯ is closed under arbitrary unions and T is the closure of T ′ under arbitrary unions. Therefore T is included in the intersection of all topologies T¯ on X which include S. Since T is itself such a topology, it follows that T is equal to the intersection of all such topologies. Therefore T satisfies Definition 14.8.4 for the topology generated by S on X. [ Try to show a reverse of Theorem 14.8.8, namely that the set of all finite intersections of arbitrary unions of a set of sets is a topology. If this is not true, investigate why not. If it is true, this may need some support from a theorem which looks like Theorem 5.15.4. ]
14.8.10 Remark: It is a useful exercise to verify theorems in topology for trivial cases. See Exercise 47.7.3 for verification of Theorem 14.8.8 for X = ∅. 14.8.11 Remark: The topology generated on a set X by a set S of subsets of X is the unique topology on X which is weaker than all other possible topologies on X. 14.8.12 Remark: Theorem 14.8.13 is a version of Theorem 14.8.8 which assumes that the set S is closed under finite intersections. Then only half of the construction work is required to build the topology generated by S on X. To facilitate comparisons, the set S is denoted as T ′ . In fact, the proof of Theorem 14.8.8 could have been shortened by first proving Theorem 14.8.13 and then applying it to the set T ′ in Theorem 14.8.8. 14.8.13 Theorem: Let X and T ′ be sets such that {∅, X} ⊆ T ′ ⊆ IP(X) and T ′ is closed under finite intersections. Define S T = Q; Q ∈ IP(T ′ ) .
Then T is the topology generated by T ′ on X.
Proof: To show that ∅ ∈ TS, let Q = ∅ ∈ IP(T ′ ). Then ∅ = Q = {X} ∈ IP(T ′ ). Then X = Q ∈ T . So {∅, X} ⊆ T .
S
Q ∈ T . To show that X ∈ T , let
S ′ To show that T is closed under finite S intersections, S letSA1 , A2 ∈ T . Then Ai = Qi for some Qi ⊆ T for i = 1, 2. Hence A1 ∩ A2 = Q1 ∩ Q2 = Q with Q = {U1 ∩ U2 ; U1 ∈ Q1 , U2 ∈ Q2 } (by ′ ′ ′ Theorem 5.14.7 (v)). For i = 1, 2, Ui ∈ Qi implies that S Ui ∈ T and so U1 ∩ U2 ∈ T by the closure of T ′ under finite intersections. So Q ∈ IP(T ). Therefore Q ∈ T . That is, A1 ∩ A2 ∈ T . Hence T is closed under finite intersections. The closure of T under arbitrary unions is guaranteed by Theorem 5.15.4. So T satisfies all of the conditions of Definition 14.2.3 for a topology on X. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
14.8.9 Remark: The proof of Theorem 14.8.8 does not use the axiom of choice because it was carefully avoided in the proof of the closure of T under arbitrary unions in Theorem 5.15.4. Avoiding the axiom of choice in topology is difficult because topology deals with such general sets and collections of sets. Measure theory is another subject which tempts one to use the axiom of choice because of the enormous generality of the sets.
14.9. The standard topology for the real numbers
369
To showS that T is the topology generated by T ′ on X, first show that T ′ ⊆ T . Let U ∈ T ′ . Then {U } ∈ IP(T ′ ). So U = {U } ∈ T . Hence T ′ ⊆ T . Let T¯ be a topology on X which satisfies T ′ ⊆ T¯. Then T ⊆ T¯ because T¯ is closed under arbitrary unions and T is the closure of T ′ under arbitrary unions. Therefore T is included in the intersection of all topologies T¯ on X which include T ′ . Since T is itself such a topology, it follows that T is equal to the intersection of all such topologies. Therefore T satisfies Definition 14.8.4 for the topology generated by T ′ on X.
14.9. The standard topology for the real numbers [ Maybe also present other “standard topologies” in this section. For example, standard topologies for the integers, rational numbers and complex numbers, and products of these spaces. Could also present some basic properties of the standard topologies in this section. ] 14.9.1 Definition: The usual topology for the real numbers is the topology generated by the set of all real open intervals. 14.9.2 Remark: Definition 14.9.1 is not as circular as it looks. In Definition 8.3.10, an “open interval” is − defined as a set of the form (a, b), where a, b ∈ IR with a ≤ b. It turns out that open intervals are indeed open sets in the usual topology of IR as one would expect. 14.9.3 Definition: The usual topology for IRn for n ∈ products of real open intervals.
Z+ is the topology generated by the set of all
Q 14.9.4 Remark: The products of real open intervals in Definition 14.9.3 are sets of the form ni=1 (ai , b1 ), n n n where (ai )i=1 , (bi )i=1 ∈ IR are sequence of real numbers such that ai ≤ bi for all Qni = 1 . . . n. The usual topology on IRn is the same as the product topology for the set product IRn = i=1 IR. (See Definition 15.1.1 for general product topologies.) 14.9.5 Remark: Topological properties of real number intervals are presented in Section 15.8.
14.10. Open bases and open subbases 14.10.1 Remark: In practice, people do not specify all of the open sets in a topology. A topology is most conveniently generated from an open base or open subbase. This is similar to the way a linear space is generated from a basis. Many operations on linear spaces can be specified for a basis, from which the operations on the whole space follow. In the same way, many definitions and calculations for topological spaces may be specified for an open base or open subbase, from which the corresponding operations follow for the full topology. 14.10.2 Definition: An open base for a topological space (X, T ) is a set S ⊆ IP(X) such that T = S D; D ⊆ S .
14.10.3 T Definition: An open subbase for a topological space (X, T ) is a subset S of IP(X) such that the set C; C ⊆ S and 1 ≤ #(C) < ∞ of finite intersections of sets in S is an open base for (X, T ). 14.10.4 Theorem: Let X and S be sets satisfying {∅, X} ⊆ S ⊆ IP(X). Let T be a topology on X. Then S is an open subbase for the topological space (X, T ) if and only if T is the topology generated by S on X. 14.10.5 Theorem: Let (Y, TY ) be a topological space and let f : X → Y be any function from X to Y . Then TX = {f −1 (Ω); Ω ∈ TY } is a topology for X. Proof: ∅, X ∈ TX since ∅ = f −1 (∅) and X = f −1 (Y ). If G1 , G2 ∈ TX , then G1 = f −1 (Ω1 ) and G2 = f −1 (Ω2 ) for some Ω1 , Ω2 ∈ TY . So by Theorem 6.6.4 (iv), G1 ∩ G2 = f −1 (Ω1 ∩ Ω2 ) ∈ TX . Let (Gi )i∈I −1 be a family of sets S S Gi ∈ TX such that Gi = f (Ωi ) for some sets Ωi ∈ TY . Then by Theorem 6.6.8 (iii), −1 i∈I Gi = f i∈I Ωi ∈ TY . So TX satisfies the conditions of Definition 14.2.3 for a topology.
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
[ Also present the standard topology on the complex numbers here. ]
370
14. Topology
14.10.6 Remark: Theorem 14.10.5 does not have a forward analogue which states that (Y, TY ) is a topology with TY = {f (Ω); Ω ∈ TX } for a topological space (X, TX ) with f : X → Y . The reason the inverse map f −1 works so well is that an inverse function is always one-to-one and onto, which causes the set map f −1 : IP(Y ) → IP(X) to send intersections to intersections and unions to unions. (See Theorems 6.6.3, 6.6.4 and 6.6.8.) 14.10.7 Remark: Theorem 14.10.5 is generalized in Theorem 14.10.8 to an arbitrary family of functions and target topologies. 14.10.8 Theorem: Let (Yi , Ti )i∈I be a family of topological spaces for some non-empty index set I. Let X be a set and let fi : X → Yi be a function for all i ∈ I. Define ′ TX =
T
TX =
S
and
j∈J
Then TX is a topology for X.
fj−1 (Ωj ); J ⊆ I, 1 ≤ #(J) < ∞, and ∀j ∈ J, Ωj ∈ Tj
′ S; S ⊆ TX .
S S Proof: ∅, X ∈ TX because ∅ = ∅ and X = fi−1 (Yi ) for any i ∈ I. Let S = i∈I {fi−1 (Ωi ); Ωi ∈ Ti } = T ′ {fi−1 (Ωi ); i ∈ I, Ωi ∈ Ti } and S ′ = C; C ⊆ S, 1 ≤ #(C) < ∞ . Clearly TX ⊆ S ′ . Since any element −1 ′ of S is a finite intersection of sets fi (Ωi ), there must be only aTfinite number of these sets for each i ∈ I. T ′ The intersection k∈K fi−1 (Ωi,k ) is equal to fi−1 (Ωi ) with Ωi = k∈K Ωi,k . So TX = S ′ . Therefore TX is a topology on X by Theorem 14.8.8. 14.10.9 Definition: The weak topology on a S set X generated by a non-empty family of maps (fi )i∈I and −1 topological spaces (Xi , Ti )i∈I is the set TX = f (Ω ); ∀i ∈ I, Ω ∈ T i i i . i∈I i
[ Should also define the strong topology on the range of a function or family of functions. This would be useful for Definition 24.4.3. ] S [ Do a version of Theorem 14.10.8 for fi partially defined on X and X = i Dom(fi ). ] 14.10.11 Remark: Theorem 14.10.15 is referred to in Remark 28.8.10. It is applicable to the construction of a topology on a set from an atlas.
14.10.12 S Definition: An open cover of a subset S of a topological space X − < (X, T ) is a set C ⊆ T such that S ⊆ C. A finite open cover of a set S is an open cover C of S such that #(C) < ∞.
14.10.13 Definition: The relative topology on a subset S of a topological space (X, T ) is the set of intersections of sets of T with S. Thus Top(S) = {G ∩ S; G ∈ T }. 14.10.14 Remark: The fact that {G ∩ S; G ∈ T } is a topology for S in Definition 14.10.13 follows easily from the distributivity of set intersection with respect to set union. 14.10.15 Theorem: Let (X, T ) be a topological space and let CSbe an open cover of X. Then for any set S ⊆ X, S ∈ T if and only if Ω ∩ S ∈ T for all Ω ∈ C. Hence T = Ω∈C TΩ , where TΩ denotes the relative topology on Ω. Proof: Suppose S ∈ T . Then Ω ∩ S ∈ T for all Ω ∈ C by the closure of T under finite intersection. S S Suppose Ω ∩ S ∈ T for all Ω ∈ C. Then S = X ∩ S = C ∩ S = Ω∈C (Ω S∩ S) ∈ T by the closure of T under union. Since Ω ∩ S ∈ T if and only if Ω ∩ S ∈ TΩ , it follows that T = Ω∈C TΩ . [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
14.10.10 Remark: Theorem 14.10.8 shows that the set TX in Definition 14.10.9 is a topology on X. It is also true that any topology on X which contains all of the sets fi−1 (Ωi ) must include the weak topology on X.
14.11. Continuous functions
371
14.11. Continuous functions 14.11.1 Remark: According to Bynum et alia [191], page 14, continuity of functions was first defined by Cauchy in 1821. Definition 14.11.2 is the standard modern definition for continuous functions. 14.11.2 Definition: A continuous function from a topological space X to a topological space Y is a function f : X → Y such that ∀Ω ∈ Top(Y ), f −1 (Ω) ∈ Top(X). 14.11.3 Theorem: All constant functions are continuous. Proof: Let f : X → Y for topological spaces X and Y . Then f is constant if and only if ∃a ∈ Y, ∀x ∈ X, f (x) = a. Let a ∈ Y satisfy ∀x ∈ X, f (x) = a. Let Ω ∈ Top(Y ). Then either a ∈ Ω or a ∈ / Ω. So either f −1 (Ω) = X or f −1 (Ω) = ∅. In either case, f −1 (Ω) ∈ Top(X) by Definition 14.2.3 (i). 14.11.4 Theorem: Let X be a set, Y be a topological space, and f : X → Y . Then {f −1 (Ω); Ω ∈ Top(Y )} is a topology on X. Proof: Let T = {f −1 (Ω); Ω ∈ Top(Y )} for a function f : X → Y for a topological space Y . Then {∅, X} ⊆ T because ∅ = f −1 (∅) and X = f −1 (Y ), since {∅, Y } ⊆ Top(Y ). So T satisfies Definition 14.2.3 (i) for a topology on X. Let S1 , S2 ∈ T . Then S1 = f −1 (Ω1 ) and S2 = f −1 (Ω2 ) for some Ω1 , Ω2 ∈ Top(Y ). But Ω1 ∩ Ω2 ∈ Top(Y ) by Definition 14.2.3 (ii). So by Theorem 6.6.4 (iv), S1 ∩ S2 = f −1 (Ω1 ) ∩ f −1 (Ω2 ) = f −1 (Ω1 ∩ Ω2 ) ∈ T . So T satisfies Definition 14.2.3 (ii). S Let C ⊆ T . Then ∀SS∈ C, ∃Ω Top(Y ), S = f −1 (ΩS ).SBut C ∈ Top(Y ) by Definition 14.2.3 (iii). So by SS ∈ Theorem 6.6.8 (iii), C = {f −1 (ΩS ); S ∈ C} = f −1 ( S∈C ΩS ) ∈ T . So T satisfies Definition 14.2.3 (iii). Hence T is a topology on X. 14.11.5 Definition: The topology induced on a set X by a function f : X → Y from a topological space Y is the topology {f −1 (Ω); Ω ∈ Top(Y )} on X.
If f : X → Y is a constant function, the topology induced by f on X is the trivial topology. Since the trivial topology on a set X is weaker than any other topology on X, it follows from Theorem 14.11.7 that all constant functions are continuous, as already stated in Theorem 14.11.3. 14.11.7 Theorem: For topological spaces X and Y , let f : X → Y . Then f is continuous if and only if the topology {f −1 (Ω); Ω ∈ Top(Y )} induced by f on X is weaker (i.e. not stronger) than Top(X). 14.11.8 Remark: Theorem 14.11.9 is a specialization of Theorem 14.11.11 to injective functions. However, for the purposes of proof discovery, it is helpful to first prove the injective version before the general version. 14.11.9 Theorem: For topological spaces X and Y , let f : X → Y be injective. Then the following statements are equivalent. (1) f is continuous. (2) ∀S ∈ IP(X), f (Int(S)) ⊇ Int(f (S)).
(3) ∀S ∈ IP(X), f (Ext(S)) ⊇ Int(f (X)) ∩ Ext(f (S)) . (4) ∀S ∈ IP(X), f (Bdy(S)) ⊆ Bdy(f (S)).
Proof: To show that part (1) implies part (2), let f : X → Y be an injective, continuous function, and let S ∈ IP(X). Then Int(f (S)) ⊆ f (S) by Theorem 14.4.9 (2). So f −1 (Int(f (S))) ⊆ f −1 (f (S)) by Theorem 6.6.4 (ii). But f −1 (f (S)) = S because f is injective. So f −1 (Int(f (S))) ⊆ S. Therefore Int(f −1 (Int(f (S)))) ⊆ Int(S) by Theorem 14.4.9 (10). But Int(f −1 (Int(f (S)))) = f −1 (Int(f (S))) since f −1 (Int(f (S))) ∈ Top(X) because f is continuous and Int(f (S)) ∈ Top(Y ). Therefore f −1 (Int(f (S))) ⊆ Int(S). By Theorem 6.6.3 (ii), it then follows that f (f −1 (Int(f (S)))) ⊆ f (Int(S)). So Int(f (S)) ⊆ f (Int(S)) by Theorem 6.6.6 (i). [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
14.11.6 Remark: The inverse-image topology in Theorem 14.11.4 is given a name in Definition 14.11.5. This is quite possibly not a standard name for this topology.
372
14. Topology
To show that part (2) implies part (1), let f : X → Y be injective and suppose that f (Int(S)) ⊇ Int(f (S)) for all S ∈ IP(X). Let S ′ ∈ Top(Y ). Then f −1 (S ′ ) ∈ IP(X). So f (Int(f −1 (S ′ ))) ⊇ Int(f (f −1 (S ′ ))). But Int(f (f −1 (S ′ ))) = Int(S ′ ) by Theorem 6.6.6 (i), and Int(S ′ ) = S ′ because S ′ ∈ Top(Y ). So f (Int(f −1 (S ′ ))) ⊇ S ′ . Therefore Int(f −1 (S ′ )) = f −1 (f (Int(f −1 (S ′ )))) ⊇ f −1 (S ′ ) by the injectivity of f and Theorem 6.6.4 (ii). Therefore Int(f −1 (S ′ )) = f −1 (S ′ ) by ZF axiom of extension (Definition 5.1.26 (1)). So f −1 (S ′ ) ∈ Top(X). Hence f is continuous. To show that part (2) implies part (3), note that Int(f (X)) ∩ Ext(f (S)) ⊆ Int(f (X) ∩ (Y \ f (S))) by Theorem 14.5.7 (1) and Theorem 14.4.10 (8). This equals Int(f (X) \ f (S)) because f (X) ⊆ Y . This equals Int(f (X \ S)) because f is injective. Then by part (2), Int(f (X \ S)) ⊆ f (Int(X \ S)) = f (Ext(S)). So Int(f (X)) ∩ Ext(f (S)) ⊆ f (Ext(S)) as claimed. (To be continued . . . ) To show the equivalence of parts (2) and (4), note that Int(S) = S \ Bdy(S) by Theorem 14.5.7 (14). So f (Int(S)) = f (S) \ f (Bdy(S)). Therefore for a fixed set S ∈ IP(X) . . . (To be continued . . . ) 14.11.10 Remark: To show that the injectivity condition in Theorem 14.11.9 cannot be discarded, consider Y = {a} with the discrete topology on Y . Then all subsets of Y are open. Therefore Int(A) = A and Bdy(A) = ∅ for all A ∈ IP(Y ). By Theorem 14.11.3, all functions f : X → Y are continuous (because they are necessarily constant). Let X be a topological space which does not have the discrete topology. Then there is a set S ∈ X such that Int(S) ∈ / Top(X) and Bdy(S) 6= ∅. Let A = f (S). Then (To be continued . . . ) 14.11.11 Theorem: For topological spaces X and Y , let f : X → Y . Then the following statements are equivalent.
Proof: To show that part (1) implies part (2), let f : X → Y be continuous and S ∈ IP(Y ). Then Int(S) ⊆ S by Theorem 14.4.9 (2). So f −1 (Int(S)) ⊆ f −1 (S) by Theorem 6.6.4 (ii). Therefore Int(f −1 (Int(S))) ⊆ Int(f −1 (S)) by Theorem 14.4.9 (10). So f (Int(f −1 (Int(S)))) ⊆ f (Int(f −1 (S))) by Theorem 6.6.3 (ii). But f −1 (Int(S)) ∈ Top(X) because Int(S) ∈ Top(Y ) and f is continuous. So Int(f −1 (Int(S))) = f −1 (Int(S)) by Theorem 14.4.10 (1). Therefore f (Int(f −1 (Int(S)))) = f (f −1 (Int(S))), which equals Int(S) by Theorem 6.6.6 (i). So Int(S) ⊆ f (Int(f −1 (S))) as claimed. To show that part (2) implies part (1), suppose that f : X → Y satisfies f (Int(f −1 (S))) ⊇ Int(S) for all S ∈ IP(Y ). Let Ω ∈ Top(Y ). Then f (Int(f −1 (Ω))) ⊇ Int(Ω) = Ω. However, Int(f −1 (Ω)) ⊆ f −1 (Ω) by Theorem 14.4.9 (2). So f (Int(f −1 (Ω))) ⊆ f (f −1 (Ω)) = Ω by Theorem 6.6.3 (ii) and Theorem 6.6.6 (i). Therefore f (Int(f −1 (Ω))) = Ω by the ZF axiom of extension (Definition 5.1.26 (1)). . . . (To be continued . . . ) 14.11.12 Notation: C(X, Y ) denotes the set of continuous functions from X to Y , for topological spaces X and Y . C 0 (X, Y ) is an alternative notation for C(X, Y ) for topological spaces X and Y . 14.11.13 Remark: The alternative notation C 0 (X, Y ) in Notation 14.11.12 is often used in contexts where the sets C k (X, Y ) of k-times continuously differentiable functions from X to Y for k ∈ + 0 are also defined. 0 Even if differentiability of functions from X to Y is not defined, the C notation is a convenient abbreviation for the word “continuous” because the notation “C” on its own is ambiguous, whereas “C 0 ” strongly suggests the idea of continuity. It is often said that a “function is C 0 ”, but rarely that a “function is C”. (See for example Notation 18.4.10 for C k function spaces.) Notation 14.11.14 defines the default range Y of C(X, Y ) as the real numbers. In other words, C 0 (X) is defined to equal C 0 (X, IR).
Z
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
(1) f is continuous. (2) ∀S ∈ IP(Y ), f (Int(f −1 (S))) ⊇ Int(S). (3) ∀S ∈ IP(Y ), f (Bdy(f −1 (S))) ⊆ Bdy(S).
14.11. Continuous functions
373
14.11.14 Notation: C 0 (X) denotes the set of continuous functions from X to IR for a topological space X, using the standard topology on IR. [ Near here, define the limit of a function at a point and define continuity in terms of this (as the limit everywhere equalling the value of the function). Then compare this definition with Definition 14.11.2. ] [ Check that Definition 14.11.15 is correct. Prove that it is the same as more usual definitions. Maybe the closure operations are superfluous, or even wrong. Most importantly, check that Theorem 14.11.20 is correct. ] 14.11.15 Definition: The limit set of a function f at a point x, for topological spaces X and Y , function f : X → Y and x ∈ X, is the set T
Ω∈Topx (X)
f (Ω \ {x}) = {y ∈ Y ; ∀Ω ∈ Topx (X), y ∈ f (Ω \ {x})}.
The limit of a function f at a point x, for topological spaces X and Y , function f : X → Y and x ∈ X, is the unique element in the limit set of f at x if the limit set contains one and only one element. 14.11.16 Notation: limz→x f (z), for a function f : X → Y , for topological spaces X and Y with x ∈ X, denotes the limit of f at x if the limit is well defined. 14.11.17 Remark: Although Notation 14.11.16 is commonly used for the limit of a function f , it contains a superfluous variable z. Notation 14.11.18 is preferable, but is probably non-standard. 14.11.18 Notation: limx f , for a function f : X → Y , for topological spaces X and Y with x ∈ X, denotes the limit of f at x if the limit is well defined.
lim
z→p
|g(z) − g(p) − v(z − p)| . |z − p|
The above expression implicitly defines a function f inline by f : z 7→
|g(z) − g(p) − v(z − p)| . |z − p|
Using Notation 14.11.16, it is not necessary to explicitly define the function f first before evaluating its limit. 14.11.20 Theorem: A function f : X → Y for topological spaces X and Y is continuous at x ∈ X if and only if the limit limx f of f at x is well defined and f (x) = limx f . [ Must fix up Definition 14.11.21, which is based on the direct product topology of an arbitrary cross product which has not yet been defined! See EDM2 [34], 435.B, for the equivalence of pointwise convergence and the product topology. ] 14.11.21 Definition: The topology of pointwise convergence on the set C(X, Y ) for topological spaces X− < (X, TX ) and Y − < (Y, TY ) is the restriction to IP(C(X, Y )) of the direct product topology on Y X . 14.11.22 Remark: See Definition 15.7.9 for the compact-open topology on C(X, Y ). 14.11.23 Theorem: The weak topology T in Definition 14.10.9 is weaker than all other topologies on the set X for which all functions fi : X → Xi are continuous. [ See EDM2 [34], article 84, for more material on continuous functions. ] 14.11.24 Theorem: Let X be a topological space and let (fi )m i=1 be a finite sequence of continuous funcm tions fi : X → IR. Then the functions minm i=1 fi and maxi=1 fi are both continuous functions on X. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
14.11.19 Remark: The variable z in Notation 14.11.16 is not superfluous when the function f is defined “inline”; in other words, when f (z) is replaced by an expression in which z is a dummy variable. For example, z is a dummy variable in the expression
374
14. Topology
14.11.25 Remark: Sequences are a special kind of function whose domain is a totally ordered set. When the domain is the set of non-negative integers, the only limiting process that can occur is the limit as a point in the domain “tends to infinity”. Since such a sequence has no value at infinity, it is not possible to define continuity “at infinity”. However, limits do still make good sense as for functions on the real numbers and other topological spaces. 14.11.26 Remark: The term “limit point” for sets has a slightly different meaning to a “limit point” of a sequence of points in a topological space. If all of the points of a sequence are the same, then the limit or limit point of that sequence is the constant value of that sequence, but that point is not a limit point of the range of (i.e. the set of points in) the sequence. But if all elements of a sequence are different, a limit point of the sequence is a limit point of the range of the sequence. Sequences may be defined as functions (or families) whose domain is a totally ordered set. This set is not necessarily a subset of the integers. In Definition 14.11.27, it is clear that there can be no limit point of a finite sequence. The generality of the definition of a sequence permits functions of real numbers, for example, to be regarded as sequences which may have limit points in accordance with Definition 14.11.27. 14.11.27 Definition: A limit (point) of a sequence (xi )i∈I of points in a topological space X is a point y ∈ X such that ∀Ω ∈ Topy (X), ∃n ∈ I, ∀i ≥ n, xi ∈ Ω.
A convergent sequence in a topological space X is a sequence of points in X which has a limit point in X. A divergent sequence in a topological space X is a sequence of points in X which is not convergent in X.
14.12. Homeomorphisms 14.12.1 Definition: A homeomorphism between two topological spaces (X1 , T1 ) and (X2 , T2 ) is a bijection f : X1 → X2 such that both f and f −1 are continuous. Two topological spaces (X1 , T1 ) and (X2 , T2 ) are said to be homeomorphic if there exists a homeomorphism f : X1 → X2 . A (topological) automorphism on a topological space X − < (X, TX ) is a homeomorphism from X to X.
The notation f : X1 ≈ X2 means that f is a homeomorphism between the topological spaces X1 and X2 with implicit topologies. Similarly, this notation may be used for explicit topologies; for example, (X1 , T1 ) ≈ (X2 , T2 ) and f : (X1 , T1 ) ≈ (X2 , T2 ). 14.12.3 Notation: Iso(X, Y ) for topological spaces X − < (X, TX ) and Y − < (Y, TY ) denotes the set of all topological isomorphisms from X to Y . Aut(X) for a topological space X − < (X, TX ) denotes the set of all topological automorphisms on X. 14.12.4 Remark: The notation “Iso” in Notation 14.12.3 seems to be non-standard. But it is a kind of space which is often used in differential geometry; so it should have its own notation. The space Aut(X) is the same as Iso(X, X). Spaces of morphisms apply to a very wide range of classes of structures. In a narrow context, there is no danger of confusion, but in differential geometry, isomorphisms are often intermingled for different classes within a single context. Therefore a subscript may sometimes be used to distinguish structure classes. 14.12.5 Remark: It follows from Definition 14.12.1 that the set map f : IP(X1 ) → IP(X2 ) of f is a bijection between the open sets in T1 and T2 . In other words, there is a one-to-one correspondence between the topologies of the two sets. This implies that absolutely all topological properties of the two sets are identical. The homeomorphism relation is clearly an equivalence relation, but the set of all topological spaces is not a set! So this is not an equivalence relation in the sense of a set of ordered pairs as in Definition 6.3.2. 14.12.6 Theorem: Let f : (X1 , T1 ) ≈ (X2 , T2 ) be a homeomorphism. Let S1 be a subset of X1 with the relative topology from T1 . Let S2 = f (S1 ) have the relative topology from T2 . Then f S1 : S1 → S2 is a homeomorphism. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
14.12.2 Notation: The notation X1 ≈ X2 means that the topological spaces X1 and X2 are homeomorphic with respect to topologies which are implicit in the context.
14.12. Homeomorphisms
375
Proof: Clearly the restriction of a bijection is always a bijection. Denote by T1′ and T2 ′ the relative topologies for S1 and S2 from (X1 , T1 ) and (X2 , T2 ) respectively. To show the continuity of f S , note that 1 ′ if G T2′ , then G′2 = G2 ∩ S2 for some G 2 ∈ T2 . Since f is continuous, f −1 (G2 ) ∈ T1 . It follows that 2 ∈ (f S )−1 (G′2 ) = f −1 (G2 ) ∩ S1 ∈ T1 . So f S is continuous with respect to the relative topologies. The 1 1 continuity of the inverse follows symmetrically. Therefore f S is a homeomorphism with respect to the 1 relative topologies on S1 and f (S1 ). That is, f S1 : (S1 , T1′ ) ≈ (S2 , T2′ ).
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
[ The composite of two homeomorphisms is a homeomorphism. Use general composition of relations for this. ]
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
376
[ www.topology.org/tex/conc/dg.html ]
14. Topology
[ draft: UTC 2009–3–21 Saturday 11:36 ]
[377]
Chapter 15 Topology classes and constructions
15.1 15.2 15.3 15.4 15.5 15.6 15.7 15.8 15.9 15.10 15.11
Product and quotient topologies . . . . . . . . . . . Separation classes . . . . . . . . . . . . . . . . . . . Separation and disconnection of sets . . . . . . . . . Connectivity classes . . . . . . . . . . . . . . . . . . Definition of continuity of functions using connectivity Open bases, countability classes and separability . . . Compactness classes . . . . . . . . . . . . . . . . . . Topological properties of real number intervals . . . . Topological dimension . . . . . . . . . . . . . . . . . Set union topology . . . . . . . . . . . . . . . . . . Topological identification spaces . . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
377 379 383 384 388 390 391 393 394 394 395
15.1. Product and quotient topologies 15.1.1 Definition: The (direct) product topology for two topological spaces (X1 , T1 ) and (X2 , T2 ) is the topology T for the set product X1 × X2 which is the set of all unions of sets of the form G1 × G2 such that G1 ∈ T1 and G2 ∈ T2 . That is, S T = A; A ⊆ {G1 × G2 ; G1 ∈ T1 and G2 ∈ T2 } .
15.1.2 Remark: The product topology T in Definition 15.1.1 is the weakest topology for which the projection maps f1 : X1 ×X2 → X1 and f2 : X1 ×X2 → X2 are continuous. The fact that the set T is a topology in both Definitions 15.1.1 and 15.1.4 follows from Theorem 14.10.8. (See Definition 6.9.8 for projection maps. See Definition 14.11.2 for continuity.) 15.1.3 Theorem: Let X, Y be topological spaces. Let f : X → Y be a continuous function. Then the map g : X → X × Y defined by g(x) = (x, f (x)) is continuous. Proof: Let Ω ∈ Top(X × Y ). It must be shown that g −1 (Ω) ∈ Top(X). Let x ∈ g −1 (Ω) ∈ Top(x). Then (x, f (x)) ∈ Ω. So G1 × G2 ⊆ Ω for some G1 ∈ Topx (X) and G2 ∈ Topf (x) (Y ) by Definition 15.1.1. Therefore x ∈ G1 ⊆ g −1 (Ω). It follows that x is in the interior of g −1 (Ω). Hence g −1 (Ω) ∈ Top(X). [ Must define product topology for arbitrary Cartesian products ×i∈I Xi . ] 15.1.4 Definition: The (direct) product topology for the set product X = ×i∈I Xi of a family of topological spaces (Xi , Ti )i∈I is the set of all unions of sets ×i∈I Ωi ∈ ×i∈I Ti such that #{Ωi ; Ωi 6= Xi } < ∞. In other words, the (direct) product topology on X is the weak topology on X generated by the projection maps πi : X → Xi of X onto the sets Xi as in Definition 14.10.9. That is, it is the set T defined by T ′ = × Ωi ∈ × Ti ; #{i ∈ I; Ωi 6= Xi } < ∞ i∈I
i∈I
and
©
Alan U. Kennington, “Differential geometry reconstructed: a unified systematic framework”. Copyright 2009, Alan U. Kennington. All rights reserved. You may print this book draft in A4 format. Printing in all other formats is forbidden. You may not charge any fee for copies of this book draft. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
[ The terms “product topology”, “direct product topology” may be synonymous. See EDM2 [34], 425.K. ]
378
15. Topology classes and constructions T =
S
S; S ⊆ T ′ .
15.1.5 Remark: There are some Axiom of Choice issues with defining direct products of arbitrary sets of topological spaces. There are even more significant AC issues with theorems about arbitrary products of topological spaces, in particular for compactness classes. In the unusual (or even pathological) case where the product of a non-empty family of non-empty topologies turns out to be empty, many claims about the properties of the product topology would be true simply because it is empty. Definition 15.1.4 does not seem to have any S AC issues. The set ×i∈I Ti is guaranteed non-empty because it contains at least the function f : I → i∈I Ti defined by f : i 7→ ∅ for i ∈ I. (Clearly the function S g : I → i∈I Ti defined by g : i 7→ Xi for i ∈ I is equally well-defined.) The set products ×i∈I Ωi ∈ ×i∈I Ti are all well-defined when only a finite number of the Ωi are not equal to Xi . There’s is no difficulty in “choosing” such sets of sets. Therefore the product topology has an abundance of open sets to work with. An AC problem certainly would have arisen if the set Ωi ; Ωi ∈ / {∅, Xi } was required to be always infinite. 15.1.6 Definition: The quotient topology on a set Y induced by a surjective map f : X → Y , where (X, TX ) is a topological space, is the set TY = {G ⊆ Y ; f −1 (G) ∈ TX }.
15.1.7 Remark: The quotient topology on Y is the strongest topology for which f is continuous. (See Definition 14.11.2 for continuity.) The fact that f is continuous with respect to TX and TY follows from directly from the definition of continuity. If any more open sets were added to TY , then clearly the definition of continuity would fail.
15.1.9 Theorem: Let X, A and B be topological spaces. Suppose f : X → A and g : X → B are functions such that f × g : X → A × B is a homeomorphism for the standard set-product topology on A × B. Then for all a ∈ A, the restricted map g f −1 ({a}) : f −1 ({a}) → ({a} × B) is a homeomorphism with respect to the relative topologies on f −1 ({a}) and {a} × B, and f −1 ({a}) ≈ B. (See Figure 15.1.1. See Definition 6.9.12 for the pointwise direct product function f × g. See Definition 14.10.13 for relative topology.) f
X
a
A
f −1 ({a}) g
f ×g :X ≈A×B
A×B
B {a} × B Figure 15.1.1
Homeomorphism for restriction of a product map
Proof: It follows from Theorem 14.12.6 that for any a ∈ A, the map (f × g) f −1 ({a}) : f −1 ({a}) → {a} × B is a homeomorphism with respect to the relative topologies on f −1 ({a}) ⊆ X and ({a} × B) ⊆ (A × B). Now define h : {a} × B → B by h(a, b) = b for all b ∈ B. Then h is clearly a bijection, and h and h−1 are continuous. To show that h is continuous, note that if G ∈ Top(B) then h−1 (G) = {a} × G ∈ Top({a} × B). To show that h−1 is continuous, let H ∈ Top({a} × B) and note that H = {a} × G for some G ∈ Top(B) and therefore h(H) ∈ Top(B). So h is a homeomorphism and g = h ◦ ((f × g) f −1 ({a}) ) : f −1 ({a}) ≈ B.
[ Should simplify and shorten the proof of Theorem 15.1.9. Should provide a diagram for clarification. ] [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
15.1.8 Definition: The quotient topology for the quotient set Y = X/R is the quotient topology on the set Y induced by the quotient projection f : X → Y .
15.2. Separation classes
379
15.2. Separation classes 15.2.1 Remark: Very confusingly, “separation” is not closely related to “separability”. A separation property of a topological space tells you how easy it is to disconnect subsets of the space from each other. (Connectivity is defined in Section 15.4.) Two subsets of a space are separated by covering them with a pair of disjoint open sets. In the case of T0 and T1 spaces, only a single open set is involved. 15.2.2 Remark: The most fundamental question one can ask about a topology T on a set X is whether it at least has a set Ω ∈ T for each pair of points x, y ∈ X such that x ∈ Ω and y ∈ / Ω. If such a set Ω does not exist for a point pair, the pair of points might as well be considered to be a single point. If x and y are always in the same open set or outside it, the topology has no capability at all to separate the points. Therefore it is not surprising that this fundamental property of separation is called the T1 property. On the other hand, the T0 property is weaker than the T1 property, and it does have some applications. The T0 property is a kind of semi-T1 property which guarantees only that one point of each pair of points has a neighbourhood which excludes the other point. 15.2.3 Definition: A T0 (topological) space is a topological space X such that ∀x1 ∈ X, ∀x2 ∈ X \ {x1 }, (∃Ω1 ∈ Topx1 (X), x2 ∈ / Ω1 ) ∨ (∃Ω2 ∈ Topx2 (X), x1 ∈ / Ω2 ). 15.2.4 Definition: A T1 (topological) space is a topological space X such that ∀x1 ∈ X, ∀x2 ∈ X \ {x1 },
∃Ω1 ∈ Topx1 (X), x2 ∈ / Ω1 .
∀x1 ∈ X, ∀x2 ∈ X \ {x1 }, / Ω2 ). (∃Ω1 ∈ Topx1 (X), x2 ∈ / Ω1 ) ∧ (∃Ω2 ∈ Topx2 (X), x1 ∈ The T0 property in Definition 15.2.3 is weaker than this because it guarantees only the disjunction of these two propositions rather than the conjunction. A topological space which is not T0 has two (or more) distinct points which are always both either inside or outside any given open set. In other words, a topological space X is non-T0 if and only if ∃x1 , x2 ∈ X, x1 6= x2 ∧ ∀Ω ∈ Top(X), (x1 ∈ Ω ⇔ x2 ∈ Ω) . 15.2.6 Remark: It follows from the T1 property in Definition 15.2.4 that there exist open sets Ω1 and Ω2 such that x1 ∈ Ω1 \ Ω2 and x2 ∈ Ω2 \ Ω1 , but it does not follow that the sets Ω1 and Ω2 can be chosen such that Ω1 ∩ Ω2 = ∅. (This is illustrated in Figure 15.2.1.) A useful way of thinking of the T1 separation property is that it guarantees that all single-point sets are closed.
Ω1
x1 x1 ∈ Ω1
Figure 15.2.1
x2 x2 ∈ / Ω1
Ω1
x1 x1 ∈ Ω1 x1 ∈ / Ω2
x2
Ω2
x2 ∈ Ω2 x2 ∈ / Ω1
T1 separation does not require disjoint covering pairs
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
15.2.5 Remark: The T1 property means that for every pair of distinct points x1 and x2 , there is an open set Ω1 which contains x1 but does not contain x2 . Since x1 and x2 may be swapped, this implies that / Ω2 also. Therefore ∃Ω2 ∈ Topx2 (X), x1 ∈
380
15. Topology classes and constructions
15.2.7 Remark: The two-point topologies in Example 14.3.5 have the following T0 and T1 properties. topology
T0
a {∅, X} b {∅, {1}, X} yes c {∅, {2}, X} yes d {∅, {1}, {2}, X} yes
T1
yes
The three-point topologies in Example 14.3.6 have the following T0 and T1 properties. a b c d e f g h
topology 0 1 12 1, 12 1, 2, 12 2, 12, 23 1, 2, 12, 23 1, 2, 3, 12, 13, 23
T0
T1
yes yes yes yes yes yes
As mentioned in Remark 14.7.10, the only T1 topology on a finite set is the discrete topology. 15.2.8 Theorem: Let X be a T1 topological space. Then the following propositions are true.
Proof: For part (1), let x ∈ X. Define C = {Ω ∈ Top(X); / Ω}. Then ∀y ∈ X S x∈ S \ {x}, ∃Ω ∈ C, y S∈ Ω because X S has the T1 property. Therefore ∀y ∈ X \{x}, y ∈ C, by the definition of C. So X \{x} ⊆ C. S S But x ∈ / SC. So X \ {x} = C. But C ∈ Top(X) because C is a set of open sets. Therefore {x} = X \ C ∈ Top(X). Part (2) follows from part (1) and Theorem 14.4.10 (2). Part (3) follows from part (2) and Theorem 14.5.7 (3). 15.2.9 Theorem: A topological space X has the T1 property if and only if {x} is closed in X for all x ∈ X. Proof: It follows from Theorem 15.2.8 (1) that {x} is closed in X for all x ∈ X if X is a T1 space. So suppose that X is a topological space such that {x} is closed in X for all x ∈ X. Then Ωx = X \ {x} is open for all x ∈ X. So y ∈ Ωx for any y ∈ X \ {x}, but x ∈ / Ωx . This implies the T1 property for X. 15.2.10 Theorem: Let X be a finite T1 topological space. Then Top(X) = IP(X). Proof: See Exercise 47.7.4. 15.2.11 Remark: Theorem 15.2.10 means that if a finite set has the T1 property, then all single-point sets are open. In the case of countably infinite sets, this is not true. This is demonstrated by Example 14.7.3. Trivial T1 topologies are mentioned in Definition 14.7.6 and Remark 14.7.7. 15.2.12 Remark: The existence of a disjoint covering pair is guaranteed by the T2 separation property, which is also known as “Hausdorff separation”. This is given by Definition 15.2.13 and illustrated in Figure 15.2.2. 15.2.13 Definition: A Hausdorff space is a topological space X such that ∀x1 , x2 ∈ X,
x1 6= x2 ⇒ ∃Ω1 ∈ Topx1 (X), ∃Ω2 ∈ Topx2 (X), Ω1 ∩ Ω2 = ∅.
(15.2.1)
A Hausdorff space is also known as a T2 space. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
(1) ∀x ∈ X, {x} ∈ Top(X). (2) ∀x ∈ X, {x} = {x}. (3) ∀x ∈ X, Ext({x}) = X \ {x}.
15.2. Separation classes
Ω1
x1
x2
381
Ω2
x1 ∈ Ω1 x2 ∈ Ω2 Ω1 ∩ Ω2 = ∅ Figure 15.2.2
Hausdorff (T2 ) space requires disjoint covering for a pair of points
15.2.14 Remark: The Hausdorff property (15.2.1) in Definition 15.2.13 means that every pair of distinct points has a pair of disjoint neighbourhoods. A useful way of thinking of the Hausdorff property is that it guarantees that every two-point set is disconnected in the sense of Definition 15.4.3. 15.2.15 Remark: For any infinite set X, the topology defined on X in Example 14.7.5 is T1 but not Hausdorff. 15.2.16 Remark: Non-Hausdorff topological spaces are of very limited use in differential geometry. (In fact, they’re useless for most applications.) When topological manifolds are defined in Section 26.3, nonHausdorff topologies are explicitly excluded. Example 43.3.2 is a non-trivial non-Hausdorff space which gives a hint of why one would want to exclude them. They’re just more bother than they’re worth. Differential geometry is bothersome enough already. Theorem 15.2.17 is an immediate application of the Hausdorff class. 15.2.17 Theorem: Let f : X → Y be a continuous function, where X is a topological space and Y is a Hausdorff space. Then the graph of f is a closed subset of the product topological space X × Y . Proof: See Exercise 47.7.9.
∀F ∈ Top(X), ∀x ∈ X \ F, ∃f ∈ C(X, [0, 1]), f (x) = 0 and ∀y ∈ F, f (y) = 1. 15.2.19 Remark: Definitions 15.2.18 and 15.2.20 use Notation 14.2.15 for the set Top(X) of closed subsets of X. Definition 15.2.18 is illustrated in Figure 15.2.3.
x
f (y) = 0
F f (y) = 0.3 f (y) = 0.7 f (y) = 1 Figure 15.2.3
Completely regular space, existence of continuous function
For any finite set of points x1 , . . . xm ∈ X \ F in Definition 15.2.18, suitable continuous functions f1 , . . . fm are guaranteed to exist if the topological space is completely regular. So one may construct f : X → [0, 1] with f (y) = minm i=1 fi (y) for all y ∈ X. Then f is continuous on X and has the value 1 on F and the value 0 on all points x1 , . . . xm . 15.2.20 Definition: A T4 (topological) space is a topological space X such that for every disjoint pair of closed sets K1 and K2 in X, there are disjoint open sets Ω1 and Ω2 such that K1 ⊆ Ω1 and K2 ⊆ Ω2 . In other words, ∀K1 , K2 ∈ Top(X), K1 ∩ K2 = ∅ ⇒ ∃Ω1 , Ω2 ∈ Top(X), (K1 ⊆ Ω1 and K2 ⊆ Ω2 and Ω1 ∩ Ω2 = ∅). 15.2.21 Definition: A normal (topological) space is a topological space which is both T1 and T4 . [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
15.2.18 Definition: A completely regular (topological) space is a T1 topological space X such that
382
15. Topology classes and constructions
15.2.22 Remark: Definition 15.2.20 is illustrated in Figure 15.2.4. A T4 space has the property that every disjoint pair of non-empty closed sets is disconnected. (See Theorem 15.4.11.)
Ω1
K1
K2
Ω2
K 1 ⊆ Ω1 K 2 ⊆ Ω2 Ω1 ∩ Ω2 = ∅ Figure 15.2.4
T4 space requires disjoint covering for a pair of closed sets
15.2.23 Remark: The T4 space condition is not very different to the Hausdorff condition. This can be made clearer by extending the notation Topx (X) = {Ω ∈ Top(X); x ∈ Ω} for points x ∈ X to a notation TopS (X) = {Ω ∈ Top(X); S ⊆ Ω} for sets S ⊆ X. ∀K1 , K2 ∈ Top(X), K1 ∩ K2 = ∅ ⇒ ∃Ω1 ∈ TopK1 (X), ∃Ω2 ∈ TopK2 (X), Ω1 ∩ Ω2 = ∅. This is similar to the Hausdorff space condition (15.2.1). [ Check if the axiom of choice is required to prove any of the relations between separation classes in Figure 15.2.5. If so, try to provide AC-free versions of the affected relations. See EDM2 [34], 425.Q, pages 1612– 1613, for numerous statements of theorems regarding the separation classes. ] 15.2.24 Remark: Relations between topological space separation classes are illustrated in Figure 15.2.5.
T0 Kolmogorov space T1 Kuratowski space T2 Hausdorff space T3 Vietoris axiom
T0 +T3 regular space
T3 1
T1 +T3 1
Tikhonov axiom
completely regular space
T4 Tietze’s first axiom
T1 +T4 normal space
T5 Tietze’s second axiom
T1 +T5 completely normal space
T6 Vedenisov axiom
T1 +T6 perfectly normal space
2
2
metrizable space
Figure 15.2.5
Family tree for separation classes of topological spaces
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
T−1 topological space
15.3. Separation and disconnection of sets
383
[ For every implication which is not indicated in Figure 15.2.5, give a counterexample to show why the implication is not generally valid. ] ¯ [ According to EDM2 [34], 425.Q, page 1612, a T5 space defined by the property that ∀A, B ∈ IP(X), (A∩B = ¯ ∅ ∧ A ∩ B = ∅) ⇒ (∃Ω1 , Ω2 ∈ Top(X), (A ⊆ Ω1 ∧ B ⊆ Ω2 ∧ Ω1 ∩ Ω2 = ∅)) . In other words if A ⊆ Ext(B) and B ⊆ Ext(A), then A and B are disconnected. I thought that this could be proved in a normal space. Check this. ] 15.2.25 Theorem: Let T1 and T2 be topologies on a set X such that T1 ⊆ T2 . Then: (1) If (X, T1 ) is T0 then (X, T2 ) is T0 . (2) If (X, T1 ) is T1 then (X, T2 ) is T1 . [ Add assertions for many more separation classes in Theorem 15.2.25. Also do such monotonicity results with respect to topology strength for other kinds of topology classes. ] 15.2.26 Remark: The Tietze extension theorem is useful when one wishes to specify the value of a function on a subset of a topological space but the value elsewhere doesn’t matter. 15.2.27 Theorem: [ Tietze extension theorem - see Simmons [139], section 28, page 135. The proof of this theorem does not seem to require the axiom of choice, but if it does, modify it to remove this dependence. ] 15.2.28 Remark: It is notable that a locally Euclidean space is not necessarily Hausdorff, although all Euclidean topological spaces are clearly Hausdorff. This is explained in Remark 26.3.2. (See Definition 26.2.1 for Euclidean topological spaces, Definition 26.2.3 for locally Euclidean spaces.)
15.3.1 Remark: An important role of a topology on a set X is to determine the boundary of each subset of X. The boundary of a set may be thought of as the separation barrier between its interior and exterior. As illustrated in Figure 14.5.1 (in Remark 14.5.9), the boundary Bdy(S) of a set S consists of elements of both the set S and the set X \ S. The topology determines which elements of S and X \ S belong to Bdy(S). The interior Int(S) and the exterior Ext(S) of a set S in a topological space X are both open sets in X. We can say, then, that these two open sets are separated by the boundary Bdy(S) of S. Since Int(S) = Ext(X \S) and Ext(S) = Int(X \S), it follows that Bdy(S) = Bdy(X \S). So Bdy(S) symmetrically separates Ext(X \S) from Int(X \ S). More generally, we can then say that any two open sets S1 and S2 are separated from each other if S1 ⊆ Ext(S2 ) and S2 ⊆ Ext(S1 ). (The conditions S1 ⊆ Ext(S2 ) and S2 ⊆ Ext(S1 ) are equivalent to S1 ∩ S¯2 = ∅ and S2 ∩ S¯1 = ∅ respectively.) If these two conditions are satisfied, then either Bdy(S1 ) or Bdy(S2 ) may be chosen as the separating boundary between S1 and S2 .
This naturally leads to the question of how a topology on X can separate two general subsets S1 and S2 of X from each other. An obvious idea is to define topological separation of S1 and S2 by the same pair of conditions S1 ⊆ Ext(S2 ) and S2 ⊆ Ext(S1 ). It turns out that this is equivalent to the standard definition of the “disconnection” of a pair of sets in a normal topological space, but it is not the same as the standard definition in a non-normal topological space. (See Definition 15.2.21 for normal topological spaces.) If S1 is entirely included in the exterior of S2 , the boundary of S2 separates S1 from the interior of S2 . If S2 is entirely included in the exterior of S1 , the boundary of S1 separates S2 from the interior of S1 . In other words, neither set strays into the boundary of the other. So it seems that these conditions are sufficient to guarantee some intuitive idea of separation between two sets. To check if the conditions are necessary, suppose there is a point x1 ∈ S1 ∩ S¯2 . Then all neighbourhoods of x1 contain one or more elements of S¯2 . This does not match the idea of x1 being separated from S2 . (To be continued . . . ) 15.3.2 Remark: The words “disconnected” and “connected” are used in topology to mean, respectively, that two sets are or are not separated. However, the word “connection” is used in the theory of parallelism to denote differential parallel transport of vectors from one place to another in a differentiable manifold. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
15.3. Separation and disconnection of sets
384
15. Topology classes and constructions
(See for example Chapters 36 and 37.) The use of the word “connection” for parallelism is unfortunate. (See Remark 37.1.3.) A better term would have been “differential parallelism” or just “parallelism”. But this terminology is unlikely to change. On the other hand, the word “separation” already has two different meanings in general topology. (See Sections 15.6 and 15.2.) So the words “connected” and “connectivity” are probably the best choices here. But they must be carefully distinguished from the differential manifold concept of a “connection”. 15.3.3 Definition: Non-empty sets S1 , S2 are said to be separated in a topological space X if S1 ∩ S¯2 = ∅ and S2 ∩ S¯1 = ∅. Non-empty sets S1 , S2 are said to be non-separated in a topological space X if S1 ∩ S¯2 6= ∅ or S2 ∩ S¯1 6= ∅.
15.4. Connectivity classes [ Possibly could define pathwise connectivity in the section. But this might be better defined in a section on curves (like Section 16.2). ] 15.4.1 Definition: A connected topological space is a topological space (X, T ) such that ∀Ω1 , Ω2 ∈ T,
(X = Ω1 ∪ Ω2 and Ω1 ∩ Ω2 = ∅) ⇒ (Ω1 = ∅ or Ω2 = ∅).
In other words, X cannot be partitioned into two non-empty open sets. 15.4.2 Remark: Any set with less than two elements must be connected because it cannot be partitioned into two different subsets. Therefore the empty set and all singletons are connected. 15.4.3 Definition: A connected subset of a topological space (X, T ) is a set Y ⊆ X such that (Y ⊆ Ω1 ∪ Ω2 and Ω1 ∩ Ω2 = ∅) ⇒ (Ω1 ∩ Y = ∅ or Ω2 ∩ Y = ∅).
(15.4.1)
In other words, Y cannot be partitioned by two non-empty open sets of X. A disconnected subset of a topological space (X, T ) is a subset which is not connected. In other words, ∃Ω1 , Ω2 ∈ T,
Y ⊆ Ω1 ∪ Ω2 and Ω1 ∩ Ω2 = ∅ and Ω1 ∩ Y 6= ∅ and Ω2 ∩ Y 6= ∅.
(15.4.2)
15.4.4 Remark: Condition (15.4.1) in Definition 15.4.3 may be expressed in the following equivalent way. ∀Ω1 , Ω2 ∈ T,
(Y ⊆ Ω1 ∪ Ω2 and Ω1 ∩ Y 6= ∅ and Ω2 ∩ Y 6= ∅) ⇒ Ω1 ∩ Ω2 6= ∅.
(15.4.3)
In other words, if the set Y is covered by two open sets Ω1 and Ω2 , and both of these open sets covers at least one point of Y , then Ω1 and Ω2 must have at least one point in common. This is probably closer to one’s intuition of connectivity. Condition (15.4.3) is illustrated in Figure 15.4.1. Y ⊆ Ω1 ∪ Ω2
Ω1 Y
Ω2 Y
Y ∩ Ω1 6= ∅ Y ∩ Ω2 6= ∅ Ω1 ∩ Ω2 = ∅ disconnected Figure 15.4.1 [ www.topology.org/tex/conc/dg.html ]
Y ⊆ Ω1 ∪ Ω2
Ω1 Y
Ω2 Y
Y ∩ Ω1 6= ∅ Y ∩ Ω2 6= ∅ Ω1 ∩ Ω2 6= ∅ connected
Definition of connectivity of a set [ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
∀Ω1 , Ω2 ∈ T,
15.4. Connectivity classes
385
Intuitively speaking, one finds the “gap” between two portions of a set and covers each portion with an open set. If it is not possible to find any gap, the set must be connected. Difficulties arise, however, when the “gap” has zero width, which demands great skill to position the covering sets accurately. There is no margin for error! The more open sets you have, the more pairs of sets you can separate. So bigger topologies have more disconnected sets, which implies less connected sets. 15.4.5 Definition: A disconnection of a topological space X − < (X, T ) is a pair (Ω1 , Ω2 ) such that Ω1 , Ω2 ∈ T , X = Ω1 ∪ Ω2 , Ω1 6= ∅, Ω2 6= ∅ and Ω1 ∩ Ω2 = ∅. In other words, a disconnection of X is a partition of X into a pair of disjoint, non-empty open subsets. 15.4.6 Remark: In terms of Definition 15.4.5, one can say that a topological space is connected if and only if there does not exist any disconnection of the topological space. 15.4.7 Definition: A disconnection of a subset S of a topological space X − < (X, T ) is a pair (K1 , K2 ) such that S = K1 ∪ K2 , K1 6= ∅, K2 6= ∅, K1 ∩ K2 = ∅, and there are open sets Ω1 , Ω2 ∈ T such that K1 ⊆ Ω1 , K2 ⊆ Ω2 and Ω1 ∩ Ω2 = ∅. In other words, a disconnection of S is a partition of S into a pair of disjoint non-empty subsets which are covered by a corresponding pair of disjoint open sets. 15.4.8 Remark: In terms of Definition 15.4.7, one can say that a subset of a topological space is connected if and only if there does not exist any disconnection of the subset. 15.4.9 Definition: A disconnected set of sets in a topological space X is a set C of non-empty sets in X which are pairwise disconnected. 15.4.10 Definition: A disconnection of a set of non-empty sets C in a topological space X is a pairwise disjoint set D of open sets in X which satisfy ∀S ∈ C, ∃Ω ∈ D, S ⊆ Ω.
[ Check whether Theorem 15.4.12 is valid if S1 = ∅ or S2 = ∅. Also check generally the extent to which the non-emptiness of the sets has to be stated explicitly for definitions of separation and disconnectedness. To the extent that the non-empty conditions are not required, they should be dropped. ] 15.4.12 Theorem: Let S1 , S2 be sets in a normal topological space X. Then (S1 ∩ S¯2 = ∅) ∧ (S¯1 ∩ S2 = ∅) ⇔ ∃Ω1 , Ω2 ∈ Top(X), (S1 ⊆ Ω1 ∧ S2 ⊆ Ω2 ∧ Ω1 ∩ Ω2 = ∅).
Proof: Let S1 , S2 be sets in a topological space X and suppose that (S1 ∩ S¯2 = ∅) ∧ (S¯1 ∩ S2 = ∅). Let Ω1 = Ext(S2 ) and Ω2 = Ext(S1 ). Then (To be continued . . . ) 15.4.13 Remark: Separation of a pair of non-empty sets may be expressed as follows in a normal topological space X. S1 and S2 are separated ⇔ ⇔ ⇔ ⇔
(S1 ∩ S¯2 = ∅) ∧ (S2 ∩ S¯1 = ∅) (S1 ⊆ Ext(S2 )) ∧ (S2 ⊆ Ext(S1 )) ∃Ω1 , Ω2 ∈ Top(X), (S1 ⊆ Ω1 ∧ S2 ⊆ Ω2 ∧ Ω1 ∩ Ω2 = ∅) S1 and S2 are disconnected.
Non-separation of a non-empty pair of sets may be expressed as follows in a normal topological space X S1 and S2 are non-separated ⇔ (S1 ∩ S¯2 6= ∅) ∨ (S2 ∩ S¯1 6= ∅) ⇔ (S1 ⊆ 6 Ext(S2 )) ∨ (S2 6⊆ Ext(S1 )) ⇔ ∀Ω1 , Ω2 ∈ Top(X), (S1 6⊆ Ω1 ∨ S2 6⊆ Ω2 ∨ Ω1 ∩ Ω2 6= ∅) ⇔ S1 and S2 are connected.
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
15.4.11 Theorem: Let X be a normal topological space. Let K1 and K2 be disjoint non-empty closed subsets of X. Then (K1 , K2 ) is a disconnection of K1 ∪ K2 .
386
15. Topology classes and constructions
The definitions of separation and disconnection seem more natural than connection and non-separation. Connectedness is therefore generally defined as the logical negative of disconnectedness. In other words, connectedness is defined as a double negative. This suggests that disconnectedness is the primary concept. In plain English, the words “connected” and “separated” are the logical negative of each other. Therefore the word “separation” does seem to be the natural word to use for this concept, despite the clash with other terminology in general topology. The difference between separation and disconnection of a pair of sets S1 , S2 is illustrated in Figure 15.4.2. separated sets Ω1 S1 S1 ⊆ Ω1 S1 ∩ Ω2 = ∅ Figure 15.4.2
disconnected sets Ω2
Ω1
S2 S2 ⊆ Ω2 S2 ∩ Ω1 = ∅
Ω2 S1
S2
S1 ⊆ Ω1 S2 ⊆ Ω2 Ω1 ∩ Ω2 = ∅
Difference between separation and disconnection of sets
15.4.14 Definition: A connected component of a topological space (X, T ) is a non-empty open set Ω ∈ T such that X \ Ω ∈ T is non-empty and Ω is a connected subset of X. The connected component of a point x in a topological space (X, T ) is the connected component of (X, T ) which contains x. [ In Remark 15.4.15, must prove that the set of connected components in a topological space is a partition. Make this an exercise? ]
15.4.16 Notation: Topcx (X) for a topological space X and x ∈ X denotes the connected component of X which contains x. [ Try to find a better notation for connected components. ] 15.4.17 Theorem: Let X and Y be topological spaces, f : X → Y be continuous and A ⊆ X be connected. Then f (A) is connected. In other words, the continuous image of a connected set is connected. Proof: Suppose f (A) is not connected. Then there are Ω1 , Ω2 ∈ Top(Y ) such that f (A) ⊆ Ω1 ∪ Ω2 , f (A) ∩ Ω1 6= ∅, f (A) ∩ Ω2 6= ∅ and Ω1 ∩ Ω2 = ∅. But A ⊆ f −1 (f (A)) (for any function f and set A), and f −1 maps disjoint sets to disjoint sets (because the inverse relation of any function is one-to-one). By Definition 14.11.2, f −1 maps open sets in Y to open sets in X. Therefore f −1 (Ω1 ), f −1 (Ω2 ) ∈ Top(X), A ⊆ f −1 (f (A)) ⊆ f −1 (Ω1 ∪ Ω2 ) = f −1 (Ω1 ) ∪ f −1 (Ω2 ), A ∩ f −1 (Ω1 ) 6= ∅, A ∩ f −1 (Ω2 ) 6= ∅ and f −1 (Ω1 ) ∩ f −1 (Ω2 ) = ∅. By Definition 15.4.3, this implies that A is not connected. 15.4.18 Remark: Theorem 15.4.19 is an immediate corollary of Theorem 15.4.17. 15.4.19 Theorem: Let X and Y be topological spaces. Let f : X → Y be continuous and A ⊆ X be connected. Then the graph of f A is a connected subset of X × Y with the product topology.
Proof: By Theorem 15.1.3, the map g : X → X × Y defined by g : x 7→ (x, f (x)) is continuous if f is continuous. Therefore the image of g is connected by Theorem 15.4.17. 15.4.20 Definition: A locally connected topological space is a topological space X such that ∀x ∈ X, ∀Ω ∈ Topx (X), ∃Ω′ ∈ Topx (X), Ω′ ⊆ Ω and Ω′ is connected. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
15.4.15 Remark: The set of connected components of a topological space (X, T ) is a partition of X. So the connected component for a given point is well-defined. Notation 15.4.16 implies that Topcx (X) ∈ Topx (X). Thus the connected component of x is the unique open neighbourhood of x which is a connected component of X.
15.4. Connectivity classes
387
15.4.21 Remark: A connected space is not necessarily locally connected, and vice versa. Examples of connected spaces which are not locally connected are the “comb space” in EDM2 [34], section 79.A, and the sine-of-reciprocal function in Example 15.4.23, which is presented by Simmons [139], section 34, page 151. 15.4.22 Remark: Some relations between connectivity classes are illustrated in Figure 15.4.3. topological space (X,TX )
Hausdorff space (X,TX )
locally connected space (X,TX )
connected space (X,TX )
locally Euclidean space (X,TX )
topological manifold (X,TX )
Figure 15.4.3
Family tree for connectivity and separation classes
15.4.23 Example: Define a topological space X as the set X = (x, y) ∈ IR2 ; x = 0 ∧ y ∈ [−1, 1] ∨ x ∈ (0, 1] ∧ y = sin(π/(2x))
with the relative topology of IR2 . (This is illustrated in Figure 15.4.4.) y
not locally connected f (x) =
0 1
π sin 2x 0
x>0 x=0
x
-1 Figure 15.4.4
Connected set which is not locally connected
X is connected (by Theorem 15.4.17) because it is the closure of the graph of sin(π/(2x)) for x ∈ (0, 1]. All neighbourhoods of all points in {0} × [−1, 1] contain an infinite number of components of X. Therefore the set is not locally connected at any of these points. [ Define simply connected topologies near here? ] [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
1
388
15. Topology classes and constructions
15.4.24 Remark: In algebraic topology (Section 16.6), there is a very wide range of definitions of connectivity of a topological manifold. These are mostly based on properties of curves and families of curves in the manifold, which are in turn defined in terms of intervals, which are the connected subsets of the real numbers.
15.5. Definition of continuity of functions using connectivity [ Show how other classes of separation between sets are related to continuity. This is done in Theorems 14.11.9 and 14.11.11. ] 15.5.1 Remark: Usually the continuity of a function is defined in terms of the action of the function (or its inverse) on sets. But functions have two ways of thinking about them: either as an active map from one set to another, or as a static set of ordered pairs, namely the “graph” of the function. Theorem 15.2.17 is an example of the graph view of continuity of a function. Theorem 15.2.17 shows that if a function is continuous (and its range is a Hausdorff space), then the graph is closed. (This is similar to, but not the same as, the closed graph theorem for Banach space operators.) Definition 14.11.2 is succinct and convenient for applications. From the technical point of view, it is a good definition. But it does not clearly correspond to the intuitive concept of continuity in everyday life. To understand the “true nature” of continuity, it is desirable to find an alternative definition. In elementary introductory courses, continuity is explained in terms of the continuity of the graph, meaning the connectedness of the graph. Theorem 15.4.19 states that a the graph of a continuous function is connected. But the converse does not hold. Theorem 15.5.8 shows that if the range of a function is a normal space, the function is continuous if and only if the inverse function maps disconnected sets to correspondingly disconnected sets.
15.5.3 Definition: A weakly connected function from topological space X to topological space Y is a function f : X → Y such that: ∀C ⊆ f (X),
f −1 (C) is connected ⇒ C is connected.
In other words, the image of a connected pre-image is connected. 15.5.4 Definition: A strongly connected function from topological space X to topological space Y is a function f : X → Y such that: ∀C ⊆ f (X), ∀K1 , K2 ⊆ Y,
(K1 , K2 ) is a disconnection of C ⇒ (f −1 (K1 ), f −1 (K2 )) is a disconnection of f −1 (C).
In other words, the pre-images of a disconnection pair are a disconnection pair for the pre-image. 15.5.5 Remark: It follows from Theorem 15.4.17 that any continuous function is weakly connected. This can be seen by setting A = f −1 (C) and noting that f (A) = f (f −1 (C)) = C for any function f and set C. It follows from the proof of Theorem 15.4.17 that a continuous function is also strongly connected, as stated in Theorem 15.5.6. 15.5.6 Theorem: If X and Y are topological spaces and f : X → Y is continuous, then f is strongly connected. Proof: See proof of Theorem 15.4.17. 15.5.7 Remark: Theorem 15.5.8 means that a function f : X → Y is continuous if and only if a subset B of Y is connected whenever the pre-image f −1 (B) is connected, and if B is disconnected, f −1 (B) is disconnected by the inverse images of the partition for B. Figure 15.5.1 illustrates this idea. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
15.5.2 Remark: It is useful to introduce here some non-standard definitions of connectivity properties of functions. Definitions 15.5.3 and 15.5.4 are useful for presenting some relations between connectivity and continuity properties of functions.
15.5. Definition of continuity of functions using connectivity Ω1
Ω2
B1
B2
f −1
f −1
f −1 (B1 )
f −1 (B2 ) B = B1 ∪ B2 is disconnected f −1 (B) = f −1 (B1 ) ∪ f −1 (B2 ) is disconnected f is continuous Figure 15.5.1
Ω1
389 Ω2
B1
B2
f −1
f −1
f −1 (B1 )
f −1 (B2 ) B = B1 ∪ B2 is disconnected f −1 (B) = f −1 (B1 ) ∪ f −1 (B2 ) is connected f is discontinuous
Continuous function pre-images of disconnected sets
[ Try to weaken the normal space condition in Theorem 15.5.8 to the Hausdorff condition using a construction such as in the proof of Theorem 15.2.17 which is given in exercise answer 48.7.3. If Hausdorff doesn’t work, try to replace the normal space with a completely regular space. In fact, a regular space does seem to be adequate. Also see Theorems 14.11.9 and 14.11.11 for similar equivalences for a general topological space with a weaker notion of set separation. ]
Proof: The forward implication of the theorem follows from Theorem 15.5.6. It remains to show that the function f is continuous if it is strongly connected. Let f : X → Y be strongly connected. If X = ∅, then f is trivially continuous. So assume that X 6= ∅. Let y ∈ f (X) and Ω ∈ Topy (Y ). To prove continuity of f , it must be shown that for any x ∈ X such that y = f (x), there is an open neighbourhood G ∈ Topx (X) such that f (G) ⊆ Ω. (In other words, it must be shown that f −1 (Ω) is an open subset of X.) Let K1 = {y} and K2 = Y \ Ω. Then K1 is closed because Y is a T1 space. (See Definition 15.2.4.) K2 is closed because Ω is open. Since K1 and K2 are disjoint closed sets and Y is a normal space, there exist disjoint G1 , G2 ∈ Top(Y ) such that K1 ⊆ G1 and K2 ⊆ G2 . So (K1 , K2 ∩ f (X)) is a disconnection of K = (K1 ∪ K2 ) ∩ f (X) = K1 ∪ (K2 ∩ f (X)). By the strong connectivity of f , the pair (f −1 (K1 ), f −1 (K2 )) must be a disconnection of f −1 (K). Therefore there are open sets H1 , H2 ∈ Top(X) such that x ∈ H1 , f −1 (K2 ) ⊆ H1 and H1 ∩ H2 = ∅. So H1 ⊆ X \ H2 ⊆ X \ f −1 (K2 ) = X \ f −1 (Y \ Ω) = f −1 (Ω). Thus f (H1 ) ⊆ Ω. Hence (or otherwise), f is continuous. 15.5.9 Remark: An immediate corollary of Theorem 15.5.8 is Theorem 15.5.10 for one-to-one functions. 15.5.10 Theorem: Let X be a topological space and Y be a normal space. If f : X → Y is one-to-one, then f is continuous if and only if ∀A, B ⊆ X, (f (A), f (B)) is a disconnection of f (A) ∪ f (B) ⇒ (A, B) is a disconnection of A ∪ B. In other words, f is continuous if and only if the image of a pair which is not a disconnection is a pair which is not a disconnection. 15.5.11 Remark: Theorem 15.5.10 means that a one-to-one function f : X → Y is continuous if and only if the image set f (A) is connected for any connected subset A of X, and if f (A) is disconnected, A is disconnected by the inverse images of the partition for f (A). Figure 15.5.2 illustrates this idea. 15.5.12 Remark: Very roughly speaking, a function is said to be continuous if it maps connected sets to connected sets. In other words, the function must preserve connectedness. That is, if there is no gap in a subset of the domain of the function, there must be no gap in the image of that subset by the function. If there [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
15.5.8 Theorem: Let X be a topological space and Y be a normal space. Then f : X → Y is continuous if and only if f is strongly connected.
390
15. Topology classes and constructions Ω1 f (A1 ) f
Ω2 f (A2 ) f
A1
A2
f (A) = f (A1 ) ∪ f (A2 ) is disconnected A = A1 ∪ A2 is disconnected f is continuous Figure 15.5.2
Ω1 f (A1 )
Ω2 f (A2 )
f
A1
f
A2
f (A) = f (A1 ) ∪ f (A2 ) is disconnected A = A1 ∪ A2 is connected f is discontinuous
Continuous one-to-one function pre-images of disconnected sets
is a gap in the range of the function, there must be a corresponding gap in the domain. “Corresponding” means that if a subset C of the range is split into A and B, then the inverse image f −1 (C) is split into f −1 (A) and f −1 (B). It follows from this that the concept of continuity can be presented in terms of the simple intuitive concept of connectivity instead of the convoluted ε-δ definitions which are used for metric spaces or the open-inversefunction definition for general topologies. Although connectivity is technically defined in terms of open sets, people have a strong intuition of connectivity without mentioning open sets. Connectivity is generally thought of as the absence of a “gap”, which is much easier to grasp than the usual universal/existential quantifier combination. Maybe continuity should be defined in elementary texts as: “A continuous function makes no new gaps.” A continuous function may close up some gaps, but it never opens up new ones. Therefore: “Continuous functions are the functions which preserve connectivity.” Alternatively: “Continuous functions are the functions whose inverses preserve disconnection.” To put it simply,
15.5.13 Remark: Theorem 15.5.8 does not imply that f is continuous if the image of any connected set is connected. A counterexample to this is the function f : [0, ∞) → IR defined by n f (x) = sin(π/(2x)) x > 0 0 x = 0. For any connected subset C of [0, ∞) which does not contain 0, f (C) is connected because f is continuous on (0, ∞). If C = {0}, then f (C) = {0}, which is connected. If 0 ∈ C and C 6= {0} is connected, then f (C) = [−1, 1], which is connected. But f is not connected. It is easy to show that if K1 = {y}, K2 = {1} and C = K1 ∪ K2 , then (K1 , K2 ) is a disconnection of C for y ∈ [−1, 1), but (f −1 K1 , f −1 (K2 )) is not a disconnection of f −1 (C) for y = 0. This is because it is not possible to find a neighbourhood of 0 ∈ X which does not intersect f −1 (K2 ).
15.6. Open bases, countability classes and separability 15.6.1 Remark: In computer representations, it would be very inconvenient to represent a topology as its complete set of open sets. There are more efficient representations. In the case of metric spaces (Definition 17.1.2), the topological specification is fully contained in the distance function. The corresponding structure for a general topology is an “open base” or an “open subbase”. The entire topology can be generated from such sets by operations such as union and finite intersection. For computation, such bases should not be too large. If a topology has a countable open base, it is said to be “second countable”. (See Simmons [139], pages 99–100.) Since computers can’t even cope with countable sets, topologies are more likely to be dealt with in computers by providing a test function which examines a representation of a set to see if it can be proven by a set of rules to be open or not. For instance, sets might be represented as unions and intersections of sets satisfying various constraints in terms of functions which are themselves represented in symbolic algebra. Thus it is more likely that the topological nature of sets would be determined in a rule-based manner. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
f is continuous ⇔ f −1 preserves gaps.
15.7. Compactness classes
391
[ Open bases and subbases are already defined in Section 14.10. Therefore should remove them from here? ] 15.6.2 Definition: An S open base for a topological space (X, T ) is a set S of subsets of X such that ∀Ω ∈ T, ∃S ′ ⊆ S, T = S ′ . In other words, every open set is the union of some subset of the open base. 15.6.3 Definition: A second countable (topological) space is a topological space (X, T ) for which there exists a countable open base. 15.6.4 Definition: An open subbase of a topological space (X, T ) is . . . 15.6.5 Definition: A dense subset of a topological space (X, T ) is a set K ⊆ X such that the closure of K is X in the topology T . 15.6.6 Definition: A separable topological space is a topological space (X, T ) such that X has a countable dense subset. [ 2009-3-3: Determine the relations between density, separability and decomposition of topological spaces into components. The relations between interior/exterior separation and disconnectedness are already under close study. So bring in also the density and countable dense subset ideas. ]
15.7. Compactness classes EDM2 [34], 273.F, suggests that the term “compact” was introduced in 1906 by Ren´e Maurice Fr´echet. [ Is topological dimension a kind of compactness? If so, include it in this section. ] 15.7.1 Definition: S A cover or covering of a subset A in a set X is a family (Bi )i∈I of sets Bi ⊆ X for i ∈ I such that A ⊆ i∈I Bi . An open cover or open covering of a subset A ⊆ X in a topological space (X, T ) is a covering (Bi )i∈I of A in X such that Bi ∈ T for all i ∈ I.
∀j ∈ J, ∃i ∈ I,
Cj ⊆ Bi .
An open refinement of an open covering (Bi )i∈I of a subset A ⊆ X in a topological space (X, T ) is a refinement C = (Cj )j∈J of A in the set X such that C is an open covering of A in the topological space (X, T ). 15.7.3 Definition: A locally finite subset of a topology T on a set X is a set C ⊆ T such that each point of X has a neighbourhood which intersects at most a finite number of elements of C. That is, # {Ω ∈ C; Ω ∩ G 6= ∅} < ∞.
∀x ∈ X, ∃G ∈ Topx (X),
[ Need to find out under what assumptions on the topology a locally finite set of open sets is the same thing as a pointwise finite set of open sets. ] 15.7.4 Definition: A compact set in a topological space (X, TX ) is a subset K of X such that every open covering of K has a finite subcover; that is, ∀C ⊆ TX ,
K⊆
S
C ⇒ ∃C ′ ⊆ C, (K ⊆
S
C ′ and #(C ′ ) < ∞) .
A compact topological space is a topological space (X, TX ) such that X is a compact set in X. 15.7.5 Remark: It seems that the characterization of compactness in Definition 15.7.4 is called the HeineBorel condition or Heine-Borel compactness. 15.7.6 Theorem: If F is a closed subset of a compact set K, then F is compact. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
15.7.2 Definition: A refinement of a covering (Bi )i∈I of a subset A in a set X is a covering C = (Cj )j∈J of A in X such that
392
15. Topology classes and constructions
Proof: Let C be an open cover for a closed subset F of a compact set K in a topological space (X, T ). Define C ′ = C ∪ {X \ F }. Then C ′ is an open cover of K. So K has a finite open subcover C1 of C ′ . Define C2 = C ∩ C1 . Then C2 is a finite subset of C. Since C1 ⊆ C ∪ {X \ F }, C2 must equal either C1 or C1 \ {X \ F }. But C1 covers F , and C1 \ {X \ F } also covers F because {X \ F } ∩ F = ∅. So C2 ⊆ C is a finite subcover of F . The theorem follows. 15.7.7 Theorem: If X and Y are topological spaces, f : X → Y is continuous and X is compact, then Y is compact. That is, the image of a compact set under a continuous map is compact. [ Define compact-open topology for C(X, Y ). See EDM2 [34], 202.C, “mapping spaces”. EDM2 [34], 435.D, states many useful properties of the compact-open topology. ] 15.7.8 Remark: The set of continuous functions from X to Y is denoted as C(X, Y ) in Notation 14.11.12. Definition 15.7.9 defines a stronger topology on C(X, Y ) than the pointwise convergence topology in Definition 14.11.21. 15.7.9 Definition: The compact-open topology on the set of functions C(X, Y ) for topological spaces X and Y is the topology generated by {ΩK,U ; K is a compact subset of X and U ∈ Top(Y )} on C(X, Y ), where ΩK,U = {f : X → Y ; f (K) ⊆ U } for all compact K ⊆ X and open U ⊆ Y . [ Prove that the compact-open topology is stronger than the pointwise topology (if true). ] 15.7.10 Remark: Tikhonov’s Theorem states that the direct product of an arbitrary set of compact topological spaces is a compact topological space. (See Definition 15.1.4 for direct products of topological spaces.) However, this theorem requires the axiom of choice. Therefore it is not quoted here. [ Must present a form of Tikhonov’s Theorem which does not require AC. The full name of Tikhonov is Andre i Nikolaeviq Tihonov. Give Tikhonov’s theorem here, but tag it as AC-tainted. ]
15.7.12 Definition: A locally compact topology on a set X is a topology on X such that every point of X has a compact neighbourhood. That is, ∀x ∈ X, ∃Ω ∈ Topx (X),
− Ω is compact,
− where Ω means the closure of Ω in Top(X). 15.7.13 Remark: If X is a compact topological space, then X is locally compact. 15.7.14 Definition: A paracompact topology on a set X is a topology on X such that every open covering of X has a locally finite open refinement. 15.7.15 Remark: If X is a compact topological space, then X is paracompact. [ Should give here an example of a paracompact set which is not locally compact. ] Any metrizable topological space is paracompact. So paracompactness is a fairly weak compactness property. (See Chapter 17 for metric spaces.) The usefulness of paracompactness in differential geometry is that it guarantees the existence of partitions of unity, according to Warner [49], page 8. Warner [49], lemma 1.9, page 9, says that a topological space is paracompact if it is locally compact, Hausdorff and second countable, and since manifolds are second countable, this means that all manifolds are paracompact. In fact, they are all metrizable. (Proof in Kelley [117].) [ Must check that none of the results in Remark 15.7.15 depend on the axiom of countable choice. Is the axiom of countable choice required? ] [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
15.7.11 Definition: A sequentially compact set in a topological space (X, TX ) is a subset K of X such that every infinite sequence in K has a convergent subsequence.
15.8. Topological properties of real number intervals
393
topological space (X,TX )
Hausdorff space (X,TX )
locally compact space (X,TX )
paracompact space (X,TX )
locally Euclidean space (X,TX )
locally compact, Hausdorff, second countable space (X,TX )
compact space (X,TX )
topological manifold (X,TX )
Figure 15.7.1
Family tree of compactness classes
15.7.16 Remark: The above classes of topological spaces are summarized in the family tree in Figure 15.7.1. The fact that the Hausdorff property is not implied by the compactness properties is proven by the trivial topology {∅, IR} for IR. This is clearly not Hausdorff, but it is compact because all open covers are finite. A topological manifold is defined as a locally Euclidean Hausdorff space in Definition 26.3.1. See EDM2 [34], article 425, page 1618, for a comprehensive set of family trees for topological spaces. [ Near here, present complete spaces, sequential compactness, etc. ]
15.8. Topological properties of real number intervals 15.8.1 Remark: For intervals of real numbers in Theorem 15.8.2, see Definition 8.3.10. For the topology on IR, see Definition 14.9.1.
Proof: The empty set and all singletons are real intervals and are connected. So assume that a set I ⊆ IR contains at least two elements. A non-empty subset of IR is an interval if and only if I ⊇ (inf I, sup I). (See Theorem 8.3.11.) Since I has two or more elements, then inf I < sup I. So if I is not an interval, then there must be elements x, z ∈ I and y ∈ IR \ I such that x < y < z. (Otherwise I ⊇ (inf I, sup I) would hold.) Hence I is disconnected by the open sets Ω1 = (−∞, y) and Ω1 = (y, ∞) which cover I. Now suppose that I is not connected. . . . [ Simmons [139], page 143, has a proof. Should produce a more satisfying version of that here. ] 15.8.3 Remark: From the topological point of view, two real intervals are equivalent if they are homeomorphic, but there is a further distinction according to whether the homeomorphism is order-preserving. The following table classifies intervals into equivalence classes with respect to order-preserving (increasing) homeomorphisms. It is assumed that a, b ∈ IR with a < b. 0. 1. 2. 3a. 3b. 4.
type
intervals
properties of intervals
empty single-point closed closed-open open-closed open
∅ [a, a] [a, b] [a, b), [a, ∞) (a, b], (−∞, b] (a, b), (a, ∞), (−∞, b), (−∞, ∞)
compact, closed, bounded compact, closed, bounded compact, closed, bounded neither open nor closed neither open nor closed open
There are 6 equivalence classes of intervals for oriented homeomorphisms and 5 classes for unoriented homeomorphisms because (3a) and (3b) are equivalent if reversals are permitted. Since all topological properties of the image of a curve are invariant under homeomorphisms of the parameter interval, one may represent all possibilities in terms of bounded intervals, and one may reduce these intervals to the canonical case that a = 0 and b = 1. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
15.8.2 Theorem: A set I ⊆ IR is connected in the usual topology on IR if and only if I is an interval.
394
15. Topology classes and constructions
15.9. Topological dimension Topological dimension has some relevance to the determination of sufficient conditions for a locally compact transformation group to be a Lie group. (See Section 34.1.) [ See EDM2 [34], section 117.B. ]
Z
15.9.1 Definition: The Lebesgue dimension of a normal topological space (X, T ) is the smallest n ∈ + 0 s such that for that Gi ∈ T for all i = 1 . . . s Ss any finite open covering G of X (i.e. a collection G = (Gi )i=1 such s and X = i=1 Gi ), for Ss some refinement H of G (i.e. a collection H = (Hi )i=1 such thatT Hi ⊆ Gi for all i = 1 . . . s and X = i=1 Hi ), for all sets I ⊆ {1 . . . s} such that #{I} = n + 2, the set i∈I Hi is empty. The Lebesgue dimension of X is denoted as dim X or dim(X). If no such n ∈ + 0 exists, then the dimension of (X, T ) is infinite. This is denoted dim X = ∞.
Z
[ You can’t be serious! Definition 15.9.1 is way too convoluted. What does it really mean? Examples are required. Whose big idea was it to invent this definition anyway? ] [ Must define cardinality of sets in the numbers section. ] 15.9.2 Remark: Definition 15.9.1 obviously requires an example or two for clarification. [ Find out what the relation is between Lebesgue dimension and Euclidean dimension. Also find out what this has to do with n-cells. ] [ Near here, there could be a short section stating a fairly complete set of properties of the empty set topology and the topologies on sets with one or two elements, even maybe with 3 elements. Alternatively, the properties of such discrete topologies could be set as easy exercises. ]
15.10.1 Theorem: Let (X1 , T1 ) and (X2 , T2 ) be topological spaces such that X1 ∩ X2 = ∅. Then the set T = {Ω1 ∪ Ω2 ; Ω1 ∈ T1 and Ω2 ∈ T2 } is the weakest topology on X1 ∪ X2 such that T ⊇ T1 ∪ T2 . T Proof: of T with respect to pairwise intersection follows from the identity i (Ωi1 ∪ Ωi2 ) = T i The T closure i i i i Ω1 ∪ i Ω2 which holds if Ω1 ⊆ X1 and Ω2 ⊆ X2 for all i, and X1 ∩ X2 = ∅. The closure with respect to arbitrary unions follows from the corresponding identity for unions. The minimality of the topology T follows from the fact that any topology T ⊇ T1 ∪ T2 must contain at least all of the unions of elements of T1 ∪ T2 . 15.10.2 Definition: The disjoint set union topology for disjoint sets X1 and X2 with topologies T1 and T2 respectively is the topology T = {Ω1 ∪ Ω2 ; Ω1 ∈ T1 and Ω2 ∈ T2 }. 15.10.3 Remark: Definition 15.10.6 introduces a sort of ‘graft’ of two or more topologies. The idea here is to try to define a topology on the set X1 ∪ X2 , especially in the case that X1 ∩ X2 is non-empty. The topologies of manifolds and fibre spaces are often defined as a “graft of patches”. This is a very general mechanism for creating topologically interesting spaces out of flat, boring spaces without having to resort to induced topologies of embeddings or various quotient topologies. The identification topology can probably be expressed as the quotient of the “union topology” (of two nominally non-intersecting topological spaces) with respect to an appropriate relation on the base set union. A more general form of “identification topological space” can be defined with the aid of Definition 6.10.5, which defines identification spaces of arbitrary families of sets. If the topologies on each member of such a family are consistent, then a topological identification space is well-defined. This is presented in Definition 15.11.2. [ Theorem 15.10.5 is clearly superseded by Theorem 15.10.7. ] [ See EDM2 [34], section 425.M for discussion of “topological sums”, which seem to be related to the set-union topology. ] 15.10.4 Remark: The conditions ∀Ω1 ∈ T1 , Ω1 ∩ X2 ∈ T2 and ∀Ω2 ∈ T2 , Ω2 ∩ X1 ∈ T1 in Theorem 15.10.5 are equivalent to requiring the identity map idX1 ∩X2 to be a homeomorphism for X1 ∩ X2 with the relative topologies from T1 and T2 respectively. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
15.10. Set union topology
15.11. Topological identification spaces
395
15.10.5 Theorem: Suppose two topological spaces (X1 , T1 ) and (X2 , T2 ) are such that Ω1 ∩ X2 ∈ T2 for all Ω1 ∈ T1 and Ω2 ∩ X1 ∈ T1 for all Ω2 ∈ T2 . Then the set T = {Ω1 ∪ Ω2 ; Ω1 ∈ T1 and Ω2 ∈ T2 } is the weakest topology on X1 ∪ X2 such that T ⊇ T1 ∪ T2 . Proof: It must be shown that the set T is a topology for X1 ∪ X2 . As in the proof of Theorem 15.10.1, it is sufficient to show that T is closed under pairwise intersections and arbitrary unions. (See Definition 14.2.3.) So consider Ω11 , Ω21 ∈ T1 and Ω12 , Ω22 ∈ T2 . It must be shown that Ω = (Ω11 ∪ Ω12 ) ∩ (Ω21 ∪ Ω22 ) ∈ T . By distributivity, Ω = Ω11 ∩ Ω21 ∪ Ω11 ∩ Ω22 ∪ Ω12 ∩ Ω21 ∪ Ω12 ∩ Ω22 .
Clearly Ω11 ∩ Ω21 ∈ T1 and Ω12 ∩ Ω22 ∈ T2 . Since Ω11 ∩ Ω22 = (Ω11 ∩ X2 ) ∩ Ω22 , it follows that Ω11 ∩ Ω22 ∈ T2 by the assumptions of the theorem. Similarly (Ω12 ∩ Ω21 ) ∈ T2 . So Ω is a union of an element of T1 with three elements of T2 . From the closure of T2 under unions, it follows that Ω ∈ T . The closure of T under arbitrary unions follows trivially from the associativity and commutativity of set unions. 15.10.6 Definition: Suppose two topological spaces (X1 , T1 ) and (X2 , T2 ) are such that Ω1 ∩ X2 ∈ T2 for all Ω1 ∈ T1 and Ω2 ∩ X1 ∈ T1 for all Ω2 ∈ T2 . (This is the same as requiring the identity map idX1 ∩X2 to be a homeomorphism between X1 and X2 .) Then the set union topology of these two topological spaces is the topological space (X1 ∪ X2 , T ), where T = {Ω1 ∪ Ω2 ; Ω1 ∈ T1 and Ω2 ∈ T2 }.
15.10.8 Definition: Suppose (Xi , Ti )i∈I is a family of topological spaces such that for all i, j ∈SI, Ωi ∩Xj ∈ Tj for all Ωi∈ Ti . Then the set union topology of the family (Xi , Ti )i∈I is the topological space i∈I Xi , T , S where T = i∈I Ωi ; ∀i ∈ I, Ωi ∈ Ti .
15.11. Topological identification spaces
“Topological identification spaces” are topological spaces constructed by grafting together patches of topological spaces. This is how topological manifolds are often defined in practice. [ See EDM2 [34], 425.L, for identification spaces. ] (If you have something better to do, like making a cup of tea or feeding the hamster, this would be a good time to skip a section and come back later. This is one of the less interesting sections.) 15.11.1 Remark: Definition 15.11.2 requires some explanation. It is effectively the same as Definition 15.10.8 except that the sets (Xi )i∈I are first grafted into a set X before the tests are applied to the topologies on the pairwise intersections of the sets Xi . In other words, the identification topology is really the same thing as a set union topology except that an equivalence relation must first be applied to the sets in the family to determine which points in the sets are supposed to be identified. A useful mental image for the definition of a topological identification space is a football made out of many patches of material sewn together with overlapping flaps. [ Maybe graft spaces could give a useful perspective on the way tangent bundles are defined by identifying point/vector pairs according to transformation rules. ] ˚i∈I Xi be an identifica15.11.2 Definition: Let (Xi , Ti )i∈I be a family of topological spaces. Let X ⊆ × tion space of the sets (Xi )i∈I . (See Definition 6.10.5.) The family (Xi , Ti )i∈I is said to be (topologically) consistent with the identification space X if for all i, j ∈ I, for all Ωi ∈ Ti , {xj ∈ Xj ; (xk )k∈K ∈ X and xi ∈ Ω i } ∈ Tj . [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
15.10.7 Theorem: Suppose (Xi , Ti )i∈I spaces such that for all i, j ∈SI, Ωi ∩ Xj ∈ S is a family of topological Tj for all S Ωi ∈ Ti . Then the set T = Ω ; ∀i ∈ I, Ω ∈ T i i is the weakest topology on i∈I i i∈I Xi such that T ⊇ i∈I Ti . S Proof: It is sufficient to show that T is a topology for i∈I Xi . Since T is clearly closed under arbitrary unions, it is sufficient to show that it is closed under pairwise intersection. ConsiderStwo families Sof open 1 sets (Ω1i )i∈I and (Ω2i )i∈I , where Ω1i , Ω2i ∈ Ti for all i ∈ I. It must be shown that Ω = Ω ∩ Ω1j i i∈I j∈I S is in T . By distributivity of union and intersection operations, Ω = i,j∈I (Ω1i ∩ Ω2i ). For any i, j ∈ I, Ω1i ∩ Ω2j = Ω1i ∩ (Ω2j ∩ Xi ), and Ω2j ∩ Xi ∈ Ti by the assumptions of the theorem. So Ω1i ∩ Ω2j ∈ Ti . It follows that Ω ∈ T .
396
15. Topology classes and constructions
˚i∈I Xi , When the family of topological spaces (Xi , Ti )i∈I is consistent with the identification space X ⊆ × the identification topology on X is the set T defined by T =
S
i∈I
fi (Ωi ); ∀i ∈ I, Ωi ∈ Ti ,
where fi : Xi → X is defined so that for all y ∈ Xi , fi (y) is the unique x ∈ X such that y = xi . (See Remark 6.10.6.) The topological space (X, T ) may be called a topological identification space of the family (Xi , Ti )i∈I . 15.11.3 Remark: The ‘topological consistency’ in Definition 15.11.2 simply means that the identification space transition functions are continuous. The topology T of a identification space X is the weakest topology for which projections from X to the patch topologies are continuous. 15.11.4 Theorem: Let X be a topological space, and let τ : Y → X be a map from a set Y onto X. Then τ −1 (Top(X)) = {τ −1 (G); G ∈ Top(X)} is a topology on Y , and (Y, τ −1 (Top(X))) ≈ (X, Top(X)). Proof: [ See EDM 376.D for the relevant properties of f −1 . ] 15.11.5 Remark: Theorem 15.11.6 is used in Section 44.3. [ Explain the motivation. ] 15.11.6 Theorem: Let X be a topological space, and let τ : X → Y be a map from X onto a set Y . If τ −1 ◦ τ (Ω) ∈ Top(X) for all Ω ∈ Top(X), then τ (Top(X)) is a topology on Y . Also τ −1 ◦ τ (Top(X)) is a topology on X and (X, τ −1 ◦ τ (Top(X))) ≈ (Y, τ (Top(X))).
(i) ∅ = τ (∅) and Y = τ (X). So ∅ and Y are in T . (ii) Let Ω1 , Ω2 ∈ T . Then Ω1 = τ (G1 ) and Ω2 = τ (G2 ) for some G1 , G2 ∈ Top(X). Let G′1 = τ −1 (Ω1 ) and G′2 = τ −1 (Ω2 ). Then G′1 , G′2 ∈ Top(X). So G′1 ∩ G′2 ∈ Top(X). Therefore Ω1 ∩ Ω2 = τ (G′1 ∩ G′2 ) ∈ T . S S S (iii) Let C ⊆ T . Then for some S ⊆ Top(X), C = G∈S τ (G) = τ ( G∈S G) ∈ T .
This shows that T is a topology on X. The remaining statements of the theorem follow immediately from Theorem 15.11.4.
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Proof: Let X be a topological space, and τ : X → Y a map from X onto Y . Suppose τ ◦ τ −1 maps open sets of X to open sets of X. Let T = τ (Top(X)). Then
[397]
Chapter 16 Topological curves, paths and groups
16.1 16.2 16.3 16.4 16.5 16.6 16.7 16.8 16.9 16.10
Curve and path terminology and definition options . Curves . . . . . . . . . . . . . . . . . . . . . . . . Path-equivalence relations for curves . . . . . . . . Paths . . . . . . . . . . . . . . . . . . . . . . . . Convex curvilinear interpolation . . . . . . . . . . Algebraic topology . . . . . . . . . . . . . . . . . Topological groups . . . . . . . . . . . . . . . . . Topological transformation groups . . . . . . . . . Topological vector spaces . . . . . . . . . . . . . . Network topology and continuous paths in networks
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
397 400 402 404 406 408 408 409 410 411
16.1. Curve and path terminology and definition options There is some variation among authors in the use of the words “curve” and “path”. This section is a discussion of the options for defining these concepts.
(1) A map γ : I → X for intervals I ⊆ IR. (2) An image set S = γ(I) of a map γ : I → X.
Most differential geometry texts define a “curve” in style (1). This is sometimes referred to as a “parametrized curve” or “continuous curve”. (Examples are Ahlfors [93], Crampin/Pirani [11], Darling [13], do Carmo [16], Frankel [18], Gallot/Hulin/Lafontaine [19], Lang [30], Lee [32], Misner/Thorne/Wheeler [37], Rudin [136], Spivak [42] and Szekeres [44].) Some texts (like Greenberg/Harper [113]) use the word “path” for structure style (1). Some texts (like Lang [30]) use both words “curve” and “path” for style (1). A minority of texts use the word “curve” for structure style (2). (Examples are EDM2 [34] and Simmons [139].) The pattern which seems to emerge is as follows. (i) Structure style (1) is the most popular by far. Style (2) is much less popular. (ii) The word “curve” is used by most DG texts for structure style (1). A small number of authors use the word “path” for style (1), either exclusively or in addition to the word “curve”. (iii) The word “path” is used only by a small minority of DG texts. (iv) The texts which do not use the word “curve” for structure (1) are predominantly on non-DG subjects such as real analysis, complex analysis and topology. The most popular names for structures (1) and (2) are “curve”, “path”, “arc”, “locus” and “contour”. Other plausible names are “trajectory”, “track”, “route”, “journey” and “traversal”. Styles (3) and (4) are not found at all in the author’s survey of texts, but they do have some benefits.
©
Alan U. Kennington, “Differential geometry reconstructed: a unified systematic framework”. Copyright 2009, Alan U. Kennington. All rights reserved. You may print this book draft in A4 format. Printing in all other formats is forbidden. You may not charge any fee for copies of this book draft. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
16.1.1 Remark: The candidates for definitions of curves and paths include the following styles of structures, where X is a topological space of points and I is a real-number interval. (The various kinds of real intervals are presented in Definition 8.3.10 and Remark 15.8.3.)
398
16. Topological curves, paths and groups
(3) A map γ : I → X from which some redundant information has been removed. (4) A set S ⊆ X to which some extra information has been added.
Structure style (1) has the most information. Style (2) has the least information. Styles (3) and (4) have intermediate amounts of information. The removal of information in case (3) may be achieved by replacing the map γ in case (1) with an equivalence class of such maps. The addition of information in case (4) may be achieved by attaching extra structures (such as start and end points, the direction of traversal, or an order structure) to the image set S of case (2). 16.1.2 Remark: In the mathematical theory of networks (sometimes called “graph theory”), a “path” generally means an ordered sequence of links (or “edges”) in the network. Although the phrase “path from A to B” has a sense of directionality, this suggests only a sequence of traversal, not the time at which each point should be reached. So the function concept (1) in Remark 16.1.1 has far too much information for a path. Perhaps a total order on the path would be the best way of representing the English-language idea of a path from one point to another. 16.1.3 Remark: In this book, the word “curve” (Definition 16.2.4) signifies the map structure (1) in Remark 16.1.1 and “path” signifies an equivalence-class structure (3), whereas concept (2) is called simply the “image” of the curve or path. In other words, a curve will be a map from an interval to a topological space whereas a “path” means a curve from which some or all information about the method of traversal has been removed.
16.1.5 Remark: What is often wanted is a curve definition with the redundant information removed. Curves which have the same image are sometimes regarded as equivalent, but the image alone does not usually contain all of the desired information. It is helpful to look at some examples. Consider the maps γ1 : [0, π] → IR2 with γ1 : t → (cos t, 0) and γ2 : [0, π] → IR2 with γ2 : t → (cos 3t, 0). (See Figure 16.1.1.) Both maps have the image set [−1, 1] × {0}, and they have the same start and end points. Given the image of a path with self-intersections, it is impossible to determine the order in which the image is traversed.
t=π −1
t=0
t = π/3
t=0
1
t=π
t = 2π/3
γ1 t 7→ (cos t, 0)
γ2 t 7→ (cos 3t, 0)
t=π IR Figure 16.1.1
t=π IR
Curves with same image set and end-points
In the case of a space-filling curve, essentially all parametrization information is lost. (See Example 43.1.2 for space-filling curves.) Therefore even if a curve is known to be one-to-one, the traversal sequence cannot be determined from only the image set and the start and end points. Therefore it is better to either specify the traversal order explicitly as an abstract order relation, or to specify a single map γ together with an equivalence relation on the set of all curves. 16.1.6 Remark: The difference between sets and curves may be compared with the difference between sets and sequences. Sets are often indexed for convenience even when the choice of index is irrelevant. Sometimes [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
16.1.4 Remark: There may be many different path definitions according to the choice of equivalence relation. This is very much analogous to the way in which there is a multiplicity of definitions of manifolds – topological, C k differentiable, analytic, and so forth. Concept (1) in Remark 16.1.1 is similar to a manifold chart (defined in the inverse direction), whereas concept (3) resembles a maximal atlas for a manifold. Concept (2) resembles the base set of a manifold without the charts or topology. (See Remark 16.1.8 for comments on inverse atlases for curves and paths.)
16.1. Curve and path terminology and definition options
399
the order of a sequence of objects is important, sometimes not. The parametrization of paths is analogous to the indexing of sets. The choice of index map for the set doesn’t matter as long as the order is right. The parametrization of a curve is often of little importance apart from its order. Similarly, the choice of charts for a manifold is often of little importance as long as the topology and differentiable structure are as intended. One can remove the superfluous details of the choice of parametrization by defining an equivalence class of parametrizations or by simply declaring parametrizations to be equivalent with respect to some specified equivalence relation. These are the usual ways of doing things when a mathematical structure contains superfluous information which should be ignored. 16.1.7 Remark: Despite some similarities between curves and 1-dimensional manifolds, they are not really the same thing. The charts of curves map from IR to the point set, whereas 1-manifold charts are from the manifold to IR. (See Figure 16.1.2.) curve S = γ(I) ⊆ M
one-manifold Dom(ψ1 ), Dom(ψ2 ) ⊆ S ⊆ M
γ:I→M
ψ1 : S → ˚ IR
IR I = Dom(γ) ⊆ IR
IR ψ1 (S), ψ2 (S) ⊆ IR
Contrast between curve and one-manifold
This is a necessary difference because curves may self-intersect. Therefore the map for a curve may not be injective. A curve is not just a point-set with a given topology. A curve is a parametrized trajectory within a topological space. A manifold structure would be more suitable for level curves of a real-valued function. Strictly speaking, “level curves” should perhaps be called “level manifolds”. 16.1.8 Remark: A path could be fully defined by analogy with manifolds as a pair (S, A) where S ⊆ M is a subset of a topological space M , and A is a set of continuous maps γ : Iγ → S for intervals Iγ ⊆ IR. (See Figure 16.1.3.) S=
S
γ∈A
γ1
IR Dom(γ1 ) ⊆ IR Figure 16.1.3
Im(γ) ⊆ M
γ2
IR Dom(γ2 ) ⊆ IR
γ3
IR Dom(γ3 ) ⊆ IR
Atlas of curves for a path
From this perspective, the curve map in concept (1) in Remark 16.1.1 is a chart γ ∈ A, the image set in concept (2) is the set S, and the equivalence class suggested by concept (3) is the atlas A. For any two maps γ1 , γ2 ∈ A, one may construct monotonic surjective continuous functions β1 : I → Iγ1 and β2 : I → Iγ2 such that γ1 ◦ β1 = γ2 ◦ β2 . (The technicalities here are explained in Section 16.4.) If the reparametrization functions β1 and β2 are non-decreasing for all pairs of maps, the path is oriented. Other constraints could be put on reparametrization maps. For instance, they could be required to be affine, [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Figure 16.1.2
ψ2 : S → ˚ IR
400
16. Topological curves, paths and groups
C k , analytic or an isometry. (This is related to the concept of a pseudogroup. See Section 19.4.) Each transition map constraint would yield a different class of path. One could even have transition maps which are of different regularity classes on different subsets of the path. In the case of rectifiable curves, Lipschitz transition maps might be appropriate. One could now proceed to develop all of the concepts of topological and differentiable manifolds for paths of the form (S, A). After defining paths, one could define a family of paths as a pair (S, A) where γ : Iγ → S is a multi-parameter map with Iγ ⊆ IRn for n ∈ + 0 . The advantage of this kind of inverse atlas construction for n-manifolds is that it can represent in a natural way surfaces which have complex self-intersections. The important point to note is that the charts are only required to be continuous, not necessarily homeomorphisms.
Z
If one expands a an atlas of curves by completing the atlas with respect to equivalence of direction only, then the information left in the atlas corresponds to a total order on the path set. In this case, one may as well use instead the concept of an “ordered traversal” which was introduced in Definition 7.1.17. A simple kind of path-chart equivalence would be to declare a set A of curves for a path set to be equivalent if γ1−1 ◦ γ2 : IR → ˚ IR is continuous for all γ1 and γ2 in A. These curve transition maps may also be required to have various regularity properties. Perhaps a much more interesting concept of “path-atlas” would generalize the index set I to an open subset of IRn for general n ∈ + 0 . As in Definition 7.1.17, the analytical structure on the parameter set I could be replace with an order structure, because sometimes one is only interested in order of traversal, not in the analytical properties of the traversal. In the interests of minimalism, the structure on I = IRn , for example, could be a partial order such as x ≤ y ⇔ (∀i ∈ n , xi ≤ yi ) or the lexicographic total order x ≤ y ⇔ ∀j ∈ n , (∀i ∈ n , i < j ⇒ xi = yi ) ⇒ xj ≤ yj . (This lexicographic total order may also be expressed as x ≤ y ⇔ ∀j ∈ n , ((xj ≤ yj ) ∨ (∃i ∈ n , (i < j ∧ xi 6= yi ))), as may be readily verified by the curious reader. See Exercise 47.7.10 and Section 7.1.)
Z
N
N N
N
N
[ Foliations might have some relevance or relation to curve families. ]
16.2. Curves 16.2.1 Remark: The customary use of the symbol γ for curves is possibly due to the fact that γ is the third Greek letter, corresponding to the Latin third letter ‘c’ for “curve”. 16.2.2 Remark: Curves are of fundamental importance to both differential geometry and physics. Therefore they deserve careful study. Much of physics is expressed in terms of the effect of fields (electric, magnetic, gravitational, etc.) on “test particles”, which are abstract infinitesimal particles which follow continuous trajectories in some space, usually with a time parameter of some kind. Much of differential geometry is expressed in terms of parallel transport along curves or paths. Curves and paths are similar to 1-manifolds (defined in Chapter 26), but have important differences as discussed in Remark 16.1.8. Curves and paths are also of fundamental importance in the study of connectivity in general topological spaces because curves are continuous maps whose domain is an interval, and intervals are precisely those subsets of IR which are connected. Curves are therefore used for defining pathwise connectivity. Families of curves are also a basic tool in algebraic topology. 16.2.3 Remark: The definitions of curves in this section are meaningful in general topological spaces although they are typically used in topological and differentiable manifolds. It is assumed that all curves are continuous because it is difficult to think of a useful class of curves with a weaker condition than continuity. (A curve with discontinuous jumps is probably better described as a set or sequence of curves, or an “ordered traversal”.) However, a curve may be referred to as a “continuous curve” to emphasize that no stronger regularity properties are expected from it. 16.2.4 Definition: A (continuous) curve in a topological space M is a continuous map γ : I → M for some interval I ⊆ IR. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
[ Maybe try to generalize ordered traversals to multi-parameter ordered traversals. This would be analogous to a multi-parameter family of curves, but with the parametrization information removed if the multiple total order on the index set is induced onto the set itself. ]
16.2. Curves
401
16.2.5 Remark: Intervals of IR are defined in Definition 8.3.10. 16.2.6 Remark: There’s an interesting question here as to whether empty curves should be permitted in Definition 16.2.4. Real intervals are characterized as the connected subsets of IR. The set of all real intervals is closed under intersection if empty intervals are permitted. So it is desirable to permit empty curves. If I = ∅, then γ = ∅, namely the empty function. 16.2.7 Remark: Since the parameter interval I of a curve in Definition 16.2.4 is connected, it follows that the image γ(I) is connected in the topology of the target space M . The interval I may be open, closed or semi-closed. It may also be classed as bounded, singly infinite or doubly infinite. If I is compact (i.e. closed and bounded), then the image γ(I) is compact. The non-empty compact intervals are the most useful for defining parallelism on fibre bundles because the end-points are required. In algebraic topology, it is usual to normalize a positive-length compact parameter interval of a curve to the set [0, 1], but in differential geometry, general parameter intervals are required. The parameter may represent, for example, the time of passing a point or the distance measured along the curve. 16.2.8 Remark: It is important to disinguish two different ways of using the words “open” and “closed”. In the theory of curves, these words are often applied to the connectivity properties of the curve rather than the general topological properties defined in Section 14.2. For instance, a curve γ : [a, b] → X might be called “closed” if γ(a) = γ(b). This is an unfortunate re-use of words. Because of these multiple meanings of “closed” and “open”, it is sometimes a matter of guesswork to interpret them. Perhaps a better word for a curve which ends where it starts would be a “loop” or a “closed loop”. The word “arc” in Definition 16.2.9 is sometimes a synonym for “curve” and sometimes not, which makes things even more confusing. Historically this confusion seems to be due to the divergent usage in complex analysis and topology. Arcs are found mostly in complex analysis. The word “arc” is avoided in this book. The term “compact-domain curve” is probably non-standard, but is often useful.
16.2.10 Remark: Whereas Definition 16.2.9 classifies curves in a topological sense, Definition 16.2.11 classifies curves according to the injectivity or lack of injectivity of the curve. 16.2.11 Definition: Let M be a topological space. A closed curve in M is a curve γ : [a, b] → M such that γ(a) = γ(b). A simple curve or Jordan arc in M is an injective curve, i.e. a curve γ such that γ(t1 ) = γ(t2 ) ⇒ t1 = t2 . A Jordan curve or simple closed curve in M is a curve with a non-empty compact domain which is injective except that the end-points coincide; in other words, it is a curve γ : [a, b] → M such that γ(t1 ) = γ(t2 ) ⇔ (t1 = t2 or {t1 , t2 } = {a, b}). A constant curve in M is a curve γ : I → M such that ∀s, t ∈ I, γ(s) = γ(t). 16.2.12 Remark: It is not at all guaranteed that a curve γ : (a, b) → M can be continuously extended to a curve γ : [a, b] → M for a, b ∈ IR with a < b. Therefore initial and terminal points in Definition 16.2.13 and Notation 16.2.14 assume a non-empty compact parameter interval [a, b]. Since concatenation of curves is defined in terms of initial and terminal points, concatenation in Definitions 16.2.15 and 16.2.17 also assumes non-empty compact parameter intervals. 16.2.13 Definition: Let M be a topological space. The initial point of a curve γ : [a, b] → M is γ(a). The terminal point of a curve γ : [a, b] → M is γ(b). A multiple point of a curve γ : I → M is a point x ∈ M such that γ(t1 ) = γ(t2 ) = x for some t1 , t2 ∈ I with t1 6= t2 . 16.2.14 Notation: The initial and terminal points of a curve γ : [a, b] → M may be denoted as S(γ) = γ(a) and T (γ) = γ(b) respectively. (S is mnemonic for “source” or “start”. T is mnemonic for “terminal” or “target”.) [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
16.2.9 Definition: Let M be a topological space. An open curve or open arc in M is a curve γ : I → M such that I is an open interval. A compact-domain curve in M is a curve γ : I → M such that I is a compact interval.
402
16. Topological curves, paths and groups
16.2.15 Definition: The concatenation of two curves γ1 : [a1 , b1 ] → M and γ2 : [a2 , b2 ] → M in a topological space M with b1 = a2 and γ1 (b1 ) = γ2 (a2 ) is the curve γ : [a1 , b2 ] → M defined for t ∈ [a1 , b2 ] by γ1 (t) for t ∈ [a1 , b1 ] γ(t) = γ2 (t) for t ∈ [a2 , b2 ]. 16.2.16 Remark: The concatenation of two continuous curves is a continuous curve. Pairwise concatenation generalizes easily to sets, families and sequences of curves. The concatentation of an unordered family of curves is given in Definition 16.2.17. 16.2.17 Definition: The concatenation of a sequence of curves (γj )j∈J with non-empty compact domains S Ij = [aj , bj ] for j ∈ J such that I = j∈J Ij is an interval and #(Ij ∩Ik ) ≤ 1 for all j, k ∈ J, and γj (t) = γk (t) for all t ∈ Ij ∩ Ik for all j, k ∈ J, is the map γ : I → M defined by γ(t) = γj (t) for all t ∈ Ij , for all j ∈ J. [ Define single-parameter and multi-parameter families of curves here. One-parameter families of curves are important for homotopy in algebraic topology. ] [ Define topological foliations near here. See EDM2 [34], section 154. ] 16.2.18 Notation: Denote by C0 (M ) the set of all continuous curves in M . 16.2.19 Remark: Notation 16.2.18 is experimental. To be useful, it may need some indication of the nature of the parameter interval.
Some texts define any two curves in a topological space M to have the same path if they have the same image. Such a definition discards information about the direction of the curve and exact process of traversal in the case of self-intersections or retracing of the image set. On the other hand, if two curves are defined to be path-equivalent when then are related by a homeomorphism between the parameter intervals, not enough information is discarded. This is because a curve which is constant on some sub-interval actually traces the same path as if there had been no such constant sub-interval. To give a real-life example, if you travel by train from Paris to Moscow, your path is the same whether or not your train stops in Berlin for 5 minutes. But parameter homeomorphisms are unable to remove such pauses in journeys. Therefore in this section a more precise concept of path equivalence is defined. To put it simply, intervals where a curve is constant (called “constant stretches”) are ignored when comparing curves. In particular, this implies that a constant curve has the same path as a curve with a one-point parameter interval, which is as one would expect. Information about the direction of a curve is not discarded because this information is needed in most applications in differential geometry. Unoriented paths are useful for defining pathwise connectivity in general topology, but oriented paths can do that job too. So the default definition for a path is oriented. Alternative terms for “oriented” would be “directed” or “ordered”. 16.3.1 Remark: In the study of pathwise parallelism, it is assumed that no change of orientation of a fibre occurs if the curve is stationary for a while. Thus if a curve γ : I → M satisfies γ(s) = γ(t) for all s, t ∈ [a, b] ⊆ I for some a < b, then the curve could be said to be stationary on the interval [a, b]. But the word “stationary” is usually associated with functions of one or more real or complex variables whose derivatives vanish at a point. Therefore the more generic term “constant” is preferable. If reparametrizations are permitted to be non-decreasing continuous maps rather than increasing homeomorphisms, then all constant stretches of curves may be removed. Thus a curve γ1 which is constant on the interval [a, b] may have this constant stretch removed by expressing it as γ1 = γ2 ◦ β, where β : I → IR is defined by β(t) = min(x, a + max(0, x − b)) for all t ∈ I, and γ1 (t) t≤a γ2 (t) = γ1 (t + b − a) t ≥ a for t ∈ β(I) = Dom(γ2 ). A reparametrization such as this is clearly not a homeomorphism, but the curves γ1 and γ2 trace out the same set of points in the same order. Therefore they are equivalent as far as representing [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
16.3. Path-equivalence relations for curves
16.3. Path-equivalence relations for curves
403
a traversal of points is concerned. When curves are used for parallel transport, there is supposed to be no change in the orientation of a fibre when the curve is constant. There are two obvious ways to deal with sometimes-constant curves. Either they can be simply removed from consideration, or else they can be “equivalenced out”, which means that they can be collected together in equivalence classes which effectively ignore the constant stretches of curves. If the latter approach is used, it will be convenient to always be able to select a never-constant representative for each equivalence class. In practice, this would be the same as just ignoring sometimes-constant curves completely. There remains, therefore, the question of whether there is any use in permitting sometimes-constant curves to be members of curve classes. The formalism chosen here uses unrestricted continuous curves, and imposes an equivalence relation which makes all curves equivalent to some never-constant curve. Therefore all sometimes-constant curves may be ignored since their paths are represented by never-constant curves. 16.3.2 Definition: A constant stretch of a curve γ : I → M in a topological space M is an interval [a, b] ⊆ I with a < b and γ(s) = γ(t) for all s, t ∈ [a, b]. 16.3.3 Definition: A never-constant curve in a topological space M is a curve γ : I → M such that ∀a, b ∈ I, (a < b ⇒ (∃c ∈ [a, b], γ(c) 6= γ(a))). A sometimes-constant curve in a topological space M is a curve γ : I → M which has a constant stretch.
16.3.4 Remark: A curve is a never-constant curve if and only if it has no constant stretches. It is not necessarily true that a never-constant curve is injective if restricted to small enough sub-intervals. That is, a restriction such as γ [t−ε,t+ε] may be non-injective for all ε > 0. But a curve which does have this local injectivity property is necessarily never-constant in the sense of Definition 16.3.3. 16.3.5 Theorem: If a curve γ is constant and never-constant, then either γ = ∅ or #(Dom(γ)) = 1.
16.3.6 Theorem: For any curve γ1 : I → M in a topological space M , there exists a never-constant curve γ2 : J → M and a non-decreasing continuous surjection β : I → J such that γ1 = γ2 ◦ β. Proof: . . . 16.3.7 Definition: Curves γ1 and γ2 in a topological space M are path-equivalent if there are surjective non-decreasing continuous functions β1 : I → Dom(γ1 ) and β2 : I → Dom(γ2 ) for some interval I ⊆ IR such that γ1 ◦ β1 = γ2 ◦ β2 . 16.3.8 Example: The curves in IR2 defined by γ1 : [0, π] → IR2 with γ1 : t → (cos t, 0) and γ2 : [0, π] → IR2 with γ2 : t → (cos 3t, 0) have the same image set [−1, 1] × {0} and the same start and finish points, but according to Definition 16.3.7, γ1 and γ2 are not path-equivalent. This follows from Theorem 16.3.12. 16.3.9 Remark: In terms of Definition 16.3.7, Theorem 16.3.6 means that every curve in a topological space M is path-equivalent to a never-constant curve in M . 16.3.10 Remark: The reparametrization functions β1 and β2 in Definition 16.3.7 modify the corresponding curves γ1 and γ2 so that they have the same parameter interval I. They also insert constant stretches into the curves so that they match correctly. Thus if p ∈ M is a point such that γ1 (t) = p for t in some positivelength interval but γ2 does not have such a constant stretch, then β2 must insert a constant stretch with value p into the curve γ2 . In other words, the maps β1 and β2 do not remove constant stretches – they insert constant stretches in each curve to match the other curve. Instead of inserting constant stretches into curves γ1 and γ2 , it would be much more satisfying to somehow remove them. This is not possible with single-valued functions, but it can be done with the function quotient in Definition 6.7.7. Surjective functions β1 : Dom(γ1 ) → I and β2 : Dom(γ2 ) → I may be chosen so that the constant stretches of β1 and β2 match the constant stretches of γ1 and γ2 respectively. Then in terms of Definition 6.7.7, γ1 ◦ β1−1 : I → M and γ2 ◦ β2−1 : I → M will be well-defined functions, and β1 and β2 may be [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
[ Show somewhere that S for any continuous function f : I → M for an interval I ⊆ IR and M a topological space, the set A = {[a, b] ⊆ I; a < b and #(f ([a, b])) = 1} is a countable union of disjoint closed intervals of IR. This may be useful for proving Theorem 16.3.6. ]
404
16. Topological curves, paths and groups
chosen so that γ1 ◦ β1−1 = γ2 ◦ β2−1 = γ3 for some never-constant curve γ3 . Hence γ1 = γ3 ◦ β1 and γ2 = γ3 ◦ β2 as in Theorem 16.3.11. This is illustrated in Figure 16.3.1. (Any resemblance between Figure 16.3.1 and the Millikan oil drop experiment – or an overhead view of a starship – is purely coincidental.) γ1
I1 β1 I β2
γ3
M
γ2
I2 Figure 16.3.1
Equivalence of two curves to a never-constant curve
16.3.11 Theorem: Two curves γ1 and γ2 in a topological space M are path-equivalent in M if and only if they are both path-equivalent to some never-constant curve γ3 in M . 16.3.12 Theorem: If two never-constant curves γ1 and γ2 in a topological space M are path-equivalent, then the sets γ1−1 ({S}) and γ2−1 ({S}) are equipotent for all sets S ⊆ M ; that is, #(γ1−1 ({S})) = #(γ2−1 ({S})). (See Definition 7.2.18 for equipotent sets.) [ Should also list other properties of curves which are homogeneous within an equivalence class, such as the initial and terminal points and hopefully the compactness of the parameter interval. To get this, must probably restrict attention to never-constant curves. ]
16.3.14 Definition: Corresponding parameters of curves γ1 : I1 → M and γ2 : I2 → M in a topological space M are parameters t1 ∈ I1 and t2 ∈ I2 such that t1 = β1 (t) and t2 = β2 (t) for some t ∈ I for some reparametrizations β1 : I → I1 and β2 : I → I2 with γ1 ◦ β1 = γ2 ◦ β2 . [ Must show that “corresponding parameters” are well-defined. ] 16.3.15 Example: The correspondence of parameters in Definition 16.3.14 is not always unique. For example, consider the curve γ : I → IR2 for an interval I ⊆ IR where γ : t 7→ (cos t, sin t). Define γ1 = γ2 = γ with I = IR. Then suitable reparametrizations are β1 : IR → IR and β2 : IR → IR with β1 : t 7→ t + 2n1 π and β2 : t 7→ t + 2n2 π for any n1 , n2 ∈ . It follows that t and t + 2(n2 − n1 )π are corresponding parameters. So each parameter t1 for γ1 has an infinite number of corresponding parameters t2 = t1 + 2(n2 − n1 )π for γ2 , even though γ1 and γ2 are both never-constant.
Z
[ Show that two never-constant compact-domain curves have unique corresponding parameters. ] 16.3.16 Remark: The non-uniqueness of corresponding parameters in Example 16.3.15 is not a problem for defining parallelism. Open curves may sometimes be invariant under a reparametrization, but the parallelism carried on such curves is independent of parametrization.
16.4. Paths In this section, paths are defined in general topological spaces as equivalence classes of curves with respect to the “path equivalence” of curves defined in Section 16.3. The notation chosen for paths here is [γ]0 for the equivalence class of any given curve γ. Then [γ1 ]0 = [γ2 ]0 if and only if γ1 and γ2 are path-equivalent curves. One may say that [γ]0 is “the path of γ”, so that any two [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
16.3.13 Remark: When defining parallel transport on curves, it is necessary that the transport be the same for all path-equivalent curves. That is, the parallelism should depend only on the path, not on the particular choice of curve to represent the path. If the parallel transport is defined between two points along equivalent curves, then the choice of curves should be irrelevant. This requirement may be stated by specifying that the parallel transport between “corresponding points” on two curves must be the same. Definition 16.3.14 defines the notion of “corresponding parameters” of curves for this purpose.
16.4. Paths
405
curves are equivalent if and only if they “have the same path”. So every curve is associated with a unique (oriented continuous) path. This path structure determines the order and general manner of traversal of points in the image set. 16.4.1 Notation: [γ]0 denotes the set of curves in a topological space M which are path-equivalent to a given curve γ in M . 16.4.2 Definition: A path in a topological space M is an equivalence class [γ]0 of curves which are pathequivalent to a given curve γ in M . A path may also be called an (oriented) (continuous) path or an (oriented) C 0 path, and the words directed or ordered may be used instead of “oriented”. For any curve γ, the path of γ is the equivalence class [γ]0 . S The set {Range(γ1 ); γ1 ∈ [γ]0 } is called the image of the path [γ]0 . Any curve in a path Q = [γ]0 may be referred to as a path representative or representative curve for the path Q. 16.4.3 Remark: For the empty curve γ = ∅, the equivalence class [γ]0 is not empty. So it cannot be called literally the “empty path”. But it could accurately be called the “empty curve path” or the “path of the empty curve”. This logically correct usage is too clumsy. So the terms “empty path” and “non-empty path” will refer to the curve map, not the equivalence class. The emptiness or non-emptiness also corresponds to the corresponding property of the image of the path. Another moderately interesting trivial-curve issue is that of constant curves. For a fixed p ∈ M , the constant curves γ1 : {0} → M with γ1 : t 7→ p and γ2 : [0, 1] → M with γ2 : t 7→ p are path-equivalent although their parameter intervals are not homeomorphic. So these curves have the same path. In fact, all constant paths with the same value are path-equivalent, for all of the topologically different kinds of non-empty oriented intervals in the table in Section 16.2. The only kind of constant curve which is never-constant is a curve with a singleton domain.
16.4.5 Remark: Notation 16.4.4 is experimental. There are some standard notations for sets of curves and paths, but this it probably not one of them. In terms of the corresponding Notation S 16.2.18 for curves, P0 (M ) = {[γ]0 ; γ ∈ C0 (M )}. Hence P0 (M ) is a partition of C0 (M ); so C0 (M ) = P0 (M ) and for all γ1 , γ2 ∈ C0 (M ), either [γ1 ]0 = [γ2 ]0 or [γ1 ]0 ∩ [γ2 ]0 = ∅. 16.4.6 Remark: Definitions 16.4.7 and 16.4.8 are based on curve classes in Definitions 16.2.9 and 16.2.11. 16.4.7 Definition: Let M be a topological space. The empty path in M is the path [γ]0 in M such that γ = ∅ is the empty curve. An open path in M is a path Q in M such that γ is an open curve in M for some never-constant curve γ ∈ Q. A closed path in M is a path [γ]0 such that γ is a closed curve in M . 16.4.8 Definition: Let M be a topological space. A simple path in M is a path [γ]0 such that γ is a simple curve in M . A simple closed path or Jordan path in M is a path [γ]0 such that γ is a Jordan curve in M . A constant path in M is a path [γ]0 in M such that ∃p ∈ M, ∀t ∈ Dom(γ), γ(t) = p. [ There are perhaps some problems with Definitions 16.4.7 and 16.4.8 because not all properties of curves are homogeneous within equivalence classes. For example, the domain of a constant curve is an arbitrary interval even though all constant curves with the same value are path-equivalent. Must fix this. Probably have to specify non-constant representative curves. ] 16.4.9 Definition: The reversal of a path [γ]0 in a topological space M is the path [−γ]0 where −γ denotes the curve −γ : t 7→ γ(−t). The reversal of a path Q = [γ]0 may be denoted as −Q or −[γ]0 . 16.4.10 Remark: The definitions of initial point, terminal point and multiple point in Definition 16.4.11 are independent of the choice of path representative. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
16.4.4 Notation: Denote by P0 (M ) the set of all continuous paths in M .
406
16. Topological curves, paths and groups
16.4.11 Definition: The initial point and terminal point of a path Q in a topological space M with non-empty compact domain are the initial point and terminal point respectively of some representative of Q. A multiple point of a path Q in a topological space M is a point x ∈ M such that x is a multiple point of some path representative of Q. 16.4.12 Notation: The initial and terminal points of a path Q with non-empty compact domain may be denoted as S(Q) = S(γ) and T (Q) = T (γ) respectively for any path representative γ of Q. 16.4.13 Definition: The concatenation of two paths Q1 and Q2 in a topological space M with non-empty compact domains such that T (Q1 ) = S(Q2 ) is the concatenation of any representatives γ1 of Q1 and γ2 of Q2 such that T (γ1 ) = S(γ2 ). [ Should show here that the concatenation of two paths is well-defined. ] [ Define sums of paths so that common stretches of paths going in opposite directions can be cancelled. This is important for stating that pathwsie curvature is additive with respect to “addition” of curves in some simplicial complex sense. I guess that means I’ve got to read up on algebraic topology now. ] 16.4.14 Remark: Definition 16.4.2 for a path removes information about the choice of parametrization of a curve except for the direction. The information which is preserved is the order in which every point of the image are traversed, possibly multiple times. Definition 16.4.15 removes slightly more information because the direction of traversal is also removed. 16.4.15 Definition: An unoriented (continuous) path in a topological space M is the set Q ∪ (−Q) for any path Q in M .
16.4.17 Remark: As is the case of all equivalence class constructions in mathematics, an equivalence class of curves (such as the paths in Definitions 16.4.2 and 16.4.15) may be represented in practical applications by a single curve of the class. In practice, one need not be fastidious about the distinction between curves and paths, as long as it is clear which equivalence relation is being used in each context. 16.4.18 Remark: The purpose of the parametrization of paths is to indicate the order and manner of traversal of points in a topological space, although most of the information in the parametrization is irrelevant. The alternative of defining some sort of total ordering on the image is too clumsy in practice. This is analogous to the issue of families of atlases versus sets of atlases. In practice, very little analysis of paths can be done without parametrization, just as very little differential geometry can be done without coordinate charts. [ Define concatenation of paths. Show that concatenation is independent of the choice of path maps. ] [ Define pathwise connectivity near here. ]
16.5. Convex curvilinear interpolation While preparing figures for this book, the author needed some formulas for convex interpolation between curves in a linear space. An example is shown in Figure 16.5.1. 16.5.1 Remark: The formula the author finally settled on for interpolation between curves is as follows. z(s, t) = (1 − t)γ1 (s) + tγ3 (s) + (1 − s)γ2 (t) + sγ4 (t) − ((1 − t)(1 − s)q0 + (1 − t)sq1 + t(1 − s)q2 + tsq3 ) = ct (γ1 (s), γ3 (s)) + cs (γ2 (t), γ4 (t)) − ct (cs (q0 , q1 ), cs (q2 , q3 )), (16.5.1) where cλ (x, y) = (1 − λ)x + λy for λ, x, y ∈ IR.
The functions γi represent curves in IR2 which have a single real parameter in [0, 1]. The points qk are the intersection points of the curves. Then the point z(s, t) is a point which interpolates between the four curves. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
16.4.16 Remark: Other possible terms for “unoriented” are “disoriented”, “undirected”, “unordered”, “disordered”, and so forth.
16.5. Convex curvilinear interpolation
407 q3
γ3
q2
γ4 γ2
q0
q1 γ1
Figure 16.5.1
Curvilinear interpolation between 4 curves
When s = 0, this interpolation matches γ2 . To be precise, z(0, t) = γ2 (t) for t ∈ [0, 1]. The complete set of boundary conditions is as follows. if if if if
t=0 s=0 t=1 s=1
q 0 q1 = q2 q3
if if if if
s=t=0 s = 1, t = 0 s = 0, t = 1 s = t = 1.
16.5.2 Remark: The intention of the curve family (16.5.1) is that the horizontal curves should “morph” from curve γ1 into curve γ3 . At the same time, the vertical curves should “morph” from γ2 to γ4 . The two sets of curves should be consistent with each other in the sense that the point with parameter t on the sth curve t 7→ z(s, t) interpolating γ2 to γ4 should be the same as the point with parameter s on the tth curve s 7→ z(s, t) interpolating γ1 to γ3 . The interpolation should also be a polynomial of the lowest possible order with respect to s and t. (In this case, it turned out to be bilinear.) 16.5.3 Remark: The chosen formula (16.5.1) has some interesting properties. For instance, the curve t 7→ z(s, t) is a linear interpolation of a simple linear interpolation of γ2 and γ4 with the simple linear interpolation of the points γ1 (s) and γ3 (s). 16.5.4 Remark: It does not necessarily follow that if the four bounding curves are non-self-intersecting then the interpolated curves will be non-self-intersecting. Figure 16.5.2 illustrates a counter-example. 16.5.5 Remark: The curvilinear interpolation formula (16.5.1) was derived by the author as follows. First, regard the location of the point z(s, t) for each s, t ∈ [0, 1] as a distortion of a regular rectangular grid. The distortion at the boundary is given, and it is only necessary to find a distortion of the rectangular grid which matches up with the give distortions at the boundary. Thus this is a kind of boundary value problem, and the solution inside the grid should be a multinomial of the lowest possible degree, preferably bilinear. Assuming that γ2 is the curve on the left and γ4 is the curve on the right as shown in Figure 16.5.1, define a convex combination αs of γ2 and γ4 : αs (t) = (1 − s)γ2 (t) + sγ4 (t). This is asymmetric with respect to s and t. This family is curvilinear in t but linear in s. When s ∈ {0, 1}, this family matches the curves γ2 and γ4 . This family of curves αs (t) does not match the curves γ1 and γ3 when t ∈ {0, 1} as desired. Therefore consider the deviation or “error” between the curves αs (0) and γ1 and between the curves αs (1) and γ3 . This “error” should look like γ1 (s) − αs (0) for t = 0 and like γ3 (s) − αs (1) for t = 0. Therefore define z(s, t) = αs (t) + (1 − t)(γ1 (s) − αs (0)) + t(γ3 (s) − αs (1)). [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
γ (s) 1 γ2 (t) z(s, t) = γ3 (s) γ4 (t)
408
16. Topological curves, paths and groups
q3 γ3
q2
γ4 γ2
q0
q1 γ1
Figure 16.5.2
Curvilinear interpolation between 4 curves
This has the form of the “erroneous” curve αs (t) plus a convex combination with respect to t of the “error corrections” for αs (t) for t = 0 and t = 1. By substituting the formula for α into this equation, the result is z(s, t) = (1 − s)γ2 (t) + sγ4 (t) + (1 − t)(γ1 (s) − (1 − s)γ2 (0) − sγ4 (0)) + t(γ3 (s) − (1 − s)γ2 (1) − sγ4 (1)) = (1 − s)γ2 (t) + sγ4 (t) + (1 − t)(γ1 (s) − (1 − s)q0 − sq1 ) + t(γ3 (s) − (1 − s)q2 − sq3 ) = (1 − s)γ2 (t) + sγ4 (t) + (1 − t)γ1 (s) + tγ3 (s) − (1 − t)((1 − s)q0 + sq1 ) − t((1 − s)q2 + sq3 ).
16.5.6 Remark: Convex curvilinear interpolation may be generalized to general differentiable manifolds (with an affine connection) using geodesics and convex combinations.
16.6. Algebraic topology Some basic definitions for topics such as homotopy groups and singular homology theory will be summarized here. [ For homology groups, see EDM2 [34], article 201, and Federer [105], page 463. For homotopy theory, see EDM2 [34], article 202. ] [ See EDM2 [34], section 148.C, for the compact-open topology on the set Ω(X; x0 , x1 ) of curves from x0 to x1 in a topological space X. This is supposed to have something to do with fibre spaces. ] [ Exact sequences of linear spaces are defined in Section 10.11. But algebraic topology requires only exact sequences of group homomorphisms, or something like that. ] [ Define sheaves (EDM2 [34], 383) and sheaf cohomology (EDM2 [34], 383.E). ]
16.7. Topological groups Topological groups are required for the specification of structure groups for fibre bundles in Chapter 23. Topological groups are related to Lie groups. (See Chapter 34.) Transformation groups are discussed in Section 9.4. This section deals with topological groups. [ Near here, should refer to a family tree for topological groups and Lie groups. ] 16.7.1 Definition: A topological group is a tuple (G, TG , σG ) such that [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
This agrees with equation (16.5.1).
16.8. Topological transformation groups
409
(i) (G, σG ) is a group, (ii) (G, TG ) is a topological space, (iii) the group operation σG : G × G → G is continuous with respect to TG , and (iv) the map g 7→ g −1 from G to G is continuous with respect to TG .
[ See EDM [33], section 406.A. Is the g −1 condition in Definition 16.7.1 superfluous? ]
16.8. Topological transformation groups 16.8.1 Remark: The difference between Definitions 16.8.2 and 16.8.3 is the extra requirement of the topological transformation group that the action map be continuous with respect to elements of the group. In both cases, the action of each element of the group G is a topological automorphism of the set X. In Definition 16.8.2, the group is not a topological group. It is just a group of automorphisms. In Definition 16.8.3, the group is a topological group whose action map is continuous with respect to both the group topology and the point-set topology. 16.8.2 Definition: A (left) transformation group of a topological space X is a tuple (G, X, TX , σG , µ) such that (G, X, σG , µ) is a (left) transformation group of the set X, and Lg : X → X is a homeomorphism from (X, TX ) to (X, TX ) for all g ∈ G. 16.8.3 Definition: A topological (left) transformation group of a topological space (X, TX ) is a tuple (G, TG , X, TX , σG , µ) such that (G, TG , σG ) is a topological group and the action map µ : G × X → X is continuous with respect to the topologies TG and TX . 16.8.4 Remark: Definitions 16.8.7 and 16.8.8 are the same as Definitions 16.8.2 and 16.8.3 respectively, except that the action is required to be effective. (See Definition 9.4.14 for the concept of effective group action.)
16.8.6 Definition: A topological (left) transformation group homomorphism from a topological left transformation group (G1 , X1 ) − < (G1 , TG1 , X1 , TX1 σ1 , µ1 ) to a topological left transformation group (G2 , X2 ) − < ˆ φ) with φˆ : G1 → G2 and φ : X1 → X2 such that (G2 , TG2 , X2 , TX2 , σ2 , µ2 ) is a pair of maps (φ, ˆ φ) is a left transformation group homomorphism; (i) The pair (φ, (ii) φˆ and φ are continuous. 16.8.7 Definition: An effective (left) transformation group of a topological space X is a (left) transformation group G − < (G, X, TX , σG , µ) of the topological space X such that G acts effectively on X. 16.8.8 Definition: An effective topological (left) transformation group of a topological space X is a topological (left) transformation group G − < (G, TG , X, TX , σG , µ) of X such that G acts effectively on X. 16.8.9 Remark: On the subject of specification tuples, the rule for choosing the listing order for the components of tuples is that all algebraic operations (such as sums σ and products µ) are placed at the end of the tuple, whereas attributes of sets such as topologies and atlases are place immediately after the sets they belong to, as for example the topology TX for X in Definition 16.8.8. These style rules are followed throughout this book. 16.8.10 Remark: The following definitions are for the “right” versions of the above “left” transformation groups. The non-topological versions of these topological right transformation groups are in Section 9.4. 16.8.11 Definition: A right transformation group of a topological space X is a tuple (G, X, TX , σG , µ) such that (G, X, σG , µ) is a right transformation group of the set X, and Rg : X → X is a homeomorphism from (X, TX ) to (X, TX ) for all g ∈ G. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
16.8.5 Remark: Homomorphisms of topological transformation groups in Definition 16.8.6 are based on Definition 9.4.9 for general transformation groups.
410
16. Topological curves, paths and groups
16.8.12 Definition: A topological right transformation group of a topological space (X, TX ) is a tuple (G, TG , X, TX , σG , µ) such that (G, TG , σG ) is a topological group and the action map µ : X × G → X is continuous with respect to the topologies TG and TX . 16.8.13 Definition: An effective right transformation group of a topological space X is a right transformation group G of the topological space X such that G acts effectively on X. 16.8.14 Definition: An effective topological right transformation group of a topological space X is a topological right transformation group G of X such that G acts effectively on X. [ Put a family tree here for topological groups, although it’s very simple. This should be similar to Figure 34.7.1. ] 16.8.15 Remark: Theorem 16.8.16 is the topological version of Theorem 9.4.24. See Remark 9.4.20. 16.8.16 Theorem: Let G − < (G, TG , σG ) be a topological group. Define the action map µ : G × G → G by µ : (g1 , g2 ) 7→ σG (g1 , g2 ). Then the tuple (G, TG , G, TG , σG , µ) is an effective topological left transformation group of (G, TG ). The tuple (G, TG , G, TG , σG , µ) is also an effective topological right transformation group of (G, TG ). Proof: For a left transformation group, the action map µ : G × X → X must satisfy the associativity rule µ(σG (g1 , g2 ), x) = µ(g1 , µ(g2 , x)) for all g1 , g2 ∈ G and x ∈ X. If the formula for µ in the theorem is substituted into this rule with X = G, it follows easily from the associativity of σG . The continuity of µ follows from the continuity of σ. For a right transformation group, the action map µ : X × G → X must satisfy the associativity rule µ(x, σG (g1 , g2 )) = µ(µ(x, g1 ), g2 ) for all g1 , g2 ∈ G and x ∈ X. This follows in exactly the same way from the group associativity.
16.8.17 Definition: The topological left transformation group (G, G) − < (G, TG , G, TG , σG , σG ) is called the topological (left) transformation group of G acting on G by left translation, or the topological left translation group of G (on itself ). The topological right transformation group (G, G) − < (G, TG , G, TG , σG , σG ) is called the topological right transformation group of G acting on G by right translation, or the topological right translation group of G (on itself ). [ Define pseudogroups of transformations of topological spaces near here. See Kobayashi and Nomizu [26], page 1. Is there such a thing as a non-topological pseudogroup of transformations? ]
16.9. Topological vector spaces 16.9.1 Remark: In this book, the term “linear space” is used in preference to “vector space” because a the word “vector” suggests an arrow with both a specified start point and a specified end point whereas a linear space is simply an algebraic structure. However, the term “topological vector space” is the standard terminology in the mathematics literature. So the more accurate terminology “topological linear space” is not used in this book. 16.9.2 Remark: Definition 16.9.3 combines Definition 10.1.2 for a linear space with Definition 14.2.4 for a topological space. [ Check what the correct definition is for a topological vector space for a general field. It’s not quite clear what to do with the topology on the field. The usual idea is to just use the standard topology on the field, but what should one do if the field is not IR of ? ]
C
16.9.3 Definition: A topological vector space is a tuple V − < (V, TV ) − < (K, V, σK , τK , σV , µ, TK , TV ) such that [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
These transformation groups are effective because the identity element of G is unique.
16.10. Network topology and continuous paths in networks (i) (ii) (iii) (iv)
411
K = (K, σK , τK ) is a field; V − < (K, V, σK , τK , σV , µ) is a linear space over K; (V, T ) is a topological space; the vector addition function σV : V × V → V is continuous with respect to the topology TV on V and the corresponding product topology on V × V ;
(v) the scalar multiplication function µ : K × V → V is continuous with respect to the topology TV on V and the product topology on K × V .
[ Check if the continuity of µ in Definition 16.9.3 can be dropped because of the linearity of the map. If not, give a counterexample. Using only the linearity of µ, it is probably possible to demonstrate continuity on a dense subspace of K, or something like that. ] [ In this section, give only the most basic general properties and definitions for topological vector spaces. Specific examples, such as Banach spaces, Hilbert spaces, distributions and Sobolev spaces should be given elsewhere. ]
16.10. Network topology and continuous paths in networks 16.10.1 Remark: This section is a mere twig on the concept tree of the book. It may be skipped with no harmful side-effects. On the other hand, it is referred to in Chapter 22, which is almost as pointless as this section. However, it may turn out some day that some significant world-models in physics require network topology. 16.10.2 Remark: The word “topology” is often applied to the connectivity properties of networks, also known as graphs. Network topology is not the same thing as topology on discrete sets. Network topology does not satisfy the conditions for a standard analytical topology, but there are many similarities. Continuous curves can be defined. Therefore notions such as pathwise parallelism and even curvature can be defined.
16.10.3 Remark: Given a set N , neighbourhoods Ωx ⊆ N can be defined for each x ∈ N such that (i) ∀x ∈ N, x ∈ Ωx , and (ii) ∀x ∈ N, ∀y ∈ Ωx , x ∈ Ωy . Unfortunately, the intersections of neighbourhoods are not generally neighbourhoods, but continuous curves may be defined in a network N as maps γ : I → N such that I is a contiguous subset of the integers and
Z
(i) ∀i ∈ I, (i + 1 ∈ I ⇒ γ(i + 1) ∈ Ωγ(i) ), and (ii) ∀i ∈ I, (i − 1 ∈ I ⇒ γ(i − 1) ∈ Ωγ(i) ). − A metric d : N × N → + 0 is usually defined recursively on a network N in terms of balls Bx,r with centre x ∈ N and radius r ∈ + 0 as follows.
Z Z
(i) ∀x ∈ N, Bx,0 = {x}, S (ii) ∀x ∈ N, ∀r ∈ + 0 , Bx,r+1 = {z ∈ N ; ∃y ∈ Bx,r , z ∈ Ωy } = y∈Bx,r Ωy . S Then the metric d(x, y) is defined as min{r ∈ + Bx,r ; otherwise d(x, y) = ∞. The 0 ; y ∈ Bx,r } if y ∈ r∈Z+ 0 network N is said to be connected if d(x, y) < ∞ for all x, y ∈ N .
Z
Z
16.10.4 Remark: A fibre bundle may be defined on a network with a network topology by attaching a copy of a fibre space to each element of the network. Then parallelism may be defined as symmetric, transitive, path-dependent maps between the fibres. In the context of a network topology, a connection may be defined as the parallelism relations between neighbouring points. From the connection, parallel transport for general continuous paths may be generated by applying the transitivity rule.
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
[ Since continuity of functions may be redefined (as in Section 15.5) in terms of preservation of disconnectedness, it may be possible to use such an approach for network topology. Give definitions for this approach. ]
412
[ www.topology.org/tex/conc/dg.html ]
16. Topological curves, paths and groups
[ draft: UTC 2009–3–21 Saturday 11:36 ]
[413]
Chapter 17 Metric spaces
17.1 17.2 17.3 17.4 17.5
Distance functions and balls . . . . . Set distance and set diameter . . . . . The topology induced by a metric . . Continuous functions in metric spaces Rectifiable sets, curves and paths . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
414 416 417 421 422
17.0.1 Remark: Metric spaces may be regarded as being in a higher concept layer than topological spaces. This is because every metric space determines a unique corresponding topological space, whereas any metrizable topological space generally corresponds to an infinite number of metric spaces. Since a metric space induces a unique topology, a metric space may freely import the definitions and theorems of general topology, but not vice versa.
One of the principal objectives of this book is to present the foundations of differential geometry in a disciplined systematic manner in order to discourage the incorrect application of specific facts to more general contexts. A similar danger of false generalization occurs in many differential geometry books which present Riemannian manifolds before general differentiable manifolds and manifolds which have only an affine connection. Commencing a book in the middle of a subject (between the low-level foundations and the high-level applications) may be more popular in the short term, but it leads to confusion in the long term. 17.0.2 Remark: Metric spaces (in the context of topology) should not be confused with the concept of a Riemannian metric (in the context of differential geometry). A Riemannian manifold is a particular kind of differentiable metric space, and a Riemannian metric is a particular kind of differential of a two-point metric on a manifold. In terms of conceptual layering, metric spaces are lower than Riemannian manifolds. Metric spaces are a more general concept which is closely associated with general topology. 17.0.3 Remark: Just as the Riemannian metric tensor field simplifies many differential geometry definitions, so also a two-point metric function simplifies many topology definitions. Many distinct concepts in general topology become equivalent concepts when the topology is induced by a metric. A good example of this is compactness. Several different definitions of compactness for general topological spaces are equivalent for metric spaces. 17.0.4 Remark: Some of the topics in this chapter, such as Lipschitz functions and rectifiable curves are more closely related to calculus than topology. It turns out that Lipschitz functions and rectifiable curves are differentiable almost everywhere when the metric space is a manifold. Differentiability is a calculus concept and “almost everywhere” is a measure theory concept, but the Lipschitz and rectifiability conditions may be stated in the absence of such higher-layer concepts.
©
Alan U. Kennington, “Differential geometry reconstructed: a unified systematic framework”. Copyright 2009, Alan U. Kennington. All rights reserved. You may print this book draft in A4 format. Printing in all other formats is forbidden. You may not charge any fee for copies of this book draft. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Since metric spaces are very close to human intuition, many textbooks introduce metric spaces before topological spaces to make life easier for the reader. In the long term, however, this is confusing because it is difficult to forget the “facts” of metric spaces when learning the more general subject of topological spaces. One’s intuition for metric spaces tends to impose itself on topological spaces, leading to many false assumptions and much confusion.
414
17. Metric spaces
17.1. Distance functions and balls 17.1.1 Definition: A metric (function) or distance function for a set M is a function d : M × M → IR+ 0 such that (i) ∀x, y ∈ M, d(x, y) = 0 ⇔ x = y, (identity) (ii) ∀x, y ∈ M, d(x, y) = d(y, x), (symmetry) (iii) ∀x, y, z ∈ M, d(x, y) + d(y, z) ≥ d(x, y). (triangle inequality) 17.1.2 Definition: A metric space is a pair (M, d) where M is a set and d is a metric function on M . A metric space (M, d) may be abbreviated as M . 17.1.3 Remark: The Riemannian metric tensor in Section 39.2 may also be called simply “the metric”, which could be confused with the metric in Definition 17.1.1. When there is a possibility of confusion, the metric in Definition 17.1.1 may be referred to as a “point-to-point metric”, “two-point metric” or “distance function”. The Riemannian concept may be called a “Riemannian metric” or “metric tensor”. The two concepts are closely related, as discussed in Section 39.3. It turns out that the Riemannian metric is actually a differential of the two-point distance function. 17.1.4 Definition: The usual metric on IRn for n ∈ Pn 2 1/2 d(x, y) = for x, y ∈ IRn . i=1 (xi − yi )
Z+0 is the function d : IRn × IRn → IR+0 defined by
17.1.5 Theorem: In a metric space (M, d), d(x, z) ≥ |d(x, y) − d(y, z)| for all x, y, z ∈ M . Proof: See Exercise 47.7.11.
17.1.6 Remark: Theorem 17.1.5 puts a lower bound on distances corresponding to the upper bound in the triangle inequality. Combining the bounds gives ∀x, y, z ∈ M,
|d(x, y) − d(y, z)| ≤ d(x, z) ≤ d(x, y) + d(y, z).
17.1.7 Definition: − The open ball in a metric space (M, d) with centre x ∈ M and radius r ∈ IR+ 0 is the set {y ∈ M ; d(x, y) < r}. −+ The closed ball in a metric space (M, d) with centre x ∈ M and radius r ∈ IR0 is the set {y ∈ M ; d(x, y) ≤ r}. 17.1.8 Notation: − Bx,r , for a metric space (M, d), x ∈ M and r ∈ IR+ 0 , denotes the open ball {y ∈ M ; d(x, y) < r} in M with centre x and radius r. −+ ¯x,r , for a metric space (M, d), x ∈ M and r ∈ IR B 0 , denotes the closed ball {y ∈ M ; d(x, y) ≤ r} in M with centre x and radius r. − Br (x), for a metric space (M, d), x ∈ M and r ∈ IR+ 0 , is an alternative notation for Bx,r . −+ ¯ ¯x,r . Br (x), for a metric space (M, d), x ∈ M and r ∈ IR0 , is an alternative notation for B 17.1.9 Remark: Definition 17.1.7 is illustrated in Figure 17.1.1. General metric spaces are quite different to IR2 , but such diagrams are often helpful for understanding general theorems. y y y
y y
x y
r
y
y
x y
r y
y ∈ Br (x) Figure 17.1.1 [ www.topology.org/tex/conc/dg.html ]
¯r (x) y∈B
Open and closed balls in a metric space [ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
One may reconcile oneself to such inequalities by drawing lines and circles on paper. Theorems 17.1.12 and 17.1.14 suggest how to draw such lines and circles.
17.1. Distance functions and balls
415
17.1.10 Remark: The words “open” and “closed” in Definition 17.1.7 are only loosely related to the concepts of open and closed sets in the topology induced by the metric in Definition 17.3.3. Likewise, the use of the bar in the notation for a closed ball does not generally mean that it is the topological closure of the corresponding open ball. Even in the spaces IRn , closed balls with zero radius are not topological closures of the corresponding open balls. In general metric spaces, the relation between open and closed balls is even looser. Finite metric spaces provide ample examples of open balls which are strictly included in the corresponding closed balls for positive radius. [ Should have numerous exercises for finite metric spaces and other metric spaces where properties expected from IRn do not hold. ] 17.1.11 Remark: There are many ways to express the triangle inequality in terms of balls. Some examples are given in Theorem 17.1.12. Parts (1) and (3) are illustrated in Figure 17.1.2.
r2
r2
z
y
y
z
x
x
r1
r1
r1 + r2
r1 + r2 ¯r (y) 6= ∅ ⇒ y ∈ B ¯r (x) ∩ B ¯r +r (x) B 2 1 2 1
¯r +r (x) ¯r (y) ⊆ B ¯r (x) ⇒ B y∈B 1 2 2 1
Triangle inequality equivalents in Theorem 17.1.12 (1) and (3)
17.1.12 Theorem: The following statements are equivalent to the triangle inequality for a function d : M ×M → IR+ 0 on a set M which satisifes the identity and symmetry conditions (i) and (ii) in Definition 17.1.1. ¯ ¯ ¯ (1) ∀x, y ∈ M, ∀r1 , r2 ∈ IR+ 0 , y ∈ Br1 (x) ⇒ Br2 (y) ⊆ Br1 +r2 (x). + S ¯ ¯ (2) ∀x, y ∈ M, ∀r1 , r2 ∈ IR0 , ¯r (x) Br2 (y) ⊆ Br1 +r2 (x). y∈B 1
¯ ¯ ¯ (3) ∀x, y ∈ M, ∀r1 , r2 ∈ IR+ 0 , Br1 (x) ∩ Br2 (y) 6= ∅ ⇒ y ∈ Br1 +r2 (x). ¯r +r (x) ⇒ B ¯r (x) ∩ B ¯r (y) = ∅. (4) ∀x, y ∈ M, ∀r1 , r2 ∈ IR+ , y ∈ /B 0
1
2
1
2
Proof: See Exercise 47.7.12. 17.1.13 Remark: There are many ways to express the triangle inequality lower bound in Theorem 17.1.5 in terms of balls. Some examples are given in Theorem 17.1.14. Parts (1) and (3) are illustrated in Figure 17.1.3.
r2 z y
z y r2
x
r1 − r2
x
r1
¯r (y) ∩ Br −r (x) = ∅ y∈ / Br1 (x) ⇒ B 2 1 2 Figure 17.1.3
r1 − r2
r1
¯r (y) 6⊆ Br (x) ⇒ y ∈ B / Br1 −r2 (x) 2 1
Triangle inequality equivalents in Theorem 17.1.14 (1) and (3)
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Figure 17.1.2
416
17. Metric spaces
17.1.14 Theorem: The following statements are equivalent to the triangle inequality for a function d : M ×M → IR+ 0 on a set M which satisifes the identity and symmetry conditions (i) and (ii) in Definition 17.1.1. ¯r (y) ∩ Br −r (x) = ∅. (1) ∀x, y ∈ M, ∀r1 ∈ IR+ / Br1 (x) ⇒ B 2 1 2 0 , ∀r2 ∈ [0, r1 ], y ∈ S + ¯r (y) = ∅. (2) ∀x, y ∈ M, ∀r1 ∈ IR0 , ∀r2 ∈ [0, r1 ], Br1 −r2 (x) ∩ y∈B B 2 / r (x) 1
(3) ∀x, y ∈ M, ∀r1 ∈ (4) ∀x, y ∈ M, ∀r1 ∈
IR+ 0 , ∀r2 IR+ 0 , ∀r2
¯r (y) 6⊆ Br (x) ⇒ y ∈ ∈ [0, r1 ], B / Br1 −r2 (x). 2 1 ¯r (y) ⊆ Br (x). ∈ [0, r1 ], y ∈ Br1 −r2 (x) ⇒ B 2 1
Proof: See Exercise 47.7.13. 17.1.15 Remark: The inequalities and propositions in Definition 17.1.1 and Theorems 17.1.5, 17.1.12 and 17.1.14 may seem shallow. They become more interesting, however, when they are compared to the corresponding inequalities and propositions for a pseudo-metric (or “hyperbolic metric”). 17.1.16 Remark: It is sometimes useful to define an open and closed annulus as in Definition 17.1.17 corresponding to the open and closed ball in Definition 17.1.7. The special cases where the inner radius is 0 may be referred to as a “punctured” open or closed ball. Notation 17.1.18 is (probably) non-standard.
17.1.18 Notation: − Bx,r1 ,r2 , for a metric space (M, d), x ∈ M and r1 , r2 ∈ IR+ 0 , denotes the open annulus {y ∈ M ; r1 < d(x, y) < r2 } in M with centre x and radius pair r1 , r2 . −+ ¯x,r ,r , for a metric space (M, d), x ∈ M and r1 , r2 ∈ IR B 1 2 0 , denotes the closed annulus {y ∈ M ; r1 ≤ d(x, y) ≤ r2 } in M with centre x and radius pair r1 , r2 . − B˙ x,r , for a metric space (M, d), x ∈ M and r ∈ IR+ 0 , denotes the punctured open ball {y ∈ M ; 0 6= d(x, y) < r} in M with centre x and radius r. −+ ¯˙ x,r , for a metric space (M, d), x ∈ M and r ∈ IR B 0 , denotes the punctured closed ball {y ∈ M ; 0 6= d(x, y) ≤ r} in M with centre x and radius r. − Br1 ,r2 (x), for a metric space (M, d), x ∈ M and r1 , r2 ∈ IR+ 0 , is an alternative notation for Bx,r1 ,r2 . −+ ¯r ,r (x), for a metric space (M, d), x ∈ M and r1 , r2 ∈ IR ¯ B 1 2 0 , is an alternative notation for Bx,r1 ,r2 . − + B˙ r (x), for a metric space (M, d), x ∈ M and r ∈ IR0 , is an alternative notation for B˙ x,r . −+ ¯˙ r (x), for a metric space (M, d), x ∈ M and r ∈ IR ¯˙ B 0 , is an alternative notation for B x,r . 17.1.19 Remark: There are, of course, many interrelationships between the various definitions of balls ¯x,r and B˙ x,r = Bx,r \ B ¯x,0 . and annuli. For example, Bx,r1 ,r2 = Bx,r2 \ B 1 17.1.20 Remark: There is some ambiguity between Notations 17.1.8 and 17.1.18. For example, Bx,r and Br1 ,r2 may be confused, particularly if M is the metric space of real numbers. Such clashes are usually easy to clarify within the application context.
17.2. Set distance and set diameter 17.2.1 Definition: The distance between two sets A and B in a metric space (M, d) is the non-negative extended real number d(A, B) ∈ IR+ 0 ∪ {∞} defined by d(A, B) = inf{d(x, y); x ∈ A, y ∈ B}. The distance between a point and a set in a metric space (M, d) is the non-negative extended real number d(x, A) ∈ IR+ 0 ∪ {∞} defined by d(x, A) = inf{d(x, y); y ∈ A} for points x ∈ X and A ⊆ M with A 6= ∅. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
17.1.17 Definition: The open annulus in a metric space (M, d) with centre x ∈ M and radius pair − r1 , r2 ∈ IR+ 0 is the set {y ∈ M ; r1 < d(x, y) < r2 }. − The closed annulus in a metric space (M, d) with centre x ∈ M and radius pair r1 , r2 ∈ IR+ 0 is the set {y ∈ M ; r1 ≤ d(x, y) ≤ r2 }. − The punctured open ball in a metric space (M, d) with centre x ∈ M and radius r ∈ IR+ 0 is the set {y ∈ M ; 0 6= d(x, y) < r}. − The punctured closed ball in a metric space (M, d) with centre x ∈ M and radius r ∈ IR+ 0 is the set {y ∈ M ; 0 6= d(x, y) ≤ r}.
17.3. The topology induced by a metric
417
17.2.2 Remark: If A = ∅ or B = ∅ in Definition 17.2.1, then d(A, B) = inf ∅ = ∞, which is okay. Similarly, d(x, A) = ∞ if A = ∅. Otherwise, all distances are non-negative (finite) real numbers. 17.2.3 Theorem: Let A1 and A2 be subsets of a metric space (M, d) with A1 ⊆ A2 . Then d(x, A1 ) ≥ d(x, A2 ) for all x ∈ M . So {x ∈ M ; d(x, A1 ) < r} ⊆ {x ∈ M ; d(x, A2 ) < r}. Let B1 , B2 be subsets with B1 ⊆ B2 . Then d(A1 , B1 ) ≥ d(A2 , B2 ). 17.2.4 Definition: The diameter of a set S 6= ∅ in a metric space (M, d) is the non-negative extended real number diam(S) = sup{d(x, y); x, y ∈ S}. 17.2.5 Remark: In Definition 17.2.4, the diameter of the empty set would be sup ∅ = −∞, which would probably be annoying. Therefore it is not defined. Any non-empty set with a finite number of elements must have a finite diameter. The diameter of an infinite set may be infinite. 17.2.6 Remark: It follows from the triangle inequality that diam(Bx,r ) ≤ 2r for all x ∈ M and r ∈ IR+ 0. Corresponding to diam(S) in Definition 17.2.4, one could also define a radius as radius(S) = inf{r ∈ IR ∪ {∞}; ∃x ∈ S, S ⊆ Bx,r }. However, although diam(S) ≤ 2 radius(S), the equality diam(S) = 2 radius(S) does not hold for all metric spaces. 17.2.7 Definition: A bounded subset of a metric space is a subset whose diameter is finite.
17.3.1 Remark: Out of all of the possible topologies which could be attached to a metric space, there is a single canonical topology which is generated by the set of all open balls in the metric space. This is the “topology induced by the metric”. Historically, metric spaces preceded topological spaces. But the history of mathematics has countless cases of “re-founding” old concepts on the basis of new concepts which are more general. This embedding of specific concepts in a more general framework is an integral part of the way mathematicians think. It often happens that there are benefits from the embedding of concepts in more general frameworks. The general frameworks suggest asking questions that one might otherwise not have asked. However, it sometimes happens that embedding in more general frameworks just makes the original concepts less comprehensible without any benefit. Generality for its own sake is sometimes a burden imposed by well-meaning mathematicians who want to create a new territory. But sometimes the new territory is barren. In the case of topology though, the more general framework has been enormously fruitful. There are many theorems for metric spaces which are even more useful when applied to more general topological spaces. The difficult thing, however, is to keep the theorems clear in one’s own mind. There are many theorems for metric spaces which do not generalize much or at all to topological spaces. It is very important to state clearly the assumptions upon which each theorem is based. 17.3.2 Remark: Just as a point-to-point metric (Definition 17.1.1) induces a canonical topology (Definition 17.3.3), so also a differential metric (the Riemannian metric) induces both a canonical topology and a canonical parallelism. It turns out that a point-to-point metric on a diffientiable manifold (Definition 27.2.6) induces a Riemannian metric (under some reasonable assumptions), and this Riemannian differential metric therefore induces a canonical topology and a canonical paralellism. Fortunately it turns out that the induced topology and induced parallelism are the same no matter which path you arrive at them by (under some reasonable assumptions). 17.3.3 Definition: The topology induced by a metric d on a set M is the topology generated by the set of all open balls with positive radius in the metric space (M, d). The topology of a metric space (M, d) is the topology induced by d on M . This topology may be denoted as Top(M ) when the metric d is implicit. 17.3.4 Remark: The topology on a metric space (M, d) can be written explicitly as S Top(M ) = Bxi ,ri ; x : S → X and r : S → IR+ , i∈S
where the open balls Bxi ,ri are as in Notation 17.1.8. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
17.3. The topology induced by a metric
418
17. Metric spaces
17.3.5 Remark: The topology in Definition 17.3.3 is well-defined because an arbitrary set of subsets of any set M will always generate a topology on M . This does not necessarily mean that the topology will have any nice properties. For example, if d(x, y) = 0 for all x, y ∈ M , the topology will be trivial. (See Definition 14.2.18 for the trivial topology.) 17.3.6 Remark: When there are two or more choices of a metric on a given set M , one could use a notation such as Topd (M ) to indicate the topology on M for a particular metric d. But this could be confused with the notation Topx (M ) for the set of open neighbourhoods of x in Top(M ). A better notation for the topology induced by (M, d) would be Top(M, d). 17.3.7 Remark: All of the definitions for topological spaces apply also to metric spaces by referring to the induced topology. Thus a metric space is said to be paracompact if the induced topology is paracompact, and so forth. Similarly, continuity between metric spaces, and between metric spaces and topological spaces, is defined as if the metric function were replaced with the induced topology, as superfluously presented in Definition 17.4.1. 17.3.8 Remark: Open balls in a metric space are automatically open sets by Definition 17.3.3. 17.3.9 Theorem: Closed balls in a metric space are closed sets in the induced topology. Proof: Consider the closed ball S = {y ∈ M ; d(x, y) ≤ r}. If r = ∞, then S = M , which is closed. So assume that r < ∞. If z ∈ M \ S, then d(x, z) > r. Therefore by Theorem 17.1.5, Bz,ε ⊆ M \ S with ε = d(x, z) − r. But Bz,ε is an open set. So M \ S is an open set. Therefore S is closed. 17.3.10 Remark: A closed ball in Definition 17.1.7 includes, but is not necessarily equal to, the closure of the corresponding open ball. Since a closed ball {y ∈ M ; d(x, y) ≤ r} is a closed set, it follows from ¯x,r ⊆ {y ∈ M ; d(x, y) ≤ r}. A discrete space such as the Definition 14.4.4 for the closure of a set that B integers with the usual metric provides ample counterexamples to the converse.
(1) Bx,r ∩ A = ∅ ⇔ d(x, A) ≥ r;
(2) Bx,r ∩ A 6= ∅ ⇔ d(x, A) < r ⇔ ∃y ∈ A, d(x, y) < r. S − Hence (or otherwise), x∈A Bx,r = {x ∈ M ; d(x, A) < r} for any set A ⊆ M and r ∈ IR+ 0.
17.3.12 Theorem: For points x and closed sets K in a metric space (M, d), x ∈ K ⇔ d(x, K) = 0. In other words, if K is closed then K = {x ∈ M ; d(x, K) = 0}. Proof: Since K is closed, M \ K is open. Therefore x∈ / K ⇔ x∈M \K ⇔ ∃r > 0, Bx,r ⊆ M \ K ⇔ ∃r > 0, Bx,r ∩ K = ∅ ⇔ ∃r > 0, d(x, K) ≥ r ⇔ d(x, K) > 0.
The theorem follows immediately. 17.3.13 Theorem: The interior of any set S in a metric S space (M, d) is equal to the union of all open balls which are included in S. In other words, Int(S) = {Bx,r ; x ∈ S, r > 0, Bx,r ⊆ S}.
17.3.14 Remark: Definitions and properties of the interior, closure, exterior and boundary of sets in a general topological space are presented in Sections 14.4 and 14.5. The correspondence between the metric on a metric space M and these set components in the induced topology on M is expressed in Theorem 17.3.15 in terms of the set distance function. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
− 17.3.11 Theorem: Let (M, d) be a metric space. Then for any A ⊆ M , x ∈ M , and r ∈ IR+ 0,
17.3. The topology induced by a metric
419
17.3.15 Theorem: Let S be a subset of a metric space M with metric d. Then: (1) (2) (3) (4)
Int(S) = {x ∈ M ; d(x, M \ S) > 0}. S¯ = {x ∈ M ; d(x, S) = 0}. Ext(S) = {x ∈ M ; d(x, S) > 0}. Bdy(S) = {x ∈ M ; d(x, S) = 0 ∧ d(x, M \ S) = 0}.
17.3.16 Theorem: The closure A¯ of a set A in a metric space (M, d) satisfies A¯ = {x ∈ M ; d(x, A) = 0} T S = Bx,r . r>0 x∈A
Proof: It follows from Theorems 17.3.12 and 17.2.3 and the fact that A¯ is closed, that A¯ = {x ∈ ¯ = 0} ⊇ {x ∈ M ; d(x, A) ¯ = 0}. M ; d(x, A) For any A ⊆ M , the set {x ∈ M ; d(x, A) > 0} is open because for any x ∈ M with d(x, A) > 0, Bx,d(x,A) ⊆ {x ∈ M ; d(x, A) > 0}. So {x ∈ M ; d(x, A) = 0} = M \ {x ∈ M ; d(x, A) > 0} is closed. Therefore A¯ ⊆ {x ∈ M ; d(x, A) = 0}. This verifies the first equality of the theorem. S To show the second equality, let Sr = x∈A Bx,r for r > 0. Then A¯ ⊆ {x ∈ M ; d(x, A) < r} = Sr by T ¯ Then d(y, A) > 0. Let r = d(y, A). Then Theorem 17.3.11. So A¯ ⊆ r>0 Sr . Now y ∈ / A. T suppose that T ¯ This verifies the second equality. y∈ / Sr = {x ∈ M ; d(x, A) < r}. So y ∈ / r>0 Sr . Hence r>0 Sr ⊆ A.
[ Originally I proved half of the first equality of Theorem 17.3.16 as in the Remark 17.3.17. But then I realized that it can be done in a single line. I’ll delete this remark as soon as I’m totally certain that it’s a total waste of space. ] T 17.3.17 Remark: By Definition 14.4.4, A¯ = {K ∈ IP(X); K is closed and S ⊆ K}. Let K be closed in M with A ⊆ K. Then K = {x ∈ M ; d(x, K) = 0} by Theorem 17.3.12. By Theorem 17.2.3, d(x, A) ≥ d(x, K) for all x ∈ M . Therefore {x ∈ M ; d(x, A) = 0} ⊆ {x ∈ M ; d(x, K) = 0} = K. So {x ∈ M ; d(x, A) = ¯ 0} ⊆ A. x ∈ S¯ ⇔ ⇔ ⇔ ⇔ ⇔
d(x, S) = 0 inf{d(x, y); y ∈ S} = 0 ∀r > 0, ∃y ∈ S, d(x, y) < r ∀r > 0, ∃y ∈ S, x ∈ By,r S ∀r > 0, x ∈ By,r
⇔ x∈
y∈S
T S
By,r .
r>0 y∈S
[ I’ll get Remark 17.3.17 sorted out when I get a bit of spare time. ] 17.3.18 Theorem: A subset of a metric space is compact if and only if it is sequentially compact. Proof: See Simmons [139], page 123, for a proof that sequentially compact implies compact, and pages 120–124 for a proof of the converse. [ Make sure the proofs of equivalence of compact and sequentially compact do not use the axiom of choice. ] 17.3.19 Theorem: All compact subsets of a metric space are closed and bounded. 17.3.20 Remark: The converse of Theorem 17.3.19 is not true. A closed, bounded subset of a metric space is not necessarily compact. (See Simmons [139], page 115.) 17.3.21 Theorem: A subset of IRn (with the usual metric) is compact if and only if it is closed and bounded. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
The following calculation is some sort of alternative proof of the second equality in Theorem 17.3.16.
420
17. Metric spaces
17.3.22 Definition: A Lebesgue number for an open cover (Ωi )i∈I of a set X in a metric space (M, d) is a positive real number λ such that ∀A ⊆ X, (diam(A) < λ) ⇒ (∃i ∈ I, A ⊆ Ωi ) ; in other words, every subset of X with diameter less than λ is fully included within at least one of the covering sets Gi . Thus an open cover (Ωi )i∈I of a set X in a metric space (M, d) is said to have a Lebesgue number if ∃λ > 0, ∀A ⊆ X,
(diam(A) < λ) ⇒ (∃i ∈ I, A ⊆ Ωi ).
17.3.23 Remark: Since compactness and sequential compactness are equivalent in a metric space, Theorem 17.3.24 implies that all open covers of compact sets in metric spaces have Lebesgue numbers. This is useful for showing that rectifiability for compact-domain paths in general topological manifolds is welldefined. 17.3.24 Theorem: Every open cover of a sequentially compact set in a metric space has a Lebesgue number. Proof: See Simmons [139], page 122. 17.3.25 Theorem: All metric spaces are paracompact. [ Prove this. ] 17.3.26 Remark: The Heine-Borel theorem requires the definition of bounded sets, which requires a metric. Therefore it cannot be defined in an earlier chapter, even though the topology on the real numbers is defined in Section 14.9. The Heine-Borel theorem is sometimes proved with the aid of the Axiom of Choice. (E.g. See Simmons [139], pages 113–119.) It does not seem that this is necessary. Theorem 17.3.27 is proved here (hopefully) without any use of AC. (See Taylor [144], page 30.) [ Unfortunately, I have read that Heine-Borel cannot be proved in ZF without AC. Bother! On the other hand, probably that refers to general metric spaces, not Euclidean spaces. Get references for this. ]
Proof: First show that any bounded, closed interval [a, b] is compact. Let C be an open cover of [a, b]. Define S = {x ∈ [a, b]; ∃C1 ⊆ C, C1 is finite and C1 covers [a, x]}. Then S 6= ∅ since a ∈ S, and S ⊆ [a, b]. So c = sup(S) is well defined and c ∈ [a, b]. Since C covers [a, b], c ∈ G for some G ∈ C. By definition of the topology on IR, there is an open interval (d, e) ⊆ IR such that c ∈ (d, e) ⊆ G. By definition of S, the interval [a, d′ ] is covered by a finite subcover C1 of C, where d′ = max(a, d). Let C2 = C1 ∪ {G}. Then C2 is a finite open cover of the interval [a, e]. But this contradicts the definition of S. So the assertion that [a, b] is compact follows. (See Figure 17.3.1.) G d e c
a C
C1
Figure 17.3.1
C1
C2 = C1 ∪ {G}
b
IR C
Proof of Heine-Borel Theorem 17.3.27
In the case of a general bounded, closed subset K of IR, there is a closed interval [a, b] ⊆ IR with K ⊆ [a, b]. By Theorem 15.7.6, any closed subset of a compact set is compact. So the theorem follows. 17.3.28 Theorem: A subset of IRn with the usual topology is compact if and only if it is closed and bounded. 17.3.29 Remark: The definition of compactness in terms of the existence of finite subcovers of open covers (Definition 15.7.4) is equivalent in a metric space to sequential compactness (Definition 15.7.11), which is also equivalent to the Bolzano-Weierstraß property. The statement that the Euclidean metric spaces IRn have the Bolzano-Weierstraß property is called the Bolzano-Weierstraß Theorem. 17.3.30 Definition: A metric space is said to have the Bolzano-Weierstraß property if every infinite subset has a limit point. [ It seems like the Bolzano-Weierstraß theorem probably does not need AC. Must check. ] [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
17.3.27 Theorem (Heine-Borel): All bounded, closed subsets of IR are compact.
17.4. Continuous functions in metric spaces
421
17.4. Continuous functions in metric spaces 17.4.1 Definition: A continuous function from a metric space (M, d) to a topological space (X, TX ) is a function f : M → X which is continuous with respect to the topologies Top(M ) and TX . Continuity of a function f : X → M is defined with respect to TX and Top(M ) respectively.
A function f : M1 → M2 for metric spaces (M1 , d1 ) and (M2 , d2 ) is said to be continuous if f is continuous with respect to Top(M1 ) and Top(M2 ). 17.4.2 Remark: Theorem 17.4.3 states that in a metric space, ε-δ continuity is the same as continuity with respect to the induced topology of the metric space. The ε-δ definition of limits and continuity is attributed to Karl Theodor Wilhelm Weierstraß by Bell [189], page 294 and Bynum et alia [191], page 15. In an earlier time, continuity was defined intuitively. This made it difficult to arrive at consunsus on which kinds of functions were continuous. But more importantly, the absence of an objective definition of continuity made deductive arguments impossible. As soon as a purely logical expression for continuity was discovered and agreed upon, rapid progress in the subject was possible. This shows the importance of replacing intuition with objective definitions. 17.4.3 Theorem: A function f : M1 → M2 between metric spaces (M1 , d1 ) and (M2 , d2 ) is continuous if and only if ∀x ∈ M1 , ∀ε > 0, ∃δ > 0, ∀y ∈ M1 , d1 (x, y) < δ
⇒
d2 (f (x), f (y)) < ε.
(17.4.1)
17.4.4 Remark: Condition (17.4.1) in Theorem 17.4.3 is expressed in Theorem 17.4.5 in terms of open balls. The ball notation requires superscripts to indicate in which metric space the balls are defined, although these can generally be guessed from the context.
∀x ∈ M1 , ∀ε > 0, ∃δ > 0,
1 f (Bx,δ ) ⊆ Bf2(x),ε .
(17.4.2)
[ Near here, specialize Definition 14.11.15 to metric spaces, using distances to define limits instead of neighbourhoods. Possibly also specialize Theorem 14.11.20 to metric spaces similarly. ] 17.4.6 Remark: For any continuous function f : M1 → M2 , for metric (M1 , d1 ) and (M2 , d2 ), one spaces + + 1 2 may construct a function Ef : M1 ×IR+ → IR ∪{∞} by E (x, δ) = inf r ∈ IR ∪{∞}; f (Bx,δ ) ⊆ B(f f 0 0 0 (x),r) . The characteristics of this function are the basis of various specialized definitions of continuity such as H¨older continuity. 17.4.7 Remark: Uniform continuity is specified in Definition 17.4.8 by swapping some of the quantifiers in Theorem 17.4.3. The condition is equivalent to ∀ε > 0, ∃δ > 0, ∀x ∈ M1 , f (Bx,δ ) ⊆ Bf (x),ε , which is equivalent to requiring that the function Ef in Remark 17.4.4 satisfies ∀ε > 0, ∃δ > 0, Ef (x, δ) ≤ ε. [ Define modulus of continuity. See EDM2 [34], page 317, section 84.A. ] 17.4.8 Definition: A uniformly continuous function f : M1 → M2 from a metric space (M1 , d1 ) to a metric space (M2 , d2 ) is a function f : M1 → M2 which satisfies ∀ε > 0, ∃δ > 0, ∀x, y ∈ M1 ,
d1 (x, y) < δ ⇒ d2 (f (x), f (y)) < ε.
17.4.9 Theorem: For any metric spaces (M1 , d1 ) and (M2 , d2 ), if K is a compact subset of M1 and f : K → M2 is continuous, then f is uniformly continuous. Proof: See Simmons [139], page 124, or Taylor [144], page 37. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
17.4.5 Theorem: A function f : M1 → M2 between metric spaces (M1 , d1 ) and (M2 , d2 ) is continuous if and only if
422
17. Metric spaces
17.4.10 Definition: A Lipschitz (continuous) or Lipschitzian function from a metric space (M1 , d1 ) to a metric space (M2 , d2 ) is a function f : M1 → M2 such that ∃K ∈ IR+ 0 , ∀x, y ∈ M1 ,
d2 (f (x), f (y)) ≤ Kd1 (x, y).
(17.4.3)
A Lipschitz constant for a Lipschitz function f is any K ∈ IR+ 0 such that (17.4.3) holds. 17.4.11 Notation: Lip(f ) denotes the infimum of all Lipschitz constants for a Lipschitz function f . [ Define modulus of continuity and show how this relates to Lipschitz and uniform continuity. ] 17.4.12 Theorem: For any two metric spaces (M1 , d1 ) and (M2 , d2 ), all Lipschitz functions f : M1 → M2 are uniformly continuous, and all uniformly continuous functions f : M1 → M2 are continuous. 17.4.13 Remark: Theorem 17.4.14 specializes Theorem 17.4.3 to the case where the metric spaces are Euclidean tuple spaces with the usual metric specified in Definition 17.1.4. [ Should generalize Theorem 17.4.14 to domains which are open subsets of IRn . This requires the induced metric and induced topology on open subsets. This theorem is used in the proof of Theorem 18.2.15. ] 17.4.14 Theorem: A function f : IRn → IRm for m, n ∈ ∀x ∈ IRn , ∀ε > 0, ∃δ > 0, ∀y ∈ IRn , |x − y| < δ
⇒
Z+0 is continuous if and only if
|f (x) − f (y)| < ε.
17.5. Rectifiable sets, curves and paths
17.5.2 Remark: Rectifiable paths are well defined in a general metric space. A more general concept is the k-rectifiable set introduced in Definition 17.5.3.
Z+0 is a set X
17.5.3 Definition: A k-rectifiable set in a metric space (M, d) for k ∈ f (S) = X for some bounded set S ⊆ IRk and Lipschitz function f : S → IRk .
⊆ M such that
17.5.4 Remark: The k-rectifiable sets in Definition 17.5.3 are clearly arbitrary subsets of images of krectangles under Lipschitz maps from IRk to M . So all subsets of a k-rectifiable set are also k-rectifiable. It is equally clear that the union of any finite number of k-rectifiable sets must be a k-rectifiable set. (See Federer [105], page 251, for further related definitions.) The map f in Definition 17.5.3 may be thought of as rectifying the set X, because under the inverse map f −1 , the set X is mapped to a flat set, which in some sense “rectifies” it; that is, it is “straightened” by the inverse map. 17.5.5 Definition: A rectifiable curve in a metric space (M, d) is a curve γ : I → M such that γ ◦ β −1 is Lipschitz continuous for some homeomorphism β : I → J, for some bounded interval J ⊆ IR. 17.5.6 Remark: The reparametrization β of a curve γ in Definition 17.5.5 is illustrated in Figure 17.5.1. The requirement that the interval J be bounded is essential. I
γ
β
M J
Figure 17.5.1 [ www.topology.org/tex/conc/dg.html ]
γ ◦ β −1
Reparametrization of rectifiable curve [ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
17.5.1 Remark: Rectifiable paths are a natural basis for the definition of parallelism in differentiable manifolds. The parallel transport of fibres between base points of a fibre bundle is closely related to integration. Therefore the paths which are used for parallel transport must be almost everywhere differentiable, and that is exactly what rectifiable paths provide.
17.5. Rectifiable sets, curves and paths
423
17.5.7 Theorem: Any subset of the image of a rectifiable curve in a metric space is a 1-rectifiable set. Proof: The image of a rectifiable curve γ in a metric space (M, d) is the image of a Lipschitz map γ ◦ β : I → M for some bounded interval I ⊆ IR. So Range(γ) = Range(γ ◦ β) is a 1-rectifiable set. Any subset of a 1-rectifiable set is 1-rectifiable. So the theorem follows. 17.5.8 Remark: The converse to Theorem 17.5.7 is not generally true. If a set X ⊆ M is 1-rectifiable, then there is a bounded set S ⊆ IR such that f (S) = X for some Lipschitz function f : S → M . Let I = [inf S, sup S] ⊆ IR be the smallest closed interval which includes S. To show the converse of Theorem 17.5.7, one must show that f can be extended as a Lipschitz function to all of I. An obvious counterexample is the metric space (M, d) where M = and d : (x, y) 7→ |x − y|. Let S = X = {0, 1} and f : x 7→ x. Then I = [0, 1], but there is no Lipschitz extension f : I → M . The converse would be true if there exists a Lipschitz curve connecting every pair of points in the metric space.
Z
17.5.9 Definition: The length of a curve γ 6= ∅ in a metric space (M, d) is L(γ) defined by L(γ) = sup
n nX i=1
where Sγ = (ai )ni=0 ; n ∈
d(γ(xi ), γ(xi−1 )); x ∈ Sγ
o
− ∈ IR+ 0,
Z+, Range(a) ⊆ Dom(γ), a is non-decreasing .
The length of an ordered traversal γ 6= ∅ in (M, d) is defined identically to the length of a curve.
17.5.11 Remark: The curve length in Definition 17.5.9 uses only the total order on the domain of the curve γ. Therefore the length may be generalized to general “ordered traversals”, which were introduced in the non-standard Definition 7.1.17. See Definition 7.1.13 for order isomorphisms. 17.5.12 Theorem: The length of an ordered traversal is invariant under order isomorphisms. That is, for any metric space (M, d), for any totally ordered sets I and J, for any ordered traversal γ : I → M , for any order isomorphism β : I → J, L(γ) = L(γ ◦ β −1 ). 17.5.13 Theorem: A curve γ in a metric space (M, d) is rectifiable if and only if L(γ) < ∞. Proof: First let γ : I → M be a Lipschitz map from a bounded interval I toPa metric space (M, d). n Then for every sequence of points (ai )ni=0 in the set Sγ for γ in Definition 17.5.9, i=1 d(γ(ai ), γ(ai−1 )) ≤ P n i=1 Lip(γ)(ai − ai−1 ) ≤ Lip(γ) diam(I). So L(γ) < ∞.
Now assume that a curve γ : I → M is a rectifiable curve in a metric space (M, d). Then by Definition 17.5.5, −1 there exists a bounded interval J ⊆ IR and a homeomorphism β : I → J such that Pnγ ◦ β : J → M is a −1 n Lipschitz map. Let γ˜ = γ ◦ β .PThen for every sequence of points (ai )i=0 in Sγ , i=1 d(γ(ai ), γ(ai−1 )) = P n n ˜ (β(ai )), γ˜ (β(ai−1 )) ≤ i=1 Lip(˜ γ )(β(ai ) − β(ai−1 )) ≤ Lip(˜ γ ) diam(J). So L(γ) < ∞ again. i=1 d γ
To show the converse, it is necessary for any γ : I → M with L(γ) < ∞ to construct a homeomorphism β : I → J such that γ ◦ β −1 : J → M is Lipschitz continuous and J is bounded. To do this, note that the ) ≤ L(γ) for any restriction length L(γ) is a non-decreasing function of the curve γ in the sense that L(γ K γ K of γ to a subinterval K of I. For t ∈ I, define K = {x ∈ I; x ≤ t}. Then L(γ K ) is non-negative, finite and non-decreasing with respect to t. Define J = [0, L(γ)], and define β : I → J by β : t 7→ L(γ K ). This function β does not quite complete the proof because it may not be a homeomorphism. The curve may be constant on subintervals of I. This can be fixed by adding a function such as tanh(t) to the function β. Alternatively, the interval I may be mapped into a bounded interval such as [0, 1] and a term such as kt may then be added to β for some small positive k.
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
17.5.10 Theorem: The length of a curve is invariant under reparametrization. That is, for any metric space (M, d), for any curve γ : I → M , for any interval J ⊆ IR, for any homeomorphism β : I → J, the curve γ ◦ β −1 : J → M satisfies L(γ ◦ β −1 ) = L(γ).
424
17. Metric spaces
17.5.14 Remark: Theorem 17.5.13 shows that Definition 17.5.5, which is a topological definition of a rectifiable curves, is equivalent to the finite-length condition based on Definition 17.5.9, which is apparently entirely non-topological. The definition of length requires only the total order on the domain of the curve γ. This suggests that the length and rectifiability of a curve may be generalized to the class of all maps from totally ordered sets to the metric space. (Similar generalizations are presumably possible for k-rectifiability and families of curves.) However, since the metric structure is required on the target space for curves, rectifiability cannot be defined for general topological spaces. 17.5.15 Remark: Since the equivalent condition for rectifiability of a curve in Theorem 17.5.13 is clearly independent of the parametrization (and incidentally the orientation also) of a curve, it follows that curves γ with the same path [γ]0 must either be all rectifiable or all not rectifiable. Therefore one may define rectifiable paths unambiguously as in Definition 17.5.16. The length of a curve is independent of parametrization. So the length of a path in Definition 17.5.17 is well-defined. 17.5.16 Definition: A rectifiable path in a metric space (M, d) is a path Q in M whose representative curves γ ∈ Q are all rectifiable curves in M . 17.5.17 Definition: The length of a rectifiable path Q in a metric space (M, d) is the length of any representative curve γ ∈ Q. The length of a path Q may be denoted as L(Q).
[ Define a distance parametrization for all curves, and normalize it to start at 0 if it is closed on the left. Show that diam(Im(γ)) ≤ L(γ). ] [ Define pseudo-metrics as hyperbolic versions of two-point metrics. ]
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
17.5.18 Remark: The rectifiability and length properties of a curve depend only on the sequence of points traversed and do not really require any continuous map. If the curve is simple, then a total order may be defined on the image of the curve to specify the traversal, and the rectifiability and length may be recovered from this. This follows from the fact that Theorem 17.5.13 uses the curve map only to determine the ordering of points. For simple curves, this implies that the curve map could be replaced in definitions of rectifiability and length with a total order. This could then be used in a parametrization-free definition of a directed path. This is not done here because it only works for simple paths (which is inconvenient for defining parallelism), and because it is too difficult to specialize to differentiable paths.
[425]
Chapter 18 Differential calculus
18.1 18.2 18.3 18.4 18.5 18.6 18.7 18.8 18.9
Infinitesimals . . . . . . . . . . . . . . . . . . . . . Differentiation for one variable . . . . . . . . . . . . Unidirectional differentiability of real-to-real functions Higher-order derivatives for real-to-real functions . . . Differentiation for several variables . . . . . . . . . . Higher-order derivatives for several variables . . . . . Some differentiability-based function spaces . . . . . Differentiation for abstract linear spaces . . . . . . . H¨older continuity . . . . . . . . . . . . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
425 426 431 432 435 439 440 440 441
Some basic calculus topics are summarized here for quick reference. 18.0.1 Remark: Calculus is a set of calculation rules for differentiating and integrating functions. Analysis deals with limits, convergence, existence, regularity and other more advanced topics. The topics “calculus” and “analysis” have a lot of overlap. Both words suffer some ambiguity because they are used with different meanings in different contexts.
18.1.1 Remark: A vector may represent either a two-point displacement or a single-point infinitesimal displacement. The application of linear spaces to two-point displacements goes back to Descartes. (See EDM2 [34], section 101.) This involved labelling each point in space with numerical coordinates. Geometry before the time of Descartes was mostly expressed in terms of compass and ruler constructions. Although Cartesian coordinates are very useful for describing the trajectories of objects in 3-dimensional space, the development of physical laws required infinitesimal displacements as introduced by Newton, who called them “fluxions”. A two-point space vector (x, x+∆x) for a non-zero ∆x may be divided by a two-point time vector (t, t + ∆t) to provide an estimate ∆x/∆t of velocity, but this two-point average velocity does not have a simple algebraic relation to forces acting on an object. For the development of theories of motion, it was necessary to develop the concept of a limit of a quotient of two quantities to arrive at measures such as velocity or acceleration. Whenever one analyses the rate of change of one variable with respect to another, which is how most physical theories are expressed, one looks at limits such as lim∆t→0 (∆x)/(∆t). In this case, one usually denotes the “limits” of ∆x and ∆t as dx and dt. Then the limit of the quotient is written as dx/dt. However, as everyone knows, neither dx nor dt is well defined in isolation. The concept of a limit may be expressed more precisely as ∀ε > 0, ∃δ > 0, |∆t| < δ ⇒ |∆x/∆t − dx/dt| < ε. [ Should refer to the discussion of infinitesimals in Bell [189], pages 142–153, especially on the difficulties of Newton and Leibniz in thinking about derivatives and velocities. ] 18.1.2 Remark: It is entirely understandable that it is so difficult to find an exact mathematical representation of an infinitesimal. It is one of the deepest and most significant concepts in the history of science. The
©
Alan U. Kennington, “Differential geometry reconstructed: a unified systematic framework”. Copyright 2009, Alan U. Kennington. All rights reserved. You may print this book draft in A4 format. Printing in all other formats is forbidden. You may not charge any fee for copies of this book draft. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
18.1. Infinitesimals
426
18. Differential calculus
discovery of this concept marked the break between descriptive science and predictive science. By applying Newton’s mechanical and gravitational laws, it was possible to determine on paper how a machine would work before it was built. But the laws could only be understood with the aid of the concepts of derivatives and integrals, both of which require the concept of infinitesimals. 18.1.3 Remark: If one thinks about velocity from the information point of view, it seems that the velocity (or momentum) of an object or particle stores information. When an object moves in a vacuum, the information encoded in its velocity is maintained indefinitely, and very precisely. One might ask how an object or particle “knows” what velocity to maintain. If this information were “copied” inaccurately from second to second, one would expect that velocity would be follow a random walk, but this does not seem to be observed. In opposition to the classical notion of infinitely well maintained velocity of a particle, taking into account various forces such as gravity, is in opposition to some more recent ideas regarding the granularity of space and time. If space-time does indeed turn out to be granular in nature, the ε-δ limit notion of velocity may turn out to be inapplicable because for small enough δ, the velocity estimator ∆x/∆t may become rather fuzzy. One may then ask what the forces (such as gravity) acting on a particle are modifying exactly, since the instantaneous velocity may not be well defined. Then it could be better to formulate velocity in terms of a probabilistic estimator which is not so sensitive to fuzziness in either space or time.
18.2. Differentiation for one variable 18.2.1 Remark: Definition 18.2.2 seeks to adapt the ε-δ style of continuity condition in Theorem 17.4.5 to derivatives of real-valued functions. In the case of continuity at a point p ∈ IR, the function f must satisfy f (Bp,δ ) ⊆ Bf (p),ε for some δ > 0. In the case of differentiability at a point p ∈ IR, the function f must satisfy (f (x) − f (p))/(x − p) ∈ Bv,ε for x ∈ Bp,δ for some δ > 0, but the point p must be excluded from the open ball Bp,δ to ensure that the quotient is well defined. 18.2.2 Definition: A function f : U → IR for U ∈ Top(IR) is said to be differentiable at p for p ∈ U if ∃v ∈ IR, ∀ε > 0, ∃δ > 0, ∀x ∈ (p − δ, p + δ) ∩ (U \ {p}),
f (x) − f (p) ∈ Bv,ε . x−p
(18.2.1)
A function f : U → IR for U ∈ Top(IR) is said to be differentiable on U if it is differentiable at p for all p ∈ U . In other words, ∀p ∈ U, ∃v ∈ IR, ∀ε > 0, ∃δ > 0, ∀x ∈ (p − δ, p + δ) ∩ (U \ {p}), f (x) − f (p) ∈ Bv,ε . x−p
(18.2.2)
18.2.3 Remark: Condition (18.2.2) may be rewritten as follows. ∀p ∈ U, ∃v ∈ IR, ∀ε > 0, ∃δ > 0, [ www.topology.org/tex/conc/dg.html ]
Qf,p (B˙ p,δ ) ⊆ Bv,ε .
(18.2.3) [ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
In applications of analysis to engineering, it is often noted that integrals smooth out the errors or noise, whereas differentiation exacerbates errors or noise. So integral expressions are preferred, for example, in feedback systems. It may be, in face, that differentiation is not an applicable model for the real physical world, whereas integration may be entirely suitable. It is probably wise to not take infinitesimal limit processes too literally. As noted elsewhere in this book, any concept which relies heavily on infinity concepts is liable to be inapplicable in detail, although models reliant on limiting processes may be accurate enough in realistic application scenarios. Concepts such as temperature and pressure of gases have undoubted applicability and reality, despite the “granular” nature of these concepts. It is probably a good idea to define derivatives, if possible, in such a way that they may easily be generalized to a corresponding statistical concept. In other words, it would be desirable to develop a “noise-tolerant definition” of differentiation.
18.2. Differentiation for one variable
427
where the function Qf,p : U \ {p} → IR is defined by Qf,p : x 7→ (f (x) − f (p))/(x − p), and B˙ p,δ denotes a punctured open ball as in Definition 17.1.17. The quotient Qf,p (x) = (f (x) − f (p))/(x − p) is an estimator for the gradient parameter of the “best-fit” straight line through the point (p, f (p)) of the graph of f . Condition (18.2.3) says that this estimator must be arbitrarily close to a real number v if the domain is restricted narrowly enough around the point p. Since there are so many families of curves to choose from for fitting to a given function f , one might ask why a linear function is chosen. Historically, this is almost certainly because of the long history of straight line Euclidean geometry which was taught in the universities of Europe in the 17th century, and because in Cartesian coordinates, the graph of a line expressed in terms of a very tiny set of add-and-multiply operations. In fact, the general ubiquity of straight lines in mathematics is surely for the same two reasons. The derivative of a function may be regarded as a local linearization of the function. The “goodness of fit” improves to any specified accuracy if the “δ microscope” is zoomed in sufficiently closely on any given point of the graph. 18.2.4 Remark: The ε-δ style of continuity condition in Theorem 17.4.3 is adapted for differentiability in Definition 18.2.5, which is an alternative for Definition 18.2.2. In the case of continuity, the difference |f (x) − f (p)| must be bounded by ε for arbitrarily small ε. For differentiability, |f (x) − f (p) − v(x − p)| must be bounded by ε|x − p| for arbitrarily small ε. This is illustrated in Figure 18.2.1. f (p) + v(x − p) + ε|x − p| f (p) + v(x − p) f (p) + v(x − p) − ε|x − p|
f (p)
p−δ Figure 18.2.1
p p+δ
x
Definition of derivative of real-valued function or real variable
Since the vast majority of fundamental physics is written in terms of derivatives, one might ask which of Definition 18.2.2 and 18.2.5 is preferable for conveying the right meaning. A derivative is generally a model for a velocity or the rate of change of some parameter. Definition 18.2.2 suggests that we are making estimates of a rate-of-change parameter. Definition 18.2.5 suggests that we are determining the “goodness of fit” of a straight line to the graph of the variation of one parameter with respect to another. There are some rate-of-change concepts which are not easily represented as fitting two graphs to each other. For example, the second derivative of a function may be represented as a limit of the quotient h−2 f (p − h) − 2f (p) + f (p + h) . Limiting this quotient to a ball Ba,ε limits the graph of f to a set of parabolas. These parabolas can be depicted on a graph of f , but they are somewhat untidy.
Graphs generally do not have any physical reality. They are merely convenient for humans to represent what is happening. It seems therefore preferable to define a derivative as the limit of a parameter estimator rather than as the parameter of a “best-fit curve”. In other words, Definition 18.2.2 is preferable to Definition 18.2.5. 18.2.5 Definition (→ 18.2.2): A function f : U → IR for U ∈ Top(IR) is said to be differentiable at p for p ∈ U if ∃v ∈ IR, ∀ε > 0, ∃δ > 0, ∀x ∈ (p − δ, p + δ) ∩ (U \ {p}), |f (x) − f (p) − v(x − p)| < ε|x − p|.
(18.2.4)
A function f : U → IR for U ∈ Top(IR) is said to be differentiable on U if it is differentiable at p for all p ∈ U . [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
f (x)
428
18. Differential calculus
18.2.6 Remark: The number v in Definition 18.2.5 is unique for a given function f and point p. This is shown in Theorem 18.2.7. The difference between equations (18.2.4) and (18.2.5) is the use of the unique existence quantifier “ ∃′ ” in (18.2.5). 18.2.7 Theorem: Let U ∈ Top(IR) and let f : U → IR be a function which is differentiable at p ∈ U . Then ∃′ v ∈ IR, ∀ε > 0, ∃δ > 0, ∀x ∈ (p − δ, p + δ) ∩ (U \ {p}),
|f (x) − f (p) − v(x − p)| < ε|x − p|.
(18.2.5)
Proof 1: The uniqueness of v follows from a straightforward modus tollens. Let v1 , v2 ∈ IR satisfy equation (18.2.4) for some p ∈ U . Suppose v1 6= v2 and let ε = |v1 − v2 |/2. Then ε > 0. Therefore ∃δ1 > 0, ∀x ∈ (p − δ1 , p + δ1 ) ∩ (U \ {p}), |f (x) − f (p) − v1 (x − p)| < ε|x − p| and ∃δ2 > 0, ∀x ∈ (p − δ2 , p + δ2 ) ∩ (U \ {p}), |f (x) − f (p) − v2 (x − p)| < ε|x − p|. Let δ = min(δ1 , δ2 ). Then ∀x ∈ (p − δ, p + δ) ∩ (U \ {p}), |f (x) − f (p) − v1 (x − p)| < ε|x − p| and |f (x) − f (p) − v2 (x − p)| < ε|x − p|.
|v1 − v2 |.|x − p| = |(v1 − v2 )(x − p)| = f (x) − f (p) − v1 (x − p) − f (x) − f (p) − v2 (x − p) ≤ |f (x) − f (p) − v1 (x − p)| + |f (x) − f (p) − v2 (x − p)| < 2ε|x − p| ≤ |v1 − v2 |.|x − p|.
This is impossible if v1 6= v2 and x 6= p. Therefore v1 = v2 . In other words, v is unique as claimed. Proof 2: Let v1 , v2 ∈ IR satisfy equation (18.2.4) for some p ∈ U . Let ε > 0. Then ∃δ1 > 0, ∀x ∈ (p − δ1 , p + δ1 ) ∩ (U \ {p}), |f (x) − f (p) − v1 (x − p)| < ε|x − p| and ∃δ2 > 0, ∀x ∈ (p − δ2 , p + δ2 ) ∩ (U \ {p}), |f (x) − f (p) − v2 (x − p)| < ε|x − p|. Let δ = min(δ1 , δ2 ). Then since (p − δ, p + δ) ∩ (U \ {p}) 6= ∅, for some x ∈ (p − δ, p + δ) ∩ (U \ {p}), |v1 − v2 | = |(v1 − v2 )(x − p)|/|x − p| = f (x) − f (p) − v1 (x − p) − f (x) − f (p) − v2 (x − p) /|x − p| ≤ |f (x) − f (p) − v1 (x − p)| + |f (x) − f (p) − v2 (x − p)| /|x − p| < 2ε|x − p|/|x − p| = 2ε.
Therefore |v1 − v2 | = 0. So v1 = v2 . In other words, v is unique as claimed. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Since (p − δ, p + δ) ∩ (U \ {p}) 6= ∅, it follows that for some x ∈ (p − δ, p + δ) ∩ (U \ {p}),
18.2. Differentiation for one variable
429
18.2.8 Remark: Proofs of uniqueness in analysis often follow the modus tollens pattern as in line (18.2.6). This is very similar to the reductio ad absurdum pattern of argument in line (18.2.7). In the 19th century, these kinds of “indirect” methods of argument were often criticised for assuming the “excluded middle” notion of truth and falsity, although as argued in Remark 3.1.4 and Section 3.11, adoption of a sensible ontology for logic ensures the validity of “proof by contradition”. A ⇒ B, ¬B ⊢ ¬A A ⇒ B, A ⇒ ¬B ⊢ ¬A.
(18.2.6) (18.2.7)
Proof 1 of Theorem 18.2.7 uses the modus tollens approach. Proof 2 of Theorem 18.2.7 apparently does not. 18.2.9 Remark: The uniqueness of the number v in Definition 18.2.5 implies that the word “the” can be used instead of “a” for this value. Thus it may be given a name such as “the derivative of f at p”. (If the number v was not unique, we would have to call it “a derivative of f at p”.) Hence Definition 18.2.10 makes good sense. If the function f is differentiable on U , there is a unique well-defined function which maps each p ∈ U to a corresponding unique value v at p. This is called “the derivative of f ”. 18.2.10 Definition: The derivative of f at p, for a function f : U → IR which is differentiable at a point p ∈ U for U ∈ Top(IR), is the unique number v ∈ IR which satisfies ∀ε > 0, ∃δ > 0, ∀x ∈ (p − δ, p + δ) ∩ (U \ {p}), f (x) − f (p) ∈ Bv,ε . x−p
18.2.11 Remark: EDM2 [34], 106.A, gives the notations dy/dx, y ′ , y, ˙ df (x)/dx, (d/dx)f (x), f ′ (x) and Dx f (x) for the derivative of y = f (x). In the olden days, the dot notation y˙ usually meant the derivative with respect to time, whereas the dashed notation y ′ means the derivative with respect to a space variable. Leibniz created the dy/dx notation whereas Newton created the x˙ notation. Some authors attribute the slow progress of analysis in England after Newton to the use of his notation. For example, Ball [187], page 439, wrote the following in about 1893. Towards the beginning of the last century the more thoughtful members of the Cambridge school of mathematics began to recognize that their isolation from their continental contemporaries was a serious evil. The earliest attempt in England to explain the notation and methods of the calculus as used on the continent was due to Woodhouse, who stands out as the apostle of the new movement. Ball [187], page 441, wrote the following about Charles Babbage, who was also a member of the Cambridge Analytical Society. It was he who gave the name to the Analytical Society, which, as he stated, was formed to advocate “the principles of pure d-ism as opposed to the dot-age of the university.” The d and dot here refer to the continental and Newtonian notations respectively. Struik [193], page 171, wrote the following. (He gives the reference Dubbey [192] for the Babbage quote.) The names of Hamilton and Cayley show that by 1840 English-speaking mathematicians had at last begun to catch up with their continental colleagues. Until well into the nineteenth century, the Cambridge and Oxford dons regarded any attempt at improvement of the theory of fluxions as an impious revolt against the sacred memory of Newton. The result was that the Newtonian school of England and the Leibnitzian school of the continent drifted apart to such an extent that Euler, in his integral calculus (1768), considered a union of both methods of expression as useless. The dilemma was broken in 1812 by a group of young mathematicians at Cambridge who, under the inspiration of the older Robert Woodhouse, formed an Analytical Society to propagate the differential notation. Its leaders were George Peacock, Charles Babbage, and John Herschel. They tried, in Babbage’s words to advocate “the principles of pure d-ism as opposed to the dot-age of the university.” [. . . ] The new generation in England now began to participate in modern mathematics. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
The derivative of f , for a function f : U → IR which is differentiable on U ∈ Top(IR), is the function f ′ : U → IR defined so that for all p ∈ U , f ′ (p) is the derivative of f at p.
430
18. Differential calculus
Ball [187], pages 361–362, also wrote the following about Newton’s “fluxions” method as opposed to the European “differential” method. The controversy with Leibnitz was regarded in England as an attempt by foreigners to defraud Newton of the credit of his invention, and the question was complicated on both sides by national jealousies. It was therefore natural, though it was unfortunate, that in England the geometrical and fluxional methods as used by Newton were alone studied and employed. For more than a century the English school was thus out of touch with continental mathematicians. The consequence was that, in spite of the brilliant band of scholars formed by Newton, the improvements in the methods of analysis gradually effected on the continent were almost unknown in Britain. It was not until 1820 that the value of analytical methods was fully recognized in England, and that Newton’s countrymen again took any large share in the development of mathematics. (See also some related comments in Remark 46.1.5.) 18.2.12 Remark: Equation (18.2.4) is the same as (18.2.8). ∃v ∈ IR,
lim
x→p
|f (x) − f (p) − v(x − p)| = 0. |x − p|
(18.2.8)
(See Definition 14.11.15 for the definition of the limit of a function at a point.) [ Definition 14.11.15 is quite unsatisfactory. This must be fixed. ] 18.2.13 Remark: It is tempting to seek a generalization of condition (17.4.2) in Theorem 17.4.5, which expresses continuity in a general metric space in terms of open balls, from continuity to differentiability. One may define sets of the form Kp,a,v,ε = {(x, y) ∈ IR × IR; |y − a − vx| < ε|x − p|} for (p, a) ∈ IR × IR, v ∈ IR and ε ∈ IR+ . Then one may re-express (18.2.4) as: ⊆ Kp,f (p),v,ǫ . ∃v ∈ IR, ∀ε > 0, ∃δ > 0, f
This is illustrated in Figure 18.2.2. The cone-shaped neighbourhoods Kp,f (p),v,ǫ are related to the definitions of α-H¨older continuity in Section 18.9, particularly the case α = 1. f (p) + v(x − p) + ε|x − p| f (p) + v(x − p) f (p) + v(x − p) − ε|x − p|
Kp,f (p),v,ε f (p)
f (x) f (p−δ, p+δ)\{p} p−δ
Figure 18.2.2
p p+δ
x
Cone-shaped neighbourhood for defining differentiability
One may also note that the definition of differentiability of f at p is equivalent to: ∃v ∈ IR, ∀ε > 0, ∃δ > 0, ∀x ∈ (p − δ, p + δ) ∩ (U \ {p}), f (x) ∈ Bf (p)+v(x−p),ε|x−p| . This means that f (x) is required to lie inside the open ball Bf (p)+v(x−p),ε|x−p| for each x, but this ball depends on x. This “moving ball” concept seem to capture the meaning of the derivative even less well than the “tangent cone” concept. All things considered, the “estimator convergence” concept in Definition 18.2.2 seems to express the derivative idea in the most natural way. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
(p−δ, p+δ)\{p}
18.3. Unidirectional differentiability of real-to-real functions
431
18.2.14 Remark: Differentiability is generally thought of as a stronger condition than continuity. However, in the case of multiple independent variables, Examples 18.5.6 and 18.5.11 show that partial and directional differentiability do not imply continuity. Therefore it is prudent to prove that differentiability for a single independent variable does imply continuity. 18.2.15 Theorem: Let U ∈ Top(IR) and p ∈ U . Suppose a function f : U → IR is differentiable at p. Then f is continuous at p. Proof: Suppose U ∈ Top(IR) and f : U → IR is differentiable at p ∈ U . By Theorem 17.4.14, f is continuous at p if ∀ε > 0, ∃δ > 0, ∀x ∈ U,
|x − p| < δ ⇒ |f (x) − f (p)| < ε.
Let ε > 0 and let v = f ′ (p) be the derivative of f at p. Let ε′ = ε/2. Then Definition 18.2.5 implies ∃δ ′ > 0, ∀x ∈ (p − δ ′ , p + δ ′ ) ∩ U \ {p}, |f (x) − f (p) − v(x − p)| < ε′ |x − p|. Let δ = min(1, δ ′ , ε/(2|v|)). Then |x − p| < 1 for all x ∈ (p − δ, p + δ) and |v| ≤ ε/(2δ). So ∀x ∈ (p − δ, p + δ) ∩ U \ {p}, |f (x) − f (p)| ≤ |f (x) − f (p) − v(x − p)| + |v(x − p)| < ε′ |x − p| + |v|.|x − p| < (ε/2)|x − p| + ε/2 < ε.
[ Show that a Lipschitz function is differentiable almost everywhere. This really requires measure theory, but Lipschitz functions are non-differentiable only on countable sets, which don’t need any measure theory. Also present functions of bounded variation. ] 18.2.16 Example: [ Here give the example of a function h : IR → IR which is everywhere continuous but differentiable nowhere. For example, see Rudin [136], theorem 7.18, page 141. This requires the use of infinite series. ]
18.3. Unidirectional differentiability of real-to-real functions 18.3.1 Definition: A right-open subset of IR is a set U ⊆ IR which satisfies ∀p ∈ U, ∃δ > 0,
(p, p + δ) ⊆ U.
18.3.2 Definition: A left-open subset of IR is a set U ⊆ IR which satisfies ∀p ∈ U, ∃δ > 0,
(p − δ, p) ⊆ U.
18.3.3 Definition: A right-differentiable function at a point p ∈ U for a right-open set U ⊆ IR is a function f : U → IR which satisfies ∃v ∈ IR, ∀ε > 0, ∃δ > 0, ∀x ∈ (p, p + δ) ∩ U, |f (x) − f (p) − v(x − p)| ≤ ε|x − p|. A right-differentiable function on a right-open set U ⊆ IR is a function f : U → IR which is right-differentiable for all p ∈ U . [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Therefore f is continuous at p.
432
18. Differential calculus f (p) + v(x − p) + ε|x − p| f (p) + v(x − p) f (p) + v(x − p) − ε|x − p|
f (p)
f (x)
p p+δ Figure 18.3.1
x
Right differentiability of real-valued function or real variable
18.3.4 Definition: A left-differentiable function at a point p ∈ U for a left-open set U ⊆ IR is a function f : U → IR which satisfies ∃v ∈ IR, ∀ε > 0, ∃δ > 0, ∀x ∈ (p − δ, p) ∩ U, |f (x) − f (p) − v(x − p)| ≤ ε|x − p|. A left-differentiable function on a left-open set U ⊆ IR is a function f : U → IR which is left-differentiable for all p ∈ U . 18.3.5 Definition: A unidirectionally differentiable function at a point p ∈ U for an open set U ⊆ IR is a function f : U → IR which is both right-differentiable and left-differentiable at p.
A unidirectionally differentiable function on an open set U ⊆ IR is a function f : U → IR which is both right-differentiable and left-differentiable on U .
18.4. Higher-order derivatives for real-to-real functions 18.4.1 Remark: It is tempting to define higher order differentiability of real-valued functions of a real variable inductively as follows.
Z
A function f : U → IR for U ∈ Top(IR) is said to be k-times differentiable at p for k ∈ + and p ∈ U if f is (k − 1)-times differentiable in a neighbourhood of p and the (k − 1)th derivative of f is differentiable at p. Such inductive definitions are somewhat dangerous in general. It is best to play it safe by first demonstrating the existence of the sequences of objects which are to be defined and then giving them names in a definition. An “inductive definition” is an infinite sequence of definitions, each of which requires the validation of all earlier definitions in the sequence in order to be validated. This is different to a “template definition” which is parametrized by an object from a specified class, but where there is no dependency between the definitions for each member of the class. The list space List(X) for an arbitrary set X in Definition 7.12.2 is a typical example of a template definition. There are infinitely many list space definitions, but none of them depend on each other, unless of course the space X is itself defined in terms of a list space. A space List(List(X)) is defined by the double application of Definition 7.12.2. An inductive sequence of definitions is a template definition where there are dependencies between the individual definitions for particular parameter values. Although such definitions may appear fairly safe for simple induction situations, the danger is much greater in the case of multiple induction, transfinite induction and induction with respect to more general ordered sets of definition parameter values. 18.4.2 Remark: Since the kth derivative of a function f at a point x ∈ IR depends on two variables, k and x, one must make a choice of order for the formalisation of higher-order derivatives. One may think of these derivatives as a sequence of partially defined functions or a sequence-valued function. In the former case, the function is of the form + ˚ IR). In the latter case, the form is IR → ( + ˚ IR), or 0 → (IR → 0 →
Z
[ www.topology.org/tex/conc/dg.html ]
Z
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
[ Insert tree diagram for various classes of first-order differentiability at a point, and in a given domain. ]
18.4. Higher-order derivatives for real-to-real functions
433
more precisely, IR → List(IR). (See Definition 6.11.3 and Notation 6.11.4 for partially defined functions. See Definition 7.12.5 for extended list spaces.) A sequence-valued function representation of higher-order derivatives is clumsy. In this representation, a sequence of higher-order derivatives is attached to each point of IR. This is unnatural in the sense that each derivative f (k) (x) is defined in terms of the function values f (k−1) (t) for t in a neighbourhood of x, not just the value f (k−1) (x). On the other hand, the sequence-valued function representation has the advantage that it forces the values f (ℓ) (x) to be defined for ℓ < k if f (k) (x) is defined. However, the trade-off seems to favour the “sequence of partially defined functions” style of representation as in Definition 18.4.4. (A “list-valued function” style of representation is presented in Remark 18.4.9.) 18.4.3 Remark: Definition 18.4.4 is expressed in terms of partially defined functions. This is because the set of partially defined functions is closed under differentiation, whereas a set of functions with a fixed domain is not. (This is illustrated by the example in Figure 18.4.1.) f ′ (x)
f (x)
f ′′ (x)
f ′′′ (x)
2
2
2
2
1
1
1
1
-2 -1
1 -1
2 x
-2
-2 -1
1 -1 -2
Figure 18.4.1
2 x
-2 -1
1 -1
2 x
-2
-2 -1
1 -1
2 x
-2
Higher derivatives may be partially defined
18.4.4 Definition: The sequence of higher order derivatives of a partially defined real function f : IR → ˚ IR – is the sequence Df : + ˚ IR) which is defined inductively by the rules: 0 → (IR → – (i) (Df )0 = f . – – – – (ii) For all k ∈ + , for all x ∈ IR, (Df )k (x) = (d/dt)(Df )k−1 (t) t=x if x ∈ Int(Dom((Df )k−1 )) and (Df )k−1 – is differentiable at x; otherwise (Df )k (x) is undefined.
Z
18.4.5 Remark: Before using the sequences in Definition 18.4.4 to define individual higher derivatives and differentiability, it’s a good idea to make sure that the sequences are always well-defined. The partially defined functions f : IR → ˚ IR include the functions f : U → IR for open sets U ∈ Top(IR) as special cases. But Dom(f ) is not necessarily an open set for all f : IR → ˚ IR. – It is clear that condition (i) is always well-defined. The zeroth element of the sequence Df is simply the function f itself. – The value (Df )k (x) is defined in condition (ii) if and only if – (1) (Df )k−1 (t) is defined for t ∈ (x − δ, x + δ) for some δ > 0 and – (2) (Df )k−1 (t) is differentiable at x. – Condition (1) is the same as saying that x is in the topological interior Int(Dom((Df )k−1 )) of the domain – – Dom((Df )k−1 ) of the partially defined function (Df )k−1 . (See Definition 14.4.1 for the topological interior of a set.) Conditions (1) and (2) are unambiguous propositions which are either true or false. So the inductive rules (i) and (ii) in Definition 18.4.4 are well-defined propositions. In other words, if the partially defined function – – (Df )k−1 is well defined, then the partially defined function (Df )k is also well defined. It follows that the – infinite sequence Df is well defined for any partially defined function f : IR → ˚ IR. Therefore Definition 18.4.4 may be safely used in other definitions. – – 18.4.6 Remark: Definition 18.4.4 implies that Dom((Df )k ) ⊆ Int(Dom((Df )k−1 )) for all k ∈ + . Hence – Dom((Df )k ) ⊆ Int(Dom(f )) for all k ∈ + . In other words, positive orders of differentiability are defined on the interior of the domain of f at most. So one may as well restrict the definitions of derivatives to functions whose domains are open sets. This is exactly what many textbooks do. However, generalizing the definitions of derivatives to arbitrary domains sometimes saves a lot of tedious formal argument in applications.
Z
[ www.topology.org/tex/conc/dg.html ]
Z
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Z
434
18. Differential calculus
18.4.7 Definition: A function f : IR → ˚ IR is k-times differentiable at x for x ∈ Dom(f ) when x ∈ – – Dom((Df )k ), where Df is the sequence of higher-order derivatives in Definition 18.4.4. – If f is k-times differentiable at x ∈ Dom(f ), then the kth derivative of f at x is the value (Df )k (x). Otherwise the kth derivative of f at x is said to be “undefined”. – A function f : IR → ˚ IR is k-times differentiable when Dom(f ) ⊆ Dom((Df )k ). 18.4.8 Remark: Definition 18.4.7 is equivalent to the more usual definition, which is that f is k-times differentiable when all derivatives f (j) (t) are defined for j < k for t in a neighbourhood of x and f (k−1) is differentiable at x. – The set inclusion condition Dom(f ) ⊆ Dom((Df )k ) for the k-times differentiability of f is, of course, the – same as the equality Dom(f ) = Dom((Df )k ) because the reverse inclusion is automatic. 18.4.9 Remark: As mentioned in Remark 18.4.2, the sequence of higher-order derivatives of a real function may also be represented as a list-valued function of the real numbers, namely as a function D∗ f : IR → List(IR), where the list space List(IR) is given by Definition 7.12.5: List(IR) = IRω ∪
S
IRk .
Z
k∈
+ 0
This representation is the transpose (in the sense of Remark 6.12.3) of the sequence-of-functions representation in Definition 18.4.4. ∀x ∈ IR, ∀k ∈
Z+0,
– (D∗ f )(x)k = (Df )k (x).
(18.4.1)
Equation (18.4.1) is to be understood in the “partially defined” sense. In other words, the right-hand side is undefined if and only if the left-hand side is undefined. By Remark 18.4.6, it follows that the domains of the sequences (D∗ f )(x) are contiguous subsets of + 0 which include 0 for all x ∈ IR. In other words, (D∗ f )(x) ∈ List(IR) for all x ∈ IR.
Z
18.4.11 Theorem: Let U ∈ Top(IR) and g ∈ C 1 (U ). Let f ∈ C 1 (V ) for some set V ∈ Top(IR) which satisfies Range(g) ⊆ V . Define h : U → IR by h : x 7→ f (g(x)). Then h ∈ C 1 (U ) and h′ (x) = f ′ (g(x))g ′ (x) for all x ∈ U . 18.4.12 Remark: The higher-order composition rules for differentiation with a single variable may be generalized from Theorem 18.4.11 as follows. h′′ = f ′′ (g ′ )2 + f ′ g ′′ h′′′ = f ′′′ (g ′ )3 + 3f ′′ g ′′ g ′ + f ′ g ′′′ h(4) = f (4) (g ′ )4 + 6f ′′′ g ′′ (g ′ )2 + 3f ′′ (g ′′ )2 + 4f ′′ g ′′′ g ′ + f ′ g (4) , and so forth. It is assumed that g ∈ C k (U ) and f ∈ C k (V ), where k is the order of the derivative of h. [ Show that the second order derivative can be calculated as limh→0 h−2 (f (x − h) + f (x + h) − 2f (x)), or something like that. Also provide such expressions for higher-order derivatives. This kind of calculation is very much in the style of the “parameter estimator” convergence notion discussed in Remarks 18.2.3 and 18.2.4. It might even more natural to define higher-order derivatives as these kinds of “parameter estimators” than to take derivatives of derivatives. One can even visualize these estimators in terms of parabolas etc. It is also possible to change the weights for points in the estimates to make them lop-sided, and to parametrize with respect to the independent variable differently. In particular, one-sided derivatives of any order may be defined like this. ] [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Z
18.4.10 Notation: C k (U ) for an open subset U of IR and k ∈ + 0 denotes the set of functions f : U → IR for which the rth derivative dr f (x)/dxr is well-defined for all r ≤ k and the kth derivative function x 7→ dk f (x)/dxk is continuous for x ∈ U . T∞ C ∞ (U ) for an open subset U of IR denotes the set k=0 C k (U ).
18.5. Differentiation for several variables
435
18.5. Differentiation for several variables 18.5.1 Remark: The definitions of differentiability for functions of several variables are not a straightforward generalization from the single variable situation. In fact, there is some substantial complexity. This complexity is not of purely technical interest. Models which arise in physics often have solutions which are functions of several variables with limited differentiability. In other words, not all orders of derivatives are defined. Therefore one may ask which order of derivative is the highest order which is defined. If this order is k, say, then clearly the order k + 1 derivative is not defined. Numerous definitions of limited differentiability may be interpolated between order k and order k + 1. Many apparently paradoxical relations occur between these definitions. For example, a function whose partial derivatives are defined everywhere in an open domain might not be continous at all points of the domain. In practical situations in physics, especially for boundary value problems, the domain is very often not an open set. In this case, differentiability on the boundary of the set must be defined. In PDE analysis, this question is avoided by requiring a function to be extendable to an open superset of the domain. This does not make sense in many physics models. Therefore differentiability on boundary points must be defined. In this case, the situation becomes very much more complicated than even the interior point case. It follows from these considerations that the technical difficulties of differentiating functions of several variables cannot be avoided if applicability of differential geometry to practical problems in physics is desired. [ Should define limits for real functions on IRn ? ] 18.5.2 Remark: Figure 18.5.1 shows some relations between differentiability properties for real-valued functions of several real variables.
Figure 18.5.1
f partially differentiable in Ω
f directionally differentiable in Ω
f totally differentiable in Ω
f ctsly partially differentiable in Ω
f ctsly directionally differentiable in Ω
f ctsly totally differentiable in Ω
Relations between differentiability properties for f : Ω → IR, Ω ∈ Top(IRn ), n ∈ IR+
18.5.3 Remark: Definition 18.2.5 for differentiability of a real-valued function of a real variable may be generalized in many different ways to functions from IRn to IRm . The most restrictive natural generalization requires the existence of a “total differential” at a point in IRn . This is given by Definition 18.5.17. A much easier test to verify is the partial differentiability property in Definition 18.5.4. 18.5.4 Definition: A function f : U → IRm for U ∈ Top(IRn ) with m, n ∈ differentiable at p ∈ U if ∀i ∈
Nn, ∃w ∈ IRm, ∀ε > 0, ∃δ > 0, ∀t ∈ (−δ, δ), p + tei ∈ U
⇒
|f (p + tei ) − f (p) − tw| ≤ ε|t|.
Z+0 is said to be partially (18.5.1)
A function f : U → IRm for U ∈ Top(IRn ) is said to be partially differentiable on U if it is partially differentiable at all points p ∈ U . [ Definition of partial derivatives ∂f (x)/∂xi . ] [ Show that the partial derivative values are unique at each point. ] 18.5.5 Remark: Partially differentiable functions are not necessarily continuous. This is proved by Examples 18.5.6 and 18.5.7. Since all totally differentiable functions are continuous, it follows that the discontinuous Examples 18.5.6 and 18.5.7 are not totally differentiable. Hence partial differentiability everywhere does not imply total differentiability. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
f continuous in Ω
436
18. Differential calculus
18.5.6 Example: Define f : IR2 → IR by 2x1 x2 /(x21 + x22 ) f (x) = 0
x 6= (0, 0) x = (0, 0).
Then f is partially differentiable on IR2 , but f is not continuous at (0, 0) ∈ IR2 . This function is not directionally differentiable at (0, 0). The level curves of f are illustrated in Figure 18.5.2. Note that |f (x)| ≤ 1 for all x ∈ IR2 . x2 −0.9 −1
2
−0.7
0.3
0.7
0.9 1 0.9
1
0
-2
0.7 0.5 0.3
-1
x1 2−0.3
1
−0.5 −0.7
-1
−0.9 −1
1 0.9
0.7
-2 0
Level curves of f (x) = 2x1 x2 /(x21 + x22 ), f : IR2 → IR
Z
Q
Q2 of elements of IR2 which have rational
18.5.7 Example: Let h : + → 2 be an enumeration of the set components. Define φ : IR2 → IR by X φ(x) = 2−k f (x − h(k))
Z
k∈
+
for all x ∈ IR2 , where f is as defined in Example 18.5.6 and x − a denotes the element (x1 − a1 , x2 − a2 ) of IR2 for all a ∈ IR2 . Then φ is well defined and partially differentiable on IR2 , but φ is discontinuous at all points a ∈ 2 . Thus φ is discontinuous on a dense subset of its domain IR2 . Note that |φ(x)| ≤ 1 for all x ∈ IR2 . (For all points a ∈ 2 , the function φ is also not directionally differentiable at a.)
Q
Q
[ There are some minor technicalities in Example 18.5.7. If 3−k is used instead of 2−k , the proof of noncontinuity at all points of 2 is a little easier. The proof of the properties of examples 18.5.6 and 18.5.7 should be given in exercises. Similar comments apply to Examples 18.5.11 and 18.5.12. ] [ There should be a whole section on enumerations of sets like n for n ∈ + and subsets of these sets. ]
Q
Q
Z
18.5.8 Definition: A function f : U → IRm for U ∈ Top(IRn ) with m, n ∈ differentiable at p ∈ U if
∀v ∈ IRn , ∃w ∈ IRm , ∀ε > 0, ∃δ > 0, ∀t ∈ (−δ, δ), p + tv ∈ U ⇒ |f (p + tv) − f (p) − tw| ≤ ε|t|. m
Z+0 is said to be directionally
(18.5.2)
n
A function f : U → IR for U ∈ Top(IR ) is said to be directionally differentiable on U if it is directionally differentiable at all points p ∈ U .
[ Definition of directional derivatives ∂v f = lima→0 (f (x + av) − f (x))/a for v ∈ IRn . Also define one-sided directional derivatives. See Rudin [136], 9.10, page 188–192 and 5.16, page 96. See also EDM2 [34], 106.G, page 396 for directional derivatives. See also Section 19.6. ] [ Show that w in line (18.5.2) is unique for each p and v. Show the relation between the directional and partial derivatives, namely that the partial derivatives are special cases of the directional derivatives. Given an example to show that w is not necessarily a linear function of v. Give an example to show that w may be an unbounded function of v for a given fixed p. ] [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Figure 18.5.2
18.5. Differentiation for several variables
437
18.5.9 Remark: It is easy to show that the m-tuple w in Definition 18.5.8 is uniquely determined by the n-tuple v for each point p ∈ U . Therefore w ∈ IRm is a well-defined function of v ∈ IRn for each fixed p ∈ U . Denote this well-defined function by φp : IRn → IRm . It is immediately clear from Definition 18.5.8 that φp (u) = φp (λv) for any u, v ∈ IRn which satisfy u = λv for some λ ∈ IR \ {0}. Thus each function φp is linear on one-dimensional linear subspaces of the linear space IRn . However, it is certainly not true in general that φp is linear on the whole of IRn . (Remark 18.5.15 is the unidirectional analogue of this remark.) 18.5.10 Remark: Examples 18.5.11 and 18.5.12 give discontinuous functions which are everywhere directionally differentiable. These are analogous to Examples 18.5.6 and 18.5.7 respectively, which are merely everywhere partially differentiable. Since all totally differentiable functions are continuous, it follows that the discontinuous Examples 18.5.11 and 18.5.12 are not totally differentiable. Hence directional differentiability everywhere does not imply total differentiability. 18.5.11 Example: Define g : IR → IR by g(t) = t exp (1 − t2 )/2 for all t ∈ IR. (See Figure 18.5.3.) Then g is a C ∞ function on IR, g(1) = 1 and |g(t)| ≤ 1 for all t ∈ IR. 2
g(t) = te(1−t
)/2
1 -3
-2
-1
t 1
2
3
-1
Define f : IR2 → IR by ∀x ∈ IR,
f (x) = =
g(x21 /x2 ) 0
x2 = 6 0 x2 = 0
4 −2 x21 x−1 x2 6= 0 2 exp (1 − x1 x2 )/2 0 x2 = 0.
Then f is directionally differentiable on IR2 , but f is not continuous at (0, 0) ∈ IR2 . Note that |f (x)| ≤ 1 for all x ∈ IR2 . Consequently f is not totally differentiable at (0, 0). The level curves of this function are illustrated in Figure 18.5.4.
Z
Q
18.5.12 Example: As in Example 18.5.7, let h : + → 2 be an enumeration of the set of IR2 which have rational components. Define φ : IR2 → IR by X φ(x) = 2−k f (x − h(k))
Z
k∈
Q2 of elements
+
2
for all x ∈ IR , where f is as defined in Example 18.5.11 and x − a denotes the element (x1 − a1 , x2 − a2 ) of IR2 for all a ∈ IR2 . Then φ is well defined and directionally differentiable on IR2 , but φ is discontinuous at all points a ∈ 2 . Thus φ is discontinuous on a dense subset of its domain IR2 . Note that |φ(x)| ≤ 1 for all x ∈ IR2 . (For all points a ∈ 2 , the function φ is also not totally differentiable at a.)
Q
Q
18.5.13 Remark: Definition 18.5.14 is a one-sided derivative version of Definition 18.5.8. 18.5.14 Definition: A function f : U → IRm for U ∈ Top(IRn ) with m, n ∈ unidirectionally differentiable at p ∈ U if ∀v ∈ IRn , ∃w ∈ IRm , ∀ε > 0, ∃δ > 0, ∀t ∈ (0, δ), p + tv ∈ U ⇒ |f (p + tv) − f (p) − tw| ≤ ε|t|. m
Z+0 is said to be (18.5.3)
n
A function f : U → IR for U ∈ Top(IR ) is said to be unidirectionally differentiable on U if it is unidirectionally differentiable at all points p ∈ U . [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
The function g : IR → IR with g : t 7→ t exp (1 − t2 )/2
Figure 18.5.3
438
18. Differential calculus x2 g( 12 )
2
f (x) = g( 12 )
f (x) = g(1)
1
f (x) = g(1)
f (x) =
f (x) = g(2)
f (x) = g(2) x1
f (x)=0
-1
-2 f (x) = g(−2)
1
2 f (x) = g(−2)
f (x) = g(−1)
-1
f (x) = g(−1)
f (x) = g(− 12 )
-2
f (x) = g(− 12 )
f (x)=0
Figure 18.5.4
Level curves of f (x) =
x21 x−1 2
2 exp (1 − x41 x−2 2 )/2 , f : IR → IR
[ Show the uniqueness of unidirectional derivatives. Show that the directional derivative in a particular direction is well defined if and only if the two corresponding unidirectional derivatives are well defined and equal. ]
18.5.16 Remark: Since a directionally differentiable function is necessarily unidirectionally differentiable, it follows that Examples 18.5.11 and 18.5.12 demonstrate the existence of discontinuous functions which are everywhere unidirectionally differentiable. 18.5.17 Definition: A function f : U → IRm for U ∈ Top(IRn ) with m, n ∈ differentiable at p ∈ U if ∃L ∈ Lin(IRn , IRm ), ∀ε > 0, ∃δ > 0, ∀v ∈ Bp,δ ∩ U, |f (p + v) − f (p) − L(v)| ≤ ε|v|. m
Z+0 is said to be totally (18.5.4)
n
A function f : U → IR for U ∈ Top(IR ) is said to be totally differentiable on U if it is totally differentiable at all points p ∈ U . If a function f : U → IRm is totally differentiable at a point p ∈ U , the total differential of f at p is the linear map L in equation (18.5.4). If a function f : U → IRm is totally differentiable on U ∈ Top(IRn ), the total differential of f is the function df : U → Lin(IRn , IRm ) defined so that for all p ∈ U , df (p) is the total differential of f at p. [ Show the uniqueness of the total differential. Give an example where the total differential is not continuous with respect to the base point p. ] 18.5.18 Remark: The total differential of a function γ : IR → IRm may be visualized as a tangent vector to a curve. In the case of a function f : IRn → IR, the total differential may be visualized as a tangent plane to constant-value contours of f . Figure 19.2.4 gives a rough impression of such tangent vectors and tangent planes. 18.5.19 Theorem: If a function f : U → IRm for U ∈ Top(IRn ) with m, n ∈ at p ∈ U , then f has two-sided directional derivatives in every direction at p. [ www.topology.org/tex/conc/dg.html ]
Z+0 is totally differentiable
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
18.5.15 Remark: Just as in Remark 18.5.9, it is easy to show that the m-tuple w in Definition 18.5.14 is uniquely determined by the n-tuple v for each point p ∈ U . Therefore w ∈ IRm is a well-defined function of v ∈ IRn for each fixed p ∈ U . Denote this well-defined function by φp : IRn → IRm . It is immediately clear from Definition 18.5.14 that φp (u) = φp (λv) for any u, v ∈ IRn which satisfy u = λv for some λ ∈ (0, ∞). Thus each function φp is linear on one-dimensional directed linear subspaces Sv+ = {λv; λ ∈ IR+ 0 } of the linear space IRn for v ∈ IRn . However, it is not true in general that φp is linear on linear subspaces Sv = {λv; λ ∈ IR} of IRn . It is a-fortiori not true in general that φp is linear on the whole of IRn .
18.6. Higher-order derivatives for several variables 18.5.20 Theorem: If a function f : U → IRm for U ∈ Top(IRn ) with m, n ∈ derivatives on U , then f is totally differentiable on U .
439
Z+0 has continuous partial
[ Theorem 18.5.20 implies that the directional derivatives can be replaced with a linear combination of partial derivatives. I.e. ∂v f = v i ∂i f . Present an example which shows that the continuity is required. ] [ Show that if the total differential is defined at a point, then the directional derivatives (and unidirectional derivatives) are well defined and are linear functions of the direction vector. Give an example to show that the converse is not true. In other words, linearity of directional derivatives does not guarantee total differentiability. Examples 18.5.11 and 18.5.12 demonstrate this in fact. ] [ Give the example of f (x, y) = x3 (x2 + y 2 )−1 which has directional derivatives everywhere but is not differentiable in the limiting linear sense. ] [ Should include the implicit function theorem inverse function theorem here for use in Section 34.1. ]
18.6. Higher-order derivatives for several variables [ The definitions for multi-indices and factorial functions should be in the number chapters. ]
Z+)k for some k ∈ Z+0. An addition operation is defined on multi-indices in (Z+ )k by α + β = (α1 + β1 , . . . αk + βk ), where α =
18.6.1 Definition: A multi-index is an element of ( (α1 , . . . αk ) and β = (β1 , . . . βk ).
Z+)k as [α] = Pki=1 αi for all α ∈ (Z+)k . Q A factorial function is defined on (Z+ )k by α! = ki=1 αi ! for all α ∈ (Z+ )k . A length function is defined on (
18.6.2 Definition: The αth derivative of a function f : U → IR for an open set U ⊆ IRn and α ∈ ( for k ∈ + 0 is. . .
Z
Nn)k
[ Define C r , C ∞ and analytic functions of several variables here. For C k functions, for example, want something like ∀α ∈ ( + )n , |α| ≤ n ⇒ Dα f ∈ C 0 (U ). Could maybe recursively define C k+1 (Ω) = {f ∈ C k (Ω); ∀i ∈ n , ∂i f ∈ C k (Ω)}. See EDM2 [34], 106.K, page 397. ]
Z
18.6.3 Theorem: Let f ∈ C 1 (Ω) for some Ω ∈ Top(IRn ) and n ∈ n X
∀x ∈ U, ∀v ∈ IRn ,
i=1
vi
Z+0. Then
∂f (x) f (x + av) − f (x) = lim . a→0 ∂xi a
[ Discuss examples where the partial derivatives exist but do not commute. E.g. see Rudin [136], page 221. ] [ Show that u ∈ C 2 (Ω) and p ∈ Ω a local maximum of u implies that aij uij ≤ 0 at p. ] 18.6.4 Theorem: The composition rules for differentiation with several variables are as follows. hi = fj φj i hij = fkℓ φk i φℓ j + fk φk ij hijk = fℓmn φℓ i φm j φn k + fℓm (φℓ ik φm j + φℓ jk φm i + φℓ ij φm k ) + fℓ φℓ ijk , and so forth, where h(x) = f (φ(x)) for x ∈ IRd , and φ : IRd → IRd is a one-to-one differentiable function. It is assumed that f and φ are C r , where r is the order of derivative of h. 18.6.5 Definition: A C k (differentiable) curve in IRm for m ∈ m 0 map γ ∈ C (I, IR ) such that γ Int(I) is of class C k .
Z+0, k ∈ −Z+0 and an interval I ⊆ IR is a
18.6.6 Remark: Definition 18.6.5 means that a map γ : I → IRm is a C k curve if and only if it is continuous on the whole interval I and C k on the interior of I.
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
N
440
18. Differential calculus
18.7. Some differentiability-based function spaces 18.7.1 Remark: The linear spaces which are most used in differential geometry are tuple spaces and function spaces. The tuple spaces are mostly finite-dimensional, typically real-valued. The functions in the functions spaces are typically real-valued and defined on finite-dimensional tuple space domains. Notation 18.7.2 is an example of these kinds of function spaces. − n + 18.7.2 Notation: C k (Ω) denotes {f : Ω → IR; f is C k } for k ∈ + 0 , Ω ∈ Top(IR ) and n ∈ 0 . −+ 18.7.3 Notation: C k (Ω, IRm ) denotes {f : Ω → IRm ; f is C k } for Ω ∈ Top(IRn ), m, n ∈ + 0 and k ∈ 0 .
Z
Z
Z
Z
˚i (Ω, IRn ). See Definition 18.8.1. ] [ Also define such spaces as C 18.7.4 Remark: The notation C ω is not used for analytic functions in this book because of the ambiguity caused by the notation ω for the set of finite ordinal numbers, which is sometimes taken as the definition of the countable infinity ∞. Analytic functions are not very useful in applications of differential geometry to general relativity. Analytic functions are completely determined globally by values on an arbitrarily small neighbourhood of an arbitrary point. This is not realistic for macroscopic physical systems. Therefore analytic functions are not emphasized in this book. [ Maybe could invent some notation A(IRn ) or
Z
Z
A(IRn) for the analytic functions? ]
N
n + k k 18.7.5 Theorem: Let n ∈ + 0, k ∈ 0 , Ω ∈ Top(IR ), f ∈ C (Ω), p ∈ Ω and α ∈ ( n ) . Then α P (α) f (p) = f (p) for all permutations P : k → k . In other words, the value of the derivative is independent of the order of differentiation.
N
18.7.6 Remark: See Notation 7.2.33 for the sets of integers
Nk .
18.7.7 Remark: There are so many function spaces in analysis, it is sometimes useful to invent a metanotation for them such as K. Then one could create templates for definitions such as “manifolds of class K” or “class K manifolds”, where K might mean C k , C k,α or analytic. A related generalization of regularity classes is the notion of a “pseudogroup of diffeomorphisms” between open subsets of IRn . This is defined in Section 19.4. Most of this book is written in terms of C k manifolds because this level of refinement of regularity is almost always an adequate starting point. Such spaces are readily refined further, for example to H¨older regularity classes C k,α if required. The important thing it to avoid the blanket use of C ∞ spaces which remove all motivation to bring differentiability into consideration. It should not be forgotten that differential geometry is an extension of analysis from flat space to curved space. Analysis is not a minor extension topic of differential geometry. The real business of differential geometry is to solve differential equations. Determining whether a manifold can be stretched into a donut or a sphere is a minor recreational consideration. [ Define notations for alternating tensor tangent bundles of the form Λm T (V, U ). Also define sets of crosssections like X k (T (r,s) (V )) and X k (Λm T (V, U )). ] [ Define Lie derivatives for flat space somewhere near here. ]
18.8. Differentiation for abstract linear spaces [ This section needs to be fixed. Maybe it could be generalized to infinite-dimensional linear spaces. ] Although all linear spaces are isomorphic to IRn for some n, there is sometimes a need for differential calculus to be defined for general finite-dimensional spaces. [ See Malliavin [35], section 2.2 for C k (V, W ) etc. ] [ Maybe Definition 18.8.1 is not a good definition. Should base Definition 18.8.1 on Definition 18.2.5? ] 18.8.1 Definition: A differentiable function from a normed linear space V to a normed linear space W is a function f : Ω → W for an open set Ω ∈ Top(V ) such that ∀x ∈ Ω, ∃φ ∈ Lin(V, W ), ∀ε > 0, ∃δ > 0, ∀x′ ∈ Ω, |x′ − x|V < δ ⇒ |f (x′ ) − f (x) − φ(x′ − x)|W ≤ ε|x′ − x|V .
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
N
18.9. H¨older continuity
441
[ Is the derivative of a function f : A → B equivalent to a special case of the covariant derivative on a manifold with a connection? ] [ Define C k diffeomorphisms in flat space. See Malliavin [35], section 2.3. ] [ Follow the notation of Federer here. See Federer [105] 3.1.11 and 3.1.1. ] [ Define also C r (V, W ). Then get D2 f (a) ∈ Lin(V, Lin(V, W )), etc. ] ˚r (V, W ). ] [ Also define C [ Show the relations between the abstract linear space definitions and the n-tuple linear space definitions for differentiability. ]
18.9. H¨ older continuity [ H¨older continuity could be defined for general metric spaces, somewhere near the definition of uniformly continuous functions. ] H¨older continuous functions are defined in terms of the standard norm on IRn for n ∈
Z+0.
[ Also define H¨older continuity for the general domains IRn . Perhaps generalize also to general ranges IRm . Maybe should even generalize to arbitrary metric spaces for both domain and range. ] [ Carefully distinguish H¨ older continuity definitions according to whether they are local or global, and whether they are uniform or pointwise. ] 18.9.1 Definition: A H¨ older-continuous function with exponent α or α-H¨ older (continuous) function at and α ∈ (0, 1] is a function f : S → IR such that x ∈ S for a set S ⊆ IRn for n ∈ + 0
Z
|f (y) − f (x)| ≤ K|y − x|α .
∃K ∈ IR, ∀y ∈ S,
A uniformly H¨ older-continuous function with exponent α or uniformly α-H¨ older (continuous) function on a set S ⊆ IRn for n ∈ + 0 and α ∈ (0, 1] is a function f : S → IR such that |f (y) − f (x)| ≤ K|y − x|α .
∃K ∈ IR, ∀x, y ∈ S,
A locally H¨ older-continuous function with exponent α or locally α-H¨ older (continuous) function at x ∈ S and α ∈ (0, 1] is a function f : S → IR such that f is uniformly α-H¨older for a set S ⊆ IRn for n ∈ + 0 continuous on all bounded subsets of S.
Z
18.9.2 Remark: As suggested in Figure 18.9.1, the α-H¨older continuity conditions with larger values of α are the most restrictive. f (x) = x
1
f (x) = |x|0.75 f (x) = |x|0.5 f (x) = |x|0.25
|x|0.25 |x|0.75
-2
-1
1
2
-1 f (x) = − x
Figure 18.9.1 [ www.topology.org/tex/conc/dg.html ]
x
f (x) = −|x|0.25 f (x) = −|x|0.5 f (x) = −|x|0.75
Fractional powers f (x) = |x|α , α ∈ (0, 1] [ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Z
442
18. Differential calculus
So, for example, all 0.75-H¨ older continuous functions are also 0.25-H¨older continuous, but not vice versa. In terms of Notation 18.9.3, this means that C 0,3/4 (U ) ⊆ C 0,1/4 (U ) for any open set U ⊆ IRn . The most restrictive α-H¨older condition (i.e. the condition with the greatest regularity) is the 1-H¨older condition, which is the same as the Lipschitz property on bounded sets. [ Check the distinction between local and global H¨older continuity definitions in Remark 18.9.2 and Notation 18.9.3. ]
Z
18.9.3 Notation: C k,α (Ω) for an open subset Ω of IRn , n, k ∈ + 0 and α ∈ (0, 1] denotes the set of functions f ∈ C k (Ω) such that the function x 7→ dk f (x)/dxK is locally α-H¨older continuous for all x ∈ Ω and multi-indices K ∈ ( + )n with |K| = k.
Z
Z
18.9.4 Remark: ItTseems reasonable that C ∞,α (Ω) for an open subset Ω T of IRn with n ∈ + 0 and α ∈ (0, 1] ∞ ∞ k,α ∞ would denote the set k=0 C (Ω). However, this is identical to C (Ω) = k=0 C k (Ω). Therefore the C ∞,α spaces are generally not defined. [ Give a family tree of H¨ older continuity properties. ] [ This section should also define one-sided derivatives and upper and lower derivatives, and give theorems about these for general functions, monotone functions, Lipschitz functions, etc. ] [ It should also be possible to define derivatives for functions on a dense subset of IR, such as for example. ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Q
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
[443]
Chapter 19 Diffeomorphisms in Euclidean space
19.1 19.2 19.3 19.4 19.5 19.6
Tangent vectors and diffeomorphisms . . . . . . . . . . Differentials and diffeomorphisms . . . . . . . . . . . . Second-level tangent vectors and diffeomorphisms . . . Diffeomorphism pseudogroups . . . . . . . . . . . . . . Second-order differential operators and diffeomorphisms Directionally differentiable homeomorphisms . . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
443 445 449 452 454 456
Diffeomorphisms between subsets of IRn are the foundation of differentiable structures on n-dimensional manifolds. In fact, the differentiable manifold concept may be thought of being nothing more or less than an abstraction of the properties of diffeomorphisms.
19.1. Tangent vectors and diffeomorphisms The concept of a tangent vector arises naturally as an invariant property of differentiable functions and curves under diffeomorphisms. Both the gradient of a real-valued function and the tangent to a curve are invariant under diffeomorphisms whereas higher-order differential operators are not invariant in a comparable way.
Z
19.1.2 Remark: The general composition of functions given by Definition 6.3.23 may be applied to the C r diffeomorphisms on IRn in Definition 19.1.1 to give a closed set of maps between open subsets of IRn . This follows immediately from the fact that the composite of two homeomorphisms is a homeomorphism and the fact that the composite of C r functions is C r . The inverse of a C r diffeomorphism is also clearly a C r diffeomorphism. A homeomorphism on IRn is also clearly of class C r if and only if its restriction to any open subset of its domain is of class C r . 19.1.3 Remark: Figure 19.1.1 illustrates the transformation of a tangent vector of a curve when the curve lies within the domain of a diffeomorphism from IRn to IRn . (C r curves in IRn are given by Definition 18.6.5.) φ v
i
q γ
γ˜ = φ ◦ γ
p = γ(t)
IRn Figure 19.1.1
wi
p˜ = φ(p) v i = ∂t γ i (t) v˜i = φi ,k (p)v k
i
v˜
q˜
w ˜i IRn
Transformation of tangent vector of a curve under a diffeomorphism
©
Alan U. Kennington, “Differential geometry reconstructed: a unified systematic framework”. Copyright 2009, Alan U. Kennington. All rights reserved. You may print this book draft in A4 format. Printing in all other formats is forbidden. You may not charge any fee for copies of this book draft. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Z
−+ n r 19.1.1 Definition: For n ∈ + 0 and r ∈ 0 , a C diffeomorphism on IR is a homeomorphism φ : Ω1 ≈ Ω2 n −1 r for open subsets Ω1 , Ω2 of IR such that both φ and φ are C differentiable maps. The sets Ω1 and Ω2 are said to be C r -diffeomorphic if there exists a C r diffeomorphism from Ω1 to Ω2 .
444
19. Diffeomorphisms in Euclidean space
As illustrated in Figure 19.1.2, the derivative (i.e. tangent vector) of a differentiable curve γ : IR → IRn is transformed according to only the first derivatives (the Jacobian matrix) of a diffeomorphism. One may write the transformation rule more compactly as ∂t (φ ◦ γ(t))i = φi ,k ∂t γ k (t). V 7→ V˜ = (˜ v i )ni=1 ∂φi (x) v˜i = vj ∂xj x=γ(t)
V = γ ′ (t) derivative γ Figure 19.1.2
(φ ◦ γ)′ (t) = V˜ derivative φ ◦ γ = γ˜
φ
Transformation of tangent to a curve under a diffeomorphism
19.1.4 Remark: In practical terms, the transformation rule for tangents to curves means that you don’t have to re-calculate tangent vectors when the curves are subjected to diffeomorphisms. The tangent vector of a transformed curve may be calculated from the untransformed curve using only the derivatives of the diffeomorphism, without needing to differentiate the curve again. This suggests that a tangent vector has some sort of existence independent of the particular curve from which it is constructed. 19.1.5 Remark: The transformation rule for tangent vectors in Remark 19.1.3 is obtained by comparing the derivative of a curve with the derivatives of the transformed curve. This suggests that derivatives and curves are part of the “essence” of tangent vectors. This is more or less true. However, for C 1 functions f : IRn → IR, the following identity holds for any v ∈ IRn . i=1
vi
∂f f (x + av) − f (x) (x) = lim . i a→0 ∂x a
(19.1.1)
The left-hand side of (19.1.1) is a convenient method of calculating the right-hand side, but it is the righthand side which is closer to the true “meaning” of the vector v ∈ IRn . The left-hand side uses a particular orientation of axes for calculating partial derivatives whereas the right-hand side is effectively written in terms of the vector from x to x + av, in other words the vector av at the point x. The function f has the role of a “test function” to help express the idea of the limit of the vector av as a tends to zero. Both the C 1 functions f and the C 1 curves γ serve to clarify the meaning of an infinitesimal vector. The limit operation serves to neutralize the curvature effect of non-linear diffeomorphisms so that the transformation rules will not be erroneous. Ultimately, it does seem that the only way to derive the properties of tangent vectors (and other tangential objects) is to evaluate the effect of diffeomorphisms on them as in Remarks 19.1.3 and 19.3.2. Thus every class of tangential object on a manifold is a generalization of some differential operator in a Euclidean space together with its transformation rules under diffeomorphisms. However, as soon as the calculus has been performed on the concrete differential operator, it is best to abstract the transformation rules from this and define the class of tangential object purely in terms of its transformation rules rather than in terms of test functions or sample curves. 19.1.6 Remark: It is important to distinguish between “point transformations” and “coordinate transformations” because they are opposites even though they may have the same equations. The diffeomorphism φ in Remark 19.1.3 is a point transformation in flat space IRn , but when the points of IRn are used as mere coordinates for a manifold (as in Definition 27.2.2), the diffeomorphism is a coordinate transformation. If ψ1 and ψ2 are charts for an n-dimensional manifold, the composite map ψ2 ◦ ψ1−1 is a local diffeomorphism of IRn which is plays the role of a coordinate transformation for the manifold, but is a point transformation in the space IRn of coordinates. [ Define the tangent bundle on IRn . Use this for Metadefinition 28.2.1. Definition 19.1.7 needs to be made more rigorous. Need a notation for the space of all C r diffeomorphisms on IRn . ]
Z
n n n 19.1.7 Definition: The tangent bundle on IRn for n ∈ + 0 is the set T (IR ) = IR × IR together with the n n transformation rule ψˆ : T (IR ) → ˚ T (IR ) defined by (p, v) 7→ (φ(p), (φi ,j (p)v j )ni=1 ) for C 1 diffeomorphisms n n φ : IR → ˚ IR . [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
n X
19.2. Differentials and diffeomorphisms
445
[ Give the rule for the Jacobian of the composite of two diffeomorphisms. ] 19.1.8 Notation: Tx (IRn ) for n ∈ T (IRn ) on IRn .
Z+0 and x ∈ IRn denotes the subset {x} × IRn of the tangent bundle
19.1.9 Definition: A cross-section of the tangent bundle on IR n for n ∈ such that f (x) ∈ Tx (IRn ) for all x ∈ Dom(f ).
Z+0 is a function f : IRn → T (IRn)
19.1.10 Definition: A C r cross-section of the tangent bundle on IR n for n ∈ section f : IRn → T (IRn ) such that f is of class C r . 19.1.11 Notation: X r (T (IRn )) for n ∈
Z+0 and r ∈ −Z+0 is a cross-
Z+0 and r ∈ −Z+0 denotes the set of C r cross-sections of T (IRn).
19.1.12 Remark: Diffeomorphisms between differentiable manifolds are defined by Definition 27.8.8.
19.2. Differentials and diffeomorphisms 19.2.1 Remark: The differential gradient) of a function is a kind of transpose of the derivative of a P (or ∂f function. A derivative such as ni=1 v i ∂x i (x) depends on two inputs, namely the sequence of n components ∂f i n n (v )i=1 and the sequence of n partial derivatives ( ∂x i (x))i=1 . The sequence of partial derivatives of f is called the “differential of f ”. This is often denoted as df , but in applied mathematics, the usual notation is ∇f . (The nabla symbol ∇ is just the Delta symbol ∆ upside down. The ∆ symbol is the Greek letter “D”, which is an abbreviation for the word “derivative”.) 19.2.2 Example: Figure 19.2.1 shows the “vectors” df (or ∇f ) for a quadratic function on IR2 . The function f is defined by f (x, y) = x2 /4 + y 2 for (x, y) ∈ IR2 . This has the differential df (x, y) = (x/2, 2y). (See also Example 43.10.1.) y
-2
-1
f (x, y) = 1.0 Figure 19.2.1
(fx , fy ) = (x/2, 2y)
1
1
2
x
f (x, y) = 0.9
Differential of quadratic function on IR2
19.2.3 Remark: It turns out that the gradient is not a tangent vector. It does not obey the right transformation rules for a tangent vector under diffeomorphisms. To see this, consider the point diffeomorphism which is illustrated in Figure 19.2.2. The point (0.5, 0) is mapped to (1, 0). If the function f : IR2 → IR is transformed, point for point, by the diffeomorphism, the value of f˜ = f ◦ φ−1 satisfies f˜(1, 0) = f ◦ φ−1 (1, 0) = f (0.5, 0) = 0.25 if f is defined by f : (x, y) 7→ x2 + y 2 . The gradient of f satisfies df (x, y) = (2x, 2y). Similarly, df˜(x, y) = (x/2, y). So df (0.5, 0) = (1, 0) and df˜(1, 0) = (0.5, 0). However, the image (or “push-forth”) φ∗ of φ maps the vector v = (1, 0) at [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
f (x, y) = x2 /4 + y 2
446
19. Diffeomorphisms in Euclidean space
2
f (x, y) = x + y y 1
φ : (x, y) 7→ (2x, y)
2
f (x, y) = 1.0 f˜(x, y) = 1.0
p 1
-1
f˜(x, y) = x2 /4 + y 2 y 1
x
v
p˜ 1
-1
-2
v˜ = φ∗ (v) 2 x
f˜(x, y) = 0.75 f (x, y) = 0.75
(f˜x , f˜y ) = (x/2, 2y)
(fx , fy ) = (2x, 2y)
Effect of diffeomorphism on a differential in IR2
Figure 19.2.2
p = (0.5, 0) to φ∗ (p, v) = (˜ p, v˜) where p˜ = (1, 0) and v˜ = (2, 0). In other words, the tangent vector (1, 0) at p is doubled in length by φ whereas the gradient df at p is halved in length. It seems that the gradient is scaled exactly the opposite to a tangent vector. This is true in general because the gradient is a “cotangent vector”, not a tangent vector. To be more precise, the gradient is a linear form on the space of tangent vectors. Alternatively, a cotangent vector may be called a “covector” for short. Just because something has n real components doesn’t mean it’s a vector. It is important to test “things with n components” to ensure that they transform correctly under diffeomorphisms before accepting them as tangent vectors.
The orthogonality is an illusion. True orthogonality would be lost under a non-orthogonal linear transformation, whereas the pseudo-orthogonality between tangent vectors and differentials seems to hold despite any diffeomorphisms at all. In fact, the “vector” representing the differential is merely the set of n components of the linear form which defines a tangent plane through a point. It is this tangent plane which transforms like a true tangent vector, not the “normal vector” to that plane. If the “normal vector” is transformed like a true tangent vector, it is transformed to a vector which is no longer normal to the tangent plane of the contour curves (or contour surfaces for n > 2). This gives a clue as to how to visualize and think about differentials (and gradients, differential forms and cotangent vectors). Differentials are best thought of as a family of parallel tangent planes with a parameter t ∈ IR which equals zero at the point where the differential is attached. These tangent planes may be thought of as the contour curves or surfaces of a real-valued function. This is illustrated in Figure 19.2.3.
H−2 H−1
H2 H3 H0 H1
p
Ht = p + v;
Pn
i=1
wi vi = t
t=−2
Figure 19.2.3
t=−1
w
v∈H1
t=0
t=1
t=2
t=3
Visualization of a cotangent vector as a family of hyperplanes
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
19.2.4 Remark: The gradient of a C 2 real-valued function on a Euclidean space is everywhere orthogonal to the level curves (if the gradient is non-zero). The “length” is inversely related to the spacing between level curves. (This idea of “length” is not a real metric. It does not transform like a metric under transformations.)
19.2. Differentials and diffeomorphisms
447
This view of cotangent vectors is the inverse of the “velocity of a curve” view of tangent vectors. In the same way that a vector v at p can be vizualized as the velocity of a C 1 function γ : IR → IRn with γ(0) = p, so also can a cotangent vector λ at p be vizualized as the gradient of a C 1 function f : IRn → IR such that f (p) = 0.
n The n-tuple Pnw in iFigure 19.2.3 is only a parameter for the linear functional λw : Tp (IR ) → IR defined by λw : v 7→ i=1 w vi . One may equivalently the linear functional by the family (Ht )t∈IR defined Pn represent i by Ht = {p + v; λw (v) = t} = {p + v; w v = t}. It is sufficient to specify only the plane H1 (and i i=1 the point p), from which the other planes are easily calculated. Alternatively, one could represent the linear functional as the function fp,w : IRn → IR defined by fp,w : x 7→ λw (x − p).
In effect, a cotangent vector does not have a direction. The direction is illusory since it does not transform like a true tangent vector. It is the hyperplanes which have a well-defined direction. The tangent plane H0 does transform correctly under diffeomorphisms.
An advantage of the representation as a family of hyperplanes (Ht )t∈IR is that the transformation rule is trivially tranformed by the point transformation φ : IRn → IRn . Since the hyperplanes are transformed to curved hypersurfaces in general, these do need to be flattened out. The function representation fp,w is similarly easy to transform. After the transformation, the transformed n-tuple w ˜ can be recalculated from the transformed function fp,w ◦ φ. All things considered, the n-tuple together with the cotangent vector transformation rule is the most convenient way of doing calculations. One must, however, always remember that the n-tuple is not a real tangent vector. It is only a parameter for a linear functional. 19.2.5 Remark: Figure 19.2.4 illustrates the similarity and difference between the methods of visualizing tangent vectors and cotangent vectors in flat space. The duality between the two is clear. In physics language one may summarize vectors and cotangent vectors as follows. – Tangent vectors are infinitesimal displacements in the point space. – Cotangent vectors are infinitesimal gradients of potential energy fields.
t=2 t=0 t = −1 p t = −2
v
p t = f (x)
x = γ(t) t=−2 ′
v = γ (0) Figure 19.2.4
t=−1
t=0
t=1
t=2
w = df (p)
Visualization of tangent vector v and cotangent vector w
19.2.6 Remark: Since the gradient of a real-valued function at a point has a transformation rule which is entirely determined by the Jacobian of the transformation at the point, it makes sense to define a class of object in which gradients of functions can “live”. The derivatives of functions γ : IR → IRn “live” in the tangent bundle of IRn , which is the set of all tangent vectors at all points in IRn . The gradient of a function f : IRn → IR obeys a different transformation rule to the tangent bundle. So a different kind of vector bundle is required. The name for this is the “cotangent (vector) bundle”. The cotangent vector transformation rule is the opposite to the tangent vector rule. 19.2.7 Remark: As illustrated in Figure 19.2.5, the differential of a differentiable function f : IRn → IR is also transformed according to only the first derivatives of a diffeomorphism. However, in this case, the inverse linear transformation is used. Consequently the (contraction) products wi v i and w ˜i v˜i are equal. n Hence the differential of a real-valued function on IR with respect to the tangent vector along a curve in IRn is independent of the choice of local coordinates. One may write the transformation rule for differentials of real-valued functions more compactly as (f ◦ φ),i = f,k φk ,i . [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
w
t=1
448
19. Diffeomorphisms in Euclidean space
W = df (x) differential
˜ = (w W 7→ W ˜ i )ni=1
f Figure 19.2.5
˜ d(f ◦ φ−1 )(φ(x)) = W
∂φj (x) wi = w ˜j ∂xi
differential f ◦ φ−1 = f˜
φ
Transformation of differential of a function under a diffeomorphism
Z
n n n ∗ 19.2.8 Definition: The cotangent bundle on IRn for n ∈ + 0 is the set T (IR ) = IR × IR together n n ∗ ∗ j n with the transformation rule ψˆ : T (IR ) → ˚ T (IR ) defined by (p, w) 7→ (φ(p), (wj φ¯ ,i (p))i=1 ) for C 1 n n j n diffeomorphisms φ : IR → ˚ IR , where [φ¯ ,i ]i,j=1 denotes the inverse of the Jacobian of φ.
[ The φ¯ pseudo-notation in Definition 19.2.8 for the inverse Jacobian is very dubious. It should be replaced with a correct notation. ] 19.2.9 Notation: Tx∗ (IRn ) for n ∈ T ∗ (IRn ) on IRn .
Z+0 and x ∈ IRn denotes the subset {x} × IRn of the cotangent bundle
19.2.10 Definition: A cross-section of the cotangent bundle on IR n for n ∈ T ∗ (IRn ) such that f (x) ∈ Tx∗ (IRn ) for all x ∈ Dom(f ).
Z+0 is a function f : IRn → Z
Z
−+ 19.2.11 Definition: A C r cross-section of the cotangent bundle on IR n for n ∈ + 0 and r ∈ 0 is a n n ∗ r cross-section f : IR → T (IR ) such that f is of class C . −+ n r ∗ 19.2.12 Notation: X r (T ∗ (IRn )) for n ∈ + 0 and r ∈ 0 denotes the set of C cross-sections of T (IR ).
Z
Z
19.2.14 Definition: The differential of a function f ∈ C 1 (IRn ) is the cross-section {(x, (∂k f (x))nk=1 ); x ∈ IRn } ∈ X 1 (T ∗ (IRn )) of the tangent bundle T ∗ (IRn ). 19.2.15 Remark: It is convenient to define a basis of unit vectors for both the tangent bandle and cotangent bundle in flat space. The unit basis for the linear space of real-valued n-tuples is given by Definition 10.2.21. In the context of diffeomorphisms, it is usual to define the unit tangent vectors ∂/∂xi and cotangent vectors dxi for i ∈ n . The tangent vector ∂/∂xi is really just a cute way of referring to the tangent vector of the curve γ : IR → IRn defined by γ : t 7→ tei , where ei = (δij )nj=1 is the usual unit n-tuple of real numbers. Thus ∂/∂xi means the tangent vector v ∈ IRn given by ∂γ j (t) vj = ∂t ∂(tδij ) = ∂t = δij = eji .
N
In other words, v = ei , which is not very useful. The real value of the pseudo-notation ∂/∂xi for ei is that it facilitates the calculation of transformation matrices under diffeomorphisms and it makes a clear distinction between tangent and cotangent vectors. The unit cotangent vectors dxi are supposed to represent the differentials of the functions f : IRn → IR defined by f : x 7→ xi . The differential df is clearly equal to the unit n-tuple ei again, which is apparently not very useful. The dxi pseudo-notation is useful as a mnemonic for the cotangent vector transformation rule under a diffeomorphism. The superscript i on dxi is a mnemonic for a covariant vector or cotangent vector whereas the subscript i of ∂/∂xi indicates a contravariant vector or tangent vector. (The indices on unit vectors have the opposite location to their corresponding coefficients.) The ∂/∂xi and dxi mnemonic abbreviations may be even further abbreviated to ∂i and di respectively. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
19.2.13 Definition: The differential of a function f ∈ C 1 (IRn ) at x ∈ IRn is the element (x, (∂k f (x))nk=1 ) of the cotangent bundle T ∗ (IRn ).
19.3. Second-level tangent vectors and diffeomorphisms
449
Although these pseudo-notations are attractive, simple, useful and popular, they are not always helpful. In some situations, they can lead to incorrect calculations. They often suggest the right calculations, but it is difficult to know when their mnemonic value is beneficial and when it is harmful. Therefore it is best to use these abbreviations only for presentation purposes. For reliable calculations, it is best to use explicit notations. 19.2.16 Remark: Figure 19.2.6 illustrates how the vector/covector dual relationship is inherent in the combination of a curve γ : IR → IRn and a real-valued function f : IRn → IR. Since the function loop starts and ends in IR, it is inevitable that the derivative of f will be a dual of the derivative of γ. Even if the space “in the middle” has no differentiable structure, there will be some sort of dual relationship in the differential behaviour of such maps. In this case, we have ∂t (f ◦ γ) = γ ′ (t)∇f (p) with p = γ(t). p γ
f IRn
IR Figure 19.2.6
IR n
The vector/covector loop from IR to IR via IR
19.2.17 Remark: In conservative force fields, the potential energy of a particle is the integral of the product of the force vector with the space displacement vector. Since the energy at each point is transformed under diffeomorphisms, it follows that the force field vectors must be cotangent vectors. To put it another way, a force vector may be calculated as the gradient of a scalar potential energy field. In very simple terms, force × displacement = energy. Therefore force is a covariant vector if displacement is a contravariant vector.
A first-level tangent vector may be thought of as an infinitesimal variation of a point. Similarly, a secondlevel tangent vector may be thought of as an infinitesimal variation of a tangent vector. Since a tangent vector is always attached to a unique point, it is the point/vector pair which is being varied. Whereas a first-level tangent vector is a variation in the point space, a second-level tangent vector is a variation in a point/vector space. In the language of manifolds, this combined point/vector space will be called a “tangent bundle”. A useful way to think about infinitesimal variations of points and vectors is to vary the points and vectors with respect to a time parameter t. This is illustrated in Figure 19.3.1, where the point p(t) and vector v(t) are varied with respect to t. wV
v(t1 ) p(t1 )
Figure 19.3.1
wH
p(t2 )
v(t2 )
Second-level tangent vector in flat space
The rate of change of the point p(t) with respect to t is a vector which is denoted here by wH . The rate of change of the vector v(t) with respect to t is a vector which is denoted here by wV . These are called respectively the “horizontal” and “vertical” components of the rate of change of the point/vector pair (p(t), v(t)). Thus the picture here is of a vector moving with variable direction and variable base point. The first task here is to provide a suitable space of objects which describe the velocity of motion of such a vector, analogous to the velocity of the base point on its own. The second task is to determine how these second-level velocity objects transform with respect to C 2 diffeomorphisms from IRn to IRn . [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
19.3. Second-level tangent vectors and diffeomorphisms
450
19. Diffeomorphisms in Euclidean space
19.3.1 Remark: Second-level velocity vectors in IRn can be described by tuples of the form (x, v, w) ∈ IRn × IRn × (IRn × IRn ), where x is the location of the base point, v is the initial value of a vector with base point p, and w = (wH , wV ) ∈ IRn × IRn is the pair of horizontal and vertical components wH ∈ IRn and wV ∈ IRn . It would seem reasonable, then, to identify the space of second-level tangent vectors T (T (IRn )) = T (2) (IRn ) with a set such as IRn × IRn × (IRn × IRn ) 19.3.2 Remark: The transformation rule for second-level tangent vectors under a C 2 diffeomorphism φ may be determined by differentiating the first-level transformation rule. The point/vector pair (p(t), v(t)) is transformed to (˜ p(t), v˜(t)) where p˜(t) = φ(p(t)) and v˜i (t) = φi ,j (p(t))v j (t). (See Figure 19.3.2.) φ
p(t1 )
Figure 19.3.2
w ˜V v˜(t2 )
wV
v(t1 )
v˜(t1 )
wH
v(t2 )
p(t2 )
p˜(t2 )
p˜(t1 )
w ˜H
Transformation of second-level tangent vector in flat space
Therefore the rate of change (wH , wV ) = ∂t (p(t), v(t)) is transformed to (w ˜H , w ˜V ) = ∂t (˜ p(t), v˜(t)) = (φi ,j (p(t))∂t pj (t), φi ,jk (p(t))∂t pk (t)v j (t) + φi ,j (p(t))∂t v j (t))
This is summarized by the following matrix equation. H H w ˜ A 0 w = , w ˜V B A wV where the n × n matrices A and B are defined by A= and
n ∂φi (x) , ∂xj x=p(t) i,j=1
∂ 2 φi (x) B= v k (t) ∂xj ∂xk x=p(t)
n
.
i,j=1
The horizontal component of a second-level tangent vector transforms like a first-level tangent vector, but the vertical component has an extra term which depends on the matrix of second derivatives of the diffeomorphism. It turns out that this term is where differential geometry becomes “interesting”. Simple second-level tangent vectors do not have the kinds of invariance properties which are required for physics. To construct well-defined tensors from second-level tangent vectors, it is necessary to add an additional term to represent “parallel transport”. This additional term compensates for the second-order derivatives of the diffeomorphism so that second-level tangent objects can be made to transform purely according to the first-order derivatives of diffeomorphisms. It is very inconvenient that second-level tangent vectors are in a different space to first-level tangent vectors. However, second-level tangent vectors can be “dropped” into the first-level tangent vector space by definition an affine connection on the tangent bundle. An affine connection specifies a parallelism relation in each direction of the horizontal component of a second-level tangent vector. This enables the vertical component [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
j k j = (φi ,j (p(t))wH , φi ,jk (p(t))wH v (t) + φi ,j (p(t))wVj ).
19.3. Second-level tangent vectors and diffeomorphisms
451
to be adjusted for parallel transport in the horizontal direction, which converts the vertical component to a net rate of change relative to parallel transport in the horizontal direction. This is, in fact, the purpose of defining an affine connection on a differentiable manifold. If no affine connection is defined, there are infinitely many equivalent linear maps to “drop” the vertical component of a second-level tangent vector. An affine connection makes a particular choice of linear map to use as the drop function.
Z
(2) 19.3.3 Definition: The second-level tangent bundle on IR n for n ∈ + (IRn ) = IRn × IRn × 0 is the set T n n n n (2) (2) (2) (IR × IR ) together with the transformation rule ψˆ : T (IR ) → ˚ T (IR ) defined by ψˆ(2) : (p, v, w) 7→ n n 2 (φ(p), v˜, w) ˜ for C diffeomorphisms φ : IR → ˚ IR , where w = (wH , wV ), w ˜ = (w ˜H , w ˜V ) and
v˜ = (φi ,j (p)v j )ni=1 , j n w ˜H = (φi ,j (p)wH )i=1 k j w ˜V = (φi ,jk (p)wH v + φi ,j (p)wVj )ni=1 .
19.3.4 Remark: First-level tangent vectors may be applied to real-valued functions as in Section 19.2. Similarly, second-level tangent vectors may be applied to tangent-vector-valued functions. Such functions are called “vector fields”. Suppose X and Y are vector fields on IRn . In other words, for all p ∈ IRn , both X(p) and Y (p) are elements of T (IRn ) = IRn × IRn such that X(p) = (p, u(p)) and Y (p) = (p, v(p)) for some u(p), v(p) ∈ IRn for all p ∈ IRn . (See Definition 19.1.7 for the tangent space T (IRn ).) Define a function Z on IRn by Z(p) = (p, w(p)) = (p, ui (p)∂i v j (p) − v i (p)∂i uj (p)). The transformation rules of Definition 19.1.7 require that (p, u(p)) 7→ (φ(p), (φi ,j (p)uj (p))ni=1 ) and (p, v(p)) 7→ (φ(p), (φi ,j (p)v j (p))ni=1 ) for C 1 diffeomorphisms φ. If Z is calculated in transformed coordinates, the value becomes u ¯i ∂¯i v¯j − v¯i ∂¯i u ¯j = φi k uk φ¯ℓ i ∂ℓ (φj m v m ) − φi k v k φ¯ℓ i ∂ℓ (φj m um ) = φi k uk φ¯ℓ i φj m ∂ℓ v m − φi k v k φ¯ℓ i φj m ∂ℓ um + φi k uk φ¯ℓ i φj mℓ v m − φi k v k φ¯ℓ i φj mℓ um = φi k φ¯ℓ i φj m (uk ∂ℓ v m − v k ∂ℓ um ) + (φi k φ¯ℓ i φj mℓ − φi m φ¯ℓ i φj kℓ )uk v m = φj m (uk ∂k v m − v k ∂k um ).
This means that the n-tuple (uk ∂k v m − v k ∂k um )nm=1 has the same transformation rules as a tangent vector in IRn . It follows that this is a chart-independent true vector. It is given the name “Poisson bracket”. The standard notation for this is [X, Y ]. The motivation for introducing the Poisson bracket is to define a form of differentiation which is chartindependent. In other words, the same vector is calculated regardless of diffeomorphisms of the underlying space. The antisymmetric quantity [X, Y ] may be expressed as DX Y − DY X, where DX represents the derivative operation ui ∂i applied to a vector field. Although DX Y is not a vector, the antisymmetrant DX Y − DY X is a vector. If this kind of differentiation is generalized to general tensors, the result is the “Lie derivative”, which achieves the same objective, namely to differentiate tensor fields in a manner which commutes with diffeomorphisms. [ Remarks 19.3.4 and 19.3.5 require work to convert the sloppy tensor calculus to authentic mathematics. ] 19.3.5 Remark: Remark 19.3.4 shows that anti-symmetrization can make the partial derivative of a contravariant vector (which is not a true vector) into a true contravariant vector. A similar kind of antisymmetrization can be applied to the partial derivative of a covariant vector. Let (ai )ni=1 be the n-tuple of coefficients of a covariant vector in T (IRn ). Then ∂¯i a ¯j − ∂¯j a ¯i = φ¯k i ∂k (φ¯ℓ j aℓ ) − φ¯k j ∂k (φ¯ℓ i aℓ ) = φ¯k i φ¯ℓ j ∂k aℓ − φ¯k j φ¯ℓ i ∂k aℓ + φ¯ℓ ji aℓ − φ¯ℓ ij aℓ = φ¯k i φ¯ℓ j (∂k aℓ − ∂ℓ ak ).
It follows that the construction ∂i aj −∂j ai is a true covariant tensor of degree 2. This is, in fact, the “exterior derivative” of the covariant vector field a = (ai )ni=1 . [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
= φj m (uk ∂k v m − v k ∂k um ) + (φj mk − φj km )uk v m
452
19. Diffeomorphisms in Euclidean space
19.3.6 Remark: While it is very valuable to construct true vectors and tensors using antisymmetrization to cancel out the second-order derivatives of a diffeomorphism, which yields the Poisson bracket, Lie derivatives and exterior derivative, it is also highly desirable to construct straightforward derivatives without antisymmetrization. To achieve chart-independent differentiation without antisymmetrization requires the introduction of differentiable parallelism, which is also called an “affine connection”. An affine connection is an extra structure on a point space which maps the vertical components of secondlevel tangent vectors to horizontal components. This map must commute with diffeomorphisms in order to give well-defined vector and tensor constructs. This chart-independence is achieved by ensuring that the definition of differential parallelism, if integrated along a path to yield a pathwise parallelism, yields the same parallelism relation between the tangent spaces at distant points with respect to each chart. The Lie derivative is defined in terms of the integrated flow of a first-level contravariant vector field. This defines a limited form of parallelism along the integral curves of the vector field. This parallelism along integral curves is chart-independent, which is why the Lie derivative and Poisson bracket are chart-independent. What an affine connection offers is a full parallelism along all differentiable paths. 19.3.7 Remark: Second-level tangent bundles (which have n + n + 2n coefficients) should not be confused with second-order differential operator bundles (which have n + n + n2 coefficients). The latter are defined in Section 19.5 in terms of second-order differentials of real-valued functions, whereas the former (in this Section) represent first-order differentials of first-level vector fields. Roughly speaking, a second-order differential operator looks like f 7→ bi ∂i f (x) + aij ∂ij f (x). (See also Remark 28.10.5 for comments on the words “degree”, “order” and “level” for tangent objects.) [ Should give some examples and an illustration to clarify the differences described in Remark 19.3.7. Also need to clarify the terminology and make the names of the different kinds of tangent objects easier to interpret. ]
19.4.1 Remark: A small number of authors give the name “pseudogroup” to sets of diffeomorphisms which are closed under function composition. The name seems to have originated with Marius Sophus Lie (1842–1899). A pseudogroup of diffeomorphisms on a set X does not generally have a single identity element. There is, instead, an identity element for each open subset of the set X. In other words, there are only local identity elements. The diffeomorphisms in a pseudogroup have left and right inverses relative to local identity maps. In addition to closure of a pseudogroup under composition, one may also require closure under such operations as function domain restriction. Pseudogroups are defined by Kobayashi/Nomizu [26], page 1 and EDM2 [34], 90.D, page 337. (This definition should not be confused with the pseudogroup definition by Malliavin [35], Section 2.4, pages 99–102, which is actually a semigroup.) A pseudogroup of transformations seems very similar to a group at first. The composition of elements is closed and associative, and all elements have an inverse. However, the inverse does not use a single global identity. There is a separate identity operation for each of the domains of the transformations in the pseudogroup. 19.4.2 Remark: As mentioned in Remark 14.1.3, the set of homeomorphisms between subsets of a Euclidean space have some similarity to the concept of the Felix Klein’s Erlanger Programm. (See EDM2 [34], article 137, page 546.) But the homeomorphisms constitute a pseudogroup rather than a group because of the lack of a global identity function. Klein claimed that geometry consists principally of the study of congruence and invariance under groups of transformations. Thus topology can be thought of as the study of the invariants of homeomorphisms. He also stated that further geometries can be generated from subgroups of a transformation group. Since C r diffeomorphisms constitute a sub-pseudogroup of homeomorphisms, one may consider the “programme” of differential geometry to be the study of the invariant properties of C r diffeomophisms. To some extent, this is true. Tangent vectors, for example, may be thought of as invariant properties of curves under diffeomorphisms. (See also Remark 37.1.4 regarding invariants of manifolds with an affine connection.) The invariance of tangent vectors is sketched in Figure 19.4.1. The set of C 1 curves in IRn is denoted by C1 (IRn ). The diagram is commutative because the same tangent vector γ˜′ (0) is arrived at by either [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
19.4. Diffeomorphism pseudogroups
19.4. Diffeomorphism pseudogroups
453
transforming the curve with φ first and then differentiating or by differentiating first and then transforming with the push-forth linear map φ∗ . (See also the similar diagram in Figure 19.1.2.)
T (IRn )
γ ′ (0)
φ∗
∂ ∂t t=0
C1 (IRn ) Figure 19.4.1
γ
φ
γ˜ ′ (0)
γ˜
T (IRn )
∂ ∂t t=0
C1 (IRn )
Invariance of tangent vector to curves under C 1 diffeomorphisms
19.4.4 Remark: There does not seem to be a rich literature on pseudogroups. A separate definition seems to be quite unnecessary. (Not all clusters of attributes define an interesting class of objects.) It happens that the transition maps of a manifold constitute a pseudogroup of homeomorphisms. This pseudogroup is not necessarily closed under restriction and extension of maps. (Closure under restriction/extension operations is easily obtained by completion of the manifold atlas with respect to restriction and extension.) Classes of differentiable manifolds are specified in terms of a regularity requirement for the transition maps between manifold charts. (For example, see Definition 27.2.6.) There is a one-to-one correspondence between classes of diffeomorphism pseudogroups and classes of differentiable manifolds. One may think of the pseudogroup concept as derived from the manifold atlas concept or vice versa. This is related to the general philosophical question of whether real objects lie “behind” perceptions. A manifold is may be thought of as the “real object” which lies behind the pseudogroup of diffeomorphisms in the flat coordinate spaces where we do concrete calculations. Or we can regard the flat coordinate spaces as the “real object” and the points of the manifold as an abstraction from the transformation pseudogroup. In this book, the author is thinking of manifolds as being the real objects “out there” in the “real world”, while coordinates for manifolds are part of the mental models “in here” inside the brain. But the distinction is quite arbitrary. 19.4.5 Definition: A pseudogroup of homeomorphisms Γ on a topological space X is a set Γ such that: (i) (ii) (iii) (iv)
∀φ ∈ Γ, (Dom(φ) ∈ Top(X) and Range(φ) ∈ Top(X)). ∀φ ∈ Γ, φ is a homeomorphism. ∀φ ∈ Γ, φ−1 ∈ Γ. ∀φ1 , φ2 ∈ Γ, φ2 ◦ φ1 ∈ Γ. (Note that φ2 ◦ φ1 : φ−1 1 (Dom(φ2 )) ≈ φ2 (Range(φ1 )).)
19.4.6 Remark: The conditions of Definition 19.4.5 are a mixture of algebraic and topological requirements. Condition (i) is topological and apparently unnecessary. The reason for restricting consideration to open domains and ranges is to avoid all of the hard work that arises from questions about boundary behaviour. Conditions (iii) and (iv) are purely algebraic. The existence of inverses in condition (iii) is purely local. The composition of a local homeomorphism with its inverse is a local identity map on the domain or range of the function. In fact, φ−1 ◦ φ = idDom(φ) and φ ◦ φ−1 = idRange(φ) for any bijection φ. The combination of conditions (iii) and (iv) implies that the pseudogroup Γ contains the identity map on all domains and ranges of homeomorphisms in Γ. A pseudogroup is not generally a group because the local identity maps do not meet the requirement of a single global identity element. It is true that the local [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
19.4.3 Remark: One may ask what what properties of figures are preserved by the pseudogroup of C 0,1 diffeomorphisms. In the case of C 1 diffeomorphisms, tangent vectors are preserved, and the largest pseudogroup which preserves tangent vectors is (more or less) the pseudogroup of C 1 diffeomorphisms. The C 0,1 diffeomorphisms (probably) preserve tangent cones rather than tangent vectors. Then one might ask whether the largest pseudogroup of homeomorphisms which preserve tangent cones is essentially the C 0,1 class. [ This requires further study. ]
454
19. Diffeomorphisms in Euclidean space
identity maps could be extended to the global identity map and identified as a single identity map. However, the homeomorphisms of the pseudogroup would then need to be extended likewise to the whole space X, which would usually contradict condition (ii). In order to even maintain the bijection property for all maps in Γ, it would be necessary to transform the points of the function range to other points. So it is not easy to automatically convert a pseudogroup into a group. But a pseudogroup is automatically a group if the homeomorphisms are required to be global, i.e. with both domain and range equal to X. The function composition in condition (iv) uses the general Definition 6.11.7 for partially defined functions. Consequently, Dom(φ2 ◦ φ1 ) is a subset of Dom(φ1 ). 19.4.7 Definition: A complete pseudogroup of homeomorphisms Γ on a topological space X is a pseudogroup of homeomorphisms Γ on X such that: (i) ∀φ ∈ Γ, ∀Ω ∈ Top(X), φ Ω ∈ Γ. S (ii) For any family (Ωα )α∈A of open subsets of X, if φ : Ω ≈ G is a homeomorphism from Ω = α∈A Ωα to an open subset G ∈ Top(X) and φ Ωα ∈ Γ for all α ∈ A, then φ ∈ Γ.
[ Is condition (i) a special case of condition (ii) in Definition 19.4.7? ]
19.4.8 Remark: In Definition 19.4.7, both conditions (i) and (ii) imply a kind of locality of the pseudogroup membership property. One may restrict the function’s domain or extend it (by merging any family of homeomorphisms) without losing pseudogroup membership. Conditions (i) and (ii) seem to be quite useless in practice. The closure under restriction and extension can easily be generated from any set of homeomorphisms satisfying the conditions of Definition 19.4.5. 19.4.9 Remark: A pseudogroup of homeomorphisms would usually be constructed by sculpting the pseudogroup of all homeomorphisms on a topological space. This is done by restricting the full pseudogroup to those homeomorphisms which satisfy a set of conditions such that the restricted set still satisfies the closure conditions of a pseudogroup. For example, requiring the homeomorphisms to be C k diffeomorphisms − n for some k ∈ + 0 yields a closed subset of the pseudogroup of all homeomorphisms of IR . This is presented as Definition 19.4.10. Some useful properties of this pseudogroup are presented in Section 19.1. This pseudogroup is implicit in Definitions 27.2.4 and 27.2.6 for the differentiable structure of a differentiable manifold. 19.4.10 Definition: The (complete) pseudogroup of C k diffeomorphisms on IR n for n ∈ is the set {φ : Ω1 ≈ Ω2 ; Ω1 , Ω2 ∈ Top(IRn ) and φ, φ−1 are of class C k }. 19.4.11 Notation: Γk (IRn ) for n ∈ phisms on IRn .
Z+0 and k ∈ −Z+0
Z+0 and k ∈ −Z+0 denotes the complete pseudogroup of C k diffeomor-
19.4.12 Remark: It is not necessary to specify the pseudogroup action in Definition 19.4.10 because the action of a pseudogroup of homeomorphisms is always simply the composition operation for functions in the set Γk (IRn ). However, it is necessary to verify that the set satisfies all of the conditions of Definition 19.4.7 for a complete pseudogroup of homeomorphisms. This follows readily from Remark 19.1.2. 19.4.13 Remark: On may also define the complete pseudogroup Γk,α (IRn ) of C k,α diffeomorphisms whose maps have kth-order derivatives which are α-H¨older continuous. (See Definition 18.9.1.)
19.5. Second-order differential operators and diffeomorphisms 19.5.1 Remark: In the same way that tangent vectors may be thought of as differential operator maps ˚1 (IRn ), x ∈ Dom(f ) and v ∈ IRn , it is possible to define second-order maps like f 7→ v i ∂i f (x) for f ∈ C i ij f 7→ b ∂i f (x) + a ∂ij f (x) for b ∈ IRn and matrices a = (aij )ni,j=1 ∈ IRn×n . Invariance rules for these tangent objects can be defined analogously to first-order tangent vectors. Then the transformation rules can be used in the definition of higher-order differential operator tangent bundles on C 2 manifolds. It turns out that higher-order differential operators in a space with arbitrary diffeomorphisms but no affine connection are not very useful. One may certainly define the operators and transform them under diffeomorphisms, but one must keep track of not only the first-order derivatives of the diffeomorphisms (the Jacobian [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Z
19.5. Second-order differential operators and diffeomorphisms
455
matrix), but also the higher-order derivatives up to the order of the operator. This is simply too inconvenient for most people. In effect, one must specify the chart together with the operator. People prefer operators which require only pointwise tensorial transformations for the coefficients, and which have the same apparent form independent of the choice of chart. Thus the gradient v i ∂i f of a real-valued function f is okay, even though the coefficients v = (v i )ni=1 are chart-dependent. The dependency is simple, and in practice the n-tuple v ∈ IRn has some kind of physical or other interpretation in the model which is under consideration. In the case of second-order operators, a C 2 change of coordinates causes the first-order term to be contaminated by an extra term which depends on the curvature of the transformation. (This problem does not occur with linear or affine transformations.) It is very inconvenient to have to explain such chart-curvature terms. In practice, the charts in which a second-order operator seems simplest and easiest to explain are aligned with some kind of parallelism or metric structure on the space. (In normal coordinates, for example, the second-order operators in some models may have a simpler form.) Then when other charts are used, correction terms may be applied to take into account curvature and distortion of the coordinates. Covariant derivatives are designed to automatically correct curved charts for deviation from parallelism. The metric tensor can be used in differential operators to correct the coordinate charts for non-orthogonality. Therefore although higher-operators are well-defined in the absence of a connection of metric, in practice it is highly preferable to incorporate the connection or metric structure into the operators so that they are easier to work with in diverse coordinate charts. In this section, the transformation rules for second-order operators are presented, but they are of limited usefulness in the differential layer of manifolds. 19.5.2 Remark: The observations in this section are useful for defining spaces of higher-order operators on differentiable manifolds in Chapter 32. In that context, the chart transition maps ψα ◦ ψβ−1 satisfy the requirements of the diffeomorphism φ in Remark 19.5.3 if the manifold is C 2 . 19.5.3 Remark: For any open set Ω ⊆ IRn , symmetric matrix functions a ∈ C 0 (Ω, Sym(n, IR)) and vector functions b ∈ C 0 (Ω, IRn ), define La,b : C 2 (Ω) → C 0 (Ω) by La,b (f )(x) = aij (x)fij (x) + bi (x)fi (x).
Define D (Ω) = {La,b ; a ∈ C 0 (Ω, Sym(n, IR)), b ∈ C 0 (Ω, IRn )}. ˜ ⊆ IRn and homeomorphisms φ : Ω → Ω, ˜ define the pull-back Tφ : C 0 (Ω) ˜ → C 0 (Ω) by For open sets Ω, Ω ˜ ∀f ∈ C 0 (Ω),
Tφ (f ) = f ◦ φ.
˜ for C 2 diffeomorphisms φ : Ω → Ω ˜ by Define Tˆφ : D (Ω) → D (Ω) ˜ ∀x ∈ Ω, ˜ ∀La,b ∈ D (Ω), ∀f ∈ C 2 (Ω), Tˆφ (La,b )(f )(x) = La,b (Tφ (f ))(φ−1 (x)), or more concisely, ∀La,b ∈ D (Ω),
Tˆφ (La,b ) = Tφ−1 ◦ La,b ◦ Tφ .
˜ For f ∈ C 2 (Ω), ˜ x ∈ Ω and x It follows from Theorem 18.6.4 that Tˆφ (Lsa,b ) ∈ D (Ω). ˜ = φ(x), Tφ−1 (La,b (Tφ (f )))(˜ x) = La,b (f ◦ φ)(x)
= aij (x) fkℓ (˜ x)φk i (x)φℓ j (x) + fk (˜ x)φk ij (x) + bi (x)fk (˜ x)φk i (x) =a ˜kℓ (˜ x)fkℓ (˜ x) + ˜bk (˜ x)fk (˜ x) = La˜,˜b (f )(˜ x),
where
and [ www.topology.org/tex/conc/dg.html ]
a ˜kℓ = (φk i φℓ j aij ) ◦ φ−1
(19.5.1)
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
∀f ∈ C 2 (Ω), ∀x ∈ Ω,
456
19. Diffeomorphisms in Euclidean space
D (Ω)
La,b
C 2 (Ω)
f ◦φ
Tˆφ
La ˜ ,˜ b
˜ D (Ω)
f
˜ C 2 (Ω)
Tˆφ (La,b ) = La ˜ ,˜ b
Tφ f ◦φ = Tφ (f )
Ω Figure 19.5.1
φ x
φ(x)
˜ Ω
Action of a diffeomorphism on a space of differential operators ˜bk = (φk ij aij + φk i bi ) ◦ φ−1 .
˜ Sym(n, IR)) and ˜b ∈ C 0 (Ω, ˜ IRn ). Therefore L ˜(f ) ∈ C 0 (Ω), ˜ from which it follows that Clearly a ˜ ∈ C 0 (Ω, a ˜,b ˜ So Tˆφ (Ls ) = L ˜ ∈ D (Ω). ˜ This is summarized in Figure 19.5.1. Tˆφ (La,b )(f ) = Tφ−1 (La,b (Tφ (f ))) ∈ C 0 (Ω). a,b a ˜,b
The term φk ij aij in ˜bk is very inconvenient. It is important to note that the operator La,b is perfectly well-defined, but to calculate it in all coordinate charts, it is necessary to introduce the term φk ij aij which depends on the second-order derivatives of the transformation. This just does not fit in with the tensorial way of thinking. The solution to this problem in practice is to introduce structures such as a connection into the operator itself so that it has the illusion of being a simple tensorial object because the curvature of the charts is incorporated into the operator. This is the motivation behind covariant differentiation.
The Laplacian ∆ is a well-defined elliptic operator, but it is not invariant under diffeomorpisms. The Laplacian stays the Laplacian under orthogonal transformations in O(n) and translations. Similarly, the Gaussian curvature operator is invariant under volume-preserving transformations in SL(n). [ Check this. ] In summary, second-order operators can be defined in a chart-independent manner only with the assistance of a connection so that covariant derivatives are defined. The Laplacian operator can only be defined chartindependently with the assistance of a metric tensor. [ Here do the general-order case of Remark 19.5.3 as definitions and a theorem? ]
19.6. Directionally differentiable homeomorphisms 19.6.1 Remark: Even the most regular of the C 0,α classes, namely the C 0,1 (Lipschitz) class, does not guarantee existence of even unidirectional derivatives. So tangent vectors are difficult to define. However, if the existence of unidirectional derivatives everywhere is the regularity test for a class of homeomorphisms, the resulting class can support meaningful tangent vectors. Unidirectional tangent bundles are defined in Section 28.14. 19.6.2 Definition: The unidirectional derivative of a function f : U → IRn at a point x ∈ U in a direction v ∈ IRm for an open set U ⊆ IRm with m, n ∈ + 0 is the limit lima→0+ (f (x + av) − f (x))/a if this limit is well-defined. The (bi)directional derivative of a function f : U → IRn at a point x ∈ U in a direction v ∈ IRm for an open set U ⊆ IRm with m, n ∈ + 0 is the limit lima→0 (f (x + av) − f (x))/a if this limit is well-defined.
Z
Z
19.6.3 Definition: A unidirectionally differentiable function is a function f : U → IRn for an open set m U ⊆ IRm with m, n ∈ + 0 such that lima→0+ (f (x + av) − f (x))/a is well-defined for all x ∈ U and v ∈ IR . n m A (bi)directionally differentiable function is a function f : U → IR for an open set U ⊆ IR with m, n ∈ + 0 such that lima→0 (f (x + av) − f (x))/a is well-defined for all x ∈ U and v ∈ IRm .
Z
[ www.topology.org/tex/conc/dg.html ]
Z
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
It is clear from equation (19.5.1) that ellipticity is preserved by C 2 diffeomeorphisms. Hence the class of elliptic operators as a whole is preserved under C 2 diffeomorphisms. If the diffeomorphisms are restricted to affine transformations, the troublesome term φk ij aij disappears. But this restriction is incompatible with the required generality of chart transition maps for differentiable manifolds.
19.6. Directionally differentiable homeomorphisms
457
Z
19.6.4 Theorem: For all n ∈ + 0 , the set of all homeomorphisms f : U ≈ V between open subsets U, V ⊆ IRn for which both f and f −1 are unidirectionally differentiable is a complete pseudogroup of homeomorphisms on IRn . Proof: Let f : U ≈ V and g : V ≈ W be unidirectionally differentiable homeomorphisms, where U , V and W are open sets in IRm , IRn and IRp respectively. Let h = g ◦ f : U ≈ W . Let x ∈ U , v ∈ IRm and a ∈ IR. Let f∗+ (x, v) denote the unidirectional derivative lima→0+ (f (x + av) − f (x))/a for x ∈ U and v ∈ IRm . Similarly, denote g∗+ (y, w) = lima→0+ (g(y + aw) − g(y))/a for y ∈ V and w ∈ IRn . Then lim
a→0+
h(x + av) − h(x) g(f (x + av)) − g(f (x)) = lim+ a a a→0 g(f (x) + af∗+ (x, v)) − g(f (x)) g(f (x + av)) − g(f (x) + af∗+ (x, v)) + lim+ = lim+ a a a→0 a→0 g(f (x) + af∗+ (x, v)) − g(f (x)) = lim+ a a→0 = g∗+ (f (x), f∗+ (x, v)).
This follows from the observation that lima→0+ f (x+av)−f (x)−af∗+ (x, v) /a = 0 by definition of f∗+ (x, v). Hence lima→0+ g(f (x + av)) − g(f (x) + af∗+ (x, v)) /a = 0 by the uniform continuity of g on compact sets. This establishes Definition 19.4.5 (iv) for a pseudogroup. The other conditions follow without difficulty. 19.6.5 Remark: Unidirectionally differentiable functions have the property that the directional derivatives f∗+ (x, v) lima→0+ (f (x + av) − f (x))/a satisfy f∗+ (x, kv) = kf∗+ (x, v) for all k ≥ 0. [ Investigate the continuity properties which follow automatically from unidirectional differentiability. ] 19.6.6 Remark: A useful short notation for lima→0+ (f (x + av) − f (x))/a could be ∂a+ f (x + av). [ This might not make good sense at all. See Metadefinition 28.14.2 for an application of this (pseudo)notation. ] 19.6.7 Definition: The (complete) pseudogroup of unidirectionally differentiable homeomorphisms on IRn n −1 for n ∈ + are unidirectionally differentiable}. 0 is the set {φ : Ω1 ≈ Ω2 ; Ω1 , Ω2 ∈ Top(IR ) and φ, φ [ Show that one-sided tangent vectors of unidirectionally differentiable curves are invariant under the pseudogroup of unidirectionally differentiable homeomorphisms. Also show that unidirectional differentials of unidirectionally differentiable real-valued functions are invariant. ] 19.6.8 Remark: The transition maps for the Lipschitz manifold in Example 43.4.1 are a pseudogroup of unidirectionally differentiable homeomorphisms on IRn according to Definition 19.4.5, but do not meet the requirements for closure under restriction and extension in Definition 19.4.7 for a complete pseudogroup. [ Show that a rectifiable curve in IRn is differentiable almost everywhere. ]
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Z
458
[ www.topology.org/tex/conc/dg.html ]
19. Diffeomorphisms in Euclidean space
[ draft: UTC 2009–3–21 Saturday 11:36 ]
[459]
Chapter 20 Measure and integration
20.1 20.2 20.3 20.4 20.5 20.6 20.7 20.8 20.9 20.10 20.11 20.12 20.13
Lebesgue measure . . . . . . . . . . . . . . . . Lebesgue integration . . . . . . . . . . . . . . Rectangular Stokes theorem in two dimensions . Rectangular Stokes theorem in three dimensions Differential forms . . . . . . . . . . . . . . . . The exterior derivative . . . . . . . . . . . . . Exterior differentiation using Lie derivatives . . Geometric measure theory . . . . . . . . . . . Stokes theorem . . . . . . . . . . . . . . . . . Radon measures . . . . . . . . . . . . . . . . . Some integrability-based function spaces . . . . Logarithmic and exponential functions . . . . . Trigonometric functions . . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
459 460 461 463 464 466 469 470 470 470 471 471 473
Measure and integration are required for Section 20.5 on differential forms and for the definition of function spaces such as Sobolev spaces.
20.1.1 Remark: For real functions of real variables, the Lebesgue integral is essentially the most general integral definition possible. The “exhaustion method” of integration, which is very similar to the informal presentation of integration in modern applied disciplines, was used in a limited way by classical Greek mathematicians such as Eudoxus in the 4th century BC. In the 3rd century BC, Archimedes used the exhaustion method in a much more general way. Newton’s integral was published in 1687 in his “Philosophiae naturalis principia mathematica”, usually referred to as the Principia mathematica or the Principia. (Bell [189], page 132, gives the dates 1666 and 1684 for R Newton and the dates 1673 and 1675 for Leibniz.) Struik [193], page 111, says that the integral symbol “ ” was introduced by Leibniz in 1686. Leibniz invented the term “calculus integralis” (integral calculus). According to Bell [189], page 480, Cauchy’s integral definition dates from 1823. The Riemann generalization of the Cauchy integral was introduced in about 1850. (Bell [189] gives the date 1854.) Lebesgue introduced his integral in 1902. 20.1.2 Remark: Integral calculus is, in a sense, infinitely more difficult than differential calculus. It is not generally possible to find closed-form integrals. In fact, the integration of many simple-looking functions requires the invention of “special functions”. Bell [190], page 101, says the following about the difficulty of closed-form integration. R [. . . ] the problem of evaluating f (x) dx for comparatively innocent-looking functions f (x) may be beyond our powers. It does not follow that an “answer” exists at all in terms of known functions when an f (x) is chosen at random—the odds against such a chance are an infinity of the worst sort (“non-denumerable”) to one. When a physical problem leads to one of these nightmares approximate methods are applied which give the result within the desired accuracy.
©
Alan U. Kennington, “Differential geometry reconstructed: a unified systematic framework”. Copyright 2009, Alan U. Kennington. All rights reserved. You may print this book draft in A4 format. Printing in all other formats is forbidden. You may not charge any fee for copies of this book draft. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
20.1. Lebesgue measure
460
20. Measure and integration
[ See Taylor [144] for Lebesgue measure theory. Should also deal with integrals for distributions. But distributions require the definition of integrals. ] 20.1.3 Remark: It is not possible to prove the existence of a subset of IR which is not Lebesgue measurable using only the ZF axioms. With the addition of the axiom of choice, the existence may be asserted but not demonstrated by any examples. This implies that no one will ever know what non-measurable sets and functions look like. Sets which are not Lebesgue measurable are “dark sets”. The believers in AC “know” that they are there. But there will never be any pictures. Theorem 20.1.4 is perhaps one of the most useless theorems in mathematics, especially considering all the heated controversy and hard work it generates. The real value of Theorem 20.1.4 is metamathematical. It shows that the combination of the axioms, definitions and theorems of mathematics yields troubling objects which are “unforeseen consequences”. When this happens, it could be an indication that this kind of mathematics is not ideally suited to the modelling of the physical world. People who are impressed by the amazing ability of mathematics to model the physical world should think also about the quirky, embarrassing outcomes from mathematical models. 20.1.4 Theorem [zf+ac]: There exists a subset of IR which is not Lebesgue measurable. 20.1.5 Remark: The usual set “construction” which is used to prove Theorem 20.1.4 first partitions all elements of IR according to an equivalence relation E ⊆ IR × IR defined by (x, y) ∈ E ⇔ x − y ∈
Q.
Let [x] denote the equivalence class of x with respect to E for any x ∈ IR. Then [x] = [x] ∩ [−x] = ∅ for all x ∈ IR \ . The set of equivalence classes is uncountable.
Q
Q for all x ∈ Q, and
The Lebesgue unmeasurability of U relies upon the fact that there is no way to list the set of equivalence classes [x]. There is no countable list because the set is uncountable. But there is also no uncountable listing, e.g. by assigning one and only one equivalence class to each x ∈ IR. Therefore there is no way to specify for each equivalence class whether it is or is not in the set Y .
20.2. Lebesgue integration 20.2.1 Remark: Theorem 20.2.3 is known as the fundamental theorem of calculus. In fact, Theorem 20.2.3 can be made stronger, but it shows the general idea. 20.2.2 Remark: The non-standard abbreviation FTOC may sometimes be used for “fundamental theorem of calculus”. [ Replace Theorem 20.2.3 with a sharper version. See Rudin [136], page 115 and EDM2 [34], 216.C, page 821. ] 20.2.3 Theorem: Let f : [a, b] → IR be a Lebesgue integrable function for a, b ∈ IR with a < b, and let F : [a, b] → IR be a continuous function on [a, b] such that F is differentiable on (a, b) and F ′ (x) = f (x) for all x ∈ (a, b). Then Z b f (x) dx = F (b) − F (a). a
[ State the generalization of Theorem 20.2.3 to C 1 curves. ] 20.2.4 Remark: Theorem 20.2.3 may be easily generalized to the integral of the differential of a real-valued function along an arbitrary C 1 curve.
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Now “construct” the set Y of equivalence classes by requiring that [0] ∈ Y and choosing one and only one of [x] and [−x] to be an element of Y for each x ∈ IR. The existence of such a set of choices is guaranteed by the Axiom of Choice. Define U = ∪Y . Then U is a Lebesgue unmeasurable subset of IR.
20.3. Rectangular Stokes theorem in two dimensions
461
20.3. Rectangular Stokes theorem in two dimensions 20.3.1 Remark: It turns out that the fundamental theorem of calculus is just the tip of an impressive iceberg. It is a special case of the Gauß-Green-Stokes family of theorems, which can be generalized in many ways. A simple consequence of Theorem 20.2.3 is the corresponding Theorem 20.3.2 for two variables. The theorem holds for very general regions, but it is illuminating to first prove it for a rectangular region. 20.3.2 Theorem: Let a1 , a2 , b1 , b2 ∈ IR with a1 < b1 and a2 < b2 , and let A : [a1 , b1 ] × [a2 , b2 ] → IR2 be a continuous function such that A has partial differentials on (a1 , b1 ) × (a2 , b2 ). Then Z
Ω
1
2
∂1 A2 − ∂2 A1 dx dx =
Z
A.ds,
(20.3.1)
∂Ω
where Ω = [a1 , b1 ] × [a2 , b2 ] and ds denotes the anti-clockwise line integral around ∂Ω. Proof: The first term ∂1 A2 may be integrated by Theorem 20.2.3 with respect to x1 for fixed x2 . This gives A2 (b1 , x2 ) − A2 (a1 , x2 ). (See Figure 20.3.1.) So ∂1 A2 dx1 dx2 =
Z
=
Z
Ω
b2
a2 b2 a2
Z
b1
∂1 A2 (x1 , x2 ) dx1 dx2
a1
A2 (b1 , x2 ) − A2 (a1 , x2 ) dx2 .
x2
γt
b2 R b1
γℓ
a1
∂1 A2 (x1 , x2 ) dx1
a2
γb a1
Figure 20.3.1
γr
x1
b1
Integration of exterior derivative in a rectangle
Similarly Z
Ω
1
2
−∂2 A1 dx dx = =
Z
b1
a1 Z b1 a1
Z
b2
a2
−∂2 A1 (x1 , x2 ) dx2 dx1
A1 (x1 , a2 ) − A1 (x1 , b2 ) dx1 .
So the left-hand side of (20.3.1) becomes the sum of anti-clockwise line integrals: Z
A.ds + γr
Z
γℓ
A.ds +
Z
γb
A.ds +
Z
A.ds, γt
where γr , γℓ , γb and γt denote respectively the right, left, bottom and top sides of [a1 , b1 ] × [a2 , b2 ]. [ Both Theorems 20.2.3 and 20.3.2 need a lot of improvement. Also give an n-dimensional version of the rectangle versions of the Stokes formula. ] [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Z
462
20. Measure and integration
20.3.3 Remark: Some elementary scaling tests can be applied to Theorem 20.3.2 to ensure that it is at least plausible. The left-hand side of equation (20.3.1) has the terms ∂1 A2 − ∂2 A1 in the integrand. These both look like covariant tensors of degree 2. So is the coordinates are multiplied by 2, they will be multiplied by 1/4. The differential form of the integral is dx1 dx2 , which looks like a contravariant tensor of degree 2. So under any simple coordinate scaling, the left hand side should remain constant. In fact, this is true for any C 1 transformation. The right-hand side of equation (20.3.1) has an integrand which looks like a covariant tensor of degree 1 and a differential form ds which looks like a contravariant tensor of degree 1. So this is also invariant under coordinate scaling. This kind of “dimensional analysis” is useful as a basic sanity check for equations and expressions in differential geometry. 20.3.4 Remark: The proof of Theorem 20.3.2 gives a clue for how to remember the form of the exterior derivative. The term ∂1 A2 , if integrated on its own, yields the difference between A2 on the right and left sides of the integration region, which is just like in the fundamental theorem of calculus. All that is R happening here is that the partial derivative ∂1 is cancelled by the integration . . . dx1 . Then the integeral over x2 sums this right-left difference over the whole right and left edges of the region. The term ∂2 A1 may be understood similarly, except that a minus-sign is required because the top edge integral is going in a negative direction, i.e. in the direction of decreasing x1 . Thus one may think of ∂1 A2 − ∂2 A1 as “the right-left difference of A2 minus the top-bottom difference of A1 ”.
From this observation, one may draw an interpretation of the expression ∂1 A2 − ∂2 A1 as the “deviation of the vector field A from the differential of a function f ”. In fact, if this integral is always zero, it follows that the boundary curve integral of A is independent of path, in which case, an integral f may be determined simply by integration of A along non-closed curves. If the vector field A represents a physical force field, the integrals in Theorem 20.3.2 may be thought of as the energy gained by one rotation around the boundary curve. So a zero value implies that the field is conservative. 20.3.6 Remark: Roughly speaking, Theorem 20.3.2 suggests that the exterior derivative ∂1 A2 −∂2 A1 may be thought of as “curl per unit area”. The kind of directional boundary path integral in Theorem 20.3.2 has an interesting additive property. If two rectangular regions are placed side by side, the common boundary segments cancel each other. The arrows cannot have the same direction on a common segment if they are always oriented counterclockwise as indicated. If a region is partitioned into many rectangles, the integral of the curl operator over the entire region may be calculated by integrating around its boundary, ignoring all of the internal line segments where the component rectangles coincide. In fact, this can be generalized to almost any region at all. This is not a surprising result when it is considered that the differential operator in the interior is equal to the limit of the per-area boundary line integral for vanishingly small rectangles. Thus, roughly speaking, one may write: Z 1 ∂i Aj − ∂j Ai dµ (∂i Aj − ∂j Ai )(p) = lim Ω→{p} µ(Ω) Ω Z 1 A.ds, = lim Ω→{p} µ(Ω) ∂Ω for p ∈ IR2 , where µ is the Lebesgue measure in IR2 and the expression “limΩ→{p} ” can be made precise in [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
20.3.5 Remark: If the vector field A in Theorem 20.3.2 is replaced with the gradient (∂1 f, ∂2 f ) of a C 2 function f : Ω → IR, the left-hand integral has a value equal to zero because ∂ 2 f /∂x1 ∂x2 = ∂ 2 f /∂x2 ∂x1 . So the theorem implies that the integral of the gradient df around the boundary ∂Ω is zero. This is not surprising because this integral of df signifies the difference in “height” of the function f as a point completes a loop around the boundary. This must be zero so that the value of f comes back to where it started. This is, in fact, a consequence of the fundamental theorem of calculus applied to the boundary curve. (This pattern continues for higher dimensions.)
20.4. Rectangular Stokes theorem in three dimensions
463
terms of the diameter of Ω. This interpretation may be compared with the corresponding formula in IR1 : Z 1 ∂i f (p) = lim ∂i f dµ I→{p} length(I) I Z 1 = lim f dµ I→{p} length(I) ∂I f (b) − f (a) = lim , a,b→p b−a where ∂I denotes the “signed boundary” of the interval I = [a, b]. This “signed boundary” is positive at b and negative at a. Hence the boundary integral is the integral of f multiplied by the 0-form which equals 1 at b and −1 at a. The Stokes Theorem follows very naturally from the way the exterior derivative (in this case the curl operator) is defined. The reason for the name “exterior derivative” is clear from the “limit of per-area boundary integral” interpretation. (See also Remark 20.6.1 for this interpretation.) [ Define the physics version of “curl” and define its precise relation to the exterior derivative. ] 20.3.7 Remark: See Remark 19.3.5 for the calculation of the transformation rule for the exterior derivative of a covariant vector field. It transforms like a covariant tensor of degree 2. This would seem to imply that Theorem 20.3.2 holds when the point space is subjected to an arbitrary C 1 diffeomorphism. [ Explain why the situation is not so good when the vector field is contravariant. Using the transformation rules for the differential forms in Theorem 20.3.2 under general diffeomorphisms (even to spaces with more than 2 dimensions), it should be possible to generate a useful class of generalizations using only diffeomorphisms. Such generalizations should give exactly the same answer as the standard general Stokes theorem. ]
20.4. Rectangular Stokes theorem in three dimensions
20.4.1 Remark: Stokes theorem can be extended from a rectangle to a rectangular solid. Consider first the surface integral forR SA = {x1 } × [x2 , x2 + ∆x2 ] × [x3 , x3 + ∆x3 ]. The integral of the vector field λ ∈ X 1 (Λ2 T (IR3 )) is SA λ(x)(e2 , e3 ) dx2 dx3 . In terms of coordinates, let λ(x)(e2 , e3 ) = a23 (x). (See Figure 20.4.1.) x3 e3 Z
e3 SA
λ(x)(e2 , e3 ) dx2 dx3
SA
Z
SB
λ(x)(e2 , e3 ) dx2 dx3
SB
e2 e2
e1 x2
x1 Figure 20.4.1
Stokes theorem in IR3
Then by subtracting the integral over SA from the integral over SB and dividing by ∆x1 ∆x2 ∆x3 and taking the limit, the result is ∂1 a23 (x) = (∂/∂x1 )λ(x1 , x2 , x3 )(e2 , e3 ). When the other two surface pairs are added, the result is ∂1 a23 (x) − ∂2 a13 (x) + ∂3 a12 (x). RBy integrating this over a non-infinitesimal rectangular solid R as for a 2-dimensional rectangle, the result is Ω ∂1 a23 (x) − ∂2 a13 (x) + ∂3 a12 (x)dx1 dx2 dx3 = ∂Ω λ(x)(dA).
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
This section extends Section 20.3 to rectangular solids in IR3 .
464
20. Measure and integration
This suggests that dλ(x) should be defined as ∂1 a23 (x) − ∂2 a13 (x) + ∂3 a12 (x) in order to make the Stokes formula valid. The Stokes formula for a rectangular solid is then Z
1
Ω
2
3
∂1 a23 − ∂2 a13 + ∂3 a12 dx dx dx =
Z
a.dA. ∂Ω
The integral over the surface could be called an “exterior integral”, while the integral over the rectangular region could be called an “internal integral”. The name of the “exterior derivative” is clearly related to the external nature of the surface integral. An easy proof of the Stokes formula for a rectangular solid follows the same method as for Theorem 20.3.2. Each of the three terms of the integrand may be integrated along lines as in Figure 20.4.2 to get rid of the partial derivative. Each of these line integrals may be integrated over the corresponding faces of the rectangular solid. These solid integerals may be added to give the surface integral. x3
Z
b1
∂1 a23 (x) dx1
a1
e3
e3
e2
e1
e2
x2
Stokes theorem integration paths in IR3
Figure 20.4.2
20.4.2 Remark: As for the rectangular Stokes theorem in IR2 (Remark 20.3.6), one may motivate the definition of the exterior derivative by the (very rough) expression: 1 Ω→{p} µ(Ω)
(∂1 a23 − ∂2 a13 + ∂3 a12 )(p) = lim
Z
a.dA,
∂Ω
for p ∈ IR3 , where µ is the Lebesgue measure in IR3 . [ Maybe derive the expression for the exterior derivative for general dimensions by using rectangular regions. Then show the Stokes theorem for C 1 regions. Show how only the antisymmetric part of a multilinear function of a set of n vectors is significant. This motivates the definition of alternating tensors. ]
20.5. Differential forms The term “differential form” originates from expressions like “dx dy” which appear in integrals, especially with respect to two or more independent variables. Just as directional derivatives may be identified with tangent vectors, so also the differentials which appear in integrals may be identified with cotangent vectors, namely the duals of tangent vectors. 20.5.1 Remark: This section introduces differential forms in flat space. Differential forms in physics represent densities of physical quantities which may be scalars, vectors or tensors. The density may be linear density along a curve, per-area density on a surface or per-volume density, and so forth for higher dimensions. Exterior calculus may be thought of as the calculus of integration on embedded submanifolds. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
x1
20.5. Differential forms
465
A differential form represents something that one may want to integrate. A differential form φ of degree m is an alternating multilinear function of sequences of m tangent vectors in some space, which may be a flat space or a manifold. If m = 1, then φ would typically be integrated along a curve and the single tangent vector would be the velocity vector of the curve. If m = 2, then φ would typically be integrated over a 2-dimensional surface, and the two tangent vectors would be tangential to the surface at each point. Differential forms φ of degree m tell you how much of something there is for a given amount of m-area. Since they are alternating multilinear with respect to sequences of m tangent vectors, they are therefore linear with respect to m-area. So for m = 1, φ depends linearly on the length of a curve. For m = 2, φ represents the amount of something per unit area at a given point. And so forth. Linearity with respect to m-area ensures that the integral is independent of the way in which a region is subdivided. So linearity seems to be an inescapable part of the definition. [ How do differential forms relate to general spaces of differentials such as T (M1 , M2 )? ] [ See notes I. See also Federer [105] 4.1.6, EDM2 [34] 105.Q, and Malliavin [35], Section 7.5, page 71. See Chapter 4 of the second part of Malliavin [35] (p.112). ] [ Should either define “flat space” very carefully somewhere or else not mention it at all. The same goes for “Euclidean spaces”. Particular difficulties arise with infinite-dimensional spaces. It should be pointed out which things are okay in infinite-dimensional space and which are not. Topology gets tricky, for instance. ]
Z
n n 20.5.2 Definition: The (alternating) m-form bundle on IRn for m, n ∈ + 0 is the set Λm T (IR ) = IR × n n n ˆ Λm (IR ) together with the transformation rule ψ : Λm T (IR ) → ˚ Λm T (IR ) for alternating m-forms under C 1 diffeomorphisms φ : IRn → ˚ IRn .
[ Must write out the full transformation rules for m-forms in Definition 20.5.2 somewhere. ]
Z+0 and x ∈ IRn denotes the subset {x} × Λm(IRn ) of the m-form
20.5.4 Definition: A cross-section of the (alternating) m-form bundle on IRn for m, n ∈ f : IRn → Λm T (IRn ) such that f (x) ∈ Λm Tx (IRn ) for all x ∈ Dom(f ).
Z+0 is a function
Z+0 is a cross-section of the m-form bundle on IRn . −+ cross-section of the m-form bundle on IR n for m, n ∈ Z+ 0 and r ∈ Z0 is a n r
A differential form of degree m on IR n for m, n ∈
20.5.5 Definition: A C r cross-section f : IRn → Λm T (IR ) such that f is of class C .
A C r differential form of degree m on IR n for m, n ∈ bundle on IRn .
Z+0 and r ∈ Z+0 is a C r cross-section of the m-form
20.5.6 Notation: X r (Λm T (IRn )) denotes the set of C r cross-sections of Λm T (IRn ) for m, n ∈ − r∈ + 0.
Z
Z+0 and
[ Remark 20.5.7 needs to be fixed. It shouldn’t refer to manifolds. See Federer [105], page 352. ] 20.5.7 Remark: The historical origin of the term “differential form” is the use in classical differential geometry of abstract expressions such as “f (x1 , x2 )dx1 + g(x1 , x2 )dx2 ” to represent differential forms. If abstract differentials such as dxi are given a concrete representation such as linear maps dψ i : T (M ) → IR with dψ i : V 7→ v i (where v i is the ith component of V ∈ T (M )), then dxi = dψ i is a differential form of degree 1. Products of differential forms such as dx1 dx2 can be similarly interpreted. [ Define exterior product. See Crampin/Pirani [11], pages 91, 95, 97, 99, 104, 258. And interior product? ] [ Must also define differential forms which are valued in spaces of infinitesimal translations of a fibre space F or fibre sets Eb . These are vector fields rather than vectors. Maybe do this in the corresponding fibre space chapter/section instead. ] [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
20.5.3 Notation: Λm Tx (IRn ) for m, n ∈ bundle Λm T (IRn ) on IRn .
466
20. Measure and integration
20.6. The exterior derivative The exterior derivative could not be defined in Chapter 13 on tensors because it requires differential calculus for its definition, and its motivation comes from integral calculus for multiple variables. For example, Theorem 20.3.2 gives a motivation for the exterior derivative of a 1-form. [ For exterior derivative, see also Crampin/Pirani [11], pages 120–125, 259. ] 20.6.1 Remark: The exterior derivative dω of a form ω of degree m may be interpreted as the infinitesimal limit of the integral of ω over the m-dimensional boundary of an (m + 1)-dimensional submanifold divided by the (m + 1)-dimensional area of the submanifold as it shrinks to a point. The Stokes theorem supports this interpretation. The definition of exterior derivative is designed to make the Stokes theorem valid. One may regard the Stokes theorem as the definition of the exterior derivative. Defining the exterior derivative in this way is more satisfying than pulling it out of a hat. (See Section 20.3 for motivation of the exterior derivative in terms of the Stokes formula.) 20.6.2 Definition: The exterior derivative d : X 1 (Λm T (IRn )) → X 0 (Λm+1 T (IRn )) for n, m ∈ defined by n ∀φ ∈ X 1 (Λm T (IRn )), ∀x ∈ IRn , ∀i ∈ Im+1 ,
dφ(x)(ei ) =
m+1 X
(−1)r−1 ∂r φ(x)(omit ei ),
(20.6.1)
r
r=1
Z+0 is
20.6.3 Remark: Definition 20.6.2 expresses the exterior derivative in terms of basis vectors e1 , . . . en in the tangent space of the point set IRn . It is also possible to express the exterior derivative in terms of n basis vectors ej = ej1 ∧ . . . ejm of the alternating m-form space Λm T (IRn ) for j ∈ Im , where the sequence 1 n e , . . . e is the dual basis of the sequence e1 , . . . en . This kind of definition has the advantage of being easier to motivate in terms of the Stokes formula, but is less concise to state. A function f ∈ X 1 (Λ0 (IRn )) = C 1 (IRn , IR) is simply a P C 1 real-valued function on IRn . The exterior derivative n of such a function f is the ordinary differential df = r=1 ∂r f.er . (See Definition 19.2.14.) The reason for choosing to define the exterior derivative to equal the differential for m = 0 is to make the fundamental theorem of calculus (Theorem 20.2.3) valid. (See Remark 20.2.4 for the FTOC on curves.) n jk For j ∈ Im , the function g(x) = ∧m in X ∞ (Λm (IRn )) is constant. So it is assumed that it has exterior k=1 e derivative equal to zero. The m-linear function g(x) has the value m
g(x)(ei ) = ( ∧ ejk )(ei ) = =
k=1 δji 2 ...im δji11 ,i ,j2 ...jm
n for all x ∈ IRn and i ∈ Im . Since f is a 0-form and g is an m-form, the exterior product f ∧g is a well-defined m-form because 0+m = m. (See Section 9.9.15 for the exterior product.) In this case, the exterior product is the same as the pointwise product. That is, (f ∧ g)(x) = f (x)g(x) for all x ∈ IRn . The exterior derivative of this m-form f ∧ g is defined to be d(f ∧ g) = (df ) ∧ g because g is constant. Thus (d(f ∧ g))(x) = (df )(x) ∧ g(x) for all x ∈ IRn . It follows immediately that
d(f (x).ej1 ∧ . . . ejm ) =
n X r=1
∂r f (x).er ∧ ej1 ∧ . . . ejm .
n
jk ∈ A general m-form α ∈ X 1 (Λm (IR )) may be written as a sum of constant simple m-forms g = ∧m k=1 e X 1 (Λm (IRn )) of basis vectors e1 , . . . en and real-valued functions f ∈ X 1 (Λ0 (IRn )). Due to the antisymmetry n rules, it is sufficient to use only increasing index sequences j ∈ Im . Thus X X m α= fj g j = fj ∧ ejk n j∈Im
[ www.topology.org/tex/conc/dg.html ]
n j∈Im
k=1
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
where e(x) ∈ Tx (IRn )n is a constant sequence of basis vectors for IRn .
20.6. The exterior derivative
467
n for some functions fj ∈ X 1 (Λ0 (IRn )) for j ∈ Im . By linearity of the exterior derivative, it follows then that
dα =
n X X
n r=1 j∈Im
=
n X X r=1
n j∈Im
m
(∂r fj ).er ∧ ∧ ejk
(20.6.2)
k=1
(∂r fj ).er ∧ ej1 ∧ . . . ejm .
n Finally the general m-form dα may be evaluated for a sequence ei ∈ (IRn )m+1 for i ∈ Im+1 to give
dα(ei ) =
n X X
n r=1 j∈Im
=
n X X r=1
m
(∂r fj ).(er ∧ ∧ ejk )(ei ) k=1
(∂r fj ).δir,j ,
n j∈Im
where the notation “r, j” means the concatenation of the one-element sequence (r) with the m-element n sequence j to give the (m + 1)-element sequence (r, j1 , . . . jm ) ∈ Im+1 . However, m
j ( ∧ ejk )(omit ei ) = δomit r (i) r
k=1
r,j = δr,omit r (i)
= (−1)r−1 δir,j .
dα(ei ) =
n X X
n r=1 j∈Im
=
n X
m
(−1)r−1 (∂r fj ). ∧ ejk (omit ei ) k=1
r
(−1)r−1 ∂r α(omit ei )
r=1
r
n for all i ∈ Im . This agrees with Definition 20.6.2.
In summary, equation (20.6.1) is applicable when an m-form is thought of as a general m-linear map on the tangent spaces of the tangent bundle, whereas equation (20.6.2) is better suited to an m-form which is expressed as the sum of simple m-covectors of basis vectors. 20.6.4 Theorem: The exterior derivative in Definition 20.6.2 is independent of the choice of basis vectors. 20.6.5 Theorem: For all m ∈
Z+0, the exterior derivative of an m-form is an (m + 1)-form.
Proof: It must be shown that the exterior derivative of an m-form transforms under a diffeomorphism as a covariant tensor of degree m + 1, and that antisymmetry holds. 20.6.6 Theorem: For all m ∈ form.
Z+0 and r ∈ −Z+0, the exterior derivative of a C r+1 m-form is a C r (m + 1)-
20.6.7 Remark: Theorems 20.6.4, 20.6.5 and 20.6.6 verify that Definition 20.6.2 yields a well-defined C 0 (m+1)-form. It follows from Theorem 20.6.6 that the exterior derivative of a C ∞ m-form is also of class C ∞ . Theorem 20.6.8 refers to the linear structure of pointwise addition and scalar multiplication of m-covector tangent bundle cross-sections. Theorem 20.6.9 shows that the exterior derivative is an extension of the definition of the differential of a real-valued function.
Z
20.6.8 Theorem: For all m, n ∈ + 0 , the exterior differential in Definition 20.6.2 is a linear map from d : X 1 (Λm T (IRn )) to X 0 (Λm+1 T (IRn )). [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Therefore
468
20. Measure and integration
Z
n n 1 0 20.6.9 Theorem: For all n ∈ + 0 , for all f ∈ X (Λ0 T (IR )), the exterior derivative df ∈ X (Λ1 T (IR )) n 0 ∗ is the same thing as the differential df ∈ X (T (IR )).
20.6.10 Theorem: For all n, m, q ∈
Z+0, for all f ∈ X 1(ΛmT (IRn )) and g ∈ X 1 (Λq T (IRn)),
d(f ∧ g) = (df ) ∧ g + (−1)m f ∧ (dg). 20.6.11 Theorem: For all n, m ∈
Z+0, for all f ∈ X 2 (ΛmT (IRn)), d(df ) = 0.
20.6.12 Remark: Theorem 20.6.11 may be thought of as: “The boundary of the boundary is empty.” Alternatively: “The exterior of the exterior is zero.” Since the exterior derivative is the limit of the integral of a differential form over the boundary of a region, it makes sense that repeating this operation yields zero. Another way to think of Theorem 20.6.11 is: “The curl of a conservative field is zero.” In physics, conservative force fields (i.e. fields in which particle or loop or surface acquires no net energy when it moves in a closed path) are often represented as the exterior derivative of a differential form representing energy. It then follows that the exterior derivative of the force field must be zero. [ See Frankel [18], p. 73, thm. 2.53 for a proof of Theorem 20.6.13. ]
Z+0, suppose that (µm)∞m=0 is a sequence of maps which satisfy (i) µm : X 1 (Λm T (IRn )) → X 0 (Λm+1 T (IRn )) for all m ∈ Z+ 0; + (ii) µm is linear for all m ∈ Z0 ;
20.6.13 Theorem: For n ∈
(iv) µm+q (f ∧g) = µm (f )∧g +(−1)m f ∧µq (g) for all f ∈ X 1 (Λm T (IRn )), g ∈ X 1 (Λq T (IRn )) and m, q ∈ (v) µm+1 (µm (f )) = 0 for all f ∈ X 2 (Λm T (IRn )), for all m ∈
Z
Z+0.
Z+0;
Then µm (f ) = df for all f ∈ X 1 (Λm T (IRn )), for all m ∈ + 0 , where d is the exterior derivative in Definition 20.6.2. In other words, the exterior derivative is uniquely determined by the above conditions. 20.6.14 Remark: The usual basis e to use in Definition 20.6.2 is the sequence (e1 , . . . en ) of unit vectors in Tx (IRn ). (For notational convenience, the dependence on x ∈ IRn is suppressed.)
N
N
N
n The set Im denotes the set of indices {i : m → n ; ∀j, k ∈ m , j < k ⇒ i(j) < i(k)}. This set n n contains #(Im ) = Cm sequences. (See Notation 7.11.10.) The notation fi for a function f : n → A and n m i ∈ Im for any set A means the sequence f ◦ i = (f (i(j)))m m → A. (See j=1 = (fi(j) )j=1 = (fi1 , . . . fim ) : Notation 7.11.13.)
N
N
Denote the value φ(x)(eJ ) by aj1 ,...jm . . .
Z
P n 1 m ik 20.6.15 Theorem: Let m, n ∈ + , n ai (x) ∧k=1 e 0 . Let f ∈ X (Λm T (IR )) be defined by f (x) = i∈Im n n 1 n 0 where ai ∈ C (IR ) for all i ∈ Im . Then the exterior derivative df ∈ X (Λm+1 T (IR )) satisfies (df )(x) =
X m+1 X m+1 (−1)r−1 ∂r aomitr (i) (x) ∧ eik
n r=1 i∈Im+1
k=1
(20.6.3)
for all x ∈ IRn . 20.6.16 Remark: The expression (20.6.3) in Theorem 20.6.15 is sometimes used as the definition of the exterior derivative. 20.6.17 Example: To show how Theorem 20.6.15 works in practice, let n = 4 and m = 2. Then I24 = {(1, 2), (1, 3), (1, 4), (2, 3), (2, 4), (3, 4)}. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
(iii) µ0 f equals the differential df of f for all f ∈ X 1 (Λ0 T (IRn ));
20.7. Exterior differentiation using Lie derivatives
469
Therefore f (x) = a12 (x)e1 ∧ e2 + a13 (x)e1 ∧ e3 + a14 (x)e1 ∧ e4 + a23 (x)e2 ∧ e3 + a24 (x)e2 ∧ e4 + a34 (x)e3 ∧ e4 ,
where aij ∈ C 1 (IRn ) for all (i, j) ∈ I24 . Then
df (x) = (∂1 a23 (x) − ∂2 a13 (x) + ∂3 a12 (x)) e1 ∧ e2 ∧ e3 + (∂1 a24 (x) − ∂2 a14 (x) + ∂4 a12 (x)) e1 ∧ e2 ∧ e4 + (∂1 a34 (x) − ∂3 a14 (x) + ∂4 a13 (x)) e1 ∧ e3 ∧ e4
+ (∂2 a34 (x) − ∂3 a24 (x) + ∂4 a23 (x)) e2 ∧ e3 ∧ e4 . 20.6.18 Remark: Curvature of a manifold is defined as the exterior derivative of parallel transport because curvature is the deviation of parallelism around a closed path which bounds a disc. The laws of motion in physics are generally defined in terms of some kind of curvature. Therefore exterior derivatives of differential forms are important in applications of differential geometry to physics.
20.7. Exterior differentiation using Lie derivatives
20.7.2 Remark: Lie derivatives (and the Poisson bracket in particular) are known to yield tensors from tensors. This is an advantage relative to partial derivatives with respect to point coordinates. The partial derivatives must be carefully balanced by antisymmetrizing so as to cancel out the non-tensorial terms which transform according to higher derivatives (than the first derivative) of point transformations. Differentiating with respect to a chart does not generally yield tensors from tensors because a chart is not an intrinsic structure of a manifold. [ A lot of things in Remark 20.7.3 need clarification. For example, the Lie derivative must be defined and differential forms with vector fields as arguments must be defined. They must also be extended from tensor monomials to general tensors. ] [ Somewhere in this chapter, define vector field algebra and Lie derivatives in flat space. See Section 33.4 for Lie derivatives in curved space. ] [ In Remark 20.7.3, expand the Lie derivatives LXi to see what terms arise. They should cancel the undifferentiated φ terms. ] 20.7.3 Remark: For any sequence of p + 1 vector fields X = (Xi )p+1 i=1 , dφ(X) = dφ(X1 , . . . , Xp+1 ) =
p+1 X
(−1)i−1 LXi (φ(omit(X))) + i
i=1
=
p+1 X
X
(−1)i+j φ( insert (omit(X)))
1≤i<j≤p+1
1,[Xi ,Xj ]
i,j
ˆ i , . . . , Xp+1 )) (−1)i−1 LXi (φ(X1 , . . . , X
i=1
+
X
ˆi, . . . , X ˆ j , . . . , Xp+1 ), (−1)i+j φ([Xi , Xj ], X1 , . . . , X
1≤i<j≤p+1
where LX denotes the Lie derivative with respect to a vector field X. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
20.7.1 Remark: The purpose of defining the exterior derivative in terms of Lie derivatives is to hide coordinates. When the exterior derivative is defined in terms of a constant basis as in Definition 20.6.2, it is not necessary to be concerned with the variation of the basis with respect to displacement in the point space. When differentiating an expression φ(x)(ei1 , . . . eim ) for x ∈ IRn and φ ∈ X 1 (Λm T (IRn , IR)), the variation of the basis vectors ei1 , . . . eim contributes no extra terms. However, if the basis vectors do vary with respect to x, this variation must be subtracted out. In the Lie derivative versions of exterior derivative definitions, variable basis vectors X1 , . . . , Xm+1 are used instead of constant basis vectors. To compensate for this variability, Lie derivatives (which follow the “flow”) are used instead of simple partial derivatives, and there are some compensatory terms involving Poisson brackets of the variable basis vectors. All in all, the simplicity of the constant-basis definition seems to be preferable. However, the Lie derivative versions can be useful when Lie derivatives are easier to calculate than simple partial derivatives.
470
20. Measure and integration
[ Why is dφ(X) in Definition 20.6.2 defined for fields X instead of vectors V ? ] 20.7.4 Remark: The pointwise tensor style in Definition 20.6.2 is as in Federer [105] 4.1.6. The tensor field style is as in EDM2 [34] 105.Q(2), Gallot/Hulin/Lafontaine [19], Pp. 43, 74, and Crampin/Pirani [11], page 125. Malliavin [35], page 117, gives the following for a vector field sequence X = (Xi )p+1 i=1 : dφ(X) = dφ(X1 , . . . , Xp+1 ) =
p+1 X
ˆ i , . . . , Xp+1 )) (−1)i−1 LXi (φ(X1 , . . . , X
i=1
+
X
ˆ i , [Xi , Xj ], . . . , X ˆ j , . . . , Xp+1 ), (−1)i+j φ(X1 , . . . , X
1≤i<j≤p+1
which is apparently different to Definition 20.6.2. ˆ i , . . . , Xp+1 )) means [ It is clear from Crampin/Pirani [11], page 128, that LXi (φ(X1 , . . . , X ˆ i , . . . , Xp+1 ) + (LXi φ)(X1 , . . . , X
p+1 X
ˆ i , . . . , Xp+1 ). φ(X1 , . . . , [Xi , Xj ], . . . , X
j=1 j6=i
I.e. must apply LXi to both φ and the parameters of φ. ]
20.8. Geometric measure theory [ Present the Gauß-Green theorem. See Federer [105] (4.5.6, page 478), EDM2 [34] (94.F, page 355, 105.U, page 390) and Frankel [18] (3.3b, page 111, 5.1, page 155). The EDM calls it the Stokes formula or the Green-Stokes formula. Frankel calls it Stokes’s Theorem but attributes it to Amp`ere, Kelvin, Green, Gauß and others. ]
20.9.1 Remark: Theorem 20.9.2 is known as the Stokes theorem. The Stokes theorem may be regarded as a natural generalization of the fundamental theorem of calculus. (See also Section 20.2.) 20.9.2 Theorem:
Z
C
where C is a singular r-chain.
dω =
Z
ω,
∂C
20.9.3 Remark: Theorem 20.9.2 is probably the deepest and most important statement in the differential layer of differential geometry. Although it is stated here in Euclidean space, the differential layer of a differentiable manifold is merely a topological generalization of Euclidean space, but is indistinguishable locally. So the theorem is valid also on general differentiable manifolds. Theorem 20.9.2 combines multi-variable differentiation, geometric measure theory and algebraic topology in a single statement. It gives the motivation for much of the differential layer and is the basis of the important facts about the higher layers. It is a combination of local and global concepts. The Stokes Theorem is important enough to deserve its own chapter!
20.10. Radon measures Radon measures are particularly useful for the study of hyperbolic first order systems of partial differential equations. [ Radon measures are very important in the study of various dynamic systems and some areas of probability theory. Measures in general, and Radon measures in particular, have a non-trivial extension to differentiable manifolds. Radon measures are defined as duals of C 0 function spaces. ] [ Must also have a section or chapter on the use of Radon measures in the solution of a wide variety of hyperbolic first-order partial differential equations, including certain “fluid flow models” which are useful in some teletraffic research. This might be best placed in another book. ] [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
20.9. Stokes theorem
20.11. Some integrability-based function spaces
471
20.11. Some integrability-based function spaces 20.11.1 Notation: Let n ∈
Z+, Ω ∈ Top(IRn), k ∈ Z+0 and p ∈ [1, ∞]. Then W k,p(Ω) denotes. . .
[ See Adams [92] for Sobolev space definitions. ]
20.12. Logarithmic and exponential functions In terms of Taylor series, the exponential function is more natural than the logarithm function. But in terms of integrals, the logarithm is more natural. The exponential is not an integral of some simpler function such as a quotient of polynomials, whereas the logarithm arises naturally as the integral of x−1 . Therefore in this section, the exponential is defined in terms of the logarithm. 20.12.1 Definition: The logarithm function is the function ln : (0, ∞) → IR defined by Z x ln(x) = t−1 dt 1
for x ∈ IR with x > 0.
20.12.3 Remark: Definition 20.12.2 means that ∀x ∈ IR, ln(exp(x)) = x. In other words, ln ◦ exp = idIR . The equation ln(y) = x has one and only one solution for x ∈ IR because the logarithm function is one-to-one and its range is IR. Equivalently, exp may be expressed as the left inverse of ln. That is, exp ◦ ln = id(0,∞) . In other words, ∀x ∈ (0, ∞), exp(ln(x)) = x. It may seem a little troubling that such a basic function as the logarithm is defined as an integral. Integrals are cumbersome to calculate in practice. Defining the exponential function as the inverse of an integral means R y that it is defined as the solution of an integral equation. Thus y = exp(x) is defined as the solution of 1 t−1 dt = x. The logarithm and exponential functions are often defined in terms of Taylor series which are better suited to computers which primarily offer addition and multiplication operations. As always, one’s choice of definition can be optimized for a given range of applications. Integrals have some advantages for defining transcendental functions. Integrals don’t require convergence tests to ensure that they are well-defined. It is easier to show that integral-defined functions are solutions to differential equations. So for many analysis purposes, integral definitions are better. Series expansions are usually quite easy to derive from integral definitions. 20.12.4 Theorem: ∀x ∈ IR,
d dx
exp(x) = exp(x).
Proof: By Definition 20.12.2, the exponential function satisfies the equation ln(exp(x)) = x for all x ∈ IR. d d By Definition 20.12.1 and the chain rule for differentiation, exp(x)−1 dx exp(x) = 1. Therefore dx exp(x) = exp(x) as claimed. 20.12.5 Theorem: The function f : IR → IR defined by 0 x≤0 f (x) = exp(−x−1 ) x > 0 is a C ∞ function on IR. (See Figure 20.12.1.) 1
1
1 f (x) = exp − x
0 Figure 20.12.1
1
gR (x) = exp
2 x
-1
0
1 2 x − R2
1 x
C ∞ functions f (x) = exp(−x−1 ) and gR (x) = exp((x2 − R2 )−1 ); R = 1
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
20.12.2 Definition: The exponential function is the function exp : IR → IR defined as the inverse of the logarithm function.
472
20. Measure and integration
20.12.6 Theorem: The function g : IRn → IR defined for R > 0 by exp (x2 − R2 )−1 |x| < R gR (x) = 0 |x| ≥ R is a C ∞ function on IRn . (See Figure 20.12.1.)
20.12.7 Theorem: The function f : IR → IR defined by ( 0 −1 f (x) = 1 + exp(x−1 − (1 − x)−1 ) 1
is a C ∞ function on IR. (See Figure 20.12.2.) 1
1 1 + exp
1 1 − 1−x x
f (x) =
1 + exp
0 Figure 20.12.2
C
∞
1 1 1 − x 1−x
x
1
function which is constant outside [0, 1]
20.12.8 Remark: Theorem 20.12.7 was arrived at by first finding a function tanh(x) which is C ∞ and bounded between two finite values, and then finding another function (1 − x)−1 − x−1 which maps the finite interval (0, 1) to the doubly infinite interval (−∞, ∞). When these two functions are composed, the result is a function which has the desired properties but whose range lies in the interval [−1, 1]. This was adjusted by noting that (tanh(x/2) + 1)/2 = (1 + e−x )−1 . (A very similar function construction is described in Warner [49], Lemma 1.10, page 10.)
Z
20.12.9 Theorem: The function gr,R : IRn → IR defined for n ∈ + and r, R ∈ IR with 0 ≤ r < R by |x| ≤ r 1 −1 gr,R (x) = 1 + exp((R − |x|)−1 − (|x| − r)−1 ) |x| ∈ (r, R) 0 |x| ≥ R
is a C ∞ function on IRn which is zero outside B0,R and equal to 1 inside B0,r . (See Figure 20.12.3.) gr,R (x) = 1 + exp 1
-2 Figure 20.12.3
-1
0
1 1 1 − R − |x| |x| − r
1
2
x1
C ∞ function which is zero outside B0,R ; cross-section x2 , . . . xn = 0; r = 1, R = 2
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
1 − f (x) =
x≤0 x ∈ (0, 1) x≥1
20.13. Trigonometric functions
473
20.12.10 Remark: Theorem 20.12.9 is useful for constructing C ∞ functions with compact support with prescribed propertiesPwithin aQgiven region. For example, if the function gr,R is multiplied by a general n i polynomial P (x) = α∈ωn cα i=1 xα i in n variables, then all of the derivatives of the pointwise product function P.gr,R are arbitrarily determined at x = 0 by the choice of coefficients cα .
20.13. Trigonometric functions Trigonometric functions are needed for the study of spheres, which provide important examples of most things in differential geometry. 20.13.1 Remark: In terms of Taylor series, the trigonometric functions sin, cos and tan are more natural than the inverse trigonometric functions. But in terms of integrals, the inverse functions are more natural. The sin, cos and tan functions cannot be constructed as integrals of simpler functions such as algebraic functions. The inverse trigonometric functions do arise naturally as integrals of simple algebraic functions. Therefore in this section, the trigonometric functions are defined in terms of their inverses. [ Present all known properties of the trigonometric functions. For the trig functions, see CRC [155], pages A2 to A-7. Also see CRC [99], pages 133–148. Also possibly slightly useful is Reinhardt [134], volume 1, pages 178–181 . See also Gradstein/Ryzhik [112], pages 50–80, and Spiegel [142], pages 11–20. ] 20.13.2 Definition: The one-argument inverse trigonometric functions are defined as follows. − ∀x ∈ IR,
arctan(x) =
∀x ∈ [−1, 1],
arcsin(x) =
Z
x
(1 + t2 )−1 dt
0 x
0
∀x ∈ [−1, 1],
arccos(x) =
Z
1
x
(1 − t2 )−1/2 dt
(1 − t2 )−1/2 dt
20.13.3 Remark: The inverse trigonometric functions are illustrated in Figure 20.13.1. It is clear from the definitions that these functions are one-to-one. The arctangent, arcsine and arccosine functions are often abbreviated to atan, asin and acos respectively.
acos(x)
y π asin(x)
π/2
atan(x)
0 -6
-5
-4
-3
-2
-1
1
2
3
4
5
6
x
−π/2 Figure 20.13.1
The atan, asin and acos functions
20.13.4 Definition: π = 4 arctan(1). R∞ 20.13.5 Remark: The number π in Definition 20.13.4 also satisfies π = 2 arctan(∞) = −∞ (1 + t2 )−1 dt. Then arccos(x) = π/2 − arcsin(x) for all x ∈ [−1, 1]. R1 Note also that π = 2 arcsin(1) = arccos(−1) = −1 (1 − t2 )1/2 dt. [ There are a zillion such formulas for π. ] [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Z
474
20. Measure and integration
20.13.6 Definition: A two-parameter arctangent function arctan : IR2 → (−π, π] may be defined in terms of the standard single-parameter version as follows. arctan(y/x) if x > 0 π + arctan(y/x) if x < 0 and y ≥ 0 −π + arctan(y/x) if x < 0 and y < 0 ∀(x, y) ∈ IR2 , arctan(x, y) = π/2 if x = 0 and y > 0 −π/2 if x = 0 and y < 0 0 if x = y = 0.
[ See Crampin/Pirani [11], page 41, for this arctangent function. ] [ Define two-parameter versions of arcsin, arccos etc. also. ]
20.13.7 Remark: The 2-parameter arctan function seems to be the best basis for deriving the other trigonometric functions. For instance, p ∀x ∈ [−1, 1], arcsin(x) = arctan 1 − x2 , x p arccos(x) = arctan x, 1 − x2 . ∀x ∈ [−1, 1],
The 2-parameter arctan function satisfies the following. x2 + y 2 > 0, y ≥ 0 arccos x(x2 + y 2 )−1/2 2 ∀(x, y) ∈ IR , arctan(x, y) = − arccos x(x2 + y 2 )−1/2 x2 + y 2 > 0, y < 0 0 x = y = 0.
[ Here define sin, cos and tan in terms of the inverse functions by way of second-order ODEs or first-order systems. Note that these abbreviated notations seem to be due to Euler (1748). See EDM2 [34], 432.C. ] 20.13.8 Remark: Definition 20.13.9 expresses the sine, cosine and tangent functions in terms of the arcsin, arccos and arctan functions. The sawtooth functions used in this definition are discussed in Remark 8.6.19.
∀x ∈ IR,
sin(x) = arcsin−1 |(x + 3π/2) mod 2π − π| − π/2 .
Z
∀x ∈ IR \ {(k + 1/2)π; k ∈ },
tan(x) = arctan−1 (x + π/2) mod π − π/2 .
20.13.10 Theorem: The sum and difference rules are: ∀a, b ∈ IR, ∀a, b ∈ IR,
sin(a + b) = sin a cos b + cos a sin b cos(a + b) = cos a cos b − sin a sin b.
20.13.11 Theorem: The product rules are: ∀a, b ∈ IR,
sin a sin b =
∀a, b ∈ IR,
cos a cos b =
∀a, b ∈ IR,
cos a sin b =
∀a, b ∈ IR,
sin a cos b =
1 2 1 2 1 2 1 2
cos(a − b) − cos(a + b)
cos(a − b) + cos(a + b) sin(a + b) + sin(a − b) sin(a + b) − sin(a − b) .
20.13.12 Theorem: The double-angle rules are: ∀θ ∈ IR,
∀θ ∈ IR,
cos 2θ = cos2 θ − sin2 θ = 2 cos2 θ − 1
= 1 − 2 sin2 θ. sin 2θ = 2 sin θ cos θ.
20.13.13 Theorem: The half-angle rules are: ∀θ ∈ IR,
∀θ ∈ IR,
sin2 21 θ = 12 (1 − cos θ)
cos2 21 θ = 12 (1 + cos θ).
and so forth. . . [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
20.13.9 Definition: The functions sin, cos and tan are defined as follows. ∀x ∈ IR, cos(x) = arccos−1 |(x + 2π) mod 4π − 2π|
20.13. Trigonometric functions
475
20.13.14 Theorem: Some rules relating the trigonometric functions to each other are: ∀θ ∈ [0, π/2),
sin θ = (1 − cos2 θ)1/2
= tan θ (1 + tan2 θ)−1/2 = (cosec θ)−1 = (sec θ)−1 (sec2 θ − 1)1/2
∀θ ∈ IR,
= (1 + cot2 θ)−1/2 (1 + tan2 θ)−1/2 cos θ = −(1 + tan2 θ)−1/2 0
if θ mod 2π ∈ [0, π/2) ∪ (3π/2, 2π) if θ mod 2π ∈ (π/2, 3π/2) if θ mod 2π ∈ {π/2, 3π/2}.
20.13.15 Theorem: Some useful combinations are: sin θ cos θ = tan θ (1 + tan2 θ)−1
∀θ ∈ (−π/2, π/2) and so forth. . .
20.13.16 Theorem: The relations between trigonometric functions can also be expressed as follows: sin(arctan x) = ±x(1 + x2 )−1/2
∀x ∈ IR
cos(arctan x) = ±(1 + x2 )−1/2
∀x ∈ IR and so forth. . .
20.13.17 Theorem: Some useful relations between the inverse trigonometric functions are: arcsin(x) = arcsin(x1/2 ) =
∀x ∈ [0, 1],
1 2 1 2
arccos(1 − 2x2 ) arccos(1 − 2x)
and so forth. . . 20.13.18 Theorem: The angle translation rules are: ∀θ ∈ IR,
cos θ = sin(θ + π/2)
and so forth. . . 20.13.19 Theorem: The derivatives of the trigonometric functions are:
Z
∀θ ∈ IR \ (π/2 + π ),
d tan θ = sec2 θ dθ
and so forth. . . 20.13.20 Theorem: The derivatives of the inverse trigonometric functions are: ∀x ∈ (−1, 1), ∀x ∈ IR \ {0}, ∀y ∈ IR, ∀x ∈ IR \ {0}, ∀y ∈ IR,
d arccos x = −(1 − x2 )−1/2 dx ∂ arctan(x, y) = −y/(x2 + y 2 ) ∂x ∂ arctan(x, y) = x/(x2 + y 2 ). ∂y
and so forth. . . [ Must check the quantifiers in Theorem 20.13.20. ] [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
∀x ∈ IR
476
20. Measure and integration
20.13.21 Theorem: The equation a cos θ + b sin θ = c for θ ∈ IR, (a, b) 6= (0, 0) and c2 ≤ a2 + b2 has the solutions θ = arctan(a, b) ± arccos(c(a2 + b2 )−1/2 ) + 2nπ for n ∈ . √ Proof: This follows from the formula a cos θ + b sin θ = a2 + b2 cos(θ − arctan(a, b)).
Z
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
20.13.22 Remark: It may seem a little ludicrous to define the inverse trigonometric functions first and then patch them together to create the familiar trigonometric functions sin, cos and tan. However, Bell [190], page 324, said the following in 1937. Now, in the integral calculus, the inverse trigonometric functions present themselves naturally as definite integrals of simple algebraic irrationalities (second degree); such integrals appear when we seek to find the length of an arc of a circle by means of the integral calculus. Suppose the inverse trigonometric functions had first presented themselves this way. Would it not have been “more natural” to consider the inverses of these functions, that is, the familiar trigonometric functions themselves as the given functions to be studied and analyzed? Undoubtedly; but in shoals of more advanced problems, the simplest of which is that of finding the length of the arc of an ellipse by the integral calculus, the awkward inverse “elliptic” (not “circular,” as for the arc of a circle) functions presented themselves first. It took Abel to see that these functions should be “inverted” and studied, precisely as in the case of sin x, cos x instead of sin−1 x, cos−1 x. Simple, was it not? Yet Legendre, a great mathematician, spent more than forty years over his “elliptic integrals” (the awkward “inverse functions” of his problem) without ever once suspecting that he should invert. This extremely simple, uncommonsensical way of looking at an apparently simple but profoundly recondite problem was one of the greatest mathematical advances of the nineteenth century. Well, maybe it wasn’t so great an advance as that. But the importance of transforming problems to facilitate their solution is clear.
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
[477]
Chapter 21 Differential equations
21.1 Ordinary differential equations . . . . . . . . . 21.2 Systems of linear second-order ODEs . . . . . . 21.3 Boundary value problems . . . . . . . . . . . . 21.4 Initial value problems . . . . . . . . . . . . . . 21.5 Calculus of variations . . . . . . . . . . . . . . 21.6 ODEs for defining exponential and trigonometric 21.7 Taylor series and exponentials of matrices . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . functions . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
479 481 481 481 482 482 483
21.0.1 Remark: The subject of differential equations, both ordinary and partial, may be regarded as an extension of the subject of measure and integration.
The subject of differential equations may be regarded as a generalization of the simple differential equation F ′ (t) = f (t). The solutions of the more general differential equations may be regarded as integrals in some sense. In other words, solutions of general differential equations may be regarded as generalized “antiderivatives”. Therefore it is not suprising that integration is an important tool in the solution of differential equations. Consequently, the treatment of differential equations comes logically after the presentation of measure and integration. 21.0.2 Remark: DE is an abbreviation for “differential equation(s)”. ODE is an abbreviation for “ordinary differential equation(s)”. PDE is an abbreviation for “partial differential equation(s)”. PDO is an abbreviation for “partial differential operator(s)”. BVP is an abbreviation for “boundary value problem(s)”. IVP is an abbreviation for “initial value problem(s)”. It is customary to append an “s” to the above abbreviations to indicate the plural because this “s” is often present in the spoken language. However, the suffix “s” is optional. Thus “partial differential equations” may be abbreviated to either PDE or PDEs. The abbreviation PDO is not very common. 21.0.3 Remark: Ordinary differential equations are expressed in terms of a single independent variable. Partial differential equations are expressed in terms of a any number of independent variables. Therefore ordinary differential equations are a special case of partial differential equations. When the number of dependent variables is more than one, the equations are referred to as a “system” of equations. This basic classification of differential equations is illustrated in Figure 21.0.1. 21.0.4 Remark: The vast majority of laws and models of physics are expressed as differential equations. So most of the mathematical work in physics consists of solving differential equations. For example, Einstein’s
©
Alan U. Kennington, “Differential geometry reconstructed: a unified systematic framework”. Copyright 2009, Alan U. Kennington. All rights reserved. You may print this book draft in A4 format. Printing in all other formats is forbidden. You may not charge any fee for copies of this book draft. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
The fundamental theorem of calculus states that the simple differential equation ∀t ∈ IR, F ′ (t) = f (t) has Rt solutions of the form ∀t ∈ IR, F (t) = f (s) ds. The integral on the real number line may thus be regarded as a method of solution of a simple class of differential equations. In other words, the FTOC states that the integral and the anti-derivative are the same thing.
478
21. Differential equations ordinary differential equation
partial differential equation
system of ordinary differential equations
system of partial differential equations
Figure 21.0.1
Basic classification of differential equations
gravity equations are a system of partial differential equations. Newton’s gravity law applied to a single object falling vertically is a single ordinary differential equation. Bell [190], pages 103–104 says the following. The great majority of the important equations of mathematical physics are partial differential equations.
Most of the vastness of the PDE subject is due to the difficulty of solving the equations. The mere formulation of PDE models is not so difficult. Therefore luckily this book does not need to provide a complete summary of PDE solution techniques as a prerequisite for differential geometry. Such a summary of PDE solution techniques would be on an encyclopedic scale. Nevertheless some basic PDE analysis techniques such as maximum principles are presented here. (Recall that this is a definitions book, not a theorems and techniques book. So this book is concerned only with the formulation of models, not the methods of solution. The author’s project to include all DG prerequisites is only feasible because this is merely a definitions book.) 21.0.6 Remark: People who study differential equations only so that they can solve them in applications often find incomprehensible the huge research effort devoted to proving merely the existence of solutions. Bell [190], page 528, makes the following comment on this issue. Of what immediate use is it to a working physicist to know that a particular differential equation occurring in his work is solvable, because some pure mathematician has proved that it is, when neither he nor the mathematician can perform the Herculean labor demanded by a numerical solution capable of application to specific problems? However, the methods of existence proof often strongly suggest methods of solution. They also give a-priori bounds which are useful for checking that numerical approximations are credible. Knowing the space in which a solution exists is important for determining the representation framework to be used for numerical approximations. And when serious difficulties arise in finding solutions, it is reassuring to know that the reason for the difficulties is not the non-existence or non-uniqueness of solutions. 21.0.7 Remark: Differential geometry generalizes the flat Euclidean space (for the independent variables) of the classical PDE literature of the 19th century to various classes of curved spaces. Thus PDEs in a curved space constitute an extension of the already difficult subject of flat-space PDEs. This, however, is the framework for models in general relativity and numerous other areas of physics. This helps to explain why DG is such a difficult subject.
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
21.0.5 Remark: The subject of differential equations is truly vast. This is not surprising because of the very broad applicability to all of the sciences, technology and engineering. But there is a second reason for the vastness of the mathematical subject of differential equations. Although most laws and models in applications are expressed as DEs, most of the work lies in the solution of these equations. Solving DEs turns out to be a much deeper subject than the solution of algebraic equations. For example, it is very rare for PDEs (in more than one variable) to have explicit solutions. Generally the best that can be achieved is the development of approximation techniques together with and analysis of error bounds. Since typically one can never “see” the solutions, it is necessary to develop tools for the analysis of properties of PDE solutions in terms of the specifications of models without seeing the solutions themselves. Thus, for example, there is a very large literature concerned with merely demonstrating the existence and regularity of solutions of PDEs.
21.1. Ordinary differential equations
479
21.1. Ordinary differential equations [ This section deals with existence and uniqueness for ODEs. This is required for constructing geodesic curves from connections. Curves are maps γ : I → X from a totally ordered set I to a set X. A path is defined as an equivalence class of curves. ] 21.1.1 Remark: One of the most important tools in ordinary and partial differential equations is the “maximum principle”. The purpose of maximum principles is to determine bounds on functions which satisfy ODEs and PDEs. Such bounds are very useful for proving existence, uniqueness and regularity, which are the three primary tasks to be carried out for any class of ODEs or PDEs. 21.1.2 Remark: Many questions in PDE can be reduced to ODE questions. So it is important to understand ODE before studying PDE. A particular example of this general idea is the topic of maximum principles for elliptic second-order differential operators. This topic is fairly simple to deal with in a single independent variable. A maximum principle for an ODE might say, for example, that if a real-valued func¯ ∩ C 2 (I) for a bounded open interval I ⊆ IR satisfies a(x)u′′ (x) + b(x)u′ (x) = 0 for x ∈ I, tion u ∈ C 0 (I) then u has no interior minimum or maximum in I, under suitable conditions on a and b. Typically it will be assumed that a(x) > 0 for all x ∈ I. So the equation can be normalized so that a(x) = 1 for all x ∈ I. An interesting question is to then ask what conditions on b will prevent an interior minimum or maximum. As an example, let u(x) = exp(−x−2 /2) for x 6= 0 and u(0) = 0. Then u ∈ C ∞ (IR) and ′′
′
u (x) + b(x)u (x) = This equals zero for all x ∈ IR if b(x) =
x−6 1 − 3x2 + b(x)x3 u(x) for x 6= 0 0 for x = 0.
(3x2 − 1)/x3 0
for x 6= 0 for x = 0.
1
4
u(x) = exp(−x−2 /2)
b(x) = −u′′ (x)/u′ (x) 3 = (3x2 − 1)/x3 2 1
x -1
0
1
-4
-3
-2
-1
0
1
2
3
4 x
u′′ (x) + b(x)u′ (x) = 0
Figure 21.1.1
C ∞ counterexample for naive maximum principle
But the function u has a local minimum at x = 0. Similarly, −u satisfies the same equation and has a local maximum at x = 0. So clearly there is no maximum principle in this case. This is perhaps a little disturbing because the operator L = ∂x2 + b(x)∂x is clearly uniformly elliptic. The missing ingredient in this maximum principle is a bound on the first-order coefficient b. If b is bounded, Theorem 21.1.3 is obtained. ¯ ∩ C 2 (I) be a 21.1.3 Theorem: Let I be a non-empty bounded open real-number interval. Let u ∈ C 0 (I) ′′ ′ real-valued function on I which satisfies Lu(x) = u (x) + b(x)u (x) = 0 in I, where b : I → IR is a bounded function on I. Then supI (u) = sup∂I (u) and inf I (u) = inf ∂I (u). [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Hence u′′ (x) + b(x)u(x) = 0 for all x ∈ IR for this choice of b. (See Figure 21.1.1.)
480
21. Differential equations
¯ ∩ C 2 (I) by v(x) = exp((kb + 1)x) for x ∈ I. ¯ Then Proof: Let kb = supI (|b|). Define v ∈ C 0 (I) Lv(x) = (kb + 1)(kb + 1 + b(x))v(x) ≥ (kb + 1)v(x) > 0 for all x ∈ I. Let w = u + αv for α ∈ IR. Then for α > 0, Lw(x) = Lu(x) + αLv(x) > 0 for all x ∈ I. Suppose x ∈ I is a local maximum of w. Then w′ (x) = 0 and w′′ (x) ≤ 0. So Lw(x) = 0. This is a contradiction. So w has no local maximum in I. Therefore supI (w) = sup∂I (w). But u is the uniform limit of w as α → 0+ . So supI (u) = sup∂I (u). Negating u gives inf I (u) = inf ∂I (u). 21.1.4 Remark: A merely C 2 function u which is also a counterexample to a naive maximum principle (with unbounded coefficient of the first-order derivative) is the function u(x) = |x|k for x ∈ IR and k > 2. In this case, the equation u′′ (x) + b(x)u(x) = 0 is satisfied for b(x) = (1 − k)/x for x 6= 0 and b(0) = 0. This is illustrated in Figure 21.1.2. u(x) = |x|k , k = 2.1
1
4
b(x) = −u′′ (x)/u′ (x) = (1 − k)/x 3 2 1
x -1
0
1
-4
-3
-2
-1
0
1
2
3
4 x
u′′ (x) + b(x)u′ (x) = 0 C 2 counterexample for naive maximum principle
It seems that functions like b(x) = |x|−1 are close to the boundary of what makes a maximum principle of this sort work. But this function is also at the boundary of the set ofR L1loc (IR) functions. In fact, any solution x u of equation u′′ (x) + b(x)u′ (x) = 0 must satisfy |u′ (x)| = exp − b(x) dx , which is never equal to zero 1 if b ∈ Lloc (IR). Therefore no interior maximum or minimum is possible. This suggests the possibility of a strengthened version of Theorem 21.1.3. Such a maximum principle is Theorem 21.1.5. ¯ ∩ W 2,1 (I) 21.1.5 Theorem: Let I be a non-empty bounded open real-number interval. Let u ∈ C 0 (I) be a real-valued function on I which satisfies Lu(x) = u′′ (x) + b(x)u′ (x) ≥ 0 for almost all x ∈ I, where b ∈ L1loc (IR). Then supI (u) = sup∂I (u). Similarly, if Lu(x) ≤ 0 for almost all x ∈ I, then inf I (u) = inf ∂I (u). R R ¯ ∩ W 2,1 (I) by v(x) = x exp x (1 − b(x)) dx dx for x ∈ I. ¯ Then Lv(x) = v ′′ (x) + Proof: Define v ∈ C 0 (I) ′ ′ b(x)v (x) = v (x) > 0 for all x ∈ I. Let w = u + αv for α ∈ IR. Then for α > 0, Lw(x) = Lu(x) + αLv(x) > 0 for all x ∈ I. Suppose x ∈ I is a local maximum of w. Then w′ (x) = 0 and w′′ (x) ≤ 0. So Lw(x) = 0. This is a contradiction. So w has no local maximum in I. Therefore supI (w) = sup∂I (w). But u is the uniform limit of w as α → 0+ . So supI (u) = sup∂I (u). 21.1.6 Remark: Many analysis texts do not provide examples to demonstrate the necessity of some of the odd-looking assumptions which are placed on theorems. It is important to provide such examples so that the reader can more readily accept some of the more technical assumptions, but also to establish the “sharpness” of results. A “sharp theorem” is a theorem whose assumptions cannot be significantly weakened and whose assertions cannot be significantly strengthened without significantly increasing the theorem’s complexity. As a trivial example, the bound cos(x) ≥ 1 − x2 for all x ∈ IR is not sharp because the bound can easily be improved to cos(x) ≥ 1 − x2 /2, which is a sharp bound because the coefficient of x2 cannot be improved.
If a theorem is not sharp, the values and even the form of bounds and conditions in the theorem will quite likely be artefacts of the method of proof. By establishing sharpness, it is made clear that the bounds and conditions are attributes of the system under study rather than the technicalities of the proof method. In a sense, a sharp bound “hugs” the envelope of possibilities of the things which are bounded. No better bound [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Figure 21.1.2
21.2. Systems of linear second-order ODEs
481
can be interposed between the bound and the things which are bounded. A further practical advantage of sharp bounds is that it will not be possible for an adversary to publish a better bound and take a share of the glory. To establish sharpness of a theorem, it is necessary to conjecture that each assumption and assertion can be individually improved by varying the parameters. Each such conjecture must be disproved. This can often be achieved by the use of counterexamples, as has been done in Remarks 21.1.2 and 21.1.4. 21.1.7 Remark: It is clear that for second-order linear equations with a single independent variable (i.e. for ordinary differential equations), explicit calculations can quickly yield maximum principles without resort to complicated constructions. The situation is not quite so simple for multiple independent variables (i.e. partial differential equations). It is, however, useful to study the single-variable case first to obtain some intuition so that the higher-dimensional cases are easier to understand. (For further information on maximum principles, see Miranda [126], section 3 and Gilbarg/Trudinger [109], chapter 3.)
21.2. Systems of linear second-order ODEs This section is relevant to Jacobi fields. In geodesic coordinates, Jacobi fields are solutions of a system of linear ordinary differential equations of second-order with respect to the affine parameter along the geodesic. Of particular interest are estimates of the solutions of boundary value problems.
A boundary value problem, in the subject of differential equations, means a problem where the values of solutions are specified on the boundary of a region and a differential equation is required to be satisfied by solutions on the interior of the region. The most interesting boundary value problems are those for which the solutions exist and are unique in some sense. Then the solution may be regarded as a function of the boundary and interior conditions. In physical models, a BVP typically describes a situation where the values of a field are known on the boundary of a region and one wishes to know what is happening inside the region. The boundary values may be known either because they are passively measured or because they are actively controlled in some way. The solutions of a BVP are typically static, particularly if uniqueness is guaranteed. Generally there is not time parameter to give the solutions a dynamic character. Second-order PDEs for which BVP existence and uniqueness are guaranteed in bounded domains are typically elliptic. When the value of a BVP solution is specified on the boundary, this is called a Dirichlet problem. When the gradient of a BVP solution is specified on the boundary, this is called a Neumann problem. 21.3.1 Remark: Bell [190], page 105, makes the following comment about the ubiquity of boundary value problems in physics. In a sense mathematical physics is co-extensive with the theory of boundary-value problems.
21.4. Initial value problems An initial value problem, in the subject of differential equations, means a problem where the values of solutions are specified at some initial time and on the (possibly empty) boundary of a region, and a differential equation is required to be satisfied by solutions on the interior of the region at all times after the initial time. In some initial value problems, the region under consideration has no boundary because the region occupies the entire space. The most interesting IVPs are those for which the solutions exist and are unique in some sense. Then the solution may be regarded as a function of the initial, boundary and interior conditions. In physical models, an IVP typically describes a situation where the values of a field are known at some initial time and on the (possibly empty) boundary of a region, and one wishes to know what is happening inside the region after the initial time. The initial and boundary values may be known either because they are passively measured or because they are actively controlled in some way. Second-order PDEs for which IVP existence and uniqueness are guaranteed in bounded domains are typically parabolic or hyperbolic. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
21.3. Boundary value problems
482
21. Differential equations
21.5. Calculus of variations This section deals with existence and uniqueness for calculus of variations. This is required for constructing geodesic curves from metrics.
21.6. ODEs for defining exponential and trigonometric functions The integral expressions for logarithmic and exponential functions in Section 20.12 have the advantage of avoiding the need to deal with the existence and uniqueness questions for the differential expressions. However, the motivation for the differential expressions is stronger. Rx 21.6.1 Remark: Consider Definition 20.12.1 for the logarithm function: ln(x) = 1 t−1 dt. The exponential function is defined as the inverse of the logarithm. This leads to Theorem 20.12.4, which states that ∀x ∈ IR, (d/dx) exp(x) = exp(x). Since exp(0) = 1, this implies that the exponential function satisfies the following ODE boundary value problem for a function f : IR → IR. ∀x ∈ IR,
f ′ (x) = f (x) f (0) = 1.
The equation f (0) = 1 is the boundary condition of the BVP. It turns out that the exponential function exp : IR → IR is the unique function which satisfies this BVP. 21.6.2 Remark: The trigonometric functions sin and cos in Section 20.13 arise naturally from second-order ODEs with constant coefficients. The sine function is the unique solution of the following BVP. f ′′ (x) = −f (x) f (0) = 0 f ′ (0) = 1.
If one ignores the effort required to prove existence and uniqueness, this seems like a much simpler characterization than Definition 20.13.9. When the sine function is characterized by a BVP, the solution is most readily obtained from a Taylor series expansion. This yields a non-algebraic function. (See EDM2 [34], article 11 for algebraic functions and article 10 for algebraic equations.) Although power series can be manipulated to obtain a vast array of properties, it is necessary Rto always be on guard with respect to convergence issues. The integral expression x ∀x ∈ [−1, 1], arcsin(x) = 0 (1 − t2 )−1/2 dt for the inverse arcsin of the sine function provides a much safer way to derive properties. 21.6.3 Remark: The cosine function is the unique solution of the following BVP. ∀x ∈ IR,
g ′′ (x) = −g(x) g(0) = 1 g ′ (0) = 0.
From this BVP, it is possible to determine that the cosine is the derivative everywhere of the sine function. The development of properties and relations of trigonometric functions usually start with basic relations such as (d/dx) sin(x) = cos(x) and (d/dx) cos(x) = − sin(x), and the initial values sin(0) = 0 and cos(0) = 1. These two equations and initial values actually constitute a system of two coupled ODEs of first order, as follows. ∀x ∈ IR, ∀x ∈ IR,
f ′ (x) = g(x) g ′ (x) = −f (x) f (0) = 0 g(0) = 1.
Such a system is a more natural starting point for development of properties than the second-order ODE, which itself follows trivially from the coupled first-order ODEs. It is readily shown that the inverses of the arcsin and arccos functions given by the integral expressions in Definition 20.13.2 satisfy the above system of equations for f and g respectively. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
∀x ∈ IR,
21.7. Taylor series and exponentials of matrices
483
21.7. Taylor series and exponentials of matrices
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
[ Probably should express all the special functions in previous sections as Taylor series here. ] This section deals with Taylor series for analytic functions, and exponentials of matrices. These matrices turn out to be solutions of systems of ODEs with constant coefficients, which are useful for studying Lie groups. [ Exponentials of matrices are covered in Warner [49], pages 102–107. ] [ Somewhere in this chapter should be the basic facts of Fourier series and transforms. It would be ridiculous to try to include all analysis that could possibly be useful in this book. But Fourier series and transforms are fairly fundamental. It would be nice to include a proof of the completeness of Fourier series and transforms. Perhaps one way to do this would be to show that all simple square functions RHa,b : IR → IR with Ha,b (x) = 1 for x ∈ (a, b) are almost everywhere approximated, and then argue that if Ha,b f dx = 0 for all a, b ∈ IR, then such a continuous function f must equal zero. ]
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
484
[ www.topology.org/tex/conc/dg.html ]
21. Differential equations
[ draft: UTC 2009–3–21 Saturday 11:36 ]
[485]
Chapter 22 Non-topological fibre bundles
22.1 Non-topological fibrations . . . . . . . . . . . 22.2 Parallelism for non-topological fibrations . . . 22.3 Non-topological fibre bundles . . . . . . . . . 22.4 Finite transformation groups as fibre bundles .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
485 487 489 489
This chapter may be regarded as a recreational prelude to Chapters 23 and 24 which deal with the serious business of topological fibre bundles. The non-topological fibre bundle definitions in this chapter are (probably) non-standard, but they are a plausible reconstruction (or deconstruction) of minimal fibre bundles from standard fibre bundles. The purpose of this chapter is to investigate the extent to which topological and differentiable fibre bundle definitions can be meaningful in non-topological fibre bundles, especially in the case of discrete or finite spaces. (If the reader would like the skip this chapter, the author will not object at all. More productive uses of the reader’s time would include sleeping, eating, and staring at the floor.)
This chapter deals with non-topological fibre bundles and parallelism. Topological fibre bundles are defined in Chapters 23–24. Differentiable fibre bundles are defined in Chapter 35. Differential parallelism (“connections”) is presented in Chapters 36–38.
22.1. Non-topological fibrations The non-uniform non-topological fibrations in Definition 22.1.1 may be the most general form of fibration on which any sort of parallelism may be defined. This minimalist definition is based on Theorem 6.6.9, which shows that any set is partitioned by a function on that set. A slightly more useful structure would be a fibre bundle which requires all sets in the partition to be equipotent, so that there would be bijections between each “fibre”. These uniform non-topological fibrations are introduced in Definition 22.1.3. There is an implicit structure group for this fibre bundle definition: the symmetric group of all permutations of the fibre set. Parallelism may be defined on such fibre bundles. If all fibres have the same cardinality, absolute (pathindependent) parallelism may be defined as bijections between the fibres. In the case of non-uniform cardinality of fibres, parallelism can be defined by more general relations than bijections. Path-dependent parallelism may be defined in terms of fibre bijections which are a symmetric, transitive function of a specified set of permitted paths. The set of paths should be closed under concatenation and reversal. If the base space had a topology, continuous paths could be used. For non-topological fibre bundles, the path space must be chosen in other ways. 22.1.1 Definition: A non-uniform non-topological fibration is a tuple (E, π, B) such that E and B are sets and π is a function π : E → B.
©
Alan U. Kennington, “Differential geometry reconstructed: a unified systematic framework”. Copyright 2009, Alan U. Kennington. All rights reserved. You may print this book draft in A4 format. Printing in all other formats is forbidden. You may not charge any fee for copies of this book draft. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
The core topic of differential geometry is curvature, curvature is defined in terms of parallelism, and fibre bundles are the natural structure on which parallelism is defined because if one asks what kind of structure the most general notion of parallelism could apply to, it must be a set of entities attached at different points of a base set. Parallelism specifies associations between objects which are attached to different points of a base set. A fibre bundle specifies how these objects are attached to the base set.
486 The The The The
22. Non-topological fibre bundles set E is called the total space of (E, π, B). set B is called the base space of (E, π, B). function π is called the projection map of (E, π, B). set Eb = π −1 ({b}) for each b ∈ B is called the fibre (set) at b.
22.1.2 Remark: Definition 22.1.1 is illustrated in Figure 22.1.1. Eb1 = π −1 ({b1 })
Eb2 = π −1 ({b2 })
E
π B Figure 22.1.1
b1
b2
Non-uniform non-topological fibrations
If the fibre sets π −1 ({b}) are equipotent, then there is a fixed set F such that for all b ∈ B, there exists a bijection φ : π −1 ({b}) → F . This could be thought of as a sort of “uniform non-topological fibration with extrinsic fibre F ” as presented in Definition 22.1.3. In physical models, one does not expect to see an extrinsic “fibra ex machina”, but it is usual to define fibre bundles in this way, the extrinsic fibre space being unique up to bijection. 22.1.3 Definition: A (uniform) non-topological fibration is a tuple (E, π, B) such that E and B are sets, π is a function π : E → B, and (i) ∀b1 , b2 ∈ B, ∃φ : π −1 ({b1 }) → π −1 ({b2 }), φ is a bijection. A set F is said to be a fibre space for (E, π, B) if (ii) ∀b ∈ B, ∃φ : π −1 ({b}) → F , φ is a bijection. 22.1.4 Remark: Definition 22.1.3 (i) means that all sets π −1 ({b}) are equipotent. 22.1.5 Definition: A fibre chart with fibre F for a fibre bundle (E, π, B) is a map φ : π −1 (U ) → F for some set U ⊆ B such that (i) π × φ : π −1 (U ) → U × F is a bijection. 22.1.6 Remark: Definition 22.1.5 is illustrated in Figure 22.1.2 using the notation Eb = π −1 ({b}) for b ∈ B. Condition (i) is equivalent to the requiring that φ π−1 ({b}) : π −1 ({b}) → F be a bijection for all b ∈ B. It is clear from the diagram that a fibre chart is analogous to a projection map in that they both project the total space down to another space, either B or F . 22.1.7 Definition: A fibre atlas with fibre space F for a fibre bundle (E, π, B) is a set AF E of functions φ : π −1 (U ) → F for U ⊆ B such that π × φ : π −1 (U ) → U × F is a bijection. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
If the fibre sets π −1 ({b}) are not equipotent (do not have the same cardinality), then it is not possible to define parallelism between fibre sets in terms of bijections. Weaker, non-bijective parallelism relations satisfying symmetry and transitivity could be defined, but this doesn’t seem very useful for differential geometry, although one could put a contrary argument. Bijections are not necessarily the only relations of interest between objects which are attached to different points of a base set.
22.2. Parallelism for non-topological fibrations F
F
φ E = φ π−1 ({b})
487 fibre space
b
φ E
Eb1
Eb2
Eb
π B Figure 22.1.2
fibre chart total space projection map
b1
b2
b
base space
Uniform non-topological fibration with fibre chart
22.1.8 Remark: Since the base set B is unstructured, there is no difference between a chart φ : π −1 (U ) → F for b ∈ U and a set of individual pointwise charts φ : π −1 ({b}) → F for all b ∈ U . Only when structures such as a topology or a differentiable manifold atlas are introduced on B is it worthwhile to define nonpointwise charts. An unstructured set is effectively equivalent to a topological space with the discrete topology. (See Definition 14.2.19 for discrete topology.) Since the fibre space F is also unstructured, even pointwise fibre charts contain no information, because all bijections between the fibre space and a given fibre set are equivalent. The value of defining structures such as charts and atlases for unstructured fibre bundles is to highlight the information that is contained in them when structure is present. 22.1.9 Definition: A uniform non-topological fibration with fibre space F is a tuple (E, π, B, AF E ) such that (i) (E, π, B) is a uniform non-topological fibration; (ii) F is a fibre space for (E, π, B);
22.2. Parallelism for non-topological fibrations 22.2.1 Remark: In the interests of pointless minimalism, it is desirable to define parallelism on nontopological fibre bundles. Parallelism associates elements of the fibre set π −1 ({b}) at different points b of the base space B of a fibre bundle (E, π, B). A simple global or absolute parallelism would identify an element f1 ∈ π −1 ({b1 }) in the fibre set at b1 ∈ B corresponding to each fibre f0 ∈ π −1 ({b0 }) at b0 ∈ B. For notation, write f1 k f2 when f1 and f2 are parallel fibres. The association rule for an absolute parallelism should be transitive between base points; that is, (f1 k f2 and f2 k f3 ) ⇒ f1 k f3 . Symmetry and idempotence conditions should also hold. The relation should also be a bijection between the fibre sets at any two points. This implies that the fibre bundle must be uniform, of course, although a generalized form of parallelism could be defined for non-uniform fibre bundles by using more general relations than bijections. In the case of a uniform non-topological fibre space, an absolute parallelism corresponds to a simple equivalence relation on the total space such that each fibre is equivalent to precisely one fibre in the fibre set at each other base point. Absolute parallelism is illustrated in Figure 22.2.1. In this case, a bijection θbb′ : Eb → Eb′ defines parallelism for each pair of fibre sets Eb = π −1 ({b}). These must obey the transitivity rule θb1 b3 = θb2 b3 ◦ θb1 b2 for all b1 , b2 , b3 ∈ B. Clearly the function φ E ′ ◦ θbb′ ◦ φ E −1 : F → F is a bijection on F for all b, b′ ∈ B. b
b
In the case of non-absolute parallelism, it is usual to talk of “parallel transport”, which means that parallelism between fibres at two base points in a fibre bundle depends on the choice of path between the points. This is a kind of “pathwise parallelism”. For this to be meaningful, there must be a definition of permitted paths. For topological fibre bundles, the special paths are chosen to be continuous. It is difficult to specify pathwise parallelism on a fibre bundle without a topology or differentiable structure because all paths are equal in an unstructured set. (An unstructured base set is equivalent to using the discrete topology on the base set. All paths are continuous in the discrete topology.) If the base set B is finite, the number of non-intersecting paths is finite. So it is then not too difficult to formalize pathwise parallelism. When the base set is countable, [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
(iii) AF E is a fibre atlas with fibre space F for (E, π, B).
488
22. Non-topological fibre bundles F
φ E
b1
φ E
fibre space
F
φ E Eb1
φ E
b3
b2
θ b1 b2 θ b2 b1
π
Eb2
θ b2 b3
Eb3
θ b3 b2
θ b1 b3
fibre chart
projection map θ b3 b1
B Figure 22.2.1
b1
total space
b2
b3
base space
Non-topological fibration with absolute parallelism
F
φ E
b1
φ E
fibre space
F
φ E
b2
ΘQ1
Eb1
φ E
b3
Eb2
ΘQ3
fibre chart
ΘQ2
Eb3
π
projection map Q1
B Figure 22.2.2
total space
ΘQ4
b1
Q3
Q2 b2
Q4
b3
base space
Non-topological fibration with pathwise parallelism
[ The discussion of paths in Remark 22.2.1 should be updated in view of the new definitions of paths and curves. For discrete space, could use never-constant curves as an equivalent definition of paths. ] 22.2.2 Remark: Paths are defined in this book as equivalence classes of curves for some suitable equivalence relation. The most important attribute of curves which should be retained in an equivalence class is the direction of the curves. So curves which go in opposite directions must always be regarded as not equivalent. Details of the parametrization of curves may be regarded as redundant information which may be ignored in paths. Network topology is discussed in Section 16.10. Continuous curves may be defined with respect to network topologies. Then continuous directed paths may be defined as equivalence classes of continuous curves which [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
questions of limits for infinite paths arise, which brings in questions about the topology. When the base set is uncountable, it is difficult to make any sensible formalism without a topology. One way to deal with this problem is to define an arbitrary set Pb1 b2 of subsets of the base space to be the permitted paths from a point b1 ∈ B to b2 ∈ B, subject to conditions of symmetry and transitivity. (The concatenation of two permitted paths should be a permitted path, but that it not essential.) Another approach is to define some notion of locality in the base space. In a discrete base space, the locality could specify a graph indicating which points are neighbours, and permitted paths could be composed of sequences of neighbouring points. This would then be the discrete version of a topology, known as a “network topology”. (For a network topology, the “tangent space” at each point could be simply the set of neighbours of the point!) If the set of permitted paths Pb1 b2 has been defined in some way on a fibre bundle for b1 , b2 ∈ B, one may denote pathwise parallelism as f1 kQ f2 if f1 is parallel to f2 along the path Q ∈ Pb1 b2 . This is then the definition of “parallel transport” of the fibre f1 ∈ π −1 ({b1 }) from b1 to b2 along the path Q. One could also denote the parallelism as f2 = ΘQ (f1 ) when f1 kQ f2 . Then a minimum requirement of a parallelism function Θ should be that ΘQ1 +Q2 = ΘQ2 ◦ ΘQ1 , where Q1 + Q2 denotes the concatenation of paths Q1 and Q2 . (This is illustrated in Figure 22.2.2.) It is even possible to define notions such as curvature in such a framework.
22.3. Non-topological fibre bundles
489
have the same direction. 22.2.3 Remark: Non-topological fibre bundles with structure groups are discussed in Section 22.3. Topological fibre bundles are presented in Chapters 23–24. Differentiable fibre bundles are presented in Chapter 35.
22.3. Non-topological fibre bundles This section presents fibre bundles which have structure groups but no topological structure. It is particularly interesting to see how parallelism is defined on such fibre bundles and their associated principal fibre bundles. (See Remark 6.6.11 for some comments on non-topological fibrations.) 22.3.1 Definition: A non-topological (G, F ) fibre bundle for an effective left transformation group (G, F ) − < (G, F, σG , µ) is a tuple (E, π, B, AF E ) such that (i) E and B are sets and π is a function π : E → B;
−1 (ii) AF (U ) → F for sets U ⊆ B such that each π × φ : π −1 (U ) → U × F is a E is a set of functions φ : π bijection; −1 (iii) ∀φ1 , φ2 ∈ AF E , ∀b ∈ Dom(φ1 ) ∩ Dom(φ2 ), ∃g ∈ G, βb,φ1 ◦ βb,φ2 = Lg , where βb,φ denotes the function F φ π−1 ({b}) for all b ∈ B and φ ∈ AE .
22.3.2 Remark: Definition 22.3.1 is an extension of Definition 22.1.9. In fact, Definition 22.1.9 is equivalent to the special case that the group (G, F ) in Definition 22.3.1 is the permutation group of F . (See Example 9.4.16 for permutation groups.) As mentioned in Remark 22.1.8, charts and atlases contain no real information if the sets B, E and F are unstructured. (E is structured by the projection map π, but is otherwise unstructured.) In this section, the fibre space is structured by the structure group G, but the base space B is still unstructured. Therefore the fibre charts do contain real information about the structuring of the individual fibre sets π −1 ({b}), but the information is the same as if the charts were specified pointwise – i.e. one chart for each fibre set π −1 ({b}). [ Here define parallelism on non-topological fibre bundles with structure groups. See Remark 22.2.1 for ideas on how to do this. ] [ Here present non-topological principal fibre bundles, with the motivation coming from parallelism for associated principal fibre bundles. Discuss invariants and covariants. Define porting of parallelism between associated fibre bundles and define curvature. ] [ A non-topological version of the Stokes Theorem could be presented here if it is well-defined. Some sort of generalized definition of curvature could be put here too. ]
22.4. Finite transformation groups as fibre bundles The only non-trivial example of a non-topological fibre bundle which the author can think of is the case of finite transformation groups. To be interesting, a class of fibre bundles must have the possibility of non-zero curvature because curvature is the core concept of differential geometry. Therefore an interesting class of fibre bundles must have a natural definition of parallel transport which is not trivial on closed paths. Let (G, F ) be a finite left transformation group. (It doesn’t really need to be finite, but that makes things simpler initially. Left transformation groups are defined in Section 9.4.) Define the base space for a fibre bundle by B = G , the set of all integer-valued functions on the set G. The set B may be given an addition
Z
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
B is called the base space of the fibre bundle. E is called the total space. (Kobayashi/Nomizu [26], page 50, call the total space also the bundle space.) π is called the projection map. G is called the structure group. F is called the fibre space. AF E is called the fibre atlas. The functions φ ∈ AF E are called the fibre charts of the fibre bundle. The sets π −1 ({b}) may be called the fibres or fibre sets of the fibre bundle.
490
22. Non-topological fibre bundles
operation σB : B × B → B defined by σB : (b1 , b2 ) 7→ (g 7→ b1 (g) + b2 (g)). Then (B, σB ) is a commutative group with identity 0B : g 7→ 0. P A distance function d : B × B → + 0 may be defined on B by d : (b1 , b2 ) 7→ g∈G |b2 (g) − b1 (g)|. (This is related to the Hamming distance function in signal processing theory. See Proakis [212], page 415.) This actually yields a discrete topology (or network topology) on B by defining the set of neighbourhoods of points b ∈ B as those points with distance 0 or 1 from b. A local difference operation ∆ may be defined for pairs of elements of B with distance not exceeding 1. For b1 , b2 ∈ B with d(b1 , b2 ) = 1, define ∆(b1 , b2 ) = h ∈ G if b1 (g) − b2 (g) = δ(g, h) and ∆(b1 , b2 ) = h−1 ∈ G if b1 (g) − b2 (g) = −δ(g, h), where δ denotes the Kronecker delta function for G. If b1 = b2 , define ∆(b1 , b2 ) = e. (The set of values of ∆(b1 , b2 ) may be interpreted as the tangent space at the point b1 .) Let the fibre space of the fibre bundle be F , and let the total space be E = B × F . Define the projection map π : E → B with π : (b, x) 7→ b, and define a single chart φ : E → F by φ : (b, x) → x. Define the fibre atlas as AF E = {φ}. A set of curves may now be defined in the base set B as the set C of all functions γ : [a, b] → B for intervals [a, b] = {t ∈ ; a ≤ t ≤ b} of the integers, with the constraint that d(γ(t), γ(t + 1)) ≤ 1 for all t ∈ [a, b) = {t ∈ ; a ≤ t < b}. (This means that the curve is continuous with respect to the discrete topology on B.) Never-constant curves are are curves in C which satisfy d(γ(t), γ(t + 1)) = 1 for all t ∈ [a, b). The initial and terminal points of a curve γ : [a, b] → B are the points S(γ) = a and T (γ) = b respectively. γ Now parallel transport may be defined on the (G, F ) fibre bundle (E, π, B, AF E ) as a map Θs,t : Eγ(s) → Eγ(t) γ γ γ for all never-constant curves γ ∈ C and s, t ∈ Dom(γ) by Θs,t : (γ(s), f ) 7→ (γ(t), σG (gs,t , f )) = (b, gs,t f ), γ where gs,t is defined inductively by the rules
Z
Z
Z
γ gs,s = Le
∀s ∈ Dom(γ),
γ γ s < t ⇒ gs,t = ∆(γ(t), γ(t − 1))gs,t−1 .
∀s, t ∈ Dom(γ),
Θγs,t : (γ(s), f ) 7→ γ(t),
t−1 Y
u=s
∆(γ(u + 1) − γ(u)) f .
The curvature of the parallelism Θ may be defined as the map κ : C¯ → G where C¯ = {γ ∈ C ; S(γ) = T (γ)} γ is the set of closed curves in B, and κ : γ 7→ gS(γ),T (γ) .
Consider the example of the symmetric group G = S3 of degree 3 acting on the set F = {1, 2, 3}. The 6 elements of this group are eG , g1 = (12), g2 = (23), g3 = (31), g4 = (123) and g5 = (321). Consider the path γ : [0, 4] → B = GZ defined by γ(0) = 0B , γ(1) = e1 , γ(2) = e1 + e2 , γ(3) = e2 and γ(4) = 0B , where ek denotes the element ek : g 7→ δ(g, gk ) of B. Then κ(γ) = g2−1 g1−1 g2 g1 = [g2 , g1 ] = g4 , the commutator of g2 and g1 . For general finite groups, similar minimal cycles in the vicinity of any point in the base space yield all of the commutators of the group. Thus curvature for this fibre bundle is very closely related to the commutator operation of the group. (These minimal cycle curvatures seem to form some sort of Lie algebra.) In the special case of a commutative group, the curvature is everywhere equal to the identity of the structure group. Hence the fibre bundle is flat if and only if the group is commutative. All products of elements of the group may be regarded as continuous paths in this fibre bundle. The difference between any two products which have the same terms in a different order may be calculated by a kind of discrete Stokes Theorem in terms of the curvature on minimal cycles which fill in the concatenation of one path with the reverse of the other. The base space B could presumably be constructed by grafting together patches from the set G . That is, manifolds with interesting topologies could be constructed from patches of G via identification spaces. For such manifolds, one could study analogues of exterior derivatives. One could possibly investigate the solutions of boundary value problems in these kinds of discrete fibre bundles. For example, one might study equations of prescribed curvature. Initial value problems may also be of some interest.
Z
[ www.topology.org/tex/conc/dg.html ]
Z
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
In other words, Θγs,t performs a left action on the fibre element f of (b, f ) ∈ Eγ(s) by a group element γ gs,t = ∆(γ(t), γ(t − 1))∆(γ(t − 1), γ(t − 2)) . . . ∆(γ(s + 1), γ(s)) ∈ G. So
22.4. Finite transformation groups as fibre bundles
491
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
The curvature for a differentiable fibre bundle is calculated as the limit for a sequence of curves which converge to a point while staying (roughly speaking) within a fixed tangent plane. This then yields a generator (or ‘infinitesimal translation’) of the structure group for each pair of tangent directions rather than an element of the group. Further examples of non-trivial discrete fibre bundles with non-zero curvature may be obtained by tesselating any Lie group or any differentiable fibre bundle with a connection. Thus by embedding a network of points and curves joining points inside the base space of a differentiable fibre bundle, the connection on the differentiable fibre bundle may be approximated by a discrete fibre bundle. (This seems to be related to the homology of simplicial complexes or polyhedra.) This kind of construction would typically yield a set of discrete curvature values which is not a finite subgroup of the differentiable structure group, but it would yield a discrete subgroup. In very special cases, the subgroup could be finite.
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
492
[ www.topology.org/tex/conc/dg.html ]
22. Non-topological fibre bundles
[ draft: UTC 2009–3–21 Saturday 11:36 ]
[493]
Chapter 23 Topological fibre bundles
23.1 23.2 23.3 23.4 23.5 23.6 23.7 23.8 23.9 23.10 23.11 23.12
History, motivation and overview . . . . . . . . . . . . . . . . . . Topological fibrations with intrinsic fibre spaces . . . . . . . . . . Topological fibrations and fibre atlases . . . . . . . . . . . . . . . Fibration identification spaces . . . . . . . . . . . . . . . . . . . Structure groups discussion . . . . . . . . . . . . . . . . . . . . . Topological fibre bundles . . . . . . . . . . . . . . . . . . . . . . Fibre bundle homomorphisms, isomorphisms and products . . . . Structure-preserving fibre set maps . . . . . . . . . . . . . . . . . Topological principal fibre bundles . . . . . . . . . . . . . . . . . Associated topological fibre bundles . . . . . . . . . . . . . . . . Construction of associated topological fibre bundles . . . . . . . . Construction of associated topological fibre bundles via orbit spaces
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
493 496 498 502 503 504 508 510 515 517 520 521
This chapter deals with topological fibre bundles. Non-topological fibre bundles are defined in Chapter 22. Differentiable fibre bundles are defined in Chapter 35.
Although this chapter may be regarded as a subset of Chapter 35, the relative simplicity of topological fibre bundles makes it easier to examine the fine detail of definitions in depth without the distractions of topics such as Lie groups and vector fields which arise in the case of differentiable fibre bundles. Therefore many issues are investigated in this chapter which are skipped over in Chapter 35. There are broadly three species of fibre bundles: fibrations (also called groupless fibre bundles), ordinary fibre bundles and principal fibre bundles. Fibrations are ordinary fibre bundles which have no structure group. Principal fibre bundles are ordinary fibre bundles for which the structure group and the fibre space are the same. (These concepts are more fully explained in Section 23.1.) Additional topics include parallelism on fibre bundles and associations between ordinary and principal fibre bundles. 23.0.1 Remark: For brevity, OFB is short for “ordinary fibre bundle”, and PFB is short for “principal fibre bundle”. These are non-standard abbreviations.
23.1. History, motivation and overview 23.1.1 Remark: It seems that fibre bundles as a distinct concept were first defined by Eduard Ludwig Stiefel in 1936 in the context of pathwise parallelism in manifolds. (See EDM2 [34], 147.A.) [ Should be able to dig up more history than this! ] 23.1.2 Remark: Fibre bundles are designed to support parallel transport in the same sense that roads are designed to support car, truck and bus transport.
©
Alan U. Kennington, “Differential geometry reconstructed: a unified systematic framework”. Copyright 2009, Alan U. Kennington. All rights reserved. You may print this book draft in A4 format. Printing in all other formats is forbidden. You may not charge any fee for copies of this book draft. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
The fundamental motivation for fibre bundles is to provide a structure on which parallelism may be defined. Parallelism is defined on topological fibre bundles in Chapter 24. Differentiable parallelism on differentiable fibre bundles, known as a “connection”, is presented in Chapters 36–38.
494
23. Topological fibre bundles
23.1.4 Remark: A fibration has a “base set” B S to which is attached a “fibre set” Eb at each b ∈ B. These fibre sets are pairwise disjoint. Their union E = b∈B Eb is called the “total space” of the fibration. The function π : E → B which maps points in each fibre set Eb to its base point b is called the “projection map” of the fibration. It follows that Eb = π −1 ({b}) for all b ∈ B. (See Figure 22.1.1.) The customary notation “E ” for the total space is most likely mnemonic for the French “entier” which means “whole”, “entire” or “total”. A typical fibre bundle is the set of tangent vectors at points of a manifold. (See Section 28.8 for total tangent spaces.) The set of tangent vectors is the total space of a fibration because each tangent vector is attached to a unique point of the manifold. (Crampin/Pirani [11], chapter 14, page 353, give a succinct motivation for fibre bundles as a generalization of tangent bundles.) A topological fibration has a total space E and base space B which are topological spaces such that the projection map π : E → B is continuous. The fibre sets Eb = π −1 ({b}) are required to be homeomorphic to each other in the relative topology of E. The full definition of fibrations (or “groupless fibre bundles”) is given in Sections 23.2 and 23.3. 23.1.5 Remark: Definitions of parallelism are required to preserve the structure of fibre sets at different points of a base space. The structure and symmetry of fibre sets are specified in terms of a “structure group”. A fibre bundle is a fibration for which a structure group is specified. Fibre bundles with structure groups (or “ordinary fibre bundles”) are defined in Section 23.6. Unless otherwise stated, a fibre bundle will usually mean an ordinary fibre bundle, but sometimes it means either of the three species of fibre bundles. 23.1.6 Remark: Principal fibre bundles, defined in Section 23.9, are fibre bundles for which the structure and fibre space are the same, but additionally a “right action” map is defined. This is used in definitions of parallelism. 23.1.7 Remark: Figure 23.1.1 is a comparative sketch of the maps and spaces which are required in the definitions of fibrations, fibre bundles and principal fibre bundles. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
23.1.3 Remark: The “elevated” language of fibre bundles provides a level of abstraction which can be quite annoying to people who want to understand differential geometry with the minimum of fuss. Unfortunately, the fibre bundle language is widely used and cannot be avoided. Roughly speaking, a fibre bundle is an attachment of multiple copies of a “fibre space” at the points of a manifold, one copy being attached to each point in the space. (See Figure 23.2.1.) The aggregate set of pointwise copies of the fibre space may be given structures such as a topology, a differentiable structure and a transformation group structure. Examples of fibre bundles include tangent spaces, tensor spaces and spaces of k-tuples of tangent vectors on differentiable manifolds. The relevance of fibre bundles in physics arises from the fact that field theories attach various scalar, vectorial and tensorial objects to each point in space-time. (The word “field” in physics means a function of space or space-time. This is quite different to the mathematical “field” in Definition 9.8.8, which is an algebraic structure.) Physical fields are “cross-sections” of fibre bundles. Thus physical fields live inside fibre bundles. A short, snappy slogan to explain fibre bundles is that “fibre bundles are the spaces where physical fields live”. Some physics-oriented texts on differential geometry mention no sets at all. It is possible to do mathematics without identifying which set each object “lives in”. But clear, unambiguous thinking is very much assisted by associating each mathematical object with a containing set. The structures of fibre bundles are used in the formulation of the equations of motion of the fields. These structures add a “horizontal” component to fibre bundles in addition to the “vertical” structure within each pointwise fibre set. (See Figure 23.2.3.) The whole point of fibre bundles is to specify the horizontal structure so that the fibre elements at nearby points in the manifold are related in some way instead of being simply isolated copies of the fibre space with no relation to neighbouring fibres. The horizontal structure is essential for defining “covariant derivatives” of physical fields. These are derivatives which, roughly speaking, measure the rate of change over space or space-time of a fields with respect to the horizontal structure on the fibre bundle. In summary, a fibre bundle may be characterized as “one copy of a fibre space F at each point of a base space B, plus some horizontal structure to specify relations between the copies of F at different points of B”. The horizontal relations may be visualized as a kind of glue attaching fibres at neighbouring points.
23.1. History, motivation and overview
495
G
G Rg
π −1 (U ) ⊆ E
π −1 (U ) ⊆ E
F φ
Lg,φ φ
Lg
Lg
q −1 (U ) ⊆ P F
Lg,φ φ
G
π q×
q
×
φ
π
π
×
π
φ
φ
U ⊆B U ×F ⊆B×F fibration Figure 23.1.1
U ⊆B U ×F ⊆B×F (ordinary) fibre bundle
U ⊆B U ×G⊆B×G principal fibre bundle
Fibrations, fibre bundles and principal fibre bundles
Each of these three species of fibre bundle is built on a base space B with a fibre space F (or G). There are projection maps π or q and local charts φ mapping the total space E or P to the fibre space so that π × φ or q × φ is a local homeomorphism. A fibration has no structure group G, whereas the fibre space F of a principal fibre bundle is the same as the structure group G. The principal fibre bundle also has a right action Rg on the total space P for each g ∈ G. (Any similarity between the two fibre bundle figures on the right and the Trojan rabbit in Scene 9 of “Monty Python and the Holy Grail” (1974) is purely coincidental. See Chapman et alia [196], page 27.)
concept basic definition chart compatible chart atlas equivalent atlas
topological manifold Definition Definition (Definition Definition (Definition
26.3.1 26.4.2 26.4.2) 26.4.6 26.4.6)
topological fibration
topological fibre bundle
Definition 23.2.1 Definition 23.6.4 Definition 23.3.4 (Definition 23.3.4) (Definition 23.3.4) Definition 23.6.14 Definition 23.3.14 (Definition 23.6.4) (Definition 23.3.14) Definition 23.6.16
The author has tried as much as possible to harmonize the definitions for manifolds and fibre spaces. 23.1.9 Remark: Concerning the specification and parametrization of fibre bundles, one may compare the usual practice for fibre bundles with that for manifolds. One does not usually include in the specification tuple for a topological n-dimensional manifold the set IRn and the topology on IRn . In the case of a C k differentiable manifold, one does not include the pseudogroup of local diffeomorphisms of IRn as part of the specification tuple. These structures are assumed to exist outside the manifold. An atlas specifies the relation of the manifold to the external Euclidean space which possesses the topological and/or differentiable structure. Oddly, in the case of fibre bundles, the usual practice is to include the structure group and fibre space in the specification tuple. 23.1.10 Remark: On the subject of terminology, it would be sensible to call a fibre set Eb a “fibre” at b, but some texts use the word “fibre” for an extrinsic fibre space, which is defined in Definition 23.3.3. It is tempting to think of each member of the fibre set Eb as a fibre, but the English-language words “thread” and “filament” perhaps better describe elements of the fibre set, while the fibre (set) can be thought of as consisting of many individual filaments spun together. So a fibre is more like rope or knitting wool. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
23.1.8 Remark: There are many similarities between the definitions for topological manifolds and topological fibre bundles because their global structure may be expressed in terms of an atlas. Charts and atlases are optional for topological manifolds and topological fibrations, but they are required for topological fibre bundles which have a structure group. The following table summarizes the relations between some basic definitions for topological manifolds and fibre bundles.
496
23. Topological fibre bundles
23.2. Topological fibrations with intrinsic fibre spaces Fibrations are groupless fibre bundles, but topological fibrations are the same as fibre bundles which have a minimal structure group, namely the group of topological automorphisms of the fibre space. Therefore fibrations have an implicit structure group. However, topological fibre bundles without an explicit structure group are used so frequently that it is useful to have a special word for them. [ A candidate topology for the automorphism group might be the direct product topology, which corresponds to pointwise convergence. ] Groupless fibre bundles are called “fibrations” by Crampin/Pirani [11], page 353, although they define only C ∞ differentiable fibrations. The Gallot/Hulin/Lafontaine [19] definition 1.91 of a topological fibre bundle is groupless, but the authors immediately proceed to rename such fibre bundles to “fibrations” without warning. The definition of a fibration in EDM2 [34], 148.B, requires the projection map to have a “covering homotopy property” (and they define “fibration” and “fibre space” to be synonyms). The EDM2 [34] fibration definition is an algebraic topology concept which is totally different to the definition in this book. It would probably be safest to always use the term “groupless fibre bundle” instead of “fibration”. But the former term is too long. Therefore the terminology of Crampin/Pirani [11] is adopted in this book: “fibration” will mean a groupless fibre bundle, either non-topological (Section 22.1), topological or differentiable. The algebraic topology usage will be ignored. A fibration (Definition 23.2.1) is specified by a triple (E, π, B), where E and B are topological spaces and π : E → B is a continuous function. For each b ∈ B, the set Eb = π −1 ({b}) is thought of as “the fibre at b”. Each fibre may be pictured as attached to b as its base point. (See Figure 23.2.1.) π −1 ({b2 })
π −1 ({b})
E
total space partitioned into fibres
π
projection map
B Figure 23.2.1
b1
b2
b
base space
Partition of total space into fibres by projection map
The fibres π −1 ({b}) are clearly disjoint subsets of E. In fact, they constitute a partition of E. (This partitioning is a general property of inverses of functions which has nothing to do with continuity or topology, of course. See Remark 6.9.7 for non-topological fibre bundles.) Although π is defined from E to B, it should be thought of as a one-to-many map from B to E. The map from B to E is defined as the inverse of a many-to-one function. But one should think in terms of a set of disjoint subsets of E, one attached to each point of B. The space B is called the “base space” of the fibre bundle, E is called the “total space”, and π is called the projection map. The fibres π −1 ({b}) are required to be topologically equivalent (i.e. homeomorphic) to each other, and they must vary continuously in some sense with respect to the base point b. This continuity is expressed through local “fibre charts”. Since the fibres at all points of B are topologically equivalent, it is usual to specify a separate topological space F such that π −1 ({b}) ≈ F for all b ∈ B, and F is nominated as “the” fibre space for the fibre bundle. The choice of F is clearly arbitrary up to homeomorphism. For clarity, spaces of the form π −1 ({b}) may be referred to as an “intrinsic fibre space”, whereas a set such as F may be referred to as an “extrinsic fibre space”. Definition 23.2.1 is a slightly unconventional definition for a fibre bundle. The standard definitions have an extrinsic fibre bundle F such that locally the total space E is homeomorphic to the cross product of the base space B with F . Given a topological fibration (E, π, B) with an intrinsic fibre space according to Definition 23.2.1, the choice of an extrinsic fibre space F is arbitrary up to homeomorphism. Therefore it is best not to include a particular choice of F in the definition. However, in practice most people want a particular choice of fibre space. Therefore topological fibre bundles are formulated here in the following three steps. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
π −1 ({b1 })
23.2. Topological fibrations with intrinsic fibre spaces
497
(1) Definition 23.2.1 defines a fibration with an intrinsic fibre space; (2) Definition 23.3.7 defines a fibration with an extrinsic fibre space; (3) Definition 23.3.14 defines fibre atlases for fibrations with an extrinsic fibre space. For practical purposes a fibre atlas is very useful, but for the formal definition it is not essential except when a structure group is specified. 23.2.1 Definition: A topological fibration with intrinsic fibre space is a tuple (E, π, B) − < (E, TE , π, B, TB ) such that (i) (ii) (iii) (iv)
E− < (E, TE ) and B − < (B, TB ) are topological spaces, π : E → B is continuous, ∀b ∈ B, ∃U ∈ Topb (B), ∃φ : π −1 (U ) → π −1 ({b}), π × φ : π −1 (U ) ≈ U × π −1 ({b}), ∀b1 , b2 ∈ B, π −1 ({b1 }) ≈ π −1 ({b2 }).
E is called the total space of (E, π, B). π is called the projection map of (E, π, B). B is called the base space of (E, π, B). For any b ∈ B, the set π −1 ({b}) is called the fibre (set) of (E, π, B) at b. The maps φ are called intrinsic fibre charts for (E, π, B). 23.2.2 Remark: The spaces and maps in Definition 23.2.1 are illustrated in Figure 23.2.2. Recall that Topb (B) denotes the set of all open neighbourhoods of b ∈ B. (See Notation 14.2.11.) See Definition 6.9.12 for pointwise direct products of functions such as π × φ. φ
π −1 (U ) ⊆ E
π −1 ({b})
π
U × π −1 ({b}) ⊆ B × π −1 ({b})
A map φ for an intrinsic fibre space π −1 ({b})
Definition 23.2.1 (iii) implies by Theorem 15.1.9 that π −1 ({b1 }) ≈ π −1 ({b2 }) for all b1 , b2 ∈ U . In the case of a connected topology on B, this implies condition (iv). So condition (iv) is superfluous in the case that B is connected. Since the fibres π −1 ({b}) are pairwise homeomorphic, any topological space F which is homeomorphic to one such fibre is homeomorphic to all of them. The space F is therefore uniquely defined up to homeomorphism and is uniform over all of B. Definition 23.2.1 (iv) means that the fibres of (E, π, B) are globally uniform. This is reminiscent of the requirement that a manifold have the same dimension at all points. Just as a manifold could be alternatively defined to have different dimensions on different components, so a fibration or fibre bundle could be defined to have different fibre spaces on different components of the base space. But the benefits of the increased generality would not outweigh the nuisance. 23.2.3 Remark: Trivial examples are often useful for checking the basic sanity of definitions and theorems. The topological space (B, TB ) in Definition 23.2.1 could be as trivial as the space (∅, {∅}), which implies that E = ∅ and π = ∅. This satisfies all of the conditions of Definition 23.2.1. This example could be referred to as the trivial or empty topological fibration. Another kind of triviality occurs if (E, TE ) = (∅, {∅}) for an arbitrary topological space (B, TB ). Then π = ∅, and condition (iii) is satisfied by U = B and φ = ∅ for all b ∈ B. A slightly less trivial example is where (E, TE ) = (B, TB ) for any topological space (B, TB ) and π is the identity idE on E. Then π −1 ({b}) = {b} for all b ∈ B and condition (iii) is satisfied by U = B and φ : b′ 7→ b [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Figure 23.2.2
φ
U ⊆B
×
π
498
23. Topological fibre bundles
for all b′ ∈ B. (The discovery of further trivial examples is left as an exercise for the interested student. Everybody else may go home when they’ve tidied their desks.) 23.2.4 Remark: A map of the form φ : π −1 (U ) → π −1 ({b}) maps the fibres at all points in U to the fibre at b. This map is continuous. To clarify this, let F = Eb = π −1 ({b}) and consider the homeomorphism ψ = (π × φ)−1 : U × F ≈ π −1 (U ) projected onto a fixed f ∈ F . Define ψf : U → π −1 (U ) ⊆ E by ψf (b) = ψ(b, f ) = (π × φ)−1 (b, f ). This is illustrated in Figure 23.2.3. π −1 ({b1 })
E
ψf (b1 )
π −1 ({b2 })
ψf (b2 )
π −1 ({b})
f
π B Figure 23.2.3
b1
b2
b
Continuity of local chart with respect to base point
23.2.5 Remark: Definition 23.2.1 of a fibration as a projection map π : E → B for topological spaces E and B is concise and tidy, but it has a problem: the fact that in many important cases, the set E is difficult to specify directly. For example, consider the tangent fibration of a sphere. The tangent fibration of S n for n ≥ 2 is clearly not globally homeomorphic to the topological product space S n × IRn . It is hard work determining constructions for tangent fibrations from standard topological spaces in terms of product topologies, quotient topologies, relative topologies and so forth. In fact, the simplest and most general way to construct tangent fibrations out of standard topologies (such as IRn , S n and projective spaces) is to use “grafting”. This construction technique is justified by Theorem 15.10.5 and Definition 15.10.6. The reason that this kind of grafting of patches produces tangent fibrations so easily is the fact that a manifold is defined to be locally homeomorphic to a Euclidean space, which implies that manifolds are globally homeomorphic to a graft of Euclidean space patches. 23.2.6 Remark: A non-trivial example of a topological fibration is the set of local coordinates on S 2 . In a neighbourhood of each point on a 2-sphere, it is straightforward to set up a homeomorphism between the local coordinates at each point in the neighbourhood and the given point. However, this map cannot be extended as a homeomorphism to the tangent space for the whole of the sphere. Therefore the tangent space of the sphere is not globally homeomorphic to the simple topological product of S 2 with the fibre space at any point.
23.3. Topological fibrations and fibre atlases This section adds fibre spaces, fibre charts and fibre atlases to the fibrations with intrinsic fibre spaces defined in Section 23.2. The specification of an external fibre space for an intrinsic fibration (Definition 23.2.1) yields the topological fibration with extrinsic fibre space in Definition 23.3.7. The addition of a fibre atlas to Definition 23.3.7 yields the topological fibration with an atlas in Definition 23.3.17. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
The function ψf is continuous because π × φ is a homeomorphism. So the point in each fibre π −1 ({b′ }) which corresponds to the fixed f ∈ F varies continuously with respect to b′ ∈ U ⊆ B. This may be thought of as a “gluing together” of the fibre sets by the maps φ. There are many ways to glue the fibre sets together, and this is what makes a fibration more than a mere set product of two topological spaces. One might say that the fibre sets are “continuously connected”. (This should not be confused with connections on differentiable fibre bundles, which are definitions of differential parallelism. In fact, the maps φ could define a kind of local parallelism, but this is usually completely irrelevant to any actual connection which is defined.) The way in which a fibre bundle provides “glue” for the fibre sets is particularly clear in the case of a M¨obius strip where it is not possible to construct a continuous curve which passes through each fibre set exactly once. This implies that it is not possible to cover a M¨obius strip with a single fibre chart.
23.3. Topological fibrations and fibre atlases
499
23.3.1 Remark: Contrary to traditional practice, the fibre space for a fibration is not included in specification tuples in this book. Just as the set IRn is not usually included in the specification tuple for a manifold, so it is also undesirable to include fibre spaces (and structure groups) in fibre bundle specification tuples. This is partly because inclusion of fibre spaces (and structure groups) makes the tuples excessively long, but mainly because these parameters are part of the class specification rather than the object specification. The principle here is that class specifiers should not be included in object specifiers. 23.3.2 Remark: The spartan simplicity of topological fibrations with intrinsic fibre spaces in Section 23.2 is inconvenient in practice. Extrinsic fibre spaces are introduced in Definition 23.3.3. By Definition 23.2.1, a fibre space is homeomorphic to all fibre sets π −1 ({b}) of a topological fibration if and only if it is homeomorphic to just one such fibre set. (A trivial exception is the empty fibration alluded to in Remark 23.2.3 where B = ∅ and therefore (F, TF ) = (∅, {∅}) is the fibre space.) 23.3.3 Definition: A fibre space or standard fibre for a topological fibration (with intrinsic fibre space) (E, π, B) is any topological space F such that F ≈ π −1 ({b}) for all b ∈ B. 23.3.4 Definition: A fibre chart for a fibre space F for a topological fibration (with intrinsic fibre space) (E, π, B) is any function φ : π −1 (U ) → F such that π × φ : π −1 (U ) ≈ U × F for some U ∈ Top(B). (See Figure 23.3.1.) φ
π −1 (U ) ⊆ E
F
π × φ
π
U ⊆B
Fibre chart φ for an extrinsic fibre space F
23.3.5 Remark: Many texts, such as EDM2 [34], 147.B, define fibre charts for fibrations and fibre bundles in the inverse direction to that specified in Definition 23.3.4. They define fibre charts to have the form ψ : U × F ≈ π −1 (U ). The relations between the two styles of definitions are π × φ = ψ −1 and φ = π2 ◦ ψ −1 , where π2 : B × F → F denotes the projection π2 : (b, f ) 7→ f . The fibre-to-total-space form of chart ψ is less economical in the sense that an additional condition is required, namely that π(ψ(b, f )) = b for all b ∈ U . Both Kobayashi/Nomizu [26], page 50, and Crampin/Pirani [11], page 354, define fibre charts in the same direction (total space to standard fibre) as here. [ See also Gallot/Hulin/Lafontaine [19], 1.91, page 31. ]
23.3.6 Remark: The fact that there is no distinction between a fibre chart and a compatible fibre chart in Definition 23.3.4 is due to the fact that any chart which is compatible with the topology on E and B must also be compatible with all other such charts since the requirement for chart compatibility is purely topological. (This is explained in more detail in Remark 23.3.15.) Therefore all fibre atlases for topological fibrations are equivalent. So the specification of a fibre atlas is optional for topological fibrations. Definition 23.3.7 shows that topological fibrations may be defined in terms of local homeomorphisms rather than a global atlas. This is completely analogous to Definition 26.3.1 which defines a topological manifold without an atlas. 23.3.7 Definition: A topological fibration with fibre space F for a topological space F − < (F, TF ) is a tuple (E, π, B) − < (E, TE , π, B, TB ) such that (i) E − < (E, TE ) and B − < (B, TB ) are topological spaces,
(ii) π : E → B is continuous,
(iii) ∀b ∈ B, ∃U ∈ Topb (B), ∃φ : π −1 (U ) → F, π × φ : π −1 (U ) ≈ U × F . [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Figure 23.3.1
U ×F ⊆B×F
500
23. Topological fibre bundles
23.3.8 Definition: A cross-section over a set B ′ ⊆ B for a topological fibration (E, π, B) is a continuous function f : B ′ → E such that π ◦ f = idB ′ . A cross-section of a topological fibration (E, π, B) is a cross-section over B.
23.3.9 Notation: X(E, π, B) for a topological fibration (E, π, B) denotes the set of all cross-sections of (E, π, B). 23.3.10 Remark: In Definition 23.3.8, a cross-section over a subset B ′ of the base space B of a fibration (E, π, B) represents a choice of fibre element f (b) ∈ Eb = π −1 ({b}) for each b ∈ B ′ because π ◦ f = idB ′ .
More generally, the right inverse f of any surjective function g : X → Y represents the choice of an element f (y) ∈ g −1 ({y}) for all y ∈ Y . A cross-section is really just a continuous right inverse of the projection map. This has nothing at all to do with any structure group. The choice of fibre space for the fibration is not directly involved either. It turns out that cross-sections are very important in the study of fibre bundles and differential geometry in general. For example, parallel transport is represented as cross-sections over paths in the base space B, and all of the fields of physics are represented as cross-sections over regions of the base space. Cross-sections are also an important tool for investigating the global topology of the base space. A useful way to think of cross-sections is as “fibre fields” by analogy with vector fields in physics. 23.3.11 Remark: Terminology for cross-sections is varied. Some authors (e.g. Crampin/Pirani [11]) hyphenate to “cross-section” and give “section” as an alternative. Other authors (e.g. Darling [13], Gallot/Hulin/Lafontaine [19], Lang [30], Lee [32]) use only “section”. Some (e.g. EDM2 [34], Kobayashi/ Nomizu [26]) use only “cross section”. Some (e.g. Frankel [18], Spivak [42]) use principally “section”, but sometimes “cross section”.
23.3.12 Remark: Since fibre charts φ : π −1 (U ) → F for a fibre space F define homeomorphisms π × φ : π −1 (U ) → U × F , the continuity of a cross-section in Definition 23.3.8 is equivalent to continuity “through the fibre charts”. In other words, a cross-section is continuous if and only if the map φ ◦ f U : B ′ ∩ U → F is continuous for all fibre charts φ. However, if the set of fibre charts is constrained to a specified atlas (as in the case of a fibre bundle with a structure group), this does not in any way constrain the class of continuous cross-sections. 23.3.13 Remark: The fibre charts in Definition 23.3.4 are expressed in terms of the arbitrary extrinsic fibre spaces in Definition 23.3.3. Since the choice of fibre space is arbitrary up to homeomorphism, fibre charts could in principle be targeted at more than one fibre space. To avoid such chaos, fibre atlases in Definition 23.3.14 are required to be targeted at a fixed choice of fibre space. Although atlases are defined here in terms of a pre-defined topology on the fibration, this logic is the reverse to usual practice. It is more usual to define the atlas first and then define the topology so that all of the charts in the atlas are homeomorphisms. 23.3.14 Definition: A fibre atlas for a fibre space F S for a topological fibration (E, π, B) is a set AF E of fibre charts for the fibre space F for (E, π, B) such that φ∈AF Dom(φ) = E. E
An indexed fibre atlas for fibre space F for S a topological fibration (E, π, B) is a family (φi )i∈I of fibre charts for fibre space F for (E, π, B) such that i∈I Dom(φi ) = E. (That is, Range(φ) is a fibre atlas for F .)
23.3.15 Remark: The chart transition function for charts φi and φj in an indexed fibre atlas in Definition 23.3.14 is (π × φi ) ◦ (π × φj )−1 : (Ui ∩ Uj ) × F → (Ui ∩ Uj ) × F. (Ui ∩Uj )×F
This is necessarily continuous by Definition 23.3.4. (See Figure 23.3.2.) Therefore it is unnecessary to add any subsidiary condition on the regularity of transition functions to Definitions 23.3.14 and 23.3.17. As in the [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
The best practical choice of name may be “cross-section”. This avoids confusion with the word “section” as in “chapters and sections”. It also avoids the line breaks that may occur with the two words “cross section”, which would make string search on computer files difficult. In casual writing and discussion, probably “section” has no disadvantages, but in book-writing, the hyphenated “cross-section” seems best.
23.3. Topological fibrations and fibre atlases
501
case of manifolds, it is necessary to equip a fibration or ordinary fibre bundle with an atlas with prescribed regularity only in the case of higher classes of regularity than mere continuity. Similarly, it is not necessary to define the notion of equivalent atlases, since all fibre charts on a fixed fibration are automatically compatible. E π −1 (Uj )
π −1 (Ui )
F φi
F φj
π −1 (Ui ∩ Uj ) φ π◦
Ui × F
B×F
Figure 23.3.2
π
Ui ∩ Uj
(Ui ∩ Uj ) × F Ui Uj × F
π◦
i
φj
(Ui ∩ Uj ) × F Uj
B
Uj × F
Ui × F
B×F
(π × φi ) ◦ (π × φj )−1 (Ui ∩Uj )×F
Chart transition map for a fibration with fibre space F
23.3.17 Definition: A topological fibration with a fibre atlas for the fibre space F for a topological space F is a tuple (E, π, B) − < (E, TE , π, B, TB , AF E ) such that (i) E − < (E, TE ) and B − < (B, TB ) are topological spaces and π : E → B is continuous, F (ii) ∀φ ∈ AE , ∃Uφ ∈ TB , π × φ : π −1 (Uφ ) ≈ Uφ × F . S (iii) φ∈AF Uφ = B. E
23.3.18 Remark: Since the charts φ ∈ AF E in Definition 23.3.17 are homeomorphisms, the topologies TE and TB may be discarded without losing information. This is analogous to the fact that the atlas on a differentiable manifold makes the specification of the topology unnecessary. So a topological fibration could be specified as (E, π, B, AF E ) without loss of information. In fact, the sets E and B (and the sets G and F ) can be discarded too. 23.3.19 Remark: Numerous standard constructions may be defined for topological fibrations. For example, the direct product fibration in Definition 23.3.20 is based on Definition 23.3.7.
23.3.20 Definition: The direct product of two topological fibrations (E1 , π1 , B1 ) − < (E1 , TE1 , π1 , B1 , TB1 ) and (E2 , π2 , B2 ) − < (E2 , TE2 , π2 , B2 , TB2 ) with fibre spaces F1 − < (F1 , TF1 ) and F2 − < (F2 , TF2 ) respectively is the topological fibration (E, π, B) − < (E, TE , π, B, TB ) with fibre space F − < (F, TF ) which satisfies the following. (i) (E, TE ) = (E1 , TE1 ) × (E2 , TE2 ), (B, TB ) = (B1 , TB1 ) × (B2 , TB2 ) and (F, TF ) = (F1 , TF1 ) × (F2 , TF2 ) are direct product topological spaces. (See Definition 15.1.1.) (ii) π = π1 × π2 : E → B is the direct product of π1 and π2 . (See Definition 6.9.11.) 23.3.21 Theorem: The tuple (E, π, B) − < (E, TE , π, B, TB ) in Definition 23.3.20 is a topological fibration with fibre space F . [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
23.3.16 Remark: Definition 23.3.17 is an alternative to Definition 23.3.7. For the theoretical development, the atlas-free Definition 23.3.7 is often preferable, whereas for practical applications it is often preferable to have the atlas as in Definition 23.3.17.
502
23. Topological fibre bundles
Proof: For any b = (b1 , b2 ) ∈ B, there are fibre charts φ1 : π1−1 (U1 ) → F1 and φ2 : π2−1 (U2 ) → F2 for U1 ∈ Topb1 (B1 ) and U2 ∈ Topb2 (B2 ) for the two respective fibrations. The direct product function φ = φ1 × φ2 : U1 × U2 → F is a fibre chart for (E, π, B) with b ∈ π(Dom(φ)).
23.4. Fibration identification spaces The term “identification spaces” is used in EDM2 [34], 147.B to describe spaces which are constructed from patches by identifying the overlaps. Atlases are not provided for the fibrations constructed in this section, but in a sense, the constructed fibration is really an atlas itself. Structure groups are also not provided in this section because they are really a property of a fibration with a particular choice of atlas. 23.4.1 Remark: The construction, or reconstruction, of a topological fibration from charts is discussed by Crampin/Pirani [11], page 355. (See also Kobayashi/Nomizu [26], page 52.) There is more than one way to construct a fibration as a graft of cross products of a base space with a fibre space. In Definition 23.4.2, a fixed pre-defined base space B is assumed. But it is also possible to construct the base space via the graft at the same time as constructing the fibre space structure on top of that base space. Set identification spaces are defined in Section 6.10. 23.4.2 Definition: Let (Ui )i∈I be a family of open subsets of a topological space B. Let F be a topological space. A topological fibration identification space with base space B and fibre space F is a set graft X ⊆ ˚i∈I (Ui × F ) together with a topology T , where X satisfies × (i) ∀(bi , fi )i∈J ∈ X, ∀i, j ∈ J, bi = bj ,
(ii) ∀i, j ∈ I, ∀b ∈ Ui ∩ Uj , ∃x ∈ X, {i, j} ⊆ Dom(x), and
(iii) the family of topological spaces (Ui × F, Top(Ui × F ))i∈I is topologically consistent with the graft X. The topology T on X is the graft topology on X derived from the patch topologies (Top(Ui × F ))i∈I .
23.4.3 Remark: Definition 23.4.2 is illustrated in Figure 23.4.1. Set grafts are introduced in Definition 6.10.5. Topological grafts are presented in Definition 15.11.2.
F
f2
f1
F
f3
F
U2
U1 Figure 23.4.1
b
U3
Element ((b, f1 ), (b, f2 ), (b, f3 )) of graft set X
From Definition 6.10.5 (iii), it follows that all elements of all sets Ui × F are represented in the set graft X. In other words, ∀i ∈ I, ∀b ∈ Ui , ∀f ∈ F, ∃x ∈ X, xi = (b, f ).
It follows from Definition 6.10.5 (ii) that an element of a set Ui × F matches at most one element of each set Uj × F for i 6= j. Hence each element in each Ui × F is in precisely one element of X.
Definition 23.4.2 (i) means that only pairs (b, f ) ∈ B × F with the same base point b are permitted to be matched up in the set graft. Definition 23.4.2 (ii) then implies that each element (b, fi ) of each set Ui × F must be matched to precisely one element (b, fj ) in each other set Uj × F such that b ∈ Uj . (Condition (i) [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
[ Perhaps should define an atlas for the fibration in Definition 23.4.2? ]
23.5. Structure groups discussion
503
only says that if there is a matching element in Uj × F , then it must have the same base point, whereas condition (ii) means that there is at least one such point.) The purpose of the graft is simply to specify which fibre elements within one patch of the graft are matched up to which fibre elements for the same base point in different patches. This means that the set graft X does not do any grafting of the base set B. The same result could have been achieved by defining a graft of a family of copies of F , one copy for each i ∈ I, at each point in B. But that would be quite clumsy to define. The set B could itself be constructed as a graft of patches. It could be convenient, in fact, to do the grafting of both the base space and the fibres at the same time, but this is not done in Definition 23.4.2. The topological consistency of the product topologies of the sets Ui × F with the graft X means that the graft transition maps hij : (Ui ∩ Uj ) × F → (Ui ∩ Uj ) × F are continuous with respect to these product topologies for all i, j ∈ I. The graft maps hij are defined so that hij (b, fi ) = (b, fj ) if and only if there is some x ∈ X such that xi = (b, fi ) and xj = (b, fj ). The continuity of a graft map hij is equivalent to the continuity of the projection of hij onto the fibre space F plus the continuity of this automorphism with respect to the point b ∈ B. [ This needs to be checked and explained better. ] 23.4.4 Theorem: A topological fibration identification space is a topological fibration. [ Here define combined set+fibre topological grafts. These are probably more useful in practice. ] [ Somewhere define topological fibre bundle identification spaces with a structure group? Also define topological principal fibre bundle identification spaces? ]
23.5. Structure groups discussion
The use of structure groups in the definition of fibre bundles helps to align differential geometry with Felix Klein’s Erlanger Programm (1872), which proposed that each geometry should have an associated transformation group under which all properties and relations of figures are invariant. Even though there are generally no such global groups for a differentiable manifold, they are present at each point of a fibre bundle such as a tangent bundle. (For at least 50 years after 1872, it was the fashion for every mathematician to discover that their subject of study was invariant under some group or other so as to increase the status of their work. See Bell [189], pages 445–6.) The definition of a “fibre bundle with structure group” requires some group membership and continuity constraints on the overlaps of fibre charts. The “structure group” is an effective topological group G of left transformations on the fibre space F . In the case of a fibration or groupless fibre bundle (E, π, B) with fibre space F , there is an implicit structure group G, namely the group Aut(F ) of topological automorphisms of the fibre space F . Thus the fibres Eb = π −1 ({b}) for b ∈ B are topological copies of F which vary continuously in E with respect to b. In the overlap of fibre charts, the transition maps are required only to be homeomorphisms of F , i.e. elements of the topological automorphism group of F . One fly in the ointment here is the task of determining what topology to put on the automorphism group for a general topological space F . This may be an unavoidable difference between a topological fibration with fibre F and a topological (G, F ) fibre bundle whose structure group G is the group of automorphisms of F . [ Should try to resolve the issue of the topology of implicit structure groups. One candidate for the “natural” topology on the set of topological automorphisms of F might be the topology of pointwise convergence, which is the same as the product topology. See EDM2 [34], 435.B. Another good candidate is the compact-open topology. (See Definition 15.7.9.) ] In the case of a fibre bundle with a structure group, the structure group G is some subgroup of the topological automorphism group of F . The pointwise transition maps between fibre charts are required to be elements [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Fibre bundles are similar to topological manifolds in many ways, but fibre bundles have a kind of structure contraint imposed by a “structure group” which has no exact analogue for manifolds. However, the “pseudogroup” of local diffeomorphisms of a Euclidean space (Kobayashi/Nomizu [26], page 1) is a kind of structure group which represents regularity for a manifold. Such structure groups or pseudogroups constrain the notion of a “compatible chart” in both cases.
504
23. Topological fibre bundles
23.6. Topological fibre bundles 23.6.1 Remark: A small family tree for topological fibre bundles is shown in Figure 23.6.1. topological space (B,TB )
non-topological fibration (E,π,B)
topological fibration (E,TE ,π,B,TB )
topological fibre bundle (E,TE ,π,B,TB ,AF E)
Figure 23.6.1
Family tree for topological fibre bundles
23.6.2 Remark: To distinguish a manifold atlas from a fibre atlas, the notation AM will be used for a manifold atlas whereas the fibre space F will be indicated explicitly in the notation AF E for a fibre atlas on a total space E. Then differentiable fibre bundles (in Chapter 35) can be specified by tuples such as (E, AE , π, B, AB , AF E ) where AE and AB are manifold atlases for E and B respectively. 23.6.3 Remark: For transformation group elements g ∈ G − < (G, F, σG , µ) in Definition 23.6.4, Lg denotes the function Lg : F → F satisfying Lg : f 7→ µ(g, f ). The spaces and maps of Definition 23.6.4 are illustrated in Figure 23.6.2. See Definition 16.8.7 for “effective topological left transformation groups”. The map µ ¯ is the function-valued function µ ¯ : G → (F → F ) defined by µ ¯(g)(f ) = µ(g, f ), where µ : G × F → F is the group operation of G on F . The circle on the arrow for µ ¯ in Figure 23.6.2 indicates that the map is of the form µ ¯ : G → (F → F ), [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
of G, which will usually be a proper subgroup of the topological automorphism group of F . These pointwise group elements are required to vary continuously with respect to position in the manifold. The structure group of a fibre bundle is utilized not only in determining allowable transition maps for fibre chart overlaps but also in determining whether two charts are compatible, and this in turn determines whether two atlases are equivalent. Thus if, for example, F is a vector space and G is the group of linear transformations of F , then any two fibre charts must match in their overlap region in a pointwise linear fashion. If G consists of orthogonal transformations, the chart overlaps must be related pointwise orthogonally, and so forth. The important point here is that the fibre π −1 ({b}) at each point b ∈ B is thought of as a full copy of the fibre space F together with all of the algebraic and other structures of F . These structures are preserved under transformations specified by the structure group G. If the maximal fibre atlas is constructed from a given fibre atlas A for a (G, F ) fibre bundle (E, π, B), this maximal atlas will generally not be a valid atlas if G is replaced by a subgroup of G. Thus maximal fibre bundle atlases vary inversely with respect to the structure group in the sense of the partial order of set inclusions. The larger the structure group, the smaller the maximal atlas, and vice versa. This is analogous to the situation with regularity of manifolds, where the higher the regularity of the diffeomorphism group on IRn , the smaller the corresponding maximal atlas, and vice versa. Instead of dealing in maximal atlases (which are very infinite), it is probably better to use language such as “the (G, F ) fibre bundle generated by atlas A on (E, π, B)”. This then leaves open the question of whether one defines the generated fibre bundle as an equivalence class of atlases, a maximal atlas, or some other set-theoretic construction. If a set must be decided upon as the “essence of the fibre bundle”, then probably an equivalence class of atlases is preferable. But this equivalence class varies according to one’s choice of structure group and regularity class. So it is best to resist the temptation to construct maximal atlases. For any given topological fibration (E, π, B), the fibre space F is uniquely determined up to homeomorphism. Given both (E, π, B) and a particular choice of F , the group G must be the group of all topological automorphisms of F because all possible fibre charts are considered to be compatible fibre charts. If the set of fibre charts is constrained to lie in a particular atlas, then G is constrained only to be a superset of the set of pointwise transition maps of the charts in the atlas. Thus the choice of atlas constrains the choice of structure group G. Conversely (actually contrapositively), the choice of structure group constrains the atlas.
23.6. Topological fibre bundles µ ¯
φ
π −1 (U ) ⊆ E
π ×
F
G
φ
π
U ⊆B Figure 23.6.2
505
U ×F ⊆B×F
Coordinate map for fibre bundle with structure group
not µ ¯ : G → F . That is, for all g ∈ G, µ ¯(g) is a map µ ¯(g) : F → F . In this case, the circled arrow µ ¯ is the action of a transformation group G on F . 23.6.4 Definition: A topological (G, F ) fibre bundle for an effective topological left transformation group (G, F ) − < (G, TG , F, TF , σG , µ) is a tuple (E, π, B) − < (E, TE , π, B, TB , AF E ) such that:
(i) (E, TE ) and (B, TB ) are topological spaces, and π : E → B is continuous; −1 (ii) ∀φ ∈ AF (Uφ ) → F is continuous and π × φ : π −1 (Uφ ) ≈ Uφ × F ; E , ∃Uφ ∈ TB , φ : π S (iii) φ∈AF Uφ = B; E −1 −1 (iv) ∀φ1 , φ2 ∈ AF ({b}) ≈ F for E , ∀b ∈ Uφ1 ∩ Uφ2 , ∃g ∈ G, βb,φ1 ◦ βb,φ2 = Lg , where βb,φ = φ π −1 ({b}) : π The transformation group G − < (G, TG , F, TF , σG , µ) is called the structure group of the fibre bundle. The set AF is called a (G, F ) fibre atlas for the fibration (E, π, B). E The set Eb = π −1 ({b}) is called the fibre (set) of (E, π, B) at b ∈ B. The map βb,φ may be called the per-fibre chart, fibre-set chart or pointwise fibre chart for the fibre bundle (E, π, B) at b ∈ B. The functions gφ1 ,φ2 are called fibre atlas transition functions for (E, π, B). The function gφ1 ,φ2 π−1 ({b}) may be called the per-fibre (fibre atlas) transition function or pointwise (fibre atlas) transition function for (E, π, B) at b ∈ B.
23.6.5 Remark: The transition map Lg for the per-fibre charts βb,φ1 and βb,φ2 in Definition 23.6.4 (iv) is illustrated in Figure 23.6.3.
π −1 ({b})
β b,φ 1
) −1 ({b} 1 π φ =
−1 = Lg , ∃g ∈ G βb,φ1 ◦ βb,φ 2
E π −1 B Figure 23.6.3
F
b
βb,
φ2
=φ 2
F
π −1 ({b
})
Fibre bundle transition maps are group elements
Conditions (i), (ii) and (iii) mean that (E, π, B) − < (E, TE , π, B, TB ) is a topological fibration with fibre space F according to Definition 23.3.7 and AF E is a fibre atlas for (E, π, B) with fibre space F according to Definition 23.3.14. So the first three conditions of Definition 23.6.4 are the same as the conditions of Definition 23.3.17 for a topological fibre atlas with a fibre atlas. The extra conditions (iv) and (v) specialize the fibre atlas AF E so that the charts are related to each other by continuous left group actions on the fibre space. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
φ ∈ AF E and b ∈ Uφ ; −1 (v) ∀φ1 , φ2 ∈ AF E , the function gφ1 ,φ2 : Uφ1 ∩ Uφ2 → G defined by Lgφ1 ,φ2 (b) = βb,φ1 ◦ βb,φ2 is continuous.
506
23. Topological fibre bundles
23.6.6 Remark: Topological fibre bundles may be empty. (This is mentioned for the case of topological fibrations in Remark 23.2.3.) The general empty topological (G, F ) fibre bundle for non-empty F has the G G form (E, TE , π, B, TB , AF E ) = (∅, {∅}, ∅, ∅, {∅}, AP ), where AP = ∅ or {∅}. (For proof, see Exercise 47.8.1.) F A special case of this is (E, TE , π, B, TB , AE ) = (∅, {∅}, ∅, ∅, {∅}, ∅). 23.6.7 Remark: The group element gφ1 ,φ2 (b) in Definition 23.6.4 (v) may be thought of as a transformation rule from φ2 coordinates to φ1 coordinates. So indices tend to match up as for standard matrix notation (as opposed to the reverse convention for Markov process matrices). An example of this is Theorem 23.8.11 (ii). 23.6.8 Remark: Definition 23.6.4 (iv) may also be expressed as ∀φ1 , φ2 ∈ AF E , ∀b ∈ Uφ1 ∩ Uφ2 , ∃g ∈ −1 −1 G, ∀y ∈ F, φ1 (βb,φ (y)) = g.y. By putting z = β (y), the condition becomes b,φ2 2 ∀φ1 , φ2 ∈ AF E , ∀b ∈ Uφ1 ∩ Uφ2 , ∃g ∈ G, ∀z ∈ Eb , φ1 (z) = gφ2 (z).
(23.6.1)
−1 Since G acts effectively on F , the group element g is uniquely determined by the transition map βb,φ1 ◦ βb,φ . 2 This is because the map Lg : F × F uniquely determines g in an effective group. Therefore the function gφ1 φ2 in condition (v) is well-defined. It is not well-defined if (G, F ) is not effective. Using gφ1 φ2 , equation (23.6.1) becomes:
∀φ1 , φ2 ∈ AF E , ∀b ∈ Uφ1 ∩ Uφ2 , ∀z ∈ Eb , φ1 (z) = gφ1 φ2 (b)φ2 (z) = gφ1 φ2 (π(z))φ2 (z), from which it follows that:
The group elements gφ1 ,φ2 (b) can be defined without the restricted β maps by noting that Lgφ1 ,φ2 (b) : y 7→ P2 (π × φi ) ◦ (π × φj )−1 (b, y) for y ∈ F , where P2 : B × F denotes the projection map P2 : (b, y) 7→ y. This is illustrated in Figure 23.6.4. π −1 (Uj )
π −1 (Ui )
φj E
F φi
π −1 (Ui ∩Uj )
π
B Ui Figure 23.6.4
π◦
π◦
φj
P2
φi
Ui ∩ Uj
(Ui ∩Uj )×F
P1 Uj
B×F Ui × F
Uj × F
Projection maps and fibre charts for transition maps
The base point b in this expression may be thought of as a tag for the fibre element y which is thrown away by P2 when it has been used in the chart transition map. This expression for gφ1 ,φ2 may be further simplified to Lgφ1 ,φ2 (b) : y 7→ φi ◦ (π × φj )−1 (b, y). F 23.6.9 Notation: atlas(E, π, B) for a (G, F ) fibre bundle (E, π, B, AF E ) denotes the fibre atlas AE . Then atlasb (E, π, B) for b ∈ B denotes the subset {φ ∈ atlas(E, π, B); b ∈ π(Dom(φ))} of atlas(E, π, B). AF E,b for b ∈ B also denotes the set {φ ∈ atlas(E, π, B); b ∈ π(Dom(φ))}. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
∀φ1 , φ2 ∈ AF E , ∀z ∈ Dom(φ1 ) ∩ Dom(φ2 ), φ1 (z) = gφ1 φ2 (π(z))φ2 (z).
23.6. Topological fibre bundles
507
[ Check if Remark 23.6.10 is really true. Maybe the atlas is uniquely determined by the tuple (E, TE , π, B, TB ). It’s probably okay as it is. ] 23.6.10 Remark: The fibre atlas AF E is an essential parameter in the specification of a topological fibre bundle with a structure group in Definition 23.6.4. A triple (E, π, B) may be given two different atlases to specify two different, incompatible fibre bundles with respect to the pair (G, F ). A triple (E, π, B) with a single atlas is a (G, F ) fibre bundle for a range of choices of the group G. But this set of groups depends on the choice of atlas AF E. In general, the larger the atlas, the smaller the set of allowable structure groups G. A fibre bundle with a single-chart atlas will have the widest range of possibilities for the group G, since any topological transformation group G on F will be consistent with the atlas. If a quadruple (E, π, B, AF E ) is a (G, F ) fibre bundle, then it is a (G′ , F ) fibre bundle for any topological left transformation group of F which is a supergroup of G. The purpose of the fibre atlas AF E on each fibre set is to specify algebraic structure. The atlas is constrained by the topology of the underlying fibration (E, π, B), although one may think of the atlas as determining the topologies on E and B. The set of cross-sections of a fibre bundle is constrained by these topologies. If the topologies are induced by the atlas, then one may say that the atlas determines the set of cross-sections also. However, the choice of structure group does not directly constrain the set of cross-sections of the fibre bundle. To summarize, the topology of the underlying fibration constrains the atlas, and the atlas constrains the structure group; in the other direction, the structure group constrains the atlas, and the atlas determines the topology of the underlying fibration. The minimal structure group is a property of the atlas, not of the underlying fibration. The maximal atlas is a property of the structure group, not of the underlying fibration. 23.6.11 Remark: It is tempting to conjecture that Definition 23.6.4 (v) could be a direct consequence of the other conditions. To try to prove this conjecture, note that for b ∈ Uφ1 ∩ Uφ2 and f ∈ F , = b, βb,φ1 ((π × φ2 )−1 (b, f ))
= (π × φ1 ) (π × φ2 )−1 (b, f ) ,
which is clearly a continuous function from B × F to B × F . So gφ1 φ2 is the projection onto F of the homeomorphism (π × φ1 ) ◦ (π × φ2 )−1 . Hence gφ1 φ2 : B × F → F is continuous. The group action µ : G × F → F is continuous by Definition 16.8.3. [ There must be some natural topology on the function space F → F . Probably this could be chosen so that a function f : G × F → F is continuous if and only if the corresponding function f : G → (F → F ) is continuous. See Definition 14.11.21 for pointwise convergence topology, which is the same as the direct product topology. This might be a suitable topology, especially for the implicit automorphism group for topological fibrations. ] The function µ : G × F → F may be regarded as µ ¯ : G → (F → F ). Then both gφ1 φ2 and µ ¯ are maps whose range is the set of maps from F to F . It is tempting to consider the function µ ¯−1 ◦ gφ1 φ2 . In fact, by condition (iv), the range of gφ1 φ2 is a subset of the range of µ ¯. So µ ¯−1 ◦ gφ1 φ2 : B → G is a well-defined −1 function. But the continuity of µ ¯ does not imply the continuity of µ ¯ . ... [ Under what conditions or circumstances would Definition 23.6.4 (v) follow from the other conditions? ] 23.6.12 Remark: Fibre bundles have many similarities to manifolds. Both fibre bundles and manifolds have topological definitions which require no extra structure such as an atlas, and more regular definitions such a differentiable fibre bundles or manifolds which do require extra structure in their specification such as an atlas. The specification of a (G, F ) fibre bundle requires an atlas. In a sense, the structure group G is analogous to the local diffeomorphism group (called a “pseudogroup” in Kobayashi/Nomizu [26]) which defines C k and other regularity for differentiable manifolds and fibre spaces. 23.6.13 Remark: For any b ∈ B and fibre chart φ such that b ∈ π(Dom(φ)), define the function βb,φ : Eb ≈ F by βb,φ (x) = φ(x). In other words, βb,φ = φ E . Then the map Lg ◦ βb,φ : Eb ≈ F is also a valid b
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
−1 (b, gφ1 φ2 (b)(f )) = (b, βb,φ1 (βb,φ (f ))) 2
508
23. Topological fibre bundles
pointwise identification of Eb with the extrinsic fibre space F . Consequently, the identification of each fibre set with the extrinsic fibre is determined only up to a group element g ∈ G. Therefore the map βb,φ should be thought of not as a fixed identification with the fibre space F but rather an identification with F in some indeterminate “orientation” of F . The set of all possible valid pointwise identifications of Eb with F is the set {Lg ◦ βb,φ ; g ∈ G}. 23.6.14 Definition: A compatible fibre chart for a (G, F ) fibre bundle (E, π, B, AF E ) is a fibre chart φ for (E, π, B) with fibre space F such that (E, π, B, AF ∪ {φ}) is a (G, F ) fibre bundle. E 23.6.15 Remark: Another way of stating Definition 23.6.16 is that two atlases A1 and A2 are said to be equivalent if every fibre chart in A1 is compatible with every fibre chart in A2 . 23.6.16 Definition: Topological (G, F ) fibre atlases A1 and A2 for a topological fibration (E, π, B) are said to be equivalent topological (G, F ) fibre atlases for (E, π, B) if A1 ∪ A2 is a topological (G, F ) fibre atlas for (E, π, B). [ Define a maximal (G, F ) fibre atlas, and emphasize that it is optional. ] [ Define cross-sections of fibre bundles. ] 23.6.17 Example: The M¨ obius strip provides probably the simplest non-trival example of a fibre bundle. (See Section 44.2.) For the M¨ obius strip, the only possible structure group is G = {I, J} where I is the identity on F = {−1, 1} and J : F → F swaps the elements of F . The trivial group {I} would not be an effective group since it leaves two elements of F fixed. In this example, both the fibre space F and the structure group G are completely determined by the triple (E, π, B). In general, the structure group G may not be uniquely determined by this triple. [ Refer to the example of the tangent bundle of a differentiable manifold. ] [ Present a worked example for G = SO(2), F = S 1 , B = S 2 . Also give the example B = S 1 , F = S 0 , G = {I, J}, where J swaps 1 with −1, which should be a M¨obius strip. ]
A fibre bundle homomorphism is called simply a “fibre bundle map” in EDM2 [34], 147.B. The style of homomorphism in Definition 23.7.1 maps only the total spaces because the rest of the structure more or less follows from this. (This is not surprising. The total space is the central component of a fibre bundle.) The atlases are not required to be completely equivalent. They are only required to be C 0 equivalent. Thus conditions (ii) and (iii) specify that the map must be consistent with the charts on each fibre bundle and the topology of the structure group. But the fibre atlas on (E1 , π1 , B1 , A1 ) could have regularity or group invariance properties which are not present for (E2 , π2 , B2 , A2 ). [ May also define “exact equivalence” of fibre bundles which have an exact match of atlases. These could be called “atlas-equivalent” fibre bundles. These would guard all regularity and group invariance properties of the fibre bundles. ] 23.7.1 Definition: A topological (G, F ) fibre bundle homomorphism between two topological (G, F ) fibre bundles (E1 , π1 , B1 , A1 ) and (E2 , π2 , B2 , A2 ) is a continuous map f : E1 → E2 such that:
(i) π2 ◦ f = f˜ ◦ π1 for some continuous function f˜ : B1 → B2 ; −1 (ii) ∀φ1 ∈ A1 , ∀φ2 ∈ A2 , ∀b ∈ U1 ∩ f˜−1 (U2 ), ∃g ∈ G, βf˜(b),φ2 ◦ f ◦ βb,φ = Lg , where βbi ,φi = φi π−1 ({bi }) 1 i for φi ∈ Ai and bi ∈ Ui = πi (Dom(φi )) for i = 1, 2; (iii) ∀φ1 ∈ A1 , ∀φ2 ∈ A2 , ρφ1 φ2 : U1 ∩ f˜−1 (U2 ) → G is continuous, where ρφ1 φ2 is defined by Lρφ1 φ2 (b) = −1 βf˜(b),φ2 ◦ f ◦ βb,φ . 1 [ Condition (iii) in Remark 23.7.2 seems to include condition (ii). So maybe these should be combined. ]
23.7.2 Remark: Definition 23.7.1 (i) is equivalent to requiring that the relation f˜ = π2 ◦ f ◦ π1−1 be a well-defined continuous function. If this function exists, it is unique. The continuity of f˜ means roughly that f continuously maps sets of the form π1−1 ({b1 }) for b1 ∈ B1 onto sets of the form π2−1 ({b2 }) for b2 ∈ B2 . [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
23.7. Fibre bundle homomorphisms, isomorphisms and products
23.7. Fibre bundle homomorphisms, isomorphisms and products
509
Condition (ii) can be made to look more similar to condition (i) by writing it as follows: ∀φ1 ∈ A1 , ∀φ2 ∈ A2 , ∀b ∈ U1 ∩ f˜−1 (U2 ), ∃g ∈ G, βf˜(b),φ2 ◦ f −1
π1 ({b})
= Lg ◦ βb,φ1 : π1−1 ({b}) → π2−1 ({f˜(b)}).
Thus Lg : F ≈ F in (ii) is analogous to the base space map f˜ : B1 → B2 in (i). Condition (iii) is illustrated in Figure 23.7.1. It may be written equivalently as : z 7→ Lρφ1 ,φ2 (π1 (z)) ◦ φ1 (z) φ2 ◦ f −1 ˜−1 π1 (U1 ∩f
(U2 ))
for z ∈ π1−1 (U1 ∩ f˜−1 (U2 )). This shows that f (z) is equivalent “through the charts” to a group action which depends only on π1 (z) for each fixed choice of charts. ρφ1 ,φ2 (b) ∈ G
F
φ1
βb,φ1
U1 = π1 (Dom(φ1 ))
π1 b B1
Dom(φ2 ) ⊆ E2 π2
f˜
π2−1 ({f˜(b)})
f˜(b) π2 (Dom(φ2 )) = U2 B2
Topological (G, F ) fibre bundle homomorphism
23.7.3 Remark: If f˜ in Definition 23.7.1 is a homeomorphism, then f is also a homeomorphism. [ This should probably be a theorem. ] 23.7.4 Definition: A topological (G, F ) fibre bundle isomorphism between topological (G, F ) fibre bundles (E1 , π1 , B1 , A1 ) and (E2 , π2 , B2 , A2 ) is a map f : E1 → E2 such that f and f −1 are topological (G, F ) fibre bundle homomorphisms. 23.7.5 Definition: Equivalent topological (G, F ) fibre bundles are topological (G, F ) fibre bundles between which there is a topological (G, F ) fibre bundle isomorphism. 23.7.6 Remark: The style of fibre bundle homomorphism in Definition 23.7.1 assumes the same topological transformation group (G, F ) for the two fibre bundles. Definition 23.7.7 removes this restriction. (See Definition 16.8.6 for topological transformation group homomorphisms.) 23.7.7 Definition: A topological fibre bundle homomorphism between a topological (G1 , F1 ) fibre bundle (E1 , π1 , B1 , A1 ) and a topological (G2 , F2 ) fibre bundle (E2 , π2 , B2 , A2 ) is a tuple of maps (fˆ, fˇ, f ) such that: (i) (ii) (iii) (iv)
(fˆ, fˇ) is a topological transformation group homomorphism with fˆ : G1 → G2 and fˇ : F1 → F2 ; f : E1 → E2 is continuous; π2 ◦ f = f˜ ◦ π1 for some continuous function f˜ : B1 → B2 ; −1 ∀φ1 ∈ A1 , ∀φ2 ∈ A2 , ∀b ∈ U1 ∩ f˜−1 (U2 ), ∃g ∈ G1 , βf˜(b),φ2 ◦ f ◦ βb,φ = fˇ◦ Lg , where βbi ,φi = φi π−1 ({b 1 i
i })
for φi ∈ Ai and bi ∈ Ui = πi (Dom(φi )) for i = 1, 2; (v) ∀φ1 ∈ A1 , ∀φ2 ∈ A2 , ρφ1 φ2 : U1 ∩ f˜−1 (U2 ) → G1 is continuous, where ρφ1 φ2 is defined by fˇ ◦ Lρφ1 φ2 (b) = −1 βf˜(b),φ2 ◦ f ◦ βb,φ . 1 [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Figure 23.7.1
βf˜(b),φ2
φ2 f
Dom(φ1 ) ⊆ E1 π1−1 ({b})
F
510
23. Topological fibre bundles fˆ
G1 µ1
µ2 fˇ
F1 φ1
φ2
π1
Figure 23.7.2
F2
f
E1
B1
G2
E2 π2
f˜
B2
General topological fibre bundle homomorphism
23.7.8 Remark: Definition 23.7.7 is illustrated in Figure 23.7.2. −1 If the equation βf˜(b),φ2 ◦ f ◦ βb,φ = fˇ ◦ Lg in condition (iv) is replaced with the less restrictive ∃g2 ∈ 1 −1 G2 , βf˜(b),φ2 ◦ f ◦ βb,φ = Lg2 ◦ fˇ ◦ Lg , which would be more symmetric, it would no longer be required in 1 −1 general that βf˜(b),φ2 ◦ f ◦ βb,φ (F1 ) ⊆ fˇ(F1 ). 1 23.7.9 Remark: Definitions 23.7.10 and 23.7.11 permit the structure groups (G1 , F1 ) and (G2 , F2 ) to be different but equivalent whereas Definitions 23.7.4 and 23.7.5 require identical structure groups. 23.7.10 Definition: A topological fibre bundle isomorphism between topological fibre bundles (E1 , π1 , B1 ) and (E2 , π2 , B2 ) is a map f : E1 → E2 such that f and f −1 are topological fibre bundle homomorphisms.
23.7.12 Remark: The term “product bundle” is generally used for trivial bundles which are constructed as the direct product of two topological spaces. Therefore the term “direct product bundle” should be used for the direct product of two topological fibre bundles. 23.7.13 Definition: [Definition of product bundle.] [ See EDM2 [34], 52.G, for a definition of fibre product. See also EDM2 [34], 200.D, for the definition of Tor. ] [ Define “Whitney sum” of fibre bundles. See Crampin/Pirani [11], pages 358–9. ] 23.7.14 Definition: [Definition of reduced bundle.] [ Give definitions of cross-section and smooth cross-section. See Gallot/Hulin/Lafontaine [19], 1.34, page 17; EDM2 [34], 147.L; Malliavin [35], 7.4.4, page 69. ] S [ May introduce “double fibre bundles” such as T (M1 , M2 ) = p∈M1 ,q∈M2 Lin(Tp (M1 ), Tq (M2 )), the “double tangent space”, with projections πj : T (M1 , M2 ) → Mj and π12 : T (M1 , M2 ) → M1 × M2 . ] [ Maybe have a new section here on “double fibre bundles”, including non-topological and topological. Define structure groups on these. Try to construct double fibre bundles out of single fibre bundles. Double fibre bundles are related to double tangent spaces. ]
23.8. Structure-preserving fibre set maps Maps which “preserve structure” between fibre sets Eb with b ∈ B for (G, F ) fibre bundles (E, π, B, AF E) are important for defining parallelism because parallelism must preserve structure. A map which preserves structure means a map which is equivalent to the action of an element of the structure group “through the charts”. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
23.7.11 Definition: Equivalent topological fibre bundles are topological fibre bundles between which there is a topological fibre bundle isomorphism.
23.8. Structure-preserving fibre set maps
511
23.8.1 Remark: It is tempting to try to transfer the action of the group G from the fibre F to the total space E. One motivation for this might be for the definition of connections on tangent bundles. This idea doesn’t seem to work though. Initially we have a fibre bundle (E, π, B) − < (E, TE , π, B, TB , AF E ) and a F F topological transformation group (G, F ) − < (G, TG , F, TF , σG , µG ). Let φ ∈ AE and define βb,φ = φ π−1 ({b}) for b ∈ Dom(φ) as in Definition 23.6.4 so that βb,φ : π −1 ({b}) ≈ F . The most obvious way to transfer the action of g ′ ∈ G from F to π −1 ({b}) is as follows. −1 z 7→ βb,φ ◦ Lg′ ◦ βb,φ (z),
∀z ∈ Dom(φ),
where b = π(z). This must be tested for fibre chart independence. From Definition 23.6.4, for any fixed b ∈ −1 Dom(φ1 ) ∩ Dom(φ2 ) there is a g ∈ G such that βb,φ1 ◦ βb,φ = Lg . From this it follows that βb,φ1 = Lg ◦ βb,φ2 2 −1 −1 and βb,φ1 = βb,φ2 ◦ Lg−1 . This implies that −1 −1 βb,φ ◦ Lg′ ◦ βb,φ1 = βb,φ ◦ Lg−1 g′ g ◦ βb,φ2 . 1 2 −1 This is only equal to the desired βb,φ ◦ Lg′ ◦ βb,φ2 if g and g ′ commute. So in general, the definition of the 2 ′ action of g on E is not chart-independent. This seems to be the motivation for the introduction of principal fibre bundles and associated fibre bundles. In the case of principal fibre bundles, the action on the total space is on the right, and the action on the right commutes with the action of chart transition functions on the left.
23.8.2 Definition: A structure-preserving fibre set map for a (G, F ) topological fibre bundle (E, π, B, AF E) is a homeomorphism f : Eb1 ≈ Eb2 for b1 , b2 ∈ B such that F ∀φ1 ∈ AF E,b1 , ∀φ2 ∈ AE,b2 , ∃g ∈ G, ∀z ∈ Eb1 ,
φ2 (f (z)) = gφ1 (z).
23.8.4 Notation: IsoG (Eb1 , Eb2 ) for a (G, F ) topological fibre bundle (E, π, B, AF E ) denotes the set of all topological isomorphisms from Eb1 to Eb2 for b1 , b2 ∈ B which are equivalent to left translations by group elements “through the charts”. In other words, F IsoG (Eb1 , Eb2 ) = {f ∈ Iso(Eb1 , Eb2 ); ∀φ1 ∈ AF E,b1 , ∀φ2 ∈ AE,b2 , ∃g ∈ G, ∀z ∈ Eb1 , φ2 (f (z)) = gφ1 (z)}.
[ Possibly define some sort of structure on the IsoG (Eb1 , Eb2 ) spaces? There could be some sort of topological or algebraic structure. Maybe there could be a “dual isomorphism bundle” structure which looks like a fibre bundle of some sort. At least it should be given a name. In the case of differentiable fibre bundles, there should be a differentiable structure. ] [ It would be useful to have a notation such as Iso SG (E) for the “double fibre bundle” of all structure-preserving fibre set maps on (E, π, B). Thus IsoG (E) = p,q∈B IsoG (Ep , Eq ). ] 23.8.5 Notation: AutG (Eb ) for a (G, F ) topological fibre bundle (E, π, B, AF E ) denotes the set of all topological automorphisms of Eb for b ∈ B which are equivalent to left translations by group elements “through the charts”. That is, AutG (Eb ) = Iso(Eb , Eb ).
23.8.6 Remark: Notation 23.8.4 is based on the Notation 14.12.3 by which Iso(Eb1 , Eb2 ) denotes the set of homeomorphisms from Eb1 to Eb2 . However, it is superfluous to require that f ∈ Iso(Eb1 , Eb2 ) for all f ∈ IsoG (Eb1 , Eb2 ) because (G, F ) is a continuous transformation group. So F IsoG (Eb1 , Eb2 ) = {f : Eb1 → Eb2 ; ∀φ1 ∈ AF E,b1 , ∀φ2 ∈ AE,b2 , ∃g ∈ G, ∀z ∈ Eb1 , φ2 (f (z)) = gφ1 (z)}.
In terms of the per-base-point chart notation βb,φ in Definition 23.6.4, one may write F IsoG (Eb1 , Eb2 ) = { βb−1 ◦ Lg ◦ βb1 ,φ1 ; φ1 ∈ AF E,b1 , φ2 ∈ AE,b2 , g ∈ G }. 2 ,φ2
All maps of the form βb−1 ◦ Lg ◦ βb1 ,φ1 are automatically elements of Iso(Eb1 , Eb2 ) by the conditions of 2 ,φ2 Definition 23.6.4. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
23.8.3 Remark: The group element g ∈ G in Definition 23.8.2 and Notation 23.8.4 may be thought of as the coordinates of the map f : Eb1 ≈ Eb2 . This is clearer if g is written as gφ1 ,φ2 (f ; b1 , b2 ) = βb2 ,φ2 ◦f ◦βb−1 , 1 ,φ1 where g and Lg are equated.
512
23. Topological fibre bundles
23.8.7 Definition: A (fibre set) automorphism through the charts on a (G, F ) fibre bundle (E, π, B, AF E) −1 is a map of the form βb,φ ◦ Lg ◦ βb,φ : Eb ≈ Eb for some g ∈ G, b ∈ B and φ ∈ AF . E,b F 23.8.8 Notation: Lbg,φ for b ∈ B, g ∈ G and φ ∈ AF E,b for a topological (G, F ) fibre bundle (E, π, B, AE ) −1 denotes the automorphism through the charts Lbg,φ = βb,φ ◦ Lg ◦ βb,φ : Eb ≈ Eb .
F Lg,φ for g ∈ G and φ ∈ AF E for a topological (G, F ) fibre bundle (E, π, B, AE ) denotes the map z 7→ −1 βπ(z),φ ◦ Lg ◦ βπ(z),φ (z) for z ∈ Dom(φ).
23.8.9 Remark: The map Lg,φ in Notation 23.8.8 is an automorphism Lg,φ : Dom(φ) ≈ Dom(φ). [ Probably get induced vector fields on differentiable fibre bundles by differentiating Lbg,φ and Lg,φ with respect to g via the atlas on G. ] 23.8.10 Remark: Theorem 23.8.11 is illustrated in Figure 23.8.1, particularly the proof of part (ii). F φk E
Lg1 βb,φ1 Lbg1 ,φ1
Lg2
F
βb,φ2 Eb
Lbg2 ,φ2
π B
fibre charts total space projection map
b
base space
Fibre set automorphisms
23.8.11 Theorem: The fibre set automorphisms Lbg,φ for a (G, F ) topological fibre bundle (E, π, B, AF E) for g ∈ G, φ ∈ AF and b ∈ U = π(Dom(φ)) have the following properties: φ E b b b (i) ∀φ ∈ AF E , ∀b ∈ Uφ , ∀g1 , g2 ∈ G, Lg1 ,φ ◦ Lg2 ,φ = Lg1 g2 ,φ ;
b b b b (ii) ∀φ1 , φ2 ∈ AF E , ∀b ∈ Uφ1 ∩ Uφ2 , ∀g1 , g2 ∈ G, Lg1 ,φ1 ◦ Lg2 ,φ2 = Lg1 h12 g2 h21 ,φ1 = Lh21 g1 h12 g2 ,φ2 , where −1 hij = gφi ,φj (b) for i, j = 1, 2, and Lgφi ,φj (b) = βb,φi ◦ βb,φ as in Definition 23.6.4 (v); j
−1 b b (iii) ∀φ1 , φ2 ∈ AF E , ∀b ∈ Uφ1 ∩ Uφ2 , Lh12 ,φ1 = Lh12 ,φ2 = βb,φ2 βb,φ1 for h12 as in (ii);
b b b (iv) ∀φ1 , φ2 , φ3 ∈ AF E , ∀b ∈ Uφ1 ∩ Uφ2 ∩ Uφ3 , Lh12 ,φ1 = Lh12 ,φ2 = Lh32 h13 ,φ3 for hij as in (ii); b b b b b (v) ∀φ1 , φ2 ∈ AF E , ∀b ∈ Uφ1 ∩ Uφ2 , ∀g ∈ G, Lg,φ1 = Lh21 ,φ2 Lg,φ2 Lh12 ,φ2 = Lh21 gh12 ,φ2 .
−1 −1 −1 −1 Proof: Part (i) follows from Lbg1 ,φ Lbg2 ,φ = (βb,φ Lg1 βb,φ )(βb,φ Lg2 βb,φ ) = βb,φ Lg1 Lg2 βb,φ = βb,φ Lg1 g2 βb,φ . (For simplicity, juxtaposition is used instead of the “◦” symbol to indicate function composition here.)
Part (ii) may be calculated as follows. (See Figure 23.8.1.) −1 −1 Lbg1 ,φ1 Lbg2 ,φ2 = (βb,φ Lg1 βb,φ1 )(βb,φ Lg2 βb,φ2 ) 1 2 −1 −1 = βb,φ Lg1 Lh12 Lg2 βb,φ2 (βb,φ βb,φ1 ) 1 1 −1 = βb,φ Lg1 h12 g2 Lh21 βb,φ1 1
= Lbg1 h12 g2 h21 ,φ1 . The equality to Lbh21 g1 h12 g2 ,φ2 follows similarly. (Note how the indices match up nicely. Spooky, huh?) −1 −1 −1 −1 Part (iii) follows from Lbh12 ,φ1 = βb,φ Lh12 βb,φ1 = βb,φ (βb,φ1 βb,φ )βb,φ1 = βb,φ βb,φ1 . Similarly, Lbh12 ,φ2 = 1 1 2 2 −1 −1 −1 βb,φ2 (βb,φ1 βb,φ2 )βb,φ2 = βb,φ2 βb,φ1 . [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Figure 23.8.1
fibre space
23.8. Structure-preserving fibre set maps
513
Part (iv) follows from the calculation: −1 Lbh12 ,φ1 = βb,φ Lh12 βb,φ1 1 −1 −1 −1 = (βb,φ βb,φ3 )βb,φ Lh12 βb,φ1 (βb,φ βb,φ3 ) 3 1 3 −1 = βb,φ Lh31 Lh12 Lh13 βb,φ3 3 −1 = βb,φ Lh31 h12 h13 βb,φ3 3
= Lbh31 h12 h13 ,φ3 = Lbh32 h13 ,φ3 . The other half of part (iv) follows from part (iii). Part (v) follows from the calculation −1 −1 −1 −1 Lbg,φ1 = βb,φ Lg βb,φ1 = βb,φ (βb,φ βb,φ2 )Lg (βb,φ βb,φ2 )βb,φ1 = Lbh21 ,φ2 Lbg,φ2 Lbh12 ,φ2 = Lbh21 gh12 ,φ2 . 1 1 2 2 23.8.12 Remark: Part (i) of Theorem 23.8.11 means that the composition rules for the automorphisms Lg,φ are the same as for the left translation operators of the transformation group (G, F ). The general composition rule Lbg1 ,φ1 ◦ Lbg2 ,φ2 = Lbg1 h12 g2 h21 ,φ1 = Lbh21 g1 h12 g2 ,φ2 for different charts in part (ii) involves some sort of conjugation of group elements with the coordinate transition group elements h12 and h21 = h−1 12 . −1 Part (iii) in reverse gives a chart transition rule for elements of a fibre set. Thus βb,φ βb,φ2 (z) = Lbh21 ,φ1 (z) 1 b for all z ∈ Eb . Hence βb,φ2 (z) = βb,φ1 Lh21 ,φ1 (z) for all z ∈ Eb . In terms of Notation 23.8.15, one may write −1 b b also Lb,b e,φ1 ,φ2 = Lh12 ,φ1 = Lh12 ,φ2 = βb,φ2 βb,φ1 .
23.8.13 Remark: For any topological (G, F ) fibre bundle (E, π, B, AF E ), it is tempting to believe that: F ∀b1 , b2 ∈ B, ∀φ1 ∈ AF E,b1 , ∀φ2 ∈ AE,b2 , ∀f ∈ IsoG (Eb1 , Eb2 ), ∀g ∈ G, ∀z ∈ Eb1 , 1 2 (z)) = Lbg,φ (f (z)). f (Lbg,φ 1 2
In fact, this is false in general. One would think that if you carry out a transformation g on the domain of an isomorphism, then this should commute with the same transformation on the range. The reason it doesn’t work is the way in which the fibre charts interact with the isomorphism. Consider for example the parallelism on IR2 with the group O(2) of rotations. If the charts at two base points are orientation-preserving, then the above rule is true. But if one of the charts has the opposite orientation to the other, then the rotation Lg,φ1 on one fibre set will have the opposite effect to the rotation Lg,φ2 on the other fibre set. The essence of the problem is the fact that all elements of IsoG (Eb1 , Eb2 ) are themselves left actions on the fibre space, and these cannot be expected to commute with the left actions (automorphisms) through the charts on each fibre set. It turns out that in the case of principal fibre bundles, this problem can be resolved by using a combination of right actions and left actions. Thus a rule of the form f (Rg (z)) = Rg (f (z)) will be obtained. [ The following paragraph doesn’t make any sense until you’ve read the sections on tangent bundles. So they really shouldn’t be here at all! ] This issue is related to the concept of group invariants because the fibre set isomorphisms preserve all invariants of the structure group, which is because the isomorphisms are themselves left group actions. In the case of coordinate bundles for differentiable manifolds, it will turn out that the left action of the structure group preserves linear combinations. That is, linear combinations are an invariant of the general linear group. Therefore matrices which represent linear combinations of tangent vectors must be invariant. But such matrices act on vectors from the right. And this is why coordinate frame bundles are used as principal fibre bundles. Coordinate frames are really right actions on the tangent vector space. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
The fact that Lbh12 ,φ1 = Lbh12 ,φ2 in part (iii) suggests that Lbh12 ,φ is independent of the chart φ. Part (iv) shows that this is not true in general. Part (v) is a general chart transition rule. This rule indicates a fundamental problem with the transfer of group actions from the fibre space F to the fibre sets Eb . The problem is that parallelism cannot be specified by associating a group element g ∈ G with each point on a curve to indicate a left action on the fibre set at that point. The left actions Lbg,φ on Eb are chart-dependent. So the orientation of the fibre sets must be indicated in general by a different group element for each chart. The purpose of principal fibre bundles is to try to remove this problem.
514
23. Topological fibre bundles
23.8.14 Definition: A (fibre set) isomorphism through the charts on a (G, F ) fibre bundle (E, π, B, AF E) F F is a map of the form βb−1 ◦ L ◦ β : E ≈ E for some g ∈ G, b , b ∈ B, φ ∈ A and φ ∈ A g b ,φ b b 1 2 1 2 1 1 1 2 E,b1 E,b2 . 2 ,φ2 1 ,b2 F 23.8.15 Notation: Lbg,φ for b1 , b2 ∈ B, g ∈ G, φ1 ∈ AF E,b1 and φ2 ∈ AE,b2 for a topological (G, F ) fibre 1 ,φ2 b1 ,b2 F bundle (E, π, B, AE ) denotes the isomorphism through the charts Lg,φ1 ,φ2 = βb−1 ◦ Lg ◦ βb1 ,φ1 : Eb1 ≈ Eb2 . 2 ,φ2
23.8.16 Remark: Notation 23.8.15 is illustrated in Figure 23.8.2. Lg F βb1 ,φ1
φk E
Eb1
π B Figure 23.8.2
fibre space
F
1 ,b2 Lbg,φ 1 ,φ2 2 ,b1 Lbg,φ 2 ,φ1
βb3 ,φ3 βb2 ,φ2 2 ,b3 Lbg,φ 2 ,φ3
Eb2
1 ,b3 Lbg,φ 1 ,φ3
3 ,b2 Lbg,φ 3 ,φ2
fibre charts total space
Eb3
projection map
3 ,b1 Lbg,φ 3 ,φ1
b1
b2
base space
b3
Fibre set isomorphisms through the charts
1 ,b2 If the fibre set isomorphisms Lbg,φ for base point pairs (b1 , b2 ) ∈ Dom(φ1 )×Dom(φ2 ) are considered as the 1 ,φ2 maps Lg,φ1 ,φ2 for variable (b1 , b2 ), the result is not a single-valued function because each element of the fibre set Eb1 is mapped to a single element in every fibre set Eb2 for b2 ∈ Dom(φ2 ). Therefore the symbol Lg,φ1 ,φ2 is best thought of as an equivalence relation than as a function. For each choice of g ∈ G and φ1 , φ2 ∈ AF E, the relation Lg,φ1 ,φ2 sets up an equivalence between one point in each fibre set of Dom(φ1 ) and one point 1 ,b2 in each fibre set of Dom(φ2 ). In practice, the superscripts on the notations Lbg,φ and Lbg,φ are tedious to 1 ,φ2 write; so they may be omitted.
1 ,b2 23.8.17 Theorem: The fibre set isomorphisms Lbg,φ for a (G, F ) topological fibre bundle (E, π, B, AF E) 1 ,φ2 F for g ∈ G, φk ∈ AE and bk ∈ Uφk = π(Dom(φk )) for k = 1, 2 have the following properties:
b2 ,b3 b1 ,b2 b1 ,b3 (i) ∀φ1 , φ2 , φ3 ∈ AF E , ∀b1 ∈ Uφ1 , ∀b2 ∈ Uφ2 , ∀b3 ∈ Uφ3 , ∀g1 , g2 ∈ G, Lg2 ,φ2 ,φ3 ◦ Lg1 ,φ1 ,φ2 = Lg2 g1 ,φ1 ,φ3 ;
b2 ,b3 b1 ,b2 (ii) ∀φ1 , φ2 , φ3 , φ4 ∈ AF E , ∀b1 ∈ Uφ1 , ∀b2 ∈ Uφ2 ∩ Uφ3 , ∀b3 ∈ Uφ4 , ∀g1 , g2 ∈ G, Lg2 ,φ3 ,φ4 ◦ Lg1 ,φ1 ,φ2 =
−1 3 Lbg12,b h32 g1 ,φ1 ,φ4 , where hij = gφi ,φj (b) for i, j = 2, 3, and Lgφi ,φj (b) = βb,φi ◦βb,φj as in Definition 23.6.4 (v);
b1 ,b2 b1 ,b2 (iii) ∀φ1 , φ′1 , φ2 , φ′2 ∈ AF E , ∀b1 ∈ Uφ1 ∩ Uφ′1 , ∀b2 ∈ Uφ2 ∩ Uφ′2 , Lg,φ1 ,φ2 = Lg ′ ,φ′1 ,φ′2 ⇔ g ′ = gφ′2 ,φ2 (b2 ).g.gφ1 ,φ′1 (b1 ) ; b1 ,b2 −1 (iv) ∀φ1 , φ2 ∈ AF E , ∀b1 ∈ Uφ1 , ∀b2 ∈ Uφ2 , Le,φ1 ,φ2 = βb2 ,φ2 βb1 ,φ1 ;
b1 ,b2 −1 (v) ∀φ1 , φ2 ∈ AF E , ∀b1 ∈ Uφ1 , ∀b2 ∈ Uφ2 , Lg = βb2 ,φ2 Lg,φ1 ,φ2 βb1 ,φ1 : F → F .
Proof: Part (i) follows by simple calculation (indicating composition by juxtaposition): b1 ,b2 −1 −1 3 Lbg22,b ,φ2 ,φ3 Lg1 ,φ1 ,φ2 = (βb3 ,φ3 Lg2 βb2 ,φ2 )(βb2 ,φ2 Lg1 βb1 ,φ1 )
= βb−1 Lg2 g1 βb1 ,φ1 3 ,φ3 3 = Lbg12,b g1 ,φ1 ,φ3 .
Part (ii) may be calculated similarly as follows: b1 ,b2 −1 −1 3 Lbg22,b ,φ3 ,φ4 Lg1 ,φ1 ,φ2 = (βb3 ,φ4 Lg2 βb2 ,φ3 )(βb2 ,φ2 Lg1 βb1 ,φ1 )
= βb−1 Lg2 Lh32 Lg1 βb1 ,φ1 3 ,φ4 3 = Lbg12,b h32 g1 ,φ1 ,φ4 .
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
[ When “double fibre bundles” have been defined, Remark 23.8.16 can be phrased in such terms. ]
23.9. Topological principal fibre bundles
515
The rule for changes of fibre charts in part (iii) follows from the calculation: 1 ,b2 Lbg,φ = βb−1 Lg βb1 ,φ1 1 ,φ2 2 ,φ2
−1 −1 = βb−1 ′ (βb2 ,φ′ βb ,φ )Lg (βb1 ,φ1 βb ,φ′ )βb1 ,φ′ 2 2 2 1 2 ,φ 1 2
1
= βb−1 ′ Lg ′ φ ,φ 2 ,φ 2
2
2
(b2 ) Lg Lgφ
′ (b1 ) 1 ,φ1
βb1 ,φ′1 .
Parts (iv) and (v) follow trivially from Definition 23.8.14. 23.8.18 Remark: Theorem 23.8.17 (v) implies that any fibre set isomorphism f : Eb1 → Eb2 may be 1 ,b2 converted to a unique g ∈ G by calculating Lg = βb2 ,φ2 ◦ f ◦ βb−1 . Any isomorphism f = Lbg,φ 1 ,φ1 1 ,φ2 may be regarded as a parallelism relation between the fibre sets at b1 and b2 in B. To specify absolute (path-independent) parallelism, these fibre set isomorphisms are completely adequate. But for pathwise (path-dependent) parallelism, the definitions of Section 24.2 are required. Looking ahead to differentiable fibre bundles, the strategy will be to try to differentiate the parameter g for the map f with respect to the point b2 as it varies along a curve. This derivative will be called a “connection”. One may think of g as a function gf,φ1 ,φ2 (b1 , b2 ) which is a coordinatization of the map f (b1 , b2 ) : Eb1 → Fb2 . Thus gf,φ1 ,φ2 (b1 , b2 ) = βb2 ,φ2 f (b1 , b2 )βb−1 . Looking even further ahead, one may also try to calculate the 1 ,φ1 exterior derivative of the derivative of gf,φ1 ,φ2 (b1 , b2 ) with respect to b2 to obtain a measure of the curvature of the parallelism. This is not totally unlike the situation in elementary calculus where the curvature of a curve in flat 2-space IR2 is related to the second derivative of the curve regarded as a graph.
23.9.1 Remark: A topological principal fibre bundle is a particular kind of topological fibre bundle, namely a topological (G, F ) fibre bundle (P, π, B) such that F = G. (Topological fibre bundles were defined in Section 23.6.) More precisely, the structure group (G, F ) − < (G, TG , F, TF , σG , µ) has F = G, TF = TG and µ = σG . So the structure group for a principal fibre bundle is the topological left transformation group (G, G) − < (G, TG , G, TG , σG , σG ). The corresponding topological group is G − < (G, TG , σG ). A principal fibre bundle with structure group G is usually called a “principal G-bundle” or just a “G-bundle”. With an ordinary fibre bundle, the group G acts on the left on the fibre space F . But when the fibre space F equals the group G, the group can also act on the right on the fibre space. It is possible to extend the right action of G from the fibre space to the total space P via the fibre charts in a chart-independent manner. This makes possible the construction of a topological right transformation group (G, P ) − < (G, TG , P, TP , σ, µP G) P in terms of a right action µG : P × G → P . This right transformation group is uniquely determined by the fibre bundle (P, π, B) − < (P, TP , π, B, TB , AG P ).
Some authors define principal fibre bundles in reverse. They start with the right transformation group (G, P ) and build the (G, G) fibre bundle (P, π, B) from this. (E.g. Kobayashi/Nomizu [26], page 50, construct the base space of a principal fibre bundle as the quotient of a G-space P by the right action of a group G. Then they construct ordinary fibre bundles as associated bundles for left transformation groups (G, F ) as in Section 23.12.) Most authors define connections (differential parallelism) on principal fibre bundles rather than ordinary fibre bundles. Then they copy connections from PFBs to associated OFBs. (They often use constructions in terms of “lifts” and “pullbacks” which are unncessarily convoluted and indirect.) So principal fibre bundles customarily serve as a structure on which to define connections and parallelism. Since connections and parallelism are defined on OFBs instead of PFBs in this book, the motivation for PFBs is not so strong. They are, however, important for understanding the literature. (Although PFBs are ubiquitous in the mathematical DG literature, they are seldom seen in the physics literature.) It is perhaps interesting to ask whether principal fibre bundles could be defined for F 6= G. For this to make sense, G would need to have two actions on F , a left action and a right action, and these two actions must commute with each other. However, this might not be very useful. Definition 23.9.2 is the topological version of Definition 35.4.1 for differentiable principal fibre bundles. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
23.9. Topological principal fibre bundles
516
23. Topological fibre bundles
23.9.2 Definition: A topological principal (fibre) bundle with structure group G for a topological group G − < (G, TG , σG ) is a topological (G, G) fibre bundle (P, π, B) − < (P, TP , π, B, TB , AG < P ) with (G, G) − (G, TG , G, TG , σG , σG ). −1 The right action of G on P is the operation µP by µP G : P × G → P defined G (z, g) = βπ(z),φ (σG (φ(z), g)) for G (z, g) ∈ P × G for any φ ∈ AP with z ∈ Dom(φ), where βb,φ = φ π−1 ({b}) for b ∈ π(Dom(φ)). The right transformation group of G on P is the topological right transformation group (G, P ) − < (G, TG , P, TP , σG , µP G ). A topological principal fibre bundle with structure group G is also called a topological principal G-bundle or a topological G-bundle.
23.9.3 Remark: Figure 23.9.1 illustrates the spaces and maps in Definition 23.9.2. The map Rg : P → P in Figure 23.9.1 is defined in Notation 23.9.4, and Lg : G → G is the left translation map Lg : g ′ 7→ gg ′ . G
Rg π −1 (U ) ⊆ P
Lg
G
φ π
Figure 23.9.1
φ
U ⊆B
×
π
U ×G⊆B×G
Principal fibre bundle spaces and maps
[ Kobayashi/Nomizu [26], page 50, says that M = P/G and π is the projection map of this quotient. ]
23.9.5 Theorem: The right action µP G in Definition 23.9.2 is fibre chart independent. −1 ′ ′ Proof: For any fibre charts φ, φ′ ∈ AG P with z ∈ Dom(φ) ∩ Dom(φ ), let g ∈ G be such that βb,φ′ ◦ βb,φ = Lg′ . Then −1 −1 βb,φ (σG (βb,φ (z), g)) = βb,φ ′ ◦ Lg ′ (σG (L(g ′ )−1 ◦ βb,φ′ (z), g)) −1 = βb,φ ′ (σG (βb,φ′ (z), g)).
The theorem follows from this. 23.9.6 Remark: This chart independence in Theorem 23.9.5 is due to that fact that the fibre space is a group whose elements can be acted on from both the left and the right. One could say that group elements are “two-port” objects. Since groups are associative, the right and left actions commute with each other. The elements of ordinary fibre spaces are then “single-port” objects. As noted in Remark 23.8.1, it is apparently not possible to define a chart-independent right action for ordinary fibre bundles. −1 If the expression βπ(z),φ (σG (φ(z), g)) for µP G (z, g) in Definition 23.9.2 seems untidy, it may also be written P −1 as µG (z, g) = (π × φ) (π(z), φ(z)g). 23.9.7 Remark: The right action Rg : P → P of G on P in Notation 23.9.4 may be expressed in terms ¯ g : G → G of G on G as Rg = β −1 ◦ R ¯ g ◦ βb,φ for g ∈ G and b ∈ π(Dom(φ)). In other of the right action R b,φ words, the action of G on P is the same as the action of G on G through the charts. Therefore the right action µP G does not provide any information that is not already in the fibre bundle. 23.9.8 Remark: Condition (ii) of Theorem 23.9.9 is often given as part of the definition a principal fibre bundle, whereas here it is presented as a consequence of the definition of a right action which is expressed in terms of the components of the standard fibre bundle tuple in Definition 23.9.2. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
23.9.4 Notation: Rg for g ∈ G for a principal fibre bundle (P, π, B, AG P ) denotes the map Rg : P → P P defined by Rg : z 7→ µP (z, g), where µ is the right action of G on P . G G
23.10. Associated topological fibre bundles
517
G 23.9.9 Theorem: The right action µP G on a topological principal fibre bundle (P, π, B, AP ) is the unique P map µG : P × G → P which satisfies:
(i) ∀z ∈ P, ∀g ∈ G, π(µP G (z, g)) = π(z); (That is, π(zg) = π(z).) G (ii) ∀φ ∈ AP , ∀z ∈ Dom(φ), ∀g ∈ G, φ(µP G (z, g)) = σG (φ(z), g). (That is, φ(zg) = φ(z)g.)
[ Show explicitly that actions Lbg,φ and Rg commute on P . ] 23.9.10 Remark: In Definition 23.6.4 for ordinary fibre bundles, the structure group G is required to act effectively on the fibre space F . Definition 23.9.2 does not need to specify that the right action of G on G is effective because a group always acts freely on itself, and a free action on a non-empty set is necessarily effective, as mentioned in Remark 9.4.22. The right action µP G of G acts freely on P . To see this in terms of the conditions in Theorem 23.9.9, let z ∈ P and g ∈ G \ {e} satisfy µP G (z, g) = z; that is, zg = z. Then by (i), π(zg) = π(z). So zg and z are in the same fibre set of P ; that is, they have the same base point b = π(zg) = π(z). From (ii) it follows that φ(zg) = φ(z)g for any chart φ ∈ AG P such that b ∈ Uφ = π(Dom(φ)). Since G acts freely on G, this implies that φ(zg) 6= φ(z). But π × φ : π −1 (Uφ ) ≈ Uφ × F is a bijection. Therefore zg = (π × φ)−1 (b, φ(zg)) 6= (π × φ)−1 (b, φ(z)) = z. This means that non-identity elements of G have no fixed points, and G therefore acts freely on P . 23.9.11 Remark: Theorem 23.9.9 (ii) means that all fibre charts are equivariant maps between the right transformation groups (G, P ) and (G, G). (See Definition 9.5.8 for equivariant maps.) Condition (i) means that the action of G on P is “vertical”. Without this condition, the action of G on P would not be uniquely determined by the fibre charts as a result of condition (ii).
G 23.9.12 Remark: In contrast to the ease of constructing µP G from the atlas AP in Remark 23.9.7, it is not P possible to construct the atlas from the right action µG . It is true that if φ(z) is known for one value of z ∈ π −1 ({b}) for a given b ∈ π(Dom(φ)), then φ(zg) is uniquely defined for all g ∈ G as φ(zg) = φ(z)g, and this means (by Definition 23.6.4 (ii)) that φ(z) is uniquely defined for all z ∈ π −1 ({b}). Thus the fibre charts are fully defined if they are defined for just one value of z in each fibre π −1 ({b}). However, this single value cannot be obtained from µP G . Therefore the information in the fibre charts is partly but not fully redundant. To specify a fibre chart φ, it is sufficient to specify the set φ−1 ({e}), where e ∈ G is the identity of G. −1 The rest of the fibre chart is uniquely determined by this. One may think of the unique point βb,φ (e) ∈ −1 −1 −1 φ ({e}) ∩ π ({b}) as the “identity point” in each fibre set π ({b}). For a tangent bundle, this gives a kind of “coordinate frame field” on B. −1 [ Kobayashi/Nomizu [26], page 50, construct B = P/G. They then derive βb,φ1 ◦ βb,φ = Lg from the right 2 action of G on P on page 51. Present this construction here also. ]
23.9.13 Remark: Structure-preserving fibre set maps are defined for principal fibre bundles in exactly the same way as for ordrinary fibre bundles in Section 23.8. As mentioned in Remark 23.8.1, left actions on fibre sets by the structure group do not generally commute with the fibre chart transition maps. Therefore the structure group element g ∈ G in Definition 23.8.2 is chart-dependent. This is reflected in Notation 23.8.8 where Lg,φ denotes left translation of fibre sets by g with respect to a fibre chart φ. This is still true for left actions on principal fibre bundles, but for a PFB it is possible to define right actions also (as in Notation 23.9.4), and right actions do commute with fibre chart transition maps. [ Near here, replicate Section 23.8 using right actions Rg on principal fibre bundles instead of left actions? ˚ G? ] Then show how left and right actions interact? E.g. RgG ◦ φ = φ ◦ RgP for charts φ : P →
23.10. Associated topological fibre bundles Two fibre bundles are said to be “associated” if they have the same fibre chart transition maps. It follows that associated fibre bundles must have the same structure group because fibre chart transition maps are equivalent to left translations Lg of the fibre space F by elements g of the structure group G, but the fibre [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
[ No textbooks seem to mention that the group action on the total space in Theorem 23.9.9 must satisfy condition (i): π(z, g) = π(z). Check that this is not implied by other conditions. ]
518
23. Topological fibre bundles
spaces F may be different. Fibre chart transition maps describe how patches of the total space are glued together. Therefore associated fibre bundles must be glued together in the same way, even though they may have a different fibre space. The equality of fibre chart transition maps also implies that the fibre bundle base spaces B must be the same. Prime examples of associated fibre bundles are tensor bundles T r,s (M ) of different types (r, s) on the same differentiable fibre bundle M . (See Section 29.3 for tensors.) These have the same structure group G (typically GL(n) for an n-dimensional manifold) and the same fibre chart transition rules. Tensors of different type have different fibre spaces F and transformation rules µ : G × F → F , but the coordinate charts φ themselves are related to each other by the same transition rules gφ1 ,φ2 for each type of tensor. Thus the group elements are the same, but the actions of the group elements on the fibre spaces are different because the fibre spaces are different. Associated fibre bundles could be defined to have different base spaces, but usually they have the same base space as in Definition 23.10.5. This is understandable because the main purpose of defining them is so as to transfer parallelism from one fibre bundle to another under the assumption that they are based on the same underlying fibre structure. Thus, for instance, all the different kinds of tensor bundles on a differentiable manifold are based on a single tangent bundle. If the base spaces are different, one may transfer parallelism between fibre bundles using fibre bundle isomorphisms. [ Maybe could transfer parallelism between fibre bundles by using the inverse of a fibre bundle homomorphism! ] Fibre bundle associations are quite different to fibre bundle homomorphisms (which are defined in Section 23.7). A fibre bundle association specifies a map between the fibre atlases of the fibre bundles. This fibre atlas map is required to preserve the chart transition maps, but there is no specified map between the total spaces of the associated bundles. A fibre bundle association may be thought of as an isomorphism of the fibre atlases. [ Investigate how a homomorphism h2 : ξ2 → η can be defined in terms of a homomorphism h1 : ξ1 → η if ξ1 and ξ2 are associated fibre bundles. ] 23.10.1 Remark: If (G, F ) and (G, F˜ ) are effective left transformation groups, then the fibre chart transition functions uniquely determine group elements. These group elements can then be compared with each other. If they are equal for some bijection between the atlases, then they are said to be associated. This shows a very good reason why the transformation groups should be effective. Since parallelism is determined by group elements at each base point, it follows that parallelism can be uniquely transferred from any fibre bundle to any associated fibre bundle. In the case of differentiable fibre bundles, this means that connections may be ported between different types of fibre bundles if they have the same base space and structure group. In particular, connections on affine connections on tangent bundles can be copied to all types of tensor bundles. 23.10.2 Remark: Definitions 23.10.3 and 23.10.5 are pleasantly simple compared to the definitions of associated fibre bundles in most textbooks. The usual definitions are actually particular methods of construction of associated fibre bundles, and it is the methods of construction which are complicated. Definition 23.10.3 says only that associated fibre bundles must have exactly matching domains for all charts in the respective atlases, and that all chart transition functions must be the same. Definition 23.10.3 is illustrated in Figure 23.10.1. Note how the correspondence between the fibre bundles involves only a map h between the charts. There are no maps between the spaces of the two fibre bundles. This should be contrasted with fibre bundle homomorphisms as illustrated in Figure 23.7.2, which has maps between all components of the two fibre bundles. Note also that although the spaces “on the outside” (G and B) are the same, the spaces “on the inside” (F and E) are different. Therefore the group action maps µ and µ ˜ are different and the projection maps π and π ˜ are different. In the middle, the charts φi and φ˜i = h(φi ) are related by the fibre space association map h. ˜
F 23.10.3 Definition: A topological fibre bundle association is a bijection h : AF ˜ between the fibre E → AE ˜ F F ˜ ˜ atlases of topological (G, F ) and (G, F ) fibre bundles (E, π, B, A ) and (E, π ˜ , B, A ) respectively such that: E
˜ E
(i) ∀φ ∈ AF ˜ (Dom(h(φ))); E , π(Dom(φ)) = π [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Specific methods for constructing associated fibre bundles are presented in Sections 23.11 and 23.12.
23.10. Associated topological fibre bundles G
G µ
µ ˜ F˜
F φ1
h
φ2
h(φ1 )
h(φ2 ) ˜ E
E π
π ˜
B Figure 23.10.1
519
B
Associated topological fibre bundles
(ii) ∀φ1 , φ2 ∈ AF ˜h(φ1 ),h(φ2 ) (b), where Uφ denotes π(Dom(φ)), and g, g˜ denote E , ∀b ∈ Uφ1 ∩ Uφ2 , gφ1 ,φ2 (b) = g the fibre chart transition functions for the respective fibre bundle atlases. ˜ in Definition 23.10.3? ] [ What can be said about the relation between the topologies on E and E 23.10.4 Remark: Perhaps a clearer way of presenting Definition 23.10.3 (ii) is: ∀φ1 , φ2 ∈ AF E , ∀b ∈ Uφ1 ∩ Uφ2 , ∀g ∈ G,
∀z ∈ π −1 ({b}), φ1 (z) = gφ2 (z)
⇔
∀˜ z∈π ˜ −1 ({b}), h(φ1 )(˜ z ) = gh(φ2 )(˜ z) .
(23.10.1)
23.10.5 Definition: Associated topological fibre bundles are topological (G, F ) and (G, F˜ ) fibre bundles ˜ ˜ , B, AF˜ ) for which a topological fibre bundle association h : AF → AF˜ is specified. (E, π, B, AF E ) and (E, π ˜ E ˜ E E 23.10.6 Remark: Recall from Definition 23.6.4 that the chart transition functions for Definition 23.10.3 −1 −1 are defined by Lgφ1 ,φ2 (b) = βb,φ1 ◦ βb,φ and Lg˜h(φ1 ),h(φ2 ) (b) = β˜b,h(φ1 ) ◦ β˜b,h(φ , where β and β˜ are notations 2 2) which are defined by βb,φ = φ −1 and β˜ ˜ = φ˜ −1 . Definition 23.10.3 (ii) means that the group π
({b})
b,φ
π ˜
({b})
elements g and g˜ are equal, but Lg : F ≈ F and Lg˜ : F˜ ≈ F˜ are not equal in general. Even if the spaces F and F˜ are the same, the group actions µ : G × F → F and µ ˜ : G × F˜ → F˜ might be different.
23.10.7 Example: The fibre bundle association for fibre bundles in Definition 23.10.5 is generally not unique even if the atlases of the fibre bundles are fixed. As a trivial example, consider F = F˜ = (−1, 1) ⊆ IR ˜ = F , and let G = Aut(F ), the group of with the relative topology from IR. Let B = {0} ⊆ IR, let E = E homeomorphisms from F to F . Define charts φ1 : E → F and φ2 : E → F by φ1 : x 7→ x and φ2 : x 7→ −x, F˜ and let AF ˜ : x 7→ 0. ˜ = {φ1 , φ2 }. Of course, π : x 7→ 0 and π E = AE
˜ ˜ ˜ , B, AF˜ ) are identical fibre bundles, the identity map h1 : AF → AF˜ is Since ξ = (E, π, B, AF ˜ ˜ E ) and ξ(E, π E E E a topological fibre bundle association according to Definition 23.10.3. But another topological fibre bundle F˜ association is the map h2 : AF ˜ defined by h2 : φ1 7→ φ2 and h2 : φ2 7→ φ1 . To show that Definition E → AE 23.10.3 (ii) is satisfied, note that g˜h(φ1 ),h(φ2 ) (b) = g˜φ2 ,φ1 (0) : x 7→ −x. But gφ1 ,φ2 (b) : x 7→ −x also. Condition (ii) follows for other chart combinations similarly. So h1 and h2 are topological fibre bundle associations between ξ and ξ˜ and h1 6= h2 . Since a fibre bundle may be associated with itself via an atlas bijection which is not the identity, such a selfassociation may be composed with associations to different fibre bundles to also make associations between different fibre bundles non-unique. It follows that when parallelism is being copied between associated fibre bundles, it is essential to specify which association map is being used between the atlases. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
This is similar to equation (24.3.1) in Definition 24.3.2 for associated parallelism. (For proof of this remark, see Exercise 47.8.2.)
520
23. Topological fibre bundles
23.10.8 Remark: Definition 23.10.5 is very strict. It requires the atlases of associated topological fibre bundles to have closely matched domains and identical fibre chart transition maps. In practice, one may be content to call a pair of fibre bundles associated if they are merely C 0 equivalent to a pair of strictly associated fibre bundles. This is presented in Definition 23.10.9. [ In Remark 23.10.8, could talk about “strongly associated” and “weakly associated” fibre bundles. ] [ Must define C 0 equivalent topological fibre bundles in Section 23.7. ] 23.10.9 Definition: C 0 associated topological fibre bundles are topological fibre bundles ξ1 , ξ˜1 with the same base space which are C 0 equivalent to associated topological fibre bundles ξ2 , ξ˜2 respectively. 23.10.10 Remark: Definition 23.10.9 is illustrated in Figure 23.10.2. Definition 23.10.9 means that if ξ2 and ξ˜2 are associated topological fibre bundles according to Definition 23.10.5, and ξ1 and ξ2 are equivalent, and ξ˜1 and ξ˜2 are equivalent, and they all have the same base space, then ξ1 and ξ˜1 are C 0 associated topological fibre bundles. (See Definition 23.7.11 for C 0 equivalent topological fibre bundles.) Definition 23.10.9 permits the associated fibre bundles ξ2 , ξ˜2 to have different but equivalent structure groups (G1 , F1 ) ˜ 1 , F˜1 ), but the base spaces are required to be identical. This is somewhat arbitrary and may be and (G changed in light of the requirements for any applications.
µ1
µ2
φ1
E2 π1
φ˜2
φ˜1
˜2 E
˜1 E π ˜2
B ξ1 isomorphism ξ2
F˜1
h(φ2 )
π2
B
µ ˜1
F˜2 h
φ2
E1
˜1 G µ ˜2
F2
F1
Figure 23.10.2
˜ 2 =G2 G
π ˜1
B
association
B ξ˜2 isomorphism ξ˜1
C 0 associated topological fibre bundles, Definition 23.10.9
23.11. Construction of associated topological fibre bundles Associated fibre bundles are constructed in this section as identification spaces. These are similar to the identification spaces in Section 23.4. 23.11.1 Remark: Definition 23.11.2 constructs an associated (G, F˜ ) fibre bundle from a given (G, F ) fibre bundle. The only information that the associated bundle inherits from the given bundle is the set of transition maps gφ1 φ2 of Definition 23.6.4 for topological fibre bundles. (See Kobayashi/Nomizu [26], Prop. 5.2, page 52, for the related result that a principal fibre bundle may be constructed from any set of transition maps covering the base space and satisfying a transitivity rule. An almost identical result is at the end of EDM2 [34], 147.B.) 23.11.2 Definition: The associated topological (G, F˜ ) fibre bundle (identification space method) of a given topological (G, F ) fibre bundle (E, π, B) − < (E, TE , π, B, TB , AF E ), for topological left transformation groups ˜ F ˜ ˜ ˜ (G, F ) − < (G, TG , F, TF , σ, µG ) and (G, F ) − < (G, TG , F , TF˜ , σ, µF G ), is the topological (G, F ) fibre bundle ˜ F ˜ π ˜ T ˜, π (E, ˜ , B) − < (E, ˜ ) defined by: E ˜ , B, TB , AE [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
G2
G1
23.12. Construction of associated topological fibre bundles via orbit spaces
521
˜ = [(b, y, φ)]; b ∈ B, y ∈ F˜ , φ ∈ AF , where [(b, y, φ)] = {(b, gφ′ φ (b)y, φ′ ); φ′ ∈ AF }, the transition (i) E E,b E,b maps gφ1 φ2 : Uφ1 ∩ Uφ2 → G are defined by Lgφ1 ,φ2 (b) = φ1 ◦ φ2 π−1 ({b}) −1 , and Dom(φ) = π −1 (Uφ )
for φ ∈ AF E; ˜ (ii) π ˜ : E → B is defined by π ˜ : [(b, y, φ)] 7→ b; F˜ F ˜ ˜ (iii) AE˜ = {φ; φ ∈ AE }, where φ˜ : π ˜ −1 (Uφ ) → F˜ is defined for φ ∈ AF E by φ : [(b, y, φ)] 7→ y; S ˜ −1 (Ωφ ); Ω : AF → IP(B × F˜ ) and ∀φ ∈ AF , Ωφ ∈ Top(Uφ × F˜ ) . (iv) TE˜ = π × φ) E E φ∈AF (˜ E
[ Create a diagram for Definition 23.11.2. ]
˜ π 23.11.3 Remark: The fibre bundle (E, ˜ , B) which is constructed in Definition 23.11.2 is well defined and satisfies Definition 23.10.3 for a fibre bundle association because the chart transition maps satisfy φ˜2 : [(b, y1 , φ1 )] 7→ gφ2 ,φ1 (b)y1 by conditions (i) and (iii). It would perhaps be more logical to use the charts φ˜ as tags for triples (b, y, φ) instead of φ, but then they would be used in (i) although they are not defined until (iii). Besides, there is a one-to-one map between them anyway. 23.11.4 Remark: If the fibre space F˜ in Definition 23.11.2 is the structure group G, then the associated ˜ π fibre bundle (E, ˜ , B) = (P, q, B) is a principal In this case, the right action µP G : P × G → P of G-bundle. P ′ ′ the G-bundle is defined by µG : [(b, g, φ)], g 7→ [(b, gg , φ)].
[ Refer to the example of the tangent bundle of a differentiable manifold. ] [ Try to show relations between actions Rg and Rf for associated ordinary and principal fibre bundles. ] [ Here present the 4-tuple equivalence class construction for associated fibre bundles. ]
The “orbit space method” of defining associated fibre bundles constructs associated ordinary fibre bundles from a given principal fibre bundle. The orbit space method is less general than the identification space method in Section 23.11 because the given fibre bundle must be a PFB, but it is the method most often encountered in textbooks. The author has spent a tremendous amount of time and energy attempting to find natural generalizations or a deeper theoretical context for the popular “orbit space method”. It seems, however, that the orbit space construction is a red herring which leads to nothing of interest. Therefore it is de-emphasized in this book. Differential geometry probably does not need it. In practice, no one ever seems to use the orbit-space construction directly because it is too abstract. Instead they show that some more concrete construction is isomorphic to the orbit-space associated fibre bundle, and they use the more concrete construction instead. So the orbit-space construction is perhaps one of those things which one should just learn and forget. 23.12.1 Remark: The identification space method uses fibre charts φ as tags for pairs (b, y) ∈ B × F , for a base space B and fibre space F , to make tagged tuples (b, y, φ). The fibre chart tags determine the required transformation of the fibre space element y when changing the fibre chart. So for an arbitrary chart φ′ , the fibre space element for (b, y, φ) is calculated as gφ′ ,φ (b)y in terms of the chart transition map gφ′ ,φ (b) for the given fibre bundle. The orbit space method, on the other hand, uses tuples of the form (z, y) ∈ P × F , for a given PFB total space P . The component z ∈ P contains the same information as the combination of the base point b and fibre chart F in the identification space method. The fibre space element corresponding to a pair (z, y) is easily calculated as φ′ (z)y for an arbitrary chart φ′ . This gives exactly the same answer as in the identification space method because φ′ (z) = gφ′ ,φ (b)φ(z). The orbit space method looks simpler, but in practice exactly the same calculation is required. In summary, the tuple (b, y, φ) in the identification space method carries around a copy of the base point b and fibre chart φ so that the fibre chart transition map gφ′ ,φ (b) can be applied correctly to change y to y ′ = gφ′ ,φ (b)y, whereas the tuple (z, y) in the orbit space method carries around a PFB total space element z so that the group elements φ(z) and φ′ (z) may be applied to y to change it to y ′ = φ′ (z)−1 φ(z)y when the fibre chart is changed to φ′ . This works because φ′ (z)−1 φ(z) = gφ′ ,φ (b). (Neat, huh?) [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
23.12. Construction of associated topological fibre bundles via orbit spaces
522
23. Topological fibre bundles
The orbit space method does not work if the given fibre bundle is not a PFB because the product φ(z)y is only defined if φ(z) is an element of a transformation group acting on y ∈ F , and this implies that the fibre bundle is a PFB. The orbit space method could be generalized to given fibre bundles which are not PFBs if their fibre space contains enough information to make the correct transition maps. In other words, it must be possible to extract the group element gφ′ ,φ (b) somehow. In fact, this can be done in the case of coordinate n-frames for tangent bundles of n-dimensional manifolds, because there is a one-to-one map between transition maps and transformations of n-frames. This is used in the popular definition of principal tangent bundles. It can also be done for the (n + 1)-frame bundle, but not for the (n − 1)-frame bundle. However, it seems that the effort required to construct such intellectual curiosities does not yield any worthwhile benefits. (See Remark 35.9.4 for further comment on this.) 23.12.2 Remark: Definition 23.12.3 constructs a topological (G, F ) fibre bundle (E, π, B) from a given topological principal G-bundle (P, q, B), where (G, F ) is an effective topological left transformation group. The total space E of the ordinary fibre bundle (E, π, B) is constructed as an equivalence class of pairs in P × F . which is locally homeomorphic to B × F . The (G, F ) fibre atlas AF E for (E, π, B) is constructed from the atlas AG for (P, q, B) by applying the group operation of G. The topology TE for E is defined so P that the maps π × φ˜ will be homeomorphisms for φ˜ ∈ AF . E 23.12.3 Definition: The associated topological (G, F ) fibre bundle (orbit space method) with structure group (G, F ) − < (G, TG , F, TF , σ, µF < (P, TP , q, B, TB , AG G ) for a given topological G-bundle (P, q, B) − P ) is the F topological (G, F ) fibre bundle (E, π, B) − < (E, TE , π, B, TB , AE ) defined by: ′ ′ (i) E = [(z, y)]; z ∈ P, y ∈ F , where [(z, y)] = {(z ′ , y ′ ) ∈ P ×F ; q(z ′ ) = q(z), ∃φ ∈ AG P , φ(z )y = φ(z)y}; (ii) π : E → B is defined by π : [(z, y)] 7→ q(z); G ˜ ˜ −1 (Uφ ) → F is defined for φ ∈ AG by φ˜ : [(z, y)] 7→ φ(z)y; (iii) AF E = {φ; φ ∈ AP }, where φ : π P S ˜ −1 (Ωφ ); Ω : AG → IP(B × F ) and ∀φ ∈ AG , Ωφ ∈ Top(Uφ × F ) . (iv) TE = G (π × φ) φ∈AP
P
P
23.12.4 Notation: (P × F )/G denotes the orbit-space version of the associated topological fibre bundle in Definition 23.12.3. Thus one may write E = (P × F )/G. (See Notation n53 for another popular notation.) 23.12.5 Remark: There is no axiom of choice problem with Definition 23.12.3 (iv) because it is clear that the maps Ω : AG P → IP(B × F ) defined by Ω : φ 7→ ∅ and Ω : φ 7→ Uφ × F are both well-defined. So TE is non-empty if P is non-empty. (As mentioned in Remark 23.2.3, fibre bundles may be empty.) 23.12.6 Theorem: The associated (G, F ) fibre bundle of a principal G-bundle satisfies the conditions for the definition of a topological (G, F ) fibre bundle. Proof: From the one-to-one correspondence between [(z, y)] ∈ Eb = π −1 ({b}) and φ(z)y ∈ F , it follows ˜ that the maps βb,φ˜ = φ E are bijections, where φ˜ is defined by condition (iii). Therefore the maps π × φ˜ : b π −1 (Uφ ) → Uφ × F are bijections. [ Show that φ1 (z) = gφ2 (z) ⇔ φ˜1 (˜ z ) = g φ˜2 (˜ z ) if π(z) = π(˜ z ). ]
[ Must show that TE is a topology etc. Use a general theorem about weak topology for partial functions? ] ... [ Show that TE is the only possible topology for E in Definition 23.12.3. ] [ Prove here that Definition 23.12.3 satisfies the conditions of Definition 23.10.5 for an associated fibre bundle. ] 23.12.7 Remark: The relation φ(z ′ )y ′ = φ(z)y in Definition 23.12.3 (i) is independent of the choice of φ ∈ G ′ ′ AG P . (For proof, see Exercise 47.8.3.) If P is non-empty, then AP is non-empty; so the set of (z , y ) satisfying G ′ ′ ′ ′ G ′ ′ “∃φ ∈ AP , φ(z )y = φ(z)y” is the same as the set of (z , y ) satisfying “∀φ ∈ AP , φ(z )y = φ(z)y”. Thus two pairs (z, y) and (z ′ , y ′ ) are considered equivalent in Definition 23.12.3 (i) if they have the same base point in B and the action of z on y through a chart φ is the same as the action of z ′ on y ′ through the same chart φ. Hence the elements [(z, y)] of E are the same as orbits of the action of P on F except that the action is indirect via one or more charts φ ∈ AG P . (For comparison, see Definition 9.4.28 for the orbit space of a general left transformation group.) [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
[ Create a diagram for Definition 23.12.3. ]
23.12. Construction of associated topological fibre bundles via orbit spaces
523
23.12.8 Remark: Definition 23.12.3 (i) may be expressed explicitly in terms of orbit spaces by noting that q(z ′ ) = q(z) and φ(z ′ )y ′ = φ(z)y ⇔ q(z ′ ) = q(z) and y ′ = φ(z ′ )−1 φ(z)y
⇔ q(z ′ ) = q(z) and ∃g ∈ G, y ′ = gy and g = φ(z ′ )−1 φ(z)
⇔ q(z ′ ) = q(z) and ∃g ∈ G, y ′ = gy and φ(z ′ ) = φ(zg −1 ) ⇔ ∃g ∈ G, y ′ = gy and z ′ = zg −1
⇔ ∃g ∈ G, (z ′ , y ′ ) = (zg −1 , gy)
⇔ ∃g ∈ G, (z ′ , y ′ ) = (zg, g −1 y).
23.12.9 Notation: The total space E = (P × F )/G may be denoted as P ×G F , and the fibre bundle (E, π, B) is often denoted as η ×G F , where η = (P, q, B). That is, (E, π, B) = (P, q, B) ×G F . 23.12.10 Remark: If the identification space method of Definition 23.11.2 is applied to a (G, F ) fibre G bundle (E, π, B, AF E ) to construct a G-bundle (P, q, B, AP ), and then the orbit space method of Definition ˜ ˜ π ˜ 23.12.3 is applied to (P, q, B) to construct a (G, F˜ ) fibre bundle (E, ˜ , B, AF ˜ ), the result is a total space E E F ˜ which consists of equivalence classes which look like [(b, g, φ)], y , where b ∈ B, g ∈ G, φ ∈ AE and y ∈ F . This suggests that one should combine the four components of the tuples into a single tuple to give an equivalence class such as [(b, g, y, φ)]. This would be defined by [(b, g, y, φ)] = {(b, g ′ , y ′ , φ′ ); gφφ′ (b)(g ′ y ′ ) = gy}. (Of course, the full definition of this set requires also the constraints g ′ ∈ G, y ′ ∈ F˜ and φ′ ∈ AF E,b . These boring constraints are suppressed here. The group element gφφ′ (b) is the usual chart transition rule.) The group element g in [(b, g, y, φ)] may be chosen to be equal to the identity e ∈ G and can then be removed. This yields [(b, y, φ)] as in Definition 23.11.2.
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
So [(z, y)] is the same thing as {(zg, g −1 y); g ∈ G}, which happens to be the orbit of (z, y) ∈ P × F under the right action ((z, y), g) 7→ (zg, g −1 y) of G on P × F . Since φ˜ (zg, g −1 y) = φ(zg)g −1 y = φ(z)gg −1 y = φ(z)y = φ˜ (z, y) for any g ∈ G, a fibre chart φ˜ maps all representatives of an orbit [(z, y)] to the same element of F . The set E = (P ×F )/G is defined as a “right inside skew product” of transformation groups in Definition 9.6.5. The construction of (P × F )/G is similar to the construction of a tensor product of two spaces. It is particularly similar to the tensor product of two modules over a ring. (See EDM2 [34], 277.J.) For any ring R and a left R-module X and right R-module Y , the tensor product X ×R Y is defined so as to satisfy (xa) ⊗ y = x ⊗ (ay) for a ∈ R, x ∈ X and y ∈ Y , and some other conditions. The projection map f : P × F → E is “G-balanced” in the sense that [(zg, y)] = [(z, gy)] for all z ∈ P , g ∈ G and y ∈ F . [ Tensor products of modules should be defined more tidily and moved to Section 9.9. ] As mentioned in Remark 23.9.7, the information in the right action map µP G is redundant because this information may be recovered from the fibre charts. Consequently, an associated OFB constructed in DefiP nition 23.12.3 from a PFB may be defined without reference to µP G . To be specific, the expression µG (z, g) −1 G in condition (i) may be replaced with βπ(z),φ (φ(z).g) for φ ∈ AP . This shows that the associated OFB is constructed essentially in terms of only the fibre charts of the PFB.
524
[ www.topology.org/tex/conc/dg.html ]
23. Topological fibre bundles
[ draft: UTC 2009–3–21 Saturday 11:36 ]
[525]
Chapter 24 Parallelism on topological fibre bundles
24.1 Parallelism path classes 24.2 Pathwise parallelism on 24.3 Associated parallelism . 24.4 Other topics . . . . . .
. . . . . . topological . . . . . . . . . . . .
. . . . . . . fibre bundles . . . . . . . . . . . . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
525 527 530 532
Parallelism on topological fibre bundles is a natural generalization of the connections on differentiable fibre bundles in Chapter 36.
24.1. Parallelism path classes
24.1.1 Remark: One may ask why parallelism is defined for paths and not, say, for parametric families of paths or some other structure. If there was absolute parallelism within general 1-parameter families of paths (essentially 2-dimensional submanifolds), then the pathwise curvature for all paths would be zero, and this would (apparently) imply that the parallelism is absolute. For 1-dimensional submanifolds, disconnected paths (such as ordered traversals as in Definition 7.1.17) would yield “parallelism at a distance”, which would also seem to imply absolute parallelism. So the only way to develop a non-trivial parallelism is apparently with connected paths, and connectedness implies that the curve domains are intervals. One may also ask whether pathwise parallelism has any applications, and the answer is that all Riemannian manifolds yield a well-defined parallelism via the Levi-Civita connection, and many areas of mechanics and field theory also yield non-trivial parallelisms. So it turns out that pathwise parallelism is a very applicable generalization of absolute parallelism, and any further generalization beyond paths is probably not very useful. [ In Remark 24.1.1, try to prove that sheetwise parallelism always implies zero curvature and absolute parallelism. ] 24.1.2 Remark: The fibre atlas on a topological fibre bundle uniquely determines the topology, which in turn determines which cross-sections along paths are continuous. A definition of parallelism, on the other hand, determines which continuous cross-sections along paths correspond to parallel translation. Since parallel translation must satisfy a group invariance property, the structure group plays a role in defining parallelism but not in defining the set of all continuous cross-sections. As discussed in Section 16.1, the terminology adopted for curves and paths in this book is that “curves” are maps γ : I → M for intervals I and topological spaces M , whereas “paths” are equivalence classes of curves. Two curves are considered equivalent if they are related to each other by an increasing parameter homeomorphism. So a path is a set of curves which all start at the same point and take the same route to the end point, passing every point in the same order. Parallel transport depends only on the path, not on the particular choice of curve which represents the path.
©
Alan U. Kennington, “Differential geometry reconstructed: a unified systematic framework”. Copyright 2009, Alan U. Kennington. All rights reserved. You may print this book draft in A4 format. Printing in all other formats is forbidden. You may not charge any fee for copies of this book draft. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Parallelism in flat spaces is absolute parallelism, which means a global equivalence relation between elements of fibres sets at each base point. By contrast, pathwise parallelism means parallelism which is absolute only within each path. For all the points along each path, there is an equivalence relation between elements of fibre sets on the points of the path. (Self-intersections of paths are dealt with by treating multiple crossings of a single base point as different points.)
526
24. Parallelism on topological fibre bundles
In the case of differentiable fibre bundles, parallelism is defined on rectifiable paths because a connection can only be integrated along a path if the tangent to the path exists almost everywhere. (See Sections 17.5 and 27.11 for rectifiable paths.) It does not seem to be possible to generalize parallelism in a satisfactory way to all continuous paths in a topological space or topological manifold. (The set of all continuous paths in M is denoted P0 (M ) in Notation 16.4.4.) This is not surprising because the transitivity property of parallelism along paths gives parallelism the character of an integral, and integrals are not usually defined for completely general functions. In this case, the integral is the kind of direction-dependent path integral that appears in the Stokes theorem, which requires a locally rectifiable curve. Therefore it seems natural and unavoidable that the most general definition of pathwise parallelism on a topological fibre bundle will require the specification of a class P of paths on which parallelism may be defined. Since the path class P will typically be defined in terms of a differentiable structure, it will not generally be definable in terms of the topological structure alone. This is a kind of ex machina path class which requires some external structure for its definition, and which therefore must be specified as an ad hoc set which has certain closure and continuity properties. The paths in the class P could be thought of as being “wormholes” through which parallelism is carried between the fibre sets at different points in the base space. This is illustrated in Figure 24.1.1.
e
2
Parallelism “wormholes” (paths carrying fibre set orientation information)
24.1.3 Remark: It is not guaranteed that every kind of transformation of fibre sets along paths in a fibre bundle will satisfy the criteria for a definition of parallelism. In a physical system, one may imagine that one sends test particles out into the state space of the system to measure the transformations that occur in the fibre sets along the paths of the particles. Only if these transformations satisfy a definition such as Definition 24.2.2 can the transformations be thought of as a kind of parallelism. In other words, a parallelism is something that one must discover. It turns out that many mathematical models, in particular all Riemannian metric spaces, have a natural and useful parallelism. If the parallelism can be differentiated, then a connection is defined. If the connection has a well-defined exterior derivative, then the curvature may be defined, and curvature is what makes differential geometry different to flat-space geometry. 24.1.4 Remark: Parallelism along paths is essential in physics for the support of polarization of light and conservation of momentum. Since Mach’s principle (a very reasonable principle) says that momentum must be related to the rest of the matter of the universe (and inertial frames just “coincidentally” are those which have constant velocity relative to the “fixed stars”), one might ask if there is some causal relation between the “fixed stars” and the parallelism or affine connection on physical space. Even though there is no luminiferous aether in the 19th century sense, there still seems to be some sort of structure in the vacuum which defines parallelism so that momentum and polarization are meaningful. Space seems to require a connection in addition to mere differentiable structure, and one might reasonably ask what this structure is composed of. It seems to be obey equations which are related to gravity, but it is not clear what the “parallelism wormholes” are. It seems almost as if space has tramlines laid down for matter and energy to flow along, and the tramlines have some sort of “roll control” which maintains parallel transport. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
ol
rm h
mh
ole 3
End here w or
Start here
Figure 24.1.1
1
wo
wormhole
24.2. Pathwise parallelism on topological fibre bundles
527
One could go further and ask the more fundamental question of why physical sapce (or space-time) seems to have a differentiable structure. How are nearby points “glued” together to make a smoothish manifold? How do nearby points “know” that they are near each other? How does know that it must let light pass through it at the speed of light and not some other speed? Why doesn’t light travel in a randomly crooked path or go round in circles? The anthropomorphic principle does not tell us how these things happen. It only tells us that we would not be observing the world if it were otherwise. 24.1.5 Remark: Definitions 24.1.6 and 24.2.2 are, presumably, non-standard. It is reasonable to expect that a parallelism path class will be closed under concatenation, restriction and reversal. Closure under continuous reparametrization is taken care of by the definition of a path as an equivalence class of curves. A class P of paths in a base space B for defining parallelism on a fibre bundle (E, π, B) must be a partition of some set C of curves in B such that the curves in any path are path-equivalent according to Definition 16.3.7. In other words, every path in P must beSa non-empty set of path-equivalent curves in C and the paths must be pairwise disjoint. In particular, C = P.
24.1.6 Definition: A parallelism path class for a topological fibre bundle (E, π, B) is a set P of paths in B such that (i) All constant paths in B are in P; (idempotence) (ii) For all paths Q = [γ]0 ∈ P, the reverse −Q = [−γ]0 is in P; (symmetry)
(iii) For all Q1 , Q2 ∈ P with T (Q1 ) = S(Q2 ), the concatentation of Q1 with Q2 is in P. (transitivity) S A parallelism curve class is the set of curves C = P in a parallelism path class P.
24.1.8 Remark: Just as a connection is defined on the differentiable structure a differentiable fibre bundle (i.e. the differentiable atlas of the fibre bundle) rather than on the differentiable manifold itself, so a “parallelism” on a topological fibre bundle is defined on a “parallelism path class” rather than on the underlying topological space. 24.1.9 Definition: A (topological) fibre set parallelism for base points p, q ∈ B in a topological (G, F ) −1 fibre bundle (E, π, B, AF E ) is a structure-preserving fibre set map α : Ep ≈ Eq of the form α = βq,φ2 ◦Lg ◦βp,φ1 F F for some g ∈ G, φ1 ∈ AE,p and φ2 ∈ AE,q . (That is, α ∈ IsoG (Ep , Eq ) in terms of Notation 23.8.4.)
The (topological) fibre set parallelism space for a topological (G, F ) fibre bundle (E, π, B, AF E ) is the set S F p,q∈B IsoG (Ep , Eq ) of all topological fibre set parallelisms for the fibre bundle (E, π, B, AE ).
24.1.10 Remark: Definition 24.2.2 is necessarily a little convoluted. In plain language, it means that a pathwise parallelism is a set of structure-preserving maps between the fibre sets of pairs of points of curves in the specified curve class C . These parallelism maps are equivalent for curves which are path-equivalent, and they obey some basic rules of transitivity and symmetry. Although general parallelism is not an absolute (i.e. path-independent) map between fibre sets, the restriction to paths is absolute. Within a path, every point in every fibre set has a unique association with a point in each other fibre set. (Recall that intersection points of paths are regarded as different points.) Therefore parallelism along a path may be formalized as an equivalence relation rather than as the maps of Definition 24.2.2. The functional representation is probably better for such tasks as differentiation though.
24.2. Pathwise parallelism on topological fibre bundles 24.2.1 Remark: See Notation 23.8.4 for the isomorphism sets IsoG (Ep , Eq ). Figure 24.2.1 illustrates some of the structures involved in the pathwise parallelism in Definition 24.2.2. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
24.1.7 Remark: Examples of suitable parallelism path classes are the set of rectifiable paths in a Lipschitz manifold, the set of piecewise C k paths in a differentiable manifold for k ≥ 1, and the set of piecewise linear paths in an affine space.
528
24. Parallelism on topological fibre bundles Θγs,t
Ep
Eq
∈ IsoG (Ep ,Eq )
q = γ(t)
γ(s) = p
Range(γ) ⊆ B
γ
Iγ
s, t
IR
Figure 24.2.1
Pathwise parallelism structure
24.2.2 Definition: A (topological) (pathwise) parallelism on a parallelism path class P for a topological (G, F ) fibre bundle (E, π, B, AF E ) is a map S S Θ:C → Iγ × Iγ → IsoG (Ep , Eq ) , γ∈C
where C =
S
p,q∈E
P and Iγ = Dom(γ) for γ ∈ C , which satisfies the following:
(i) ∀γ ∈ C , Dom(Θγ ) = Iγ × Iγ ; (ii) ∀γ ∈ C , ∀s, t ∈ Iγ , Θγs,t ∈ IsoG (Eγ(s) , Eγ(t) );
γ (v) ∀γ ∈ C , ∀s, t ∈ Iγ , Θ−γ −s,−t = Θs,t ; (reversibility) (vi) ∀γ1 , γ2 ∈ C , γ1 ⊆ γ2 ⇒ Θγ1 ⊆ Θγ2 ; (monotonicity) γ (vii) ∀γ ∈ C , ∀φ1 , φ2 ∈ AF gφγ1 ,φ2 : I1 × I2 → G with Ik = Iγ ∩ γ −1 (Uφk ) for E , gφ1 ,φ2 is continuous, where for all φ1 , φ2 ∈ AF k = 1, 2 is defined by φ2 ◦ Θγs,t = Lgφγ ,φ (s,t) ◦ φ1 E E and s, t ∈ I1 × I2 . (continuity) 1
The notation
Θγs,t
2
γ(s)
γ
means Θ(γ)(s, t), and Θ means Θ(γ).
[ Would it be better to call the monotonicity condition (vi) in Definition 24.2.2 a “restriction independence” condition or something similar? ] [ Try to show that for a PFB, Rgφγ ,φ (s,t)−1 is a parallelism if Lgφγ ,φ (s,t) is. See manuscript notes for proof. ] 1
2
1
2
[ Write out what the conditions of Definition 24.2.2 mean in terms of “coordinates” gγ(s),γ(t) with Θγs,t = γ(s),γ(t)
Lg,φ1 ,φ2 ? ] 24.2.3 Remark: Definition 24.2.2 (iii) has the consequence that the parallelism map is the identity map along constant stretches of curves. That is, if β : I → Iγ is constant on [a, b], then Θγ◦β s,t = idEγ(β(s)) for all s, t ∈ [a, b]. If (iii) is applied twice in the case of a curve equivalence γ1 ◦ β1 = γ2 ◦ β2 = γ3 with γ1 , γ2 , γ3 ∈ Q, the result is Θγβ11 (s),β1 (t) = Θγβ22 (s),β2 (t) = Θγs,t3 . This means that the definition of parallelism is independent of the curve used to represent a path. So parallelism depends only on the path, not on the parametrization. 24.2.4 Remark: The transitivity rule Definition 24.2.2 (iv) with u = t implies an idempotence rule, namely that Θγt,t = idEγ(t) for any γ ∈ C and t ∈ Iγ . Similarly, (iv) implies the rule Θγt,s = (Θγs,t )−1 . These look like semigroup properties, but the maps Θγs,t are only isomorphisms, not automorphisms. [ Since the maps Θγ are not semigroups, what are they? Is there a name for this sort of thing? ] If the fibre set isomorphisms Θγs,t are known for a fixed s ∈ Iγ , then the isomorphisms for all other pairs (s, t) may be calculated. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
γ (iii) ∀Q ∈ P, ∀γ ∈ Q, ∀I ⊆ IR, ∀β ∈ C(I, Iγ ), ∀s, t ∈ I, γ ◦ β ∈ Q ⇒ Θγ◦β s,t = Θβ(s),β(t) ; (parametrization independence) (iv) ∀γ ∈ C , ∀s, t, u ∈ Iγ , Θγt,u ◦ Θγs,t = Θγs,u ; (transitivity)
24.2. Pathwise parallelism on topological fibre bundles
529
24.2.5 Remark: The reversibility rule, Definition 24.2.2 (v), together with the transitivity rule (iv), implies that the parallelism on a path is absolute. That is, it doesn’t matter how a curve gets from one point of the path to another, it will always give the same parallelism from one point on the path to another. (When there are self-intersections, the different traversals through the same point are regarded as different points of the path.) In particular, if a curve starts at a point on a path and comes back to the same point, the result is the identity map. So the parallelism is “flat” because there are no closed paths for which the parallelism is a non-identity fibre set map. This absolute parallelism implies that the function Θγ may be replaced with a simple equivalence relation on the fibre sets over the base points of the curve γ. 24.2.6 Example: Condition (v) for Definition 24.2.2 does not follow from the other conditions. As a 2 counterexample, consider a trivial (G, F ) fibre bundle (E, π, B, AF E ) with G = O(2), F = IR , B = IR, F E = B × F , π : (x, z) 7→ x, and AE = {φ} with φ : (x, z) 7→ z. Define C to be the set of rectifiable γ γ ), where R(αs,t ) denotes rotation curves γ : I → B. Define a map Θ for this fibre bundle by Θγs,t = R(αs,t Rt ′ γ of the fibre sets (through the chart) by angle αs,t = s |γ (u)| du for γ ∈ C . The interested student may verify that all of Definition 24.2.2 except condition (v) is satisfied by Θ. (It’s about time the other students did some work too. The interested student can’t be expected to do everything!) If the rotation angles are Rt γ replaced with αs,t = s γ ′ (u) du = γ(t) − γ(s), then Θ satisfies all of the conditions of Definition 24.2.2. These examples are illustrated in Figure 24.2.2 for γ : u 7→ sin u, s = 0 and t = π.
x ∈ IR
γ αs,t =
t
s
|γ ′ (u)| du
γ αs,t =
Z
s
t
γ ′ (u) du = γ(t) − γ(s)
Definition of parallelism without and with reversibility
24.2.7 Remark: Definition 24.2.2 (vi) says that if γ1 is a subcurve of γ2 , then Θγ1 is a restriction of Θγ2 . γ1 γ2 Therefore Θ = Θ I ×I . γ1
γ1
If two curves γ1 and γ2 have a common portion γ0 so that γ0 ⊆ γ1 and γ0 ⊆ γ2 , then condition (vi) implies that the parallelism of γ1 and γ2 will be the same along the common portion γ0 . This means that any two curves passing through the same points will experience the same parallelism transformation, no matter how the curves differ elsewhere. This may be thought of as a “memoriless” property. The transformation experienced by a test particle moving along any portion of a curve is independent of anything that happens before (or after) it passes through that portion. This can be looked at in reverse. The opposite of a function restriction is a function extension. If a curve γ3 is the concatenation of curves γ1 and γ2 , then γ1 ⊆ γ3 and γ2 ⊆ γ3 . So both curves are subcurves of γ3 . (It follows by Definition 24.1.6 that γ3 ∈ C .) Therefore the parallelism map Θγs,t3 for s ∈ Iγ1 and t ∈ Iγ2 is 1 1 obtained as the composition of Θγs,b and Θγa22 ,t , where Iγk = [ak , bk ] for k = 1, 2. Thus Θγs,t3 = Θγs,b ◦ Θγa22 ,t . 1 1 So Definition 24.2.2 (vi) may be thought of as a concatenation rule. 24.2.8 Remark: The group element gφγ1 ,φ2 (s, t) in Definition 24.2.2 (vii) generally depends on the fibre charts φ1 and φ2 . If b = γ(s) = γ(t), one may choose a single chart φ = φ1 = φ2 . For such a closed curve γ portion, Θγs,t = Lbg,φ ∈ AutG (Eγ(s) ) with g = gφ,φ (s, t). Unfortunately, by Theorem 23.8.11 (v), the group element g depends on φ. A simple example of this is the tangent bundle of the sphere S 2 with the orthogonal group G = (O)(2) as the structure group. The parallel transport around a closed curve (with the standard parallelism definition) results in a rotation of the tangent space common initial and terminal point through some angle, α ∈ IR say. This angle is the same for all charts which have the same orientation, but the rotation angle is −α for a chart with the opposite orientation. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Figure 24.2.2
Z
x ∈ IR
530
24. Parallelism on topological fibre bundles
[ Construct equivalence relations for fibre sets over a path from Definition 24.2.2. One can similarly construct −1 a map (t, y) 7→ Θγs,t (βγ(s),φ (y)) for (t, y) ∈ Iγ × F which yields a single “lift” of γ for each y ∈ F . Each lift is a curve in E. ] [ Since (G, F ) is an effective group, there is a one-to-one correspondence between the group elements and the parallelism maps between fibre sets at different points of a (G, F ) fibre bundle. The group element is uniquely determined by the bijection through the charts. ]
24.3. Associated parallelism 24.3.1 Remark: Definition 24.3.2 shows how parallelism can be “ported” between a topological (G, F ) fibre bundle and an associated topological (G, F˜ ) fibre bundle. This concept is illustrated in Figure 24.3.1.
(G, F ) fibre bundle (G, F˜ ) fibre bundle Porting parallelism between associated fibre bundles
This must be the real reason for defining associated fibre bundles. The idea is to achieve economy of specifications of parallelism by specifying it just once for one fibre bundle and then copying it to all associated fibre bundles. The prime example of this is where the parallelism on the tangent bundle of a differentiable manifold is re-used for all of the different kinds of tensor bundles on that manifold. Most of this chapter is intended as preparation for Definition 24.3.2. 24.3.2 Definition: The associated (topological) (pathwise) parallelism from a topological (G, F ) fibre bun˜ ˜ ˜ ˜ , B, AF˜ ) for a given topological dle ξ = (E, π, B, AF ˜ E ) to an associated topological (G, F ) fibre bundle ξ = (E, π E ˜ on C˜ which pathwise parallelism Θ on a parallelism curve class C is the topological pathwise parallelism Θ is defined by ˜
F , ∀g ∈ G, ∀γ ∈ C , ∀s, t ∈ Iγ , ∀φ1 ∈ AF ˜ E,γ(s) , ∀φ2 ∈ AE,γ(t) γ(s),γ(t)
Θγs,t = Lg,φ1 ,φ2
˜ γs,t = Lγ(s),γ(t) , ⇔ Θ ˜ ,φ ˜ g,φ 1
2
(24.3.1)
where φ˜1 = h(φ1 ) and φ˜2 = h(φ2 ) are the charts for ξ˜ which are associated with the charts φ1 and φ2 F˜ respectively for ξ via a topological fibre bundle association h : AF ˜. E → AE γ(s),γ(t) ˜ ?] 1 ,φ2
[ An alternative to (24.3.1) in Definition 24.3.2 might be Lg ≡ Rg−1 ,φ˜
24.3.3 Remark: Definition 24.3.2 is illustrated in Figure 24.3.2. The most important thing to focus on in this cluttered diagram is the equality g˜ = g. This means that for matching (i.e. associated) charts, the parallelism is “coordinatized” by the same group element, regarding the fibre charts as a kind of coordinatization of the space of all permitted isomorphisms of the fibre space. [ Maybe do another diagram like Figure 24.3.2 with Rg˜ , g˜ = g −1 ? ] [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Figure 24.3.1
24.3. Associated parallelism Lg
F
Eγ(s)
π B
γ(t)
B
β˜γ(t),φ˜2 ˜γ(t) E
˜ γs,t = Lγ(s),γ(t) Θ ˜ ,φ ˜ g ˜,φ 1
2
π ˜
γ γ(s)
γ(t)
B
˜ ˜ γs,t = Lγ(s),γ(t) = β˜−1 Θ ˜1 ˜ ◦ βγ(s),φ ˜ ◦ Lg ˜ ,φ ˜ γ(t),φ g ˜,φ
γ(s),γ(t)
−1 ◦ Lg ◦ βγ(s),φ1 Θγs,t = Lg,φ1 ,φ2 = βγ(t),φ 2
Figure 24.3.2
F˜
˜2 =h(φ2 ) φ
π ˜
γ γ(s)
˜1 =h(φ1 ) φ
˜γ(s) E
Eγ(t)
γ(s),γ(t)
π B
β˜γ(s),φ˜1
φ2 βγ(t),φ2 Θγs,t = Lg,φ1 ,φ2
Lg˜ ; g˜ = g
F˜
F
βγ(s),φ1 φ1
531
1
2
2
Associated topological pathwise parallelism γ(s),γ(t)
γ(s),γ(t)
−1 −1 Recall from Notation 23.8.15 that Lg,φ1 ,φ2 = βγ(t),φ ◦ Lg ◦ βγ(s),φ1 : F ≈ F , and Lg,φ˜ ,φ˜ = βγ(t), ˜2 ◦ Lg ◦ 2 φ 1 2 ˜ ˜ βγ(s),φ˜1 : F ≈ F , where βb,φ denotes φ π−1 ({b}) and so forth.
Using the notation of Definition 24.2.2 (vii), the parallelism association in expression (24.3.1) may be formulated as the equation gφγ1 ,φ2 = g˜φγ˜ ,φ˜ for all associated charts φ1 ↔ φ˜1 and φ2 ↔ φ˜2 . The group elements 1 2 gφγ1 ,φ2 (s, t) and g˜φγ˜ ,φ˜ (s, t) may be thought of as the coordinates in G of the fibre set isomorphisms Θγs,t and 1 2 ˜ γs,t respectively with respect to the corresponding fibre charts. Θ
[ Could perhaps also define associated parallelism when the charts of the fibre bundles are not exactly asso˜ γs,t = Lγ(s),γ(t) = Lγ(s),γ(t) ciated? Θ ˜′ = g. . . Use Theorem 23.8.17 (iii). ] ˜ ,φ ˜ ˜′ ,φ ˜′ , where g g ˜,φ g ˜′ ,φ 1
2
1
2
24.3.4 Remark: There are many things which are the same in the two associated fibre bundles in Definition 24.3.2. These include the structure group G, the base space B, and the curve class C . Thus both ˜ are defined for the same values of γ, s and t, and their values are left translations parallelisms Θ and Θ through the charts by the same group element g ∈ G for each curve γ ∈ C . The difference is that these left translations are for different fibre spaces. 24.3.5 Remark: It doesn’t seem to be possible to define associated parallelism without the use of fibre charts because fibre bundle associations can only be defined in terms of fibre charts. Therefore, as with all definitions which are constructed with charts, it must be verified that the definition is chart-independent. This is done in Theorem 24.3.6. ˜ is a parallelism. ] [ Must also show that Θ [ See manuscript notes about Theorem 24.3.6. ] 24.3.6 Theorem: The associated parallelism in Definition 24.3.2 is chart-independent. ˜ γs,t = Lγ(s),γ(t) with g = g˜γ Proof: It must be shown that the isomorphism Θ ˜1 ,φ ˜2 ˜1 ,φ ˜2 (s, t) is independent of g,φ φ the choice of fibre charts. The original parallelism Θ is automatically chart-independent because the group elements g are defined in terms of Θ rather than the other way around. Chart-independence for Θ means that the group elements gφγ1 ,φ2 (s, t) ∈ G obey the rule gφγ′ ,φ′ (s, t) = g¯φ′2 ,φ2 (γ(t))gφγ1 ,φ2 (s, t)¯ gφ1 ,φ′1 (γ(s)), where the 1
2
functions g¯φ,φ′ : Uφ ∩ Uφ′ → G denote the fibre chart transition functions for charts φ, φ′ ∈ AF E . This follows b1 ,b2 from the rules for change of fibre charts for the fibre set isomorphisms Lg,φ1 ,φ2 . (See Definition 23.8.17 (iii).) ˜ that is, it must be shown that The same rule must be shown to apply for Θ; gφγ˜′ ,φ˜′ (s, t) = g¯φ˜′ ,φ˜2 (γ(t))gφγ˜ 1
[ www.topology.org/tex/conc/dg.html ]
2
2
˜
1 ,φ2
(s, t)¯ gφ˜1 ,φ˜′ (γ(s)). 1
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
[ Show how to define associated parallelism for orbit-space associated fibre bundles? ]
532
24. Parallelism on topological fibre bundles
By Definition 24.3.2, gφγ˜′ ,φ˜′ (s, t) = gφγ′ ,φ′ (s, t) and gφγ˜ 1
2
1
2
˜
1 ,φ2
(s, t) = gφγ1 ,φ2 (s, t). From Definition 23.10.3 (ii)
for fibre bundle associations, it follows that g¯φ˜′ ,φ˜2 (γ(t)) = g¯φ′2 ,φ2 (γ(t)) and g¯φ˜1 ,φ˜′ (γ(s)) = g¯φ1 ,φ′1 (γ(s)). So 2 1 everything works out very nicely. (This is not a coincidence. It’s all been rigged!) [ Parallelism can also be ported between OFBs and PFBs as a natural extension of Definition 24.3.2. Show how parallelism is ported to associated fibre bundles which are constructed by the orbit space (skew product) and identification space constructions. ] [ Should also make some comments on parallelism on topological PFBs. ] [ Define a kind of “covariant derivative” as the change of fibre minus the parallel change. ] [ Show the relation to right translations on PFBs. Maybe Lbg,φ ≡ Rgb for PFBs? (Closed curves??) And b1 ,b2 1 ,b2 perhaps Lbg,φ ≡ Rg,?,? ?] 1 ,φ2 γ(s),γ(t) ˜ 1 ,φ2
[ If (E, π, B) is a PFB, try to show that Rg−1 ,φ˜
gives an associative parallelism. ]
24.4. Other topics This section is a holding bay for some other topological parallelism topics, such as holonomy groups and topological generalizations of curvature and Stokes Theorems. [ Remark 24.4.1 about “homotopy continuity” (not a standard definition) of pathwise parallelism is purely experimental. ]
[ Does “homotopy continuity” follow automatically in the compact-open topology (or some other natural topology) from the continuity condition in Definition 24.2.2? ] S [ Should try to put a topology on IsoG (E) = p,q∈E IsoG (Ep , Eq ) so that continuity can be defined for variations of γ where the endpoints are variable. This is related somehow to the task of differentiating a connection twice on geodesics. So maybe will need a differentiable structure on IsoG (E). This space could be interpreted as some sort of “double fibre bundle”. ] [ Define “pathwise curvature” as the map from the set of paths which start and finish at a common point p ∈ M to the set of automorphisms of Ep . The value of the curvature is the map Θab : Ep → Ep , where Θst is the “pathwise parallelism” on the fibre bundle. Then must show that curvature is additive in the sense that if two closed paths are added (in some algebraic topology sense – see Ahlfors [93], page 137, for formal sums of curves; see EDM2 [34], 94.D, for additivity of integrals on contours; see EDM2 [34], 80.D, for holonomy groups) to make another close path, then the curvature value for the path sum is the sum of the curvature values. This kind of additivity has some relation to the Stokes Theorem. Consider the case of complex analysis, where paths not enclosing poles have absolute parallelism whereas with a pole in the middle, the pathwise curvature is non-zero. ] [ It seems clear that the pathwise curvature will be independent of the choice of start/end point on a path. Will −1 have a curvature map κ : C → G or κ : P → G. Then g = κ(γ) will be the solution of Θγs,s = βb,φ ◦ Lg ◦ βb,φ . Unfortunately, this will probably depend on the fibre chart φ. Must look into using a principal fibre bundle and then writing Θγs,s = Rg : Eb ≈ Eb , or something like that. ] [ Try to use the chart-invariance of fibre coordinates/components of Θ for a closed path to show that κ(γ) is a chart-independent group element gφγ1 ,φ1 (s, t). This is not true, but try to prove something like this anyway. ] [ Definition 24.4.3 has been temporarily abandoned. It may be revived some day. The idea is that parallelism is an equivalence relation defined on a “path bundle”, which is a sort of wormhole through the fibre bundle which carries fibre orientation from one point to another. It doesn’t seem to be needed for defining anything yet, but there could be some use for it. ] [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
24.4.1 Remark: It is reasonable to expect that as a path is continuously varied, the parallel transport along that path should vary continuously. This can be made meaningful in terms of homotopy. Let (E, π, B) be a topological fibre bundle. A homotopy from a curve γ0 to a curve γ1 may be defined as a map γ : [0, 1] × [0, 1] → B such that γ(s, 0) = γ(0, 0) and γ(s, 1) = γ(0, 1) for all s ∈ [0, 1]. Denote by γs : [0, 1] → B the curve defined by γs : t 7→ γ(s, t). If each curve γs is a suitable curve for a pathwise s parallelism Θ on (E, π, B), then Θ may be said to be “homotopy continuous” if the map s 7→ Θγ0,1 (z) is a continuous map from [0, 1] to Eγ(0,1) for each z ∈ Eγ(0,0) .
24.4. Other topics
533
24.4.2 Remark: The “curve bundle” and “path bundle” definitions are non-standard. They may be thought of as wormholes transporting fibre orientation from point to point of a fibre bundle. Definition 24.4.3 defines curve bundles as parametrized families of points in the fibre bundle. The parametrization distinguishes multiple points of self-intersecting curves. Because of possible self-intersections, a curve bundle is not quite the same thing as a subbundle constructed by restricting the base space to the image of the curve. 24.4.3 Definition: The curve bundle over a never-constant curve γ : I → B in a (G, F ) topological fibre F bundle (E, π, B, AF E ) is the (G, F ) topological fibre bundle (Eγ , πγ , Bγ , AEγ ) defined by (i) Eγ = {(t, z); t ∈ I, z ∈ Eγ(t) } ⊆ I × E;
(ii) πγ : (t, z) 7→ (t, π(z)) = (t, γ(t)) for t ∈ I; (iii) Bγ = {(t, γ(t)); t ∈ I} = γ ⊆ I × B; F F ˜ ˜ ˜ ˜ (iv) AF Eγ = {φ; φ ∈ AE }, where the chart φ : Uφ → F is defined for each φ ∈ AE by Uφ = {(t, z) ∈ Eγ ; γ(t) ∈ Dom(φ)} and φ˜ : (t, z) 7→ φ(z); (v) The topology TE on Eγ is the weak topology induced by the maps πγ × φ˜ for φ˜ ∈ AF . Eγ
γ
(vi) The topology TBγ on Bγ is the strong topology induced by the map πγ .
24.4.4 Remark: It is difficult to give a simple definition of curve bundles for sometimes-constant curves because constant stretches of curves are really a kind of self-intersection, and the topology and charts for intersections should distinguish between separate intersections of over a point, but constant stretches should be regarded as a single point. The curve bundle topology and charts for sometimes-constant curves may be defined with reference to equivalent never-constant curves. The topology defined for curve bundles in Definition 24.4.3 is equivalent to that of the trivial bundle I × F .
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
[ Define a “parallel cross-section” along a path bundle or curve bundle. Also define parallel transport and the lift of a path or curve through a given z ∈ Ep . Parallel cross-sections may also be expressed as equivalence relations which contain the same information. ]
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
534
[ www.topology.org/tex/conc/dg.html ]
24. Parallelism on topological fibre bundles
[ draft: UTC 2009–3–21 Saturday 11:36 ]
[535]
Part II
Differential geometry
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
536
[ www.topology.org/tex/conc/dg.html ]
24. Parallelism on topological fibre bundles
[ draft: UTC 2009–3–21 Saturday 11:36 ]
[537]
Chapter 25 Overview of differential geometry layers
25.1 25.2 25.3 25.4 25.5 25.6 25.7
Layer 0: Set theory (points) . . . . . . . . . . . . . . Layer 1: Topology (connectivity and continuity) . . . Layer 2: Differentiable structure (charts and vectors) Tensors and differential forms . . . . . . . . . . . . . Layer 3: Affine connection (parallelism at a distance) Layer 4: Riemannian metric (distance and angles) . . Pseudo-Riemannian metric (hyperbolic distance) . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
537 538 540 542 542 544 545
[ This chapter has not yet been written. This is just a preliminary sketch. It can only be written when the main part of the book is finished. ] 25.0.1 Remark: This chapter will be a preview of the principal concepts and buzz-words of differential geometry. By taking this quick tour or “executive summary”, the reader will not need to wade through several hundred pages to discover the highlights of the book. The five layers of differential geometry are outlined in Section 1.1. (See also Hermann Weyl’s “three-storey” model, Remark 26.1.9.)
25.1.1 Remark: In layer 0, there are points or events. A point is a location in space. An event is a combination of a point and a time which locates the event in time. Points (and events) are represented mathematically as elements of sets. (In the following, the word “point” includes the meaning “event”.) The only property of a point is its location. There is no association at all between points in layer 0. The points are independent of each other. The only relation between points is the equality relation. In other words, given any two points P and Q, it is possible to say if the points are equal (P = Q) or not equal (P 6= Q). There is no distance relation. So there is no distinction between points which are close to P and points which are distant from P . A set of points can be counted because the equality relation is well defined on any set. Counting a set or subset can be achieved by labelling points with numbers as illustrated in Figure 25.1.1. 7 9
5
8
3
4
2 6 0 Figure 25.1.1
1
Layer 0: Points (or events)
©
Alan U. Kennington, “Differential geometry reconstructed: a unified systematic framework”. Copyright 2009, Alan U. Kennington. All rights reserved. You may print this book draft in A4 format. Printing in all other formats is forbidden. You may not charge any fee for copies of this book draft. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
25.1. Layer 0: Set theory (points)
538
25. Overview of differential geometry layers
Although the points in this figure are drawn on 2-dimensional paper, pure points do not have coordinates of any kind. All we can do with pure points is label and group them. Despite the lack of attributes, a lot of mathematics uses no point attributes at all. Basic set theory is presented in Chapters 3 to 8.
25.2. Layer 1: Topology (connectivity and continuity) 25.2.1 Remark: Layer 1 adds topological structure to simple point sets. Topology may be thought of as a kind of glue which holds neighbouring points together. Otherwise points would have no association with their neighbours at all. 25.2.2 Remark: The fundamental concepts of topology are connectivity of sets and continuity of functions. The simpler concept to understand intuitively is connectivity of sets. Topological structure is defined in terms of “open neighbourhoods” of points.
25.2.4 Remark: In topology, it is considered that every point x is in the interior of one or more neighbourhoods Nx of x. Figure 25.2.1 illustrates two sets of points. Each point x ∈ S1 and y ∈ S2 is surrounded by a single circular neighbourhood Nx or Ny in this diagram, but points may have any number of neighbourhoods of any size and shape. The fact that the two sets of points S1 and S2 in Figure 25.2.1 are disconnected from each other is clear from the fact that each point-set S1 and S2 can be covered by neighbourhoods which exclude the neighbourhoods of the other set of points. That is, the intersection Nx ∩ Ny of neighbourhoods Nx and Ny is empty for all x ∈ S1 and y ∈ S2 .
x
Nx Nx ∩ Ny = ∅
Figure 25.2.1
y
Ny
Disjoint neighbourhoods of individual points of disconnected sets
The neighbourhoods of the sets S1 and S2 may be joined into combined neighbourhoods Ω1 and Ω2 which contain all of the respective This is illustrated in Figure 25.2.2. The combined S points in the interiors. S neighbourhoods are Ω1 = x∈S1 Nx and Ω2 = y∈S2 Ny respectively. Clearly S1 is in the interior of Ω1 and S2 is in the interior of Ω2 , and the intersection Ω1 ∩ Ω2 is empty because all of the neighbourhood pairs Nx and Ny have an empty intersection.
Ω1 =
S
x∈S1
Figure 25.2.2
Nx
Ω1 ∩ Ω2 = ∅
Ω2 =
S
Ny
y∈S2
Disjoint combined neighbourhoods of disconnected sets
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
25.2.3 Remark: An “open set” is a set which has no boundary. This means that all points of an open set are “interior points”, which means that they are surrounded by other points of the same set. You can get an intuitive idea of what this means by removing the national border lines from a map of the world, leaving only a zero-width space between countries. Then all the remaining points inside countries are interior points surrounded only by other points of the same country. You might say that the remaining points inside the boundaries still have an “edge”, but this is only because the map has finite resolution. If you consider the boundary to have a thickness of exactly zero metres, then no matter how close you get to the boundary, you will still be within one country or another. Only when you are exactly on the boundary will you have “neighbours” in more than one country. (The idea of an open set is related to Zeno’s paradox of Achilles and the tortoise.)
25.2. Layer 1: Topology (connectivity and continuity)
539
Note that the sets Ω1 and Ω2 are not associated with any particular individual point. The subject of topology is much simpler as a logical discipline if the open sets are not associated with particular points. Instead of dealing with an infinite number of little neighbourhoods around an infinite number of points, we can deal instead with a much smaller number of combined neighbourhoods around combined sets of points. This leads to a way of thinking that looks more like Figure 25.2.3. 25.2.5 Remark: The “open set” concept is an efficient abstraction of the per-point “open neighbourhood” concept. The gain in logical efficiency has the disadvantage of a loss of intuitive directness. In fact, in applications one generally focuses on per-point neighbourhoods rather than abstract open sets. In this book, Top(X) denotes the set of open sets of a set X and Topx (X) = {Ω ∈ Top(X); x ∈ Ω} denotes the set of open neighbourhoods of a point x ∈ X. In practice, the pointwise sets Topx (X) are generally more useful than the abstract set Top(X). It is important to be able to change focus easily between the per-point neighbourhoods in Topx (X) and the per-set neighbourhoods in Top(X). In terms of abstract open sets, a set is defined to be disconnected if it can be covered by two disjoint open sets Ω1 and Ω2 which each contain at least one point of the set. Otherwise the set is connected. In other words, a set is connected if and only if it has no “gaps” which separate the set into two or more components. The set illustrated in Figure 25.2.3 is disconnected. Ω1 Ω2
Definition of disconnectedness in terms of disjoint open set covering
This fact is determined by first specifying the set of all open neighbourhoods in the topological structure of the point space. (A different choice of neighbourhoods would give a different classification of sets into connected and disconnected.) In Figure 25.2.3, there are two sets Ω1 and Ω2 which “cover” the set of points. In other words, all points of the set are inside at least one of the two sets. In this case, the neighbourhoods are disjoint. This is the definition of connectedness of a set. A set X is disconnected if there are two neighbourhoods which cover X and there is at least one point of X in each of the two neighbourhoods. These neighbourhoods effectively “disconnect” the set into two non-empty components. (See also Figure 15.4.1.) The ability to disconnect points and sets of points from each other is the fundamental task of open sets. A topology is formally defined to be a set of open sets. 25.2.6 Remark: Chapter 14 presents the basic concepts of topology. One might ask how the set of open sets is chosen. The choice is quite arbitrary, but must satisfy some rules to keep the definitions self-consistent. In practice, each commonly used space has a small number of usual topologies which are commonly defined on that space. If the set of open neighbourhoods is reduced by removing some open neighbourhoods, this tends to reduce the ability to disconnect sets into two components. It follows that more sets are then defined to be connected as the set of neighbourhoods is reduced. Similarly, if the set of neighbourhoods is augmented by adding new neighbourhood sets, this makes it easier to disconnect sets into two portions. Then you tend to have less connected sets and more disconnected sets. Roughly speaking, the bigger the topology is, the smaller the set of connected sets is. 25.2.7 Remark: Continuity of a function may be defined in terms of connectivity. A function is continuous if and only if its inverse preserves disconnectedness. (There is a minor technicality that the range of the function must be within a “normal space”, but as the name suggests, this is fairly weak requirement. Normal spaces are defined in Section 15.2.) Most texts do not define continuity of functions in terms of connectedness, but this style of definition is presented in Section 15.5 as an alternative. It is useful to just [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Figure 25.2.3
540
25. Overview of differential geometry layers
know that continuity can be defined as the preservation of set disconnectedness by the inverse of a function rather than preservation of set openness. Continuous functions can be used to define connectedness of sets. Thus the concepts of connectivity and continuity are “equipotent”. If you know which sets are connected, you can determine which functions are continuous. If you know which functions are continuous, you can determine which sets are connected.
25.2.9 Remark: In layer 0, it is possible to count sets. The cardinality of sets is determined by defining an equivalence relation on all sets; then saying that two sets have the same cardinality if they are equivalent. Two sets A and B have the same cardinality (i.e. number of elements) if and only if there is a bijection h : A → B. By analogy, in layer 1, two sets A and B are said to have the same topology if there is a homeomorphism h : A → B. A homeomorphism is defined as a bijection h : A → B with the extra restriction that both h and its inverse h−1 : B → A map open sets to open sets. In other words, a homeomorphism preserves not only the number of elements in subsets, but also the property of openness of subsets. Since all topological properties, such as connectivity and continuity, are defined in terms of the topology, two sets with equivalent topology must also have equivalent connectivity and continuity properties. This reduces work because every topological fact that is known about one topological space is automatically true for all equivalent topological spaces. In layer 0, we can classify all sets in equivalence classes according to their numbers of elements. This is achieved in practice by defining a wide class of sets called “ordinal numbers” which are used as standard sets for comparison with other sets. Thus a set A has 3 elements if and only if there is a bijection h : A → B, where B is the standard 3-element set (which happens to be the set 3 = {∅, {∅}, {∅, {∅}}}). In layer 1, the classification of sets according to their topology is one of the greatest preoccupations. It is not possible to classify topologies in a totally ordered sequence as in the case of cardinality. Algebraic topology is concerned with the calculation of algebraic topological attributes to topological spaces. Considerable research is concerned with determining sufficient conditions which guarantee the topological equivalence (i.e. the existence of a homeomorphism) between sets. The Poincar´e conjecture is just one famous example of a research question which attempts to uniquely determine the topology of a set in terms of specified attributes. 25.2.10 Remark: If the reader has the impression that Figures 25.2.1, 25.2.2 and 25.2.3 resemble diagrams of small multi-celled animals, this would not be an entirely unjustified impression. Sets of points are analogous to single-celled organisms whereas multi-celled organisms have a topological structure which indicates the connectivity of the cells. When large numbers of cells are present, it makes better sense to focus on the whole organism rather than the individual cells.
25.3. Layer 2: Differentiable structure (charts and vectors) 25.3.1 Remark: Vectors specify direction at each point of a set. This enables you to determine the rates of change of functions in various directions. Vectors are defined with the aid of local coordinate charts as illustrated in Figure 25.3.1. There is a lot of freedom in the choice of the local charts. For example, if the charts are rotated through any angle, the coordinates attached to points are changed, but the property of differentiability of a function with respect to points “under the chart” is not changed by such a rotation. Any local diffeomorphism of a local chart leaves the differentiability properties unaltered. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
25.2.8 Remark: It is disconcerting that real physical geometry does not correspond to the properties of topological spaces. For example, if a zero-diameter single point is removed from a physical plane figure, it is impossible to detect this. No measurement apparatus has a zero resolution. If a point is removed from the set IR2 , the topology is significantly altered. For a physical set, there is no way to determine whether the set is open or closed because a zero-thickness boundary is undetectable. All physical sets have fuzzy boundaries. However, this is not a serious problem. It must never be forgotten that mathematical models and physical systems occupy different universes. Topology applies to abstract mathematical models, not to observations of real physical systems. Real points have positive width. Real boundaries have positive thickness. The advantage of zero-diameter points and zero-thickness boundaries is that they avoid all arguments about the minimum discernible point width or boundary thickness, which depend very much on the quality of the measuring equipment and the nature of the experiment.
25.3. Layer 2: Differentiable structure (charts and vectors)
Figure 25.3.1
541
Layer 2: Charts and vectors
25.3.2 Remark: There is no correspondence between the direction of a vector at one point and a vector at any other point. This is because of the total freedom of choice of local coordinate orientation. Even if two vectors at different points seem to have the same direction for a particular choice of coordinates, the directions will generally be different for other orientations of the charts. 25.3.3 Remark: The local charts in Layer 2 define a canonical topology on the set of points. This is why charts are in a higher layer than the topology. The choice of a set of local charts for a set is called a “differentiable structure”.
25.3.5 Remark: A “differential” at a point on a manifold is a map from the linear space of vectors at that point to some other linear space. The simplest kind of differential map maps the tangent vectors at a point to the 1-dimensional linear space IR. Differentials are useful for defining the rates of change of functions on a manifold. The rate of change is a linear function of the tangent vectors. 25.3.6 Remark: The core fact about the differential layer is the Gauß-Green theorem, which is also known as the Stokes theorem, the Stokes formula and the Green-Stokes formula. This theorem combines local and global concepts, multi-variable derivatives, geometric measure theory and algebraic topology. An operator called the “exterior derivative” maps the differential forms used for integration (line elements, area elements and volume elements) to the differential forms for their boundaries; hence the adjective “exterior”. The integral of the exterior derivative of a differential form over a region equals the integral of the form itself over the boundary of the region. This provides a very powerful tool for converting between pointwise equations in the interior of a region and integrals over the boundary of the region. This is very important, for example, in electromagnetism (Maxwell’s equations). The Stokes theorem is usually written in the following deceptively simple way: Z Z dω = ω, C
∂C
for any singular r-chain C and differential form ω of degree r − 1. Unravelling the meaning of this formula reveals a vast network of concepts. The Stokes theorem is the culmination of the development of the “mathematical machinery” in the differential layer. 25.3.7 Remark: Differentiable structures are defined in Chapter 27. Vectors on manifolds are defined in Chapters 28, 29 and 30. Differentials on manifolds are defined in Chapters 31 and 32. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
25.3.4 Remark: Vectors at points of a manifold are called “tangent vectors” for historical reasons. They originated in the study of n-dimensional surfaces in IRn+1 for n = 1 or 2, regarded as real functions of n variables. In this case, the vectors can be interpreted as tangent line segments which are familiar from Euclidean geometry.
542
25. Overview of differential geometry layers
25.4. Tensors and differential forms 25.4.1 Remark: Simple vectors and differential forms are not sufficient to do a full range of algebra and calculus in manifolds. The derivative of a vector field is not a vector field. It is a new kind of object. Similarly, the products of differentials which are required for integration are not simple differentials. These new kinds of objects are “tensors”. A tensor is a kind of product of vectors or differentials or mixtures of vectors and differentials. 25.4.2 Remark: A tensor may be characterized as “the multilinear effect of a sequence of vectors”. This is not easy to understand. It is easier to first understand alternating tensors, which may be thought of as “the antisymmetric multilinear effect of a sequence of vectors”. This sounds even more complex. But consider the example of a pair of vectors v1 and v2 . The “antisymmetric multilinear effect” of this pair of vectors is just the area spanned by the two vectors. The word “area” here means “directed area”, not just the amount of area. This directed area is denoted v1 ∧ v2 . (The symbol “∧” is pronounced “wedge”.) Since this area has a direction, reversing one of the vectors changes the direction to the opposite: v1 ∧ (−v2 ) = −(v1 ∧ v2 ) = (−v1 ) ∧ v2 . When the order of multiplication is swapped, the resulting area has the opposite direction. So v2 ∧ v1 = −(v1 ∧ v2 ). It follows that v2 ∧ v1 + v1 ∧ v2 = 0. As illustrated in Figure 25.4.1, the antisymmetric multilinear effect of the pair of vectors v1 and v1 + v2 is the same as for the pair v1 and v2 . This is because v1 ∧(v1 +v2 ) = v1 ∧v1 +v1 ∧v2 by linearity with respect to the second factor and v1 ∧v1 = −(v1 ∧v1 ) by antisymmetry. So v1 ∧v1 = 0. Hence v1 ∧(v1 +v2 ) = v1 ∧v2 . Similarly, 0.5(v1 − v2 ) ∧ (v1 + v2 ) = (v1 − 0.5(v1 + v2 )) ∧ (v1 + v2 ) = v1 ∧ (v1 + v2 ) − 0.5(v1 + v2 ) ∧ (v1 + v2 ) = v1 ∧ (v1 + v2 ).
v2
v1 + v2
v2
v1
v1 + v2
2v1 + v2
v1
v1 + v2
1.5v1 + 0.5v2 0.5v2 v1
2v1 + 0.5v2 2v1
0.5(v1 − v2 ) Figure 25.4.1
v1 ∧ (v1 + v2 )
=
=
0.5(v1 − v2 ) ∧ (v1 + v2 )
=
(2v1 ) ∧ (0.5v2 )
Equivalent antisymmetric multilinear effect of vector pairs
By a happy coincidence, the parallelograms subtended by these vector pairs have the same area. This coincidence holds generally for vector sequences with the same “antisymmetric multilinear effect”. So alternating tensors are used for integration of functions on curves, surfaces and volumes. The line, area and volume elements for (standard) integration are antisymmetric multilinear products of vectors. This explains the importance of alternating tensors in the differential geometry literature.
25.5. Layer 3: Affine connection (parallelism at a distance) 25.5.1 Remark: The first obstacle to understanding affine connections is the unfortunate choice of name. The words “affine” and “connection” are both confusing. A better name would be “differentiable parallelism” or “parallelism differential”. Parallel transport on a manifold may be defined as an integral of an affine connection. But that is getting ahead of the story. So let’s start from the beginning. . . 25.5.2 Remark: The set of second derivatives of a real-valued function on a manifold is, unhappily, not a tensorial object. The tuple of first-order derivative is tensorial, namely a first-degree tensor. But the second-order derivatives require a “tensorization” term to counteract the second-order derivatives which enter into the transformation rules for transition maps between charts. This tensorization requirement leads directly to the definition of an affine connection. The overwhelming importance of second-order derivatives in physics makes it essential to be able to define tensorial second derivatives. Derivatives which are corrected by “tensorization terms” are called “covariant derivatives”. 25.5.3 Remark: Parallelism connects up the definitions of direction at different points in a space by “carrying” vectors from one place to another. Layer 2 does not have any definition of parallel motion. This [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
v1 ∧ v2
25.5. Layer 3: Affine connection (parallelism at a distance)
543
is very unfortunate for physics. Momentum, for example, requires parallel motion of objects to be defined. Newton’s first law says that in the absence of forces, an object’s momentum does not change. It travels in a straight line at an even speed. But if you apply a diffeomorphism to the space coordinates, a straight line may become curved and vice versa. If the notion of “straight ahead” is not well defined, an object does not know which direction to travel in. The games of cricket and billiards are not possible in such a world. An affine connection must be added to the differentiable structure in order to define parallism at a distance. It is sufficient to define parallel translation along curves. This permits objects to determine how curved their path is. 25.5.4 Remark: The observant reader will notice that the parallel transport of chart axes from point A to point B in Figure 25.5.1 does not result in agreement at B.
A
Layer 3: Path-dependent parallel transport of vectors
The reader may think that this is most unfortunate and may wish that things were defined differently so as to exclude this inconvenience. However, the difference in parallel transport between two paths with the same pair of end-points is called “curvature”. The curvature of a manifold equals zero everywhere if and only if all paths between each pair of end-points give the same parallel transport. A space where curvature is everywhere zero is called “flat”. The flat spaces are the ones which Euclid wrote about. There is no need of differential geometry when curvature is zero. The curvature of space-time in general relativity is zero only if there is no gravitational field. Flat space-time corresponds to an empty universe. Therefore, far from being an inconvenient nuisance, path-dependent parallelism is the core idea which lies at the heart of differential geometry. It is the raison d’ˆetre of this book. It is the engine inside the chassis. Curvature is the DNA which defines the organism. It is the Penelope which brings Odysseus home. The entire subject of differential geometry exists to make sense of curvature. The reader who does not wish to accept pathdependent parallelism is studying the wrong subject. 25.5.5 Remark: The most important point about pathwise parallelism in Layer 3 is not whether it shows zero or non-zero curvature. The important thing is that it must be defined. When the differentiable structure is already defined in Layer 2, and the parallelism is differentiable with respect to that differentiable structure, the parallelism is called an “affine connection”. The word “affine” arose from a thinking error by Euler when he was writing about similarity transformations. It is related to the word “affinity”. The choice of the word “affine” is unfortunate, but we are stuck with it. A better terminology would have been “linear connection”. (See Section 46.3 for history of the word “affine”.) 25.5.6 Remark: Affine connections are defined on differentiable manifolds in Chapter 37. More general connections are defined in Chapter 36. A very general kind of parallelism is defined in Chapter 24, but the full generality of parallelism is not required for differential geometry. 25.5.7 Remark: The natural, most general kind of mathematical structure on which one can define parallelism is the “fibre bundle”. This concept tends to make differential geometry quite confusing because the concept is very general and abstract. Since much DG literature is written in the language of fibre bundles, it is necessary to present fibre bundle definitions in this book. Roughly speaking, fibre bundles are just structures which are attached at points of a manifold. Different textbooks define fibre bundles differently. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Figure 25.5.1
B
544
25. Overview of differential geometry layers
Therefore a wide range of definitions is defined in this book. The best approach to understanding fibre bundles is to think always about particular examples. The most basic non-trivial example of a fibre bundle is the space of all tangent vectors on a manifold. Fibre bundles are defined in Layer 2, but they fulfil their purpose only when parallelism is defined on them in Layer 3.
25.6. Layer 4: Riemannian metric (distance and angles) 25.6.1 Remark: A Riemannian metric makes possible the comparison of lengths and angles at different points in a space, as illustrated in Figure 25.6.1. 10 0
10
0 0
0
10
10
0
10 0
10
0
10 10
10
90
90
90
0 10
10
90
10
0
0
0
10
0
90
90
90
10 10
0 0
90
90
0
0
90 0
90
0
10
10
10
0
90
0
0
10
0 0
10
0
10
90
0
10
10
90
0 0
0 0 10
10
0
10
10
0
90
10 0
90
90
10 0
0
10
0
10
0
10
10
90
0 10 10
10 0
90
0
90
0
0
10
10
90
10
0 0
90
Layer 4: Riemannian metric defines distance and angles globally
The distance and angle comparisons are global, not path-dependent. In physics, it is unusual to have an absolutely synchronized quantity throughout the universe for all time. But this is what the Riemannian metric offers. 25.6.2 Remark: The Riemannian metric tensor’s component matrix g = [gij ]ni,j=1 and the two-point metric function d, familiar from topological metric spaces, are related by the following formula. 1 ∂ 2 d(x, y)2 gij (x) = . (25.6.1) 2 ∂y i ∂y j y=x
This may be abbreviated to gij (x) = 21 ∂ij d2 (x, ·). That is, the Riemannian metric tensor’s component matrix g at a point x equals half the matrix of second-order derivatives of the square of the two-point distance function d. (To avoid technicalities, points and coordinates are used interchangeably here.) A Riemannanian manifold is a combined differentiable manifold and metric space such that the square of the distance function is twice differentiable. Hence a Riemannian metric is a sub-class of the familiar class of two-point metrics. If the distance function has the right kind of differentiability, half the Hessian of its square is a Riemannian metric. (An interesting technicality here is the fact that the Hessian of a function at a stationary point is tensorial even in the absence of a connection.) It follows that Riemannian manifolds are a sub-class of the familiar topological metric spaces, assuming that a differentiable structure is specified on the metric space. Conversely, the two-point distance function is recovered by integrating the metric tensor. Z q d(x, y) = min1 gij dxi dxj . γ∈Cx,y
γ
In other words, the distance from x to y equals the minimum integral of the metric tensor over all differentiable curves from x to y. Contrary to popular belief, a Riemannian manifold may be defined in terms of either a metric tensor g or a two-point distance function d. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Figure 25.6.1
0
10
10
10
10
10
25.7. Pseudo-Riemannian metric (hyperbolic distance)
545
25.6.3 Remark: In layer 4, a Riemannian metric is a kind of differential of a distance function. In layer 3, an affine connection is a kind of differential of the parallel transport function. In both layers, the global point-to-point function is obtained by integrating the differential. Parallel transport is path-dependent. The distance function is path-dependent too, but the distance of greatest interest is the minimal distance obtained by extremizing the integral over a class of paths. In layer 2, a tangent vector is the differential of displacement along a curve. So the principal structure in each layer is the differential of something which varies along a curve. layer
pathwise concept
differential concept
2 3 4
displacement parallel transport distance
tangent vector affine connection metric tensor
Fundamental physics is written mostly in terms of differential equations. So it is the differential specification of structures in layers 2, 3 and 4 which is of greatest relevance to the formulation of physics theories. 25.6.4 Remark: Although orientations at different points are not comparable, the ability to unambiguously compare angles and local distances is a very tight constraint on the manifold. Clearly a Riemannian metric is a big step beyond what an affine connection offers. A conformal metric specifies angles globally but not distances. So a conformal metric lies between layers 3 and 4.
25.6.6 Remark: Since the Riemannian metric determines a canonical parallelism one may use this parallelism to define geodesic curves. A geodesic curve, also called simply a “geodesic”, is a curve whose direction is parallel at all points. In other words, if you transport the direction of the curve at any point to a second point, then the direction at the second point is the same as what you transported. The geodesic curves can be combined with the metric to determine distance. Thus the distance between two points A and B in a Riemannian manifold is obtained by adding the distances along a geodesic curve from A to B. This is illustrated in Figure 25.6.2.
90 10
0
10
0
A Figure 25.6.2
90
90 10
90
90 90
90
90 90
0
90
90
90
0
0
10
0
10
0
10
0
10
10 0
10
10
0
10
0
10
B
0
Levi-Civita connection determines geodesic curves and long distances
This is how the local Riemannian distance function is extended to measure distances globally. (It turns out that the extremal paths with respect to distance are the geodesics. So the “self-parallel” paths are the ones which determine the point-to-point distance function.)
25.7. Pseudo-Riemannian metric (hyperbolic distance) 25.7.1 Remark: As in the case of the Riemannian manifold, a pseudo-Riemannian metric tensor field may be defined either explicitly or as half the Hessian of the second derivatives of a two-point distance function as in equation (25.6.1). In the pseudo-Riemannian case, the distance is hyperbolic rather than elliptic. Thus the symmetric metric tensor component matrix has eigenvalues with mixed sign whereas the eigenvalues are all positive in the case of a Riemannian manifold. 25.7.2 Remark: Special relativity is formulated in terms of Minkowski space-time, which is a hyperbolic version of Euclidean space. When flat Minkowski space-time is generalized to manifolds, the corresponding concept is a pseudo-Riemannian metric. This is the mathematical framework of general relativity. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
25.6.5 Remark: A Riemannian metric induces a canonical affine connection. This is called the Levi-Civita connection. (There is also a more general class of affine connections called “metric connections” which are more weakly consistent with a Riemannian metric. This is mentioned in Section 39.4.) The Levi-Civita connection is an orthogonal connection. This means that the connection preserves angles and length.
546
25. Overview of differential geometry layers
A pseudo-Riemannian metric permits distances to be negative. Distances are determined globally with a pseudo-Riemannian metric as described in Remark 25.6.6. Geodesic curves do not necessarily minimize distance. They are defined by parallelism, not by minimizing or maximizing the length of the curve. 25.7.3 Remark: Pseudo-Riemannian geometry is the final stage in the presentation of differential geometry. The most important concept to generalize to a pseudo-Riemannian metric (for the purposes of general relativity) is the Riemann curvature tensor. When all of the machinery of differential geometry has been generalized from the Riemannian metric to the pseudo-Riemannian metric, the framework is then finally ready for the presentation of Einstein’s equations and other physics models which are expressed in terms of Riemannian and pseudo-Riemannian geometry.
25.7.5 Remark: Since distance and angles are determined globally in a Riemannian or pseudo-Riemannian manifold by the transport of the local metric tensor along geodesic curves which are determined by the LeviCivita connection, it would be tempting to think that a universe model based on such geometry has a clear means of synchronizing the laws of physics at different points. However, the metric tensor is assumed a-priori to be a global structure in space-time. It would be much more satisfying if the universe somehow “knew” how to calculate the global Riemannian metric using parallel transport of the laws of physics along geodesics. Since it seems now from experimental evidence that the laws of physics may indeed vary in space and time, it may be that in future, the laws of gravity will be expressed in terms of level 3 affine connections, which just happen to give an illusion of the existence of a global Riemannian metric. If there is some deviation of the transport mechanisms of physics from an ideal Levi-Civita connection, then one would expect the global Riemannian metric to be replaced by a metric which is path dependent. Therefore the length of a distant object may depend on which path its length was communicated to you along. Thus the illusion of a global Riemannian metric may be an artefact of a connection which is approximately a Levi-Civita connection. The connection may determine the metric instead of vice versa. This could help account for an apparent variable speed of light. It could also help to bring gravity theory into conformance with Mach’s principle. This remark is, of course, pure conjecture. 25.7.6 Remark: General relativity was the principal driving force for differential geometry in the early 20th century, particularly for Riemannian geometry. This is a case where a body of mathematics was first developed for mathematical reasons, then became useful – or indispensable – for physics, and then was rapidly and richly developed to meet the needs of applications. Bell [189], page 370, says the following. General relativity [. . . ] was directly responsible for the direction taken by differential geometry about 1920. This newer geometry might have been developed almost forty years earlier. All the necessary technique was available; but it was not until the successes of relativity showed that Riemannian space and the tensor calculus were of more than mathematical interest that differential geometers noticed what they had been overlooking. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
25.7.4 Remark: General relativity is defined in terms of a generalized Riemannian metric called a “pseudoRiemannian metric”. The absolute global metric in general relativity is based on the assumption that all of time-space is uniform. In particular, the speed of light is assumed to be constant at all points, in all directions, and for all time. However, observational evidence in 2002 suggested that the speed of light is not constant after all. This may imply that a Riemannian (or pseudo-Riemannian) metric is not a suitable basis for geometry and gravity in cosmology. If this is so, physics which requires only layer 3 may survive while the global metric in layer 4 does not. Therefore it is important to express geometry and physics as much as possible in terms of layer 3. This implies a need to keep very clear which definitions require a metric and which definitions require only an affine connection. Surprisingly, most of differential geometry does not require a Riemannian metric. This may turn out to be a good thing for cosmology if it turns out that a path-independent Riemannian metric is not a valid model for global geometry. Alternatively, the relative lack of necessity of a metric may be a hint that the metric is indeed not valid in cosmology. There is a precedent for this in the discovery that Euclid’s fifth postulate did not follow from the other axioms. The abandonment of the equidistance of parallel lines facilitated the development of geometries such as Riemannian manifolds. If the Riemannian and pseudo-Riemannian metrics have to be abandoned, it will be important to base as much geometry as possible on connections. That is why this book introduces the Riemannian metric as late as possible. (The Riemannian metric is defined in Chapter 39. The pseudo-Riemannian metric is in Chapter 40.)
25.7. Pseudo-Riemannian metric (hyperbolic distance)
547
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Riemann lived from 1826 to 1866. Tensor calculus was developed about 1890 by Ricci-Curbastro. Einstein’s general relativity was published in 1915 and 1916.
[ www.topology.org/tex/conc/dg.html ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
548
[ www.topology.org/tex/conc/dg.html ]
25. Overview of differential geometry layers
[ draft: UTC 2009–3–21 Saturday 11:36 ]
[549]
Chapter 26 Topological manifolds
26.1 26.2 26.3 26.4 26.5 26.6
Background . . . . . . . . . . . . . . . . . . . . Euclidean and locally Euclidean spaces . . . . . . Topological manifolds . . . . . . . . . . . . . . . Charts and atlases . . . . . . . . . . . . . . . . . Topological manifold constructions, attributes and Topological identification spaces . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . relations . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
549 551 552 553 556 557
26.0.1 Remark: In this chapter, topological manifolds are introduced as topological spaces which happen to be locally homeomorphic to Euclidean spaces. Alternatively, topological manifolds may be thought of as a “patchwork quilt”, consisting of patches of Euclidean space sewn together somehow seamlessly. In practice, the patchwork quilt model is how manifolds are studied, but the point set of the manifold is primary in the way a manifold is visualized. 26.0.2 Remark: A topological space (X, T ) requires no extra structure in addition to the topology T in order to be declared to be a topological manifold. Additional structure such as an atlas is optional. Differentiable manifolds, introduced in Chapter 27, do need extra structure (such as an atlas) for their specification.
26.1.1 Remark: Manifolds have been a big success story of mathematical generalization. Much of modern physics is written in the language of manifolds, perhaps because no one really believes any more that the universe is flat. Even if the universe turns out to be homeomorphic to a simple Euclidean space, the general manifold point of view is still a useful tool for forcing one to throw away preconceived notions of flat space. 26.1.2 Remark: Probably the term “manifold” arose to describe the generalization of 1-parameter curves to parametrized surfaces. OED [211], page 1272, gives the year 1855 for the earliest use of the word “manifold” ¨ in Kantian philosophy. Riemann’s Uber die Hypothesen welche der Geometrie zu Grunde liegen appeared in 1854, but the OED gives the year 1890 for the first use of “manifold” in mathematics. 26.1.3 Remark: There was a time when mathematicians would define a manifold as any set which could be parametrized by a finite set of real parameters. Nowadays mathematicians are more precise about the conditions for such parametrizations. In the olden days, manifolds were pictured as having a grid of coordinate curves on a surface similar to longitude and latitude curves on a globe of the Earth. Points were labelled with coordinates (x1 , . . . , xn ) and these were used as a substitute for the familiar Cartesian coordinates for flat space, the difference being that the coordinate curves on a manifold were themselves variable instead of the more familiar rigid Cartesian grid lines. 26.1.4 Remark: Although nowadays the topological structure on Euclidean spaces IRn is regarded as the lowest-level structure which a manifold should have local equivalences to at every point of the manifold, Bell [190], page 505, suggests that the existence of local maps from a set to subsets of IRn was formerly (at least in 1937) considered sufficient to call the set an n-dimensional manifold.
©
Alan U. Kennington, “Differential geometry reconstructed: a unified systematic framework”. Copyright 2009, Alan U. Kennington. All rights reserved. You may print this book draft in A4 format. Printing in all other formats is forbidden. You may not charge any fee for copies of this book draft. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
26.1. Background
550
26. Topological manifolds A manifold is a class of objects (at least in common mathematics) which is such that any member of the class can be completely specified by assigning to it certain numbers, in a definite order, corresponding to “numerable” properties of the elements, the assignment in the given order corresponding to a preassigned ordering of the “numerable” properties.
The existence of space-filling curves implies that local bijections do not ensure sufficient structural equivalence to subsets of IRn . The topological structure must be equivalent also. In other words, the bijections must be homeomorphisms. 26.1.5 Remark: A topological manifold is defined as a Hausdorff locally Euclidean topological space with constant dimension. In other words, topological manifolds are just patched-together pieces of a Euclidean space – up to a homeomorphism. The concept of a manifold is yet another method for creating new spaces from old. The language used to describe manifolds clearly shows the subject’s origin in the mapping of the Earth, which is everywhere locally sort-of-Euclidean but not Euclidean globally. Hence early attempts to create charts of the Earth on a single sheet of flat paper were doomed to failure. The compromise was to create an atlas which would cover the whole Earth with multiple overlapping charts. This principle of defining a curved space in terms of an atlas of charts is capable of wide generalization. 26.1.6 Remark: Although a topological space is defined as being homeomorphic everywhere to open sets in a fixed Euclidean space, it is interesting to speculate on the usefulness of a class of topological spaces which is simply uniform in the sense that a fibre space is uniform in Definition 23.2.1. In other words, one could require every two points p and q in a topological space to have open neighbourhoods, Ωp and Ωq respectively, with a homeomorphism φ : Ωp ≈ Ωq such that φ(p) = q. For such a space, if one point was locally Euclidean, then all points would be locally Euclidean. A somewhat absurd notion of “discrete manifold” is suggested by Bell [190], page 505. Such a manifold M would have local bijections everywhere from subsets of M to subsets of n . If this proposal was serious, it would mean that all sets are “discrete manifolds”, even if the bijections are required to be homeomorphisms (under very weak assumptions on the topology). However, if some sort of order structure on the manifolds was required to correspond to order on n , perhaps some non-trivial definition could be built from such an idea.
Z
26.1.7 Remark: A minor distinction in terminology is introduced in this chapter. If a topological manifold is specified by its topology, namely a set TM of open sets, then the pair (M, TM ) is called simply a “topological manifold”. But if the manifold’s topology is specified by an atlas AM , then the pair (M, AM ) is called a “C 0 manifold”, which fits in nicely with the definitions of differentiable manifolds of class C r for general integers r. This is discussed further in Remark 26.4.1. 26.1.8 Remark: In terms of the mathematics, probably 99% of the definitions in differential geometry make perfect sense in a single chart of a manifold. It is really only questions of global topology which require an atlas of charts covering the whole manifold. Pure mathematicians who work in differential geometry are indeed strongly interested in questions relating to the global topology of manifolds, but the definitions which they use are generally of two kinds: (1) purely topological definitions (layer 1 in Section 1.1) relating to connectivity categories (like homotopy and homology) and (2) single-chart definitions in the higher layers of the manifold structure (layers 2 to 4). In other words, the full atlas is required only for the topology, which is not, strictly speaking, a geometric structure. The DG definitions which clearly do require more than a very local chart for their definition relate to geodesics and the convexity of sets. A convex set, however, can always be defined in a single chart, and shortest-path geodesics can usually be fitted into a single chart. A particular situation where a geodesic might not fit in a single chart is where the image of the geodesic is locally dense in the manifold. This can happen with chaotic or fractal orbits, for example. In the case of integrals over the whole manifold, clearly a full atlas of charts is required. Even though single-chart differential geometry may seem to be an appealing simplification, when global questions hit the fan, a full atlas is required. So it’s best to anticipate this by working with atlases as soon as possible. 26.1.9 Remark: Hermann Weyl [50, page 104], writing in 1918–1922, strongly hinted at a three-layer [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Z
26.2. Euclidean and locally Euclidean spaces
551
model of differential geometry structure in the following passage which appears at the end of a philosophical discussion of the nature of physical space. Die gedanklichen Grundlagen sind gelegt, und wir d¨ urfen jetzt nicht l¨anger s¨aumen, mit dem systematischen Aufbau der »reinen Infinitesimalgeometrie« zu beginnen, der sich naturgem¨aß in drei Stockwerken vollziehen wird; vom jeder n¨aheren Bestimmung baren Kontinuum u ¨ber die affin zusammenh¨ angende Mannigfaltigkeit zum metrischen Raum. This may be translated into English as follows. The intellectual foundations are laid, and we must now no longer tarry to begin with the systematic construction of the “pure infinitesimal geometry” which will be carried out, appropriately to its nature, in three storeys; from the continuum which is bare of every qualification, via the affinely connected manifold to the metric space. Another passage [50, page 78] makes it clear that Weyl’s “continuum” means a locally Euclidean topological space. (The word “Raum” means “space”, but it also means a “room” as in a house. Probably this was intended humorously! The word “gedanklich” can be translated as “intellectual” or “imaginary”.) Thus Weyl was evidently proposing a “three-storey” model as follows. storey/floor
structure
3 metric space Riemannian metric 2 affinely connected manifold affine connection 1 continuum continuous charts This may be compared to the five-layer model in Section 1.1. Weyl misses out the differential layer, although throughout his book, he does require sufficient differentiability of the chart transition maps for his purposes. Since not much can be done without differentiability (i.e. with only a pure topological manifold “bare of any qualification”), it is quite understandable that he does not split his “continuum” layer into a topological continuum and differentiable continuum.
26.2. Euclidean and locally Euclidean spaces Product topologies are defined in Section 15.1. The most important examples of product topologies are the n-fold products of sets such as IR. The standard topology on IRn is the product topology of the factors in the set product. [ Define the standard metric space (IRn , d). ] [ Section 43.2 covers Euclidean spaces with a focus on tangent bundles. ]
Z
26.2.1 Definition: A Euclidean topological space is a Cartesian product set IRn for some n ∈ + 0 together with the product topology on IRn , where the topology on each set IR is generated by the set of intervals of the form (a, b) for a, b ∈ IR with a < b. [ Obviously topological bases must be defined for this, and a couple of other things, like the topology on IR. ] [ Give the example here of the product of the topologies of [0, 1] ⊆ IR and {0, 1} ⊆ IR and similar examples. Also give the example of the product of the S 1 and S 0 topologies. ] 26.2.2 Remark: Unless otherwise stated, the topology on IRn is assumed to be as in Definition 26.2.1. 26.2.3 Definition: A locally Euclidean space is a topological space X such that n ∀x ∈ X, ∃Ω ∈ Topx (X), ∃n ∈ + 0 , ∃G ∈ Top(IR ), Ω ≈ G.
Z
26.2.4 Remark: Warner [49] defines a locally Euclidean space to require the Hausdorff property, but most authors seem to agree with Definition 26.2.3. The Hausdorff property is not implied by Definition 26.2.3. See Section 43.3 for examples of non-Hausdorff locally Euclidean topological spaces. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
The zeroth layer (a set without topology) is not defined by Weyl. This is also quite understandable because differential geometry really only starts when you get to charts in the “continuum” layer. What goes on in the zeroth layer “cellar” was quite rightly not Weyl’s concern in a book intended for physicists around 1920.
552
26. Topological manifolds
26.2.5 Remark: The concept of a homeomorphism may be thought of as a way of transferring tasks from a general space to a special kind of space where the work is easier. This reduces duplication of effort. In the case of Definition 26.2.3, tasks are transferred from a locally Euclidean space to open subsets of Euclidean spaces. Thus many already-proven results in Euclidean spaces are “portable” to locally Euclidean spaces. 26.2.6 Remark: A locally Euclidean space is both locally connected and locally compact. See Exercises 47.9.1 and 47.9.2.
26.3. Topological manifolds Subsets of topological spaces are implicitly considered to have the relative topologies from those spaces. (See Definition 14.10.13 for relative topology.) This applies particularly to subsets of IRn and topological manifolds. Most texts define a manifold in terms of an atlas. A more intrinsic approach is taken here. Manifolds are defined in terms of the existence everywhere of a local homeomorphism to a Euclidean space. Definition 26.3.1 assumes that a set M has a pre-defined topology and a test is applied to this topology to determine whether it satisfies the conditions for a topological manifold. It seems that Malliavin [35] actually creates the topology on a set via an atlas as the weak topology induced by the coordinate maps. Although this may be closer to the practical way of working, it would introduce superfluous structure into the mathematical definition. Manifolds could be generalized from the “locally Euclidean topological space” definition presented here to a manifold which is locally homeomorphic to any topological space at all. This would yield a general “patching together” definition for pieces of any given topology. In particular, spaces which are locally homeomorphic to n or n could be studied. Kobayashi/Nomizu [26], page 2, define manifolds more generally to be topological spaces which are locally homeomorphic to spaces other than IRn . Such generalizations are not considered here, even to complex spaces. [ Could define here a locally Euclidean topological space with variable dimension, and then define a manifold as such a space with constant dimension. On the other hand, a fibre bundle always has a constant fibre space at all points. This raises the question of whether it would be useful to define a fibre bundle with non-constant fibre space. ]
Q
Z
26.3.1 Definition: Let n ∈ + 0 . An n-dimensional (topological) manifold is a Hausdorff space M such that every point has an open neighbourhood which is homeomorphic to an open sub set of IRn . In other words, M is said to be an n-dimensional (topological) manifold if (i) M is a Hausdorff space (see Definition 15.2.13), and (ii) ∀x ∈ M, ∃Ω ∈ Topx (M ), ∃G ∈ Top(IRn ), Ω ≈ G. [ Try to show that a topological manifold is more than just T2 , i.e. Hausdorff. Probably the extra locally Euclidean property combined with T2 would imply something stronger than that. See Section15.2 for topological separation classes. ] 26.3.2 Remark: The Hausdorff condition is not superfluous. (See [19], definition 1.6, page 6.) [ See Malliavin [35], proposition I.1.4.1. ] It is not immediately obvious why the Hausdorff property does not follow from the local homeomorphisms to IRn , but Example 43.3.2 confirms that the Hausdorff condition is not superfluous. 26.3.3 Remark: Warner [49] defines a locally Euclidean space to require the Hausdorff condition, which makes his definition the same as Definition 26.3.1. He then defines a manifold to have a maximal atlas, and requires the topology to be second countable and Hausdorff. This seems to be a rare choice of conditions. But Crampin/Pirani [11], page 238, require a countable basis. 26.3.4 Remark: Notation 26.3.5 is a good example of a style of definition which is almost ubiquitous in mathematics literature. Taken at face value, it suggests that dim(M ) is a function of the manifold. In fact, as noted in Remark 26.3.7, dim(M ) has an infinite number of values for the empty manifold. So it [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
C
26.4. Charts and atlases
553
cannot possibly be an inferrable property of the set M together with its topology. In the case of non-empty manifolds, the calculation of the dimension non-trivial. One is not supposed to understand the notation dim(M ) as requiring a calculation from the set and topology. The notation dim(M ) actually means “the integer n which was used in the definition of M in Definition 26.3.1”. This is an ill-defined concept. But it is also what the definition means in most textbooks. Such “properties of definitions” should always be replaced by a “property of defined object” if one wishes to be moderately rigorous. 26.3.5 Notation: dim(M ) denotes the dimension of a manifold M . 26.3.6 Example: A trivial example of a topological manifold is the set IRn with its standard topology n n for n ∈ + 0 . For any x ∈ IR , the sets Ω and G in Definition 26.3.1 may be taken as the whole set IR , and the homeomorphism is the identity map.
Z
26.3.7 Remark: It could be argued that the case of dimension n = 0 is of little interest. There are, however, some redundant cases which make use of this. A 0-dimensional manifold is just a discrete topology; in other words, an arbitrary set M with the maximal topology T = IP(M ). The empty topological space (M, T ) = (∅, {∅}) is a topological manifold for all dimensions n ∈ + 0.
Z
[ Define “manifolds with boundary”. E.g. see Lang [30], pages 38–41. ] 26.3.8 Notation: C(M, IR) denotes, for any topological manifold M , the set of all continuous real-valued functions on M together with the operations of pointwise addition and multiplication by elements of IR. To be precise in terms of Definition 10.1.2, C(M, IR) is an abbreviation for the tuple (IR, V, σIR , τIR , σV , µ), where (i) IR − < (IR, σIR , τIR ) denotes the field of real numbers,
(ii) V = {f : M → IR; f is continuous},
(iii) σV : V × V → V is the pointwise addition on V , and
(iv) µ : IR × V → V is the pointwise multiplication by IR on V . ˚ 26.3.9 Notation: C(M, IR) for a manifold M denotes the set of continuous partially defined real-valued functions on M . ˚p (M, IR) for a manifold M and p ∈ M denotes the set of continuous partially defined 26.3.10 Notation: C real-valued functions on M whose domains contain p. 26.3.11 Remark: Notations 26.3.8, 26.3.9 and 26.3.10 are well-defined for general topological spaces, not just topological manifolds. (See Section 6.11 for partially defined functions.) Other structures are frequently added to C(M, IR) in Notation 26.3.8 in a fairly standard fashion, such as topological or metric space structure for compact M .
[ Here give notation C(M, IRm ) for continuous maps between manifolds etc. ] [ Here give notation C(M1 , M2 ) for continuous maps between manifolds etc. ] 26.3.12 Remark: The definitions of curves and paths in topological manifolds are inherited from the topological space definitions in Sections 16.2 and 16.4.
26.4. Charts and atlases 26.4.1 Remark: A topological manifold is just a topological space M which happens to be Hausdorff and everywhere homeomorphic to a Euclidean space IRn . In other words, a topological manifold is a locally Euclidean Hausdorff topological space with constant dimension. No extra structure, such as an atlas, needs to be specified since these are implicit in the topological structure on M . This contrasts with the case of a differentiable manifold which does require extra structure. (See Notation 27.4.4 and Remark 27.2.8.) [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
C(M, IR) may also be denoted as C(M ), C 0 (M, IR) or C 0 (M ).
554
26. Topological manifolds
Since an atlas is a very common way of indicating the topology on a manifold, even if the manifold is not differentiable, it is convenient to accept an alternative specification tuple (M, AM ) for a manifold (M, TM ), where AM is an atlas according to Definition 26.4.6 and TM is the topology implied on M by AM . The tuples (M, TM ) and (M, AM ), specifying the topology or an atlas respectively, will be used almost interchangeably. Both specifications are used by a large number of authors as the standard for a “topological manifold”. To distinguish between them when necessary, the topological form (M, TM ) will be referred to as a “topological manifold” (Definition 26.3.1), and the atlas form (M, AM ) will be called a “C 0 manifold” (Definition 26.4.11). This fits conveniently with the definition for a C r differentiable manifold (Definition 27.2.6). 26.4.2 Definition: A chart for an n-dimensional topological manifold M for n ∈ ψ : U → G such that U ∈ Top(M ) and G ∈ Top(IRn ). A manifold chart is also called a coordinate map or a coordinate function.
Z+0 is a homeomorphism
26.4.3 Remark: Definition 26.4.2 is illustrated in Figure 26.4.1. ψ U
G
M IRn Figure 26.4.1
Coordinate map ψ : U → G ∈ Top(IRn ) with U ∈ Top(M )
26.4.5 Remark: Since the map ψ in Definition 26.4.2 is a homeomorphism, so is its inverse ψ −1 . Therefore it would be possible to define charts as maps from the Euclidean space IRn to the manifold. Many texts do this. However, it is better to regard coordinates as tags on geometrical points. The points are the primary entity and the coordinates are mere labels for the points. On the other hand, in the case of curves and families of curves, points are given as a function of real n-tuples because curves and families of curves represent a kind of possibly self-intersecting (non-injective) motion within the manifold. Curves may selfinterest; manifolds do not. A curve gives you a unique point for each parameter value. A manifold has a unique set of parameters for each point. Another way to see that it is more natural to define charts as functions from the point set to the coordinate set than vice versa is to think of how people make real maps of the Earth. The usual procedure is to choose which points are of interest, such as towns and mountains, and then determine the coordinates (e.g. longitude and latitude) of these points. In other words, the coordinates are attributes of the point, not vice versa. One does not choose a set of coordinates and then go out and see what is at those coordinates. Perhaps an exception to this would be aerial and satellite photography where data is organized as a set of pixels, and the points on the Earth must be determined as a function of the pixel row and column in the matrix. However, such images are usually calibrated by identifying points on Earth which have known coordinates and then determining the Earth-to-pixels map by interpolation. Similarly in the case of fibre bundles, fibre charts are expressed as functions from the fibre bundle to the fibre space rather than vice versa. On the other hand, when defining charts for particular embedded manifolds, it is generally easier to define the inverse chart, i.e. from a Euclidean space to the manifold. (See for instance spherical coordinates in Section 42.1.) 26.4.6 S Definition: An atlas for an n-dimensional topological manifold M is a set S of charts for M such that ψ∈S Dom(ψ) = M . [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
26.4.4 Remark: In this book, the symbol ψ usually hints at a manifold chart (i.e. a coordinate map). Many books use the symbol φ for manifold charts. However, in this book, φ hints at a function from one manifold to another. The author’s mnemonic for this is that ψ (“psi”) suggests the last two letters of “maps”, whereas φ (“phi”) is the first letter of the word “function” (a Latin word which does not seem to be of Greek origin).
26.4. Charts and atlases
555
An S indexed atlas for an n-dimensional topological manifold M is a family (ψi )i∈I of charts for M such that i∈I Dom(ψi ) = M . (See Figure 26.4.2.) U1 ψ1
M
M \ U2
U2 M \ U1
ψ2
ψ1 (U1 )
ψ2 (U2 ) U1 ∩ U2
IRn Atlas for a topological manifold M
26.4.7 Remark: An atlas is given two alternative formalizations in Definition 26.4.6: with and without an index. In practical applications, the charts are usually indexed. For convenience, the family of charts is usually referred to interchangeably as a set of charts, which then means the set of charts which is indexed by the family. As always with indexed families, the indexing may be implicit or explicit, according to the requirements of the context. The same issue arises for fibre bundle atlases in Definition 23.3.14. The arguments in favour of defining an atlas as a set of charts rather than an indexed family are much stronger than the counterarguments. In the case of an indexed atlas: (1) it is difficult to choose an index set for a complete atlas other than the set of charts themselves, which is rather clumsy; (2) when merging two atlases, it is difficult to choose an index for the union of the atlases, particularly if the atlases are infinite; (3) when restricting a manifold to a subset, the restricted atlas uses a subset of the index set of the full atlas; (4) since the content of an atlas is independent of the indexing, the extraneous index map must be ignored when comparing atlases. All in all, it is best to simply add an index set whenever it is convenient for applications. [ As in Definition 23.3.17 for fibre bundles, define a “topological manifold with atlas” even though the atlas is optional. ] 26.4.8 Notation: An atlas for a manifold M which is implicit in a particular context may be denoted by atlas(M ). Then atlasp (M ) = {ψ ∈ atlas(M ); p ∈ Dom(ψ)} denotes the set of charts in atlas(M ) whose domain contains a given point p ∈ M . Another notation will be AM for an atlas for M , and ApM for atlasp (M ). 26.4.9 Remark: It is not necessary to impose any additional continuity condition on the transition maps ψj ◦ (ψi U ∩U )−1 : ψi (Ui ∩ Uj ) → ψj (Ui ∩ Uj ) because all charts are homeomorphisms by Definition 26.4.2. i
j
Therefore any atlas for M is a C 0 atlas according to Definition 27.2.2. Every topological manifold M possesses an atlas. If M is compact, then any atlas on M has a finite subset which is also an atlas. (This may be referred to as a sub-atlas.) Therefore every compact topological manifold has a finite atlas. But a topological manifold with an infinite number of disconnected components cannot have a finite atlas. [ Should show that a simply connected topological manifold must have a finite atlas, but only if it is true. ] 26.4.10 Theorem: For any given atlas S on a topological manifold M , there is one and only one topology T on M such that S is an atlas for the topological space (M, T ). In other words, the atlas uniquely determines the topology. Conversely, the topology determines the set of all possible charts, and hence the set of all possible atlases. 26.4.11 Definition: A C 0 manifold is a pair (M, AM ) such that AM is an atlas for the topological manifold (M, TM ) for some topology TM on M . [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Figure 26.4.2
IRn
556
26. Topological manifolds
26.4.12 Remark: By Theorem 26.4.10, the topology TM in Definition 26.4.11 is uniquely determined by the atlas AM . The atlas AM is uniquely determined up to an atlas equivalence by the topology TM . See Remark 26.4.1 for related discussion. 26.4.13 Theorem: If S is an atlas for the topological space (M, T ), and ψ is a chart for (M, T ), then S ∪ {ψ} is an atlas for (M, T ). The set {ψ; ψ is a chart for M } is an atlas for M . [ See Malliavin [35], proposition I.1.3.1 for Theorem 26.4.13. ] Proof: See Exercise 47.9.3. 26.4.14 Remark: All pairs of charts for a topological manifold are automatically consistent on their intersection. This is because the topological manifold structure is entirely determined by the topological structure. This is different to the case of a differentiable manifold, where an atlas is required to indicate which structure is intended. An atlas for a topological manifold is fairly superfluous unless the manifold is actually defined in terms of an atlas using the concept of a topological graft, as in Theorem 26.6.4. 26.4.15 Definition: The maximal atlas of a topological space (M, T ) is the atlas consisting of all charts ˚ for (M, T ), namely C(M, IRn ) = {ψ : U → IRn ; U ∈ Top(M ) and ψ is a homeomorphism}. [ Define a “complete atlas” and show that the maximal atlas is complete? ]
26.4.17 Remark: If one wished to generalize the concept of a topological manifold in a manner similar to the intrinsic definition of a topological fibre bundle (Definition 23.2.1), then one could define a manifold to be a topological space (M, T ) such that for all x1 , x2 ∈ M , there exist neighbourhoods U1 , U2 ∈ T such that U1 ≈ U2 . In other words, the topological space is locally homogeneous. This also would imply the existence of a fixed extrinsic topological space (V, T ′ ) such that for all x ∈ M , there is a neighbourhood U of x such that U ≈ V . This would mean that (M, T ) is a space which is patched together from patches of the space (V, T ′ ). This is not very useful for differential geometry because only spaces which are locally Euclidean are relevant. 26.4.18 Remark: In topological manifolds, a curve may be tested for continuity by mapping it through the charts as in Theorem 26.4.19. 26.4.19 Theorem: If γ : I → M is a map from an interval I ⊆ IR to an n-dimensional topological manifold M , then γ is continuous if and only if ψ ◦ γ γ −1 (U) : γ −1 (U ) → IRn is continuous for all continuous charts ψ : U → IRn for M . Proof: See Exercise 47.9.4.
26.5. Topological manifold constructions, attributes and relations 26.5.1 Theorem: A function f : M1 → M2 is continuous if and only if continuous through the charts etc. [ See Malliavin [35], proposition I.1.5.3. ] Proof: See Exercise 47.9.5. [ Define here the restriction of a manifold to an open subset. See Malliavin [35], section I.6, for submanifolds. ] 26.5.2 Remark: The direct product functions ψ1 × ψ2 in Definition 26.5.3 are given by Definition 6.9.11, which defines ψ1 × ψ2 : (p1 , p2 ) 7→ (ψ1 (p1 ), ψ2 (p2 ) for p1 ∈ Dom(ψ1 ) and p2 ∈ Dom(ψ2 ).
If n1 = dim(M1 ) and n2 = dim(M2 ), then the usual identification of IRn1 × IRn2 with IRn1 +n2 by concatentation is assumed. (See Definition 7.7.6 for concatenation.) [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
26.4.16 Remark: Although the structure of a topological manifold may equally be described by its topology or by an atlas, it seems that a differentiable manifold requires the atlas, unless there’s some sort of set of subsets or something on a differentiable atlas from which the differentiable structure can be recovered.
26.6. Topological identification spaces
557
26.5.3 Definition: The (direct) product atlas of two atlases S1 and S2 on topological manifolds M1 and M2 respectively is the atlas S for the product topological space M1 × M2 given by S = {ψ1 × ψ2 ; ψ1 ∈ S1 and ψ2 ∈ S2 }.
26.5.4 Theorem: If (M1 , S1 ) and (M2 , S2 ) are topological manifolds, then (M1 × M2 , S1 × S2 ) is a topological manifold with the direct product atlas S1 × S2 as in Definition 26.5.3. 26.5.5 Definition: The (direct) product manifold of two topological manifolds (M1 , S1 ) and (M2 , S2 ) is the manifold (M1 × M2 , S1 × S2 ). 26.5.6 Theorem: The topology induced on the product of two topological manifolds by a product atlas is the same as the product of the topologies induced by the respective atlases on the manifolds. Proof: See Exercise 47.9.6. 26.5.7 Remark: Generally the direct product atlas of two maximal atlases for topological manifolds is not itself maximal. This is another good reason to not work with maximal atlases. [ Should say something about manifolds with boundaries here. ] [ Also present here the quotient of a manifold with respect to an equivalence relation. ]
26.6. Topological identification spaces 26.6.1 Remark: Although the formal definition of a topological manifold is expressed in terms of a topological space with the property that charts exist everywhere, in practice, the manifold itself is often defined in terms of charts. In other words, the manifold is actually constructed from the grafting together of an atlas of charts. So the fact that an atlas of charts exists is a direct consequence of the method of construction. The graft of a family of topological spaces is presented in Definition 15.11.2. If the charts in a graft of a family of topological spaces are locally Euclidean and related to each other by homeomorphisms on their intersections, the resulting graft will be a manifold. This is stated more precisely in Theorem 26.6.4. Conversely, a topological manifold is homeomorphic to a topological graft of the chart spaces, as stated in Theorem 26.6.3.
ψ1 (U1 ) = Range(ψ1 ) U1 = Dom(ψ1 )
x1
IRn
f1
ψ1 x
p
M
ψ2 (U2 ) = Range(ψ2 )
ψ2 U2 = Dom(ψ2 )
Figure 26.6.1
x2
IRn
f2
X
Topological graft of charts of a manifold
26.6.3 Theorem: Let (ψi )i∈I be an indexed atlas for an n-dimensional topological manifold M . Let ˚i∈I Xi by Xi = Range(ψi ) for i ∈ I. Define X ⊆ × ˚ Xi ; ∃p ∈ M, (∀i ∈ J, xi = ψi (p) and ∀i ∈ I \ J, p ∈ X = (xi )i∈J ∈ × / Dom(ψi )) i∈I ˚ Xi ; ∃p ∈ M, ∀i ∈ I, ((i ∈ J and xi = ψi (p)) or (i ∈ = (xi )i∈J ∈ × / J and p ∈ / Dom(ψi ))) . i∈I
For all i ∈ I, define the topology Ti on Xi to be the relative topology from IRn . The family (Xi , Ti )i∈I is topologically consistent with the graft X. (See Definition 15.11.2.) Let T be the graft topology on X. Then (X, T ) ≈ (M, Top(M )). [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
26.6.2 Remark: Theorem 26.6.3 is illustrated in Figure 26.6.1.
558
26. Topological manifolds
Proof: It is evident from Definition 6.10.5 that X is a set graft of the family (Xi )i∈I . Condition (i) of Definition 6.10.5 requires that there be no null families in X. This follows from the fact that the atlas covers M . The remaining conditions follow equally straightforwardly. The fact that the family (Xi , Ti )i∈I is topologically consistent with the graft X follows directly from the topological consistency of all charts in the atlas. Since all chart transition maps are homeomorphisms, the image of an open set under any chart transition map is an open set. To show the topological equivalence of (X, T ) and (M, Top(M )), define the identification map fi : Xi → X as in Definition 15.11.2 so that fi (xi ) = x for all i ∈ I and x ∈ X. In other words, fi maps each element of Xi to the corresponding element of the graft X. Define the function f : M → X so that f (p) = fi (ψi (p)) for some i ∈ I. This is well-defined because the functions fi are defined so that fi (xi ) = fj (xj ) if and only if ψi−1 (xi ) = ψj−1 (xj ) for all i, j ∈ I. To show that f is a homeomorphism, first let Ω ∈ Top(M ) and note that ψi (Ω) ∈ Ti for all i ∈ I. Therefore f (Ω) = ∪i∈I fi (ψi (Ω)) ∈ T . Similarly, any open set of (X, T ) is of the form ∪i∈I fi (Ωi ) for some open sets Ωi ∈ Ti for i ∈ I, by Definition 15.11.2. Each set ψi−1 (Ωi ) is open in (M, Top(M )). So f −1 (∪i∈I fi (Ωi )) = ∪i∈I f −1 (fi (Ωi )) = ∪i∈I ψi−1 (Ωi ) ∈ Top(M ). Hence f : (M, Top(M )) ≈ (X, T ).
Z
26.6.4 Theorem: Let n ∈ + , and let (X, T ) be a topological graft of a family (Xi , Ti )i∈I of topological spaces which are all homeomorphic to some open subset of IRn . Then (X, T ) is an n-dimensional topological manifold.
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Proof: See Exercise 47.9.7.
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
[559]
Chapter 27 Differentiable manifolds
Terminology and definition choices . . . . . . . . . . . . . . Differentiable manifold atlases . . . . . . . . . . . . . . . . Some standard differentiable manifold atlases . . . . . . . . Some basic definitions for differentiable manifolds . . . . . . Differentiable real-valued functions on differentiable manifolds Differentiable curves and paths . . . . . . . . . . . . . . . . Differentiable families of differentiable transformations . . . . Differentiable maps between differentiable manifolds . . . . . Analytic manifolds . . . . . . . . . . . . . . . . . . . . . . Unidirectionally differentiable manifolds . . . . . . . . . . . Lipschitz manifolds and rectifiable curves . . . . . . . . . . . Differentiable fibrations . . . . . . . . . . . . . . . . . . . . Tangent space building principles . . . . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
560 561 563 563 565 566 567 568 570 570 571 572 574
27.0.1 Remark: This chapter introduces the “differential layer” of differential geometry. (The structural layers of differential geometry are summarized in Sections 1.1 and 25.3.) No connection or metric is assumed to be defined. Only a differentiable structure is defined. Therefore geodesics, covariant derivatives, distances and angles are meaningless here. The subject matter of Chapters 27 to 33 is referred to as “differential topology” by Misner/Thorne/Wheeler [37], chapter 9. Figure 27.0.1 illustrates the relations between the various kinds of manifolds according to the amount of structure which is defined on them. topological manifold differentiable manifold differentiable manifold with connection Riemannian manifold Figure 27.0.1
pseudo-Riemannian manifold
Family tree of manifolds according to structures defined on them
27.0.2 Remark: Apart from global considerations, a differentiable manifold is the same as a flat space which is subject to arbitrary differentible changes of coordinates. Only concepts which retain their meaning under local diffeomorphisms of the underlying space are meaningful for differentiable manifolds. For example, the concept of a straight line is meaningless. 27.0.3 Remark: Differentiable manifold topics are distributed among the chapters as follows.
©
Alan U. Kennington, “Differential geometry reconstructed: a unified systematic framework”. Copyright 2009, Alan U. Kennington. All rights reserved. You may print this book draft in A4 format. Printing in all other formats is forbidden. You may not charge any fee for copies of this book draft. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
27.1 27.2 27.3 27.4 27.5 27.6 27.7 27.8 27.9 27.10 27.11 27.12 27.13
560
27. Differentiable manifolds chapter
27. 28. 29. 30. 31. 32. 33.
topics
differentiable manifolds differentiability tests for manifolds, functions, maps and curves tangent bundles tangent vectors; tangent operators tensors, tensor fields covariant vectors; tensors; vector fields; tensor fields; differential forms higher-order vectors higher-order tangent vectors and operators differentials differentials of functions, maps and curves higher-order differentials higher-order differentials; differentials for higher-order operators vector field calculus Poisson bracket; Lie derivatives; exterior derivative
27.1. Terminology and definition choices
27.1.2 Remark: The concept of a “differentiable structure” on a manifold is an abstract concept or “essence” which does not need to be represented by any particular set construction such as a C r maximal atlas as many authors do. The best way to think of the differentiable structure is as a set of methods and procedures for answering questions about differentiability for a manifold, such as which functions are differentiable and which curves are differentiable. Such questions can be answered in terms of a finite atlas or in other ways. For computational applications, it is desirable to define a manifold in terms of a finite atlas rather than a maximal atlas. The desire to have a manifold definition which is independent of a particular finite atlas should be resisted because the completion of any atlas is highly dependent on the level of regularity specified. A maximal atlas also discards possibly valuable information, such as, for example, a varying level of regularity in different regions of the manifold. Therefore a C r differentiable structure is defined in this book as a C r differentiable atlas. The term “differentiable structure” seems wrong although it is common usage. The expression “differential structure” seems much more logical since the structure itself is not differentiable structure. Expressions such as “metric structure”, “algebraic structure” and “topological structure” certainly suggest “differential structure” as the preferred term. It is likely that “differentiable structure” is really a contraction of “differentiable manifold structure”. Some authors use the term “differentiable manifold” to mean a C ∞ manifold. This is an unfortunate practice. When a reader refers to a book for particular results, it is very easy to make serious errors by thinking that the assumptions or assertions of a theorem are weaker than they really are. It is an unnecessary practice, given that the term C ∞ is easier to write and has the same number of syllables to pronounce as “differentiable”. So no effort is economized and great harm can be done in terms of wasted time and energy for the reader. Redefining standard terms to mean something else is nearly always a nuisance for the reader. (A similar practice is the habit of using the term “map” to mean a continuous map. This is dangerous if the reader is not reading a book linearly.) Neither the C 1 nor C ∞ interpretations of the word “differentiable” agree with the standard elementary calculus definition, which is a weaker notion than C 1 . However, the C 1 interpretation will be used for the word “differentiable” in the manifold context in this book, although this is, strictly speaking, incorrect terminology. (See Remark 27.2.7, for instance.) 27.1.3 Remark: Many differential geometers claim to use “coordinate-free” definitions and notations. These just hide the coordinates so that you don’t see them. There is an analogy here to computer operating systems which offer a point-and-click interface so that you never have to type textual commands on an old-fashioned command line. What really happens is that someone sets up buttons and menus so that the novice user can make complex commands happen with the click of a mouse, but when the experienced user finds that the pre-programmed command sequences are insufficient, they must use a text command line. In the same way, differential geometry can be done in a coordinate-free manner with notations set up to hide the coordinates for common situations. Then when you want to do something unusual which is not on the menu, you must do the hard work and go back to coordinates. Practical calculations almost always require detailed low-level work with coordinates. Anyone who wants to take differential geometry seriously should not avoid learning the low-level coordinate versions of everything. This does not mean that one should use [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
27.1.1 Remark: The central concept of the differential layer of a manifold is the tangent space. Tangent spaces are defined in Chapter 28. The reason for using regularity classes such as C r rather than r-times differentiability is to simplify the definitions of derivatives. Pn For instance, the directional derivative lima→0 (f (x + av) − f (x))/a for v ∈ IRn may be expressed as i=1 v i ∂f (x)/∂xi if f is C 1 .
27.2. Differentiable manifold atlases
561
tensor calculus at all times, but one should always know how to convert all “coordinate-free” expressions into the coordinate expressions which they hide.
27.2. Differentiable manifold atlases 27.2.1 Remark: It is a source of some frustration that a geometric object as symmetric as the 2-sphere S 2 in IR3 which is everywhere smooth and uniform must be described in differential geometry with charts which have edges. It is not possible to cover all of S 2 with a single chart. This is the same annoyance that occurs when trying to print a map of the world on a single sheet of flat paper. The cause of the problem is the fact that Cartesian coordinates, which are so appropriate for parametrising a space such as IR3 simply cannot cope with a sphere. A similar problem arises even in flat spaces such as IR3 , because in physical space we do not see any grid lines. Any set of Cartesian coordinates for IR3 will make some sets easier to describe than others. So we accept that any translation and orientation of Cartesian coordinates is valid. But annoyingly, there is no single “correct” translation and orientation. Physical 3-dimensional space is apparently everywhere uniform and smooth, but Cartesian coordinates have special points and directions, so that there are an infinite number of ways of parametrising any physical system. The fact that the cross-product of two intervals of real numbers cannot parametrize a 2-sphere is an additional annoyance. Euclid’s geometry, in which all relations between figures are relative, does not suffer from these annoyances, but for the sake of analysis, it is necessary to work with coordinates and charts. Maybe some day, someone will discover how to analyse the world without recourse to grids and charts. For the present, though, coordinates are the only practical way to do differential geometry. After all, everyone seems to accept that sentences are broken in the middle when the text flows into a new line. There is no such break in human speech, and yet we accept these breaks in printed books. No one would seriously suggest that books should be written on long paper rolls so as to avoid line breaks and page breaks. Similarly we must accept that manifold charts have edges and non-uniformities.
An indexed C r atlas for a topological manifold M is a family (ψα )α∈I such that {ψα ; α ∈ I} is a C r atlas for M . A C r atlas for a topological manifold M is also called a C r differentiable structure on M . M U1
U2 U1 ∩ U2
ψ1
ψ2
ψ1 (U1 )
ψ2 (U2 ) ψ1 U
ψ1 (U1 ∩ U2 )
IRn Figure 27.2.1
1 ∩U2
ψ2 U
ψ2 ◦ ψ1−1 ψ1 (U1 ∩U2 )
Transition map ψ2 ◦ ψ1−1 ψ
[ www.topology.org/tex/conc/dg.html ]
1 ∩U2
1 (U1 ∩U2 )
ψ2 (U1 ∩ U2 )
IRn , n = dim(M ) [ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Z
− 27.2.2 Definition: A C r atlas for a topological manifold M − < (M, TM ) for r ∈ + 0 is a topological −1 r atlas AM for (M, TM ) such that ψ2 ◦ ψ1 : ψ1 (U1 ∩ U2 ) → ψ2 (U1 ∩ U2 ) is C for all ψ1 , ψ2 ∈ AM , where Uα = Dom(ψα ) for α = 1, 2. (See Figure 27.2.1.)
562
27. Differentiable manifolds
27.2.3 Remark: Definition 27.2.2 defines a C r atlas as a special kind of topological atlas which satisfies a differentiability condition. This assumes that the set M has a pre-defined topology of a topological manifold. (Topological atlases are defined in Section 26.3.) An alternative approach is taken in Definition 27.2.4 where no pre-existing topology is assumed. In this case, a unique topology is induced on the set M by the atlas. It is not really important whether the topology comes from the atlas or the atlas comes from the topology, as long as they agree with each other. Definition 27.2.4 is used in Definition 27.2.6.
Z
Z
−+ r 27.2.4 Definition: For n ∈ + 0 and r ∈ 0 , an n-dimensional C atlas for a set M is a set AM of n bijections ψ : U → IR from subsets U of M to open subsets ψ(U ) of IRn such that (i) ∀ψ1 , ψ2 ∈ AM , ψ2 ◦ ψ1−1 : ψ1 (U1 ∩ U2 ) → ψ2 (U1 ∩ U2 ) is C r , where Uα = Dom(ψα ) for α = 1, 2; S (ii) ψ∈AM Dom(ψ) = M .
An indexed C r atlas for a set M is a family (ψα )α∈I such that {ψα ; α ∈ I} is a C r atlas for M . The topology induced by a C r atlas AM on a set M is the topology TM on M for which all of the maps in AM are homeomorphisms. The atlas AM is called a C r manifold atlas for the set M if the topology induced by AM on M is Hausdorff. The indexed atlas (ψα )α∈I is called an indexed C r manifold atlas for the set M if the topology induced by AM = {ψα ; α ∈ I} on M is Hausdorff. [ Must show that the domains of the charts in Definition 27.2.4 are open in the induced topology. Must also define the topology more succinctly and precisely. ] 27.2.5 Remark: The topology TM in Definition 27.2.4 is uniquely determined by the atlas AM and is the weak topology induced by the charts ψ ∈ AM . If the topology TM on M is not Hausdorff, then the atlas AM may be thought of as a “C r locally Euclidean space atlas”.
Z
27.2.7 Remark: If the regularity class C r of a differentiable manifold is not specified, it is assumed to be C 1 . A “differentiable manifold” means a C 1 manifold unless otherwise indicated. Some authors, unfortunately, say “differentiable manifold” when they mean “C ∞ manifold”. It is better to use the term “smooth manifold” to mean “C ∞ manifold”. Best of all is to avoid the ambiguous terms “differentiable manifold” and “smooth manifold” and always state the regularity class explicitly as C 1 or C ∞ or some other class. A very wide range of regularity classes could be useful instead of just “C r ” in Definition 27.2.2. However, only the C r classes are specified here because they are adequate for most purposes of this book, and because it is easy to substitute some other regularity class such as “analytic” or “C k,α ” for “C r ”. This is also mentioned in Remark 18.7.7 for flat spaces. In the case of differentiable manifolds, one would need to define a class of maps from IRn to IRn , such as a pseudogroup, for each regularity class. (See Section 19.4 for diffeomorphism pseudogroups.) 27.2.8 Remark: Whereas a topological manifold may be defined without an atlas (see Definition 26.3.1), it is necessary to specify an atlas for a differentiable manifold in order to indicate the choice of differentiable − structure. For any r ∈ + , a single topological space may support an infinite number of incompatible C r atlases. By contrast, all C 0 atlases for a topological manifold are automatically compatible. So a C 0 manifold is just a topological manifold with an arbitrary atlas. (See Remarks 26.4.14 and 27.4.11.)
Z
27.2.9 Remark: A C 0 manifold is not, strictly speaking, a differentiable manifold. The case r = 0 is included for notational convenience, and also to provide an alternative specification tuple for a topological manifold. The term “C 0 manifold” will be used for any tuple (M, AM ) such that AM is a C 0 atlas for M , whereas if TM is the topology induced on M by the atlas AM , the pair (M, TM ) is referred to as a “topological manifold”. (See Remark 26.4.1 for related discussion.) [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Z
− r 27.2.6 Definition: For n ∈ + and r ∈ + 0 , an n-dimensional C (differentiable) manifold is a pair r (M, AM ) such that M is a set and AM is a C manifold atlas for the set M . The topology TM induced by the atlas AM on M is called the underlying topology of the differentiable manifold (M, AM ). The topological space (M, TM ) is called the underlying topological space of the differentiable manifold (M, AM ).
27.3. Some standard differentiable manifold atlases
563
27.3. Some standard differentiable manifold atlases
Z
27.3.1 Definition: The usual atlas for an open subset M of IRn for n ∈ + 0 is the atlas AM = {ψ} where ψ = idM . The usual atlas for an open set M ⊆ IRn is also called the standard atlas for M or the usual differentiable structure for M or the standard differentiable structure for M . The pair (IRn , AIRn ) is called the differentiable manifold IRn for n ∈ + 0 , where AIRn = {idIRn }.
Z
27.3.2 Remark: The differentiable manifold IRn − < (IRn , AIRn ) in Definition 27.3.1 is a C ∞ manifold for + all n ∈ 0 . It is also an analytic manifold.
Z
27.3.3 Remark: The set M = {x ∈ IRn+1 ; xn+1 = f (x1 , . . . xn )} ⊆ IRn+1 for any C 0,1 function f : IRn → IR can be given manifold charts in a natural way by projecting points from M to x ∈ IRn+1 ; xn+1 = 0}, which may be identified with IRn . For any w ∈ IRn , a function ψw : M → IRn may be defined by ψw : (x1 , . . . xn+1 ) 7→ (x1 − w1 xn+1 , . . . xn − wn xn+1 ). This function is one-to-one if |w| < Lip(f )−1 , where Lip(f ) is the Lipschitz constant of f in Definition 17.4.10. An example of this is illustrated in Figure 43.4.1. Manifolds which are embedded in the flat spaces IRn are said to be “regularly” embedded if they are everywhere locally projectable in this one-to-one manner onto hyperplanes of the ambient space. If only a single projection of a manifold onto a hyperplane is used in an atlas, the atlas will be of class C ∞ because there will be no transition maps. Therefore it is important to include enough charts in the atlas to accurately describe the inherent regularity (or irregularity) of the manifold. In IRn , n projections with n independent projection directions at each point should be adequate to fully describe the manifold’s regularity.
27.4. Some basic definitions for differentiable manifolds
27.4.2 Remark: The differentiability property of any function f : IRn → IRn is a kind of “locally affine” condition. It is therefore not surprising that affine connections are defined on manifolds satisfying Definition 27.2.2, which requires differentiable chart transition functions as indicated in Definition 27.4.7. 27.4.3 Remark: Kobayashi/Nomizu [26], page 1, give a very general class of regularity definitions for atlases through the concept of a “pseudogroup of transformations”. Thus an atlas is said to be compatible with a particular pseudogroup Γ of transformations if its transition maps are all elements of Γ. 27.4.4 Notation: A differentiable manifold (M, AM ) is written as M if the atlas AM is implicit in the context. Then atlas(M ) denotes the atlas AM , and atlasp (M ) denotes {ψ ∈ atlas(M ); p ∈ Dom(ψ)}, the subset of charts ψ in AM whose domains contain a particular point p ∈ M . An alternative notation for atlasp (M ) is ApM . − s 27.4.5 Theorem: Let (M, AM ) be a C r manifold for r ∈ + 0 . Then (M, AM ) is a C manifold for all + s ∈ 0 with s ≤ r.
Z
Z
27.4.6 Definition: A C r chart for a C r manifold (M, AM ) is a topological chart ψ for M such that AM ∪ {ψ} is a C r atlas for M . Such a chart ψ is said to be C r compatible with (M, AM ). 27.4.7 Definition: The coordinate transition matrix for charts ψα and ψβ in an indexed atlas (ψα )α∈I for a differentiable manifold M at a point p ∈ Dom(ψα ) ∩ Dom(ψβ ) is the matrix Zβα (p) ∈ GL(n) defined by Zβα (p)i j =
∂ i −1 ψ ◦ ψ (x) . β α ∂xj x=ψα (p)
(27.4.1)
The function Zβα : Dom(ψα ) ∩ Dom(ψβ ) → GL(n) is called the coordinate transition matrix map for charts ψα , ψβ ∈ atlas(M ). [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
27.4.1 Remark: Although an atlas is defined in Definition 27.2.2 as a set AM of charts, an atlas is sometimes formalized as a family (ψα )α∈I such that AM = {ψα ; α ∈ I}. The conversion between these two set constructions is usually handled informally. Each form has its own advantages according to context.
564
27. Differentiable manifolds
˚r−1 (M, GL(n)) if M is C r and r ≥ 1. 27.4.8 Theorem: Using the notation of Definition 27.4.7, Zβα ∈ C ∞ ∞ If M is C , then Zβα is C . [ This also uses Notation 27.5.5! Must fix this. ] 27.4.9 Remark: A useful mnemonic shorthand for equation (27.4.1) is
Zβα (p)i j =
∂ψβi ∂ψαj
(p).
This is similar to Remark 28.5.16. 27.4.10 Definition: Two C r manifold atlases A1M and A2M for a set M are said to be C r equivalent atlases on M if A1M ∪ A2M is a C r atlas for M . Then (M, A1M ) and (M, A2M ) are said to be C r equivalent manifolds. 27.4.11 Remark: Clearly (M, A1M ∪ A2M ) in Definition 27.4.10 is C r equivalent to both (M, A1M ) and (M, A2M ). [ See Malliavin [35], definition I.1.5.1, section I.3.1. ] All atlases on a given topological manifold are C 0 equivalent. This follows from Theorem 26.4.13. [ See Malliavin [35], proposition I.1.4.1. Need the Hausdorff condition to get this equivalence? ] 27.4.12 Definition: The restriction of a differentiable manifold (M, AM ) to an open subset Ω of M is the differentiable manifold (Ω, AΩ ), where AΩ = {ψ Ω∩Dom(ψ) ; ψ ∈ S and Ω ∩ Dom(ψ) 6= ∅}. The atlas AΩ may be called the relative atlas of relative differentiable structure of the subset Ω of M .
27.4.13 Remark: If (M, AM ) is a C r manifold, then the restricted manifold (Ω, AΩ ) is C r . [ Prove this? ] The exclusion of empty charts from the definition of the atlas AΩ is quite arbitrary. [ Define submanifolds near here. Do this also for topological manifolds. ]
27.4.14 Remark: A C r maximal atlas can be constructed for a C r manifold (M, AM ) as the set of all C r compatible charts for (M, AM ) according to Definition 27.4.6. This C r maximal atlas is sometimes called the “C r differentiable structure” on M − < (M, AM ). Maximal atlases are not very useful in practice. For example, Theorem 27.4.5 would not be valid if differentiable manifolds were required to have maximal atlases. Kobayashi/Nomizu [26], page 2, call a maximal atlas a “complete” atlas. A C r maximal atlas could be referred to as a “C r -maximal differentiable structure”. But it is clearer to call it a C r -maximal atlas. 27.4.15 Definition: The product of two differentiable manifolds (M1 , A1 ) and (M2 , A2 ) is the differentiable manifold (M1 × M2 , A), where A is the product atlas of A1 and A2 . (See Definition 26.5.3.) [ See Malliavin [35], proposition I.3.2.7. ] 27.4.16 Theorem: Let r ∈ M2 is C r .
Z0 . Let M1 and M2 be C r manifolds. Then the product manifold of M1 and
−+
[ Probably a large number of topological categories of C r manifolds could be collected together into a single definition instead of the following multiple categories. ] 27.4.17 Definition: A compact C r manifold is a C r manifold (M, S) whose underlying topological space M is compact. 27.4.18 Definition: A paracompact C r manifold is a C r manifold (M, S) whose underlying topological space M is paracompact.
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
[ Define foliations. Also for topological manifolds. See EDM2 [34], section 154. ]
27.5. Differentiable real-valued functions on differentiable manifolds
565
27.5. Differentiable real-valued functions on differentiable manifolds The continuity of a real-valued function on a topological manifold M depends only on the topological structure. It has nothing to do with the choice of atlas, but differentiability of a real-valued function f : M → IR is meaningless if the only structure on a topological manifold M is the topology. The same topological space M can have many incompatible differentiable structures such that f is differentiable in some but not in others. (See Remark 27.2.8.) The C r differentiability of a function whose domain is an open subset of IRn and range is IRm for some r m, n ∈ + 0 is (or will be) defined in Section 18.5. Thus the C differentiability of functions on manifolds is defined in terms of the same property for flat spaces.
Z
Z
− r 27.5.1 Definition: A (real-valued) C r (differentiable) function for r ∈ + 0 in an open subset Ω of a C −1 r manifold M − < (M, AM ) is a function f : Ω → IR such that f ◦ ψ : ψ(Ω ∩ Dom(ψ)) → IR is of class C for all charts ψ ∈ AM . (See Figure 27.5.1.) M ψ(Ω ∩ Dom(ψ)) ψ
Ω
f
Dom(ψ)
Range(ψ)
IR
Figure 27.5.1
Differentiability test for f ◦ ψ −1 : ψ(Ω ∩ Dom(ψ)) → IR
Z
Z
−+ 27.5.2 Definition: An (IRm -valued) C r (differentiable) function for m ∈ + 0 and r ∈ 0 in an open subset m r −1 Ω of a C manifold M − < (M, AM ) is a function f : Ω → IR such that f ◦ ψ : ψ(Ω ∩ Dom(ψ)) → IRm is r of class C for all charts ψ ∈ AM .
Z
− r < (M, AM ), denote by C r (M ) = C r (M, IR) the 27.5.3 Notation: For r ∈ + 0 and C manifolds M − r real linear space of all C functions on M with the usual pointwise addition and real scalar multiplication operations for function spaces.
Z
m m r r For m ∈ + 0 , denote by C (M, IR ) the real linear space of all IR -valued functions of class C on M , with the usual pointwise addition and real scalar multiplication operations for function spaces.
27.5.4 Theorem: The set C r (M, IR) of C r real functions on a C r differentiable manifold M is a ring under the operations of pointwise addition and multiplication.
Z
− r r ˚r ˚r 27.5.5 Notation: For r ∈ + 0 and C manifolds M , denote by C (M ) = C (M, IR) the set of all C functions f : Ω → IR on open sets Ω ∈ Top(M ).
Z
− r ˚r ˚r 27.5.6 Notation: For r ∈ + 0 and points p in a C manifold M , denote by Cp (M ) = Cp (M, IR) the set r of all C functions on open sets Ω ∈ Topp (M ). 27.5.7 Remark: The spaces C r (M ) contain functions which match any specified derivatives of order up to r at any given point p in the manifold. For instance, let ψ ∈ AM be a chart for (M, AM ) such that p ∈ M , and let R ∈ IR+ be a positive number such that the ball Bψ(p),R ⊆ Range(ψ). Let x0 = ψ(p). Define φ : Range(ψ) → IR by φ(x) = 1 − 1/(1 + exp(1/|x − x0 | − 1/(R − |x − x0 |))) for 0 < |x − x0 | < R, φ(x0 ) = 1, and φ(x) = 0 for |x − x0 | ≥ R. Then φ ∈ C ∞ (IRn ) and all derivatives of φ equal zero for [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
f ◦ ψ −1
IRn
566
27. Differentiable manifolds
x = x0 and |x − x0 | ≥ R. (See Theorem 20.12.7.) Therefore if φ is multiplied by any polynomial function P : IRn → IR, the pointwise product P φ is C ∞ , has the same derivatives as the polynomial P at x = x0 , and vanishes completely outside Bx0 ,R . Then the function (P.φ) ◦ ψ : Dom(ψ) → IR may be extended with the value zero outside Dom(ψ) to yield a function in C r (M ) with any specified values of derivatives up to order r at the point p. [ Define compact-open topology for differentiable manifolds? See EDM2 [34], section 279.C. ] 27.5.8 Definition: A local maximum of a function u : M → IR on a C 0 manifold M is a point p ∈ M such that for some open neighbourhood Ω of p, u(q) ≤ u(p) for all q ∈ Ω.
A local minimum of a function u : M → IR on a C 0 manifold M is a point p ∈ M such that for some open neighbourhood Ω of p, u(q) ≥ u(p) for all q ∈ Ω. 27.5.9 Theorem: If p ∈ M is a local maximum of a function u ∈ C 1 (M ), where M − < (M, AM ) is a C 1 i −1 n-dimensional manifold, then (∂/∂x )(u ◦ ψ (x)) x=ψ(p) = 0 for all i = 1, . . . n and ψ ∈ AM . n If M is C 2 and u ∈ C 2 (M ), then the matrix of second derivatives (∂ 2 /∂xi ∂xj )(u ◦ ψ −1 (x)) x=ψ(p) i,j=1 is negative semi-definite for all ψ ∈ AM . [ Must prove the “negative semi-definite” assertion in Remark 27.5.10. ]
Since the first-order derivatives of the function u in Theorem 27.5.9 are zero, the second-order derivatives of chart transition maps in Remark 19.5.3 do not play a role. So the second-order derivative matrix in Theorem 27.5.9 is in fact tensorial. In other words, it transforms entirely according to the first-order derivatives (the Jacobian matrix) of chart transition maps. Of course, whether or not a point is a maximum for a C 2 function is independent of the choice of chart. So, as expected, the criteria for a maximum in Theorem 27.5.9 are chart-independent. More than this, the criteria do not require any connection or metric, even though second derivatives are involved here.
27.6. Differentiable curves and paths This section uses the definitions in Sections 16.2 and 16.4 for curves, paths and parametric families of curves in general topological spaces. Curves and families of curves are assumed to be continuous by definition.
Z
− < (M, AM ) for r ∈ + 27.6.1 Definition: A C r (differentiable) curve in a C r manifold M − 0 is an open curve γ : I → M in M such ψ ◦ γ : γ −1 (Dom(ψ)) → IRn is of class C r for all ψ ∈ AM , where n = dim(M ). 27.6.2 Remark: Open curves are introduced in Definition 16.2.9. C r curves in IRn are given by Definition 18.6.5. [ Define C k paths as equivalence classes [γ]k of C k curves γ. ] [ Should extend Definition 27.6.1 to non-open intervals by requiring one-sided derivatives to be well-defined at the end-points of the interval. The boundary conditions get trickier in the case of families of curves. ] [ Define a nonsingular curve/path. E.g. Greene/Wu [67], page 6. ] [ Define a piecewise C r curve/path. State that piecewise C r curves/paths are closed under concatenation. Define rectifiable curves/paths. Show that a rectifiable curve in a C 1 manifold is differentiable almost everywhere. ] [ Define tangent bundles and fibrations for paths, maybe not in this section. This definition should use curves/paths to cope with self-intersections of paths. The tangent vector at a point on a curve should be in the path tangent bundle. So the tangent vectors for a curve should be a cross-section of the path tangent bundle. ] [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
n 27.5.10 Remark: The matrix (uij )ni,j=1 = (∂ 2 /∂xi ∂xj )(u ◦ ψ −1 (x)) x=ψ(p) i,j=1 in Theorem 27.5.9 is Pn ij negative semi-definite if and only i,j=1 a uij ≤ 0 for all positive semi-definite real symmetric matrices (aij )ni,j=1 . This makes maximum principles for boundary value problems possible. This helps to explain why second-order elliptic partial differential equations are a important class of equations.
27.7. Differentiable families of differentiable transformations
567
27.6.3 Theorem: If p ∈ M is a local maximum of a function u ∈ C 1 (M ), where M is a C 1 manifold, and γ : IR → M is a C 1 curve in M , then (d/dt)(u ◦ γ(t)) t=x = 0 for all x ∈ IR such that p = γ(x). If M is C 2 , u ∈ C 2 (M ) and γ is C 2 , then (d2 /dt2 )(u ◦ γ(t)) ≤ 0 for all x ∈ IR such that p = γ(x). t=x
[ Somewhere, but not in this section, give a definition of tangent vectors in terms of curves through a point. Call these “tangent curve classes”. (See Darling [13], section 7.2.1, page 147.) Then refer to this from Section 28.3. ] [ Should define tangent vectors as Cauchy sequence classes somewhere. ] 27.6.4 Remark: The following table summarizes how curve and path topics have been distributed within this book. section topics 16.1 16.2 16.3 16.4
terminology for curves and paths curves path-equivalence of curves paths
paths are equivalence classes [γ]0 of curves γ definitions for curves in topological spaces definition of curves which have the same path definitions for paths
17.5
rectifiable curves and paths
almost-everywhere differentiable curves
24.2
pathwise topological parallelism
parallelism Θγs,t for paths γ
30.8
higher-order vector fields for families of curves
31.4
differential of a curve
dγ = γ ′
32.3 32.8
higher-order differentials of curves and families differentials of curves for higher-order operators
dk γ, k ≥ 1; ∂ijk... γ
38.1 38.2
covariant derivatives of vector fields along curves Dγ; Dk γ, k ≥ 1 geodesic curves Dγ = 0
27.7. Differentiable families of differentiable transformations Differentiable families of differentiable diffeomorphisms are required for the analysis of connections on differentiable fibre bundles. 27.7.1 Definition: A C 1 one-parameter family of diffeomorphisms of a C 1 manifold M is a map φ : IR → C 1 (M, M ) such that (i) ∀t ∈ IR, φ(t) is a diffeomorphism (automorphism) from M to M , (ii) φ ∈ C 1 (IR × M, M ). 27.7.2 Definition: A C 1 one-parameter group of diffeomorphisms of a C 1 manifold M is a map φ : IR → C 1 (M, M ) such that (i) ∀t ∈ IR, φ(t) is a diffeomorphism (automorphism) from M to M , (ii) ∀s, t ∈ IR, φ(s) ◦ φ(t) = φ(s + t), (iii) φ ∈ C 1 (IR × M, M ).
27.7.3 Remark: The function φ : IR → C 1 (M, M ), and its transpose φ¯ : IR × M → M , defined by ¯ x) = φ(x)(s) for (s, x) ∈ IR × M , are regarded as interchangeable in Definition 27.7.2. φ(s,
27.7.4 Definition: A local C 1 one-parameter group of local transformations of a C 1 manifold M is a map φ : (a, b) → C 1 (Ω, M ) for some a, b ∈ IR such that a < 0 < b, Ω is an open subset of M , and
(i) ∀t ∈ (a, b), φ(t) is injective, (ii) ∀s, t ∈ (a, b), (s + t ∈ (a, b) and φ(t) ∈ Ω) ⇒ φ(s) ◦ φ(t) = φ(s + t), (iii) φ ∈ C 1 ((a, b) × Ω, M ). [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
27.6 differentiable curves and paths 27.11 rectifiable curves 29.8 vector fields along curves
568
27. Differentiable manifolds
[ The above definition derives from EDM [33], 108.L, and Gallot/Hulin/Lafontaine [19], pages 23–26. ]
27.8. Differentiable maps between differentiable manifolds Differentiable maps include diffeomorphisms as a special case. A diffeomorphism is a differentiable map which is also a homeomorphism whose inverse is a differentiable map. Differentiable maps may be between manifolds of arbitrary equal or unequal dimension. Note that even though it is straightforward to define whether a map between manifolds is of class C r , it is not so easy to state just what the rth derivative is. This contrasts with real functions of a real variable, where the derivative of a function resides in the same space as the function being differentiated. In differential geometry, derivatives frequently fall into a completely different space to the original function. Each order of derivative may require a separate space to be constructed for it. The derivative of a differentiable map, called a “differential”, is presented in Section 31.3. It is possible to avoid the question of what a differential is in this section by defining differentiability in terms of charts, which moves the question into flat space. 27.8.1 Definition: A C r (differentiable) map from a C r manifold M1 to a C r manifold M2 for r ∈ a map φ : M1 → M2 such that
Z0
−+
is
ψ2 ◦ φ ◦ ψ1−1 is of class C r .
∀ψ1 ∈ AM1 , ∀ψ2 ∈ AM2 ,
[ See Malliavin [35], proposition I.1.5.3 for C 0 maps. ] 27.8.2 Remark: When the regularity class C r of a differentiable map in Definition 27.8.1 is not stated explicitly, some authors assume this to mean C 1 while others assume it means C ∞ . It is generally best to be explicit to avoid misunderstanding. The spaces and maps in Definition 27.8.1 are illustrated in Figure 27.8.1. ψ2 ◦ φ ◦ ψ1−1
ψ1 M1 Figure 27.8.1
IRn2 ψ2
p φ
φ(p)
M2
C r differentiable map “through the charts”
27.8.3 Remark: The C r regularity of maps between manifolds is often defined in terms of real-valued test functions. This gives a correct test which is neat and tidy. In this test-function regularity definition, a map φ : M1 → M2 is said to be C r if f ◦ φ is in C r (M1 ) for all f ∈ C r (M2 ). This kind of definition has a few difficulties despite its formal neatness. The number of test functions f in this definition is generally extremely infinite, whereas for finite atlases, Definition 27.8.1 requires only a finite number of functions to be tested for C r regularity. Another difficulty is that in practice, testing f ◦ φ for regularity requires the use of charts on each manifold, which implies that the test-function definition turns into Definition 27.8.1 anyway. So the apparent chart-free status of the test-function approach is actually an illusion. The equivalence of the definitions is demonstrated in Theorem 27.8.4. The function f ◦ φ in Theorem 27.8.4 is a kind of “pull-back” of f from M2 to M1 . This results in a “pushforth” of tangent vectors from M1 to M2 in Definitions 31.3.1 and 31.3.18. The function ψ2k ◦ φ ◦ ψ1−1 in the proof of Theorem 27.8.4 could be thought of as a push-forth of the function φ from the manifold to the coordinate space via the charts, thereby making the function φ act on the coordinate space instead of the points of the manifold. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
IRn1
27.8. Differentiable maps between differentiable manifolds 27.8.4 Theorem: Let M1 and M2 be C r manifolds for some r ∈ class C r if and only if ∀f ∈ C r (M2 ),
569
Z0 . Then a function φ : M1 → M2 is of
−+
f ◦ φ ∈ C r (M1 ).
(27.8.1)
Proof: Suppose that φ : M1 → M2 is of class C r . Let ψ1 ∈ atlas(M1 ) and ψ2 ∈ atlas(M2 ). Then by Definition 27.8.1, ψ2 ◦ φ ◦ ψ1−1 is of class C r . A function f : M2 → IR is of class C r (by Definition 27.5.1) if and only if f ◦ ψ2−1 is C r for all ψ2 ∈ AM2 . Let f : M2 → IR be C r . Then f ◦ ψ2−1 is C r . Therefore (f ◦ ψ2−1 ) ◦ (ψ2 ◦ φ ◦ ψ1−1 ) is C r . But this equals f ◦ φ ◦ ψ1−1 . It follows that f ◦ φ is C r . To show the converse, suppose that φ : M1 → M2 satisfies line (27.8.1). The kth component of ψ2 ◦ φ ◦ ψ1−1 is ψ2k ◦ φ ◦ ψ1−1 . Define f : Dom(ψ2 ) → IR by f : p 7→ ψ2k (p). This is of class C ∞ because f ◦ ψ2−1 = ψ2k ◦ ψ2−1 : x 7→ xk for x ∈ Range(ψ2 ). Therefore f ◦ φ is C r , which means that f ◦ φ ◦ ψ1−1 is C r for ψ1 ∈ atlas(M1 ). So ψ2k ◦ φ ◦ ψ1−1 is C r for k = 1, . . . n2 . Hence φ is of class C r .
27.8.5 Remark: Figure 27.8.2 illustrates the sets and maps in Theorem 27.8.4.
M1
p
φ
f ◦φ
φ(p)
M2
f IR
r
C differentiable map via test functions
The proof of Theorem 27.8.4 exemplifies how attempts to do differential geometry in a coordinate-free fashion really only hide the coordinates. To be precise, whenever one invokes the space of C r functions f : M → IR as a test space on which to work “coordinate-free”, the real-valued functions f are themselves coordinates. There is very little difference indeed between the individual coordinates ψ k of a C r chart ψ and a C r real-valued function. This thinking may be applied similarly to tangent operators as defined in Section 28.5. These are defined on C 1 test functions f : M → IR, but this is equivalent to defining operators on chart coordinates ψ k : M → IR. In fact, a tangent operator (in Definition 28.5.1) of the form ∂p,v,ψ acting on a function f = ψ k yields ∂p,v,ψ (f ) = v k . This shows again the equivalence of chart coordinates and test functions. 27.8.6 Notation: For r ∈ manifold M2 .
Z0 , denote by C r (M1 , M2) the set of all C r maps from C r manifold M1 to C r
−+
˚r (M1 , M2 ) the set of all C r maps from open sets Ω ⊆ M1 to M2 . 27.8.7 Notation: Denote by C
Z
− r 27.8.8 Definition: For r ∈ + 0 , a C diffeomorphism from M1 to M2 is a homeomorphism φ : M1 ≈ M2 −1 r such that both φ and φ are C differentiable maps. The C r manifolds M1 and M2 are said to be C r -diffeomorphic if there exists a C r diffeomorphism from M1 to M2 . 27.8.9 Remark: If the regularity class C r is not specified, r = ∞ is often assumed. Thus two differentiable manifolds are often said to be diffeomorphic when they are both C ∞ manifolds and are C ∞ -diffeomorphic. It is preferable to state the regularity class of a diffeomorphism explicitly. The flat-space version of Definition 27.8.8 is Definition 19.1.1.
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Figure 27.8.2
570
27. Differentiable manifolds
27.9. Analytic manifolds [ This section may be combined with Section 27.2 some day using a variable regularity class. ] Analytic manifolds have no special properties that are useful in this book, except that they are used in defining Lie groups in Chapter 34. The analytic manifold definitions are the same as the C ∞ except for a simple text substitution. 27.9.1 Definition: An analytic atlas for an n-dimensional topological manifold M is an atlas S for M such that ψ2 ◦ ψ1−1 is analytic for all ψ1 , ψ2 ∈ S. 27.9.2 Definition: An n-dimensional analytic manifold is a pair (M, S) such that M is an n-dimensional topological manifold and S is an analytic atlas for M . Then M is called the underlying topological space of (M, S). 27.9.3 Definition: An analytic chart for an analytic manifold (M, S) is a chart ψ for M such that S ∪{ψ} is also an analytic atlas for M . 27.9.4 Definition: Analytic equivalent atlases on a topological manifold M are analytic atlases A1 and A2 on M such that A1 ∪ A2 is an analytic atlas on M . Then (M, A1 ) and (M, A2 ) are said to be analytic equivalent manifolds. (Clearly (M, A1 ∪ A2 ) is then also analytic equivalent to both (M, A1 ) and (M, A2 ).) 27.9.5 Definition: A compact analytic manifold is an analytic manifold (M, AM ) whose underlying topological space M is compact. 27.9.6 Definition: A paracompact analytic manifold is an analytic manifold (M, AM ) whose underlying topological space M is paracompact. 27.9.7 Remark: It seems like all analytic manifolds should be paracompact. So Definition 27.9.6 is apparently a waste of space. [ Check this. ]
27.10. Unidirectionally differentiable manifolds 27.10.1 Remark: For many purposes, including analysis on the graphs of solutions of boundary value problems, it would be valuable to be able to define various generalizations of differentiable manifolds with weaker regularity than C 1 . For example, Lipschitz and H¨older continuity would be useful, as would various Sobolov spaces and distribution spaces. In the case of Lipschitz continuity, it is well known that the first derivative exists almost everywhere. This could be translated into the intrinsic manifold context without great difficulty. In the case of Lipschitz manifolds, instead of a tangent bundle with copies of IRn attached to each point in the manifold, there would be a tangent cone at each point. This cone would be a linear tangent space almost everywhere. Lipschitz manifolds are useful for defining rectifiable curves, which are the natural kind of curve for defining parallelism. This subject is presented in Section 27.11. For the subject of second-order boundary value problems, the case of C k,α manifolds for general k would be more useful than the k = 0 case. Then the tangent bundle would be the same as for C 1 manifolds, but higher-order structures would require generalizations. [ It may be that for Lipschitz surfaces, the triples (p, v, ψ) could be easier to generalize than the tangent operators ∂p,v,ψ . Attempt here to make these generalizations. ] 27.10.2 Remark: Definitions 27.10.3 and 27.10.4 are the unidirectional analogues of the C r manifold Definitions 27.2.2 and 27.2.6. 27.10.3 Definition: A unidirectionally differentiable atlas for a topological manifold M − < (M, TM ) is a topological atlas AM for (M, TM ) such that ψ2 ◦ ψ1−1 : ψ1 (U1 ∩ U2 ) → ψ2 (U1 ∩ U2 ) is unidirectionally differentiable for all ψ1 , ψ2 ∈ AM , where Uα = Dom(ψα ) for α = 1, 2. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
[ Note that a differentiable manifold with only one chart in its atlas is automatically analytic. ]
27.11. Lipschitz manifolds and rectifiable curves
571
Z
27.10.4 Definition: For n ∈ + , an n-dimensional unidirectionally differentiable manifold is a pair (M, AM ) such that M is a set and AM is a unidirectionally differentiable manifold atlas for M .
27.11. Lipschitz manifolds and rectifiable curves [ This section has been taking up too much time. The author will continue work on it later. Please ignore it for now, because it not yet ready to read. ] [ Logically speaking, this section probably belongs at the beginning of Chapter 27. But weak regularity is more difficult to understand than C k regularity. That’s why it’s here. ] Rectifiable curves are a minimum requirement for defining parallelism and connections (and therefore covariant derivatives and curvature) on a manifold, and a Lipschitz atlas is a minimum requirement for defining rectifiable curves. Therefore Lipschitz manifolds are a minimum requirement for much of differential geometry. This is the motivation for this section. 27.11.1 Remark: Example 27.11.2 shows that it is not possible to define rectifiability of curves in topological spaces in terms of the curve map “through the charts”, because even if a curve is rectifiable with respect to one chart, there will certainly be other charts for which rectifiability does not hold. [ If a topological manifold is metrizable, rectifiability is well-defined for each choice of metric, but the lengths of curves will depend on the choice of metric, and the rectifiability property for each curve may depend on the choice of metric. ]
ψ0 γ
γ(I)
IR2
I ψh M IR2 Figure 27.11.1
Chart-dependence of rectifiability of a curve
Now define a curve γ : I → M by γ : x 7→ (x, 0) for some interval I ⊆ IR. Consider the map “through the charts” defined by ψh ◦ γ : I → IR2 . This must be continuous because ψh and γ are continuous. It is clear that when h = 0, the map ψ0 ◦ γ = γ is C ∞ and certainly a Lipschitz function. But if h is chosen to be a continuous-everwhere, differentiable-nowhere function, then ψh ◦ γ is continuous but not Lipschitz continuous. (See Example 18.2.16 for a nowhere-differentiable function.) 27.11.3 Remark: One could define Lipschitz manifolds whose transition functions have the corresponding regularity. On such manifolds, rectifiable curves and sets would be well-defined. Rectifiable curves and paths are defined for metric spaces in Section 17.5. Manifold charts induce a chartdependent metric structure on a manifold. Lengths of curves cannot be defined in a chart-independent manner in the absence of a true metric. But the rectifiability property can be defined in a chart-independent manner if the manifold is Lipschitz continuous. [ Maybe should define locally Lipschitz manifolds instead of Lipschitz. ] [ Define a locally Lipschitz atlas. ] [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
27.11.2 Example: Define a 2-dimensional topological manifold (M, TM ) by M = IR2 with the usual topology TM on IR2 . For any continuous function h : IR → IR define the map ψh : M → IR2 by ψh : (x1 , x2 ) 7→ (x1 , x2 + h(x1 )). Then ψh is a continuous map for any continuous h, and the inverse of ψh is ψ−h , which is also continuous. Therefore ψh is a homeomorphism and so a valid continuous chart for (M, TM ) for any continuous h : IR → IR. For the zero function h = 0, the map ψh = ψ0 is the identity map on IR2 . This is illustrated in Figure 27.11.1.
572
27. Differentiable manifolds
27.11.4 Definition: A Lipschitz (continuous) manifold is a topological manifold equipped with a locally Lipschitz atlas. [ In Definition 27.11.5, shouldn’t need to restrict Dom(ψ) to a smaller neighbourhood? ] 27.11.5 Definition: A rectifiable compact-domain curve in a Lipschitz manifold (M, AM ) is a compactdomain curve γ : I → M such that for Lipschitz chart ψ ∈ AM,γ(t) , for some δ > 0, all t ∈ I, for some γ(Bt,δ ) ⊆ Dom(ψ) and the map ψ ◦ γ B : I ∩ Bt,δ → IRn is a rectifiable curve in IRn , where n = dim(M ). t,δ
[ In Definition 27.11.5, by Lebesgue covering lemma, can always take a finite number of intervals K to cover I. ]
27.11.6 Remark: Since the parameter interval I in Definition 27.11.5 is compact, the cover of I by open balls Br,δ ⊆ IR may be replaced by a finite cover.
The set γ −1 (U ) is relatively open in I, but is not generally an interval. In general, γ −1 (U ) is a countable disjoint union of open intervals.
27.11.7 Remark: Theorem 27.11.8 implies that a curve in a Lipschitz manifold (M, AM ) may be tested for rectifiability by breaking up the domain of the curve into a finite number of compact subintervals whose ranges fit within at least one of the domains of charts in the atlas. Most importantly, the test gives the same result independent of the choice of atlas, as long as the atlases are Lipschitz equivalent. [ Maybe instead of Theorem 27.11.8 should just state that Definition 27.11.5 is independent of the choice of Lipschitz atlas for Lipschitz equivalent atlases. ] 27.11.8 Theorem: For any rectifiable compact-domain curve γ : I → M in a Lipschitz manifold (M, AM ) Sm with n = dim(M ), there is a finite family (Kj )m of compact subsets of I such that I = K and for j j=1 j=1 all j = 1 . . . m, for some ψj ∈ AM , Kj ⊆ Dom(ψj ), and the map ψj ◦ γKj : Kj → IRn is a rectifiable curve.
Proof: Let ψ ′ : U ′ → IRn be a chart for (M, TM ). Let K ′ ⊆ I be a compact interval such that γ(K ′ ) ⊆ U ′ . ′ ′ It must be shown that ψ ◦ γ K ′ : K → IRn is a rectifiable curve in IRn .
Let AM = {ψj ; j ∈ J} be the atlas in the statement of Theorem 27.11.8, and let Uj = Dom(ψj ) for j ∈ J. Then {γ −1 (Uj ); j ∈ J} is an open cover of K ′ . Since K ′ is a compact subset of I, the Lebesgue covering lemma implies that K ′ may be divided into a finite number of compact subintervals Kℓ such that each Kℓ is included entirely in one set γ −1 (Uj ). By the property of the atlas AM , ψj ◦(γ K ) is a rectifiable curve in IRn . ℓ Since ψj (γ(Kℓ )) is compact, it follows that (ψ ′ ◦ ψj−1 )(ψj (γ(Kℓ ))) = ψ ′ (γ(Kℓ )) is compact and ψ ′ ◦ (γ K ) ℓ is a rectifiable curve in IRn . Therefore the property is satisfied for the chart ψ ′ . It follows that the property holds for all atlases for M . [ Maybe, for the above proof, may need a theorem that the concatentation of rectifiable curves is rectifiable? ] [ Is rectifiability only well-defined for compact-domain paths? Is local rectifiability well-defined for all paths? Could use compact subsets of the parameter interval for this. ]
27.12. Differentiable fibrations This section used to be in Chapter 35, but the concepts of differentiable fibrations are used too frequently in Chapters 28 to 33. However, all other definitions of differentiable fibre bundles remain in Chapter 35 because they require Lie groups which are not defined until Chapter 34. As illustrated in Figure 23.1.1, there are broadly three species of topological fibre bundles. Differentiable fibre bundles have the same three species: fibrations, ordinary fibre bundles, and principal fibre bundles. (Recall that fibrations are groupless fibre bundles.) Topological fibrations with an intrinsic fibre space were defined in Section 23.2. As mentioned in Remark 23.2.4, a topological fibration defines how the fibre sets at each point of the base space are “glued [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
[ The following proof is work in progress. It isn’t ready to be read yet. It’s actually the proof of a different (deleted) theorem! ]
27.12. Differentiable fibrations
573
together”. In the case of differentiable fibrations, the fibre sets are glued together in a differentiable fashion. Definition 27.12.1 is a differentiable version of the topological fibration with intrinsic fibre space in Definition 23.2.1. The topologies TE and TB are replaced with manifold atlases AE and AB . The differentiable fibration in Definition 27.12.1 may seem like it is just a differentiable map between manifolds which happens to satisfy some special conditions (iii) and (iv). But the purpose of the map is to partition the total space E into fibres π −1 ({b}) at each of the points b of the base space B, and conditions (iii) and (iv) constrain how the pointwise fibres are related to each other. 27.12.1 Definition: A C k (differentiable) fibration with intrinsic fibre space for k ∈ (E, π, B) − < (E, AE , π, B, AB ) such that
Z0
−+
is a tuple
(i) E − < (E, AE ) and B − < (B, AB ) are C k manifolds,
(ii) π : E → B is of class C k ,
(iii) ∀b ∈ B, ∃U ∈ Topb (B), ∃φ : π −1 (U ) → π −1 ({b}), π × φ : π −1 (U ) → U × π −1 ({b}) is a C k diffeomorphism, (iv) ∀b1 , b2 ∈ B, π −1 ({b1 }) is C k -diffeomorphic to π −1 ({b2 }). E is called the total space of (E, π, B). π is called the projection map of (E, π, B). B is called the base space of (E, π, B). For any b ∈ B, the set π −1 ({b}) is called the fibre set of (E, π, B) at b. 27.12.2 Remark: If the structure group of a fibre bundle is not specified (as in the groupless fibre bundles in Definitions 27.12.1 and 27.12.3), it seems reasonable that the group should be defined implicitly as the largest group that makes sense for the given class of fibre bundle. In the case of topological fibre bundles in Section 23.3, this implicit group is the group of topological automorphisms of the fibre space. Thus if the C k differentiable fibre bundle in Definition 35.2.2 had no explicitly defined group G, the implied structure group should be the group of all C k diffeomorphisms of the fibre space F . One little problem with this idea is the fact that the group of all diffeomorphisms of a given differentiable manifold is generally not a finite-dimensional differentiable manifold. The diffeomorphism group is usually very infinite-dimensional. By contrast, the topological automorphism group of a fibre space can be given a reasonable natural topology. 27.12.3 Definition: A C k (differentiable) fibration with fibre space F for a C k differentiable manifold − F − < (F, AF ) for k ∈ + < (E, AE , π, B, AB ) such that 0 is a tuple (E, π, B) −
Z
(i) E − < (E, AE ) and B − < (B, AB ) are C k manifolds,
(ii) π : E → B is a C k map,
(iii) ∀b ∈ B, ∃U ∈ Topb (B), ∃φ : π −1 (U ) → F, π × φ : π −1 (U ) → U × F is a C k diffeomorphism. If the regularity class C k is not stated, it is assumed to be C 1 .
27.12.4 Remark: The fibre charts φ in Definition 27.12.3 are automatically pointwise C k -consistent on their overlaps. So it is not necessary to require regularity of the transition maps in this definition. In fact, since (π × φ1 ) ◦ (π × φ2 )−1 : (Uφ1 ∩ Uφ2 ) × B ≈ (Uφ1 ∩ Uφ2 ) × B is a C k diffeomorphism which maps (b, y) to (b, gφ1 ,φ2 (b)(y)) for some function gφ1 ,φ2 : Uφ1 ∩ Uφ2 → C k (F, F ), it follows that gφ1 ,φ2 is of class C k in the weak sense that gφ1 ,φ2 (·)(y) : b 7→ gφ1 ,φ2 (b)(y) ∈ F is a C k map for each fixed y ∈ F . Without a C k structure on C k (F, F ), it is not possible to assert differentiability in the stronger sense of gφ1 ,φ2 being itself of class C k . A C 0 fibration is equivalent to a topological fibration except that the topologies on the spaces are specified via charts rather than sets of open sets. An advantage of the absence of a fibre atlas in Definition 27.12.3 is the fact that a C k fibration (E, π, B) for a fibre space F is also a C k fibration for any C k manifold F ′ which is diffeomorphic to F . This is as one would intuitively expect. Definition 27.12.7, on the other hand, specifies a particular atlas for a particular fibre space F . It is straightforward to define equivalence relations among atlases so that this dependency on a particular space F is removed. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
If the regularity class C k is not stated, it is assumed to be C 1 .
574
27. Differentiable manifolds
Z
− k 27.12.5 Definition: A C k (differentiable) fibre chart for k ∈ + 0 for a C fibration (E, π, B) with fibre −1 −1 space F is any function φ : π (U ) → U × F such that π × φ : π (U ) → U × F is a C k diffeomorphism for some U ∈ Top(B). − space F for a C k fibration (E, π, B) is a set 27.12.6 Definition: A C k fibre atlas for k ∈ + 0 for a fibre S F k AE of C fibre charts for fibre space F for (E, π, B) such that φ∈AF Dom(φ) = E. E −+ k k An indexed C fibre atlas for k ∈ 0 for fibre space FSfor a C fibration (E, π, B) is a family (φi )i∈I of C k fibre charts for fibre space F for (E, π, B) such that i∈I Dom(φi ) = E. (That is, Range(φ) is a C k fibre atlas for F .) − 27.12.7 Definition: A C k (differentiable) fibration with a fibre atlas for the fibre space F for k ∈ + 0 for a C k manifold F is a tuple (E, π, B) − < (E, AE , π, B, AB , AF E ) such that
Z
Z
Z
(i) E − < (E, AE ) and B − < (B, AB ) are C k manifolds and π : E → B is C k , −1 (ii) ∀φ ∈ AF (Uφ ) → Uφ × F is a C k diffeomeorphism. E , ∃Uφ ∈ TB , π × φ : π S (iii) φ∈AF Uφ = B. E
27.12.8 Definition: The horizontal component of a vector W ∈ T (E) for a C 1 fibration (E, π, M ) is the vector π∗ (W ) ∈ T (M ) 27.12.9 Definition: A vertical vector on the total space E of a C 1 fibration (E, π, M ) is a vector W ∈ T (E) whose horizontal component is zero.
27.12.11 Remark: For a very good reason, there is no definition of vertical components or horizontal vectors for general differentiable fibre bundles. This is because these concepts are not meaningful (i.e. chartindependent) in the absence of a connection. It is the purpose of connections in Chapter 36 to give these concepts meaning. Very limited definitions of vertical components and horizontality are possible in the context of Lie derivatives. (See Section 33.4 for Lie derivatives.) 27.12.12 Remark: Cross-sections of topological fibrations are defined in Definition 23.3.8. Cross-sections of differentiable fibre bundles are the same except that differentiability is required. Definition 27.12.13 and Notation 27.12.14 apply also to C k fibre bundles. In general, any definition for a fibration applies to the underlying fibration of a fibre bundle. − k 27.12.13 Definition: A C k cross-section of a C k fibration (E, π, B) for k ∈ + 0 is a C map X : B → E such that π ◦ X = idB . − k k 27.12.14 Notation: X k (E, π, B) for k ∈ + 0 and a C fibration (E, π, B) denotes the set of all C crosssections of (E, π, B).
Z
Z
27.13. Tangent space building principles There is a bewildering array of mechanisms for building spaces from a differentiable manifold. These are presented in Chapters 28 to 33. These spaces may be grouped together under the general title “tangent spaces”. The following is a summary of these building principles. [ The following table is not finished yet. Some inaccuracies need to be fixed, and there should be references to the sections and subsections where the methods are defined. Should also indicate which kinds of differentials live in each kind of construction. Are there meaningful constructions such as T r,s (M1 , M2 ) or P r (M1 , M2 )? Maybe could even include concepts like direct product manifolds near here, and quotient manifolds? But these are not really tangent space constructions. ] [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
27.12.10 Remark: The set ker((dπ)z ) = {W ∈ Tz (M ); (dπ)z (W ) = 0} of vertical vectors in Definition 27.12.9 for a fixed point z ∈ E is a subspace of Tz (E), namely the kernel of the pointwise differential S map (dπ)z : Tz (E) → Tπ(z) (M ). The set z∈Ep ker(dπ)z for a fixed p ∈ M is not a linear space, but it S is essentially the same as the total tangent space of Ep . The set z∈E ker(dπ)z of all vertical vectors on a fibration (E, π, M ) is also not a linear space, but it is useful in the case of the case of tangent fibrations (see Section 28.8) because the vertical vectors may be identified with the space T (M ) via a “drop” function.
27.13. Tangent space building principles
575
(1) Tangent vectors. Input: C k manifold. Output: Vector space or C k−1 vector bundle. For any C 1 manifold M , construct the pointwise tangent vector space Tp (M ) for all p ∈ M and the total tangent space T (M ). (2) Tangent operators. Input: C k manifold. Output: Vector space or set. ˚p (M ) for all p ∈ M and the For any C 1 manifold M , construct the pointwise tangent operator space T ˚ ˚ set of tangent operators T (M ). (The set T (M ) is not given structure such as an atlas or topology.) (3) Tagged tangent operators. Input: C k manifold. Output: Vector space or C k−1 vector bundle. For any C 1 manifold M , construct the pointwise tagged tangent operator space Tˆp (M ) for all p ∈ M and the total tagged tangent operator space Tˆ(M ). (4) Coordinate frame space. Input: C k manifold. Output: C k−1 fibre bundle. For any C 1 manifold M and r = 1, . . . dim(M ), construct the pointwise tangent r-frame space Ppr (M ) for all p ∈ M , and the total tangent r-frame space P r (M ). If r = dim(M ) = n, these are called the pointwise tangent n-frame space Pp (M ) for all p ∈ M , and the total tangent n-frame space P (M ) (5) Cotangent vectors. Input: C k manifold. Output: Vector space or C k−1 vector bundle. For any C 1 manifold M , construct the pointwise cotangent vector space Tp∗ (M ) for all p ∈ M and the total cotangent space T ∗ (M ). (6) Tensors. Input: C k manifold. Output: Tensor space or C k−1 vector bundle. For any C 1 manifold M , construct the pointwise tensor space Tpr,s (M ) for all p ∈ M , and the total tensor space T r,s (M ), where r, s ∈ + 0. (7) Vector fields. Input: C k vector bundle. Output: Vector field algebra. For any total space on a C k manifold M , construct the space of vector fields X k ( ) for integers k ∈ + k . 0 . This is the space of C cross-sections of
Z
T
T
T
(8) Map forms. Input: Two C k manifolds. Output: C k−1 vector bundle. For any two C 1 manifolds M1 and M2 , construct the pointwise space of T (M2 )-valued forms Tp∗ (M1 , M2 ) at p ∈ M1 , and the total space of T (M2 )-valued forms T ∗ (M1 , M2 ) on M1 . (9) Map vectors. Input: Two C k manifolds. Output: C k−1 vector bundle. For any two C 1 manifolds M1 and M2 , construct the pointwise space of T ∗ (M2 )-valued tangent vectors Tp (M1 , M2 ) at p ∈ M1 , and the total space of T ∗ (M2 )-valued tangent vectors T (M1 , M2 ) on M1 . (Note that T (M1 , M2 ) = T ∗ (M2 , M1 ), roughly speaking.) (10) Higher-order tangent vectors. Input: C k manifold. Output: C k−ℓ vector bundle. [ℓ] For any C ℓ manifold M , construct the pointwise space of ℓth-order tangent vectors Tp (M ) at p ∈ M , [ℓ] and the total space of ℓth-order tangent vectors T (M ). (11) Higher-order map vectors. Input: Two C k manifolds. Output: C k−ℓ vector bundle. For any C ℓ manifolds M1 and M2 , construct the pointwise space of ℓth-order T ∗ (M2 )-valued tan[ℓ] gent vectors Tp (M1 , M2 ) at p ∈ M1 , and the total space of ℓth-order T ∗ (M2 )-valued tangent vec[ℓ] tors T (M1 , M2 ). The above construction methods may be applied recursively. The output from each principle may be an input to other building principles. Also, each C k vector bundle is also a C k manifold. So for example, T (T (M )) results from applying method (1) twice to a C k manifold to yield a C k−2 vector bundle, because the C k−1 vector bundle from the first step is also a C k−1 manifold which can be input to the second step. Another typical example is X k (T ∗ (M )), which results from applying method (5) followed by method (7). The main motivation and application of all tangent space constructions is to provide a “place to live” for differentials of various kinds. For example, the differential dφ of a C 1 map φ : M1 → M2 for C 1 manifolds M1 and M2 looks like a cotangent vector at each point of M1 and like a tangent vector at each point of M2 in the range of φ. This kind of differential requires a tangent space T ∗ (M1 , M2 ). [ There should be a section, maybe near here, in which there is a summary of all the different kinds of spaces, including spaces of tangent vectors, tangent operators, higher-order tangent vectors, (mixed) tensors, differentials, cotangents, vector fields and differential forms. There should be at least as table summarising these, and maybe one or more diagrams too. This plethora of spaces is available even without bringing in connections or metrics. ]
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Z
576
[ www.topology.org/tex/conc/dg.html ]
27. Differentiable manifolds
[ draft: UTC 2009–3–21 Saturday 11:36 ]
[577]
Chapter 28 Tangent bundles on differentiable manifolds
Styles of representation of tangent vectors . . . . . . Tangent bundle metadefinition . . . . . . . . . . . . Tangent vectors . . . . . . . . . . . . . . . . . . . . Computational tangent vectors . . . . . . . . . . . . Tangent operators . . . . . . . . . . . . . . . . . . . Tagged tangent operators . . . . . . . . . . . . . . . Pointwise tangent spaces . . . . . . . . . . . . . . . Tangent bundles . . . . . . . . . . . . . . . . . . . . Tangent operator bundles . . . . . . . . . . . . . . . The tangent bundle of a tangent bundle . . . . . . . Horizontal components and drop functions . . . . . . Tangent frames and coordinate basis vectors . . . . . Tangent space constructions, attributes and relations Unidirectional tangent bundles . . . . . . . . . . . . Distributions as representations of tangent bundles . Tangent bundles on infinite-dimensional manifolds . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
579 583 586 587 588 591 591 593 596 596 600 602 604 604 605 606
The following table summarizes some tangent space concepts which are introduced in this chapter. reference concept symbol comments 27.2.6
differentiable manifold
(M, AM )
28.2.1 28.3.2 28.3.3 28.7.1 28.8.1 28.10.13 28.10.14
abstract tangent bundle tangent coordinate triple tangent vector tangent (vector) space tangent bundle tang. bundle tang. space tang. bundle tang. bundle
(T , π, AˆT , Φ) (p, v, ψ) tp,v,ψ Tp (M ) (T (M ), AT (M ) ) Tz (T (M )) T (T (M ))
28.5.1 28.6.2 28.7.6 28.7.9 28.9.2
tangent operator tagged tangent operator tangent operator space tagged tang. operator sp. tangent operator bundle
∂p,v,ψ (p, ∂p,v,ψ ) ˚p (M ) T Tˆp (M ) (Tˆ(M ), ATˆ(M ) )
28.12.1 28.12.2 28.12.8
tangent k-frame tangent n-frame set tangent n-frame bundle
(tp,vj ,ψ )kj=1 Pp (M ) (P (M ), AP (M ) )
M a set, AM = atlas(M ) a C r atlas on M , r ≥ 1
total space T , projection π, bundle atlas AˆT , lift Φ p ∈ M, v ∈ IRn , ψ ∈ atlasp (M ) equivalence class of (p, v, ψ) for fixed p ∈ M {tp,v,ψ ; pS∈ M, v ∈ IRn , ψ ∈ atlasp (M )} T (M ) = p∈M Tp (M ) with C r−1 atlas AT (M ) tangent space of a tangent bundle, z ∈ T (M ) has C r−2 atlas AT (T (M )) , r ≥ 2 ∂p,v,ψ : f 7→ v i ∂ip,ψ f for f ∈ C r (M, IR) tangent operator ∂p,v,ψ with tag p ∈ M {∂p,v,ψ ; p ∈ M, v ∈ IRn , ψ ∈ atlasp (M )}; untagged {(p, ∂p,v,ψ ); p ∈ M, v ∈ IRn , ψ ∈ atlasp (M )} S r−1 ˆ atlas ATˆ(M ) p∈M Tp (M ) with C
sequence of k independent vectors in Tp (M ) set of n-frames at p ∈ M S P (M ) = p∈M Pp (M ) with C r−1 atlas
28.0.1 Remark: A tangent bundle is not a fibre bundle. The tangent bundle concept in this chapter is
©
Alan U. Kennington, “Differential geometry reconstructed: a unified systematic framework”. Copyright 2009, Alan U. Kennington. All rights reserved. You may print this book draft in A4 format. Printing in all other formats is forbidden. You may not charge any fee for copies of this book draft. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
28.1 28.2 28.3 28.4 28.5 28.6 28.7 28.8 28.9 28.10 28.11 28.12 28.13 28.14 28.15 28.16
578
28. Tangent bundles on differentiable manifolds
not a sub-species of the differentiable fibre bundles defined in Chapter 35. A tangent bundle is a stand-alone concept which is closely related to topological fibre bundles and differentiable fibre bundles. (The relations between tangent bundles and fibre bundles are roughly summarized in Figure 28.0.1.) non-topological fibre bundle topological fibre bundle tangent bundle differentiable fibre bundle Figure 28.0.1
Family tree for fibre bundles and tangent bundles
Differentiable fibre bundles require differentiable groups which in turn require tangent bundles on differentiable manifolds. So to avoid cyclic definitions, tangent bundles cannot be defined as a sub-species of differentiable fibre bundles. 28.0.2 Remark: Tangent vectors are the quintessence of directionality. The word “tangent” is derived from the Latin word “tangens” which means “touching”. A tangent vector is a vector which touches a curve at a point. Tangent vectors in flat space have been well known since classical Greek mathematics. Tangent vectors for manifolds (curved space) are defined in terms of flat-space tangent vectors via differentiable charts. Since tangent vectors are well-defined and familiar for flat spaces IRn (which are the ranges of manifold charts), it makes sense to exploit flat-space vectors to define tangent vectors for differentiable manifolds. Charts on manifolds map points to coordinates. In the reverse direction, tangent vectors in IRn are mapped back onto the manifold.
(x, v) = Ψ(ψ)(V ) T (M )
V
π
Ψ(ψ) ∈ AT (M )
Ψ
T (IRn ) ≡ IRn × IRn ≡ IR2n (x, v) 7→ x x = ψ(p)
M
Figure 28.0.2
p = π(V )
ψ ∈ AM
IRn
Exploitation of flat-space tangent bundle to define tangent bundles on manifolds
The idea is to use the flat-space tangent bundle T (IRn ) alluded to in Definition 19.1.7 to define the tangent bundle T (M ) on a manifold M . Then the charts Ψ(ψ) for the tangent bundle’s total space T (M ) are required to have the same transformation rules as the chart-transition diffeomorphisms φ = ψ2 ◦ ψ1 for charts ψ1 , ψ2 ∈ atlasp (M ) ⊆ AM . 28.0.4 Remark: Many differential geometry textbooks adopt tangent operators as the fundamental definition of a tangent vector. Tangent operators have some difficulties which must be swept under the carpet. So an old-fashioned and pedestrian definition of tangent vectors is adopted here, namely equivalence classes of coordinate triples (p, v, ψ) ∈ M × IRn × AM for p ∈ M , v ∈ IRn and ψ ∈ AM , where M is the manifold and AM is the atlas on M . The space of these tangent vectors is given the symbol T (M ). In second place, the set of tagged tangent operators (p, ∂p,v,ψ ) is denoted Tˆ(M ). In third place, tangent operators ∂p,v,ψ are [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
28.0.3 Remark: Figure 28.0.2 outlines the core idea for defining tangent vectors on manifolds in this book.
28.1. Styles of representation of tangent vectors
579
˚(M ). The less common definition of tangent vectors as equivalence classes of C 1 curves given the symbol T passing through a point p ∈ M will be referred to as “tangent curve classes”. 28.0.5 Remark: The word ‘vector’ is Latin meaning ‘carrier’. So a vector carries something from one point to another. According to Struik [193], page 175, the word “vector” was introduced into mathematics by William Rowan Hamilton (1805–1865) in the context of quaternions. The OED [211], page 2456, gives the date 1865 as the first recorded occurrence of the word “vector” in the mathematical sense of a quantity having both magnitude and direction although it was used as early as 1796 in the sense of the straight line joining a planet to the focus of its orbit. 28.0.6 Remark: The words “coordinate”, “component” and “coefficient” are often used interchangeably, but they have different meanings. A coordinate is a number which tells you the location of a point in a grid, for example the numbers x and y in Cartesian coordinates (x, y) for the plane. A component is an element of a list or array of numbers, for example the numbers xi in a vector (x1 , . . . xn ) or the numbers aij in a matrix [aij ]ni,j=1 . A coefficient is typically a constant multiplier of a term in an expression, for example the numbers a, b, and c in the expression ax2 + bx + c. The numbers v i in Definition 28.5.1 for tangent operators may be described in all three ways, but with slightly different interpretations. They are coordinates of operators ∂p,v,ψ in the natural atlas on the total tangent operator space Tˆ(M ). They are components of the n-tuple v ∈ IRn . And they are coefficients of the first-order derivatives. These three words each suggest different relations of the numbers to their mathematical objects rather than attributes of the numbers themselves. In the case of Definition 28.3.3 for tangent vectors, it is not strictly correct to talk of the numbers v i as coefficients, because they are not multipliers for terms in an expression. But they certainly are components of an n-tuple. They may also be thought of as coordinates of tangent vectors with respect to an atlas for the total space of a tangent bundle.
28.1. Styles of representation of tangent vectors This section is a discussion of the meaning of tangent vectors, some popular styles of representation of tangent vectors, and the advantages and disadvantages of the various styles. A particular manifold may have many tangent bundles, but they are related by unique isomorphisms. The classical meaning of a vector is an oriented segment from a given point p to another point q. (This is explained in the article on vectors in EDM2 [34], 442.A.) Vectors arose in physics to describe velocities and forces. A velocity is typically a derivative γ ′ (t) of a curve γ. A force is typically a gradient ∇E of a potential function E. Both of these are defined as limits of rates of variation of functions from point to point. In flat space, there is a unique straight line joining one point to another, but in a differentiable manifold there is generally no uniquely specified shortest path between two points. There is no such thing as a geodesic in a manifold without a connection. A specification of two end-points is insufficient to specify a unique, chart-invariant shortest path in a curved space without a metric. The closest thing to a “real meaning” of a tangent vector in a differentiable manifold M is an infinitesimal translation of a point p ∈ M . This philosophical concept cannot be directly represented as a set construction. (See Section 18.1 for the problem of “infinitesimals” in the linear space case, which has a long history.) But a generator of small movements can be defined. This gives a motivation for preferring the differential operator (case (ii) in Remark 28.1.1). In the case of embedded manifolds, it is possible to use extrinsic tangent vectors, such as are indicated in equations (42.2.1) and (42.2.2) for S 2 embedded in IR3 . This is useless in general, for instance for cosmology, but extrinsic tangent vectors do meet the requirements for a tangent vector object when they are available. When a manifold is embedded in a Euclidean space, the tangent plane to any point of a smooth enough manifold is well defined. The vectors in this tangent plane may be used to describe the velocity of a curve in [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
For both tangent vectors and tangent operators, it is probably preferable to refer to the numbers v i as coordinates. In the case of tangent operators (Definition 28.5.1), the term “coefficients” is may also be used. In the case of tangent vectors (Definition 28.3.3), the term “components” may be used for the coordinates.
580
28. Tangent bundles on differentiable manifolds
the manifold, the gradient of a real-valued function on the manifold, and various other analytic concepts. The problem with non-embedded manifolds is that there is no ambient space in which to construct a tangent plane. One could artificially construct an ambient space locally for any smooth enough non-embedded manifold and use that as a tangent space. But that would be clumsy, unnatural and restrictive. (Frankel [18], section 1.3, page 23, mentions that Hassler Whitney showed that an n-dimensional manifold can always be embedded in some 2n-dimensional manifold.) Many approaches may be taken to construct intrinsic tangent vectors, such as using the coordinates from local charts, using curves within the manifold to substitute for straight-line tangents, or using gradient operators. The problem of defining tangent vectors directly in terms of the points of a manifold arises because the only thing known about the points of a manifold is that they are elements of a set. It is not at all clear how to construct well-defined tangent vectors for abstract points. When presented with a manifold, the tangent space is not usually part of the given structure. One must construct a tangent space somehow from the points. (For further discussion of this issue, see Remark 28.1.11.)
(i) Coordinates. Almost all real-life computations use this representation. A tangent vector may be represented as an equivalence class of triples (p, v, ψ), where p ∈ M is the vector’s base point, v ∈ IRn is the set of tangent vector coordinates and ψ is a chart. (See Definition 28.3.3.) Alternatively, tangent vectors may be represented by triples (x, v, α) where x is the set of coordinates of p ∈ M and α is the index of ψ in an indexed atlas. (See Definition 28.4.2. This kind of definition is almost identical to Darling [13], page 135, section 6.6.2 and Frankel [18], page 23, section 1.3a.) (ii) A differentialPoperator. This is popular for analytical and theoretical applications. The operator has the form f 7→ i vψi (∂/∂xi )(f ◦ ψ −1 (x)) x=ψ(p) for vψ ∈ IRn . This style of definition requires a space of functions to differentiate. (See comments in Remark 28.1.2.) (iii) An equivalence class of curves. This representation defines a tangent vector at p ∈ M to be an equivalence class of curves γ : IR → M which have the same velocity vector γ ′ (0) at p = γ(0), where ′ γ (0) = (d/dt)ψ(γ(t)) t=0 for charts ψ : M → IRn . (See comments in Remark 28.1.3.) [ For equivalence classes of curves, see Malliavin [35], proposition I.4.1. ] (See for instance Gallot/Hulin/Lafontaine [19], page 14, section 1.25. See also Crampin/Pirani [11], page 247.) (iv) A derivation. This representation is based on linear functionals L : C ∞ (M ) → IR which obey the Leibniz rule. (See Section 45.1.) These are essentially the same as the differential operators in (ii), but are defined more algebraically. They don’t work correctly for C k (M ) spaces with k < ∞. (This form of tangent vector definition appears in EDM2 [34], section 105.F, and Gallot/Hulin/Lafontaine [19], section 1.45–1.53, pages 20–22.) (v) A generalized function. Schwartz distributions, for example, represent generalized functions as elements of the duals of spaces such as C0∞ (M ). In particular, points may be represented as Dirac delta functions and tangent vectors may be represented as directional derivatives of delta functions. This is a broad extension of the differential operator and derivation styles of definition. This style of representation is restricted to C ∞ manifolds. It is straightforward to invent other reasonable definitions of tangent vectors, such as an equivalence class of local diffeomorphisms or an equivalence class of sequences of points (converging with a specified velocity to a point). 28.1.2 Remark: The differential operator representation of tangent vectors in part (ii) of Remark 28.1.1 is often claimed to be chart-independent and “coordinate-free”, although clearly in order to specify which vector one it talking about, one must give the vector components. The operator definition has the advantage that the transformation rules follow automatically from its form. It is probably the style of representation which is most widely regarded as the essence of tangent vectors in differential geometry texts. One seldom-mentioned disadvantage of this representation is the “zero vector ambiguity problem”: the zero vectors at all points in the manifold are represented by the same operator. (See Remark 28.5.12, Definition 28.6.2 and Remark 28.6.1.) [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
28.1.1 Remark: The coordinate transition matrices for charts are presented in Definition 27.4.7. Any object which transforms according to these matrices may be accepted as a valid tangent vector representation. This idea is implemented in Metadefinition 28.2.1. Some examples of intrinsic tangent vector constructions which satisfy this metadefinition are as follows.
28.1. Styles of representation of tangent vectors
581
Another problem is that the space of differentiable functions on a manifold must be defined beforehand. A differentiable function f is defined as one for which the derivatives (∂/∂xi )(f ◦ ψ −1 (x)) are well-defined for charts ψ. This is uncomfortably close to being a circular definition. The space C 1 (M ) is a very large set of functions. Defining a tangent vector as a linear functional on an infinite-dimensional linear space of functions is certainly not conceptually economical. In fact, the n coordinate functions p 7→ ψni (p) for i = 1 . . . n (for a fixed chart ψ whose domain contains p) suffice as test functions to fully determine the operator, but this is a reversion to coordinates again, which the operator definition is supposed to avoid. Clearly the claims that differential operators provide the best, simplest or most natural representation of tangent vectors are vulnerable to scrutiny. (See also Remark 28.6.6.) One should regard tangent operators as being abstract differentiation procedures rather than actions on a particular class of test functions, because otherwise the chosen class will always be too large or small for some applications. 28.1.3 Remark: Since the curves in part (iii) of Remark 28.1.1 are embedded within the point space of the manifold, this definition has a strong intuitive appeal which seems to be coordinate-free, but this is illusory because the choice of curves depends heavily on the coordinate maps. The representation is quite uneconomical in practice because a single vector is represented as an infinite number of curves. The curves themselves must be tested to ensure that the expression (d/dt)ψ i (γ(t)) is well-defined for all components ψ i of the coordinate charts of the manifold. So the coordinates are not excluded from the definition. When one wishes to indicate a particular tangent vector, in practice one must specify the components of the derivatives. So, like the operator in part (ii), this style of definition has intuitive appeal but no practical value.
For comparison, the positive integers may be represented as Babylonian, Greek, Roman or Arabic numerals, in binary, octal, hexadecimal or sexagesimal. The best choice depends on context. The same is true for tangent vector representations. 28.1.5 Remark: Tangent vectors are used to define tangents of curves (γ ′ for γ : IR → M ), gradients of real-valued functions (∇v f for f : M → IR), differentials of functions (dφ for φ : M1 → M2 ), affine connections (ρ : T (M ) → T (P (M ))), vector fields (X : M → T (M )), and many other analytic concepts. For the development of such definitions, the most convenient tangent vector definition is arguably the differential operator option (ii), but this convenience can be obtained with option (i) by using the notation DV to distinguish a tangent operator from its corresponding component-based vector V . Thus component vectors and tangent operators may happily co-exist. 28.1.6 Remark: One might ask why C 1 test functions are used instead of general differentiable functions. P The reason is to guarantee the equality of the expressions lima→0 (f (x+av)−f (x))/a and ni=1 v i ∂f (x)/∂xi . [ Must give a reference for this. ] 28.1.7 Remark: It is reasonable to ask whether the existence of unidirectional derivatives everywhere on a manifold would be sufficient to define meaningful tangent vectors like lima→0+ (f (x + av) − f (x))/a which are unidirectional. The examples in Section 43.4 give some hints on this subject. Example 43.4.4 shows that C 0,1 regularity is not sufficient to guarantee existence everywhere of directional derivatives. However, if the chart transition maps have well-defined unidirectional derivatives, it should be possible to define unidirectional tangent vectors. This could be useful for some applications. (Such generalizations are discussed in Section 28.14.) 28.1.8 Remark: In a sense, the tangent vector representations (ii) and (iii) are inverses or duals of each other. Just as a test function f : M → IR is one coordinate of a coordinate chart ψ : M → IRn , each curve γ : IR → M is one inverse coordinate of an inverse coordinate chart ψ −1 : IRn → M . Both the differential operator and curve representations of tangent vectors are pruned-down versions of coordinate [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
28.1.4 Remark: All of the constructions in Remark 28.1.1 have the correct transformation rules. It is difficult to select one representation as morally superior to all others. The approach here will be to say that a tangent space is any construction which has the right properties and relations to the corresponding manifold. All such representations and constructions are isomorphic, equivalent and interchangeable. However, option (i) is chosen here as the preferred representation because it is the best starting point for deriving all of the other representations. (See Definition 28.3.3.)
582
28. Tangent bundles on differentiable manifolds
charts. Therefore one may as well go the whole hog and use full coordinate charts. The moral of this story is that there is no such thing as a coordinate-free tangent vector. Test functions and curves are thinly disguised coordinate charts. 28.1.9 Remark: Sometimes vectors do not transform as they should. Some quantities in physics do not vary under all transformations in GL(n) according to the standard matrix rules. In such cases, a different invariance group may be required. So, all things considered, it may be best to define a vector as an equivalence class of triples (p, v, ψ), where p is a point in a space, v is a set of coordinates for the vector, ψ is an element of the permitted set of charts for the space, and a set of transformation rules is supplied for determining the equivalence class. The point/coordinates/chart form of vector definition is unsatisfying. Since the coordinates must be the coordinates of something, it leaves open the question of what that something is. But then, one can equally observe that the Cartesian coordinates for a point in space are just numbers which depend upon the coordinate frame in a specified way. The coordinates are certainly not points. Nor is any equivalence class of coordinate/chart pairs (p, ψ) a point. A point is really something outside the scope of pure set theory. But it is not really an empirical construct either. A point is a psychological construct within the minds of mathematicians. This construct is useful for modelling the real world, and it may be given coordinates. But the point itself is undefined, just as in the case of classical Euclidean geometry.
28.1.11 Remark: After ten years of occasional meditation on the question, the author finally decided on 22 October 2001 that tangent vectors are neither of the usual candidates. In fact, tangent vectors are a class of object rather than a particular set or function, even if the manifold is a particular given construction. To be precise, tangent vectors are any mathematical object which has the correct transformation rules. It is not necessarily a differential operator, an equivalence class of curves, an equivalence class of coordinates, nor derivations, germs, jets or distributions. All of these may qualify as tangent vectors as long as they obey the specified transformation rules. The following comment was written by this author shortly after arriving at this conclusion. Solved! I’ve finally on the morning of 22 October 2001, ten years after I started trying to decide the issue, discovered a good solution to the question of how to represent tangent vectors. The answer is that one should not select any particular set structure as being the tangent vector structure. Instead, one should provide a test which any given set structure must pass in order to qualify as a tangent vector. This is what mathematical physicists do anyway. They define an object, and then they test it to see if it qualifies as a vector or tensor of some kind. There is nothing radical about this approach. It is exactly what was done in the case of defining a manifold. It was nowhere said that a manifold is a member of some specified set of objects, such as the set of topological grafts of Euclidean spaces with various properties, although that could have been done. In fact, manifolds can be anything at all that has a suitable topology. A differentiable manifold is just a manifold with an atlas having the right sort of property. If the points of a manifold are just anything you like, as long as the structure has the right properties, then why shouldn’t the tangent bundle also be any set structure which you want to put on a manifold which happens to pass a qualification test. Therefore from now on, I intend to refer to “a” tangent bundle rather than “the” tangent bundle of a given manifold. ‘Real tangent vectors’ lie outside the scope of set theory, just as ‘real points’ do. Probably the majority of mathematical classes are defined in terms of satisfying specified conditions rather than as specific set constructions. Thus linear spaces have points in any set at all, with operations satisfying various conditions, whereas the space IRn is a specific set construction – as long as a particular representation of the real numbers is decided on. But even the real numbers [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
28.1.10 Remark: A point is defined as something which has position but no extent. But this is not very illuminating. Since points cannot ultimately be defined within mathematics, it is no surprise that vectors are not defined either. So if it is good enough to define a point as somehow underlying the sets of coordinates that describe it, then surely this must be good enough for vectors. One may as well define them as equivalence classes of triples (p, v, ψ) for p, v ∈ IRn just as points are really equivalence classes of pairs (x, ψ) for x ∈ IRn . If coordinates are good enough for points, surely they are good enough for vectors too. It follows that for consistency one should define vectors by coordinates rather than by differential operators on function spaces.
28.2. Tangent bundle metadefinition
583
or the integers may be regarded as classes of objects rather than particular set constructions. Any system which is isomorphic to a given system may then be regarded as representing the same object.
28.2. Tangent bundle metadefinition This section presents a specification, or metadefinition, for tangent bundles. The choice of representation for a tangent bundle is “outsourced”. That is, any representation may be freely chosen within the specified rules. Whenever you outsource anything, you must be careful to specify exactly what you expect to get. This is essentially an axiomatic approach as opposed to a specific representation. The set of tangent vectors on a manifold is so important that it deserves its own name: “tangent bundles”. Although the word “bundle” is used, this does not imply that it is a kind of fibre bundle. The word “space” is not quite right because that would suggest a linear space, which the set of all tangent vectors on a manifold certainly is not. A tangent bundle requires additional structure to qualify for the name “fibre bundle”. With the addition of a topology on the total space and a suitable topological group structure on IRn , a tangent bundle qualifies as a topological fibre bundle. (See Section 23.6.) If the base manifold M is C 2 , the addition of an induced differentiable structure on a tangent bundle for M qualifies it as a differentiable fibre bundle. (See Chapter 35.) Since differentiable fibre bundles are defined in terms of tangent bundles, it is important to avoid an infinite cycle of definitions.
[ Try to adapt of Metadefinition 28.2.1 to define pseudo-vectors (cross-product in IR3 ) and spinors. Also try to adapt it for single-sided tangent vectors analogous to unidirectional derivatives. ] 28.2.1 Metadefinition: A tangent bundle for an n-dimensional C 1 manifold (M, AM ) must provide a tuple (T , π, AˆT , Φ) − > (T , AˆT ) − > T which satisfies the following conditions. (i) π : T → M is a surjective map. (ii) Φ : AM → AˆT is a bijection.
(iii) ∀ψ ∈ AM , Φ(ψ) : π −1 (Dom(ψ)) → IRn . (iv) ∀p ∈ M, ∀ψ ∈ atlasp (M ), Φ(ψ) π−1 ({p}) : π −1 ({p}) → IRn is a bijection. (v) ∀p ∈ M, ∀ψ1 , ψ2 ∈ atlasp (M ), ∀V ∈ π −1 ({p}), ∀i = 1 . . . n, Φ(ψ2 )(V )i =
n X ∂ −1 i (ψ ◦ ψ (x)) Φ(ψ1 )(V )j . 2 1 j ∂x x=ψ1 (p) j=1
(28.2.1)
T is called the total space of the tangent bundle. An element of T is called a tangent vector. π is called the projection map of the tangent bundle. AˆT is called the tangent bundle atlas of the tangent bundle. The maps φψ ∈ AˆT are called tangent bundle charts. Φ is called the lift function of the tangent bundle. A tangent vector at p ∈ M is any element of π −1 ({p}). The tangent vector at p ∈ M with coordinates v ∈ IRn with respect to chart ψ ∈ AM is the tangent vector (Φ(ψ) π−1 ({p}) )−1 (v) ∈ T . [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Metadefinition 28.2.1 provides an acceptance test for any proposed tangent bundle definition for a differentiable manifold. To pass the test, a proposed tangent bundle construction must provide a set T (the “total tangent space”) whose elements will be called the “tangent vectors” of the manifold. A projection function π must be provided so that each vector V ∈ T can be associated with a unique base point p = π(V ) ∈ M . It must be possible to determine the components in IRn of any vector V ∈ T with respect to any given chart for the manifold. That is, given any chart ψ ∈ AM , it must be possible to determine the components φψ (V ) ∈ IRn of any vector V ∈ T . The tangent bundle (total space) atlas must be consistent with the manifold atlas.
584
28. Tangent bundles on differentiable manifolds v = φψ (V ) T
V
π
φψ ∈ AˆT
IRn ≡ Tψ(p) (IRn )
Φ x = ψ(p)
M
Figure 28.2.1
p = π(V )
ψ ∈ AM
IRn
Tangent bundle metadefinition “lift” function Φ and “anchor” charts φψ = Φ(ψ)
28.2.2 The range set IRn for the functions Remark: Metadefinition 28.2.1 is illustrated in Figure 28.2.1. n Φ(ψ) π−1 ({p}) should be thought of as the tangent space Tψ(p) (IR ) of IRn at p. In other words, the maps Φ(ψ) associate tangent vectors of the manifold M at p with tangent vectors of IRn at ψ(p).
Condition (iii) implies that the functions (ψ ◦ π) × Φ(ψ) for ψ ∈ AM map from subsets of T to IRn × IRn , which may be identified with the set of tangent vectors of IRn . The set of the functions (ψ ◦ π) × Φ(ψ) will be a C 0 atlas for T .
Note that upper-case V is used for tangent vectors in T whereas lower-case v is used for vector components in IRn . Recall also (Section 5.16) that “A > − B ” means that expression B is an abbreviation for expression A.
28.2.4 Remark: The tangent bundle charts φψ = Φ(ψ) ∈ AˆT in Metadefinition 28.2.1 “anchor” the vectors of the tangent bundle to particular tangent vectors in the already-defined tangent bundle of IRn . The familiar, well-understood tangent vectors in flat space are “leveraged” to define tangent vectors on differentiable manifolds. The tangent bundle charts φψ answer the question: “Which direction does each tangent vector have?” Condition (v) ensures that the answer to this question is independent of the choice of chart. By anchoring tangent vectors on manifolds to tangent vectors on IRn , each tangent vector has well-defined coordinates for each manifold chart because flat-space tangent vectors have well-defined coordinates. The projection map π similarly anchors tangent vectors to their base points. The combination of the projection map and tangent bundle charts completely determines the base point p and the coordinates v for each chart ψ. This leads naturally to an equivalence class of triples (p, v, ψ) describing each tangent vector. This is the basis of Definition 28.3.3. The inverses of tangent bundle charts induce tangent space structure from IRn onto the total space T . The inverses of charts also induce a topology and manifold atlas on the tangent bundle’s total space. This automatically gives the total space a topological manifold structure. If this structure is differentiable, it is possible to build a tangent bundle on the tangent bundle. This is essential for defining the higher-order derivatives which are required in physical models. There are many different definitions of tangent vectors on manifolds, but as long as they are anchored to the flat-space tangent vectors, the existence of unique isomorphisms between the various tangent bundle definitions is guaranteed. Therefore there is no ambiguity. All tangent bundle definitions are equivalent and interchangeable. (The same requirement for unique isomorphisms is discussed in Remark 13.4.6 for the flat-space tensor product metadefinition.) 28.2.5 Remark: The form of Metadefinition 28.2.1 would be easy to generalize by replacing the “fibre space” IRn with some other space and replacing the general linear group GL(n, IR) which is implicit in the transition map rule with some other group. In fact, the tangent bundle metadefinition may be thought of as a template for all fibre bundle definitions. In other words, fibre bundles are merely generalizations of tangent bundles. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
28.2.3 Remark: Metadefinition 28.2.1 (v) implies that φψ (V ) is uniquely determined for all charts ψ if the value of φψ (V ) is given for any particular chart ψ.
28.2. Tangent bundle metadefinition
585
φψ2 ◦ (φψ1 π−1 ({p}) )−1 Tψ1 (p) (IRn ) ≡ IRn
v1 φ = Φ(ψ ) ψ1 1
T
v2 φψ2 = Φ(ψ2 )
V π
Φ
IRn ≡ Tψ2 (p) (IRn )
Φ
x1 IRn
x2 ψ1
p
IRn
ψ2
M ψ2 ◦ ψ1−1 Figure 28.2.2
Tangent bundle metadefinition transition maps
28.2.6 Remark: The transition rule condition (v) in Metadefinition 28.2.1 is illustrated in Figure 28.2.2. The transition map φψ2 ◦ (φψ1 π−1 ({p}) )−1 in Metadefinition 28.2.1 (v) is a linear bijection from IRn to IRn . In other words, it is an element of GL(n, IR). For p ∈ M , let x1 = ψ1 (p), x2 = ψ2 (p), V ∈ π −1 ({p}), v1 = φψ1 (V ) and v2 = φψ2 (V ). Then x2 = ψ2 ◦ ψ1−1 (x1 ) v2 = φψ2 ◦ (φψ1 −1 )−1 (v1 ) π
({p})
∀i = 1 . . . n,
v2i =
n X ∂ −1 i (ψ ◦ ψ (x)) v1j . 2 1 j ∂x x=ψ1 (p) j=1
28.2.7 Remark: Metadefinition 28.2.1 (iii) implies that ∀ψ ∈ AM , Dom(Φ(ψ)) = π −1 (Dom(ψ)). This constraint could be relaxed to permit a tangent space atlas with a very general set of chart domains. But this book is following a policy of minimal structures. If a manifold has only 2 charts, then the tangent space gets only 2 charts and they match domain for domain. If a particular tangent space implementation has more charts, that could reduce the regularity of the tangent space. That would be a different differentiable structure. The approach taken here maximizes the regularity of the tangent space. [ This remark is similar to Remark 28.8.9. ] 28.2.8 Remark: Metadefinition 28.2.1 (v) says effectively that anything is a tangent space if it transforms like a tangent space between charts on the manifold. This kind of outsource-and-test approach permits the coexistence of a large number of different definitions of tangent spaces. This fits well with general practice in physics, which is to accept anything as a vector which transforms like a vector. A good example of accepting anything as a tangent vector which transforms like a tangent vector is Frankel [18], section 1.3a, page 23. However, physicists typically test mathematical expressions to determine if they are vectors whereas mathematicians test mathematical set constructions to see if they are correct representations of the class of tangent bundles. In the former case, the set construction is generally fixed. In the latter case, it is the set construction which is to be tested, not the form of an expression which determines the numbers which populate the set construction. [ Define the topology induced on a tangent bundle by tangent bundle charts. If a tangent bundle definition has a topology, it must be the same as this induced topology. Also define a differentiable structure on the tangent bundle induced by the tangent bundle charts. Show some basic properties of the induced topology and induced differentiable structure. ] [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
and
586
28. Tangent bundles on differentiable manifolds
28.2.9 Theorem: The set of maps AT (M ) = {Qn,n ◦ ((ψ ◦ π) × Φ(ψ)); ψ ∈ AM } ,
where Qn,n : IRn ×IRn → IR2n is the concatenation map in Definition 8.5.3, is a 2n-dimensional differentiable structure on any tangent bundle defined under Metadefinition 28.2.1 for an n-dimensional manifold. [ Definition 28.8.4 defines the standard atlas on a concrete tangent bundle as in Theorem 28.2.9. ]
28.3. Tangent vectors 28.3.1 Remark: The two main forms of tangent vector definition used in this book are Definition 28.3.3 (component form) and Definition 28.6.2 (differential operator form). For computational purposes, it is best to work with coordinates only. Definition 28.4.2 (computational form) contains coordinate numbers only. Since Metadefinition 28.2.1 requires tangent vector components to be specified for all charts, Definition 28.3.3 offers an excellent combination of simplicity and versatility. It is very easy to convert a component-triple vector into a tangent operator, but not vice versa. 28.3.2 Definition: A tangent coordinate triple for an n-dimensional C 1 manifold (M, AM ) is a triple (p, v, ψ) ∈ M × IRn × AM such that p ∈ Dom(ψ). 28.3.3 Definition: A tangent vector for a C 1Smanifold M − < (M, AM ) with n = dim(M ) is an equivalence class [(p, v, ψ)] of tangent coordinate triples in ψ∈AM (Dom(ψ) × IRn × {ψ}), where the triples (p1 , v1 , ψ1 ) and (p2 , v2 , ψ2 ) for ψ1 , ψ2 ∈ AM are said to be equivalent whenever p1 = p2 = p and n X ∂ −1 i ∀i = 1 . . . n, v2i = (ψ ◦ ψ (x)) v1j . (28.3.1) 2 1 j ∂x x=ψ1 (p) j=1
28.3.6 Remark: It is convenient to think of the equivalence class tp,v,ψ = [(p, v, ψ)] in Definition 28.3.3 as the “tangent vector at p with components v with respect to the chart ψ”. If the reader chooses to use a differential operator representation of tangent vectors instead of components, the expression tp,v,ψ = [(p, v, ψ)] can be identified with that representation, or any other representation. Definition 28.3.3 is almost the same as the tangent vector definition by Frankel [18], section 1.3a, page 23, which is very much in the style favoured by physicists. Frankel essentially defines a tangent vector at p as an equivalence class of the form [(ψ, v)]. (This equivalence class just happens to be the graph of a function with domain atlasp (M ).) The Frankel definition requires the association of a coordinate n-tuple v with every chart ψ so that equation (28.3.1) is satisfied. One may similarly regard an equivalence class [(p, v, ψ)] as a function with domain {p} × atlasp (M ) and range IRn by identifying each triple (p, v, ψ) with the pair ((p, ψ), v). The main problem with regarding the components v as a function of a set of charts is that the function’s domain is atlasp (M ), a set of charts which varies according to the point p ∈ M . Therefore this representation would define the set of vectors on a manifold to be a set of functions with an untidy, variable set of pointdependent domains. Therefore the “function-of-charts” representation is not used in this book. 28.3.7 Remark: There could be benefits in changing the order of tangent component triples from (p, v, ψ) to (p, ψ, v). This would help to clarify that the components v are a function of the set of pairs (p, ψ), as noted in Remark 28.3.6. The altered order would also be better for generalizations to higher-order tangent objects which would then be specified by quadruples like (p, ψ, v, a), where v and a are the first-order and second-order parts respectively, and so forth for progressively higher-order tangent objects. An argument in favour of putting the chart ψ at the end of a tangent vector component triple is that generally one wants to focus on the pair (p, v). Having the chart ψ in the middle is a distraction. In practical applications, the chart is mostly fixed while the points and vectors vary. It is for this reason that the order in Definition 28.3.3 was chosen. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Z
28.3.4 Notation: tp,v,ψ for p ∈ M , v ∈ IRn , ψ ∈ atlasp (M ) and n ∈ + 0 denotes the equivalence class [(p, v, ψ)] in Definition 28.3.3. In other words, tp,v,ψ = [(p, v, ψ)]. S n 28.3.5 S Remark: nThe set of triples ψ∈AM (Dom(ψ) × IR × {ψ}) in Definition 28.3.3 is the same as the set ψ∈M ({p} × IR × atlasp (M )).
28.4. Computational tangent vectors
587
28.3.8 Notation: Tp (M ) denotes the set of tangent vectors tp,v,ψ at a point p in a C 1 manifold M . In other words, Tp (M ) = {tp,v,ψ ; v ∈ IRn , ψ ∈ atlasp (M )}. S 28.3.9 Notation: T (M ) = p∈M Tp (M ) denotes the set of all tangent vectors on a C 1 manifold M . In other words, T (M ) = {tp,v,ψ ; p ∈ M, v ∈ IRn , ψ ∈ atlasp (M )}. 28.3.10 Remark: Notations 28.3.8 and 28.3.9 demonstrate an annoying difficulty with the way sets are notated. It would have been fairly logical to write
and
Tp (M ) = {tp,v,ψ ; v ∈ IRn , ψ ∈ AM , p ∈ Dom(ψ)} T (M ) = {tp,v,ψ ; p ∈ M, v ∈ IRn , ψ ∈ AM , p ∈ Dom(ψ)},
which is apparently the same as in Notations 28.3.8 and 28.3.9. The problem is that this would define Tp (M ) and T (M ) to be the same thing. In the above specification for Tp (M ), however, the predicate p ∈ Dom(ψ) is supposed to indicate a condition to be satisfied, whereas the predicate p ∈ M for T (M ) indicates that p is a free (or “dummy”) variable. What is really required is a way of indicating that p is fixed on the right-hand side for Tp (M ), but is a free variable for T (M ). [ This would be a good location for the definition of a tangent bundle. ] 28.3.11 Remark: A useful mnemonic for equations (28.3.1) and (28.4.1) is vβi =
∂ψβi ∂ψαj
(p)vαj .
28.4. Computational tangent vectors 28.4.1 Definition: A computational tangent coordinate triple for an n-dimensional C 1 manifold (M, AM ) with indexed atlas (ψα )α∈I is a triple (x, v, α) ∈ IRn × IRn × I such that x ∈ Range(ψα ). 28.4.2 Definition: A computational tangent vector for a C 1 manifold S (M, AM ) with n = dim(M ) and indexed atlas (ψα )α∈I is an equivalence class of coordinate triples in α∈I (Range(ψα ) × IRn × {α}), where the triples (xα , vα , α) and (xβ , vβ , β) for α, β ∈ I are said to be equivalent whenever ψα−1 (xα ) = ψβ−1 (xβ ) and ∀i = 1 . . . n,
vβi =
n X ∂ −1 i (ψ ◦ ψ (x)) vαj . β α j ∂x x=xα j=1
(28.4.1)
S 28.4.3 Remark: The set ψ∈AM (Dom(ψ)×IRn ×{ψ}) in Definition 28.3.3 is a subset of M ×IRn ×AM . The equivalence classes of component triples effectively constitute a “graft” of the product sets Dom(ψα ) × IRn . (See Definition 6.10.5 for graft sets.) S The set α∈I (Range(ψα ) × IRn × {α}) in Definition 28.4.2 is a subset of IRn × IRn × I. The equivalence classes of component triples effectively constitute a “graft” of the product sets Range(ψα ) × IRn . Darling [13], page 135, section 6.6.2, gives a definition of tangent vectors which is almost identical to Definitions 28.3.3 and 28.4.2. Darling uses equivalence classes of the form [(p, α, v)] where p ∈ M , α ∈ I and v ∈ IRn . For such a definition, the α ∈ I is convenient for computation as in Definition 28.4.2, but the p ∈ M is an abstract point as in Definition 28.3.3. It is, in fact, probably more logical to make the chart index precede the vector components, since then the index stays in the same place if higher-order objects are represented in a similar way. [ This remark is related to Remark 28.3.7. ] [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
See Remark 28.5.16 for further comment on this.
588
28. Tangent bundles on differentiable manifolds
28.5. Tangent operators A tangent operator is an action of a tangent vector on (real-valued) differentiable functions on a manifold. There is an almost exact correspondence between a tangent vector defined in terms of components and the differential operator which is constructed from this. Many authors use tangent operators as their primary definition of tangent vectors, but as discussed in Section 28.1, tangent operators have both advantages and disadvantages as a primary definition. The disadvantages seem to outweigh the advantages. Therefore Definition 28.3.3 is adopted as the basic definition and Definition 28.5.1 is a secondary definition. 28.5.1 Definition: A tangent operator on a C 1 manifold M is a function ∂p,v,ψ : C 1 (M ) → IR which is defined for p ∈ M , v ∈ IRn and ψ ∈ atlasp (M ) by n X ∂ . ∀f ∈ C 1 (M ), ∂p,v,ψ (f ) = v i i (f ◦ ψ −1 )(x) ∂x x=ψ(p) i=1
The n-tuple v is called the coefficient vector or coordinate vector of the tangent operator ∂p,v,ψ with respect to the chart ψ. The triple (p, v, ψ) is called the coordinate triple for the tangent operator ∂p,v,ψ . The point p is called the base point of the tangent operator ∂p,v,ψ . 28.5.2 Remark: Definition 28.5.1 is illustrated in Figure 28.5.1. The symbol ∂i is shorthand for ∂/∂xi . ψ p vi U = Dom(ψ)
∂i (f ◦ ψ −1 )
f ◦ψ
f
−1
ψ(U )
level curves of f ◦ ψ −1
IRn Pn Tangent operator definition: ∂p,v,ψ (f ) = i=1 v i ∂i (f ◦ ψ −1 )(ψ(p)) IR
Figure 28.5.1
˚p (M ) denotes the set of tangent operators with base point p ∈ M in a C 1 manifold M . 28.5.3 Notation: T 1 ˚(M ) denotes the set S ˚ 28.5.4 Notation: T p∈M Tp (M ) of all tangent operators on a C manifold M .
28.5.5 Remark: One of the difficulties with the action space C 1 (M ) for tangent operators is that very often one wishes to apply tangent operators to functions which are not globally C 1 or not defined on the whole ˚p1 (M ). (See Notation 27.5.6.) manifold M . A tangent operator ∂p,v,ψ is well-defined for any function f ∈ C In fact, the function f only needs to be differentiable at the point p. Even if f is not differentiable at p, it could be differentiated as a generalized function or distribution. Therefore it is usually “understood” that tangent operators are actually a formal or symbolic operation or procedure, not a function with a predetermined domain and range. So the expression ∂p,v,ψ should be thought of as the abstract operator f 7→ lima→0 f (ψ −1 (ψ(p) + av)) − f (p) , where f is any sort of thing at all. This is generally accepted in practice. In this way of thinking, a tangent operator isn’t a function. It’s a procedure. The procedure requires test functions and a differentiable structure. The main problem with the “differentiation procedure” view of tangent operators is the hidden dependency of the limit procedure on a differentiable structure. The choice of C 1 (M ) as the action space for tangent operators is a reasonable compromise. It is always well-defined on C 1 manifolds. And after all, it doesn’t matter at all because it’s only the symbolic form of the operator that matters. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
ψ(p)
M
28.5. Tangent operators
589
28.5.6 Remark: One of the most unsatisfying aspects of tangent operators as a primary definition for tangent vectors is the difficulty of differentiating such operators. The ability to differentiate vector fields is essential for most applications. But to differentiate an operator-valued function requires some sort of differentiable structure on the set of tangent operators. How does one coordinatize the set of all tangent operators on a manifold? The obvious way to do this is to use the coefficient n-tuples v of operators ∂p,v,ψ . In other words, one arrives back at the coordinate representation of tangent vectors. Differentiating in the general space of operators on spaces like C 1 (M ) leads down complicated, confusing pathways from which it is difficult to return. If most common operations on tangent vectors require a reversion to coordinates anyway, one might as well use coordinates as the primary definition. That is precisely the thinking behind the primary definition choice in this book. 28.5.7 Notation: DV for any tangent vector V = tp,v,ψ ∈ T (M ) for a C 1 manifold M denotes the ˚(M ). corresponding tangent operator. In other words, DV = ∂p,v,ψ ∈ T 28.5.8 Remark: The D-notation for tangent operators in Notation 28.5.7 is quite standard, but it can be ˚(M ). But an operator such ambiguous. In Notation 28.5.7, D is presented as a function D : T (M ) → T ∞ 1 as DV may be applied to the C (M ) functions or the C (M ) functions, or it may be applied to Schwartz distributions or tempered distributions on the manifold. DV may also apply to vector or tensor fields rather than real-valued functions, or it may apply to differential forms of various kinds. (In these cases, DV signifies a covariant derivative with respect to an affine connection.) The subscript V may refer to a tangent vector at just one point p ∈ M , or it may refer to a vector field, in which case DV would yield a function on a whole region of definition of its argument. It follows from this remark that one must be very careful to determine which kind of derivative is indicated by the notation DV in each context and what the domain and range of the operator are. Mostly, when the domain of definition of the operator is known, the appropriate definition may be determined. 28.5.9 Example: The linear space IRn is easily given a C ∞ manifold structure by defining an atlas with a single chart, namely the identity map on IRn . (See Definition 27.3.1.) Tangent vectors for the C ∞ manifold M = IRn with this standard differentiable structure are of the form tp,v,ψ0 , where p, v ∈ IRn and ψ0 : IRn → IRn is the identity map. The equivalence class tp,v,ψ0 = [(p, v, ψ0 )] is the singleton set {(p, v, ψ0 )}. The set of tangent vectors T (M ) = {tp,v,ψ ; p ∈ M, v ∈ IRn , ψ ∈ atlasp (M )} in Notation 28.3.9 is T (IRn ) = {{(p, v, ψ0 )}; p, v ∈ IRn }. ˚(M ) of tangent operators for this manifold is the set T ˚(IRn ) = {Pn v i ∂ p ; p, v ∈ IRn }, where The set T i i=1 P P n n p n i i i 1 i=1 v ∂i is shorthand for the map f 7→ i=1 v (∂/∂x )f (x) x=p for f ∈ C (IR ).
The tangent bundle total space T (IRn ) for the differentiable manifold IRn is not the same thing as the tangent space T (IRn ) of the linear space IRn . However, it is mostly harmless to think of these two structures as being the same thing.
28.5.10 Remark: Theorem 28.5.11 means that Definitions 28.3.3 and 28.5.1 are consistent with each other. If the coordinate vectors are equal, then the operators are equal. In other words, the choice of representative of the equivalence class of coordinate triples is immaterial. So the D-operation in Notation 28.5.7 commutes with changes of chart. Theorem 30.1.3 is similar to Theorem 28.5.11. 28.5.11 Theorem: If the tangent vectors tp1 ,v1 ,ψ1 and tp2 ,v2 ,ψ2 of a C 1 manifold are equal then the tangent operators ∂p1 ,v1 ,ψ1 and ∂p2 ,v2 ,ψ2 are equal. Proof: Suppose tp1 ,v1 ,ψ1 = tp2 ,v2 ,ψ2 . By Definition 28.3.3, p1 = p2 and v2i = v j ∂j (ψ2i ◦ ψ1−1 ) for all i = 1, . . . n. So by Definition 28.5.1, ∀f ∈ C 1 (M ),
[ www.topology.org/tex/conc/dg.html ]
n X
∂ −1 (f ◦ ψ )(x) 1 i ∂x x=ψ1 (p1 ) i=1 n X ∂ ∂ j −1 = v1i j (f ◦ ψ2−1 )(˜ x) (ψ ◦ ψ )(x) 2 1 ∂ x ˜ x ˜=ψ2 (p2 ) ∂xi x=ψ1 (p1 ) i,j=1
∂p1 ,v1 ,ψ1 (f ) =
v1i
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
[ The DV notation is used as here in Frankel [18], section 1.3b, page 24. ]
590
28. Tangent bundles on differentiable manifolds =
n X
v2j
j=1
∂ −1 (f ◦ ψ )(˜ x ) 2 ∂x ˜j x ˜=ψ2 (p2 )
= ∂p2 ,v2 ,ψ2 (f ), where v2j =
Pn
j i i=1 v1 ∂i (ψ2
◦ ψ1−1 ) for j = 1, . . . n.
28.5.12 Remark: Theorem 28.5.11 has a partial converse. The converse is true if at least one of the tangent operators is non-zero. The proof of this converse (Theorem 28.5.13) requires the construction of an appropriate pair (or set?) of separating test functions. 28.5.13 Theorem: If tangent vectors ∂p1 ,v1 ,ψ1 and ∂p2 ,v2 ,ψ2 on a C 1 manifold are equal and non-zero, then tp1 ,v1 ,ψ1 = tp2 ,v2 ,ψ2 . 28.5.14 Theorem: Let M be an n-dimensional C 1 manifold with indexed atlas (ψα )α∈I . Let p ∈ M , α, β ∈ I, and vα , vβ ∈ IRn be such that ∂p,vα ,ψα = ∂p,vβ ,ψβ . Then vβ = Zβα (p)vα , where the matrix Zβα (p) ∈ GL(n) for p ∈ Dom(ψα ) ∩ Dom(ψβ ) is as in Definition 27.4.7. 28.5.15 Remark: The matrix Zβα (p) in Theorem 28.5.14 may be thought of as the contravariant transformation rule from α-coordinates to β-coordinates. The right-to-left order is due to the use of matrix multiplication of column vectors from the left, which is the usual order in finite linear space theory. (The reverse convention is followed in Markov probability theory.) Thus Zγα (p) = Zγβ (p)Zβα (p) for all α, β, γ ∈ I such that p ∈ Dom(ψα ) ∩ Dom(ψβ ) ∩ Dom(ψγ ). 28.5.16 Remark: The tangent operator ∂p,v,ψ in Definition 28.5.1 may be written more colloquially as n X i=1
vi
∂ ∂xi x=ψ(p)
or
n X i=1
vi
∂ (p). ∂ψ i
or
v i ∂ip,ψ .
These useful mnemonic notations are similar to the notations in Remark 27.4.9. When p and ψ are clear from the context, the simpler notation vi∂ may be used. 28.5.17 Remark: Figure 28.5.2 shows the relations between tangent operators, real-valued functions, and points in a manifold. In this diagram, f appears twice, first as a function from points p ∈ M to IR, and then as a point f ∈ C 1 (M ) being mapped to IR by the operator ∂p,v,ψ . In one context, f is a function, whereas in the other it is a ‘point’ in the domain of the operator ∂p,v,ψ .
C 1 (M )
f
∂p,v,ψ
IR M Figure 28.5.2
p
f
Tangent operator map ∂p,v,ψ : C 1 (M ) → IR
28.5.18 Remark: Theorem 27.5.9 is rewritten in the language of tangent operators in Theorem 28.5.19. The expression (d/dt)(u ◦ γ(t)) t=x = 0 in Theorem 27.6.3 for a C 1 curve γ : IR → M , where γ(x) = p, equates to the expression Dγ ′ (x) u, where γ ′ (x) is as in Definition 31.4.1. 28.5.19 Theorem: If p ∈ M is a local maximum of a function u ∈ C 1 (M ), where M is a C 1 manifold, then DV (u) = 0 for all V ∈ Tp (M ). [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
∂p,v,ψ =
28.6. Tagged tangent operators
591
28.6. Tagged tangent operators 28.6.1 Remark: A “tagged” tangent operator on a manifold M is a pair of the form (p, ∂p,v,ψ ), where ∂p,v,ψ is a tangent operator as in Definition 28.5.1. The reason for tagging is to resolve the ambiguity of operators ∂p,0,ψ with zero coefficients. (See Remark 28.5.12.) It is not possible to construct a tangent bundle from tangent operators without such tagging. Although non-zero tangent operators can be disambiguated in principle, in practice it is very burdensome to determine the base point p and coefficients v (for a given chart ψ) from the action of an operator on test functions. So even without the zero-vector ambiguity, it would be a good idea to use tagging. (Malliavin [35], section I.7.1, page 64, also comments on the topic of ambiguity of untagged tangent operators.) 28.6.2 Definition: A tagged tangent operator on a C 1 manifold (M, AM ) is a pair (p, ∂p,v,ψ ) such that p ∈ M and ∂p,v,ψ : C 1 (M ) → IR is a tangent operator at p. The vector v is called the coefficient vector, coordinate vector, coefficients or coordinates of the tagged tangent operator (p, ∂p,v,ψ ) with respect to the chart ψ.
28.6.3 Notation: Tˆp (M ) for a manifold M and point p ∈ M denotes the set of tagged tangent operators (p, ∂p,v,ψ ) at p. S 28.6.4 Notation: Tˆ(M ) for a manifold M and point p ∈ M denotes the set p∈M Tˆp (M ) of tagged tangent operators (p, ∂p,v,ψ ) at p. [ Near here, define the standard atlas on Tˆ(M ). ]
28.6.6 Remark: A tangent bundle based on the tagged tangent operator in Definition 28.6.2 satisfies Metadefinition 28.2.1, but it is certainly not an economical tangent vector definition. The space C 1 (M ) is a very large space of functions. If p is specified, it is actually sufficient to consider only the functions f k ∈ C 1 (M ) defined by f k (p) = ψ(p)k for p ∈ Dom(ψ) and k = 1 . . . n (with appropriate smoothing at the boundary of Dom(ψ) to ensure extendability to all of M ). Then ∂p,v,ψ (f k ) = v k for all v ∈ IRn and k = 1 . . . n. In other words, the action of ∂p,v,ψ on a well-chosen set of n functions contains the same information as the action on the entire space C 1 (M ). But of course, the main objective of this representation is usefulness in theoretical applications, not economy for data structures in computer software. It is possible to recover the base point p and the component vector v ∈ IRn from a tangent vector operator ∂p,v,ψ if the point p is not known, but it is more difficult. If the operator is non-zero, it is sufficient to apply the operator to the functions fk and fkℓ defined by fk : p 7→ ψ(p)k and fkℓ : p 7→ ψ(p)k ψ(p)ℓ . Let αi = ∂p,v,ψ (fk ) and βij = ∂p,v,ψ (fkℓ ) for p ∈ Dom(ψ). Then clearly v i = αi for all i = 1 . . . n, and xi = βii /2αi for all i such that αi 6= 0. Choose j such that αj 6= 0. Then v i = βij /αj for all i such that αi = 0. Thus both the point p = ψ −1 (x) and v ∈ IRn are determined from the action of ∂p,v,ψ on at most n + n2 test functions. 28.6.7 Remark: The set of tagged tangent P operators for the manifold IRn with the standard differentiable n n structure in Example 28.5.9 is Tˆ(IR ) = {(p, i=1 v i ∂ip ); p, v ∈ IRn }.
28.6.8 Remark: Definition 28.5.1 was the first definition to be written in this book. The whole book started with defining tangent vectors and then moving forwards and backwards from there. There is a good reason for this. The concept of an intrinsic tangent vector is the fundamental leap from a point space into another kind of space. In a linear space, there is little difference between a point and a vector.
28.7. Pointwise tangent spaces This section presents the linear space structure which is induced on the set of tangent vectors at a fixed point of a differentiable manifold by a tangent bundle atlas. Definition 28.7.1 formally defines the tangent space at a point of a manifold as in Definition 10.1.2. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
ˆ V for a tangent vector V = tp,v,ψ ∈ T (M ) on a C 1 manifold M denotes the tagged 28.6.5 Notation: D tangent operator (p, DV ) = (p, ∂p,v,ψ ) ∈ Tˆ(M ).
592
28. Tangent bundles on differentiable manifolds
28.7.1 Definition: The tangent space at a point p in a C 1 manifold (M, AM ) is the linear space tuple Tp (M ) − < (IR, Tp (M ), σIR , τIR , σTp (M ) , µ), where
(i) Tp (M ) = {tp,v,ψ ; v ∈ IRn , ψ ∈ atlasp (M )} as in Notation 28.3.8; (ii) σIR and τIR are the addition and multiplication operations of IR; (iii) σTp (M ) : Tp (M ) × Tp (M ) → Tp (M ) is the vector addition operation on Tp (M ) defined by tp,v1 ,ψ + tp,v2 ,ψ 7→ tp,v1 +v2 ,ψ ; (iv) µ : IR × Tp (M ) → Tp (M ) is the scalar multiplication operation defined by (λ, tp,v,ψ ) 7→ tp,λv,ψ .
28.7.2 Remark: The vector addition and scalar multiplication in Definition 28.7.1 are independent of equivalence class representatives because the equivalence relation (28.3.1) in Definition 28.3.3 is linear. 28.7.3 Remark: The linear space Tp (M ) in Definition 28.7.1 can be its own tangent space. Instead of constructing a separate tangent space for Tp (M ), the limit of an expression such as γ ′ (0) = limt→0 t−1 (γ(t) − γ(0)) for maps γ : IR → Tp (M ) may be regarded as an element of Tp (M ) rather than some abstract tangent space of the tangent space Tp (M ). The map which sends an abstract tangent vector γ ′ (0) to the corresponding concrete element of Tp (M ) will be called a “drop function”. It turns out that this is important in defining Lie derivatives and covariant derivatives, especially because most textbooks “drop” abstract tangents to concrete tangents without comment. 28.7.4 Definition: A coordinate basis vector at a point p ∈ M for a chart ψ ∈ AM in a C 1 manifold (M, AM ) with n = dim(M ) is a tangent vector ep,ψ = tp,vi ,ψ ∈ Tp (M ), where vi ∈ IRn is defined for i i = 1 . . . n by vij = δij .
˚p (M ) of 28.7.6 Definition: The tangent operator space at a point p in a C 1 manifold M is the set T tangent operators at p ∈ M together with the operations of addition and multiplication by real numbers. ˚p (M ) is defined by Thus the linear combination λ1 L1 + λ2 L2 : C 1 (M ) → IR of tangent operators L1 , L2 ∈ T ∀f ∈ C 1 (M ),
(λ1 L1 + λ2 L2 )(f ) = λ1 L1 (f ) + λ2 L2 (f ).
˚p (M ) is a linear subspace of the global function space 28.7.7 Remark: Each pointwise tangent space T ˚(M ) = S ˚ Lin(C 1 (M ), IR) of linear functionals on C 1 (M ). The union T p∈M Tp (M ) of these subspaces is not 1 a linear subspace of Lin(C (M ), IR) because the sum of non-zero tangent vectors at two different points of a manifold will not be a tangent vector at any point of the manifold. ˚p (M ) is restricted to the space C 1 (M ) 28.7.8 Remark: Although the action of the operators in spaces T ˚p (M ) so that operators at all points can act on the same space, it is tacitly assumed that every operator in T 1 1 ˚ is extended to the space Cp (M ) of C functions which are defined in a neighbourhood of p. In fact, the tangent operators are assumed to act on any reasonable kind of function on M or a subset of M , whether the function is classically differentiable or not. [ This remark looks similar to Remark 28.5.5 and some other remarks. ] 28.7.9 Definition: The tagged tangent operator space at a point p in a C 1 manifold M is the set Tˆp (M ) of ˚p (M ), together with the operations of pointwise addition and multiplication by all pairs (p, L) such that L ∈ T real numbers on the operator component. Thus the linear combination λ1 (p, L1 ) + λ2 (p, L2 ) of two tangent vectors (p, L1 ), (p, L2 ) ∈ Tˆp (M ) is defined by λ1 (p, L1 ) + λ2 (p, L2 ) = (p, λ1 L1 + λ2 L2 ). 28.7.10 Remark: The chart basis operators in Definition 28.7.11 provide a natural basis for the linear space of tangent operators at a fixed point of a differentiable manifold. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
28.7.5 Remark: The coordinate basis vectors in Definition 28.7.4 are a basis for the linear space of tangent vectors at a fixed point of a differentiable manifold. The symbol δij is the Kronecker delta presented in Pn Definition 7.9.10. Thus tp,v,ψ = i=1 v i ep,ψ i .
28.8. Tangent bundles
593
28.7.11 Definition: An tangent operator basis operator at a point p ∈ M with respect to a chart ψ ∈ AM ˚p (M ) defined for i = 1 . . . n = dim(M ) by of a C 1 manifold (M, AM ) is a tangent operator ∂ip,ψ ∈ T ∂ip,ψ (f ) =
∀f ∈ C 1 (M ),
∂(f ◦ ψ −1 (x)) . ∂xi x=ψ(p)
A tagged tangent operator basis vector is a tagged tangent operator ∂ˆip,ψ = (p, ∂ip,ψ ) such that ∂ip,ψ is a tangent operator basis vector. P ˚(M ). 28.7.12 Remark: The equation ∂p,v,ψ = ni=1 v i ∂ip,ψ is satisfied for all tangent operators ∂p,v,ψ ∈ T ˚p (M ) for The right-hand side of this equation is a linear combination in the linear space structure of T each p ∈ M . 28.7.13 Theorem: Let ψα , ψβ ∈ atlas(M ) be two charts for a C 1 manifold M with n = dim(M ). Then p,ψ the tangent operator basis vectors (∂ip,ψα )ni=1 and (∂i β )ni=1 at any point p ∈ Dom(ψα ) ∩ Dom(ψβ ) satisfy p,ψβ
∀i = 1 . . . n,
∂i
=
n X
∂jp,ψα Zαβ (p)j i .
(28.7.1)
j=1
[ The Jacobian matrices Zαβ are defined where? ] 28.7.14 Remark: A useful mnemonic for ∂ip,ψ is ∂/∂ψ i (p). (See Remark 28.5.16.) Then equation (28.7.1) may be expressed as n
p,ψβ
∂i
=
X ∂ ∂ (p)Zαβ (p)j i (p) = i j ∂ψβ ∂ψ α j=1 = =
n X ∂
∂ψαj j (p) ∂ψ i (p) β j=1 ∂ψα n n X X p,ψ ∂ψ j ∂jp,ψα Zαβ (p)j i . ∂j α αi (p) = ∂ψ β j=1 j=1
A useful shorthand for the symbol ∂ip,ψ is ∂i when the point p and chart ψ are implied. (See Remark 19.2.15 for the corresponding flat-space abbreviations for unit vectors.) 28.7.15 Remark: In Theorem 28.7.13, the unit basis vectors are multiplied by the matrix Zαβ (p) on the right whereas in Theorem 28.5.14, the components are multiplied by the inverse matrix Zβα (p) on the left. There is no contradiction here. In Theorem 28.5.14, the objects being transformed are contravariant components for tangent vectors, whereas in Theorem 28.7.13, the objects are sequences of tangent vectors, which P are essentially covariant in nature. This ensures that the linear combination ni=1 ∂ip,ψα vαi is independent of the chart ψα . Whenever a coordinate change occurs, the components vαi of a tangent operator ∂p,v,ψ (or vector tp,v,ψ ) are transformed according to a linear transformation with matrix Zβα (p) ∈ GL(n) on the left, which is just what is required for the definition of a fibre bundle in Definition 23.6.4, condition (iv). [ Why isn’t the symbol for Jacobian matrices a letter J? ]
28.8. Tangent bundles This section presents a natural atlas for the set of tangent vectors at all points of a differentiable manifold. This atlas induces a topology on the set of tangent vectors. If the manifold is C 2 , the atlas induces a differentiable manifold structure on the bundle of tangent S vectors. Whereas pointwise tangent spaces Tp (M ) are linear spaces, the tangent bundle total space T (M ) = p∈M Tp (M ) is not a linear space although it does have a natural topological structure. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
∀i = 1 . . . n,
594
28. Tangent bundles on differentiable manifolds
Differentiable fibre bundles are defined in terms of differentiable (Lie) groups, which are defined in terms of differentiable manifolds. Therefore tangent bundles cannot logically be defined as differentiable fibre bundles until these two other topics have been presented. The full differentiable tangent bundle definition is in Section 35.8. The tangent bundle representation adopted for Definition 28.8.1 uses coordinates and components. The representation in Definition 28.9.2 uses tagged differential operators (p, ∂p,v,ψ ). 28.8.1 Definition: The tangent bundle of an n-dimensional C 1 manifold (M, AM ) is the tuple (T (M ), π, AˆT (M ) , Φ) − > (T (M ), AˆT (M ) ) − > T (M ) where: S (i) T (M ) = p∈M Tp (M ) = {tp,v,ψ ; p ∈ M, v ∈ IRn , ψ ∈ atlasp (M )}. (ii) π : T (M ) → M is defined by π : tp,v,ψ 7→ p.
(iii) Φ is the function with domain AM which is defined so that for all ψ ∈ AM , Φ(ψ) is the map Φ(ψ) : π −1 (Dom(ψ)) → IRn defined by Φ(ψ) : tp,v,ψ 7→ v. (iv) AˆT (M ) = {Φ(ψ); ψ ∈ AM }. 28.8.2 Theorem: Definition 28.8.1 satisfies Metadefinition 28.2.1. 28.8.3 Remark: Some of the maps and spaces in Definition 28.8.1 are illustrated in Figure 28.8.1. The tangent bundle atlas AˆT (M ) (combined with the maps ψ ◦ π for ψ ∈ AM ) induces a topological and differentiable structure on the total tangent vector set T (M ). This agrees with the atlas definition in EDM2 [34], section 147.F. φψ
π −1 (U ) ⊆ T (M )
IRn
U = Dom(ψ) ⊆ M Figure 28.8.1
IRn
ψ
Spaces and maps for a tangent bundle
28.8.4 Definition: The tangent bundle total space atlas for the tangent bundle (T (M ), π, AˆT (M ) , Φ) of an n-dimensional C 1 manifold M is the atlas AT (M ) defined by AT (M ) = {Qn,n ◦ ((ψ ◦ π) × Φ(ψ)); ψ ∈ AM } , where Qn,n : IRn × IRn → IR2n is the concatenation map in Definition 8.5.3. 28.8.5 Remark: The tangent bundle charts in Definition 28.8.4 are illustrated in Figure 28.8.2.
φψ
π −1 (U ) ⊆ T (M )
= Φ(ψ)
Qn,n ◦ ((ψ ◦ π) × φψ )
Qn,n
ψ
π
IRn
◦
IR2n
π
U = Dom(ψ) ⊆ M Figure 28.8.2 [ www.topology.org/tex/conc/dg.html ]
ψ
IRn
Standard atlas for a tangent bundle [ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
π
28.8. Tangent bundles
595
The atlas for T (M ) in Definition 28.8.4 may be written as AT (M ) = {Ψ(ψ); ψ ∈ AM } ,
where
Ψ(ψ) = Qn,n ◦ ((ψ ◦ π) × Φ(ψ)) for all ψ ∈ AM . The bijection Ψ : AM → AT (M ) associates tangent bundle (total space) charts with (base space) manifold charts. (It is convenient to abbreviate each chart Ψ(ψ) : π −1 (Dom(ψ)) → IR2n to ψˆ ∈ AT (M ) .) The atlas in Definition 28.8.4 is related to the flat-space tangent bundle as in Figure 28.0.2. 28.8.6 Definition: The tangent bundle total space manifold of a C 1 manifold M is the the differentiable manifold (T (M ), AT (M ) ), where the atlas AT (M ) is given by Definition 28.8.4. 28.8.7 Remark: The tangent fibration in Definition 28.8.8 is essentially a differentiable fibration as its name suggests. (See Section 35.1.) General differentiable fibrations are defined in Section 27.12. Differentiable tangent bundles are defined in Section 35.8. < (M, AM ) is the tuple (T (M ), π, M ) − < 28.8.8 Definition: The tangent fibration of a C 1 manifold M − (T (M ), AT (M ) , π, M, AM ) where AT (M ) is the tangent bundle total space atlas for M and π is the projection map for T (M ) as in Definition 28.8.1.
28.8.9 Remark: It is tempting to think that a tangent bundle should not be defined to have a specific atlas. Definitions 28.8.1 and 28.9.2 seem perhaps to be overly restrictive in prescribing specific atlases. However, it would not be strictly correct to define the differentiable structure on T (M ) as some equivalence class of AT (M ) or some sort of maximal atlas because that would weaken the regularity of the specific atlas {Ψ(ψ); ψ ∈ AM }. There are infinitely many levels and gradations of regularity between classes C r and C r+1 . Any addition to the differentiable structure would weaken the regularity, possibly losing some property of interest. The regularity inherited from the base space M might not be analytic. For instance, the atlas for M may satisfy some group property such as local orthogonality or conformality of the transition maps. [ See Remark 28.2.7. ] [ Define the standard topology on the tangent operator bundle. ] 28.8.10 Remark: The definition of the induced topology on a tangent bundle total space is related to Theorem 14.10.15, which expresses the topology on a set in terms of the topology on each member of an open covering of that set. [ See Malliavin [35], section I.1.2 regarding the “natural topology” of a C 0 atlas. ] 28.8.11 Theorem: The topology defined for a tangent bundle in Definition 28.8.1 depends on the topology on the base space, but is independent of the choice of atlas for the base space. 28.8.12 Theorem: The tuple (T (M ), TT (M ) , π, M, TM ), where (T (M ), TT (M ) ) is the topological total tangent space for a C 1 n-dimensional manifold M − < (M, AM ), is a topological fibre bundle with fibre space IRn . (See Definitions 23.2.1 and 23.3.7.) [ Check this theorem. ] 28.8.13 Theorem: For r ∈ is a C r+1 manifold.
[ www.topology.org/tex/conc/dg.html ]
Z0 , the pair (T (M ), AT (M )) in Definition 28.8.1 is a C r manifold if (M, AM )
−+
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
[ Maybe should have a table of the various tangent bundle total spaces, fibrations and fibre bundles near here? ]
596
28. Tangent bundles on differentiable manifolds
28.9. Tangent operator bundles 28.9.1 Remark: Definition 28.9.2 is an operator version of Definition 28.8.1. [ Convert Definition 28.9.2 to make it look like Metadefinition 28.2.1. ] 28.9.2 Definition: The tangent operator bundle of a C 1 manifold M − < (M, AM ) is the C 0 manifold Tˆ(M ) − < (Tˆ(M ), ATˆ(M ) ), where S (i) Tˆ(M ) = p∈M Tˆp (M ) = {(p, ∂p,v,ψ ); ψ ∈ AM , p ∈ Dom(ψ), v ∈ IRn }, where n = dim(M ), and
ˆ ψ ∈ AM }, where for any chart ψ ∈ AM , the chart ψˆ : π −1 (Dom(ψ)) → IR2n is defined by (ii) ATˆ(M ) = {ψ; ψˆ : (p, ∂p,v,ψ ) 7→ (ψ(p), v), where π : Tˆ(M ) → M is defined by π : (p, ∂p,v,ψ ) 7→ p.
The function π : Tˆ(M ) → M is called the projection map of the tangent operator bundle Tˆ(M ). [ Show that Definition 28.9.2 satisfies Metadefinition 28.2.1. ] [ Show that Definition 28.9.2 is well-defined. See Malliavin [35], lemma I.7.2.4. ]
28.10. The tangent bundle of a tangent bundle
28.10.2 Remark: A second-order derivative is usually thought of as the first-order derivative of the firstorder derivative. This is fine as long as the output from the first-order derivative of a function space is the same space you started with. For example, the first-order derivative of a C ∞ function f : IR → IR is a C ∞ function from IR to IR. So you can keep taking derivatives up to any order in the same way. But with a function f ∈ C ∞ (IRn , IR) for integer n ≥ 2, the first-order derivative will typically be the sequence of n first-order partial derivatives (∂i f )ni=1 ∈ C ∞ (IRn , IRn ) (or some similar representation). So even in this very basic flat-space case, the second-order derivative is not the double application of the first-order derivative. The situation is much more difficult when defining a second-order derivative on a differentiable manifold. In this case, the first-order derivative of a real-valued function yields a cross-section of the cotangent bundle, which is a linear functional on the tangent vector space at each point of the manifold. The purpose of this section is to define the first-order derivative of such cross-sections so that second-order derivatives will be meaningful. However, not all cross-sections of first-order tangent bundles are obtained as the differential of a real-valued function. So the applicability of the tangent bundle of a tangent bundle is much broader than just second-order derivatives of real-valued functions. [ Should also present the chart transition map equations for T (T ∗ (M )) and T ∗ (T ∗ (M )) in the hope that these may be closely related to the equations for second-order tangent operators. ] 28.10.3 Remark: One obvious way to try to define mathematical objects to represent second-order derivatives would be to construct a second layer of tangent vectors on top of the bundle of first-order tangent vectors. But tangent vectors are defined with a differentiable manifold M as input and a tangent bundle T (M ) as output. It would be possible to construct a second-level tangent object T (T (M )) if the tangent bundle T (M ) was a manifold. It happens that the tangent bundle total space T (M ) does have a natural C r−1 differentiable structure if M is C r . This can be used as the manifold to construct a second-level tangent bundle T (T (M )) of class C r−2 . So the steps to construct a second-level tangent bundle are as follows. (These steps are sketched in Figure 28.10.1.) (i) Define a tangent bundle T (M ) on a C r manifold M − < (M, AM ). r−1 (ii) Define a C atlas (i.e. differentiable structure) AT (M ) on the total space of T (M ). (iii) Define a second-level tangent bundle T (2) (M ) = T (T (M )) on the total space of T (M ) − < (T (M ), AT (M ) ). r−2 (iv) Define a C atlas AT (T (M )) on the total space of T (T (M )). [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
28.10.1 Remark: Tangent vectors are mathematical objects which correspond to first-order derivatives. But a very large proportion of physics is expressed in terms of second-order differential equations. So tangent vectors must be extended somehow to represent second-order derivatives if differential geometry is to be useful in physics. However, the transition from first-order to second-order objects is precisely where differential geometry becomes significantly more complicated than Euclidean space.
28.10. The tangent bundle of a tangent bundle
Figure 28.10.1
AM
AT (M )
AT (T (M ))
M
T (M )
T (T (M ))
597
Building tangent bundles from manifolds and atlases
The differentiable structure on the tangent bundle of a differentiable manifold is presented in Definition 28.8.4. The existence of a differentiable structure on the tangent bundle implies that tangent vectors can be defined with base points in the tangent bundle T (M ) just as they were on the base space M in Definition 28.3.3. The manifold of these tangent vectors forms a new tangent bundle, namely T (T (M )). [ There is some discussion of T (T (M )) in Malliavin [35], chapter II.4, page 112. ] 28.10.4 Remark: When the time comes to construct the second-level tangent bundle, coordinate vectors have a clear advantage compared to differential operator vectors. Tangent vectors are defined in terms of coordinate charts on a manifold. Therefore second-level tangent vectors must be defined in terms of coordinate charts on the first-level tangent bundle. It is easier to construct T (T (M )) in terms of coordinatebased tangent vectors (Definition 28.3.3) than tangent operators (Definition 28.6.2). 28.10.5 Remark: The following concepts are easily confused.
(iii) The space of tensors of degree 2: T 2 (M ) = T (M ) ⊗ T (M ). (See Section 29.3.) (iv) The set of ordered pairs of tangent vectors: T (M )2 = T (M ) × T (M ). (See Section 28.12 for 2-frames.) The words “degree”, “order” and “level” are used in this book with the following meanings. word meaning example tangent spaces degree order level
the multiplicity of a tensor the order of derivatives recursive level of tangent bundles
⊗1 V , ⊗2 V , ⊗3 V i
2
i
j
3
i
j
∂/∂x , ∂ /∂x ∂x , ∂ /∂x ∂x ∂x T (M ), T (T (M )), T (T (T (M )))
k
T (r,0) (M ) ˚[r] (M ) T [r] (M ), T (r) T (M )
28.10.6 Remark: In the same way that first-level tangent bundles on manifolds are defined in terms of flat-space tangent bundles (as mentioned in Remark 28.0.3), second-level tangent bundles are defined in terms of the corresponding flat-space second-level tangent bundles. This is summarized in Figure 28.10.2. (x, v, w) = Ψ(2) (ψ)(W ) T (2) (M )
W
π (2)
Ψ(2) (ψ) ∈ AT (2) (M )
Ψ(2)
T (2) (IRn ) ≡ IRn × IRn × (IRn × IRn ) ≡ IR4n (x, v, w) 7→ x x = ψ(p)
M
Figure 28.10.2
p = π(W )
ψ ∈ AM
IRn
Defining second-level tangent bundle in terms of flat space
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
(i) The tangent bundle of a tangent bundle: T (2) (M ) = T (T (M )). (This section.) ˚[2] (M ) of second-order derivative operators f 7→ v i wj (∂ 2 f (x)/∂xi ∂xj ) (ii) The space T . (Section 32.5.) x=p
598
28. Tangent bundles on differentiable manifolds
28.10.7 Definition: A tangent vector on the total tangent space (T (M ), AT (M ) ) of a C 2 n-dimensional ˆ where p ∈ M , v ∈ IRn , manifold (M, AM ) at z = tp,v,ψ ∈ T (M ) is an equivalence class [(tp,v,ψ , w, ψ)], ˆ w ∈ IR2n , ψˆ ∈ atlasz (T (M ))}. ψ ∈ AM , w ∈ IR2n and ψˆ ∈ AT (M ) . Thus Tz (T (M )) = {[(z, w, ψ)]; (2)
ˆ might be t A reasonable sort of notation for [(z, w, ψ)] ˆ. z,w,ψ
28.10.8 Remark: Theorem 28.10.9 gives chart transition rules for tangent vectors to the total tangent space T (M ) which are analogous to the chart transition rules in Definition 28.3.3 for tangent vectors to the manifold M . These transition rules show that verticality of vectors in T (T (M )) is chart-independent, (2) whereas horizontality is not. In other words, if the components w ∈ IR2n of a vector tz,w,ψˆ ∈ Tz (T (M )) satisfy wi = 0 for i = 1 . . . n for one chart, then this holds for all charts. On the other hand, horizontality is not chart-independent. That is, if w satisfies wi = 0 for i = n + 1, . . . 2n for one chart, this equality does not generally hold for all charts. But if z is a zero vector (that is, z = tp,v,ψ with v = 0), then horizontality is chart-independent.
28.10.9 Theorem: Let M be an n-dimensional C 2 manifold. Let p ∈ M and z = tp,v,ψ ∈ Tp (M ). Let ψ1 , ψ2 ∈ atlasp (M ), and let ψˆ1 , ψˆ2 ∈ atlasz (T (M )) be the corresponding tangent bundle total space charts (2) (2) in Definition 28.8.4. Then for w1 , w2 ∈ IR2n , tz,w ,ψˆ = tz,w ,ψˆ if and only if for all i = 1 . . . n, w2i =
j=1
and w2n+i =
n X
∂xj (ψ2i ◦ ψ1−1 (x))
x=ψ1 (p)
∂xj ∂xk (ψ2i ◦ ψ1−1 (x))
1
2
w1j
x=ψ1 (p)
j,k=1
2
(28.10.1)
v1k w1j +
n X j=1
∂xj (ψ2i ◦ ψ1−1 (x))
∀i = 1, . . . 2n,
=
2n X j=1
(2)
(2)
Proof: By Definition 28.3.3 applied to the manifold T (M ), tz,w w2i
x=ψ1 (p)
∂yj (ψˆ2i ◦ ψˆ1−1 (y))
ˆ
1 ,ψ1
ˆ1 (z) y=ψ
= tz,w
ˆ
2 ,ψ2
w1n+j .
(28.10.2)
if and only if
w1j .
(28.10.3)
The variable y = ψˆ1 (˘ z ) ∈ IR2n has the form (˘ x, v˘) = (ψ1 (˘ p), v˘), where z˘ = tp,˘ ˘ v ,ψ1 is a point in T (M ). So j j j j j−n y =x ˘ = ψ1 (˘ p) for j = 1 . . . n and y = v˘ for j = n + 1, . . . 2n. Thus ( i −1 ψ2 ◦ ψ1 (˘ x) for i = 1 . . . n ψˆ2i ◦ ψˆ1−1 (y) = Pn (28.10.4) i−n ◦ ψ1−1 (x)) x=˘x v˘k for i = n + 1, . . . 2n. k=1 ∂xk (ψ2 Substitution of (28.10.4) into the expression ∂yj (ψˆ2i ◦ ψˆ1−1 (y)) in (28.10.3) for j = 1 . . . n gives ∂x˘j (ψ2i ◦ ψ1−1 (˘ x)) for i = 1 . . . n Pn i−n −1 k ∂x˘j k=1 ∂xk (ψ2 ◦ ψ1 (x)) x=˘x v˘ for i = n + 1, . . . 2n ( ∂xj (ψ2i ◦ ψ1−1 (x)) x=˘x for i = 1 . . . n = Pn i−n ◦ ψ1−1 (x)) x=˘x v˘k for i = n + 1, . . . 2n. k=1 ∂xj ∂xk (ψ2
∂yj (ψˆ2i ◦ ψˆ1−1 (y)) =
(
A similar substitution for j = n + 1, . . . 2n gives ∂yj (ψˆ2i ◦ ψˆ1−1 (y)) =
(
=
x)) ∂v˘j−n (ψ2i ◦ ψ1−1 (˘ Pn ∂v˘j−n k=1 ∂xk (ψ2i−n ◦ ψ1−1 (x)) x=˘x v˘k
0 ∂xj−n (ψ2i−n ◦ ψ1−1 (x)) x=˘x
for i = 1 . . . n for i = n + 1, . . . 2n for i = 1 . . . n for i = n + 1, . . . 2n.
Substitution of these expressions into (28.10.3) gives (28.10.1) and (28.10.2). [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
1
n X
28.10. The tangent bundle of a tangent bundle
599
Tz (T (M )) (2)
tz,w z = tp,v1 ,ψ1 = tp,v2 ,ψ2
x1 = ψ1 (p)
ψˆ1
(x1 ,v1 )
= tz,w
(x2 ,v2 )
IR2n
T (M ) ψ1
x1
ψ2
p
IRn
ˆ
2 ,ψ2
x2 = ψ2 (p)
ψˆ2
z
IR2n
Figure 28.10.3
(2)
ˆ
1 ,ψ1
x2
IRn
M
Chart transition rule for tangent vectors to T (M )
28.10.10 Remark: The sets, points and maps in Theorem 28.10.9 are illustrated in Figure 28.10.3. The transformation rules (28.10.1) and (28.10.2) may be summarized as the following matrix equation. (This is similar to Remark 19.3.2.) H H w2 A 0 w1 = , w2V B A w1V where wiH and wiV are respectively the horizontal and vertical parts of the component vector w ∈ IR2n and
B=
"
∂ψ2i ∂ψ1j
#n
,
i,j=1
∂ 2 ψ2i ∂ψ1j ∂ψ1k
vk
#n
.
i,j=1
28.10.11 Remark: The transformation rules for T (2) (M ) = T (T (M )) in Theorem 28.10.9 are eerily similar to the rules for T [2] (M ) in Definition 30.3.3. This is probably not a coincidence. i [ Compare the transition map rule in Theorem 28.10.9 with the rule for Γjk in EDM2 [34], 80.L? ]
28.10.12 Remark: The way in which atlases are used to mechanically arrive at the chart transition rules in Theorem 28.10.9 shows the advantage of working with coordinate-based tangent vectors in T (T (M )) rather ˚(T ˚(M )) or Tˆ(Tˆ(M )). It was precisely the difficulties of working than tangent operators in spaces such as T with these higher-level tangent spaces which prompted the author to abandon differential operators as the primary basis for tangent spaces. Such operators are good for intuition and bad for precise calculations with higher-level concepts. [ Definition 28.10.13 is probably a theorem, not a definition. ] 28.10.13 Definition: The tangent space of a tangent bundle total space T (M ) at a point z = tp,v,ψ ∈ T (M ) for a C 2 n-dimensional manifold M is the set Tz (T (M )) of all tangent vectors on T (M ) at z together with the linear space operations of pairwise addition and scalar multiplication defined by (2)
λ1 tz,w
(2)
ˆ
1 ,ψj
+ λ2 tz,w
(2)
ˆ
2 ,ψj
= tz,λ
ˆ
1 w1 +λ2 w2 ,ψj
for all λ1 , λ2 ∈ IR, w1 , w2 ∈ IR2n and ψˆj ∈ atlasz (T (M )). [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
A=
"
600
28. Tangent bundles on differentiable manifolds
28.10.14 Definition: The total space of the second-level tangent bundle of a C 2 n-dimensional manifold M− < (M, AM ) is the differentiable manifold tuple T (T (M )) − < (T (T (M )), AT (T (M )) ), where S (i) T (T (M )) = z∈T (M ) Tz (T (M )) is the union of the pointwise tangent spaces of (M, AM );
ˇ ψ ∈ AM }, where for all ψ ∈ AM , the chart ψˇ : π (ii) AT (T (M )) = {ψ; ˆ −1 (Dom(ψ)) → IR4n is defined by (2) (2) ψˇ : tt ˆ : T (T (M )) → T (M ) is defined by π ˆ = π ∗ : tt ˆ 7→ (ψ(p), v, w), where π ˆ 7→ tp,π1 (w),ψ , ,w,ψ ,w,ψ p,v,ψ
p,v,ψ
where π1 : IR2n → IRn is the projection map with π1 : (x1 , . . . x2n ) 7→ (x1 , . . . xn ).
28.10.15 Remark: Definition 28.10.14 is illustrated in Figure 28.10.4. ψˇ
T (T (M ))
IR4n
π ˆ ψˆ
T (M )
IR2n
π M Figure 28.10.4
ψ
IRn
Charts and maps for total tangent space of total tangent space
28.10.16 Remark: It is clumsy to have to refer to a space such as T (T (T (M ))) as the “tangent bundle of the tangent bundle of the tangent bundle of M ”. Since there is probably no standard term for this, it is convenient to refer to such a space as the “third-level tangent bundle of M ”. Then a linear space such as Tz (T (T (M ))) for z ∈ T (T (M )) may be referred to as a “third-level tangent space of M at z”. It is not obvious what notation to use for kth-level tangent spaces for arbitrary k ∈ + . Notation 28.10.17 will be used in this book. The parentheses in the superscript are supposed to be a mnemonic for the parantheses in the recursive definition. 28.10.17 Notation: T (k) (M ) for k ∈ where T (0) (M ) = M .
Z+ denotes the tangent bundle T (T (k−1)(M )) of a C k manifold M ,
28.10.18 Definition: The kth-level tangent bundle of a C k manifold M is the tangent bundle T k (M ) in Notation 28.10.17.
28.11. Horizontal components and drop functions 28.11.1 Remark: A connection (i.e. parallelism) is constructed on a differentiable manifold M by defining, for each direction of motion v in the tangent space at a point p ∈ M , the rate of adjustment of each vector w in the tangent space to keep the vector w moving in a parallel fashion. For example, a point q ∈ M may move along a curve γ : IR → M with velocity vector v as the curve passes through the point p. The vector w(q) always lies in the tangent space at the point q. So it is necessary to be able to define the velocity of a vector in the tangent bundle with a variable base point q. This in turn requires the definition of tangent vectors on the tangent bundle, and more importantly, the tangent vectors on the tangent bundle must be given a differentiable manifold structure so that all of the conditions in the definition of a connection will be meaningful. (See Definition 36.5.3.) Parallelism is defined for a particular path between two points p1 , p2 ∈ M as a linear map between the tangent spaces Tp1 (M ) and Tp2 (M ). When this is differentiated, it leads naturally to a definition of affine connections in terms of the tangent space of the tangent space of M . This seems to imply that a manifold must be C 2 in order to have a (differential) connection on it. Probably this constraint could be weakened somewhat. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Z
28.11. Horizontal components and drop functions
601
28.11.2 Definition: The horizontal component of a vector W in a total tangent space T (T (M )) for a C 2 manifold M is the vector π∗ (W ) in T (M ), where π is the projection map of T (M ). 28.11.3 Definition: A vertical vector in a total tangent space T (T (M )) for a C 2 manifold M is a vector W ∈ T (T (M )) whose horizontal component is zero. 28.11.4 Remark: The horizontal component of a vector in a total tangent space is chart-independent. Therefore the verticality property is also chart-independent. Definitions 28.11.2 and 28.11.3 follow the terminology for general differentiable fibre bundles in Section 27.12. Vertical components and horizontal vectors are not defined here because there is no “connection” between tangent spaces at different points of a manifold. 28.11.5 Remark: It turns out that Definition 28.11.6 is very important in constructing a covariant derivative out of an affine connection. It may be compared with the much simpler “drop” function for a linear space such as IRn . The tangent space Tp (IRn ) at a point p ∈ IRn is usually identified with IRn without comment. Thus if γ : IR → IRn is a differentiable curve in IRn , the derivative of γ at p = γ(t) is defined as limh→0 (γ(t + h) − γ(t))/h, which is the limit of a vector (γ(t + h) − γ(t))/h that just happens to be in IRn for h 6= 0. The fact that IRn has a complete topology implies that the limit exists and is an element of IRn . Alternatively, one could think of IRn as a manifold with an atlas of one chart – the identity map. Then the limit could be thought of as an element of Tp (IRn ).
28.11.6 Definition: The drop function at a point z ∈ T (M ) for a C 2 manifold M is the linear map ˆ 7→ (π(z), π2 (w), ψ) for w ∈ IR2n and ψ ∈ atlasπ(z) , where ̟z : ker((dπ)z ) → T (M ) defined by ̟z : (z, w, ψ) π2 : IR2n → IRn is the projection map π2 : w 7→ (wn+1 , . . . w2n ), and π : T (M ) → M is the standard projection map for T (M ). 28.11.7 Theorem: The drop function ̟z in Definition 28.11.6 is a chart-independent linear isomorphism. Proof: Let z ∈ T (M ), p = π(z) ∈ M and ψ1 , ψ2 ∈ atlasp (M ). It must be shown that tp,π2 (w1 ),ψ1 = (2) (2) tp,π2 (w2 ),ψ2 if tz,w ,ψˆ = tz,w ,ψˆ ∈ ker((dπ)z ). The components w11 , . . . w1n and w21 , . . . w2n are all zero. So by 1 1 2 2 Theorem 28.10.9, n X w1n+j , w2n+i = ∂xj (ψ2i ◦ ψ1−1 (x)) j=1
x=ψ1 (p)
from which it follows that π2 (w1 ) and π2 (w2 ) satisfy the same chart transition rule, which happens to be the right transition rule for vectors in Tp (M ). ˆ 7→ The fact that ̟z is an isomorphism follows from the fact that π2 is surjective and (dπ)z : (z, w, ψ) (p, π1 (w), ψ) for ψ ∈ atlasp (M ). 28.11.8 Remark: The map ̟z in Definition 28.11.6 is a bijection from the subspace ker((dπ)z ) of vertical vectors in Tz (T (M )) to Tp (M ). Note that ̟z cannot be extended to a chart-independent linear map on all of Tz (T (M )) without some means of removing the arbitrary definition of horizontality. This is precisely why connections are required. A connection on a C 2 manifold is equivalent to extending ̟z to all of Tz (T (M )) for all z ∈ T (M ) in a way which is chart-independent. This requires some way of adjusting the map for each chart so as to remove the horizontal component in equation (28.10.2) in Definition 28.10.9. In fact, a connection may be thought of as a horizontal component removal rule – the connection determines a horizontal vector which is subtracted from an element of Tz (T (M )) to give a vertical vector. This is the [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
In the case of a general manifold, there is no linear space structure, and therefore only the tangent space version of a vector can be defined. But if the manifold happens to be a linear space also, then there are two possible definitions. This is exactly what occurs when one constructs tangent vectors in Tz (T (M )) for a C 2 manifold with z ∈ T (M ). Any vertical vector in Tz (T (M )) may be constructed either within the linear space of vertical vectors or in the manifold structure of T (M ). Definition 28.11.6 gives the canonical map from the latter to the former. The verticality is important because a vertical vector y ∈ Tz (M ) represents a tangent vector with a constant base point p = π(z), and if the base point is not changing, the rest of the manifold might as well not exist, and the vector therefore exists entirely within a simple linear space.
602
28. Tangent bundles on differentiable manifolds
basis of covariant differentiation. Definition 28.11.6 is an essential ingredient in defining covariant derivatives from connections, although most texts do not mention this explicitly. The vertical vectors constructed from connections must be “dropped” from the vertical space ker((dπ)z ) down to the base tangent space Tp (M ). (This is the procedure for covariant differentiation of vector fields in X 0 (T (M )). Similar procedures are followed for general tensor fields. If the field to be differentiated is valued in a linear space, a similar drop function can generally be defined.) S Maybe the total drop function ̟ = z∈T (M ) ̟z could be regarded as a differentiable map between differS entiable manifolds if the domain z∈T (M ) ker((dπ)z ) (i.e. the set of all vertical vectors) can be regarded as a C 1 manifold. [ Must try to put a sensible differentiable structure on the set of all vertical vectors. And maybe a fibration and fibre bundle structure too. Maybe even give it a notation, both for total tangent spaces and for general C 1 fibre bundles. ] 28.11.9 Remark: It is sometimes useful to work with functions f : M → T (T (M )) such that f (p) ∈ T (T (M )) and π(π∗ (f (p))) = p for all p ∈ M , where M is a C 2 manifold. Such functions are cross-sections of the fibration (T (T (M )), π ◦ π∗ , M ). It would be useful to have a notation for the space of these functions. One possibility is the notation “T (T (M ))/M ”, although this is ambiguous because the projection map could be something other than π ◦ π∗ .
28.12. Tangent frames and coordinate basis vectors
A tangent frame is a sequence of linearly independent vectors in the tangent vector space Tp (M ) at a single point p of a C 1 manifold M . Tangent frames with r vectors for r ≤ n are elements of the direct product set Tp (M )r at each p ∈ M . Tangent frames are useful for the definition of parallelism (connections) on manifolds and also for representing forms on manifolds. The set of all tangent r-frames of a given manifold has a natural manifold structure. It is shown in Section 35.9 that the set of all tangent n-frames of a manifold may be regarded as a principal fibre bundle. 28.12.1 Definition: A tangent r-frame at a point p ∈ M of an n-dimensional C 1 manifold M for r ∈ with r ≤ n = dim(M ) is any linearly independent sequence of r elements in Tp (M ).
Z+0
A tangent basis at a point p ∈ M is any tangent n-frame at a point p of an n-dimensional manifold M .
28.12.2 Notation: Ppr (M ) denotes the set of all tangent r-frames at a point p in a C 1 manifold M . (Then Ppr is a subset of the set Tp (M )r of r-tuples of elements of Tp (M ).) Pp (M ) denotes the set of all n-frames at p ∈ M , where n = dim(M ). (Then Pp (M ) = Ppn (M ).) S P r (M ) denotes the set p∈M Ppr (M ) of all r-frames of the manifold M . S P (M ) denotes the set p∈M Pp (M ) of all coordinate frames of the manifold M . 28.12.3 Remark: The set Ppr (M ) may be expressed symbolically as follows:
Pr Ppr (M ) = (zi )ri=1 ∈ Tp (M )r ; ∀(λi )ri=1 ∈ IRr , i=1 λi zi = 0 ⇒ (∀i = 1, . . . r, λi = 0) .
28.12.4 Definition: A tangent operator r-frame is. . .
28.12.5 Remark: An atlas is introduced for the set of coordinate frames in Theorem 28.12.6 and Definition 28.12.8 using the coordinate basis vectors presented in Definition 28.7.11. [ Give also a version of Theorem 28.12.6 using tangent component vectors rather than tangent operators. In this case, the basis vectors will be denoted as ep,ψ ∈ Tp (M )n . The proof is identical to that for Thei orem 28.12.6. It’s best to just prove the theorem for component vectors and write the operator case as a corollary. ] [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
This section presents the natural topological and atlas structures for the space of all sequences of linearly independent tangent vectors of a differentiable manifold.
28.12. Tangent frames and coordinate basis vectors
603
28.12.6 Theorem: Let (M, AM ) be an n-dimensional C 1 manifold. Then for any p ∈ M and any chart ˚p (M ). Therefore ∀p ∈ M, dim(T ˚p (M )) = dim(Tp (M )) = ψ ∈ AM , the sequence (∂ip,ψ )ni=1 is a basis for T dim(M ). [ Is this really worth stating? ] ˚p (M ), there is a unique matrix b = Hence (or otherwise) for any p ∈ M , ψ ∈ AM and n-frame B ∈ P i n (b j )i,j=1 ∈ GL(n) such that n X n ∂ip,ψ bi j B= . j=1
i=1
If this sequence is denoted as Bp,b,ψ , then and
˚p (M ) = {Bp′ ,b,ψ ; b ∈ GL(n), ψ ∈ AM , p′ ∈ Dom(ψ), p′ = p} P ˚(M ) = {Bp,b,ψ ; b ∈ GL(n), ψ ∈ AM , p ∈ Dom(ψ)}. P
The n-frames Bp,b,ψ obey the following coordinate transformation rule for ψα , ψβ ∈ AM : Bp,bα ,ψα = Bp,bβ ,ψβ if and only if (bβ )i j =
n X
(Zβα (p))i k (bα )k j ,
k=1
where Zαβ (p) is as in Definition 27.4.7. p,ψβ
Proof: A straightforward calculation using ∂ip,ψα = (∂ψβk /∂ψαi )∂k X n n p,ψα i Bp,bα ,ψα = ∂i (bα ) j
p,ψ ∂k β Zβα (p)k i (bα )i j
i=1 k=1
=
p,ψβ
∂k
i=1
=
X n i=1
n X
Zβα (p)k i (bα )i j
p,ψβ
(bβ )k j
j=1
n
j=1
k=1
∂k
n
n
= Bp,bβ ,ψβ .
j=1
28.12.7 Remark: In contrast to the situation for tangent vector spaces, there are no algebraic operations on the sets of tangent r-frames and coordinate frames. This is because these frames are not closed under simple vector addition. However, the right action of the group G = GL(n) on the set of frames is of some interest. The group actions µ : Ppr (M ) × G → Ppr (M ) for p ∈ M and µ : P r (M ) × G → P r (M ) turn out to be the actions of topological transformation groups. (See Section 16.8 for topological transformation groups.) When the set P (M ) is regarded as a fibre bundle over the base space M , it turns out that (P (M ), π, M ) is a principal G-bundle. Although the treatment of the tangent frame bundle is delayed until Section 35.9 so as to use the differentiable fibre bundle definitions of Chapter 35, it is possible to introduce here the differentiable manifold structure on the sets of tangent r-frames. This is presented in Definition 28.12.8. 28.12.8 Definition: The total (tangent) r-frame space of a C 1 n-dimensional manifold (M, AM ) for r ∈ + r 0 with r ≤ dim(M ) is the tuple (P (M ), AP r (M ) ), where
Z
(i) P r (M ) = {(p, z) ∈ (M, T (M )r ); z ∈ Tp (M )r and the vectors (z1 , . . . zr ) are independent}; ˆ ψ ∈ AM }, where for any chart ψ ∈ AM , the chart ψˆ : q −1 (Dom(ψ)) → IRn+rn is (ii) AP r (M ) = {ψ; defined by ψˆ : (p, z) 7→ (ψ(p), v1 , . . . vr ), where zj = tp,vj ,ψ for j = 1, . . . r and q : P r (M ) → M satisfies q : (p, z) 7→ p.
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
X n X n X n
Zβα (p)k i yields the following:
j=1
i=1
=
p,ψβ
= ∂k
604
28. Tangent bundles on differentiable manifolds
The map q : P r (M ) → M is called the projection map of the total tangent r-frame space. The total tangent n-frame space P (M ) − < (P (M ), AP (M ) ) = (P n (M ), AP n (M ) ) of the manifold M is also called the total (tangent) coordinate frame space. 28.12.9 Remark: The sets and functions in Definition 28.12.8 are illustrated in Figure 28.12.1. ψ˜
q −1 (U ) ⊆ P r (M )
IRn+rn
q
U ⊆M Figure 28.12.1
ψ
IRn
Total tangent r-frame space
[ Define total tangent r-frame fibration, as for the tangent fibration. See Definition 35.9.1. ] [ Show that P (M ) is a topological principal fibre bundle, or at least that the operation Rg on P (M ) transforms as it should. ] [ Must also define T (P (M )) here for use in connections. ]
28.13. Tangent space constructions, attributes and relations This section collects together various constructions, attributes and relations for tangent spaces which are of a general nature. They are generally lacking in any great theoretical interest. These things have been moved out of the earlier sections of this chapter because they tend to be a bit of a yawn.
∂(p1 ,p2 ),v1 ⊕ v2 ,ψ1 ⊕ ψ2 (f ) =
n1 X
v1i
i=1
˚1 for all f ∈ C (p1 ,p2 ) (M1 × M2 , IR).
n2 ∂ ∂ X j −1 −1 f ◦ ψ (x ) + v f ◦ ψ (x ) 1 2 1 2 2 ∂xi1 x1 =ψ1 (p1 ) x2 =ψ2 (p2 ) ∂xj2 j=1
28.13.2 Notation: The tangent operator ∂(p1 ,p2 ),v1 ⊕ v2 ,ψ1 ⊕ ψ2 ∈ T(p1 ,p2 ) (M1 × M2 ) will also be denoted by ∂p1 ,v1 ,ψ1 ⊕ ∂p2 ,v2 ,ψ2 .
28.14. Unidirectional tangent bundles A unidirectional tangent bundle looks like a cone at each point of the manifold whereas the usual definition of a tangent bundle looks like a tangent plane at each point. The natural invariant class of curves for unidirectionally differentiable manifolds is the class of curves with one-side derivatives everywhere. The set of one-sided derivatives at a point constitutes a cone of tangent vectors. The examples in Section 43.4 gives some idea of the issues which arise for manifolds with Lipschitz transition maps. The Lipschitz (i.e. C 0,1 ) guarantees the existence of derivatives almost everywhere, but does not guarantee even the existence of unidirectional derivatives everywhere, as shown by Example 43.4.4. Existence of tangent vectors almost everywhere may be adequate in contexts where vector fields are to be integrated, for example. So the manifold classes C k,1 for integer k may be useful for some applications. In other applications, directional derivatives may be required. Such manifolds may arise from differential equations whose force functions have well-defined one-sided limits everywhere instead of full continuity. 28.14.1 Remark: Metadefinition 28.14.2 is the unidirectional version of Metadefinition 28.2.1. Unidirectionally differentiable homeomorphisms are defined in Section 19.6. Unidirectionally differentiable manifolds are defined in Section 27.10. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
28.13.1 Theorem: Let M1 and M2 be C 1 manifolds. Let n1 = dim(M1 ) and n2 = dim(M2 ). Let p1 ∈ M1 and p2 ∈ M2 . Let v1 ∈ IRn1 and v2 ∈ IRn2 . Let ψ1 ∈ atlasp1 (M1 ) and ψ2 ∈ atlasp2 (M2 ). Then the tangent operator ∂(p1 ,p2 ),v1 ⊕ v2 ,ψ1 ⊕ ψ2 ∈ T(p1 ,p2 ) (M1 × M2 ) satisfies
28.15. Distributions as representations of tangent bundles
605
[ See Remark 19.6.6 for the ∂a+ notation. It is probably a pseudo-notation. ] 28.14.2 Metadefinition: A unidirectional tangent bundle for an n-dimensional unidirectionally differentiable manifold (M, AM ) must provide a tuple (T , π, AˆT , Φ) − > (T , AˆT ) − > T which satisfies the following conditions. (i) (ii) (iii) (iv)
π : T → M is a surjective map. Φ : AM → AˆT is a bijection. ∀ψ ∈ AM , Φ(ψ) : π −1 (Dom(ψ)) → IRn . ∀p ∈ M, ∀ψ ∈ atlasp (M ), Φ(ψ) π−1 ({p}) : π −1 ({p}) → IRn is a bijection.
(v) ∀p ∈ M, ∀ψ1 , ψ2 ∈ atlasp (M ), ∀V ∈ π −1 ({p}),
Φ(ψ2 )(V ) = ∂a+ ψ2 (ψ1−1 (x + av))
x=ψ1 (p),v=Φ(ψ1 )(V )
.
(28.14.1)
T is called the total space of the unidirectional tangent bundle. An element of T is called a unidirectional tangent vector. π is called the projection map of the unidirectional tangent bundle. AˆT is called the unidirectional tangent bundle atlas of the unidirectional tangent bundle. The maps φψ ∈ AˆT are called unidirectional tangent bundle charts. Φ is called the lift function of the unidirectional tangent bundle. A unidirectional tangent vector at p ∈ M is any element of π −1 ({p}). The unidirectional tangent vector at with coordinates v ∈ IRn with respect to chart ψ ∈ AM is the p ∈ M −1 unidirectional tangent vector (Φ(ψ) π−1 ({p}) ) (v) ∈ T . 28.14.3 Remark: Equation (28.14.1) may be written more fully as follows.
ψ2 (ψ1−1 (x + av)) − ψ2 (ψ1−1 (x)) a a→0 x=ψ1 (p),v=Φ(ψ1 )(V ) ψ2 (ψ1−1 (ψ1 (p) + aΦ(ψ1 )(V ))) − ψ2 (p) = lim+ a a→0 = ∂a+ ψ2 (ψ1−1 (ψ1 (p) + aΦ(ψ1 )(V ))).
Apart from condition (v), Metadefinition 28.14.2 is the same as Metadefinition 28.2.1. If the manifold M is C 1 , the metadefinitions are the same. 28.14.4 Remark: It seems fairly clear that any two representations of Metadefinition 28.14.2 must be related by a unique isomorphism. This is required if the metadefinition is to be a complete characterization of unidirectional tangent bundles. [ Provide a theorem to confirm the unique isomorphism property. ] [ Here might be a good place to insert a section which generalizes tangent vectors to C 0,1 manifolds, C 0,α manifolds, rectifiable manifolds, and manifolds which are simply differentiable (as opposed to continuously differentiable). For C 0,1 (Lipschitz) manifolds, the manifold would be differentiable almost everywhere. So the tangent bundle would presumably be an almost-everywhere tangent bundle. This suggests possibly using distributions instead of function. C 0,1 and W 1,∞ regularity are essentially equivalent. This suggests a motivation for using Sobolev spaces and weakly differentiable functions. ]
28.15. Distributions as representations of tangent bundles [ For completeness, there should be a section on distributions as a framework for defining tangent vectors as in Remark 28.1.1, case (v). This could be useful for partial differential equations and field theories. It would be interesting to see how the higher derivatives of delta functions are related to higher order tangent vectors, tensors and differentials on a differentiable manifold. Maybe a whole chapter would be needed for this. Manifolds with Sobolev space regularity could be combined with the distribution representations. This means that distributions must be fully defined, probably somewhere near Section 20.10. ] [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Φ(ψ2 )(V ) = lim+
606
28. Tangent bundles on differentiable manifolds
28.16. Tangent bundles on infinite-dimensional manifolds [ Maybe should present infinite-dimensional manifolds some day. Infinite dimensional manifolds have relevance to calculus of variations, Jacobi fields on geodesics, motion of manifolds by mean curvature, vibrations of strings and membranes and other applications. For example, the infinitesimal variations of a curve or membrane may be regarded as elements of a tangent space at a point in the space of curves or membranes. ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
28.16.1 Remark: If infinite-dimensional manifolds were within the scope of this book, it would be important to choose the definition of tangent vectors to maximize the range of possible manifolds on which tangent vectors can be defined. An example of a very infinite-dimensional manifold is the set M of all C 1 diffeomorphisms of a given manifold S. In this case, the tangent vectors of M would be the vector fields generated on S by families of diffeomorphisms. Such a family is a differentiable curve in M . This suggests that the most general form of tangent vector definition might be the set of all derivatives γ ′ (0) of curves γ : IR → M . This still raises the question of what space the derivatives γ ′ (0) should live in. This could be some sort of Hilbert space, Banach space or topological vector space. It is best to use differential operators for motivation only; then use coordinates for the formal definition.
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
[607]
Chapter 29 Tensor bundles and tensor fields on manifolds
29.1 29.2 29.3 29.4 29.5 29.6 29.7 29.8 29.9
Contravariant tensors and tensor spaces . Cotangent vectors and cotangent spaces . Covariant and mixed tensors . . . . . . . Double tangent spaces . . . . . . . . . . . Vector fields . . . . . . . . . . . . . . . . Tangent operator fields . . . . . . . . . . Tensor fields . . . . . . . . . . . . . . . . Vector fields and tensor fields along curves Differential forms . . . . . . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
608 608 610 611 612 613 614 614 615
29.0.1 Remark: This chapter presents covariant vectors, tensors, vector fields, tensor fields and differential forms. Differential manifolds were defined in Chapter 27. Several kinds of tangent spaces were defined in 28. [ The following table will give an overview of tangent spaces, vector fields and differentials. It isn’t ready to look at yet. Tangent space building principles are summarized in Section 27.13. ] TS
TTS
vector fields
real function manifold map
Cpk (M ) Cpk (M1 , M2 )
C (M ) C k (M1 , M2 )
tangent tangent operator tagged tangent operator tangent frame cotangent type (r, s) tensor
Tp (M ) ˚p (M ) T Tˆp (M ) Pp (M ) Tp∗ (M ) Tpr,s (M )
T (M ) ˚(M ) T Tˆ(M ) P (M ) T ∗ (M ) T r,s (M )
X r (M )
map cotangent map tangent
Tp∗ (M1 , M2 ) Tp (M1 , M2 )
T ∗ (M1 , M2 ) T (M1 , M2 ) T [k] (M ) ˚[k] (M ) T Tˆ[k] (M )
[k]
order-k tangent Tp (M ) ˚p[k] (M ) order-k tangent operator T [k] tagged order-k tang. op. Tˆp (M )
k
PD
GD
references
f (p) φ(p)
f φ
27.5 27.8
γ ′ (t) γ ′ (t) γ ′ (t)
γ′ γ′ γ′
28.7, 28.8, 29.5
(df )p
df
29.2 29.3, 29.7
X r (T ∗ (M1 , M2 )) X r (T (M1 , M2 ))
(dφ)p
dφ
29.4 29.4
X r (T [k] (M ))
(dφ)p (dφ)p (dφ)p
dφ dφ dφ
30.7
X r (Tˆ(M )) X r (P (M )) X r (T ∗ (M )) X r (T r,s (M ))
X r (Tˆ[k] (M ))
29.5
[ Missing from this table are semi-groups like φ : IR × M → M , which generate vector fields, and higherorder differentials like d2 f and d2 φ. Another missing class is curves and families of curves. Families of diffeomorphisms are related to curves. Could also present multi-parameter families of differentiable maps. Such maps are not necessarily diffeomorphisms. ] 29.0.2 Remark: As in the case of tangent bundles, it is important to remember that tensor bundles are not defined as special cases of differentiable fibre bundles. A tensor bundle does not have the differentiable structure group of the fully-defined differentiable fibre bundles which are introduced in Chapter 35. The rough interrelationships are sketched in Figure 29.0.1.
©
Alan U. Kennington, “Differential geometry reconstructed: a unified systematic framework”. Copyright 2009, Alan U. Kennington. All rights reserved. You may print this book draft in A4 format. Printing in all other formats is forbidden. You may not charge any fee for copies of this book draft. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
type
608
29. Tensor bundles and tensor fields on manifolds non-topological fibre bundle topological fibre bundle tangent bundle tensor bundle
differentiable fibre bundle
differentiable tangent fibre bundle
Figure 29.0.1
differentiable tensor fibre bundle
Family tree for fibre bundles and tensor bundles
29.1. Contravariant tensors and tensor spaces Contravariant tensors are constructed from ordinary tangent spaces.
29.2. Cotangent vectors and cotangent spaces Cotangent spaces are defined to give differentials a “place to live”. Differentials are linear functionals on the space of tangent vectors. The properties of linear functionals on a finite-dimensional linear space are well-known. (See Section 10.5.) In particular, the dual space has the same dimension as the original space. (See EDM2 [34], 256.G, for dual spaces.)
Cotangents can also be defined in terms of flat-space cotangents just as tangent vectors are. Thus a cotangent would be a triple (p, w, ψ) where w ∈ IRn is the sequence of components of a cotangent at ψ(p) ∈ IRn for manifold charts ψ ∈ atlasp (M ). Another way to define a cotangent vector at p ∈ M would be as an equivalence class of function f ∈ C 1 (M ) which have the same gradient. Thus two functions f, g ∈ C 1 (M ) would be considered equivalent if DV f = DV g for all V ∈ Tp (M ). (Notation 28.5.7 defines DV .) A cotangent would then be an equivalence class [f ] of f ∈ C 1 (M ) with respect to this equivalence relation. This construction for cotangent vectors is an analogue of the construction of contravariant vectors on manifolds as equivalence classes of curves with the same derivative. (See Remark 28.1.1, paragraphs (iii) and (iv).) Cotangent vectors can also be defined as maps from the set of C 1 curves γ : IR → M with γ(0) = p to the real numbers. This is a dual of the curve-equivalence-class representation of tangent vectors. (See 1 Remark 28.1.1 (iii).) Every function f ∈ C (M ) corresponds to such a cotangent vector. The map γ 7→ ∂t (f (γ(t))) t=0 has the same value for all curves γ in a curve equivalence class at p ∈ M . Just as in the case of (contravariant) tangent vectors, one really wants to be able to use all representations of cotangent vectors interchangeably according to the application du jour. But for practical reasons, it is necessary to choose one of the representations as the standard and derive all of the others from the standard version. This book tries to use the “economy principle” to help choose the best standard representation of each class of object. This eliminates some of the more artistically pleasing representations involving curves, smooth functions and linear functionals. The most economical representation is in terms of equivalence classes [(p, w, ψ)] for w ∈ IRn and charts ψ ∈ atlasp (M ). These are the same kinds of triples as for tangent vectors, but the computation rules are different. [ For better consistency and more efficient calculations, should use covectors of the form t∗p,w,ψ = [(p, w, ψ)] for T ∗ (M ) ≡ T 0,1 (M ), where p ∈ M , ψ ∈ AM,p ,w ∈ (IRn )∗ = Lin(IRn , IR). Then define µ : T 0,1 (M ) × T 1,0 (M ) → T 0,0 (M ) by µ : [(p, w, ψ)], [(p, v, ψ)] 7→ [(p, w(v), ψ)]. But Tp0,0 (M ) is equivalent to IR. So [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
29.2.1 Remark: There are many ways to define cotangents on differentiable manifolds. Cotangents at a point p in a manifold M may be defined as the dual of the linear space Tp (M ) of tangent vectors at p. This dual can be defined either (1) in terms of standard (induced) linear structure on Tp (M ) or (2) as cotangent triple equivalence classes t∗p,w,ψ = [(p, w, ψ)] defined so that t∗p,w,ψ .tp,v,ψ = wi v i for tp,v,ψ ∈ Tp (M ). In case (1), the cotangent vectors are linear maps from Tp (M ) to IR.
29.2. Cotangent vectors and cotangent spaces
609
[(p, w(v), ψ)] ≡ w(v) ∈ IR. Then for general tensors, likewise use [(p, K, ψ)] with K a flat-space tensor r,s in ⊗ IRn . Really need µ : T 0,1 (M ) × T 1,0 (M ) → IR. Is there any real use for T 0,0 (M ) with real numbers attached at points of M ? ] [ See Crampin/Pirani [11], page 76, regarding “covector fields and the Lie derivative”. On page 37, they call cotangent vectors “covectors”. ] 29.2.2 Definition: The cotangent (vector) space of a C 1 manifold M at a point p ∈ M is the dual linear space Lin(Tp (M ), IR) of Tp (M ), denoted as Tp∗ (M ). Any element of Tp∗ (M ) is called a cotangent (vector) or covector of M at p. 29.2.3 Remark: The pointwise cotangent space Tp∗ (M ) in Definition 29.2.2 is strictly speaking an abbreviation for the tuple (IRn , Tp∗ (M ), σIR , τIR , σ, µ), where σIR and τIR are respectively the addition and multiplication operations on IR, σ is the addition operation on Tp∗ (M ), and µ is the scalar multiplication operation of IR on Tp∗ (M ). (See Definition 10.1.2 for linear spaces.) It is usual to assume that the space Tp∗ (M ) has its standard topology, but not the standard metric, inner product or norm, because these are not invariant under chart transitions. ˚p∗ (M ) as the dual of T ˚p (M ) (or otherwise). The elements of T ˚p∗ (M ) should map operators [ Near here, define T ∂p,v,ψ to IR. ] [ Probably should make the following remark into a formal definition. ] 29.2.4 Remark: Cotangents may be coordinatized in terms of coordinates on the tangent space Tp (M ). n If ψ is a chart for M at p, and ω ∈ Tp∗ (M ), define Ψ : Tp∗ (M ) → IRn by Ψ : ω 7→ ω(tp,ei ,ψ ) i=1 , where for each i =P 1 . . . n, ei ∈ IRn has components (ei )j = δij . If w = Ψ(ω), then wi = ω(tp,ei ,ψ ) for i = 1 . . . n, and ω(V ) = ni=1 wi v i for all V = tp,v,ψ ∈ Tp (M ).
29.2.6 Remark: The unit basis cotangent vectors eip,ψ in Definition 29.2.5 are equivalent to the differen˚p∗ (M ). One could also use the simplified notation di or di . Then the identity di (ep,ψ ) = δ i tials (dψ i )p ∈ T j p,ψ p,ψ j holds in terms of the unit vector notation ep,ψ in Definition 28.7.4. (See also Remark 31.2.8 for similar comj ments on notation.) The cotangent vectors (dψ i )p may also be written as dψ i (p). 29.2.7 Theorem: The unit vectors dip,ψ = (dψ i )p for i ∈ M satisfy the transformation rule:
Nn at p ∈ M on an n-dimensional C 1 manifold
∂ ψ˜j (ψ −1 (x)) djp,ψ ∂xi x=ψ(p) ∂ ψ˜j j = d ∂ψ i p,ψ
dip,ψ˜ =
for charts ψ, ψ˜ ∈ atlasp (M ). 29.2.8 Definition: The total cotangent space of a C 1 n-dimensional manifold M − < (M, AM ) is the C 0 ∗ ∗ manifold T (M ) − < (T (M ), AT ∗ (M ) ), where S (i) T ∗ (M ) = p∈M Tp∗ (M ), and
(ii) AT ∗ (M ) = {ψˆ∗ ; ψ ∈ AM }, where for any chart ψ ∈ AM , the chart ψˆ∗ : π ∗ −1 (Dom(ψ)) → IR2n is defined by ψˆ∗ : ω 7→ ψ(p), (ω(tp,ei ,ψ ))ni=1 , where π ∗ : T ∗ (M ) → M is the projection map for T ∗ (M ), and ei denotes the ith unit vector of IRn for i = 1, . . . n.
29.2.9 Remark: The charts and projection maps for Definition 29.2.8 are illustrated in Figure 29.2.1. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
29.2.5 Definition: A coordinate basis covector at a point p ∈ M for a chart ψ ∈ AM in a C1 manifold (M, AM ) with n = dim(M ) is the vector eip,ψ ∈ Tp∗ (M ) defined for i = 1 . . . n by eip,ψ tp,v,ψ = v i for all v ∈ IRn .
610
29. Tensor bundles and tensor fields on manifolds ψ˜
T (M )
IR2n
π ψ
M
IRn
π∗ ψ˜∗
T ∗ (M ) Figure 29.2.1
IR2n
Charts and projection maps for tangent and cotangent spaces
29.3. Covariant and mixed tensors The space T ∗ (M ) is defined in Section 31.2. [ Must define the full topological and differentiable structures for all tensor spaces in this section. ] r,s
29.3.1 Remark: In Notation 29.3.2, it would probably be better to define Tpr,s (M ) = ⊗ Tp (M ). In Remark 13.7.4, it was mentioned that there is a difficulty here in notating the fully general ordering of mixed tensors. One could use a notation such as Tp1,−1,1,1 (M ) = Tp (M ) ⊗ Tp∗ (M ) ⊗ Tp (M ) ⊗ Tp (M ). More n generally, TpJ (M ) = ⊗i=1 Vi , where J ∈ {−1, 1}n , and for i = 1, . . . n, Vi = Tp (M ) if J(i) = 1, and ∗ Vi = Tp (M ) if J(i) = −1. There doesn’t seem to be any standard notation such as this. [ The tensor spaces in Notation 29.3.2 are constructed pointwise from multilinear functions of the pointwise tangent vector spaces Tp (M ). It could be simpler to re-use the flat-space definitions as in [(p, K, ψ)]. ]
Tpr,s (M ) = where
r
s
⊗ Tp (M ) ⊗ ⊗ Tp∗ (M )
i=1
⊗ denotes the tensor product of linear spaces.
j=1
,
(See Section 13.11.)
[ Define coordinates for Tpr,s (M ). ]
29.3.3 Notation: T r,s (M ) denotes the set [ Define the atlas for T r,s (M ). ]
S
p∈M
Tpr,s (M ).
29.3.4 Definition: A tensor of type (r, s) at a point p in a C 1 manifold M is any element of Tpr,s (M ). [ Here the definitions of tensor addition and product should be imported from Chapter 13. ] 29.3.5 Theorem: For any tensor K of type (r, s) at a point p in a C 1 manifold M , there are unique numbers (Kji )i∈Nnr ,j∈Nns such that K=
X
r i∈Nn s j∈Nn
Kji
r
s
⊗ ∂ip,ψ ⊗ ⊗ (dxj
k=1
k
ℓ=1
ℓ
)p,ψ .
[ This theorem has to be tidied up a bit to take care of the chart dependence properly. Are the dxjℓ differentials of real-valued functions? ] 29.3.6 Definition: The numbers (Kji )i∈Nnr ,j∈Nns in Theorem 29.3.5 are the components of the tensor K with respect to the chart ψ. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
29.3.2 Notation: Tpr,s (M ) denotes the tensor product of r copies of the tangent space Tp (M ) of the C 1 manifold M at the point p ∈ M , and s copies of the dual space Tp∗ (M ):
29.4. Double tangent spaces
611
29.3.7 Theorem: [ The coordinate transformation rule for tensors. ] [ Define tensor bundles in two ways: as vector bundles of tensors and as the tensor product of vector bundles? ] 29.3.8 Definition: [Definition of tensor bundle. Define the atlas for T r,s (M ).] [ Must do “double fibre bundles” in fibre.tex and lie.tex. ]
29.4. Double tangent spaces This section deals with tangent spaces such as T (M1 , M2 ) and T ∗ (M1 , M2 ) for C 1 manifolds M1 and M2 . Such spaces are useful for representing differentials of maps between manifolds. [ Must define canonical atlases for these double tangent spaces. Show how the special cases M1 = IR or M2 = IR lead to spaces T ∗ (M ) and T (M ) respectively. Thus T ∗ (M ) is associated with real-valued functions in the same way that T (M ) is associated with differentiable curves. Discuss the fact that Tp (IR) can be identified with IR for all p ∈ IR because IR has an absolute parallelism. Therefore Tp (M ) may be identified with T0 (M ), and so forth. ] 29.4.1 Definition: The (pointwise) double tangent space of a pair of C 1 manifolds M1 and M2 at points p1 ∈ M1 and p2 ∈ M2 is the linear space Tp1 ,p2 (M1 , M2 ) = Hom(Tp1 (M1 ), Tp2 (M2 )) of linear homomorphisms between the tangent spaces Tp1 (M1 ) and Tp2 (M2 ).
[ Define components cji so that L = cji eij for L ∈ Hom(. . . , . . .), or something like that. Relate this to Jacobian matrices as in Definition 31.3.15? ] 29.4.3 Definition: The total double tangent space of a pair of C 1 manifolds M1 and M2 is the space S T (M1 , M2 ) = p1 ∈M1 ,p2 ∈M2 Tp1 ,p2 (M1 , M2 ) of all linear homomorphisms between tangent spaces Tp1 (M1 ) and Tp2 (M2 ). [ Must also define an atlas and fibration structure etc. in Definition 29.4.3? Is this a fibration over M1 × M2 ? Maybe also a fibration over M1 and M2 ? πj : T (M1 , M2 ) → Mj etc. ] ˆ . .) = (ψ1 (p1 ), ψ2 (p2 ), . . .). [ Define an atlas for T (M1 , M2 ) and define tangent space T (T (M1 , M2 )). Atlas ψ(. Bring in a Jacobian matrix. ] 29.4.4 Remark: Corresponding to the double tangent spaces are the double cotangent spaces. Clearly T ∗ (M1 , M2 ) = T (M2 , M1 ) for any C 1 manifolds M1 and M2 . 29.4.5 Definition: The (pointwise) double cotangent space of a pair of C 1 manifolds M1 and M2 at points p1 ∈ M1 and p2 ∈ M2 is the space Tp∗1 ,p2 (M1 , M2 ) = Hom(Tp2 (M2 ), Tp1 (M1 )) of linear homomorphisms between the tangent spaces Tp2 (M2 ) and Tp1 (M1 ). [ Must also define an atlas and linear space operations etc. ] 29.4.6 Definition: The total double cotangent space of a pair of C 1 manifolds M1 and M2 is the space S ∗ T (M1 , M2 ) = p1 ∈M1 ,p2 ∈M2 Tp∗1 ,p2 (M1 , M2 ) of all linear homomorphisms between tangent spaces Tp2 (M2 ) and Tp1 (M1 ).
[ Must also define an atlas and fibration structure etc. Or basis and coordinates? ] [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
29.4.2 Remark: It is quite easy to define coordinates for spaces Tp1 ,p2 (M1 , M2 ) = Hom(Tp1 (M1 ), Tp2 (M2 )) in Definition 29.4.1. Let ψ1 ∈ atlasp1 (M1 ) and ψ2 ∈ atlasp2 (M2 ). Basis vectors epi 1 ,ψ1 = tp1 ,ei ,ψ1 ∈ Tp1 (M1 ) and epj 2 ,ψ2 = tp2 ,ej ,ψ2 ∈ Tp2 (M2 ) for i = 1, . . . n1 = dim(M1 ) and j = 1, . . . n2 = dim(M2 ) are provided by Definition 28.7.4. The basis covectors eip1 ,ψ1 ∈ Tp∗1 (M1 ) are defined in Definition 29.2.5. Pn1 k p1 ,ψ1 2 ,i Basis vectors epp21 ,ψ ∈ Tp1 (M1 ), k=1 v ek ,ψ1 ,j ∈ Hom(Tp1 (M1 ), Tp2 (M2 )) may be defined so that for all V = p2 ,ψ2 ,i i p2 ,ψ2 i i ep1 ,ψ1 ,j (V ) = v ej . A mnemonic for this is ej = ej · e , where the operation “·” represents the scalar multiplication on Tp2 (M2 ). Thus eij (V ) = ej · ei (V ) for V ∈ Tp1 (M1 ). [ This implies dim(Tp1 ,p2 (M1 , M2 )) = n1 n2 ? ]
612
29. Tensor bundles and tensor fields on manifolds
29.5. Vector fields Vector fields on manifolds are vector-valued functions. The value of a vector field at any point of a manifold is a tangent vector at that point. The symbol X is often used to denote vector fields and spaces of vector fields. Since vector fields may be interpreted as cross-sections of tangent bundles, a useful mnemonic for the letter X is the word “cross”. Figure 29.5.1 is an artist’s impression of a vector field.
X(p) ∈ Tp (M ) p
Figure 29.5.1
A vector field on a manifold
29.5.1 Definition: A (tangent) vector field on a C 1 manifold M is a function X : M → T (M ) such that X(p) ∈ Tp (M ) for all p ∈ M .
A (tangent) vector field on a subset S ⊆ M of a C 1 manifold M is a function X : S → T (M ) such that X(p) ∈ Tp (M ) for all p ∈ S.
29.5.3 Remark: The atlas AT (M ) of the total tangent space T (M ) of a C 1 manifold M is defined in Definition 28.8.1. A chart ψˆ ∈ AT (M ) is associated with each chart ψ ∈ AM . By Theorem 28.8.13, (T (M ), AT (M ) ) is a C r−1 manifold if (M, AM ) is a C r manifold with r ≥ 1. 29.5.4 Remark: It follows from Definition 27.8.1 and Theorem 28.8.13 that a vector field X is of class C r − −1 ˆ is of class C r for all charts ψ ∈ atlas(M ). on a C r+1 manifold M for r ∈ + 0 if and only if the map ψ ◦X ◦ψ The function ψˆ ◦ X ◦ ψ −1 may be tested for regularity in terms of a coordinate representation of the vector field X. Suppose ψ ∈ AM and X(p) = tp,ξ(ψ(p)),ψ for p ∈ Dom(ψ), where ξ : Range(ψ) → IRn with n = dim(M ). Then ψˆ ◦ X ◦ ψ −1 (y) = (y, ξ(y)) for all y ∈ Range(ψ). Therefore X is a C r vector field if and only if the components ξ of X are of class C r for all charts ψ ∈ AM . From the formula ψˆ ◦ X ◦ ψ −1 (y) = (y, ξ(y)), it follows that ψˆ ◦ X ◦ ψ −1 = (idDom ψ ) × ξ, and therefore π2 ◦ ψˆ ◦ X ◦ ψ −1 = ξ, where π2 : IR2n → IRn is the projection map π2 : (x1 , . . . x2n ) 7→ (xn+1 , . . . x2n ).
Z
29.5.5 Definition: The component function of a vector field X ∈ X(M ) on a C 1 n-dimensional manifold M for a chart ψ ∈ atlas(M ) is the function ξ : Range(ψ) → IRn defined by ξ = π2 ◦ ψˆ ◦ X ◦ ψ −1 , where ψˆ ∈ atlas(T (M )) is the chart for T (M ) corresponding to ψ and π2 : IR2n → IRn is the projection map π2 : (x1 , . . . x2n ) 7→ (xn+1 , . . . x2n ). 29.5.6 Remark: Notation 29.5.7 seems to be slightly non-standard. Some authors (e.g. Malliavin [35], 7.4.1, page 69, Gallot/Hulin/Lafontaine [19], 1.38, page 18, Darling [13], 7.1.1, page 144) use notations such as ΓT (M ), Γ(T M ) or ΓT M instead of X(M ). Crampin/Pirani [11], page 252, uses a notation like X(M ). EDM2 [34], section 105.M, and Kobayashi/Nomizu [26], page 5, use the notation X(M ). 29.5.7 Notation: X(T (M )) for a C 1 manifold M denotes the set of vector fields on M . − r X r (T (M )) for a C r+1 manifold M for r ∈ + 0 denotes the set of C vector fields on M . X(M ) is an abbreviation for X(T (M )), and X r (M ) is an abbreviation for X r (T (M )).
Z
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
29.5.2 Remark: A vector field on a subset S of a manifold M in Definition 29.5.1 is the same thing as a vector field on the manifold S with the relative differentiable structure. (See Definition 27.4.12 for relative differentiable structures.) It is generally assumed that definitions for whole manifolds also apply to open subsets with the relative differentiable structure.
29.6. Tangent operator fields
613
Z
− 29.5.8 Definition: The linear space of C r vector fields on a C r+1 manifold M for r ∈ + 0 is the tur r ple X (M ) − < (IR, X (M ), σIR , τIR , σX r (M ) , µ), where IR − < (IR, σIR , τIR ) is the usual field of real numbers, σX r (M ) : X r (M ) × X r (M ) → X r (M ) is the operation of pointwise addition on X r (M ) (using the linear space addition of Tp (M ) for all p ∈ M ), and µ : IR × X r (M ) → X r (M ) is the pointwise product operation of IR on X r (M ) (also using the linear space structure of Tp (M ) for p ∈ M ). 29.5.9 Remark: The linear space of C r vector fields in Definition 29.5.8 can be defined also for general vector fields in X(T (M )), although this is not often useful. 29.5.10 Notation: DX for a vector field X ∈ X(T (M )) for a C 1 manifold M denotes the map DX : C 1 (M ) → (M → IR) defined by (DX f )(p) = DX(p) f for all f ∈ C 1 (M ) and p ∈ M .
Z
r r r+1 29.5.11 Theorem: For all r ∈ + manifold M is a unitary 0 , the space X (M ) of C vector fields on a C r+1 left module over the ring C (M ) with the pointwise product (f.X)p = f (p)Xp .
29.5.12 Definition: The coordinate basis vector fields of a C 1 manifold M with respect to a C 1 chart ψ for M are the maps eψ i : Dom(ψ) → T (M ) defined for i = 1, . . . n = dim(M ) by ∀p ∈ Dom(ψ),
p,ψ eψ i (p) = ei .
29.5.13 Remark: See Definition 28.7.4 for the chart basis vectors ep,ψ = tp,vi ,ψ referred to in Definii tion 29.5.12. − 29.5.14 Theorem: The coordinate basis vector fields of a C k+1 manifold for a C k+1 chart for k ∈ + 0 are of class C k .
Z
[ Around about here, should define n-frame fields, and then define the field of coordinate bases. This particular n-frame field is apparently called the Gauß frame with respect to a given chart. ]
Many texts use the same notation for vector fields X and the action DX of vector fields. Since this text is being more careful about such definitions, a separate set of notations is defined in this section for tangent operator fields. 29.6.1 Remark: In Notations 29.5.7 and 29.6.3, X(T (M )) and X(Tˆ(M )) are spaces of cross-sections of ˚(M )) in Notation 29.6.2 is not a space of the corresponding tangent fibrations T (M ) and Tˆ(M ), but X(T ˚ cross-sections because the set T (M ) of untagged tangent operators is not a fibration. However, all of the spaces in Notations 29.6.2 and 29.6.3 may be regarded as linear spaces with respect to pointwise vector addition and scalar multiplication similarly to Definition 29.5.8. ˚(M )) for a C 1 manifold M denotes the set of maps X : M → T ˚(M ) such that 29.6.2 Notation: X(T ˚ X(p) ∈ Tp (M ) for all p ∈ M . ˚(M )) for a C r+1 manifold M with r ∈ −+ denotes the set {X ∈ X(T ˚ (M )); X is C r } of C r tangent X r (T ˚ operator fields in X(T (M )). ˚ ˚(M )), and X ˚ r (M ) is an abbreviation for X r (T ˚(M )). X(M ) is an abbreviation for X(T
Z
29.6.3 Notation: X(Tˆ(M )) for a C 1 manifold M denotes the set of maps X : M → Tˆ(M ) such that X(p) ∈ Tˆp (M ) for all p ∈ M . − ˆ (M )); X is C r } of C r tagged X r (Tˆ(M )) for a C r+1 manifold M with r ∈ + denotes the set {X ∈ X(T tangent operator fields in X(Tˆ(M )). ˆ ˆ r (M ) is an abbreviation for X r (Tˆ(M )). X(M ) is an abbreviation for X(Tˆ(M )), and X
Z
29.6.4 Definition: The coordinate basis operator fields in a subset U of a C 1 manifold M with respect to a chart ψ are the maps ∂iψ : Dom(ψ) → T (M ) defined by ∂iψ (p) = ∂ip,ψ for all p ∈ Dom(ψ). (See Definition 28.7.11 for the chart basis operators ∂ip,ψ .) [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
29.6. Tangent operator fields
614
29. Tensor bundles and tensor fields on manifolds
29.6.5 Theorem: The coordinate basis operator fields are C 0 vector fields. [ The following definition is probably a trivial application of the definition of the chart on the tangent space. ] [ Use the charts ψˆ in Definition 29.6.6. ] 29.6.6 Definition: The component function (or the set of components) of a vector field X on a C 1 manifold M with respect to the chart ψ ∈ atlas(M ) is the function from Dom(ψ) to IRn which maps p ∈ Dom(ψ) to ξψ (p) = φψ (p, Xp ), so that n X DX(p) = (φψ (p, Xp ))i ∂ip,ψ . i=1
Z
−+
29.6.7 Theorem: Let r ∈ 0 . Then a vector field X in a C r manifold M is a C r vector field if and only if for all charts ψ of M the component function ξψ (p) satisfies ξψ ∈ C r (Dom(ψ), IRn ). That is, X ∈ C r (M, T (M )) ⇔ ∀ψ ∈ atlas(M ), ξψ ∈ C r (Dom(ψ), IRn ).
29.7. Tensor fields 29.7.1 Definition: A tensor field of type (r, s) in a subset S of a C 1 manifold M for r, s ∈ Y : S → T r,s (M ) such that Y (p) ∈ Tpr,s (M ) for all p ∈ S.
Z+0 is a map
29.7.2 Definition: A C k differentiable tensor field of type (r, s) in an open subset Ω of a C k+1 manifold r,s M for r, s ∈ + (M ) is of class C k with respect 0 is a tensor field Y of type (r, s) in Ω such that Y : Ω → T to the C k differentiable structures on M and T r,s (M ).
Z
k 29.7.3 Notation: Xr,s (Ω) will denote the set of C k tensor fields of type (r, s) in the open subset Ω of a C k+1 manifold M . k [ Define “naive” differentials in T (T r,s (M )) of tensor fields in Xr,s (M ). ]
A vector field along a curve is defined on the domain of the curve, not on the range. This is convenient in the case of non-self-intersecting curves, but essential for curves which do have self-intersections. The phrase “along a curve” is used rather than “on a curve” because vector fields are defined on the parameter interval of the curve, not on the image set. Some important vector fields along curves in differentiable manifolds are the velocity field of a curve (the tangent to the curve at each point) and transversal vector fields induced by embedding the curve in a family of curves. 29.8.1 Definition: A vector field along a curve γ : I → M in a C 1 manifold M is a function Y : I → T (M ) such that ∀t ∈ I, Y (t) ∈ Tγ(t) (M ). 29.8.2 Definition: A continuous vector field along a curve γ : I → M in a C 1 manifold M is a vector field Y : I → T (M ) along γ such that Y is a continuous function with respect to the usual topology on intervals I ⊆ IR and the standard topology on T (M ). 29.8.3 Remark: Differentiable vector fields can be defined along differentiable curves. See Definition 27.6.1 for differentiable curves. Definition 29.8.4 uses the standard C r differentiable structure on the total tangent space T (M ) of a C r+1 manifold M . 29.8.4 Definition: A C r (differentiable) vector field along an open C r curve γ : I → M in a C r+1 − r manifold M for r ∈ + 0 is a vector field Y : I → T (M ) along γ such that Y is of class C with respect r to the usual differentiable structure on open intervals I ⊆ IR and the standard C differentiable structure on T (M ).
Z
[ Must define C r vector fields on manifolds for r ≥ 2, and then specialize to vector fields along curves. ] [ Define vector fields on families of curves. ] [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
29.8. Vector fields and tensor fields along curves
29.9. Differential forms
615
29.9. Differential forms [ See EDM 108.Q, Malliavin, Section I.7.5, page 71. ] 29.9.1 Definition: A differential form ofSdegree m with coefficients in a linear space W on a subset N of a C ∞ manifold M is a function ω : N → p∈M (Λm (Tp (M ), W )) such that ω(p) ∈ Λm (Tp (M ), W ) for all p ∈ N , and. . . A differential form of degree m on a subset N of a C ∞ manifold M is a differential form of degree m with coefficients in IR on N . That is, if the space of coefficients is not mentioned, then it is implicitly IR. 29.9.2 Notation: The set of differential forms of degree m with coefficients in W on a subset N of a C ∞ manifold M , together with the operations of pointwise addition and pointwise multiplication by elements of the field of W , may be denoted by Λm (N, W ). [ Must check that this notation is okay. ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
[ Also need to deal with the case that W is an (alternating) algebra, so that the space of differential forms can be made into an (alternating) algebra. ] [ Additionally must deal with the spaces of C r differential forms. ]
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
616
29. Tensor bundles and tensor fields on manifolds
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
[617]
Chapter 30 Higher-order tangent vectors
30.1 30.2 30.3 30.4 30.5 30.6 30.7 30.8
Higher-order tangent operators . . . . . . . . . . . . . . . Tensorization coefficients for second-order tangent operators Higher-order tangent vectors . . . . . . . . . . . . . . . . Higher-order tangent spaces . . . . . . . . . . . . . . . . . Drop functions for second-level tangent vectors . . . . . . . Elliptic second-order operators . . . . . . . . . . . . . . . Higher-order vector fields . . . . . . . . . . . . . . . . . . Higher-order vector fields for families of curves . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
619 621 623 626 626 627 627 628
Very roughly speaking, first-order derivatives are of interest to geometers whereas second-order derivatives are of interest to physicists. First-order operators are related to the geometry of a manifold, whereas secondorder operators are related to forces acting on fields.
In the same way that tangent operators are defined in Section 28.5 (Definition 28.5.1) as first-order derivatives of test functions, it is also possible to define higher-order operators. The transformation rules for these operators then yield similar coordinate-based tangent vector definitions. 30.0.1 Remark: It is difficult to assign a meaning to second-order tangent vectors in the sense of the interpretation of first-order tangent vectors as geometric infinitesimal vectors. The difficulty of geometric interpretation is perhaps one reason why it is rarely dealt with in differential geometry texts, although second-order operators are of the greatest importance in physics. 30.0.2 Remark: It is noteworthy that higher-order tangent vectors do not require a metric, a connection, or even the second tangent space T (T (M )). Second-order tangent vectors have chart transition rules which fulfil the requirements of symmetry and transitivity for a well-defined differential geometric object because they are based on second-order tangent operators, which themselves are defined as maps from C 2 (M ) functions to IR. Although the Laplacian operator uses a metric for its definition, it is in fact a second-order tangent operator. This may seem a little paradoxical. But the Laplacian needs the metric for the choice of its coefficients, but it resides in a space which does not require a metric for its construction. 30.0.3 Remark: The reason for normally not defining second-order tangent operators becomes clear when one asks what the second derivative of a real-valued function is in a particular direction V ∈ Tp (M ) in the tangent space at a point. The first-order derivative is chart-independent (because the chart transition rules for the vector V make the vector “follow” the chart transition map). But the second-order derivative is chart-dependent. One way to deal with this is to define a new class of second-order tangent objects which “follow” the chart transition maps. This is not usually done because it does not correspond to physical intuition. We like to think of space as being essentially Euclidean, Cartesian, Galilean, Newtonian or even Lorentzian. Physicists do not want to be bothered with second-order transformation rules. The first-order
©
Alan U. Kennington, “Differential geometry reconstructed: a unified systematic framework”. Copyright 2009, Alan U. Kennington. All rights reserved. You may print this book draft in A4 format. Printing in all other formats is forbidden. You may not charge any fee for copies of this book draft. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Higher-order tangent vectors are extensions of ordinary tangent vectors to represent higher-order derivatives. These should be distinguished from higher-order differentials of functions and maps. Higher-order tangent vectors are dealt with separately in their own chapter because they are not commonly dealt with in differential geometry textbooks at all.
618
30. Higher-order tangent vectors
rules are burdensome enough already. There is nothing to intuitively identify with locally parabolic-looking curves in space, whereas locally linear-looking curves may be identified with velocities and forces. We are accustomed to dealing with Galilean transformations of coordinates every day. Galilean relativity is part of ordinary life. So we find it easy to accept the need for such transformations in physics. An object which transforms according to a translation combined with a rotation seems intuitively real to us. But everyday life does not usually require us to accept curved, accelerating or rotating coordinates on equal terms with affine static coordinate frames. If that were part of everyday life, then differential geometry could have been formulated in such terms. We need to remove second-order derivatives from coordinate transformation rules in differential geometry simply because we are not accustomed to them. There is no a-priori necessity to reject mathematical objects which transform according to higher-order derivatives of the chart transition maps. 30.0.4 Remark: Second-order operators are so quintessential to physics that they must be dealt with somehow. In practice, the second-order terms in diffeomorphisms are dealt with by making them disappear by effectively normalizing the coordinates at each point before doing calculations. The use of covariant derivatives and Christoffel symbols provides compensation terms which always bring back second-order calculations to a standard “uncurved” chart. If the chart is chosen to make these compensation mechanisms disappear, such a chart is called “normal coordinates”. This means that space is assumed to have a special class of charts at each point which are somehow physically different to the others. Relativity in physics does not mean relativity with respect to all diffeomorphisms. There are, in effect, “grooves” in space which define parallel translation. These geodesic curves must be used in order to define second-order derivatives.
In this chapter, connections are not yet defined. Therefore the “grooves in space” which enable chartindependent second-order derivatives to be calculated are not available. So the values of second-order derivatives are chart-dependent. However, this does not mean that they are ill-defined. They just have burdensome transformation rules. 30.0.5 Remark: Elliptic second-order boundary value problems in flat space are essentially identical to the same analysis on a single patch of a differentiable manifold. The arbitrariness of the chart implies that the same problem may be attacked in a wide variety of coordinate systems. It may be that a BVP is classified as linear or semilinear in some choices of coordinate system but not in others. The ellipticity property for second-order equations is invariant under diffeomorphisms. Regularity classes are also essentially unchanged by smooth enough diffeomorphisms. Therefore the existence, uniqueness and regularity theory for elliptic second-order equations should be portable from flat space. When more than one coordinate patch is required, the analysis of elliptic BVPs starts to look different to the situation in flat space. 30.0.6 Remark: The analysis of partial differential equations is significantly different in a differentiable manifold (by contrast to flat space) if convexity properties are studied because it is generally not possible to choose a coordinate system in which the geodesics are all straight lines. Similarly, a difference arises if one wishes to define operators such as the Laplacian, which depends strongly on the metric structure of a manifold. The Laplacian needs not only chart-independent second derivatives, as all chart-independent second-order operators do, but also chart-independent distance and angle specifications so that the second-order derivatives may be correctly scaled and orthogonalized. (In other words, orthonormal coordinates are required at each point.) To achieve this, the metric tensor is required. Although one of the main aims of this book is to present differential geometry in such a way that geometric properties of solutions of partial differential equations can be studied, a good starting point is to study how much analysis can be done in the total absence of a connection or metric structure.
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
The second-order derivative of a real-valued function in a direction V can be defined in a chart-dependent manner by only calculating the second-order derivative in a special subset of all charts, and then using tensorization terms (such as Christoffel symbols) to convert the calculation in general charts to the value in a special subset of normalized charts. A reference chart for such a second-order derivative definition could be thought of as an “anchor chart”.
30.1. Higher-order tangent operators
619
30.1. Higher-order tangent operators Higher-order tangent vectors are an abstraction of higher-order tangent operators. So the operators are defined first in order to determine the transformation rules. 30.1.1 Definition: A second-order tangent operator on an n-dimensional C 2 manifold M is any function [2] ∂p,a,b,ψ : C 2 (M ) → IR defined for p ∈ M , a ∈ Sym(n, IR), b ∈ IRn and ψ ∈ atlasp (M ) by ∀p ∈ M, ∀a ∈ Sym(n, IR), ∀b ∈ IRn , ∀ψ ∈ atlasp (M ), ∀f ∈ C 1 (M ), n n X X ∂2 [2] ij −1 i ∂ −1 (f ◦ ψ )(x) + b (f ◦ ψ )(x) . ∂p,a,b,ψ (f ) = a i ∂xj i ∂x ∂x x=ψ(p) x=ψ(p) i=1 i,j=1
The pair (a, b) is called the component pair for, or the components of, the second-order tangent operator [2] ∂p,a,b,ψ with respect to the chart ψ at p. [2]
The tuple (p, a, b, ψ) is called the coefficient tuple for the second-order tangent operator ∂p,a,b,ψ . ˚p[2] (M ) denotes the set of all second-order tangent operators ∂ [2] 30.1.2 Notation: T p,a,b,ψ at a fixed point p ∈ M in Definition 30.1.1. [2] ˚[2] (M ) = S ˚[2] T p∈M Tp (M ) denotes the set of all second-order tangent operators ∂p,a,b,ψ in Definition 30.1.1. [2]
[2]
30.1.3 Theorem: Second-order tangent operators ∂p1 ,a1 ,b1 ,ψ1 and ∂p2 ,a2 ,b2 ,ψ2 on an n-dimensional C 2 manifold M are equal if p1 = p2 = p and
bi2
∀i = 1 . . . n,
=
n X
k,ℓ=1 n X
k,ℓ=1
∂ ∂ −1 −1 i j (ψ ◦ ψ (x)) (ψ ◦ ψ (x)) akℓ 2 2 1 , 1 1 ∂xk x=ψ1 (p) ∂xℓ x=ψ1 (p)
(30.1.1)
n X ∂ ∂2 −1 i kℓ (ψ2 ◦ ψ1 (x)) a1 + (ψ2 ◦ ψ1−1 (x))i bk1 . (30.1.2) k ℓ k ∂x ∂x ∂x x=ψ1 (p) x=ψ1 (p) k=1
Proof: Suppose that p1 = p2 ∈ M . Then for all f ∈ C 2 (M ), [2]
∂p1 ,a1 ,b1 ,ψ1 (f ) = =
n X
aij 1
i,j=1 n X
aij 1
i,j,k,ℓ=1 n X
+
+ = =
n X ∂2 −1 −1 i ∂ (f ◦ ψ )(x) + b (f ◦ ψ )(x) 1 1 1 i ∂xi ∂xj ∂x x=ψ1 (p1 ) x=ψ1 (p1 ) i=1
∂2 ∂ ∂ −1 −1 −1 k ℓ (f ◦ ψ )(˜ x ) (ψ ◦ ψ )(x) (ψ ◦ ψ )(x) 2 2 2 1 1 k ℓ i j ∂x ˜ ∂x ˜ x ˜=ψ2 (p2 ) ∂x x=ψ1 (p1 ) ∂x x=ψ1 (p1 ) aij 1
i,j,k=1 n X
bi1
∂ ∂2 (f ◦ ψ2−1 )(˜ x) (ψ2k ◦ ψ1−1 )(x) k i j ∂x ˜ x ˜=ψ2 (p2 ) ∂x ∂x x=ψ1 (p1 )
∂ ∂ −1 −1 k (f ◦ ψ )(˜ x ) (ψ ◦ ψ )(x) 2 2 1 ∂x ˜k x ˜=ψ2 (p2 ) ∂xi x=ψ1 (p1 )
i,k=1 n X ∂2 akℓ (f 2 ∂x ˜k ∂ x ˜ℓ k,ℓ=1
[2] ∂p2 ,a2 ,b2 ,ψ2 (f ),
◦ ψ2−1 )(˜ x)
x ˜=ψ2 (p2 )
+
n X
k=1
bk2
∂ −1 (f ◦ ψ )(˜ x ) 2 ∂x ˜k x ˜=ψ2 (p2 )
where a2 and b2 are as in equations (30.1.1) and (30.1.2). 30.1.4 Remark: To show the converse of Theorem 30.1.3 is not so easy. It must be shown first that p1 = p2 and v2i = v1j ∂j (ψ2i ◦ ψ1−1 ) for all i = 1, . . . n, and then that the symmetric parts of the second-order coefficients are the same. This requires the use of test functions. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
∀i, j = 1 . . . n, aij 2 =
620
30. Higher-order tangent vectors
[2] [2] ˚p[2] (M ) 30.1.5 Remark: The set of operators ∂p,0,b,ψ is a closed subset of the set of operators ∂p,a,b,ψ in T [2]
under chart transitions. The set of tangent operators ∂p,a,0,ψ is not closed. [2]
30.1.6 Remark: More colloquially, the second-order tangent operator ∂p,a,b,ψ may be written as [2]
∂p,a,b,ψ =
n X
aij
i,j=1
or just [2]
∂p,a,b,ψ = aij
n X ∂ 2 ∂ bi i + i j ∂x ∂x x=ψ(p) i=1 ∂x x=ψ(p)
∂2 ∂ + bi i ∂xi ∂xj ∂x
or even
or
n X
n
aij
i,j=1
X ∂2 ∂ (p) + bi i (p), i j ∂ψ ∂ψ ∂ψ i=1
[2]
∂p,a,b,ψ = aij ∂ij + bi ∂i .
30.1.7 Remark: Differential geometry is infested with the tedious kinds of expressions and calculations seen in Theorem 30.1.3 and its proof. One has the choice between writing the full tedious details or using ambiguous abbreviations. Generally the abbreviations, such as in Remark 30.1.6, are to be preferred, if one does not forget how to write down the full details. In abbreviated form, Theorem 30.1.3 becomes: i kℓ ˜ k˜ aij 1 ∂ij + b1 ∂i = a2 ∂kℓ + b2 ∂k
where i j kℓ aij 2 = φ,k φ,ℓ a1
and i k bi2 = φi,kℓ akℓ 1 + φ,k b1 .
Therefore i kℓ i k ˜ i j kℓ ˜ i aij 1 ∂ij + b1 ∂i = φ,k φ,ℓ a1 ∂ij + (φ,kℓ a1 + φ,k b1 )∂i i ˜ k i j ˜ i ˜ = akℓ 1 (φ,k φ,ℓ ∂ij + φ,kℓ ∂i ) + b1 (φ,k ∂i ).
(30.1.3)
Here φ = ψ2 ◦ ψ1−1 . 30.1.8 Remark: The expression (30.1.3) looks simple enough. It suggests that ∂ij = φk,i φℓ,j ∂˜kℓ + φk,ij ∂˜k and ∂i = φk,i ∂˜k , which is true and interesting. However, it is much more useful to rearrange (30.1.3) so that a and b transform like tensors. Thus ij k ℓ i m ˜r ˜s ˜ i k ˜ ˜ aij 1 ∂ij + b1 ∂i = (a1 φ,i φ,j )(∂kℓ + φ,rs φ,k φ,ℓ ∂m ) + (b1 φ,i )∂k =a ˜kℓ (∂˜kℓ + φm φ˜r φ˜s ∂˜m ) + ˜bk ∂˜k 1
,rs ,k ,ℓ
1
where ij k ℓ a ˜kℓ 1 = a1 φ,i φ,j
and ˜bk = bi φk . 1 1 ,i ˜r ˜s ˜ ˜ This gives us a nice tensorial form for the coefficients. It follows that the operators ∂˜kℓ + φm ,rs φ,k φ,ℓ ∂m and ∂k are also tensorial. So we have constructed a tensorial kind of second-order derivative. The problem with this is that the second order derivative operator must be calculated in terms of a single special chart (or a special subset of atlas-compatible charts). An interesting question to ask now is how to define a second-order operator on any C 2 manifold so that it ˜r ˜s ˜ looks like ∂˜kℓ + φm ,rs φ,k φ,ℓ ∂m when transformed. This question leads to Theorem 30.2.1. [2]
30.1.9 Definition: A tagged second-order tangent operator for a C 2 manifold (M, AM ) is a pair (p, ∂p,a,b,ψ ) such that p ∈ M and
[2] ∂p,a,b,ψ
: C 2 (M ) → IR is a second-order tangent operator at p.
[2]
The tuple (p, a, b, ψ) is called the coefficient tuple for the tagged second-order tangent operator (p, ∂p,a,b,ψ ). [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
k ℓ ˜ k ˜ i k˜ = aij 1 (φ,i φ,j ∂kℓ + φ,ij ∂k ) + b1 (φ,i ∂k ).
30.2. Tensorization coefficients for second-order tangent operators
621
[2] 30.1.10 Notation: ∂ˆp,a,b,ψ for p ∈ M , a ∈ Sym(n, IR), b ∈ IRn and ψ ∈ atlasp (M ) for an n-dimensional [2]
C 2 manifold M denotes the ordered pair (p, ∂p,a,b,ψ ).
[2] [2] 30.1.11 Notation: Tˆp (M ) denotes the set of all tagged second-order tangent operators (p, ∂p,a,b,ψ ) at a fixed point p ∈ M in Definition 30.1.9. S [2] [2] Tˆ[2] (M ) = p∈M Tˆp (M ) denotes the set of all tagged second-order tangent operators (p, ∂p,a,b,ψ ) in Definition 30.1.9.
[ Define higher-order tangent operator bundles near here? ] [ Must also define higher-order operators of order greater than 2. ] [ Show how second-order operators and vectors are related to spaces like T (T (M )), T (T ∗ (M )) and T ∗ (T ∗ (M )), or something like that. ]
30.2. Tensorization coefficients for second-order tangent operators [ The function spaces C 2 (M ) and C 0 (M ) in Theorem 30.2.1 should be restricted to the domain of the chart ψ. ] 30.2.1 Theorem: For an n-dimensional C 2 manifold M and charts ψ ∈ atlas(M ), let Lij (ψ) : C 2 (M ) → k C 0 (M ) be the second-order operator fields on the domain of ψ defined by Lij (ψ) = ∂ij −ωij (ψ)∂k for i, j ∈ n . Then the matrix Lij (ψ)(f ) transforms like the coefficients of a T (0,2) (M ) tensor for f ∈ C 2 (M, IR) if and only if ω satisfies ℓ ˜ k ωij (ψ) = ωrs (ψ)φr ,i φs ,j φ˜ℓ ,k + φk ,ij φ˜ℓ ,k (30.2.1)
N
N
˜ ) = φk ,i φℓ ,j Lkℓ (ψ)(f ) for i, j ∈ n , ψ, ψ˜ ∈ atlas(M ) and f ∈ Proof: It must be shown that Lij (ψ)(f 2 ˜ C (M ). For p ∈ Dom(ψ) ∩ Dom(ψ), 2 ˜−1 x)) ∂(f ◦ ψ˜−1 (˜ x)) k ˜ ˜ )(p) = ∂ (f ◦ ψ (˜ Lij (ψ)(f − ω ( ψ)(p) ˜ ij ˜ ∂x ˜i ∂ x ˜j ∂x ˜k x ˜=ψ(p) x ˜=ψ(p) 2 −1 −1 ˜ ∂ (f ◦ ψ ◦ ψ ◦ ψ (˜ x)) ∂(f ◦ ψ −1 ◦ ψ ◦ ψ˜−1 (˜ x)) k ˜ = − ω ( ψ)(p) ˜ ij ˜ ∂x ˜i ∂ x ˜j ∂x ˜k x ˜=ψ(p) x ˜=ψ(p) −1 2 −1 ∂(f ◦ ψ (x)) ∂ (f ◦ ψ (x)) = φk ,i φℓ ,j + φk ,ij ∂xk ∂xℓ ∂xk x=ψ(p) x=ψ(p) −1 ∂(f ◦ ψ (x)) k ˜ − ωij (ψ)(p)φℓ ,k ∂xℓ x=ψ(p) ∂(f ◦ ψ −1 (x)) k = φk ,i φℓ ,j Lkℓ (ψ)(f )(p) + φr ,i φs ,j ωrs (ψ)(p) ∂xk x=ψ(p) −1 ∂(f ◦ ψ (x)) ℓ ˜ + φk ,ij − ωij (ψ)(p)φk ,ℓ . ∂xk x=ψ(p)
This is equal to φk ,i φℓ ,j Lkℓ (ψ)(f )(p) for all f ∈ C 2 (M, IR) if and only if k ˜ φr ,i φs ,j ω k (ψ)(p) + φk ,ij − ω ℓ (ψ)(p)φ ,ℓ = 0. rs
ij
This is easily rearranged to give
ℓ ˜ k ωij (ψ)(p) = φr ,i φs ,j φ˜ℓ ,k ωrs (ψ)(p) + φk ,ij φ˜ℓ ,k ,
which agrees perfectly with (30.2.1). 30.2.2 Remark: Condition (30.2.2) in Definition 30.2.3 ensures that the second-order operator matrix k Lij (ψ) = ∂ij − ωij (ψ)∂k yields the coefficients of a type (0, 2) tensor when it is applied to any real-valued 2 function f ∈ C (M, IR). In other words, the matrix of values Lij (ψ)(f ) ∈ C 0 (Dom(ψ), IR) for all f ∈ C 2 (M, IR) must satisfy ˜ ) = φk ,i φℓ ,j Lkℓ (ψ)(f ) Lij (ψ)(f ˜ for all f ∈ C 2 (M, IR) and ψ, ψ˜ ∈ atlas(M ). on Dom(ψ) ∩ Dom(ψ) [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
for all ψ, ψ˜ ∈ atlas(M ), where φ = ψ ◦ ψ˜−1 and φ˜ = ψ˜ ◦ ψ −1 = φ−1 .
622
30. Higher-order tangent vectors
30.2.3 Definition: Tensorization coefficients for an n-dimensional C 2 manifold M are functions ω(ψ) = k (ωij (ψ))ni,j,k=1 ∈ C 0 (Dom(ψ), IRn×n×n ) for ψ ∈ atlas(M ) such that ∀ψ, ψ˜ ∈ atlas(M ), ∀i, j, k ∈
Nn, k ˜ ℓ ωij (ψ) = φ˜k ,ℓ φr ,i φs ,j ωrs (ψ) + φ˜k ,ℓ φℓ ,ij
(30.2.2)
where φ = ψ ◦ ψ˜−1 and φ˜ = ψ˜ ◦ ψ −1 = φ−1 . 30.2.4 Remark: Condition (30.2.2) may be written out more fully as follows.
N
∀p ∈ M, ∀ψ, ψ˜ ∈ atlasp (M ), ∀i, j, k ∈ n , k ˜ ℓ ωij (ψ)(p) = φ˜k ,ℓ (p)φr ,i (p)φs ,j (p)ωrs (ψ)(p) + φ˜k ,ℓ (p)φℓ ,ij (p) ∂ ˜ ∂ ∂ −1 k ℓ ˜−1 (x))r ˜−1 (x))s ( ψ ◦ ψ (x)) (ψ ◦ ψ (ψ ◦ ψ ωrs (ψ)(p) = ˜ ˜ ∂xℓ ∂xj x=ψ(p) ∂xi x=ψ(p) x=ψ(p) ∂ ∂2 ˜−1 (x))ℓ + ℓ (ψ˜ ◦ ψ −1 (x))k (ψ ◦ ψ . (30.2.3) ˜ ∂x x=ψ(p) ∂xi ∂xj x=ψ(p)
k 30.2.5 Remark: The tensorization coefficients ωij (ψ) in Definition 30.2.3 are completely arbitrary for any k ˜ ˜ The fixed chart ψ, but then the transformation rules completely determine ωij (ψ) for all other charts ψ. k values of ωij (ψ)(p) are completely independent at all points p ∈ M . In particular, the values may be not be continuous, bounded or even integrable.
The choice of tensorization coefficients at a point is equivalent to choosing normalized coordinates at that point. (More precisely, it is equivalent to choosing an equivalence class of normalized coordinates. Here “normalized” means that the tensorization coefficients vanish in normalized coordinates at the given point.) 30.2.6 Remark: Although the operators ∂ij yield a symmetric matrix of values ∂ij f when applied to any k function f ∈ C 2 (M ), the tensorization coefficients ωij (ψ) are not necessarily symmetric. Therefore the ˜ operators Lij (ψ)(f ) in Theorem 30.2.1 do not necessarily yield a symmetric matrix. Although ∂ij f (p) is ˜ )(p) may be neither continuous nor symmetric. Asymmetry necessarily continuous and symmetric, Lij (ψ)(f of the “tensorization coefficients” turns out to be interesting enough to have its own name: “torsion”. k 30.2.7 Remark: The Christoffel symbol Γij = 21 g kl ∂gli /∂xj + ∂glj /∂xi − ∂gij /∂xl in Section 39.4 satk k isfies the requirements of Theorem 30.2.1 if ωij = Γij . (See Theorem 39.4.4.) Therefore second covariant derivatives which are based on the Levi-Civita connection in a Riemannian space are tensorial. One can say more than this. Since the Christoffel symbol for the Levi-Civita connection in a Riemannian space provides a valid “tensorization term” for second-order operators, one may add any tensor of type (1, 2) to the Christoffel symbol and still have a valid tensorization term. This gives some idea of the wide range of choice available for defining an affine connection. 30.2.8 Remark: Although tensorization coefficients are apparently quite arbitrary, being defined independently at each point of a manifold, an affine connection is not independent at each point. An affine connection is defined as the differential of a parallelism. This constrains an affine connection to have more properties than are imposed by equation (30.2.2) in Definition 30.2.3. [ Determine a set of necessary and sufficient conditions for a set of tensorization coefficients to be the differential of a parallelism. ] [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
The summation symbols in (30.2.3) have been omitted due to lack of space. Despite this, (30.2.3) is still considerably more difficult to write and read than (30.2.2). However, sometimes it is desirable to ensure that abbreviated notations such as are written in (30.2.2) do have the intended, well-defined meaning. In particular, it often happens that there is confusion between functions which are valued in the manifold’s point space and functions which are valued in the chart’s range space. This is usually not too dangerous when there is only one chart in a given context, but in the presence of multiple charts, hidden errors can arise which are difficult to identify and remove.
30.3. Higher-order tangent vectors
623
[ Express the tensorization coefficients concept in terms of something more abstract to make it more similar to connection forms and covariant derivatives. ] k 30.2.9 Remark: One might well ask why there is a negative sign in the operator Lij (ψ) = ∂ij − ωij (ψ)∂k in Theorem 30.2.1. The sign is chosen to match the standard definition for the Christoffel symbol, but the Christoffel symbol’s sign is itself chosen so that it is positive for transforming covariant tensors and negative to transform contravariant tensors. It happens that the first-order operator ∂i is covariant. So it gets a negative sign for the coefficients when subjected to a covariant derivative. It is a very general problem that contravariant tensors are regarded as “ordinary” while covariant tensors are regarded as “opposite” in some sense. It is too late to reverse the long course of history in differential geometry. (See also Remark 13.6.2 on this subject.) So it is necessary to simply tolerate some of the perplexing ways that signs appear in tensor calculus.
[ Near here, present the IR1 and IR2 versions of tensorization coefficients in explicit detail. In IR1 , there is a single term φ′′ /φ′ , or something like that. This just happens to be the curvature of the function φ. Coincidence or not? You be the judge! ] [ Present the third-order tangent operator tensorization. ] [2]
30.2.10 Remark: In a later chapter, differential operators La,b or tp,a,b,ψ will be defined with the assistance of a connection to make the operators construct objects which have tensorially transforming coefficients. The actual tangent objects will be same as without a connection, but the coefficients will transform tensorially because the second-order parts of the transformation rules will be incorporated into the object’s construction algorithm. (This is a bit like incorporating a motion-compensation gyro unit in a camera to correct for attitude variation.)
[ The plethora of tangent object classes is a bit overwhelming. They should be summarized in a table and dealt with in a very systematic and digestible fashion. Each class gets a bundle, a space, an object notation, a space notation, an operator notation, a transformation rule, an atlas, and so forth. All of this should be in a neat and tidy table with references to sections where they are defined. ] In the following, Sym(n, IR) means the set of real symmetric n × n matrices. (See Notation 11.5.3.) 2 30.3.1 Definition: A an n-dimensional C manifold (M, AM ) S second-order tangent (component)n tuple for is a tuple (p, a, b, ψ) ∈ ψ∈AM Dom(ψ) × Sym(n, IR) × IR × {ψ} .
A computational second-order tangent (component) tuple for an n-dimensional C 2 manifold (M, AM ) with S indexed atlas (ψα )α∈I is a tuple (x, a, b, α) ∈ α∈I Range(ψα ) × Sym(n, IR) × IRn × {α} .
30.3.2 Remark: The second-order tangent vector component-tuple equivalence rules in Definition 30.3.3 are based on the corresponding rules for second-order operators in Theorem 30.1.3. Although these vector transformation rules, and probably all others, are defined to match the corresponding rules for differential operators, it does not follow that all tangent objects are differential operators. What does follow is that all differential operators on functions spaces on manifolds yield tangent spaces which have transformation rules in terms of chart transition maps. We use the differential operators as a quick way to determine the transformation rules. But then the corresponding tangent spaces are general mathematical classes which can be used for a wide variety of purposes, to represent the results of a wide variety of constructions and calculations. By comparison, consider ordinary vector fields X ∈ X 1 (T (M )) on manifolds. (See Section 29.5 for vector fields.) At each point p ∈ M , the vector X(p) is an element of Tp (M ), but X(p) is not necessarily a differential operator. Similarly, if W ∈ X 1 (T ∗ (M )) is a covector field, the covector W (p) ∈ T ∗ (M ) is not necessarily the gradient W (p) = df (p) of some real-valued function f ∈ C 1 (M ). This would be a very strong constraint on the field W . However, first-order operators and differentials are used to conveniently determine the transformation rules. Then vectors like X(p) are thought of simply as indicating a direction, not an operator. It just happens that a first-order operator has a direction, but other things have a direction too. In the same way, all directional objects on a manifold use differential operators for their transformation rules without actually being operators. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
30.3. Higher-order tangent vectors
624
30. Higher-order tangent vectors
30.3.3 Definition: A second-order tangent vectorSfor an n-dimensional C 2 manifold M − < (M, AM ) is an equivalence class [(p, a, b, ψ)] of tuples (p, a, b, ψ) ∈ ψ∈AM (Dom(ψ) × Sym(n, IR) × IRn × {ψ}), where the tuples (p1 , a1 , b1 , ψ1 ) and (p2 , a2 , b2 , ψ2 ) for ψ1 , ψ2 ∈ AM are said to be equivalent whenever p1 = p2 = p and ∀i, j = 1 . . . n, aij 2 = ∀i = 1 . . . n,
bi2 =
n X
k,ℓ=1 n X
k,ℓ=1 [2]
∂ ∂ −1 −1 i j (ψ ◦ ψ (x)) (ψ ◦ ψ (x)) akℓ 2 2 1 , 1 1 ∂xk x=ψ1 (p) ∂xℓ x=ψ1 (p)
(30.3.1)
n X ∂2 ∂ −1 −1 i kℓ i (ψ ◦ ψ (x)) a + (ψ ◦ ψ (x)) bk1 . (30.3.2) 2 2 1 1 1 ∂xk ∂xℓ ∂xk x=ψ1 (p) x=ψ1 (p) k=1
30.3.4 Notation: tp,a,b,ψ for p ∈ M , a ∈ Sym(n, IR), b ∈ IRn and ψ ∈ atlasp (M ) for an n-dimensional C 2 manifold M with n ∈ + 0 denotes the equivalence class [(p, a, b, ψ)] in Definition 30.3.3. In other words, [2] tp,a,b,ψ = [(p, a, b, ψ)].
Z
30.3.5 Remark: Notation 30.3.6 gives notations for sets of second-order tangent vectors. Definition 30.3.7 specifies the obvious linear structure on the pointwise set of second-order tangent vectors. Definition 30.3.10 gives the tangent obvious bundle structure for second-order tangent vectors. [2]
[2]
30.3.6 Notation: Tp (M ) denotes the set of second-order tangent vectors tp,a,b,ψ at a point p in a C 2 manifold M . That is, [2] Tp[2] (M ) = tp,a,b,ψ ; ψ ∈ atlasp (M ), a ∈ Sym(n, IR), b ∈ IRn . S
[2]
p∈M
Tp (M ) denotes the set of all second-order tangent vectors for a C 2 manifold M . That is, [2] T [2] (M ) = tp,a,b,ψ ; p ∈ M, ψ ∈ atlasp (M ), a ∈ Sym(n, IR), b ∈ IRn .
30.3.7 Definition: The second-order tangent space at a point p in a C 2 manifold (M, AM ) is the set [2] [2] Tp (M ) where n = dim(M ) and tp,a,b,ψ = [(p, a, b, ψ)] denotes the equivalence class of (p, a, b, ψ) with respect to the equivalence relation in Definition 30.3.3, together with the linear space operations inherited from Sym(n, IR) and IRn . 30.3.8 Remark: More precisely, the second-order tangent space at p ∈ M in Definition 30.3.7 is the tuple [2] [2] (IR, Tp (M ), σIR , τIR , σT [2] (M ) , µ), where Tp (M ) is as above, σIR and τIR are the standard operations of p
[2]
[2]
[2]
addition and multiplication for IR, σT [2] (M ) : Tp (M ) × Tp (M ) → Tp (M ) is the addition operation on p
[2]
[2]
[2]
[2]
[2]
Tp (M ) defined by tp,a1 ,b1 ,ψ + tp,a2 ,b2 ,ψ 7→ tp,a1 +a2 ,b1 +b2 ,ψ , and µ : IR × Tp (M ) → Tp (M ) is the scalar [2]
[2]
multiplication operation (λ, tp,a,b,ψ ) 7→ tp,λa,λb,ψ .
30.3.9 Remark: Just as in Definition 28.7.1, the definitions of vector addition and scalar multiplication in Definition 30.3.7 are independent of the choice of coordinates. This is because the chart transition rule in equations (30.3.1) and (30.3.2) is linear with respect to the components a and b. [ Define higher-order tangent bundles as for Definition 28.8.1. Definitions 30.3.10 and 30.3.11 need to be fixed. ] 30.3.10 Definition: The second-order tangent bundle of a C 2 manifold (M, AM ) is the C 0 manifold (T [2] (M ), AT [2] (M ) ), where S [2] [2] (i) T [2] (M ) = p∈M Tp (M ) = {tp,a,b,ψ ; ψ ∈ AM , p ∈ Dom(ψ), a ∈ Sym(n, IR), b ∈ IRn }, where n = dim(M ), and ˜ ψ ∈ AM }, where for any chart ψ ∈ AM , the chart ψ˜ : π −1 (Dom(ψ)) → IRn+n2 +n is (ii) AT [2] (M ) = {ψ; [2] defined by ψ˜ : tp,a,b,ψ 7→ (ψ(p), a, b), where π : T (M ) → M is defined by π : [(p, a, b, ψ)] 7→ p.
The function π : T [2] (M ) → M is called the projection map of the total tangent space T [2] (M ). [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
T [2] (M ) =
30.3. Higher-order tangent vectors
625
30.3.11 Definition: The topological second-order tangent bundle of a C 2 manifold (M, AM ) is the topological space (T [2] (M ), TT [2] (M ) ), where TT [2] (M ) is the topology induced on T [2] (M ) by the second-order total tangent space atlas AT [2] (M ) . [2]
30.3.12 Notation: DW for any second-order tangent vector W = tp,a,b,ψ ∈ T [2] (M ) denotes the corre[2] ˚[2] (M ). sponding tangent operator ∂ ∈T p,a,b,ψ
[2] ˆ W for a second-order tangent vector W = t[2] 30.3.13 Notation: D p,a,b,ψ ∈ T (M ) denotes the corre[2] ˆ W = (p, DW ) = sponding tagged second-order tangent operator (p, DW ) = (p, ∂ ) ∈ Tˆ[2] (M ). Thus D p,a,b,ψ
[2] [2] ∂ˆp,a,b,ψ = (p, ∂p,a,b,ψ ).
30.3.14 Remark: Theorem 30.1.3 implies that Definitions 30.3.3 and 30.1.1 are consistent with each other and the D-operation in Notation 30.3.12 commutes with changes of chart. 30.3.15 Remark: A useful mnemonic for equations (30.3.1) and (30.3.2) is aij 2 = bi2 =
∂ψ2i ∂ψ2j kℓ a , ∂ψ1k ∂ψ1ℓ 1 ∂ 2 ψ2i ∂ψ2i k akℓ b . 1 + k ℓ ∂ψ1 ∂ψ1 ∂ψ1k 1
The equations are reduced to the first-order tangent vector transformation rule if a1 is zero. If the first-order component is ignored, the transformation rule reduces to that for contravariant coefficients of 2-tensors in Tp2,0 (M ). The rules can be further abbreviated as follows. a ¯ij = x ¯i k x ¯j l akℓ ,
Equations (30.3.1) and (30.3.2) are reminiscent of the equations in Theorem 28.10.9. ˚(M ) ⊆ T ˚[2] (M ) 30.3.16 Remark: It is essentially true that T (M ) ⊆ T [2] (M ), and precisely true that T [2] and Tˆ(M ) ⊆ Tˆ (M ), since the second-order tangent vectors and operators reduce to the corresponding first-order vectors and operators when the second-order coefficient matrix a is zero. The higher-order operators of order k > 2 may also be defined along the same pattern as the second-order operators. The notations for these spaces would then be T [k] (M ) and so forth. Each one of these spaces has a corresponding total tangent space and can have a higher-order tangent bundle defined for it in Chapter 35. [ Should determine the relation between second-order tangent vectors and degree 2 tensors in Tp2,0 (M ). Is [2]
Tp (M ) possibly some sort of extension of Tp2,0 (M )? Answer: It seems like the components for the secondorder operator are tensorial (i.e. they are equal to the components of tensor objects) if they are (1) defined in a fixed chart and (2) transformed between charts according to covariant transformation rules using an affine connection. The Hessian of a real-valued function at a stationary point (probably) does not need such covariant transformation to correct for 2nd order diffeomorphism derivatives. ] 30.3.17 Remark: There are good arguments against the component order (p, a, b, ψ) in Definition 30.3.3. The ordering (ψ, p, b, a) is preferable in some ways. For example, it is sometimes useful to consider all spaces T [k] (M ) together in one combined set. It is useful to be able to say that (ψ, p, a1 , a2 , a3 ) ∈ T [3] (M ) and (ψ, a1 , a2 ) ∈ T [2] (M ) are equivalent when the symmetric n × n × n array a3 equals zero. This could be done by considering all such sequences to be elements of a set of infinite sequences for which all but a finite number of components equal zero. This is much easier to do if the components have increasing order. Nevertheless, decreasing order is used here for the time being because it matches the decreasing left to right order in which polynomials are written. (See Remark 33.2.6 for similar discussion.) [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
¯bi = x ¯i kℓ akℓ + x ¯i k bk .
626
30. Higher-order tangent vectors
30.4. Higher-order tangent spaces [ Maybe should present higher-order cotangent spaces here, and higher-order tensor spaces? What do the linear and multilinear duals of higher-order tangent spaces look like? Are they useful? What’s needed is a big classification diagram to interrelate all of the spaces! ] These spaces include pointwise higher-order tangent spaces and total tangent spaces for both vectors and operators. Of particular interest are the elliptic second-order operators. Higher-order operators were defined in Section 30.1. 30.4.1 Definition: The second-order tangent operator space at a point p in a C 2 manifold M is the ˚p[2] (M ) of all tangent operators at p ∈ M , together with the operations of pointwise addition and set T multiplication by real numbers. Thus the linear combination λ1 L1 + λ2 L2 : C 2 (M ) → IR of two second-order ˚p[2] (M ) is defined by tangent operators L1 , L2 ∈ T ˚p[2] (M ), ∀λ1 , λ2 ∈ IR, ∀f ∈ C 2 (M ), ∀L1 , L2 ∈ T
(λ1 L1 + λ2 L2 )(f ) = λ1 L1 (f ) + λ2 L2 (f ).
˚p[2] (M ) is restricted to the space C 2 (M ) 30.4.2 Remark: Although the action of the operators in spaces T ˚p[2] (M ) so that operators at all points can act on the same space, it is tacitly assumed that every operator in T 2 2 ˚p (M ) of C functions which are defined in a neighbourhood of p. In fact, the is also defined on the space C tangent operators are assumed to act on any reasonable kind of function on M or a subset of M , whether the function is classically differentiable or not. 30.4.3 Definition: The tagged second-order tangent operator space at a point p in a C 2 manifold M is [2] ˚p[2] (M ), together with the operations of pointwise addition the set Tˆp (M ) of all pairs (p, L) such that L ∈ T and multiplication by real numbers on the operator component (as in Definition 30.4.1). Thus the linear [2] combination λ1 (p, L1 ) + λ2 (p, L2 ) of two tangent vectors (p, L1 ), (p, L2 ) ∈ Tp (M ) is defined by ˚p[2] (M ), ∀λ1 , λ2 ∈ IR, ∀L1 , L2 ∈ T
[ Here could present coordinate basis vectors analogous to Definition 28.7.4 etc. ] 30.4.4 Remark: Second-order total tangent vector and operator spaces may be defined as in Section 28.8 by defining a standard atlas for the sets T [2] (M ) and Tˆ[2] (M ).
30.5. Drop functions for second-level tangent vectors [ This section must be checked and rewritten. ] 30.5.1 Remark: In Definition 30.5.2, (uv + vu)/2 means the matrix (aij )ni,j=1 with aij = (ui v j + v i uj )/2. 30.5.2 Definition: The drop function from T (2) (M ) to T [2] (M ) for a C 2 manifold M is the function (2) [2] ̟ : T (2) (M ) → T [2] (M ) defined by ̟ tV,(u,w),ψ˜ = tp,(uv+vu)/2,w,ψ for all p ∈ M , ψ ∈ atlas(M ), V = tp,v,ψ ∈ Tp (M ), and u, v, w ∈ IRn with n = dim(M ), where ψ˜ is the chart for T (2) (M ) corresponding to ψ. 30.5.3 Theorem: The drop function in Definition 30.5.2 is chart-independent. Proof: It must be shown that the transformation rules for T (2) (M ) in Theorem 28.10.9 match the rules for T [2] (M ) in Definition 30.3.3. Let ψ1 , ψ2 ∈ atlas(M ) and p ∈ Dom(ψ1 ) ∩ Dom(ψ2 ), and let u1 , v1 , w1 ∈ IRn and u2 , v2 , w2 ∈ IRn be the respective components u, v, w ∈ IRn in Definition 30.5.2 for ψ1 and ψ2 . In abbreviated notation, the transformations rules for T (2) (M ) on lines (28.10.1) and (28.10.2) may be written as ui2 =
∂ψ2i ∂ψ1j
uj1
(30.5.1)
and [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
λ1 (p, L1 ) + λ2 (p, L2 ) = (p, λ1 L1 + λ2 L2 ).
30.6. Elliptic second-order operators w2i =
∂ 2 ψ2i ∂ψ1j ∂ψ1k
uj1 v1k +
∂ψ2i ∂ψ1j
627
w1j .
(30.5.2)
If line (30.5.1) is combined with the transformation rule v2i = ∂ψ2i /∂ψ1j v1j for v1 and v2 , the result is ui2 v2j =
∂ψ2i ∂ψ2j k ℓ u v . ∂ψ1k ∂ψ1ℓ 1 1
j i k ℓ kℓ This yields aij 2 = (∂ψ2 /∂ψ1 )(∂ψ2 /∂ψ1 )a1 by symmetrising over i and j, where a1 , a2 are the respective matrices in Definition 30.3.3. This together with line (30.5.2), symmetrized over j and k, gives an exact match with the abbreviated transformation rule in Remark 30.3.15.
30.5.4 Remark: The drop function in Definition 30.5.2 is an extension of the drop function for vertical (2) vectors in Definition 28.11.6. To see this, note that tV,(u,w),ψ˜ in Definition 30.5.2 is vertical if and only (2) [2] (2) if u = 0. Then ̟ tV,(u,w),ψ˜ = tp,0,w,ψ ≡ tp,w,ψ . This agrees with ̟V tV,(0,w),ψ˜ in Definition 28.11.6. Thus the natural extension of the drop function to general vectors in T (T (M )) yields second-order tangent vectors when the (chart-dependent) horizontal component is non-zero. [ Try to extend the (vertical) drop function from T (2) (M ) to general T (k) (M ). ]
30.6. Elliptic second-order operators
30.6.1 Remark: The Hessian operator in flat space is a second-order operator, but there are two kinds of Hessian for differentiable manifolds. The Hessian at a critical point of a function f , namely a point p ∈ M ˚p[2] (M ) which does not require a connection for its where (df )p = 0, turns out to be a covariant tensor in T definition. However, the Hessian at a non-critical point of a real-valued function does require a connection for its definition. (See Section 29.3 for discussion of the Hessian at critical points.) See Greene/Wu [67], page 7, for the Hessian for general functions using a connection. Since D2 f (X, Y ) = X(Y f ) − (DX Y )f , it seems that (DX Y )f = 0 if (df )p = 0. So the connection doesn’t come into it. [ This must be checked since they have fields X and Y rather than vectors at a point. ] 30.6.2 Remark: Definite and semi-definite matrices were introduced in Definition 11.4.7. Elliptic secondorder operators have a second-order component which is positive semi-definite or positive definite. The chart transition rules in Definitions 30.3.3 and 30.1.1 guarantee that this property is chart-independent. 30.6.3 Definition: A (weakly) elliptic second-order tangent operator at a point p ∈ M of a C 2 manifold [2] ˚p[2] (M ) such that the matrix a ∈ Sym(n, IR) is positive semi-definite. M is an operator ∂p,a,b,ψ ∈ T
A strictly elliptic second-order tangent operator at a point p ∈ M of a C 2 manifold M is an operator [2] ˚p[2] (M ) such that the matrix a ∈ Sym(n, IR) is positive definite. ∂p,a,b,ψ ∈ T 30.6.4 Theorem: If p ∈ M is a local maximum of a function u ∈ C 2 (M ), where M is a C 2 manifold, then [2] DW (u) ≥ 0 for all positive semi-definite operators W ∈ Tp (M ). [ Near here maybe do some vector field versions of Theorem 30.6.4 etc. ]
30.7. Higher-order vector fields [k]
This section deals with functions on C k manifolds M which are valued in spaces such as Tp (M ) (introduced in Definition 30.3.7) at points p ∈ M . In other words, these higher-order vector fields, which may be vectorvalued or operator-valued, are cross-sections of fibrations such as T [k] (M ) and Tˆ[k] (M ). The definitions follow the pattern of Section 29.5. 30.7.1 Definition: A second-order vector field in a subset S of a C 2 manifold M is a map X : S → T [2] (M ) such that π(X(p)) = p for all p ∈ S, where π : T [2] (M ) → M is the standard projection map. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
˚p[2] (M ). Quote the [ Mention near here that the Laplacian operator at a point p in a C 2 manifold resides in T operator here when it has been properly defined in a later chapter. ]
628
30. Higher-order tangent vectors
30.7.2 Definition: The action of a second-order vector field X in the manifold M on a function f ∈ C 2 (M ) is the map DX f : M → IR defined by DX f : p 7→ DX(p) f . [ Define a vector field of order 0 for simple multiplication by a real-valued function? ] 30.7.3 Definition: A vector field X of order k ∈ − r∈ + 0 if
Z
∀f ∈ C r+k (M ),
Z+0 in a C r+k manifold M is said to be of class C r for
DX f ∈ C r (M ).
30.7.4 Notation: X r (T [k] (M )) denotes the set of C r vector fields of order k on a C r+k manifold M , −+ where k ∈ + 0 and r ∈ 0 .
Z
Z
30.7.5 Remark: The invariance of the class of C r vector fields of order 2 on a C s+2 manifold is discussed in Remark 19.5.3. 30.7.6 Definition: An elliptic second-order vector field in a C 2 manifold M is a second-order vector field X ∈ X 0 (T [2] (M )) such that the operator X(p) is an elliptic second-order operator for all p ∈ M . [ Give maximum principles for weakly and strongly elliptic vector fields. ] 30.7.7 Remark: Even if a metric or connection is needed for the calculation of some kinds of higher-order vector fields, they are still well-defined vector fields in spaces X r (T [k] (M )) without the metric or connection. For example, operators such as the Laplacian acting on real-valued functions require a metric to determine the components of the Laplacian vector field, but the field “lives” in a space X r (T [2] (M )), which does not itself involve any metric in its definition. It is important to distinguish between the structures required for constructing an object and the structures required for “housing” the object.
[ Must define higher-order vector fields along curves and families of curves. These are very important for 3-point convexity maximum principles. ] [ Apply d[2] φ to families of curves such as in Figure 30.8.1, where φ is the map from γ(1, t) to γ(s, t). This will be applicable to convexity maximum principles. Should also say exactly the relation is between such differentials and vectors fields along curves and families of curves. ] γ(1, t)
γ(1, 0)
γ(1, 1)
γ(s, 1) γ(s, 0)
γ(0, 0)
γ(0, 1) γ(0, t)
Figure 30.8.1
[ www.topology.org/tex/conc/dg.html ]
Family of curves
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
30.8. Higher-order vector fields for families of curves
[629]
Chapter 31 Differentials on manifolds
31.1 31.2 31.3 31.4 31.5
Pointwise differentials versus induced maps The differential of a real-valued function . The differential of a differentiable map . . The differential of a curve . . . . . . . . . One-parameter transformation families and
. . . . . . . . . . . . . . . . . . . . . . . . . . . . vector fields
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . . R
. . . . .
629 631 633 637 639
In the olden days, differentials were thought of as the df and dx in expressions such as df /dx or f dx. As mentioned in EDM2 [34], 106.B, these differentials are not well-defined, but may be thought of as the limits of infinitesimal quantities. In differentiable geometry, there are well-defined mathematical objects which are analogous to the ill-defined differentials of the 18th century. For a function f : M → IR and point p ∈ M for a C 1 manifold M , the differential (df )p is defined as the map from differential operators at p to derivatives. Then for a chart ψ ∈ atlas(M ), the object dψ i is well-defined for i = 1 . . . n, where n = dim(M ). Similarly, objects of the form ∂/∂ψ i are well-defined as differential operators. Neither dψ i nor ∂/∂ψ i are well-defined in 18th century calculus. It is important not to confuse the welldefined concepts of differential geometry with the abstract differentials of elementary calculus.
31.1.1 Remark: Small differences in the way first-order differentials are defined have a significant effect on how higher-order differentials are defined. There seem to be essentially two ways of defining differentials of functions and maps. These approaches may be called “pointwise differentials” and “induced maps”. In the first approach, the differential is calculated at each point of the domain manifold first, and this is then aggregated to construct a function whose domain is the same as the original function. In the second approach, both the domain and range of the function are regarded as differentiable manifolds with tangent vector fibrations, and the differential function is defined between those two fibrations. This second form is a kind of “lift” or “push-forth” of the original function. A variant of the second approach is the “pull-back”, which is some kind of dual or inverse of the push-forth. 31.1.2 Remark: The notation commonly used for the differential of a function f at a point p is “(df )p ”. This may be interpreted as either the value (df )(p) of a function df at p or the restriction (df ) Tp (M ) to the tangent space Tp (M ). The first interpretation leads to pointwise differentials whereas the second leads to induced maps. These two approaches to defining differentials are not greatly different in the case of first-order differentials, but they are significantly different when higher-order differentials are constructed. In the case of real-valued functions on manifolds, the pointwise style of differential is generally adopted, whereas the “induced map” style is more usual for the differential of a map between two manifolds. In this chapter, an attempt will be made to disentangle the various ways in which differentials of various orders may be defined. 31.1.3 Remark: Some texts are unclear as to what the domains and ranges of the functions in their definitions are. However, the following are rough indications of the approaches to defining differentials by some authors. Examples of the pointwise differential approach are EDM2 [34], sections 105.I–J and Kobayashi/
©
Alan U. Kennington, “Differential geometry reconstructed: a unified systematic framework”. Copyright 2009, Alan U. Kennington. All rights reserved. You may print this book draft in A4 format. Printing in all other formats is forbidden. You may not charge any fee for copies of this book draft. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
31.1. Pointwise differentials versus induced maps
630
31. Differentials on manifolds
Nomizu [26], page 7. Examples of the induced map approach are Crampin/Pirani [11], section 10.3, pages 250–251, Darling [13], section 2.8, page 43, and Gallot/Hulin/Lafontaine [19], section 1.64. Examples of the pull-back approach to differentials are Malliavin [35], page 90, and Gallot/Hulin/Lafontaine [19], page 38. 31.1.4 Remark: Differentials of real-valued functions may be regarded as a special case of differentials of maps between manifolds because IR may be regarded as a 1-dimensional manifold whose tangent space at each point is a copy of IR. (See Darling [13], pages 40–41, for a comment on this.) 31.1.5 Remark: The naming of “induced” maps is quite appropriate because they are reminiscent of the way electrical current is induced in a wire by a neighbouring wire via a magnetic field. The induced map between tangent spaces is in some sense parallel to the map between the point spaces. This is illustrated in Figure 31.1.1.
T (T (M1 ))
z
φ∗∗
φ∗∗ (z)
π ˜1 T (M1 )
π ˜2 φ∗
φ∗ (L)
L
π1 M1
T (M2 )
π2 φ φ(p)
M2
Induced maps for a manifold map
31.1.6 Remark: The following table indicates the domains and ranges of some forms of differentials and induced maps. manifold map φ : M1 → M2
dφ : M1 → T (M1 , M2 ) φ∗ : T (M1 ) → T (M2 )
real-valued function f : M → IR
∗
df : M → T (M ) f∗ : T (M ) → IR
curve γ : IR → M
dγ : IR → T (IR, M ) γ∗ : IR → T (M )
d2φ : M1 → T (M1 , T (M1 , M2 )) d2f : M → T (M, T ∗ (M )) d2 γ : IR → T (IR, T (IR, M )) ∗ (dφ)∗ : T (M1 ) → T (T (M1 , M2 )) (df )∗ : T (M ) → T (T (M )) (dγ)∗ : IR → T (T (IR, M )) d(φ∗ ) : T (M1 ) → T (T (M1 ), T (M2 )) d(f∗ ) : T (M ) → T ∗ (T (M )) d(γ∗ ) : IR → T (IR, T (M )) φ∗∗ : T (T (M1 )) → T (T (M2 )) f∗∗ : T (T (M )) → IR γ∗∗ : IR → T (T (M )) [ Probably will interpret T (IR, M ) in this table as equivalent to T (M ). Should mention this in Section 29.4. Also try to find a simplification of spaces like T (M, T ∗ (M )). Also must deal near here with differentials of diffeomorphism families φ : IR × M → M . Also deal with path families γ : IRm → ˚ M. ] 31.1.7 Remark: Broadly speaking, the d operator yields a function of points on manifolds, which therefore emphasizes the dependence on the base point alone, whereas the induced map operator yields a function on the combined base point and tangent vector. However, the double application of the d operator does not yield a function with a simple dependence on pairs of manifold points as might have been desired. In all cases, there is a need to construct differentiable structures on top of other differentiable structures. This makes the situation inconveniently complicated for all second-order differentials. 31.1.8 Remark: The relations between the two styles of first-order differentials of manifold maps are as follows. ∀V ∈ T (M ), [ www.topology.org/tex/conc/dg.html ]
φ∗ (V ) = (dφ)π(V ) (V ) [ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Figure 31.1.1
p
T (T (M2 ))
31.2. The differential of a real-valued function φ∗ =
S
631
(dφ)p
p∈M
∀p ∈ M, ∀V ∈ Tp (M ), ∀p ∈ M,
(dφ)p (V ) = φ∗ (V ) (dφ)p = φ∗ −1 π
({p})
,
where π : T (M ) → M is the projection map for T (M ). The corresponding relations for real-valued functions are the following. ∀V ∈ T (M ),
f∗ (V ) = (df )π(V ) (V ) S f∗ = (df )p p∈M
∀p ∈ M, ∀V ∈ Tp (M ), ∀p ∈ M,
(df )p (V ) = f∗ (V ) (df )p = f∗ −1 π
({p})
.
31.1.9 Remark: Although the differentials df of real-valued functions f : M → IR may be interpreted as cross-sections of the cotangent fibration, df ∈ X 0 (T ∗ (M )), such an interpretation does not seem to be possible in the case of the differential dφ of a manifold map. of dφ is not the entire total S In this case, the range S tangent space T (M1 , M2 ). The range of dφ is in fact p∈M1 Tp,φ(p) (M1 , M2 ) 6= p∈M1 ,q∈M2 Tp,q (M1 , M2 ). It is just possible, though, that this could be regarded as a cross-section of some sort of fibration over M1 alone if T (M1 , M2 ) can be regarded as a fibration over M1 . 31.1.10 Remark: The simplification of the form of the maps df and f∗ relative to dφ and φ∗ has the consequence that information is lost. The maps df and f∗ both lose the information in the function f . In the case of df , the manifold map dφ : M → T (M, IR) is replaced with df : M → T ∗ (M ), and covectors (df )p ∈ Tp∗ (M ) contain no information on the value of f (p). Similarly, φ∗ : T (M ) → T (IR) is replaced with f∗ : T (M ) → IR, and f∗ (V ) ∈ IR for V ∈ Tp (M ) contains no information about f (p). A value such as φ∗ (V ) ∈ Tφ(p) (IR), however, contains the value φ(p).
The derivative of a real-valued function of several real variables is sometimes specified as the sequence of partial derivatives of the function with respect to the independent variables. These partial derivatives have simple transformation rules under changes of coordinates if the function is C 1 . Another approach is to specify the directional derivative of the function in every direction at every point. In the case of real-valued functions on differentiable manifolds, this second approach is generally adopted. This associates with each C 1 function f and direction V at each point p a directional derivative DV f (p). If f and p are fixed, then DV f (p) is a function of the direction V . This is the “differential” of the function f at p. The differential of a function is a map from the set of all vectors V at a point to the real-valued derivative of the function in the direction V . This is clearly linear with respect to V . Therefore a differential is a linear functional on the set of vectors. In other words, it is a member of the dual linear space. 31.2.1 Definition: For any function f ∈ C 1 (M ) and point p in a C 1 manifold M , the differential of f at p is the map (df )p : Tp (M ) → IR defined by ∀V ∈ Tp (M ),
(df )p (V ) = DV f.
˚p (M ) → IR defined by The differential of f at p (for tangent operators) is the map (df )p : T ˚p (M ), ∀L ∈ T
(df )p (L) = L(f ).
31.2.2 Remark: Some spaces and maps in Definition 31.2.1 are illustrated in Figure 31.2.1. The same notation (df )p is used for both the tangent vector and operator versions of differentials to economize ˚ )p would have been the natural choice for the operator version, on notations. (A notation such as (df but this would look silly. It increases the line spacing too much. But with such a notation, one could ˚ )p ◦ D, where D is the map V 7→ DV as in Figure 31.2.1.) write (df )p = (df [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
31.2. The differential of a real-valued function
632
31. Differentials on manifolds D Tp (M )
V
C 1 (M )
f
(df )p
(df )p
DV
L
L
˚p (M ) T
f
C 1 (M )
p
M
IR
M Figure 31.2.1
p
f
f
Differential of a real-valued function at a point for vectors and operators
31.2.3 Remark: For any V = tp,v,ψ ∈ Tp (M ), the value of (df )p (V ) may be written explicitly as n X (df )p tp,v,ψ = v i ∂xi (f ◦ ψ −1 (x)) i=1
x=ψ(p)
= ∂p,v,ψ (f ).
31.2.4 Remark: Since (df )p = {(V, DV f ); V ∈ Tp (M )}, whereas DV = {(f, DV f ); f ∈ C 1 (M )}, it seems that DV and (df )p are projections of the map (f, V ) 7→ DV f onto f and V respectively. So the derivative and the differential are just different ways of viewing the same map.
˚p1 (M, IR) instead of C 1 (M ), 31.2.5 Remark: It is often desirable to define (df )p to act on functions in C because then it is not necessary to extend local functions to global functions with desired local properties. Pn ˚p1 (M, IR), Another advantage is the ability to write expressions such as (df )p = i=1 ∂ip,ψ (f )(dψ i )p for f ∈ C with n = dim(M ), as in Theorem 31.2.7. It is not true in general that the coordinate functions ψ i are in C 1 (M ). For simplicity, Definition 28.5.1 was written in terms of C 1 (M ), and Definition 31.2.1 follows this pattern. (See Remark 28.5.5 for discussion of this.) It will be understood that the differentials in Definition 31.2.1 apply to functions in any natural extension of C 1 (M ) which yield linear functionals on the tangent vector or tangent operator space. 31.2.6 Remark: The cotangent set Tp∗ (M ) is equal to {(df )p : Tp (M ) → IR; f ∈ C 1 (M )} because the set of differentials (df )p spans Tp∗ (M ). This can be shown with functions f which are chart component functions ψ i for ψ ∈ atlas(M ) multiplied by suitable functions with compact support. (See Remark 20.12.10 for smooth functions with compact support.) ˚p (M ) rather than Tp (M ), then the If the differentials (df )p are defined in terms of tangent operators in T ∗ ˚ ˚ cotangent space may be defined as Tp (M ) = {(df )p : Tp (M ) → IR; f ∈ C 1 (M )}. The difference between these cotangent spaces (probably) may be safely glossed over in most situations. 31.2.7 Theorem: For any C 1 manifold M and chart ψ ∈ atlasp (M ), the sequence of vectors ((dψ i )p )ni=1 is a basis for Tp∗ (M ), where n = dim(M ). 31.2.8 Remark: Theorem 31.2.9 expresses the differential (df )p of a C 1 real-valued function in terms of the unit basis vectors in Theorem 31.2.7. The cotangent vectors (dψ i )p may be abbreviated to dip,ψ or di . (See Remark 28.5.16 for the corresponding tangent operator abbreviations. See Remark 29.2.6 for the unit cotangent vector notation.) ˚p1 (M, IR), (df )p = Pn ∂ p,ψ (f )(dψ i )p . 31.2.9 Theorem: For any f ∈ C i=1 i
31.2.10 Remark: The pointwise differentials (df )p of a function f ∈ C 1 (M ) at points p ∈ M are combined to construct a cotangent vector field df ∈ X(T ∗ (M )) in Definition 31.2.11. There are two ways to proceed depending on whether (df )p is interpreted as (df )(p) or (df ) T (M ) . As mentioned in Section 31.1, the former p yields differentials whereas the latter yields induced maps. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
[ For “coordinate differentials” such as (dψ i )p , see Crampin/Pirani [11], pages 37 and 249. ]
31.3. The differential of a differentiable map
633
31.2.11 Definition: For any function f ∈ C 1 (M ) for a C 1 manifold M , the differential of f is the cotangent vector field df ∈ X(T ∗ (M )) defined by ∀p ∈ M, ∀V ∈ Tp (M ),
(df )(p)(V ) = (df )p (V ).
˚∗ (M )) defined by The differential of f (for tangent operators) is the function df ∈ X(T ˚p (M ), ∀p ∈ M, ∀L ∈ T
(df )(p)(L) = (df )p (L).
The differential of f (for tagged tangent operators) is the function df ∈ X(Tˆ∗ (M )) defined by ∀p ∈ M, ∀(p, L) ∈ Tˆp (M ),
(df )(p)(p, L) = (df )p (L).
[ Near here, show that df ∈ X k (T ∗ (M )) if f ∈ C k+1 (M ), or something like that. Crampin/Pirani [11], pages 76 and 249, say that df ∈ X 0 (T ∗ (M )). ] 31.2.12 Remark: If f is defined only on an open subset of M in Definition 31.2.11, then the domain of df is correspondingly restricted. Thus df : T (Dom(f )) → IR, where Dom(f ) is understood to have the appropriate restricted atlas. 31.2.13 Theorem: If f ∈ C k+1 (M ) for a C k+1 manifold with k ∈
Z+0, then df ∈ X k (T ∗(M )).
Proof: . . . [ Should find out if df is a special case of the exterior derivative. I think it is. ] 31.2.14 Remark: Higher-order differentials (such as differentials of differentials) of real-valued functions are defined in Section 32.1. Differentials of real-valued functions for higher-order operators are defined in Section 32.5. [ Look at the action of differentials and induced maps on vector fields in X k (T (M )). ]
A differentiable map is a C 1 map φ : M1 → M2 for C 1 manifolds M1 and M2 . The differential of a differentiable map is covariant with respect to the source manifold M1 and contravariant with respect to the target manifold M2 . That is, the differential behaves like cotangent vector in T ∗ (M1 ) and like a tangent vector in T (M2 ). Differentials of real-valued functions and curves may be thought of as special cases of differentials of maps between manifolds. Differentiable maps are defined in Section 27.8. The first-order differential of a differentiable map is defined in this section both in terms of tangent vectors (Definition 31.3.1) and tangent operators (Definition 31.3.18). The operator form is clearly the simplest. This shows the value of tangent operators for presenting and motivating definitions, but for practical calculations, the tangent vector version (defined in terms of components) is required. 31.3.1 Definition: The differential at a point p ∈ M1 of a C 1 map φ : M1 → M2 , for C 1 manifolds M1 and M2 with n1 = dim(M1 ) and n2 = dim(M2 ), is the linear map (dφ)p : Tp (M1 ) → Tφ(p) (M2 ) defined by ∀ψ1 ∈ atlasp (M1 ), ∀v1 ∈ IRn1 , ∀ψ2 ∈ atlasφ(p) (M2 ), (dφ)p tp,v1 ,ψ1 = tφ(p),v2 ,ψ2 , n2
where v2 ∈ IR
(31.3.1)
is defined by
∀k = 1, . . . n2 ,
v2k =
n1 X i=1
=
v1i ∂xi ψ2k ◦ φ ◦ ψ1−1 (x)
∂p,v1 ,ψ1 (ψ2k
x=ψ1 (p)
◦ φ).
31.3.2 Remark: The pointwise differential in Definition 31.3.1 is extended to the whole manifold M1 in Definition 31.3.4. The result is a map dφ : M1 → T (M1 , M2 ) such that (dφ)(p) ∈ Tp,φ(p) (M1 , M2 ) for all p ∈ M1 . The pointwise and total double tangent spaces Tp,q (M1 , M2 ) and T (M1 , M2 ) are given by Definitions 29.4.1 and 29.4.3 respectively. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
31.3. The differential of a differentiable map
634
31. Differentials on manifolds
31.3.3 Remark: In Gallot/Hulin/Lafontaine [19], 1.36, page 17, the notation Tp f is used for (dφ)p . In Federer [105], the notation D is used. The d notation has the advantage that dxi etc. is given real meaning by the definition. An added advantage would be if the differential coincides with the exterior derivative for scalar fields. (This coincidence probably does not occur!) [ An attempt should be made to check whether such formulas as ds2 = cos2 θ dφ2 + dθ 2 are made meaningful by the definition of differential of a real function. ] 31.3.4 Definition: The differential of a C 1 map φ : M1 → M2 for C 1 manifolds M1 and M2 is the map dφ : M1 → T (M1 , M2 ) with (dφ)(p) = (dφ)p for all p ∈ M1 . [ Should have an explicit definition of the application of tangent operators and general operators ∂xk etc. to IRn -valued fucntions. ] 31.3.5 Remark: The tangent operator ∂p,v1 ,ψ1 in Definition 31.3.1 is as defined in Definition 28.5.1. It is applied to each of the n2 components of the function ψ2 ◦ φ : M1 → IRn2 . As a shorthand, one could write v2 = ∂p,v1 ,φ1 (ψ2 ◦ φ). In fact, this extension of differential operators to IRn -valued functions will be adopted for convenience. Then one could write instead of equation (31.3.1): ∀ψ1 ∈ atlasp (M1 ), ∀v1 ∈ IRn1 , ∀ψ2 ∈ atlasφ(p) (M2 ), (dφ)p tp,v1 ,ψ1 = tφ(p),∂p,v1 ,ψ1 (ψ2 ◦φ),ψ2 .
31.3.6 Definition: The induced map of a map φ ∈ C 1 (M1 , M2 ) for C 1 manifolds M1 and M2 is the map φ∗ : T (M1 ) → T (M2 ) defined by ∀z ∈ T (M1 ),
φ∗ (z) = (dφ)π1 (z) (z),
where (dφ)p is as in Definition 31.3.1 with p = π1 (z), and π1 is the projection map of T (M1 ).
31.3.7 Remark: Definition 31.3.6 joins S together the pointwise differentials (dφ)p of Definition 31.3.1 for all points p ∈ M1 . In other words, φ∗ = p∈M1 (dφ)p . The corresponding Definition 31.3.24 for tangent operators is a little less tidy because of the requirement to add tags to the operators. [ See Malliavin [35], proposition I.3.5 for definitions of φ∗ and φ∗ . ] [ The induced map of a differentiable map between differentiable manifolds must be a fibre bundle map. This should be a theorem in the differentiable fibre bundles chapter. See Malliavin [35], proposition I.7.1.6. ] [ Is the map φ∗ a tangent bundle map? See Malliavin [35], proposition I.7.1.3. ] [ Should define here the extension of the induced map φ∗ to n-frames. Since φ∗ sends tangent vectors to tangent vectors, this clearly induces a unique corresponding map of the n-tuples of tangent vectors. Therefore there must be a map φ∗ : P (M1 ) → P (M2 ) for the principal tangent bundles. This probably has something to do with connections on principal fibre bundles. ] − 31.3.8 Theorem: If φ : M1 → M2 is a C r map between C r manifolds M1 and M2 for r ∈ + , then the r−1 induced map φ∗ is of class C .
Z
Proof: [ See Malliavin [35], proposition I.7.2.8 for proof of Theorem 31.3.8. Only need to show that v2 in Definition 31.3.1 is C r−1 . ] 31.3.9 Definition: A C 1 map φ : M1 → M2 is said to be regular at p ∈ M1 if (dφ)p is injective. 31.3.10 Definition: An immersion of a C 1 manifold M1 into a C 1 manifold M2 is a differentiable map from M1 to M2 which is regular at all points of M1 . 31.3.11 Definition: An embedding of a C 1 manifold M1 into a C 1 manifold M2 is an immersion of M1 into M2 which is injective. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
[ Must also define pull-back φ∗ and push-forth for tangent vectors and vector fields. ]
31.3. The differential of a differentiable map
635
[ There might be other ‘standard’ definitions of submanifolds than Definition 31.3.12. ] 31.3.12 Definition: A submanifold of a manifold M1 is a manifold M2 such that M2 ⊆ M1 and the identity map i : M2 → M1 is an embedding of M2 into M1 . 31.3.13 Remark: The induced map φ∗ of a C r diffeomorphism φ : M1 → M2 between C r manifolds M1 − and M2 for r ∈ + is (probably) a C r−1 diffeomorphism from T (M1 ) to T (M2 ). This has a natural extension to a map from X r−1 (M1 ) to X r−1 (M2 ).
Z
[ Define here the image of a vector field under a diffeomorphism. See Gallot et alia 1.63, p.24. ] 31.3.14 Definition: The induced map for vector fields of a C 1 diffeomorphism φ : M1 → M2 , where M1 and M2 are C 1 manifolds, is the map φ∗ : X 0 (M1 ) → X 0 (M2 ) defined by φ∗ (X)(p) = φ∗ (X(φ−1 (p))),
∀X ∈ X 0 (M1 ), ∀p ∈ M2 ,
where φ∗ : T (M1 ) → T (M2 ) denotes the induced map of φ. That is, φ∗ (X) = φ∗ ◦ (X ◦ φ−1 ).
∀X ∈ X 0 (M1 ),
[ The notation Jψφ1 ψ2 will be used for the Jacobian matrix of a map from one manifold to another, where ψ1 and ψ2 are charts for the two different manifolds. The notation Zβα will be used for a change of coordinates at a single point p in a single manifold M , where ψα , ψβ ∈ atlasp (M ). ]
Jψφ1 ψ2 (p)i j =
∂ i −1 ψ ◦ φ ◦ ψ (x) . 2 1 ∂xj x=ψ1 (p)
˚p1 (M, M1 ) and φ2 ∈ 31.3.16 Theorem: Let M, M1 , M2 be C 1 manifolds. Let p ∈ M . Let φ1 ∈ C ˚p1 (M, M2 ). Let ψ ∈ atlasp (M ), ψ1 ∈ atlasφ (p) (M1 ) and ψ2 ∈ atlasφ (p) (M2 ). Then C 1 2 ∂ −1 i (((ψ ◦ φ ) ⊕ (ψ ◦ φ )) ◦ ψ (x)) 1 1 2 2 ∂xj x=ψ(p) φ1 φ2 · · i = concat(J (p) , J (p) ) ,
φ1 ×φ2 Jψ,ψ (p)i j = 1 ⊕ ψ2
ψ,ψ1
j
ψ,ψ2
j
φ1 φ2 where the Jacobian matrices Jψ,ψ (p) ∈ Sym(n1 , IR) and Jψ,ψ (p) ∈ Sym(n2 , IR) are as in Definition 31.3.15, 1 2 where n1 = dim(M1 ) and n2 = dim(M2 ).
[ Give an example to clarify Theorem 31.3.16. ] [ See Definition 7.7.6 for the concatenation operator. ] 31.3.17 Remark: By putting L = ∂p,v1 ,ψ1 for tp,v1 ,ψ1 ∈ Tp (M ) and f = ψ2k ◦ φ in Definition 31.3.1, the operator form of the differential is constructed in Definition 31.3.18. These are equivalent definitions. Definition 31.3.18 is simpler whereas Definition 31.3.1 is more useful for computation. The operator definition provides a convenient shorthand and mnemonic for the component version of the differential, which is defined so as to be consistent with the operator version. 31.3.18 Definition: The differential (for tangent operators) at a point p ∈ M1 of a map φ ∈ C 1 (M1 , M2 ), ˚p (M1 ) → T ˚φ(p) (M2 ) defined by where M1 and M2 are C 1 manifolds, is the linear map (dφ)p : T ˚p (M1 ), ∀f ∈ C 1 (M2 ), ∀L ∈ T [ www.topology.org/tex/conc/dg.html ]
(dφ)p (L) (f ) = L(f ◦ φ).
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
31.3.15 Definition: Let φ : M1 → M2 be a C 1 map between C 1 manifolds M1 and M2 with n1 = dim(M1 ) and n2 = dim(M2 ). Then the Jacobian matrix of φ at p ∈ M1 with respect to charts ψ1 ∈ atlasp (M1 ) and ψ2 ∈ atlasφ(p) (M2 ) is the matrix Jψφ1 ψ2 (p) ∈ Mn2 n1 (IR) defined by
636
31. Differentials on manifolds (dφ)p (L) ˚p (M1 ) T
(dφ)p
L
˚φ(p) (M2 ) T
π1 ψ1
π2 φ
p M1
IRn1
ψ2
φ(p) M2
f ◦φ
f
IRn2
IR Figure 31.3.1
The differential of a map for first-order operators
31.3.19 Remark: Figure 31.3.1 shows spaces and maps relevant to Definition 31.3.18, and the projection ˚(Mk ) → Mk for k = 1, 2. maps πk : T Theorem 31.3.20 gives the relation between the component version of the map differential in Definition 31.3.1 and the operator version in Definition 31.3.18. The same notation (dφ)p is used for both versions, but this should not cause confusion. 31.3.20 Theorem: Definitions 31.3.1 and 31.3.18 are consistent with each other. That is, D(dφ)p (V ) = (dφ)p (DV ) for all V ∈ Tp (M1 ) and p ∈ M1 . Proof: Let V = tp,v1 ,ψ1 and f ∈ C 1 (M2 ). Then DV = ∂p,v1 ,ψ1 and so by Definition 31.3.18, (dφ)p (DV )(f ) = ∂p,v1 ,ψ1 (f ◦ φ) n1 X = v1i ∂xi (f ◦ φ ◦ ψ1−1 (x))
=
i=1 n2 X j=1
x=ψ1 (p)
v1i ∂xi (f ◦ ψ2−1 ◦ ψ2 ◦ φ ◦ ψ1−1 (x))
x=ψ1 (p)
∂yj (f ◦ ψ2−1 (y))
y=ψ2 (φ(p))
= ∂φ(p),v2 ,ψ2 (f ) = D(dφ)p (V ) (f ),
n1 X i=1
v1i ∂xi (ψ2j ◦ φ ◦ ψ1−1 (x))
x=ψ1 (p)
where v2 is as in Definition 31.3.1. Since this holds for all f ∈ C 1 (M2 ), the result follows. 31.3.21 Remark: Definition 31.3.18 maps the “ubiquitous zero vector” of M1 to the corresponding vector ˚p (M1 ) such that L : f 7→ 0 for all f ∈ C 1 (M1 ) is the same map in M2 . The tangent operator L ∈ T independent of p ∈ M1 , which justifies the name “ubiquitous zero vector”. Luckily, this vector is mapped to the corresponding zero vector for M2 , no matter which point p it is attached to. Therefore the union of the ˚(M1 ) → T ˚(M2 ) is a well-defined function φ∗ = S differential maps (dφ)p : T p∈M1 (dφ)p . This is the induced map given in Definition 31.3.23. 31.3.22 Definition: The differential (for tangent operators) of a map φ ∈ C 1 (M1 , M2 ) for C 1 manifolds ˚(M1 , M2 ) defined by M1 and M2 is the linear map dφ : M1 → T ˚(M1 ), ∀f ∈ C 1 (M2 ), (dφ)(p)(L) (f ) = L(f ◦ φ). ∀p ∈ M1 , ∀L ∈ T
31.3.23 Definition: The induced map (for tangent operators) of a map φ ∈ C 1 (M1 , M2 ) for C 1 manifolds ˚(M1 ) → T ˚(M2 ) defined by φ∗ = S M1 and M2 is the linear map φ∗ : T p∈M1 (dφ)p . That is, ˚(M1 ), ∀f ∈ C 1 (M2 ), ∀L ∈ T
[ www.topology.org/tex/conc/dg.html ]
φ∗ (L) (f ) = L(f ◦ φ).
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
=
i=1 n1 X
31.4. The differential of a curve
637
31.3.24 Definition: The induced map (for tagged tangent operators) of a map φ ∈ C 1 (M1 , M2 ) for C 1 manifolds M1 and M2 is the map φ∗ : Tˆ(M1 ) → Tˆ(M2 ) defined by ∀(p, L) ∈ Tˆ(M1 ), φ∗ (p, L) = (φ(p), (dφ)p (L)), where (dφ)p is the operator version of the differential given in Definition 31.3.18.
31.3.25 Remark: In the study of partial differential equations, a typical equation would be aij (x)uij (x) + bi (x)ui (x) + c(x)u(x) = f (x) for u ∈ C 2 (Ω), c, f ∈ C 0 (Ω), b ∈ C 0 (Ω, IRn ) and a ∈ C 0 (Ω, Sym(n, IR)) for some open subset Ω of IRn . In a differential geometry context, such equations must be given a chartindependent meaning. A term such as bi (x)ui (x) may be replaced by ∂p,b(p),ψ (u) for functions u ∈ C 2 (M ) and b : M → IRn . It is of interest to know how such expressions are transformed under differentiable maps between manifolds. Theorem 31.3.26 shows the correspondence between first order partial differential equations in diffeomorphic open subsets of manifolds. The close relation between partial differential equations and tangent vectors reveals the true analytical nature of differential geometry. [ Maybe should do a version of Theorem 31.3.26 for second-order operators in Section 32.2. Also give the transformations for chart transitions. Should check Theorem 31.3.26. See Section 19.5. ] 31.3.26 Theorem: Let M1 , M2 be C 1 n-dimensional manifolds and let φ : Ω1 → Ω2 be a diffeomorphism between open sets Ω1 ⊆ M1 and Ω2 ⊆ M2 . Let b ∈ IRn and u ∈ C 1 (Ω1 , IR) and c, f ∈ C 0 (Ω1 , IR). If the equation ∂p,b,ψ1 (u) + c(p)u(p) = f (p) is satisfied for some p ∈ Ω1 and ψ1 ∈ atlasp (M1 ), then the equation
is satisfied for any ψ2 ∈ atlasq (M2 ), where q = φ(p), c˜ = c ◦ φ−1 , f˜ = f ◦ φ−1 , u ˜ = u ◦ φ−1 , and ˜bi = bj ∂j (ψ2 ◦ φ ◦ ψ −1 )i 1 ψ1 (p)
for i = 1 . . . n.
Proof: All terms in the two equations are equal. The equality of the first-order terms is a simple consequence of the definition of the differential of the map φ for operators.
31.4. The differential of a curve The differential of a curve is the same thing as the tangent vector or velocity vector of the curve. Differentials of curves are contravariant vectors because the range of a curve is the manifold itself. This may be contrasted with differentials of real-valued functions, which are covector fields because the manifold is the domain of the map. The differential of a map between two manifolds is covariant with respect to the map’s domain and contravariant with respect to the map’s range. Differentiability of curves in a differentiable manifold is defined in Section 27.6. Differentiable vector fields along curves are defined in Section 29.8. 31.4.1 Definition: The tangent vector field of a C 1 open curve γ : I → M in a C 1 manifold M − < (M, AM ) is the map γ ′ : I → T (M ) defined by ∀t ∈ I,
γ ′ (t) =
γ(t),
n d ψ i (γ(u)) ,ψ i=1 du u=t
= tγ(t),∂t (ψ◦γ(t)),ψ for any chart ψ ∈ AM such that γ(t) ∈ Dom(ψ). [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
∂q,˜b,ψ2 (˜ u) + c˜(q)˜ u(q) = f˜(q)
638
31. Differentials on manifolds
31.4.2 Remark: The fact that γ ′ (t) in Definition 31.4.1 is a well-defined tangent vector is easily verified in relation to Definition 28.3.3. Let ψ1 , ψ2 ∈ atlasγ(t) (M ). Then ∂t (ψ2 ◦ γ(t))
i
= ∂t (ψ2i ◦ ψ1−1 ◦ ψ1 ◦ γ(t)) n X j = ∂xj (ψ2 ◦ ψ1−1 (x)) ∂t (ψ1 ◦ γ(t)) , j=1
31.4.3 Remark: The tangent vector γ ′ (t) in Definition 31.4.1 is really only a substitute for a limit of the form limh→0 (γ(t + h) − γ(t))/h. If M has no linear space (or affine space) structure, then the sum and product in the expression (γ(t + h) − γ(t))/h simply do not make sense. This is precisely why tangent vectors for manifolds were invented. They provide a substitute for a real differential of a curve by first overlaying a coordinate chart and then differentiating the coordinates instead of the curve itself. Then to get rid of the arbitrariness of this procedure, equivalence classes of these derivatives are used. The situation becomes more interesting when the manifold M does have a linear space (or affine space) structure. In this case, the expression (γ(t + h) − γ(t))/h is well-defined, and if the limit exists, it would be interesting to compare the result with the corresponding tangent vector to the manifold. (It is assumed here that all finite-dimensional linear spaces are given the standard topology which makes them homeomorphic under a linear map to a space IRn , which is topologically complete.) The results should be equivalent if the coordinate charts match up in a differentiable manner with the linear space structure. Suppose that M is a linear space with the standard topology, and that ψ ∈ atlas(M ) is C 1 with respect to the linear space structure of M in the sense that for all p, v ∈ M , the derivative bp,v = ∂t (ψ(p + tv)) t=0 ∈ IRn is well-defined and continuous with respect to v. Then define Vp,v = tp,bp,v ,ψ ∈ Tp (M ). The map L : M → Tp (M ) defined by L : v 7→ Vp,v is a linear isomorphism. Therefore the inverse linear map ̟p = L−1 : Tp (M ) → M is well-defined. The map ̟p might be referred to as the “drop” of Tp (M ) onto M (analogous to “lift” functions). A special case of this kind of drop-function is the canonical identification of Tp (IRn ) with IRn for all p ∈ IRn . Such drop-functions arise in the calculation of covariant derivatives with respect to affine connections in Section 37.4. (See Definition 28.11.6 for drop functions.) [ The following paragraph needs to be sorted out a bit. ] Curves and families of curves are maps from spaces IRm to manifolds. Since the domains of such maps may be regarded as either a manifold or a linear space, there are two possible definitions for a differential of the map. For a curve, the value of γ ′ (t) is in a tangent space Tγ(t) (M ) in the manifold case, and in IRm in the linear space case. A similar issue arises when the range of the map is IRm . The differential of a real-valued function may be regarded as valued in some dual space Tp∗ (M ) or else a linear map between Tp (M ) and a tangent space of IR. 31.4.4 Remark: It is interesting to compare Definition 31.4.1 with the corresponding definitions for the differential of a differential map in Section 31.3. The set IR may be regarded as a differentiable manifold (IR, AIR ) with AIR = {ψ0 }, where ψ0 = idIR is the identity chart on IR. Then the differential dγ : IR → T (IR, M ) and induced map γ∗ : T (IR) → T (M ) are defined as follows. ∀t ∈ IR, ∀α ∈ IR, (dγ)t tt,α,ψ0 = αγ ′ (t) ∀t ∈ IR, ∀α ∈ IR, γ∗ tt,α,ψ0 = αγ ′ (t), where αγ ′ (t) = tγ(t),α∂t (ψ◦γ(t)),ψ for ψ ∈ AM . Thus (dγ)t = γ∗ Tt (IR) ∈ Hom(Tt (IR), Tγ(t) (M )) for t ∈ IR. Conversely, the value γ ′ (t) can be expressed in terms of dγ and γ∗ as ∀t ∈ IR, γ ′ (t) = (dγ)(t) tt,1,ψ0 = γ∗ tt,1,ψ0 .
It is clear that γ ′ contains all of the information in the maps dγ and γ∗ . Since IR has such an obvious choice of chart and the pointwise linear spaces of IR are 1-dimensional, it seems quite unnecessary to give the full differentials. It is entirely sensible to define γ ′ (t) as γ∗ tt,1,ψ0 , since the number 1 and chart ψ0 are implicit. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
which verifies equation (28.3.1). Since the tangent vector field is tagged by the curve parameter t ∈ I, there are no ambiguities at self-intersections of the curve. However, γ ′ (t) is customarily said to be “the tangent vector at γ(t)”, which is ambiguous if the curve parameter is discarded.
31.5. One-parameter transformation families and vector fields
639
31.4.5 Notation: dγ denotes the tangent vector field γ ′ in Definition 31.4.1. 31.4.6 Remark: In contradiction to Remark 31.4.4, Notation 31.4.5 defines the differential dγ of a C 1 curve to be the same as the tangent vector field of the curve. This is based on the identification of the tangent space at a point of a real interval with the set IR. 31.4.7 Definition: The tangent operator of a C 1 open curve γ : I → M in a C 1 manifold M for a param˚γ(t) (M ), where γ ′ (t) is the tangent vector in Definition 31.4.1. eter t ∈ I is the tangent operator Dγ ′ (t) ∈ T 31.4.8 Theorem: If r ≥ 1 and γ is a C r non-self-intersecting curve, then the tangent vector field γ ′ is a C r−1 vector field on the image of the restriction of γ to the interior of its domain. [ Theorem 31.4.8 needs to be tidied up by extending the tangent vector field to the whole of the domain of γ. ] 31.4.9 Definition: A one-parameter family of curves induced by a vector field on a curve is. . . [ Cover definition, existence and uniqueness of integral curves of vector field X with γ ′ (t) = X(γ(t)). ]
Z
31.4.10 Definition: A partial tangent vector field of a C 1 family of curves γ : IRm → M with m ∈ + in a C 1 manifold M is a vector field ∂k γ : IRm → T (M ) defined for k = 1 . . . m, t ∈ IR and ψ ∈ atlasγ(t) (M ) by ∂k γ(t) = tγ(t),∂tk (ψ◦γ(t)),ψ . [ Should have a diagram here showing transversal vector fields. ] 31.4.11 Remark: The differential dγ for a C 1 family of curves γ : IRm → M is not quite the same thing as the sequence (∂k γ)m k=1 , but it does contain essentially the same information. 31.4.12 Remark: The differentials in Definition 31.4.10 may be thought of as “transversal vector fields”. For example, when m = 2, the vector γ2 (u1 , u2 ) is transverse to the curve t1 7→ γ(t1 , t2 ) at the point γ(u1 , u2 ) for u = (u1 , u2 ) ∈ IR2 .
31.5. One-parameter transformation families and vector fields [ Probably the one-parameter and multi-parameter groups of diffeomorphisms should be defined near Section 27.6 on differentiable curves. Then the generated vector fields should be presented in this section. ] Vector fields generated by families of diffeomorphisms are important because these are exactly what is generated on fibre spaces by a connection for motion along a curve in a base space. Parallelism is always defined as a bijection between fibre sets at points of a base space, but in the case of a connection (differential parallelism), the fibre sets are differentiable manifolds, and the pathwise parallelism for a given curve in the base space generates a one-parameter family of diffeomorphisms of the fibre space (through the charts). Generally diffeomorphisms of differentiable manifolds are not of great importance or interest, but in the case of fibre spaces, they are the essence of parallelism and connections. The Poisson bracket of the vector fields generated by base space curves corresponds to curvature. The curvature of a connection may be defined in general, even if the structure group is not a finite-dimensional manifold, as the Poisson bracket of the fields generated by infinitesimal motions in the base space. [ Should present here especially the vector fields generated by families of diffeomorphisms on fibre spaces. This is also covered in the differentiable groups chapter, but the group doesn’t have to be a manifold. ] [ See Crampin/Pirani [11], page 251, for one-parameter transformation groups and vector fields. ] In this section, the Poisson bracket is given an alternative definition via local one-parameter groups of transformations. A canonical example of this sort of correspondence between vector fields and one-parameter transformation groups is the Taylor series expansion f (x + ε) = exp(ε(∂/∂x))(f )(x) for analytic functions f . [ The symbol ∂/∂x is a sort of slang term, which should be defined carefully in general somewhere. Should also carefully define exp(ε(∂/∂x)). ] 31.5.1 Remark: It follows (hopefully) from Definition 27.7.2 (i) that φt is a diffeomorphism from Ω to φt (Ω). [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
[ Present here the almost-everywhere differentials of rectifiable curves and families of curves. ]
640
31. Differentials on manifolds
[ The following definition is actually only a motivational definition for the material in the following comment. ] 31.5.2 Definition: The vector field generated by a C 1 one-parameter group φ of transformations of a C 1 manifold M is the vector field Xφ ∈ X 0 (M ) defined by ∀f ∈ C 1 (M ), ∀p ∈ M,
Xφ (f )(p) = lim t−1 (f (φ(t)(p)) − f (p)). t→0
[ This may be equivalent to the image under the differential of φ of the vector field ∂/∂t on IR × M . Of course, this operator should be defined somewhere. Then dφ : T (IR × M ) → T (M ) maps (∂/∂t)(t,p) for (t, p) ∈ IR × M to (dφ)(∂/∂t)(t,p) ∈ Tφ(t,p) (M ). (dφ)
∂ : f 7→ lim t−1 (f (φ(t)(p)) − f (p)). t→0 ∂t
That is, X(φ)(f )(p) = (dφ) That is, X(φ) = (dφ)
∂ (f )(p). ∂t ∂ . ∂t
That is, the vector field ∂/∂t on IR × M is mapped to X(φ) on M by dφ. In reverse, φ is generated by X(φ). For each X ∈ T (M ), want existence and uniqueness of φ : IR × M → M such that (dφ)(∂/∂t) = X. Notation exp(tX) = φ(t). The vector field generated by a one-parameter group (φt )t∈IR is in a sense the derivative of the group at t = 0. ] [ Should the expression X(φ) in the above comment be replaced by Xφ ? ]
[ Define a one-parameter group of transformations of a manifold generated by X, and a local one-parameter group of local transformations around p generated by X. ] [ Should mention the EDM claim that the local group generated by a vector field always exists. ] ∞ ∞ [ Will define d so that ((dφt )Y )(p) = (dφt )φ−1 (p) Y (φ−1 t (p)) for an extension of d from C (M ) to X (M ). ] t
[ Must define vector fields for multi-parameter groups of transformations. This has some relevance to curvature and connections and fibre spaces. ]
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
31.5.3 Theorem: The function Xφ in Definition 31.5.2 is a vector field for all one-parameter groups φ of transformations of a C 1 manifold M .
[641]
Chapter 32 Higher-order differentials
32.1 32.2 32.3 32.4 32.5 32.6 32.7 32.8
Higher-order differentials of a real-valued function . . . . . . . Higher-order differentials of a differentiable map . . . . . . . . Higher-order differentials of a curve . . . . . . . . . . . . . . . Higher-order differentials of curve families . . . . . . . . . . . Differentials of real-valued functions for higher-order operators Hessian operators at critical points . . . . . . . . . . . . . . . Differentials of differentiable maps for higher-order operators . Differentials of curves for higher-order operators . . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
641 641 642 644 645 646 646 648
This chapter presents two different topics: higher-order differentials and differentials for higher-order operators. These are distinct topics. For example, a second-order differential is the differential of a differential, whereas the differential for a second-order operator is the “push-forth” of the operator. These topics are presented together to (hopefully) help to clarify the difference between them.
This section is about differentials of differentials (d2f )p of real-valued functions f , and higher-order versions of this. By Definition 31.2.11, df ∈ X(T ∗ (M )), and by Theorem 31.2.13, df ∈ X 1 (T ∗ (M )) for if f ∈ C 2 (M ). Thus df is a C 1 map from M to T ∗ (M ), which both have a C 1 manifold structure if M is a C 2 manifold. Therefore df : M → T ∗ (M ) is a C 1 map which can be differentiated. Its differential must be a map of the form d2f : T (M ) → T (T ∗ (M )) such that (d2f )p : Tp (M ) → T(df )p (T ∗ (M )).
32.2. Higher-order differentials of a differentiable map [ See Malliavin [35], proposition I.7.3 for higher order derivatives. ] There are at least two classes of differential maps which may be described as second-order differentials between manifolds M1 and M2 . One is the differential of a differential, which is a map d2 φ from T (T (M1 )) to T (T (M2 )); the other is a differential map d[2] φ from T [2] (M1 ) to T [2] (M2 ), where T [2] (M ) denotes the space of second-order partial differential operators on a C 2 manifold M . Differentials d[k] φ for higher-order operators are described in Section 32.5. This section is about differentials of differentials d2 φ etc. [ d2 φ is really φ∗∗ . ] The second-order differential of a diffeomorphism is of interest because an affine connection on a manifold is a special case of this. Parallelism on a manifold is then generated by integrating such a differential. [ This gives some sort of motivation for connections? ] [ φ maps M1 to M2 . Then φ∗ : T (M1 ) → T (M2 ). Must calculate coefficients of φ∗∗ ? ]
©
Alan U. Kennington, “Differential geometry reconstructed: a unified systematic framework”. Copyright 2009, Alan U. Kennington. All rights reserved. You may print this book draft in A4 format. Printing in all other formats is forbidden. You may not charge any fee for copies of this book draft. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
32.1. Higher-order differentials of a real-valued function
642
32. Higher-order differentials
32.3. Higher-order differentials of a curve Just as the first-order differential of a curve (Section 31.4) represents the velocity of a trajectory (if the parameter is interpreted as time), so the second-order differential represents the acceleration. Usually the second and higher order differentials are presented in the context of an affine connection, but in this section, the differentials are more abstract. In essence, higher order differentials may be specified in terms of their effect on real-valued functions, just as first-order differentials of curves are. This guarantees chartindependence. In the context of physics, chart-independence may be interpreted as observer-independence. The facts should be objective. That is, facts about objects should be independent of the coordinate charts used to describe them. This section defines connection-free higher-order differentials of curves and verifies that they are chartindependent. [ The differential in Definition 32.3.1 is really γ∗∗ with some sort of simplification of T (T (IR))? ] 32.3.1 Definition: The second-order tangent vector field on a C 2 curve γ : IR → M in a C 2 manifold M with n = dim(M ) is the map γ ′′ : IR → T (T (M )) defined for ψ ∈ atlasγ(t) (M ) by γ ′′ (t) = tγ ′ (t),wψ (t),ψ˜ ,
∀t ∈ IR,
where wψ : IR → IR2n is defined by wψ (t)i =
∀t ∈ IR,
(
∂t (ψ i ◦ γ(t))
for i = 1 . . . n
∂t2 (ψ i−n ◦ γ(t)) for i = n + 1, . . . 2n
and γ ′ is the first-order tangent vector field in Definition 31.4.1.
γ ′′ (t) =
∀t ∈ IR,
tγ(t),∂t (ψ◦γ(t)),ψ , (∂t (ψ ◦ γ(t)), ∂t2 (ψ ◦ γ(t))), ψ˜
,
where (∂t (ψ ◦ γ(t)), ∂t2 (ψ ◦ γ(t))) ∈ IR2n represents the concatenation of the vectors ∂t (ψ ◦ γ(t)) ∈ IRn and ∂t2 (ψ ◦ γ(t)) ∈ IRn . Definition 32.3.1is constructed from the double application of Definition 31.4.1. When the coordinates ψ(γ(t)), ∂t (ψ◦γ(t)) ∈ IR2n of γ ′ (t) in the total tangent space T (M ) are differentiated with respect to t, the result is simply ∂t ψ(γ(t)), ∂t (ψ ◦ γ(t)) = ∂t (ψ ◦ γ(t)), ∂t2 (ψ ◦ γ(t)) . It is perhaps interesting to note that the standard coordinates in IR4n for γ ′′ (t) with respect to the chart ψ may be written as the quadruple concatenation ψ(γ(t)), ∂t (ψ ◦ γ(t)), ∂t (ψ ◦ γ(t)), ∂t2 (ψ ◦ γ(t)) . The first derivatives appear twice in this coordinate vector.
32.3.3 Theorem: The vector field γ ′′ in Definition 32.3.1 is chart-independent. In other words, for all t ∈ IR and ψ1 , ψ2 ∈ atlasγ(t) (M ), tγ ′ (t),wψ (t),ψ˜1 = tγ ′ (t),wψ (t),ψ˜2 , where ψ˜α denotes the chart for T (M ) 1 2 corresponding to the chart ψα for M for α = 1, 2. Proof: By Theorem 28.10.9, tγ ′ (t),wψ (t),ψ˜1 = tγ ′ (t),wψ (t),ψ˜2 if and only if equations (28.10.1) and (28.10.2) 1 2 are satisfied. Thus it must be shown that for all i = 1 . . . n, i wψ (t) = 2
n X j=1
and n+i wψ (t) = 2
n X
∂xj (ψ2i ◦ ψ1−1 (x))
x=ψ1 (γ(t))
∂xj ∂xk (ψ2i ◦ ψ1−1 (x))
j,k=1
j wψ (t) 1
x=ψ1 (γ(t))
j vψk 1 (t)wψ (t) + 1
n X j=1
∂xj (ψ2i ◦ ψ1−1 (x))
x=ψ1 (γ(t))
n+j wψ (t), 1
where vψ1 (t) ∈ IRn is the component vector for γ ′ (t) defined by γ ′ (t) = tγ(t),vψ1 (t),ψ1 . The first equation [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
32.3.2 Remark: The expression for γ ′′ (t) in Definition 32.3.1 may be written out more fully as follows:
32.3. Higher-order differentials of a curve
643
follows as for the first-order tangent field from the calculation i wψ (t) = ∂t (ψ2i ◦ γ(t)) 2
= ∂t (ψ2i ◦ ψ1−1 ◦ ψ1 ◦ γ(t)) n X = ∂xj (ψ2i ◦ ψ1−1 (x)) =
j=1 n X j=1
x=ψ1 (γ(t))
∂xj (ψ2i ◦ ψ1−1 (x))
x=ψ1 (γ(t))
∂t (ψ1j ◦ γ(t)) j wψ (t). 1
The second equation follows similarly from the calculation n+i wψ (t) = ∂t2 (ψ2i ◦ γ(t)) 2
x=ψ1 (γ(t))
j,k=1
+
n X j=1
=
n X
∂xj (ψ2i ◦ ψ1−1 )
∂t (ψ1k ◦ γ(t))∂t (ψ1j ◦ γ(t))
x=ψ1 (γ(t))
∂xj ∂xk (ψ2i ◦ ψ1−1 )
x=ψ1 (γ(t))
j,k=1
∂t2 (ψ1j ◦ γ(t))
j k wψ (t)wψ (t) + 1 1
n X j=1
k This is the correct answer because wψ (t) = vψk 1 (t) for k = 1 . . . n. 1
∂xj (ψ2i ◦ ψ1−1 )
x=ψ1 (γ(t))
n+j wψ (t). 1
k 32.3.4 Remark: The fact that wψ (t) = vψk 1 (t) for k = 1 . . . n in the proof of Theorem 32.3.3 means that 1 ′′ the horizontal component of γ is the same as γ ′ . Thus π∗ (γ ′′ (t)) = γ ′ (t), where π : T (M ) → M is the projection map for T (M ). In other words, the vector γ ′′ (t) ∈ Tγ(t) (T (M )) carries inside it a copy of the vector γ ′ (t) ∈ Tγ(t) (M ). This is because the vector γ ′ (t) = tp,v,ψ contains a copy of the base point p. Variation of p is regarded as horizontal whereas variation of v is regarded as vertical. Figure 32.3.1 illustrates the four parts of the second-order tangent field of a curve. The first two parts are the point γ(t) and the component vector v ∈ IRn which are combined as γ ′ (t) = tγ(t),v,ψ . The third part is the sequence of n horizontal components wj of the vector γ ′′ (t) ∈ Tγ ′ (t) (T (M )). The fourth part is the sequence of n vertical components wn+j of γ ′′ (t), which indicate the rate of change of the components v with respect to t. In the limit, the third part is the same as the second part. Only the fourth part is really a second differential. All the rest is just book-keeping.
32.3.5 Remark: The dotted arrow in Figure 32.3.1 represents a kind of parallel-translated copy of the vector v k . 4 vertical
2 vk 1 γ(t)
Figure 32.3.1
wn+j
wj 3 horizontal
Components of second-order tangent field of a curve
However, it is important to remember that this translated vector depends on the choice of coordinates. The dotted arrow is constructed by copying the coordinates vk from γ(t) to a nearby point on the curve. But the transition rules are different at different points of the manifold. So these copied coordinates to not transform correctly at any point except γ(t). The purpose of an affine connection is to construct a true vector at each point of the curve which transforms correctly at each point and which may be thought of as the parallel translate of γ ′ (t) to each point. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
= ∂t2 (ψ2i ◦ ψ1−1 ◦ ψ1 ◦ γ(t)) n X = ∂xj ∂xk (ψ2i ◦ ψ1−1 )
644
32. Higher-order differentials
32.3.6 Remark: If γ ′ (t) = 0, then γ ′′ (t) is a vertical vector. This implies that the vertical part of γ ′′ (t) transforms in the same way as vectors in Tp (M ) under changes of chart. If the “drop” function ̟γ ′ (t) in Definition 28.11.6 is applied to γ ′′ (t) ∈ ker((dπ)γ ′ (t) ), the result is a well-defined vector in Tp (M ). This means that even in the absence of a connection, the acceleration of a trajectory is chart-independent if the velocity is zero, and the acceleration may then be interpreted as a true tangent vector to the manifold. One way of thinking about this is to observe that the error introduced by chart variation (as in Remark 32.3.5) into the limiting process when differenting γ ′ (t) with respect to t converges to zero faster than t converges.
Z
32.3.7 Notation: dk γ for k ∈ + for a C k open curve γ : I → M in a C k manifold M recursively denotes the differential d(dk−1 γ), where d0 γ = γ. 32.3.8 Remark: Notation 32.3.7 implies that dk γ : I → T (k) (M ), where the higher tangent spaces T (k) (M ) are defined in Notation 28.10.17. For example, d2 γ : I → T (T (M )) is defined by ∀t ∈ I, (d2 γ)(t) = tγ(t),∂t (ψ◦γ(t)),ψ , ∂t2 (ψ ◦ γ), ψ˜ , for all ψ ∈ atlasγ(t) (M ), where ψ˜ is the chart for T (M ) corresponding to ψ as in Definition 28.8.1.
[ Similarly to Remark 31.4.4, compare γ ′′ with 2nd-order differentials d2 γ or γ∗∗ for γ : IR2 → M . Relate this to T (T (IR)). ] γ(t),ψ
[ Define Dγ ′′ (t) . Relate Dγ ′′ (t) to basis tangent operators ∂i ′′
r
γ(t),ψ
and ∂ij
.]
r+2
[ Show that γ is C if γ is C .] [ Distinguish between simple second derivatives of curves and covariant derivatives of the simple first derivative. The first must be in T (T (M )) whereas the second may be in T (M ). ] [ Must look at the second-order tangent vector interpretation of γ ′′ (t). In other words, γ ′′ (t) may be thought of as a second-order operator acting on real functions. Yet another possibility is to regard γ ′′ (t) as a chartψ (γ) ∈ Tγ(t) (M ). This happens to be the first-order part of the second-order tangent dependent vector ∂tt ψ vector! The vector ∂tt (γ) could be combined with a connection to give a chart-independent tangent vector. ]
The second-order differential of a curve may be interpreted either as the differential of the differential in T (T (M )) or as a second-order operator in T [2] (M ). The T (T (M )) interpretation is presented in this section. 32.4.1 Remark: When derivatives do not commute, it is important to get the order right. The notation γkℓ means a derivative with respect to k then ℓ. Thus in Definition 32.4.2, γkℓ (t) means ∂tℓ (∂tk γ(t)), which is a vector with base point ∂tk γ(t) = γk (t). 32.4.2 Definition: A partial second-order tangent vector field of a C 2 map γ : IRm → M in a C 2 manifold M with n = dim(M ) and m ∈ + is a map γkℓ : IRm → T (T (M )) defined for k, ℓ = 1 . . . n and ψ ∈ atlasγ(t) (M ) by
Z
∀t ∈ IRm ,
(2) ˜, k (t),wψ (t),ψ
γkℓ (t) = tγ
where wψ : IRm → IR2n is defined by m
∀t ∈ IR ,
i
wψ (t) =
(
∂tℓ (ψ i ◦ γ(t)) ∂tℓ ∂tk (ψ
i−n
for i = 1 . . . n
◦ γ(t)) for i = n + 1, . . . 2n
and γk is the first-order tangent vector field for γ as in Definition 31.4.10. 32.4.3 Remark: The expression for γkℓ (t) in Definition 32.4.2 may be written out more fully as: ∀t ∈ IR, γkℓ (t) = tγ(t),∂tk (ψ◦γ(t)),ψ , (∂tℓ (ψ ◦ γ(t)), ∂tk ∂tℓ (ψ ◦ γ(t))), ψ˜ ,
where (∂tℓ (ψ ◦ γ(t)), ∂tk ∂tℓ (ψ ◦ γ(t))) ∈ IR2n represents the concatenation of the vectors ∂tk (ψ ◦ γ(t)) ∈ IRn and ∂tk ∂tℓ (ψ ◦ γ(t)) ∈ IRn . Definition 32.4.2 is constructed from the double application of Definition 31.4.1. When the coordinates ψ(γ(t)), ∂tk (ψ ◦ γ(t)) ∈ IR2n of γk (t) in the total tangent space T (M ) are differenℓ tiated with respect to t , the result is ∂tℓ ψ(γ(t)), ∂tk (ψ ◦ γ(t)) = ∂tℓ (ψ ◦ γ(t)), ∂tk ∂tℓ (ψ ◦ γ(t)) . [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
32.4. Higher-order differentials of curve families
32.5. Differentials of real-valued functions for higher-order operators
645
[ Do a diagram for second-order differentials of curve families similar to Figure 32.3.1. ] 32.4.4 Remark: The second-order tangent vector γkℓ (t) in Definition 32.4.2 is a chart-independent vector in Tγk (t) (T (M )). The proof of this is the same as for Theorem 32.3.3. In this case, though, the second and third parts v k and wk of the vector are not generally the same unless k = ℓ. That is, the horizontal part of the second-order differential is not generally the same as the first-order differential. Related to this is the fact that the partial tangent vector fields do not generally satisfy γkℓ = γℓk when k 6= ℓ. Such fields are not even comparable because they are in different tangent spaces Tγk (t) (M ) and Tγℓ (t) (M ).
32.5. Differentials of real-valued functions for higher-order operators Higher-order tangent vectors and tangent operators are defined in Section 30.3. First-order differentials of real-valued functions are defined for first-order tangent vectors and operators in Section 31.2. 32.5.1 Definition: For any function f ∈ C 2 (M ) and point p in a C 2 manifold M , the differential of f ˚p[2] (M ) → IR defined by at p for second-order tangent operators is the linear map (d[2] f )p : T ˚p[2] (M ), ∀L ∈ T
(d[2] f )p (L) = L(f ). [2]
The differential of f at p for second-order tangent vectors is the linear map (d[2] f )p : Tp (M ) → IR defined by ∀W ∈ Tp[2] (M ),
(d[2] f )p (W ) = DW f.
32.5.2 Remark: Some of the spaces and maps in Definition 32.5.1 are shown in Figure 32.5.1. These pointwise differentials may now be extended to the whole second-order tangent space.
L
C 2 (M )
f
(d[2] f )p
L IR f
M Figure 32.5.1
p
Differential of a real-valued function for second-order operators at a point
32.5.3 Definition: For a function f ∈ C 2 (M ) on a C 2 manifold M , the differential of f (for second-order ˚[2] (M ) → IR defined by tangent operators) is the map d[2] f : T ˚[2] (M ), ∀L ∈ T
(d[2] f )(L) = L(f ).
The differential of f (for tagged second-order tangent operators) is the map d[2] f : Tˆ[2] (M ) → IR defined by ∀(p, L) ∈ Tˆ[2] (M ), (d[2] f ) (p, L) = L(f ). The differential of f (for second-order tangent vectors) is the map d[2] f : T [2] (M ) → IR defined by ∀W ∈ T [2] (M ),
(d[2] f )(W ) = DW f. [2]
[ Define the “generalized Hessian” in a C 2 manifold as the dual of Tp (M ). I.e. this Hessian is the linear [2] [2] map tp,a,b,ψ 7→ ∂p,a,b,ψ (f ). ] [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
˚p[2] (M ) T
646
32. Higher-order differentials
32.6. Hessian operators at critical points [ Should determine whether “stationary” might be a more accurate word than “critical”. ] 32.6.1 Remark: It is perhaps surprising that the Hessian of a real-valued function is, under the right circumstances, a well-defined tensor even in the absence of a connection. The usual definition of the Hessian operator incorporates a connection. (See Remark 30.6.1 for a comment on general Hessians. See Greene/ Wu [67], page 7, for Hessians with a connection.) It will be shown here that the Hessian of a real-valued function f ∈ C 2 (M ) for a C 2 manifold M , at a critical point of f , namely a point p ∈ M such that (df )p = 0, ˚p0,2 (M ). Obviously, since this “connectionless” version of the Hessian of f at p is a well-defined tensor in T the Hessian is restricted to critical points, it is not very effective at generating interesting vector fields from real-valued functions in the way that the first-order differential df does. It is tempting to use a notation such as (d2 f )p for the Hessian of f at p, but this has some difficulties. The differential of df should be a map from T (T (M )) to IR, as discussed in Section 32.1. In the presence of a connection, it is usual to write D2 f for the Hessian of f . This is the same thing as the Hessian in Theorem 32.6.2 if (df )p = 0. It is probably more accurate to denote the Hessian as (d[2] f )p , using the notation of Section 32.5. 32.6.2 Theorem: Let M be a C 2 manifold. Then for any real-valued function f ∈ C 2 (M ) and point p ∈ M such that (df )p = 0, the map Hf,p : Tp (M ) × Tp (M ) → IR defined for all tp,u,ψ , tp,v,ψ ∈ Tp (M ) by n X Hf,p tp,u,ψ , tp,v,ψ = ui v j i,j=1
∂2 (f ◦ ψ) ∂xi ∂xj x=ψ(p)
(32.6.1)
Proof: [ See Theorem 30.1.3 for transformation rules for second-order operators. ] [ Must show that Hf,p is a tensor and that it is chart-independent. Should also show that when (df )p 6= 0, it is a tensor but is chart-dependent. ] For a fixed chart ψ, the function Hf,p is clearly bilinear with respect to u, v ∈ IRn . Therefore for this fixed chart, Hf,p ∈ L 2 (Tp (M ), IR) ∼ = L 2 (Tp∗ (M ), IR)∗ = Tp0,2 (M ). 32.6.3 Remark: For f and ψ as in Theorem 32.6.2, define fij = ∂ 2 /∂xi ∂xj (f ◦ ψ) x=ψ(p) for i, j = 1, . . . n. Pn Then Hf,p = i,j=1 fij ei ⊗ ej , where the cotangent vectors ei = (dψ i )p ∈ Tp∗ (M ) are combined to obtain Pn ei ⊗ ej ∈ Tp0,2 (M ), and the right-hand side of equation (32.6.1) may be written as i,j=1 fij ui v j . [ Interpret Hessians in the context of contravariant tensors aij ei ⊗ ej ∈ Tp2,0 (M ). ]
32.7. Differentials of differentiable maps for higher-order operators Differentials for second-order operators should be useful for the second-order analysis of “geodesic leverage” or “pantograph” maps in Section 38.8. Differentiable maps are defined in Section 27.8. Higher-order tangent vectors and tangent operators are defined in Section 30.3. First-order differentials of differentiable maps are defined for first-order tangent vectors and operators in Section 31.3. The first-order differential of a differentiable map for second-order tangent vectors and operators is defined in this section in terms of tangent vector coordinates (Definition 32.7.1) and then in terms of tangent operators (Definition 32.7.3). 32.7.1 Definition: The differential for second-order tangent vectors at a point p ∈ M1 of a differentiable map φ ∈ C 2 (M1 , M2 ), where M1 and M2 are C 2 manifolds with m = dim(M1 ) and n = dim(M2 ), is the [2] [2] map (d[2] φ)p : Tp (M1 ) → Tφ(p) (M2 ) defined by [2]
∀tp,a,b,ψ1 ∈ Tp[2] (M1 ), ∀ψ2 ∈ atlasφ(p) (M2 ),
[2] [2] (d[2] φ)p tp,a,b,ψ1 = tφ(p),˜a,˜b,ψ , 2
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
is a tensor in Tp0,2 (M ) which is independent of the chart ψ ∈ atlasp (M ).
32.7. Differentials of differentiable maps for higher-order operators
647
where a ˜ ∈ Sym(n, IR) is the real symmetric n × n matrix defined by a ˜kℓ = aij ∂xi ψ2k ◦ φ ◦ ψ1−1 (x)
x=ψ1 (p)
for all k, ℓ = 1, . . . n, and ˜b ∈ IRn is defined by ˜bk = aij ∂xi ∂xj ψ k ◦ φ ◦ ψ −1 (x) 2
=
[2] ∂p,a,b,ψ1 (ψ2k
1
∂xj ψ2ℓ ◦ φ ◦ ψ1−1 (x)
x=ψ1 (p)
x=ψ1 (p)
◦ φ)
+ bi ∂xi ψ2k ◦ φ ◦ ψ1−1 (x)
x=ψ1 (p)
for all k = 1, . . . n. 32.7.2 Remark: As in the case of first-order tangent operators, the differential of a map for second-order operators is determined in Definition 32.7.3 by simply passing the test function from M2 to M1 via the map. Once again, the form of the definition appears much simpler for operators than for coordinate vectors. P m ij The expression for a ˜kℓ looks like m are the standard i,j=1 a ∂p,ei ,ψ1 (ψ2 ◦ φ)∂p,ej ,ψ1 (ψ2 ◦ φ), where ei ∈ IR 2,0 orthonormal basis vectors. This expression resembles a T (M1 ) tensor. [ Must check whether there is any truth in this. ] 32.7.3 Definition: The differential for second-order operators at a point p ∈ M1 of a differentiable ˚p[2] (M1 ) → T ˚[2] (M2 ) defined by map φ ∈ C 2 (M1 , M2 ), for C 2 manifolds M1 and M2 , is the map (d[2] φ)p : T φ(p) ˚p[2] (M1 ), ∀f ∈ C 2 (M2 ), ∀L ∈ T
(d[2] φ)p (L) (f ) = L(f ◦ φ).
(d[2] φ)p (L) ˚p[2] (M1 ) T
L
(d[2] φ)p
˚[2] (M2 ) T φ(p)
π1 ψ1
π2
p
φ
M1
IRm
ψ2
φ(p) M2
f ◦φ
f
IRn
IR Figure 32.7.1
The differential of a map for second-order operators
32.7.5 Theorem: Definitions 32.7.1 and 32.7.3 are consistent with each other. That is, D(d[2] φ)p (W ) = [2]
(d[2] φ)p (DW ) for all W ∈ Tp (M1 ) and p ∈ M1 . [2]
Proof: (d[2] φ)p (W ) is the result of the differential of φ applied to W ∈ Tp (M1 ), whereas (d[2] φ)p (DW ) ˚p[2] (M1 ). To prove Theorem 32.7.5, it is necessary to is the result of the differential of φ applied to DW ∈ T calculate (d[2] φ)p (L) (f ) with L = DW . [2] [2] [2] ˚p[2] (M1 ). So for ψ2 ∈ atlasφ(p) (M2 ) and f ∈ Let W = tp,a,b,ψ1 ∈ Tp (M1 ). Then L = DW = ∂p,a,b,ψ1 ∈ T [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
32.7.4 Remark: The map (d[2] φ)p is obviously linear. Figure 32.7.1 shows spaces and maps relevant to ˚(Mk ) → Mk for k = 1, 2. Definition 32.7.3, and the projection maps πk : T
648
32. Higher-order differentials
C 2 (M1 ), [2] (d[2] φ)p (L) (f ) = ∂p,a,b,ψ1 (f ◦ φ) = (aij ∂xi ∂xj + bi ∂xi ) (f ◦ ψ2−1 ) ◦ (ψ2 ◦ φ ◦ ψ1−1 (x)) x=ψ1 (p) −1 −1 ij k = a ∂x˜k ∂x˜ℓ f ◦ ψ2 (˜ x) ∂xi ψ2 ◦ φ ◦ ψ1 (x) ∂xj ψ2ℓ ◦ φ ◦ ψ1−1 (x) x ˜=ψ2 (φ(p)) x=ψ1 (p) x=ψ1 (p) −1 −1 ij k + a ∂x˜k f ◦ ψ2 (˜ x) ∂xi ∂xj ψ2 ◦ φ ◦ ψ1 (x) x ˜=ψ2 (φ(p)) x=ψ1 (p) −1 −1 i k + b ∂x˜k f ◦ ψ2 (˜ x) ∂xi ψ2 ◦ φ ◦ ψ1 (x) x ˜=ψ2 (φ(p)) x=ψ1 (p) =a ˜kℓ ∂x˜k ∂x˜ℓ f ◦ ψ2−1 (˜ + ˜bk ∂x˜k f ◦ ψ2−1 (˜ x) x) x ˜=ψ2 (φ(p))
=
x ˜=ψ2 (φ(p))
[2] ∂φ(p),˜a,˜b,ψ (f ) 2
= Dt[2]
(f )
φ(p),˜ a,˜ b,ψ2
= D(d[2] φ)p (W ) (f ), where a ˜ and ˜b are as in Definition 32.7.1. The theorem follows immediately. [ Must find or make a macro to permit page-splitting in the above equation display. ] 32.7.6 Remark: Theorem 32.7.5 leads to the commutative diagram illustrated in Figure 32.7.2. (d[2] φ)p
˚(M1 ) T
D
T (M1 ) Figure 32.7.2
(d[2] φ)p
T (M2 )
Differential map for second-order vectors and operators
32.8. Differentials of curves for higher-order operators This section deals with the push-forth of higher-order operators according to curves and families of curves. [2]
[ For γ : IR → M , should get (d[2] γ)(t) ∈ Tγ(t) (M ). ]
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
D
˚(M2 ) T
[649]
Chapter 33 Vector field calculus
33.1 33.2 33.3 33.4 33.5 33.6
Naive vector field derivatives . . . . . . The Poisson bracket . . . . . . . . . . . Vector field derivatives for curve families Lie derivatives of vector fields . . . . . . Lie derivatives of tensor fields . . . . . . The exterior derivative . . . . . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
649 651 653 653 658 658
33.1. Naive vector field derivatives
33.1.1 Remark: The action ∂V in Definition 33.1.2 is the same as the action DV in Notation 28.5.7. The differential action with respect to vector fields in Definition 33.1.3 is defined in terms of this pointwise differential operator ∂V = DV for real-valued functions. The vector field spaces X(M ) and X r (M ) for − r∈ + 0 are defined in Notation 29.5.7.
Z
33.1.2 Definition: The action of a vector V ∈ T (M ) on C 1 real-valued functions on a C 1 manifold M is the map ∂V : C 1 (M ) → IR where ∂V f = DV f for all f ∈ C 1 (M ). 33.1.3 Definition: The action of a vector field X ∈ X(M ) on C 1 real-valued functions for a C 1 manifold M is the map ∂X : C 1 (M ) → (M → IR) where ∂X f : M → IR is defined by (∂X f )(p) = ∂X(p) f for all f ∈ C 1 (M ) and p ∈ M . 33.1.4 Remark: The action ∂X in Definition 33.1.3 is the same as the action DX in Notation 29.5.10. It is redefined here with a new notation to emphasize that it is a member of a family of naive differential actions, whereas the notation DX emphasizes membership of the family of covariant derivatives. These derivatives just happen to be identical in the case of derivatives of real-valued functions. − 33.1.5 Theorem: A vector field X in a C k+1 manifold M is of class C k for k ∈ + 0 if and only if
Z
∀f ∈ C k+1 (M ),
∂X f ∈ C k (M ).
33.1.6 Remark: Theorem 33.1.5 suggests that one may compose the actions of two vector field actions ∂X and ∂Y to construct an action ∂X ◦ ∂Y : C k+2 (M ) → C k (M ), although this is clearly not a simple vector ˚[2] (M ). field action. This is discussed in Section 33.2. It turns out that ∂X ◦ ∂Y ∈ T
©
Alan U. Kennington, “Differential geometry reconstructed: a unified systematic framework”. Copyright 2009, Alan U. Kennington. All rights reserved. You may print this book draft in A4 format. Printing in all other formats is forbidden. You may not charge any fee for copies of this book draft. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Naive vector field derivatives are derivatives with respect to vector fields on manifolds which do not use any definition of parallelism. This may be contrasted with Lie derivatives which use parallel transport with respect to the vector field and covariant derivatives which use parallel transport with respect to a connection. When there is some definition of parallel transport, it is possible to use some notion of “pull-back” to force the function being differentiated to stay inside the fibre set at a fixed point of a manifold so that the derivative of the function can remain in the same class of functions. However, in the absence of any notion of parallel transport, the function and its derivative are generally in different function classes. Therefore naive vector field derivatives are often of very limited utility. On the other hand, they are the important for the definition of other derivatives such as the Lie and covariant derivatives.
650
33. Vector field calculus
33.1.7 Remark: Corresponding to the vector field action on real-valued functions in Definition 33.1.3 is the vector field action on vector fields in Definition 33.1.11. This is defined in terms of the action of a vector on a vector field in Definition 33.1.8. The partial derivative notation ∂X Y emphasizes that this derivative is not the same as the covariant derivative which is denoted as DX Y , of the Lie derivative which is denoted as LX Y . Both DX Y and LX Y are constructed from ∂X Y by subtracting suitable correction factors. 33.1.8 Definition: The action of a vector V ∈ T (M ) on C 1 vector fields on a C 2 manifold M is the map ∂V : X 1 (M ) → T (T (M )) where ∂V Y ∈ TY (p) (T (M )) is defined for V ∈ Tp (M ) and Y ∈ X 1 (M ) by ∂V Y = Y∗ (V ), where Y∗ : T (M ) → T (T (M )) is the induced map of Y . 33.1.9 Remark: The differential action ∂V Y in Definition 33.1.8 is the rate of change of the vector field Y in the direction V . This rate of change is not in the tangent space T (M ), but it is in the second tangent space T (2) (M ) = T (T (M )). Therefore it is a well-defined vector, although as all texts point out, its components do not transform like a vector in T (M ). The vector Y∗ (V ) is also denoted as (dY )p (V ). 33.1.10 Theorem: Let M be a C 2 manifold. Let V = tp,v,ψ ∈ T (M ) and Y ∈ X 1 (M ). Then the action ∂V Y in Definition 33.1.8 satisfies ∂V Y =
Y (p), (v, v k (∂xk η(x))
x=ψ(p)
), ψ˜ ,
where η = π2 ◦ ψ˜ ◦ Y ◦ ψ −1 is the component function of Y for the chart ψ.
33.1.12 Remark: Unfortunately, the result of differentiating a vector field Y with respect to a vector field X in Definition 33.1.11 is not a vector field in a space such as X 0 (T (T (M ))) as one would hope. 0 Instead, ∂X Y ∈ X T (T (M )), π ◦ π∗ , M , where (T (T (M )), π ◦ π∗ , M ) is the fibration of double tangents over the base space M instead of the usual T (M ). The map π ◦ π∗ is the composition of the projection maps π : T (M ) → M and π∗ : T (T (M )) → T (M ). Thus ∂X Y is a cross-section of this space of second tangents over M as a base space. The reason for this is that ∂X Y only specifies a single vector in T (T (M )) for each point p ∈ M rather than a vector at all points of T (M ). Theorem 33.1.13 is written in terms of the set X k (T (T (M )), π ◦ π∗ , M ) of C k cross-sections of the fibration of the second tangent space T (T (M )) of M over the base space M . The general notation X k (E, π, B) for the set of C k cross-sections of a fibration (E, π, B) is defined in Notation 27.12.14. 33.1.13 Theorem: Let M be a C k+2 manifold for some k ∈ ∀X ∈ X k (M ), ∀Y ∈ X k+1 (M ),
Z0 . Then
−+
∂X Y ∈ X k T (T (M )), π ◦ π∗ , M ,
where π : T (M ) → M and π∗ : T (T (M )) → T (M ) are the usual projection maps. [ The notation X k (E, π, M ) for differentiable fibrations (E, π, M ) in Theorems 33.1.13 and 33.1.18 is defined in Chapter 35. ] 33.1.14 Remark: The differential action ∂X Y in Definition 33.1.11 satisfies (∂X Y )(p) = Y∗ (X(p)) for all p ∈ M . In other words, ∂X Y = Y∗ ◦ X. Since Y is a cross-section of T (M ), it satisfies π ◦ Y = idM , where π : T (M ) → M is the projection map of T (M ) and idM is the identity on M . Therefore π∗ ◦ Y∗ = idT (M ) , from which it follows that π∗ (∂V Y ) = V for all V ∈ Tp (M ). Hence π∗ ((∂X Y )(p)) = X(p) for all p ∈ M . In other words, π∗ ◦ (∂X Y ) = X. 33.1.15 Remark: Strictly speaking, there should be different notations for the action of vectors on realvalued functions and vector fields. The notation ∂V refers to different operators in expressions such as ∂V f for f ∈ C 1 (M ) and ∂V X for X ∈ X 1 (M ). Naive vector field derivatives may be generalized to differentiable cross-sections of any differentiable fibre bundle. Of special interest are tensor fields of general type. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
33.1.11 Definition: The action of a vector field X ∈ X(M ) on C 1 vector fields for a C 2 manifold M is the map ∂X : X 1 (M ) → (M → T (T (M ))) where ∂X Y : M → T (T (M )) is defined by (∂X Y )(p) = ∂X(p) Y for all Y ∈ X 1 (M ) and p ∈ M .
33.2. The Poisson bracket
651
Z
− k k k+1 The space Xr,s (M ) = X k (T r,s (M )) for k ∈ + 0 consists of all C tensor fields of type (r, s) on a C 1 manifold M . If a field K ∈ Xr,s (M ) are differentiated in a naive manner with respect to a vector V ∈ T (M ), the result is a vector ∂V K in the tangent space TK(p) (T r,s (M )) of the total tensor space T r,s (M ), 1 where p = π(V ). Therefore this differential operator has domain and range ∂V : Xr,s (M ) → T (T r,s (M )). It r,s 1 seems reasonable to label the operator with the tensor type. Thus ∂V : Xr,s (M ) → T (T r,s (M )). From this perspective, the operator on real-valued functions is notated as ∂V0,0 f = ∂V f = DV f , and the operator on vector fields is notated as ∂V1,0 X = ∂V X. 33.1.16 Definition: The action of a vector V ∈ T (M ) on C 1 tensor fields of type (r, s) on a C 2 manifold M is the map ∂Vr,s : X 1 (T r,s (M )) → T (T r,s (M )) where ∂Vr,s K ∈ TK(p) (T r,s (M )) is defined for V ∈ Tp (M ) and K ∈ X 1 (T r,s (M )) by ∂Vr,s K = K∗ (V ), where K∗ : T (M ) → T (T r,s (M )) is the induced map of K. 33.1.17 Definition: The action of a vector field X ∈ X(M ) on C 1 tensor fields of type (r, s) on a C 2 r,s r,s manifold M is the map ∂X : X 1 (T r,s (M )) → (M → T (T r,s (M ))) where ∂X K : M → T (T r,s (M )) is defined 1 r,s by (∂X K)(p) = ∂X(p) K for all K ∈ X (T (M )) and p ∈ M . 33.1.18 Theorem: Let M be a C k+2 manifold for some k ∈
Z0 . Then for all tensor types (r, s),
−+
∀X ∈ X k (M ), ∀K ∈ X k+1 (T r,s (M )),
r,s ∂X K = ∂X K ∈ X k T (T r,s (M )), π ◦ π ˆ, M ,
where π : T r,s (M ) → M and π ˆ : T (T r,s (M )) → T r,s (M ) are the usual projection maps.
The Poisson bracket is applicable to the definition of curvature of a connection for a general differentiable fibre bundle – with or without a Lie structure group. An “infinitesimal curve” in the base space of a differentiable fibre bundle induces a parallel motion of the fibre set over the points of the curve, and if this parallel motion is viewed through a fibre chart, the motion is a global diffeomorphism of the fibre space, which is a differentiable manifold. Therefore all connections are equivalent to vector fields which are generated by diffeomorphisms. The most important characteristic of a connection is its curvature, which is defined as the commutator of these vector fields. For example, if Xk is the field generated by the motion of a base point with velocity Vk for k = 1, 2, then the curvature in the “plane” spanned by the ordered pair (V1 , V2 ) is the commutator [X1 , X2 ]. Thus the fibre space vector field algebra corresponding to connections gives a very general definition of curvature even when the customary connection form, which is defined in terms of the Lie algebra of the structure group, is not itself defined. In order to present the Poisson bracket of vector fields in a logically correct fashion, it is necessary to make use of the second-order tangent vector space T [2] (M ) (Section 30.3) and the second-order tangent operator ˚[2] (M ) (Section 30.1). space T [ Present commuting vector fields on two-parameter curve families here. Try to show that the fields commute if and only if they are the tangent fields for the parameters of a two-parameter curve family, or something like that. Comment on the relation of this to “zero torsion”. ] [ See EDM2 [34], 82.B, for canonical transformations and the Poisson bracket. See also EDM2 [34], 271.F, for the relation to Hamilton-Jacobi equations, and 324.C,D for the relation to first-order PDEs. ] 33.2.1 Remark: Vector fields X, Y ∈ X 1 (T (M )) for a C 2 manifold M can be combined in the sense of ˚(M )) acting on the space of real-valued functions C 2 (M ). Thus differential operator fields ∂X , ∂Y ∈ X 1 (T 0 2 ∂X (∂Y f ) ∈ C (M ) for f ∈ C (M ). The composition ∂X ◦ ∂Y C 2 (M ) of these differential operators is in ˚[2] (M )) which is defined in Section 30.7. This corresponds to the the tangent operator field space X 0 (T component-oriented tangent vector field XY in X 0 (T [2] (M )) given by Definition 33.2.2. 33.2.2 Definition: The composition of vector fields X ∈ X(M ) and Y ∈ X 1 (M ) for a C 2 manifold M is the second-order vector field XY ∈ X 0 (T [2] (M )) given by ∀p ∈ M, ∀ψ ∈ atlasp (M ), [ www.topology.org/tex/conc/dg.html ]
[2]
[2]
(XY )(p) = tp, a(ψ(p)), b(ψ(p)), ψ ∈ Tp (M ) [ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
33.2. The Poisson bracket
652
33. Vector field calculus
with a = (aij )ni,j=1 : Range(ψ) → Sym(n, IR) and b = (bj )nj=1 : Range(ψ) → IRn for n = dim(M ) defined by ∀x ∈ Range(ψ), ∀i, j = 1 . . . n,
∀x ∈ Range(ψ), ∀i = 1 . . . n,
aij (x) = 21 (ξ i (x)η j (x) + η i (x)ξ j (x)) bi (x) = ξ i (x)∂xi η j (x),
where ξ : Range(ψ) → IRn and η ∈ C 1 (Range(ψ), IRn ) are component functions for X and Y respectively for the chart ψ, so that X(p) = tp,ξ(ψ(p)),ψ and Y (p) = tp,η(ψ(p)),ψ for p ∈ Dom(ψ). 33.2.3 Remark: Theorem 33.2.4 uses the drop function ̟ : T (2) (M ) → T [2] (M ) in Definition 30.5.2. This theorem follows easily by comparing the expression for bi (x) in Definition 33.2.2 with the chart-dependent vertical component of ∂V Y in Theorem 33.1.10. The identity ̟ ◦ (∂X Y ) = XY can be interpreted by applying both sides as differential operators to a quadratic polynomial. Then the left side can be thought of as the divergence of the gradient whereas the right side can be thought of as some sort of Hessian operator. Theorem 33.2.4 is a generalization of the identity LX Y = [X, Y ] in Theorem 33.4.13. The generalization probably isn’t very useful, but you never can tell with these things. Waste not, want not! 33.2.4 Theorem: ̟ ◦ (∂X Y ) = XY for any X ∈ X(M ) and Y ∈ X 1 (M ) for a C 2 manifold M . 33.2.5 Remark: From Definition 33.2.2, the commutator [X, Y ] = XY − Y X for X, Y ∈ X 1 (M ) for a C 2 manifold M satisfies: ∀p ∈ M, ∀ψ ∈ atlasp (M ),
[2]
[X, Y ](p) = tp, 0, (ξi ∂i ηj −ηi ∂i ξj )n
j=1
,ψ
≡ tp,(ξi ∂i ηj −ηi ∂i ξj )nj=1 ,ψ ∈ Tp (M ),
where 0 denotes the n × n zero matrix. A second-order tangent vector is equivalent to a first-order tangent vector if the second-order component is zero. This vector [X, Y ](p) is chart-independent.
[ Have a theorem stating that [X, Y ] in Remark 33.2.5 is chart-independent? ] 33.2.7 Definition: The composition of operator fields ∂X and ∂Y for X ∈ X 0 (M ) and Y ∈ X 1 (M ) for a ˚[2] (M )) given by C 2 manifold M is the second-order operator field ∂X ∂Y ∈ X 0 (T ∀p ∈ M, ∀f ∈ C 1 (M ),
(∂X ∂Y )(p)(f ) = (∂X ∂Y f )(p).
33.2.8 Theorem: Definitions 33.2.2 and 33.2.7 are equivalent. In other words, ∂XY = ∂X ∂Y . 33.2.9 Remark: Theorem 33.2.8 may be compared with the calculation ∂XY f = ∂X ∂Y f = ∂X (Y i ∂i f ) = (∂X Y )i ∂i f + X i Y j ∂i ∂j f for functions f ∈ C 2 (M ). In other words, ∂XY = (∂X Y )i ∂i + X i Y j ∂i ∂j , where X i means ξ i and Y j means η j . This informal calculation shows how the action of XY is related to the naive derivative ∂X Y . 33.2.10 Definition: The Poisson bracket on a C 2 manifold M is the operation [·, ·] : X 1 (M ) × X 1 (M ) → X 0 (M ) defined by [X, Y ] = XY − Y X for vector fields X, Y ∈ X 1 (M ).
33.2.11 Theorem: For any C ∞ manifold M , the tuple X ∞ (M ) − < (IR, X ∞ (M ), σIR , τIR , σA , τA , µ) is a Lie ∞ algebra, where τA is the Poisson bracket on M and (IR, X (M ), σIR , τIR , σA , µ) is the real linear space with pointwise addition and multiplication. Proof: See Definition 9.11.1 for Lie algebras. This theorem follows by Theorem 9.11.9. [ Determine how much regularity the Poisson bracket inherits from the fields X and Y . ] [ Gallot/Hulin/Lafontaine [19] has some material for this section? Proof of lemma 1.52 shows that vector fields are closed under the Poisson bracket; defn. after lemma 1.52 for Poisson bracket; theorem 1.63 and defn 1.64 on Lie derivative of the “push-forth”. ] [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
[2]
33.2.6 Remark: The calculation in Remark 33.2.5 shows that the tuple format tp,a,b,ψ = [(p, a, b, ψ)] for second-order tangent vectors is inconvenient. It would be more logical to present the components in increasing order, such as [(p, b, a, ψ)]. Even more logical would be an ordering such as [(ψ, x, b, a)] with x = ψ(p). Then vectors of all orders could be regarded as infinite tuples for which only a finite number of components are non-zero. Then an equivalence such as [(ψ, x, b, a)] ≡ [(ψ, x, b)] would seem quite natural. Probably in computer software, things would be done this way. (Remark 30.3.17 is similar.)
33.3. Vector field derivatives for curve families
653
33.3. Vector field derivatives for curve families The Poisson bracket [X, Y ] has some special properties if the vector fields X and Y are the vector fields generated by a two-parameter curve family. 33.3.1 Remark: Suppose that γ : IR2 → M is a C 2 curve family in a C 2 manifold M , and define X : IR2 → T (M ) by X(s, t) = ∂s γ(s, t) ∈ Tγ(s,t) (M ). Similarly define Y : IR2 → T (M ) by Y (s, t) = ∂t γ(s, t) ∈ Tγ(s,t) (M ). These fields may be expressed more precisely as γ(s, t), ∂s (ψ(γ(s, t))), ψ ∀s, t ∈ IR, ∀ψ ∈ atlasγ(s,t) (M ), Y (s, t) = γ(s, t), ∂t (ψ(γ(s, t))), ψ .
∀s, t ∈ IR, ∀ψ ∈ atlasγ(s,t) (M ), X(s, t) =
For brevity, one may write simply X = γs and Y = γt . It is useful to calculate various definitions of derivatives for such vector fields. For example, the action ∂X Y of one vector field on another in Definition 33.1.11 may be calculated as (∂X Y )(p) =
X(s, t), ∂t ψ(γ(s, t)), ∂s ∂t ψ(γ(s, t)) , ψ˜ ,
for all p = γ(s, t), where ψ˜ is the chart for T (M ) corresponding to ψ ∈ atlasp (M ). As expected, (∂X Y )(p) ∈ TX(p) (M ) and the horizontal component of (∂X Y )(p) is π∗ ((∂X Y )(p)) = Y (p). A fly in the ointment here is the fact that Y is not always a vector field in the sense required by Definition 33.1.11 because Dom(Y ) = Dom(γ) may not include a neighbourhood of p. A second fly in the ointment is the fact that γ may not be injective and so Y (s, t) may have two different values for (s, t) such that γ(s, t) = p. Since X(p) may also be multi-valued, the derivative ∂X(p) Y may become very ambiguous indeed. When there are so many flies in the ointment, it is best to throw away the ointment and buy a new jar.
∀p ∈ Dom(ψ),
(∂X Y )(p) =
The composition XY of X and Y satisfies ∀p ∈ Dom(ψ),
Y (p), (1, 0, . . . 0), (0, . . . 0) , ψ˜ .
(XY )(p) =
p, a(p), b(p), ψ ,
where a : Dom(ψ) → Sym(n, IR) is given by a(p)ij = 21 (δ1i δ2j + δ2i δ1j ) for i, j = 1, . . . n and b(p) = 0 ∈ IRn for all p ∈ M . The Poisson bracket [X, Y ] satisfies [X, Y ](p) = 0 for all p ∈ M .
[ Interpret Theorem 33.2.4 for X = γs , Y = γt . Also interpret this for contours etc. for functions f ∈ C 2 (IR2 , IR). ]
33.4. Lie derivatives of vector fields In this section, the Lie derivative is defined for vector fields acting on vector fields. This is shown to be equal to the Poisson bracket. In Section 33.5, the Lie derivative of general tensor fields is defined using the association between tensor spaces and tangent vector spaces. 33.4.1 Remark: The first task for defining Lie derivatives of vector fields is to define the differential of a vector under the flow generated by a vector field. This must then be subtracted from the actual differential of the vector field in the direction of the flow. The difference between the actual differential and the “with the flow” differential is defined as the Lie derivative of the vector field. This principle applies also to general tensor fields, but for such fields it is more difficult to define the “with the flow” differential. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
33.3.2 Theorem: Let M be a C 2 manifold with n = dim(M ) ≥ 2. Let ψ be a C 2 chart for M . Define the ψ ψ vector fields X and Y on Dom(ψ) by X = eψ 1 and Y = e2 , where the basis vector fields ek : Dom(ψ) → T (M ) are as in Definition 29.5.12. Then the action of X on Y satisfies
654
33. Vector field calculus
33.4.2 Remark: The “flow” of one vector field Y ∈ X 1 (M ) on a C 2 manifold with respect to a second vector field X ∈ X 1 (M ) is defined in terms of a local family of differentiable maps φ : I → (M → M ) which is constructed from the field X. This family φ does not necessarily have to exist or be unique. It is only used as a construct to motivate the definition of the notion of flow. The first field Y may be thought of as the passive field which is acted on by the active field X. Thus the passive field Y is made to flow according to the active field X, which may be thought of as the velocity field of a fluid flow. Now suppose that for all t ∈ I, for some open interval I ⊆ IR with 0 ∈ I, there is a C 2 diffeomorphism φ(t) : M → M and φ(0) = idM is the identity map on M . Suppose also that for all p ∈ M , the map t 7→ φ(t)(p) is a C 2 map (actually a curve) from I to M , and that this map satisfies ∂t (φ(t)(p)) = X(p) for all p ∈ M . (If such a family of diffeomorphisms exists, it is said to be generated by the vector field X.) For a fixed t ∈ I, the map φ(t) can be differentiated at each point p ∈ M . This gives the differential (dφ(t))p : Tp (M ) → Tφ(t)(p) (M ) of φ(t) at p. For any tangent vector V = tp,v,ψ ∈ Tp (M ), the tangent vector (dφ(t))p (V ) represents the result of making V flow under the influence of the field X. The vector V may be thought of as the velocity of a curve γ which passes through the point p. Under the diffeomorphism φ(t), this curve is transported to a new curve γ¯ = φ(t) ◦ γ which passes through φ(t)(p). The velocity of γ¯ as it passes through p¯ = φ(t)(p) is then V¯ = (dφ(t))p (V ). Let n = dim(M ). Then (dφ(t))p (V ) = tp,¯ ¯ v ,ψ2 for any ψ2 ∈ AM,p¯, where v¯j =
n X i=1
v i ∂yi (ψ2 ◦ φ(t) ◦ ψ −1 (y)).
The quantity of interest is the differential of (dφ(t))p (V ) with respect to t when t = 0, because this is the velocity of the motion of V when it is made to flow according to the field X. The picture to have in mind is a vector V attached to the point p which moves according to the vector field X along with all other points in a neighbourhood of p. The desired differential with respect to t is the vector θX V ∈ TV (T (M )) defined as θX V = ∂t (dφ(t))p (V ) t=0 = (dφ(t))p (V ), ∂t (ψ ◦ φ(t)(p)), ∂t (v i ∂yi (ψ ◦ φ(t) ◦ ψ −1 (y)) ) , ψ˜ (33.4.1) y=ψ(p) t=0 = V, ξ(ψ(p)), v i ∂yi (ξ(y)) , ψ˜ , (33.4.2) y=ψ(p)
where ψ2 is chosen equal to ψ because φ(t)(p) = p, and ψ˜ ∈ AT (M ) is the chart for T (M ) corresponding to ψ ∈ AM as in Definition 28.8.1. (If you think the plethora of parentheses is confusing, you obviously haven’t done much Lisp programming!) The above calculation is possibly not instantly clear. The expression on (2) line (33.4.1) has the form tV,(w ,w ),ψ˜ for some w1 , w2 ∈ IRn . This is the usual way in which vectors in the 1 2 tangent space T (T (M )) are expressed. (See Definitions 28.8.4 and 28.10.13 for details. Recall that vectors in T (T (M )) have 2n components.) In this case, w1 = ∂t (ψ ◦ φ(t)(p)) is the rate of change of the n base space coordinates with respect to t. As expected, this is equal to ξ(ψ(p)) ∈ IRn , which is the sequence of components of X(p). The second sequence of n components of ∂t (dφ(t))p (V ) is w2 = ∂t (v i ∂yi (ψ ◦ φ(t) ◦ ψ −1 (y))). The important step here is to swap the differential operators ∂t and ∂yi . This is okay because v i is constant and ψ ◦ φ(t) ◦ ψ −1 (y) is C 2 with respect to t ∈ I and y ∈ IRn . The result is then w2 = v i ∂yi ∂t (ψ ◦ φ(t) ◦ ψ −1 (y)). But ∂t (ψ ◦ φ(t) ◦ ψ −1 ) = ξ. Line (33.4.2) follows from this. 33.4.3 Remark: Although the ‘flow velocity’ θX V of a vector V for a vector field X in Remark 33.4.2 is defined in terms of a family φ of diffeomorphisms which may or may not exist, the flow θX V is calculated to be the vector on line (33.4.2) which does not require any such diffeomorphisms. Therefore this may be taken as a general definition of the parallel translation velocity of a vector V for the vector field X. The terminology “Lie connection” in Definition 33.4.4 is non-standard, but it seems reasonable enough because the vector (dφ(t))p (V ) in Remark 33.4.2 is called the “Lie transport” of the vector V by the vector field X. (See for example Crampin/Pirani [11], pages 64–69, for Lie transport.) The ‘Lie connection’ is the differential of a parallel transport. So it is in fact a connection of a limited kind. However, instead of being a per-path connection, it is a per-vector-field connection. Hence it is labelled with the vector field rather than a curve velocity vector as is the case for an affine connection. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
∀j = 1 . . . n,
33.4. Lie derivatives of vector fields
655
33.4.4 Definition: The Lie connection with respect to a C 1 vector field X on a C 2 manifold M with n = dim(M ) is the map θX : T (M ) → T (T (M )) defined by ∀p ∈ M, ∀v ∈ IRn , ∀ψ ∈ atlasp (M ), θX V =
V, ξ(ψ(p)), v i ∂xi (ξ(x))
x=ψ(p)
, ψ˜ ,
where V = tp,v,ψ ∈ T (M ), X(q) = tq,ξ(ψ(q)),ψ for all q ∈ Dom(ψ), and ψ˜ is the T (T (M )) chart corresponding to ψ. 33.4.5 Definition: A Lie transport of a vector V ∈ T (M ) of a C 2 manifold M by a vector field X ∈ X 1 (M ) is a curve γ : I → T (M ) for some interval I ⊆ IR such that 0 ∈ I, γ Int(I) is C 1 , and (i) γ(0) = V ;
(ii) ∀t ∈ Int(I), γ ′ (t) = θX (γ(t)), where θX is the Lie connection with respect to X. 33.4.6 Remark: The Lie transport curve γ in Definition 33.4.5 is a lift of the curve π ◦ γ : I → M in the same way that curves are lifted by affine connections. (Here π : T (M ) → M is the projection map of T (M ).) An important difference is that the base point curve π ◦ γ cannot be freely chosen. The base point curve is generally determined by the starting point p = π(V ) and the field X. So parallelism is defined by this “Lie connection” only along integral curves of X. Definition 33.4.5 does not require existence or uniqueness of integral curves. But if existence and uniqueness are guaranteed, then one may refer to the Lie transport of a vector by a vector field. The Lie connection in Definition 33.4.4 is well-defined even if integral curves of X are non-existent or non-unique. 33.4.7 Remark: The vector θX V in Definition 33.4.4 looks very similar to ∂V X. In fact,
X(p), v, v i ∂yi (ξ(y))
y=ψ(p)
, ψ˜ .
(33.4.3)
The vectors X(p) and V in lines (33.4.2) and (33.4.3) appear in different positions. Thus in line (33.4.3), ∂V X ∈ TX(p) (T (M )) and π∗ (∂V X) = V , where π∗ is the induced map or push-forth of π, whereas in line (33.4.2), θX (V ) ∈ TV (T (M )) and π∗ (θX (V )) = X(p). The vectors ∂V X and θX (V ) have the same vertical component v i ∂yi (ξ(y)) y=ψ(p) , which is, of course, chart-dependent. This vertical component may be interpreted as the difference between the parallel motion of V with respect to the vector field and the parallel motion of V with respect to the coordinate system at p. A problem with the parallel translation vector θX (V ) is the fact that it is an element of TV (T (M )) rather than the tangent space Tp (M ) where p = π(V ). The set TV (T (M )) ⊆ T (2) (M ) = T (T (M )) contains “tangents to tangents” as defined in Section 28.10. These vectors are chart-independent because they are defined in terms of the charts of T (T (M )), not T (M ). Just like the affine connections in Chapter 37, these vectors should be thought of a “tensorization terms” which are applied to the actual rate of change of vector fields. The difference between them is then a “vertical vector” which can be converted to a vector in T (M ) by using a “drop function”. Figure 33.4.1 shows the flow velocities θX (V1 ) and θX (V2 ) for vectors V1 , V2 ∈ Tp (M ) for a vector field X ∈ X 1 (T (M )). This emphasizes that the flow velocities are not elements of the tangent space T (M ). 33.4.8 Remark: The Lie derivative of a vector field Y with respect to a vector field X on a differentiable manifold M is defined as the difference between the actual rate of change of Y and the rate of change it would have if it was transported in a parallel fashion by the flow of X. This difference is ∂X(p) Y − θX (Y (p)) for p ∈ M . This is a vertical vector in TY (p) (T (M )), which means that (dπ)Y (p) (∂X(p) Y − θX (Y (p)) = 0, where (dπ)Y (p) : TY (p) (T (M )) → Tp (M ) is the differential at Y (p) ∈ Tp (M ) of the projection map π : T (M ) → M . Therefore a drop function ̟ : T (T (M )) → T (M ) may be applied to the difference vector to give LX (Y )(p) = ̟(∂X(p) Y − θX (Y (p))) ∈ Tp (M ) for all p ∈ M . This is illustrated in Figure 33.4.2. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
∂V X =
656
33. Vector field calculus
T (M )
V1 V2
T θX (V1 ) ∈ V1
(T (M )) T (M ) X(p)
θX (
V2 )
∈T
V2 (T
π
(M )
)
π
∂ V1X
∂V X 2 ∈T
V1 ∈ Tp (M ) M
) (T (M
))
))
X(p) ∈ Tp (M ) V2 ∈ Tp (M )
Flow velocities for vectors V1 , V2 for a vector field X
T (M )
Y (p)
∈ TY (p) ∂X (p)Y
θX
π
M
Figure 33.4.2
X (p
p
M
V2 ∈ Tp (M ) Figure 33.4.1
(T (M
V1 ∈ Tp (M )
X(p) ∈ Tp (M )
p
) ∈ T X (p
(Y (
p))
(T (M ))
∂X(p) Y − θX (Y (p)) ∈ ker((dπ)Y (p) ) ∈T
Y( p) (T
(M
))
p LX (Y )(p) = ̟Y (p) ∂X(p) Y − θX (Y (p)) ∈ Tp (M )
Lie derivative of vector field Y with respect to X
∀p ∈ M,
(LX Y )(p) = ̟Y (p) (∂X(p) Y − θX (Y (p))),
where θX is the Lie connection on M with respect to X and ̟ is the drop function given by Definition 28.11.6. 33.4.10 Remark: In terms of components, (LX Y )(p) = tp,ξi ∂i η−ηi ∂i ξ,ψ ∈ Tp (M ) for all p ∈ M in Definition 33.4.9 if X(p) = tp,ξ,ψ and Y (p) = tp,η,ψ for charts ψ ∈ atlasp (M ). This follows from the calculation (2) ̟ tV,(0,ξi ∂ η−ηi ∂ ξ),ψ˜ = tp,ξi ∂i η−ηi ∂i ξ,ψ . i
i
33.4.11 Remark: The notation LX Y for Lie derivatives could be confused with the notation Lg X for the left translation of vector fields in Section 34.3. The former has a vector field as a subscript whereas the latter has a group element. This will usually remove any ambiguity. 33.4.12 Remark: The hypothetical family of transformations φ in Remark 33.4.2 is more than needed. It is sufficient to define a two-parameter curve family whose tangent vectors in one direction equal the vector field Y at p ∈ M and equal X in the other direction. The differential ∂X Y for such a curve family is the desired parallel flow θX Y of Y with respect to the field X, which must then be subtracted from the actual rate of change ∂X Y of Y . It just happens that the vertical components of ∂Y X and ∂X Y are the same for the parallel translated field Y . This leads on to the temptation to claim that LX Y = ∂X Y − ∂Y X. But this is a misleading claim in some ways. [ In Remark 33.4.12, put (∂s γ)(0, 0) = Y (p) and ∂t γ = X. ] [ Try to interpret the ‘equality’ of θX Y and θY X in the case that X = γs and Y = γt by using T [2] (M ) vectors or some sort of [(p, v, w1 , w2 , ψ)] vectors. ] 33.4.13 Theorem: LX Y = [X, Y ] for any C 1 vector fields X and Y on a C 2 manifold. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
33.4.9 Definition: The Lie derivative of a C 1 vector field Y with respect to a C 1 vector field X for a C 2 manifold M is the vector field LX Y ∈ X 0 (M ) defined by
33.4. Lie derivatives of vector fields
657
33.4.14 Remark: The Lie derivative with respect to a field X may be thought of as the covariant derivative for a parallelism which is defined for a curve class C which is the set of integral curves of X. This is a very limited kind of parallelism because it only sets up isomorphisms between tangent spaces at points which are connected by these integral curves. 33.4.15 Remark: The identity LX Y = [X, Y ] hides the different ways in which these two expressions are constructed. The expression LX Y for vector fields X, Y ∈ X 1 (M ) for a C 2 manifold M is formed by subtracting a flow-parallelism connection term θX Y from the naive derivative ∂X Y to give a T (T (M ))-valued field ∂X Y − θX Y , so that (∂X Y − θX Y )(p) ∈ T (T (M )) for p ∈ M . It happens that the horizontal component π∗ (∂X Y − θX Y )(p) equals zero, and therefore a drop function ̟ may be applied to this vertical vector to give a vector field ̟ ◦ (∂X Y − θX Y ) ∈ X 0 (M ).
On the other hand, the expression [X, Y ] is constructed by first subtracting the combined vectors XY, Y X ∈ X 0 (T [2] (M )) to give the commutator [X, Y ] = XY − Y X ∈ X 0 (T [2] (M )). In this case, it happens that the second-order component of this commutator equals zero. Therefore it is equivalent to a first-order vector field [X, Y ] ∈ X 0 (T (M )) by “dropping” the second-order vector into T (M ). Thus LX Y and [X, Y ] are constructed in different higher spaces but are then dropped into the same space X 0 (M ). It happens that they have the same value for all fields X, Y ∈ X 1 (M ).
33.4.16 Remark: A third expression which evaluates to the same result as LX Y and [X, Y ] in Remark 33.4.15 is “∂X Y − ∂Y X”. This is not really well defined because (∂X Y )(p) ∈ TY (p) (T (M )) and (∂Y X)(p) ∈ TX(p) (T (M )). But if the chart-dependent vertical components of these two expressions are subtracted, the chart-independent result is the same as for LX Y and [X, Y ]. More precisely, ̟(∂X Y )−̟(∂Y X) = XY −Y X is well-defined, but ̟(∂X Y − ∂Y X) is meaningless in general because ∂X Y − ∂Y X is meaningless in general.
To see why ∂X Y − ∂Y X and XY − Y X are similar but different, note that for f ∈ C 2 (M ) and component functions ξ and η for X and Y respectively, ∂XY f = ∂X ∂Y f = (ξ i ∂i )(η j ∂j )f = ξ i (∂i η j )∂j f + ξ i η j ∂i ∂j f . By contrast, ∂X Y = ξ i (∂i η j ) looks like the first term in the component expression ∂X ∂Y = ξ i (∂i η j )∂j +ξ i η j ∂i ∂j . Therefore ∂∂X Y 6≡ ∂X ∂Y .
This implies that ∂X Y must not be confused with XY or ∂XY . The vector field XY and the operator field ∂XY may be regarded as interchangeable, but ∂X Y is not equivalent to the other two. In fact, ˚[2] (M )), the field ∂X Y is in the second tangent vector field whereas XY ∈ X 0 (T [2] (M )) and ∂XY ∈ X 0 (T 0 space X (T (T (M )), π ◦ π∗ , M ). In other words, XY is valued in T [2] (M ) whereas ∂X Y is valued in T (2) (M ).
Despite the clear difference between ∂XY and ∂X Y , it happens that ∂XY − ∂Y X = ∂[X,Y ] and ∂X Y − ∂Y X both give the same result when applied to a C 2 real-valued function. The reason for this is that the second term ξ i η j ∂i ∂j in the expression for ∂XY is cancelled in the commutator ∂X ∂Y − ∂Y ∂X when it is applied to a real-valued function. The author would like to take this opportunity to apologize for the banality of Remark 33.4.16. 33.4.17 Theorem: LX Y = ̟(∂X Y ) − ̟(∂Y X) = XY − Y X for all vector fields X, Y ∈ X ( M ) for a C 2 manifold M . 33.4.18 Remark: Continuing from Section 33.3, the Lie derivative LX Y may be calculated for vector fields X and Y as in Theorem 33.3.2. Within the domain Dom(ψ) of a C 2 chart ψ, LX Y = ̟(∂X Y − θX Y ) with [2] [2] (∂X Y )(p) = tY (p),((1,0,...0),(0,...0)),ψ˜ ∈ T [2] (M ) from Theorem 33.3.2, and θX (Y (p)) = tY (p),((1,0,...0),(0,...0)),ψ˜ from line (33.4.2), for all p ∈ Dom(ψ). Hence LX Y = 0. This is as expected from the identity LX Y = [X, Y ]. [ See Crampin/Pirani [11], page 77, for comparison of Lie derivative with covariant derivative. ] [ Must do all of the Lie derivatives in this section also in the flat-space tensor chapter. ] [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
The fact that two calculations return the same answer does not mean that one is necessarily calculating the same thing. The fact that LX Y = [X, Y ] does not automatically imply that a Lie derivative and a Poisson bracket have the same meaning. The Lie derivative expression LX Y is just one of a family of expressions LX K for tensor fields K ∈ X 1 (T r,s (M )) of general type (r, s). The Poisson bracket expression [X, Y ] does not seem to generalize to tensors of general type. The Poisson bracket is related to curvature concepts since it typically measures the strength of interaction between families of transformations, whereas the Lie derivative LX K typically measures the variation of any tensor field K with respect to a given flow field X.
658
33. Vector field calculus
33.5. Lie derivatives of tensor fields Lie derivatives are an extension of the Poisson bracket from vector fields to general tensor fields. Or perhaps the Poisson bracket of vector fields is a special case of the Lie derivative. More precisely, the Poisson bracket is a particular construction from two vector fields, whereas the Lie derivative is a family of operations on fibre bundles which are associated with the tangent bundle of a manifold, and in the special case of the Lie derivative of a vector field (a cross-section of the tangent bundle), the value of the Lie derivative just happens to be the same as the Poisson bracket. In most texts, the Lie derivative is expressed in a rather abstract form in terms of a family of local diffeomorphisms. In practice, it is essential to have a concrete expression for Lie derivatives acting on any given kind of vector, covector or tensor field. To obtain an explicit expression for the Lie derivative for general tensors and other fibre bundles on a given base manifold, it is necessary to use the concept of “associated parallelism”, which is defined in Section 24.3. The Lie derivative is straightforward to define for the tangent bundle of a differentiable manifold. The Lie derivatives for cross-sections of all other fibre bundles on the same manifold are defined in terms of fibre bundle associations, which are defined in Section 23.10.3. Just as covariant derivatives may be extended from cross-sections of tangent bundles (i.e. vector fields) to cross-sections of arbitrary associated tangent fibre bundles (such as tensor fields and differential forms) by using the concept of associated parallelism (and associated connections), so also Lie derivatives may be extended from vector fields to general associated fibre bundles. Textbooks often perform this extension by differentiating various contractions of vector and tensor fields, but this is just one way of defining the association between different types of tensor fields. [ Gallot/Hulin/Lafontaine [19], defn 1.109 gives a definition of Lie derivative for general types of tensor fields. Must give an explicit formula for Lie derivatives of general tensor field types in terms of derivatives of K and X. Will use associated fibre bundles for this. Will this require differentiable fibre bundles? ] 1 0 [ Maybe use a notation such as Lr,s X : Xr,s (M ) → Xr,s (M ) for Lie derivatives of tensors of general type. ]
33.6. The exterior derivative
33.6.1 Definition: The exterior derivative on the space (algebra?) Λ∗ (N, W ) (?) of C r differential forms with coefficients in W on a subset N of a C ∞ manifold M is the map d : Λ∗ (N, W ) → Λ∗ (N, W ) (effectively) defined by ∀ω ∈ Λm (N, W ),
dω = . . .
[ See Malliavin, defn. I.4.4, p.117, Gallot et alia, and EDM 108.Q(2) for definitions of the exterior derivative. ] [ Give the definition of Lie derivative of a differential form. ] [ Here must present measure and integration theory for differential forms on differentiable manifolds. Present the Gauß-Green theorem, the Stokes theorem, etc. etc. ]
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
[ Try here to motivate or justify the definition of exterior derivative in terms of its superior properties. ]
[659]
Chapter 34 Differentiable groups
34.1 34.2 34.3 34.4 34.5 34.6 34.7 34.8
Lie groups . . . . . . . . . . . . . . . . . Hilbert’s fifth problem . . . . . . . . . . . Left invariant vector fields on Lie groups . Right invariant vector fields on Lie groups The Lie algebra of a Lie group . . . . . . Diffeomorphism groups . . . . . . . . . . Lie transformation groups . . . . . . . . . Infinitesimal transformations . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
660 661 662 666 668 669 670 672
34.0.1 Remark: Three kinds of group are presented in this chapter: (1) Lie group: a group which is a differentiable manifold; (2) Lie transformation group: a Lie group which is a group of diffeomorphisms of a differentiable manifold; (3) diffeomorphism group: a topological group of diffeomorphisms of a differentiable manifold.
34.0.2 Remark: Lie groups may be viewed as groups which happen to be differentiable or as differentiable manifolds which happen to have a group structure. In the latter perspective, the group structure puts a very strong constraint on the manifold. In fact, the manifold is in some senses almost completely determined by a finite set of parameters which determine the Lie algebra at the origin. From this point of view, the relation of Lie groups to general differentiable manifolds may be regarded as analogous to the relation of polynomials to general differentiable functions of several real variables. In other words, the Lie groups are a (relatively) tiny subclass of general differentiable manifolds. Nevertheless, just like polynomials, Lie groups have an enormous range of applications and have therefore been very intensively studied. Lie groups enter into differential geometry in two distinct ways: as structure groups for fibre bundles and as examples of differentiable manifolds. In this chapter, Lie groups are considered only in the role as structure groups. The reason for this is the cyclical nature of the definitions for Lie groups. Differentiable manifolds provide a substrate for diferentiable fibre bundles, which require Lie groups for their full definition. But Lie groups are themselves differentiable manifolds which have differentiable fibre bundles defined on them. So there is an infinite tree of manifolds supporting fibre bundles, which include Lie groups which are manifolds with fibre bundles defined on them, and so forth. To break this infinite cycle, tangent bundles were defined on differentiable manifolds in Chapter 28 to not be a sub-class of differentiable fibre bundles. The resulting tree of definition dependencies is illustrated in Figure 34.0.1. 34.0.3 Remark: The definitions for differentiable groups require the differentiable manifold definitions of Chapter 27, and are in turn required by the definitions of differentiable fibre bundles in Chapter 35 and connections in Chapters 36 and 37.
©
Alan U. Kennington, “Differential geometry reconstructed: a unified systematic framework”. Copyright 2009, Alan U. Kennington. All rights reserved. You may print this book draft in A4 format. Printing in all other formats is forbidden. You may not charge any fee for copies of this book draft. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Diffeomorphism groups which are not necessarily differentiable manifolds (case (3)) are perhaps not, strictly speaking, “differentiable groups”, but it is convenient to present them together with Lie groups and Lie transformation groups. Since a group may always be regarded as a transformation group of itself, (1) may be considered as a special case of (2) while (2) is a special case of (3).
660
34. Differentiable groups tangent bundle Lie group differentiable fibre bundle Lie group with differentiable fibre bundle
connection
Lie group with connection
Figure 34.0.1
Family tree of Lie groups and differentiable fibre bundles
34.0.4 Remark: The core concept of differential geometry is curvature. The most general concept of curvature on a differentiable fibre bundle is defined in terms of the exterior derivative of vector fields (infinitesimal left actions) on the fibre space which are induced by motion along curves in the base space. Such an exterior derivative is equal to a Poisson bracket of vector fields. Therefore curvature and Lie algebras of Lie groups are closely related. Since curvature in physical models is very often identified with the force acting on a system, equations of motion and Lie algebras of Lie groups are also closely related. Similarly, parallelism is closely related to least action principles. It is not surprising, therefore, that Lie groups and Lie algebras are of such importance in physics.
Marius Sophus Lie (1842–1899) defined Lie groups to require first and second order differentiability. Hilbert asked whether this requirement could be weakened. By 1952–53, it was shown that continuity and some weak conditions would guarantee analyticity. Therefore it is customary to define Lie groups as being analytic. Therefore the analytic condition has been adopted in Section 34.1. − In the case of Lie transformation groups, if the space acted upon by the group is C k for some k ∈ + 0 and the group action is likewise C k for a fixed group element, then the group action can be proved to be C k with respect to the group elements also. It does not seem that analyticity follows in this case. Therefore the group actions of Lie transformation groups (Section 34.7) are defined here as having only C k regularity − from some k ∈ + 0 , although the groups themselves are defined to be analytic.
Z
Z
34.1. Lie groups Malliavin [35], page 156, defines Lie groups to have a manifold structure of class C 3 and group action and inverse map of class C 3 . Gallot et alia [19], page 27, require class C ∞ . EDM2 [34], section 249.A, requires an analytic manifold and analytic group operations. Kobayashi and Nomizu [26], page 38, require C ∞ regularity, but they say on page 43 that the C ∞ condition may be replaced with analyticity. This very confusing range of definitions for Lie groups seems to arise from the fact that they are all really the same, if some fairly weak conditions are assumed. The simplest way to deal with this confusion is to define Lie groups with the strongest reasonable assumptions, and then provide the theorems which guarantee such strong assmptions in terms of weaker assumptions. Therefore Definition 34.1.1 is phrased in terms of analyticity. [ Define here a differentiable group of class C k . Then show that for such a group, there is an analytic atlas for which the group is a Lie group. ] 34.1.1 Definition: A Lie group is a triple (G, AG , σ) such that [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
34.0.5 Remark: It seems that the development of Lie groups (1869–99) arose out of a desire to apply the transformation group approach of Galois theory (1832–46) to differential equations. Lie groups are also used as structure groups for principal fibre bundles, associated with an attempt to make differential geometry correspond in some way to the Erlanger Programm (1872) where geometries are thought of as the study of invariants of transformation groups. David Hilbert’s fifth problem (1900) asked the question of whether a topological group is always a Lie group. This was followed by over 50 years of effort to show finally (1952–53) that the answer is yes, more or less.
34.2. Hilbert’s fifth problem
661
(i) (G, σ) is a group; (ii) (G, AG ) is an analytic manifold; (iii) the operation σ : G × G → G is analytic with respect to AG ; (iv) the inversion map f : G → G with f : g 7→ g −1 is analytic with respect to AG . 34.1.2 Remark: If TG denotes the topology induced by the atlas AG in Definition 34.1.1, then (G, TG , σ) is a locally Euclidean topological group. See Definition 16.7.1 for topological groups and Definition 26.2.3 for locally Euclidean topology. EDM2 [34], section 249.A, requires paracompactness of the manifold (G, AG ) in Definition 34.1.1. This seems to be superfluous. According to Warner [49], lemma 1.9, page 9, any locally compact, Hausdorff, second countable topological space is paracompact. He says that all manifolds are second countable. So it seems that there are adequate conditions to guarantee paracompactness without it being explicitly stated. (See Definition 15.7.14 for paracompactness.) 34.1.3 Remark: Many texts require the group operation (x, y) 7→ σ(x, y −1 ) = xy −1 from G × G → G to be analytic for a Lie group. The purpose of this is to imply that the map y 7→ y −1 is analytic. However, this is superfluous since, as remarked in Montgomery/Zippin [76], page 49, the analyticity of the map g 7→ g −1 follows from the analyticity of σ by the implicit function theorem. So condition (iv) of Definition 34.1.1 is superfluous. [ Prove that g 7→ g −1 is analytic. Then remove this from Definition 34.1.1. ] 34.1.4 Remark: Lie groups, regarded as transformation groups of themselves, are particularly useful for defining differentiable principal fibre bundles. [ Near here, present Lie group homomorphisms, isomorphisms and automorphisms. Also define “inner automorphisms” g ′ 7→ gg ′ g −1 or g ′ 7→ g −1 g ′ g. These are also called conjugation maps. See Definition 9.3.22. ]
34.2.1 Remark: Apparently continuity can be substituted for analyticity in Definition 34.1.1, and the analyticity then follows. (See Sulanke and Wintgen [85].) Kobayashi and Nomizu [26], page 38, require a Lie group to be a C ∞ manifold and state that it follows that the manifold is real analytic (page 43). EDM2 [34], section 423.N, says that Hilbert’s fifth problem (in 1900) posed precisely this question, and it was resolved in the positive in 1952. “It was proved [in 1952] that any locally connected finite-dimensional locally compact group is a Lie group.” Of course, since a topological group does not have a differentiable structure, it must be provided. So what this means is that there exists a differentiable atlas AG , compatible with the topology TG on the topological group, for which the group (G, AG , σ) is a Lie group. The real reference for this problem, known as Hilbert’s 5th problem, is the book by Montgomery/Zippin [76]. In section 4.10, they give some theorems on this subject, which unfortunately, are a little perplexing. Two of their theorems are quoted here as Theorems 34.2.2 and 34.2.4. 34.2.2 Theorem: A locally euclidean group has no small subgroups and is isomorphic to a Lie group. 34.2.3 Remark: Theorem 34.2.2 is stated in Montgomery/Zippin [76], pages 70 and 184. They attribute this result to Gleason [66] and Montgomery/Zippin [77]. 34.2.4 Theorem: A locally compact group which is finite-dimensional and locally-connected is a Lie group. 34.2.5 Remark: Theorem 34.2.4 is stated in Montgomery/Zippin [76], page 185. A difficulty with Theorem 34.2.4 is the apparent fact that any finite-dimensional topological group must also be locally compact and locally connected. This raises the question of why such superfluous conditions are included. One possibility is that the authors did not know that they were superfluous. Another possibility is that they required the locally compact and locally connected conditions principally, and added the finitedimensional condition as an afterthought without thinking about the dependencies among the conditions. It is also possible that their conditions are non-standard and therefore are not interdependent somehow. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
34.2. Hilbert’s fifth problem
662
34. Differentiable groups
The definition of finite-dimensional used by Montgomery/Zippin [76] is in terms of homeomorphisms to Euclidean spaces, not the more abstract topological definition of Lebesgue dimension. (See Section 15.9 for topological dimension.) [ See EDM2 [34], section 117.B. ] They go on to say that for locally compact metric groups, they would use the more general topological dimension, except that this would be equivalent to using “cells” to define dimension. Since they do not define “cells”, it is difficult to know exactly what they mean. Perhaps their page 184 clarifies this a little. This seems to imply that their n-cell is an n-cube or similar structure, or possible a higher-dimensional tetrahedron. They also seem to be indicating that by “dimension” they do mean some sort of generalized topological dimension from dimension theory, rather than euclidean dimension. Another difficulty with Theorems 34.2.2 and 34.2.4 is the fact that one of them talks about being isomorphic to a Lie group whereas the other says that the group is a Lie group. The latter is probably just an informal way of saying that the group is isomorphic under some change of coordinates. All in all, it seems that Theorems 34.2.2 and 34.2.4 may be equivalent. If the generalized topological dimension is intended in Theorem 34.2.4, then certainly Theorem 34.2.2 would follow from it, since a locally Euclidean space is locally compact, locally connected and finite dimensional according to any reasonable definition of topological dimension. Figure 34.2.1 shows some family relations between topological groups and Lie groups. topological group (G,TG ,σG )
locally compact top. group (G,TG ,σG )
locally connected top. group (G,TG ,σG )
locally Euclidean group (G,TG ,σG ) EDM2, 423.N
Figure 34.2.1
Family tree of topological and Lie groups
34.3. Left invariant vector fields on Lie groups General invariants of transformation groups are discussed in Section 9.7. 34.3.1 Remark: The left translation operators in Definition 34.3.2 are required for the definition of the Lie algebra of a Lie group. The superscripts on the symbols for these operators distinguish the kinds of spaces which they operate on: G for group elements, C for continuous real-valued functions, T for tangent vectors, and F for fields. The superscript is omitted when the application space is obvious. As one would expect, all of the left translation operators in Definition 34.3.2 become identity maps when g equals the identity e of G. 34.3.2 Definition: Let G be a Lie group. The left translation operator (for group elements) LG g : G → G is defined for g ∈ G by ∀x ∈ G,
LG g (x) = gx.
0 0 The left translation operator (for real-valued functions) LC g : C (G) → C (G) is defined for g ∈ G by
∀φ ∈ C 0 (G),
G LC g (φ) = φ ◦ Lg −1 .
That is, ∀φ ∈ C 0 (G), ∀x ∈ G, [ www.topology.org/tex/conc/dg.html ]
−1 LC x). g (φ)(x) = φ(g [ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Lie group (G,AG ,σG )
34.3. Left invariant vector fields on Lie groups
663
˚(G) → T ˚(G) is defined for g ∈ G by The left translation operator (for tangent operators) ˚ LTg : T ˚ LTg (DV ) = DV ◦ LC g −1 .
∀V ∈ T (G), That is,
˚ LTg (DV )(φ) = DV (LC g −1 φ)
∀V ∈ T (G), ∀φ ∈ C 1 (G),
= DV (φ ◦ LG g ).
˚0 ˚0 The left translation operator (for tangent operator fields) ˚ LF g : X (G) → X (G) is defined for g ∈ G by G ˚ ˚T LF g (DX ) = Lg ◦ (DX ◦ Lg −1 ).
∀X ∈ X 0 (G), That is,
−1 ˚ ˚T LF x)). g (DX )(x) = Lg (DX (g
∀X ∈ X 0 (G), ∀x ∈ G, That is,
∀X ∈ X 0 (G), ∀x ∈ G, ∀φ ∈ C 1 (G), G C ˚ LF g (DX )(x)(φ) = DX (L −1 x)(L −1 φ) = DX (g
g −1
g
x)(φ ◦ LG g ).
[ Should extend Definition 34.3.2 to all tensor spaces T r,s (G) and some other spaces. ] 34.3.3 Remark: The interpretation of Definitions 34.3.2 and 34.3.7 may be assisted by Figure 34.3.1. LC g
x
LF g
LTg
V
LTg V
LC g −1
LC g −1 φ
g −1 x
gx
LTg
x Figure 34.3.1
LG g
LC g
φ
LG g
φ gx
LC g φ
LG g −1
x
LF g
X(g −1 x)
LF g (X)(x)
LC g −1
LC g −1 φ g −1 x
LG g −1
φ x
Left translation operators for Lie groups
˚(M ) instead 34.3.4 Remark: Definition 34.3.2 shows the convenience of using tangent operators DV ∈ T of tangent vectors V ∈ T (M ). Just as in distribution theory, the translation of differential operators may be easily expressed in terms of the reverse translation of test functions. For differentiable manifolds, it is necessary to convert the tangent operator back into a tangent vector. This conversion is discussed in Remark 28.6.6 and elsewhere. In the case of tangent vector fields X ∈ X 0 (G), the notation DX means the pointwise assignment of ˚ 0 (G) is defined by DX (x) = DX(x) for all x ∈ G an operator to a vector. In other words, DX ∈ X 0 and X ∈ X (G). The left translation operators ˚ LTg and ˚ LF g tangent operators DV and tangent operator fields DX whereas T F Lg and Lg apply to non-operator tangent vectors V and vector fields X respectively. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
LG g
664
34. Differentiable groups
˚gx (G) for all V ∈ Tx (G) (i.e. for all DV ∈ T ˚x (G)). So 34.3.5 Remark: In Definition 34.3.2, ˚ LTg (DV ) ∈ T ˚ ˚x (G), but it is an automorphism of the total tangent space T ˚(G). LTg does not define an automorphism of T −1 The left translation operator for vector fields ˚ LF x, but g is defined by taking the value of the field X at g −1 −1 the test function φ must then also be moved back to g x in order to apply the vector X(g x) to it. 34.3.6 Remark: Definition 34.3.7 gives the non-operator tangent vector version of Definition 34.3.2 for the left translation functions LTg and LF g . Whereas the tangent operators in Definition 34.3.2 act on test functions, the tangent vectors in Definition 34.3.7 use the induced map (LG g )∗ . The induced map (dLG g )∗ in Definition 34.3.7 may be written in terms of the pointwise differential. For G G −1 −1 all V ∈ T (G), (dLg )∗ (V ) = (dLG x)) = (dLG x)). g )π(V ) (V ) and (dLg )∗ (X(g g )g −1 x (X(g 34.3.7 Definition: Let G be a Lie group. The left translation operator (for tangent vectors) LTg : T (G) → T (G) is defined for g ∈ G by LTg = (LG g )∗ : LTg (V ) = (LG g )∗ (V ).
∀V ∈ T (G),
0 0 The left translation operator (for vector fields) LF g : X (G) → X (G) is defined for g ∈ G by
∀X ∈ X 0 (G),
T G LF g (X) = Lg ◦ X ◦ Lg −1
G = (LG g )∗ ◦ X ◦ Lg −1 .
That is, T −1 LF x)) g (X)(x) = Lg (X(g
∀X ∈ X 0 (G), ∀x ∈ G,
−1 = (LG x)). g )∗ (X(g
34.3.8 Theorem: Definition 34.3.7 is consistent with Definition 34.3.2. That is, DLTg (V ) = ˚ LTg (DV ) for all 0 g ∈ G and V ∈ T (M ); and DLF (X) = ˚ LF g (DX ) for all g ∈ G and X ∈ X (G). Proof: For g ∈ G, V ∈ T (M ) and φ ∈ C 1 (G), it follows from Definition 34.3.2 that ˚ LTg (DV )(φ) = DV (φ ◦ G G T G ˚ Lg ). By Definition 31.3.23, this equals (Lg )∗ (DV )(φ). So Lg (DV ) = (Lg )∗ (DV ). By Theorem 31.3.20, this T ˚T equals D(LG . By Definition 34.3.7, (LG as claimed. g )∗ (V ) = Lg (V ). Therefore Lg (DV ) = DLT g )∗ (V ) g (V ) −1 ˚F The equivalence DLFg (X) = ˚ LF x)(φ ◦ g (DX ) follows similarly from the calculation Lg (DX )(x)(φ) = DX (g G G −1 F G −1 ˚ Lg ) = (Lg )∗ (DX (g x))(φ), so that Lg (DX )(x) = (Lg )∗ (DX (g x)) = D(LG ) (X(g−1 x)) for x, g ∈ G g
∗
−1 and X ∈ X 0 (G). But by Definition 34.3.7, (LG x)) = LF g )∗ (X(g g (X)(x) for all x ∈ G, and the result follows from this.
34.3.9 Remark: In Definition 34.3.7, LTg (V ) ∈ Tgx (G) for all V ∈ Tx (G). So LTg does not define an automorphism of Tx (G), but it is an automorphism of the total tangent space T (G) since LTg = (LG g )∗ . G G C C C 34.3.10 Theorem: For all elements g1 , g2 ∈ G of a Lie group G, LG g1 ◦ Lg2 = Lg1 g2 , Lg1 ◦ Lg2 = Lg1 g2 , T T T F F F T T T F F F ˚ ˚ ˚ ˚ ˚ ˚ Lg1 ◦ Lg2 = Lg1 g2 , Lg1 ◦ Lg2 = Lg1 g2 , Lg1 ◦ Lg2 = Lg1 g2 and Lg1 ◦ Lg2 = Lg1 g2 .
34.3.11 Definition: A left invariant vector field on a Lie group G is a vector field X ∈ X 0 (G) such that ∀g ∈ G,
LF g (X) = X.
G 34.3.12 Theorem: Let G be a Lie group. Then X(g) = (LG g )∗ (X(e)) = (dLg )e (X(e)) for any left invariant vector field X on G and any g ∈ G, where e is the identity of G.
Proof: Let X be a left invariant vector field on G. Then by Definitions 34.3.11 and 34.3.7, X(g) = −1 (LG g)) for any g, h ∈ G. Put h = g. Then X(g) = (LG g )∗ (X(e)) for all g ∈ G. h )∗ (X(h 34.3.13 Remark: As mentioned in Remark 34.8.8, left invariant vector fields are infinitesimal right actions. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
g
34.3. Left invariant vector fields on Lie groups
665
34.3.14 Theorem: Let G be a Lie group with identity e. Then for all V ∈ Te (G) there is a unique left invariant vector field XV ∈ X 0 (G) such that XV (e) = V . Moreover, XV ∈ X ∞ (G) and XV (g) = LTg V for all g ∈ G. Conversely, any left invariant vector field on G is of the form XV for some V ∈ Te (G). Proof: Define XV : G → T (G) for V ∈ Te (G) by XV (g) = LTg V for all g ∈ G. Then XV (g) ∈ Tg (G) for all g ∈ G. So XV ∈ X(G). To show that XV is left invariant, let x, g ∈ G. Then T −1 LF x)) g (XV )(x) = Lg (XV (g
= LTg LTg−1 x V = LTx V = XV (x). So XV is left invariant by Definition 34.3.11. (This the same as Theorem 34.3.12.) To prove that XV is C ∞ , it must be shown (by Remark 29.5.4) that ψ˜ ◦ XV ◦ ψ −1 is C ∞ for all ψ ∈ atlas(G), where ψ˜ ∈ atlas(T (G)) denotes the tangent space chart corresponding to each manifold chart ψ ∈ atlas(G). (See Definition 28.8.1 for tangent space charts.) By Definition 31.3.1 for the differential of a map, XV (g) = n (dLG g )e (V ) = tg,w,ψ2 for ψ2 ∈ atlasg (G), where w ∈ IR is defined by ∀k = 1 . . . n,
wk = =
n X
i=1 n X i=1
−1 v i ∂xi (ψ2k ◦ LG ◦ ψ (x)) g 1
x=ψ1 (e)
v i ∂xi
ψ2k (σ(g, ψ1−1 (x)))
x=ψ1 (e)
,
Define f : U → IRn by f (x, y) = ψ2 (σ(ψ2−1 (y), ψ1−1 (x))) for (x, y) ∈ U = Dom(ψ1 ) × Dom(ψ2 ) ⊆ IR2n . Then by the definition of a C ∞ manifold map σ : G × G → G, f is a C ∞ function. Therefore the derivative Pn i ∞ ∞ function (x, y) 7→ h(x, y) = with respect to g, i=1 v ∂xi f (x, y) is also C . It follows that w is C −1 ˜ since w = h(ψ2 (g), ψ1 (e)). But for all y ∈ Dom(ψ2 ), ψ2 (XV (ψ2 (y))) = (y, h(y, ψ1 (e))) ∈ IR2n , which is now obviously C ∞ with respect to y. Therefore XV is C ∞ as claimed. The uniqueness and the converse follow from Definition 34.3.7 with x = g and X(e) = V . 34.3.15 Remark: The proof of Theorem 34.3.14 shows that left invariant vector fields XV are analytic. 34.3.16 Remark: One may try to prove the C ∞ regularity in Theorem 34.3.14 using the test-function ˚ 0 (G) version of the left translation operator ˚ LTg . In this case, the left invariant tangent operator field XV ∈ X would be constructed as XV (g) = ˚ LTg (DV ) for V ∈ Te (G). In this case, the test for C ∞ regularity would ∞ require XV (φ) ∈ C (G) for all φ ∈ C ∞ (G). For any x ∈ G, XV (φ)(x) = XV (x)(φ) = ˚ LTx (DV )(φ) = G G ∞ G DV (φ ◦ Lx ). It would be argued that φ ◦ Lx is C with respect to x, and therefore DV (φ ◦ Lx ) must be C ∞ with respect to x, and therefore the C ∞ regularity of XV (φ) is proved. However, although it is “obvious” ∞ ∞ that x 7→ DV (φ ◦ LG x ) is C , it is not obvious how one would prove it. To show that DV (φ(σG (x, ·))) is C would probably require some analysis using charts as in the proof of Theorem 34.3.14.
[ Is XL∞ (G) in Theorem 34.3.17 the same thing as XL (G)? ] 34.3.17 Theorem: The set XL∞ (G) of left invariant C ∞ vector fields on G is a subalgebra of the Lie algebra X ∞ (G) of C ∞ vector fields on G. Proof: It must be shown that XL∞ (G) is closed under both linear combinations and the Poisson bracket. ... 34.3.18 Remark: The tangent bundle of a Lie group G is trivial. A chart which demonstrates this is φ : T (G) → Te (G) defined by φ : z 7→ (dLπ(z)−1 )π(z) (z), where e is the identity of G, π is the projection map of T (G), and dLπ(z)−1 is the differential of the map Lπ(z)−1 : G → G. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
where v ∈ IRn satisfies V = te,v,ψ1 for ψ1 ∈ atlase (G), and σ : G × G → G is the C ∞ group action of G.
666
34. Differentiable groups
34.3.19 Remark: Left translation of vectors defines an absolute parallelism on a Lie group. More precisely, vectors V1 ∈ Tg1 (G) and V2 ∈ Tg2 (G) for g1 , g2 in a Lie group may be said to be parallel if V2 = LTg g−1 (V1 ). 2 1 Of course, right translation also defines an absolute parallelism. [ Try to define left translations of general tensors, and left-invariant tensor fields. ]
34.4. Right invariant vector fields on Lie groups In the interests of fair play, this section replicates Section 34.3 for the case of right invariant vector fields. 34.4.1 Remark: Right invariant vector fields are important for defining connections on principal fibre bundles. Such connections must leave invariant the structure of fibre sets; therefore the infinitesimal transformations specified by connections must be right invariant vector fields because these are equivalent to infinitesimal left actions of the group on itself, as explained in Remark 34.8.8. 34.4.2 Definition: Let G be a Lie group. The right translation operator (for group elements) RgG : G → G is defined for g ∈ G by ∀x ∈ G,
RgG (x) = xg.
The right translation operator (for real-valued functions) RgC : C 0 (G) → C 0 (G) is defined for g ∈ G by ∀φ ∈ C 0 (G),
RgC (φ) = φ ◦ RgG−1 .
That is, ∀φ ∈ C 0 (G), ∀x ∈ G,
RgC (φ)(x) = φ(xg −1 ).
˚gT : T ˚(G) → T ˚(G) is defined for g ∈ G by The right translation operator (for tangent operators) R ∀V ∈ T (G),
˚gT (DV ) = DV ◦ RC−1 . R g
∀V ∈ T (G), ∀φ ∈ C 1 (G),
˚gT (DV )(φ) = DV (RC−1 φ) R g = DV (φ ◦ RgG ).
˚ 0 (G) → X ˚ 0 (G) is defined for g ∈ G by ˚gF : X The right translation operator (for tangent operator fields) R ∀X ∈ X 0 (G),
˚gF (DX ) = R ˚gT ◦ (DX ◦ RG−1 ). R g
That is, ∀X ∈ X 0 (G), ∀x ∈ G,
˚gF (DX )(x) = R ˚gT (DX (xg −1 )). R
That is, ∀X ∈ X 0 (G), ∀x ∈ G, ∀φ ∈ C 1 (G), ˚gF (DX )(x)(φ) = DX (RG−1 x)(RC−1 φ) R g
g
= DX (xg −1 )(φ ◦ RgG ). 34.4.3 Definition: Let G be a Lie group. The right translation operator (for tangent vectors) RgT : T (G) → T (G) is defined for g ∈ G by RgT = (RgG )∗ : ∀V ∈ T (G),
RgT (V ) = (RgG )∗ (V ).
The right translation operator (for vector fields) RgF : X 0 (G) → X 0 (G) is defined for g ∈ G by ∀X ∈ X 0 (G),
[ www.topology.org/tex/conc/dg.html ]
RgF (X) = RgT ◦ X ◦ RgG−1 .
= (RgG )∗ ◦ X ◦ RgG−1 . [ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
That is,
34.4. Right invariant vector fields on Lie groups
667
That is, RgF (X)(x) = RgT (X(xg −1 ))
∀X ∈ X 0 (G), ∀x ∈ G,
= (RgG )∗ (X(xg −1 )).
34.4.4 Remark: The interpretation of Definitions 34.4.2 and 34.4.3 may be helped by Figure 34.4.1. RgG
RgC
x
RgF
RgT
V
RgT V
RgC−1
RgC−1 φ x
xg −1
xg
RgT
x
RgF
RgF (X)(x)
RgC−1 xg −1
xg
RgC φ
RgG−1
X(xg −1 )
RgC−1 φ
φ
RgG
Figure 34.4.1
RgC
φ
RgG
RgG−1
φ x
Right translation operators for Lie groups
˚gT (DV ) for 34.4.5 Theorem: Definition 34.4.2 is consistent with Definition 34.4.3. That is, DRgT (V ) = R ˚gF (DX ) for all g ∈ G and X ∈ X 0 (G). all g ∈ G and V ∈ T (M ); and DRF (X) = R g
34.4.6 Theorem: For all elements g1 , g2 ∈ G of a Lie group G, RgG1 ◦ RgG2 = RgG2 g1 , RgC1 ◦ RgC2 = RgC2 g1 , ˚gT ◦ R ˚gT = R ˚gT g , R ˚gF ◦ R ˚gF = R ˚gF g , RgT ◦ RgT = RgT g and RgF ◦ RgF = RgF g . R 1 2 2 1 1 2 2 1 1 2 2 1 1 2 2 1 34.4.7 Definition: A right invariant vector field on a Lie group G is a vector field X ∈ X 0 (G) such that ∀g ∈ G,
RgF (X) = X.
34.4.8 Theorem: Let G be a Lie group. Then X(g) = (RgG )∗ (X(e)) = (dRgG )e (X(e)) for any right invariant vector field X on G and any g ∈ G, where e is the identity of G. Proof: The proof is the same as for Theorem 34.3.12. 34.4.9 Theorem: Let G be a Lie group with identity e. Then for all V ∈ Te (G) there is a unique right invariant vector field XV ∈ X 0 (G) such that XV (e) = V . Moreover, XV ∈ X ∞ (G) and XV (g) = RgT V for all g ∈ G. Conversely, any right invariant vector field on G is of the form XV for some V ∈ Te (G). Proof: The proof is the same as for Theorem 34.3.14. [ The formula XV (g) = RgT V in Theorem 34.4.9 is the same as Theorem 34.4.8. ] ∞ 34.4.10 Theorem: The set XR (G) of right invariant C ∞ vector fields on G is a subalgebra of the Lie ∞ ∞ algebra X (G) of C vector fields on G.
[ Must show the relation between left and right invariant fields. They’re probably conjugate or something. ] [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Proof: The proof is the same as for Theorem 34.3.8.
668
34. Differentiable groups
34.4.11 Remark: Group elements may be thought of as “two-port” objects because they can be multiplied both on the left and on the right, whereas the elements of the passive set of a Lie transformation group are “one-port” objects because they can only be multiplied from the left. This turns out to be the fundamental reason why principal fibre bundles have some advantages over ordinary fibre bundles. Theorem 34.4.12 is an example of why the two-port property is useful: the left and right actions commute. This follows from associativity. 34.4.12 Theorem: Let G be a Lie group. Then G G G (i) LG g Rh = Rh Lg for all g, h ∈ G.
C C C (ii) LC g Rh = Rh Lg for all g, h ∈ G.
(iii) LTg RhT = RhT LTg for all g, h ∈ G.
G G G Proof: Part (i) follows from the calculation LG g Rh x = g(xh) = (gx)h = Rh Lg for all g, h ∈ G. Part (ii) C C G G G G follows from the calculation Lg Rh (φ) = φ ◦ Lg−1 ◦ Rh−1 = φ ◦ Rh−1 ◦ Lg−1 = RhC LC g for all g, h ∈ G, which follows from part (i). Part (iii) has two forms depending on whether vector fields or operator fields are acted upon. ...
34.5. The Lie algebra of a Lie group The Lie algebra of a Lie group is generally defined in two different ways, as tangent vectors in the tangent space of the identity of the group (as in Definition 34.5.1) or as left invariant vector fields (as in Definition 34.5.2). By Theorem 34.3.14, these definitions are effectively equivalent. [ Probably could use Theorem 9.11.9 to show that X ∞ (M ) is a Lie algebra with the commutator operation. ] 34.5.1 Definition: The (tangent-space version) Lie algebra of a Lie group G with identity e is the set Te (G) together with the operation [·, ·] : Te (G) × Te (G) → Te (G) defined by [v, w] = [Xv , Xw ](e),
where Xv , Xw are the left invariant vector fields in G defined in Theorem 34.3.14. [ Must define the exponential map on a Lie group by elements of the Lie algebra: exp tA, for t ∈ IR and A ∈ Te (G). Prove that the exponential map exists and is unique. See EDM2 [34] 249.Q. ] [ Present the full specification tuples for Definitions 34.5.1 and 34.5.2. ] 34.5.2 Definition (→ 34.5.1): The (vector-field version) Lie algebra of a Lie group G is the set XL∞ (G) of all left invariant vector fields in G together with the operation [·, ·] : XL∞ (G) × XL∞ (G) → XL∞ (G) defined by ∀X, Y ∈ XL∞ (G),
[X, Y ] = XY − Y X.
[ Show that exp(f (x)) = f (exp(x)) for automorphisms f . See Crampin/Pirani [11], page 313. ] 34.5.3 Theorem: Definitions 34.5.1 and 34.5.2 for the Lie algebra of a Lie group are equivalent. [ That is, there is some sort of Lie algebra isomorphism XL∞ (G) ≃ Te (G). Define this isomorphisms. Mention the exponential map. ] [ Define linear representations of Lie algebras. See Remark 9.11.16. Also define irreducible representations, adjoint representations and Killing forms. See EDM2 [34], section 248.B. For adjoint representations, see EDM2 [34], 249.P, Crampin/Pirani [11], page 314, Fulton/Harris [108], page 106. Define ad(g) as the induced map of the inner automorphism by g. ] [ Theorem: If f : G → G is and automorphism, then f∗ : Te (G) → Te (G) is a Lie algebra automorphisms. Similar theorems for homomorphisms etc. ] [ Define general linear Lie algebras. See EDM2 [34] 248.A. ] [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
∀v, w ∈ Te (G),
34.6. Diffeomorphism groups
669
34.6. Diffeomorphism groups The structure groups for parallelism on differentiable fibre bundles are groups of diffeomorphisms of a differentiable manifold. Lie transformation groups (in Section 34.7) are diffeomorphism groups for the special case that the group itself is a differentiable manifold. A general diffeomorphism group can be very infinitedimensional. So strictly speaking, diffeomorphism groups are not generally differentiable groups themselves. Of special interest for the analysis of parallelism on differentiable fibre bundles are families of diffeomorphisms, generators of diffeomorphisms, and vector fields generated by families of diffeomorphisms. In the analysis of curvature, the Lie algebra of such vector fields plays an important role. 34.6.1 Definition: The C r diffeomorphism group of a C r manifold M for r ∈ group G − < (G, M ) − < (G, M, AM , σG , µ) of all C r diffeomorphisms from M to M .
Z0
−+
is the transformation
The C r diffeomorphism group of M may also be called the C r automorphism group of M . [ Should define a standard topology on diffeomorphism groups, probably the topology of pointwise convergence. ] 34.6.2 Remark: In Definition 34.6.1, the group elements are identified with their action on the manifold M . This implies automatically that the group acts effectively on M because any two group elements which have the same action on M are the same element by their definition. Since the identity function on M is always a C r diffeomorphism, the C r diffeomorphism group is well defined for any C r manifold. If the class is not specified, it is assumed to be C 1 . The C 0 diffeomorphism group of a C 0 manifold M is the same thing as the topological automorphism group of M although the topology on M is represented by an atlas instead of open sets. In general, there is no standard differentiable structure (such as a differentiable atlas) on diffeomorphism groups, although the space of all vector fields on M which are generated by oneparameter families of diffeomorphisms of M could be regarded as a kind of tangent space for the group. 34.6.3 Definition: A group of C r diffeomorphisms of a C r manifold M for r ∈ group G − < (G, M ) − < (G, M, AM , σG , µ) of C r diffeomorphisms from M to M .
Z0
−+
is a transformation
34.6.4 Remark: A group of C r diffeomorphisms of a C r manifold in Definition 34.6.3 is necessarily a subgroup of the C r diffeomorphism group of M according to Definition 34.6.1. If the group G has a topology TG , and the group action is continuous with respect to TG and the topology on M , then this kind of group may be called a topological group of C r diffeomorphisms as in Definition 34.6.5. The topological group of all C r diffeomorphisms of a manifold M is required in Definition 34.6.6 to have the compact-open topology.
Z
− 34.6.5 Definition: A topological group of C r diffeomorphisms of a C r manifold M for r ∈ + 0 is a r topological transformation group G − < (G, M ) − < (G, TG , M, AM , σG , µ) of C diffeomorphisms from M to M such that µ : G × M → M is continuous.
A topological group of C r diffeomorphisms of M may also be called a topological group of C r automorphisms of M .
Z
− 34.6.6 Definition: The topological C r diffeomorphism group of a C r manifold M for r ∈ + 0 is the topological transformation group G − < (G, M ) − < (G, TG , M, AM , σG , µ) of all C r diffeomorphisms from M to M , where the topology TG is the compact-open topology on G. The topological C r diffeomorphism group of M may also be called the topological C r automorphism group of M . [ Check that the compact-open topology is the most suitable topology for Definition 34.6.6. ]
Z
− r 34.6.7 Definition: A C r family of diffeomorphisms for r ∈ + 0 of a C manifold M is a map γ : I → r (M → M ) for some interval I ⊆ IR such that γ(t) : M → M is a C diffeomorphism of M for all t ∈ I, and the map t 7→ γ(t)(z) is a C r map from I to M for all t ∈ I. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
A group of C r diffeomorphisms of M may also be called a group of C r automorphisms of M .
670
34. Differentiable groups
34.6.8 Definition: The vector field generated by a C r family γ of diffeomorphisms of a C r manifold M for a parameter t0 ∈ I is the map X : M → T (M ) defined by X : z 7→ ∂t (γ(t)(z)) t=t for z ∈ M . 0
Z
−+
34.6.9 Theorem: For all r ∈ , the vector field generated by any C r family of diffeomorphisms of a C r r−1 manifold is of class C for all parameter values of the family.
34.7. Lie transformation groups Lie transformation groups are particularly useful for defining differentiable ordinary fibre bundles and connections on differentiable fibre bundles. [ See Sulanke and Wintgen [85], section I.7. ] [ Refer back to topological transformation groups and Euclidean topological transformation groups. ] 34.7.1 Definition: A C r Lie transformation group for r ∈ (G, M ) − < (G, AG , M, AM , σG , µ) such that
Z0
−+
is a tuple
(i) (G, AG , σG ) is a Lie group, (ii) (M, AM ) is a C r manifold, (iii) (G, TG , M, TM , σG , µ) is a topological transformation group with topologies TG and TM induced by the atlases AG and AM respectively, (iv) the map µ : G × M → M is C r with respect to the product differentiable structure on G × M and the differentiable structure on M ; i.e. µ ∈ C r (G × M, M ). 34.7.2 Remark: A Lie transformation group is also known as a differentiable transformation group. The manifold M in Definition 34.7.1 is sometimes called a G-manifold or G-space. In this book, it will be called a Lie (transformation) group space, especially when the group G is not specified.
34.7.3 Remark: This is a minimal-regularity definition as in EDM2 [34], section 431.C, which corresponds to Montgomery/Zippin [76], page 195. Malliavin [35], page 240 defines Lie groups of transformations as C ∞ transformations on the right of a C ∞ manifold. Kobayashi and Nomizu [26], page 41 also define Lie transformation groups to have a C ∞ action on the right on a C ∞ manifold. 34.7.4 Remark: Figure 34.7.1 shows some of the relations between topological transformation groups and Lie transformation groups. The symbols AG and AM refer to atlases on sets G and M respectively. The atlases imply corresponding topologies for the respective sets. transformation group (G,X,σG ,µ)
topological group (G,TG ,σG )
Lie group (G,AG ,σG )
transf. group of top. space (G,X,TX ,σG ,µ)
top. transf. group of top. space (G,TG ,X,TX ,σG ,µ) EDM2, 431.H(10) C k Lie transf. group (G,AG ,M,AM ,σG ,µ)
EDM2, 431.H(11)
Figure 34.7.1
transf. group of top. space M loc. compact, TG =compact-open top. M loc. connected or unif. top. space G acts equicontinuously on M (G,TG ,X,TX ,σG ,µ)
eff. top. transf. group, M is C k G loc. compact, µ is C k w.r.t. M (G,AG ,M,AM ,σG ,µ)
Family tree of topological and Lie transformation groups
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
If the regularity class C r of a Lie transformation group is unspecified, it is assumed to be C 1 . (Some authors may assume that it is C ∞ or analytic.)
34.7. Lie transformation groups
671
According to EDM2 [34], section 431.H. “Suppose that M is a C 1 manifold and G is a topological transformation group of M acting effectively on M . If G is locally compact and the map x 7→ g(x) of M is of class C 1 for each element g of G, then G is a Lie transformation group of M .” This seems to cover all of the cases of interest. Thus if the group, the manifold and the group action are C 1 , then the group is a Lie transformation group, which implies that it is analytic. This implies that there is not much point in defining C r transformation groups for r > 1. However, it seems clear that there is a point in defining varying levels of regularity for the manifold M and for the action µ : G × M → M . Of particular relevance to this is Theorem 34.7.5, which is paraphrased from Montgomery/Zippin [76], page 212. [ What is the relation between Theorem 34.7.5 (and Remark 34.7.4) and Hilbert’s fifth problem? ] − 34.7.5 Theorem: Let (G, M, σG , µ) be a Lie transformation group of a manifold M . Let k ∈ + 0 . Suppose that M is a C k manifold and Lg : M → M is C k (i.e. Lg ∈ C k (M, M )) for all g ∈ G. Then µ ∈ C k (G×M, M ). If M is analytic and Lg : M → M is analytic, then the group action µ is analytic.
Z
34.7.6 Notation: GL(n, IR) denotes the Lie group of general linear transformations of IRn . That is, it is the group of real invertible n × n matrices. An abbreviated notation is GL(n). [ Present examples here, such as SO(2) and SO(3). SO(3) is discussed a little in Section 42.8. These are defined in Section 10.9. Should present here the full specification tuples of the classical groups. ] 34.7.7 Remark: Definition 34.7.8 is a generalization to Lie transformation groups of Definition 34.3.2 for Lie groups. The left translation operators in Definition 34.7.8 use the same notations as in Definition 34.3.2, but all act on the passive space M rather than G. A similar comment applies to Definition 34.7.9. 34.7.8 Definition: Let (G, M ) be a Lie transformation group. The left translation operator (for group elements) LG g : M → M is defined for g ∈ G by ∀x ∈ M,
LG g (x) = gx.
∀φ ∈ C 0 (M ),
G LC g (φ) = φ ◦ Lg −1 .
That is, ∀φ ∈ C 0 (M ), ∀x ∈ M,
−1 LC x). g (φ)(x) = φ(g
˚(M ) → T ˚(M ) is defined for g ∈ G by The left translation operator (for tangent operators) ˚ LTg : T ∀V ∈ T (M ),
˚ LTg (DV ) = DV ◦ LC g −1 .
That is, ∀V ∈ T (M ), ∀φ ∈ C 1 (M ),
˚ LTg (DV )(φ) = DV (LC g −1 φ) = DV (φ ◦ LG g ).
˚0 ˚0 The left translation operator (for tangent operator fields) ˚ LF g : X (M ) → X (M ) is defined for g ∈ G by ∀X ∈ X 0 (M ),
G ˚ ˚T LF g (DX ) = Lg ◦ (DX ◦ Lg −1 ).
That is, ∀X ∈ X 0 (M ), ∀x ∈ M,
−1 ˚ ˚T LF x)). g (DX )(x) = Lg (DX (g
That is, ∀X ∈ X 0 (M ), ∀x ∈ M, ∀φ ∈ C 1 (M ), G C ˚ LF g (DX )(x)(φ) = DX (L −1 x)(L −1 φ) g
g
= DX (g −1 x)(φ ◦ LG g ). 34.7.9 Definition: Let (G, M ) be a Lie transformation group. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
0 0 The left translation operator (for real-valued functions) LC g : C (M ) → C (M ) is defined for g ∈ G by
672
34. Differentiable groups
The left translation operator (for tangent vectors) LTg : T (M ) → T (M ) is defined for g ∈ G by ∀V ∈ T (M ),
LTg (V ) = (LG g )∗ (V ).
0 0 The left translation operator (for vector fields) LF g : X (M ) → X (M ) is defined for g ∈ G by
∀X ∈ X 0 (M ),
G G LF g (X) = (Lg )∗ ◦ X ◦ Lg −1 .
That is, ∀X ∈ X 0 (M ), ∀x ∈ M,
G −1 LF x)). g (X)(x) = (Lg )∗ (X(g
[ Define also left invariant vector fields etc. See Definition 34.3.11. Remark that except for (G, G), left invariant fields on M are not very practical. E.g. SO(2) on IR2 has no non-constant left invariant vector fields? ]
34.8. Infinitesimal transformations [ This topic is related to the exponential map and the “local Lie group of local transformations”. ] In this section, vector fields on a G-manifold M are generated by differentiating transformations in a Lie transformation group G. These vector fields may be thought of as “infinitesimal transformations” or “differential actions” of the group on the manifold, or “generators” of actions on the manifolds. These are used for defining connections on ordinary fibre bundles because differential parallelism is represented as differential actions on tangent fibre bundles.
34.8.2 Remark: Definition 34.8.4 defines an infinitesimal transformation YV on a G-manifold corresponding to each element V of the Lie algebra Te (G). To motivate Definition 34.8.4, consider a curve γ : I → G for some open interval I ⊆ IR such that 0 ∈ I and γ(0) = e ∈ G. For each p ∈ M , one may define a curve γp : I → M by γp : t 7→ γ(t)p. Since M is a manifold, differentiability of such curves is well-defined. So assume that γp is a C 1 curve in M for all p ∈ M . If G is a Lie group, one may differentiate both γ and γp . Then γp′ (0) = ∂t (γ(t)p) t=0 = ∂t (Rp (γ(t))) t=0 = (dRp )e (γ ′ (0)), where Rp is the right action map in Definition 34.8.3. Thus if G is a Lie group, then the infinitesimal action of the curve γ on M is of the form (dRp )e (V ) with V = γ ′ (0). However, even if G is not a Lie group, it may be that the derivatives γp′ (0) exist for all p ∈ M , in which case one may generalize the definition to infinitesimal transformations of the form Yγ ∈ X(M ), where Yγ : p 7→ γp′ (0). This more general definition is applicable to connections on fibrations, whereas the Lie group version is applicable to connections on differentiable fibre bundles. < (G, M, σG , µ) by a point p ∈ M 34.8.3 Definition: The right action on a Lie transformation group G − is the map Rp : G → M defined by Rp : g 7→ g.p = µ(g, p). 34.8.4 Definition: An infinitesimal transformation of a C 1 Lie transformation group G acting on a C 1 manifold M is a vector field YV ∈ X(M ) defined for V ∈ Te (G) by ∀p ∈ M,
YV (p) = (dRp )e (V ),
where Rp : G → M is the right action of p on G. 34.8.5 Remark: Since the right action Rp on G by each element p ∈ M is a C 1 map, the differential dRp is well-defined. So YV (p) ∈ Tp (M ) is well defined for all p ∈ M . The regularity of YV follows from the regularity of the group action in a similar way to the regularity of the left invariant vector field XV ∈ X(G) in Theorem 34.3.14. 34.8.6 Theorem: [ Have a theorem to say that YV is C k if (G, M ) is C k+1 , or something like that. The proof is probably similar to that of Theorem 34.3.14 ]. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
34.8.1 Remark: Corresponding to left and right invariant vector fields on a Lie group, as outlined in Sections 34.3 and 34.4, one may define vector fields on a G-manifold M with some similar properties. An example of this for G = SO(3) and M = S 2 is presented in Remark 42.8.4.
34.8. Infinitesimal transformations
673
34.8.7 Remark: In principle, one could try to invert the construction in Definition 34.8.4 to obtain a vector field on the group G. This does not seem to be useful, however. For each p ∈ S 2 and W ∈ Tp (S 2 ), one may attempt to construct a vector field from the inverse map (dRp )g at each g ∈ G, such as (dRp )−1 g (W ) or ker((dRp )g ). The problem with (dRp )−1 (W ) is the fact that the linear map (dR ) is not generally p g g injective. Therefore (dRp )−1 g ({W }) would be some sort of hyperplane in Tg (G). The subspace ker((dRp )g ) of Tg (G) could possibly hold some interest, but it is difficult to see any immediate application. This situation contrasts with the situation where σG : G × G → G yields useful left and right invariant vectors fields on Lie groups G.
34.8.8 Remark: The form (dRp )e for infinitesimal transformations in Definition 34.8.4 may seem a little surprising, but in the case M = G, so that G acts on G by left translation, Theorem 34.4.8 shows that a right-invariant vector field X on G satisfies X(p) = (dRp )e (X(e)) for all p ∈ G, which matches perfectly with Definition 34.8.4. Therefore an infinitesimal transformation is a generalization of right-invariant vector fields from Lie left transformation groups of the special form (G, G) to general Lie left transformation groups (G, M ). It may seem odd that infinitesimal transformations of a left transformation group are so closely related to right invariant vector fields. The reason for this is that left and right actions of groups commute, as stated in Theorem 34.4.12. Therefore an infinitesimal left action by a group is invariant under right actions of the group; so an infinitesimal left action is a right invariant vector field when the G-manifold M is the group G itself. Similarly, in the case of the Lie right transformation group (G, G) − < (G, G, σG , σG ) (same tuple as the left transformation group but with a different object class, as mentioned in Remark 5.16.6), infinitesimal transformations are infinitesimal right actions, and these are left invariant vector fields on G. In the case of general Lie right transformation groups (G, M ), the infinitesimal transformations are of the form g 7→ (dLg )e (V ) for V ∈ Te (G). Summarizing this, one may say that on a Lie group, left invariant vector fields are infinitesimal right actions, and right invariant vector fields are infinitesimal left actions. 34.8.9 Definition: The left action on a Lie right transformation group G − < (G, M, σG , µ) by a point p ∈ M is the map Lp : G → M defined by Lp : g 7→ pg = µ(p, g). 34.8.10 Definition: An infinitesimal transformation of a C 1 Lie right transformation group G acting on a C 1 manifold M is a vector field YV ∈ X(M ) defined for V ∈ Te (G) by ∀p ∈ M,
YV (p) = (dLp )e (V ),
where Lp : G → M is the left action of p on G.
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
[ Have a definition of infinitesimal transformations for non-Lie transformation groups. Give conditions for meaningfulness. ] [ Look at composition and commutators of infinitesimal transformations. Should get some sort of Lie algebra out of this, probably corresponding to Te (G) etc. ] [ An infinitesimal right action is left invariant and vice versa. ] [ In the case M = G, try to show that (dRp )e (v) is left or right invariant, but not usually otherwise. Try to show that for G “bigger” than M , all left invariant vector fields are constant. If M is “bigger” than G, can get infinitely many left invariant vector fields with the same value at a given point. But with (G, G), one field value determines all values everywhere. ]
674
[ www.topology.org/tex/conc/dg.html ]
34. Differentiable groups
[ draft: UTC 2009–3–21 Saturday 11:36 ]
[675]
Chapter 35 Differentiable fibre bundles
35.1 35.2 35.3 35.4 35.5 35.6 35.7 35.8 35.9
Differentiable fibre bundles with non-Lie structure group Differentiable fibre bundles with Lie structure group . . . Vector fields on differentiable fibre bundles . . . . . . . . Differentiable principal fibre bundles . . . . . . . . . . . Vector fields on differentiable principal fibre bundles . . . Associated differentiable fibre bundles . . . . . . . . . . Vector bundles . . . . . . . . . . . . . . . . . . . . . . Tangent bundles of differentiable manifolds . . . . . . . . Tangent frame bundles . . . . . . . . . . . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
676 676 677 679 680 680 683 683 684
component
symbol
topological fibre bundle
total space base space fibre space structure group
E B F G
topological topological topological topological
projection map fibre charts group operation left action
π:E→B φ:E→ ˚ F σ:G→G µ:G×F →F
continuous continuous continuous continuous
space space space group
C k differentiable fibre bundle
analytic fibre bundle
C k differentiable manifold C k differentiable manifold C k differentiable manifold Lie group
analytic manifold analytic manifold analytic manifold Lie group
Ck Ck analytic Ck
analytic analytic analytic analytic
Whereas a topological fibre bundle specifies a topology for each space, a differentiable fibre bundle specifies an atlas for each space. Any level of regularity beyond continuity requires atlases. These regularity-indicating atlases must not be confused with fibre atlases which indicate global fibre structure. Although pathwise parallelism may be defined on topological fibre bundles, the definitions of connections in Chapter 36 require differentiable fibre bundles because a connection is a differential representation of pathwise parallelism. Pathwise parallelism is then calculated by integrating a connection along a path. Just as topological fibre bundles (Chapter 23) are the natural structure for defining parallelism (Chapter 24), differentiable fibre bundles (Chapter 35) are the natural structure for defining connections (Chapters 36–38). [ For differentiable fibre bundles, see Sulanke and Wintgen [85], II.1, and Choquet-Bruhat [60], I.12. ]
©
Alan U. Kennington, “Differential geometry reconstructed: a unified systematic framework”. Copyright 2009, Alan U. Kennington. All rights reserved. You may print this book draft in A4 format. Printing in all other formats is forbidden. You may not charge any fee for copies of this book draft. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Chapter 23 on topological fibre bundles should be read before this chapter. Differentiable fibre bundles require also the definitions of differentiable manifolds (Chapters 27–33) and Lie groups (Chapter 34). Differentiable fibrations are defined in Section 27.12. The following overview table compares the regularity conditions for definitions of topological and differentiable fibre bundles.
676
35. Differentiable fibre bundles
35.1. Differentiable fibre bundles with non-Lie structure group Differentiable fibrations (fibre bundles with a structure group) are presented in Section 27.12. This section presents differentiable fibre bundles with a non-Lie structure group, which means that the structure group is not a differentiable manifold. It could be useful to examine differentiable manifolds whose base space is a finite-dimensional manifold but whose total space and fibre space are more general structures. Such generality is not (currently) presented here. Thus in this section, the base space, total space and fibre space are assumed to be manifolds, but the structure group is assumed to be only a topological space, which may be the space of all diffeomorphisms of the fibre space. Definition 35.1.1 is essentially identical to Definition 27.12.7. 35.1.1 Definition: A C k (differentiable) fibration with fibre space F for a C k differentiable manifold − F − < (F, AF ) and k ∈ + < (E, AE , π, B, AB , AF 0 is a tuple (E, π, B) − E ) which satisfies:
Z
(i) (E, AE ) and (B, AB ) are C k manifolds and π : E → B is C k ; −1 (ii) ∀φ ∈ AF (Uφ ) → F is C k and π ×φ : π −1 (Uφ ) → Uφ ×F is a C k diffeomorphism; E , ∃Uφ ∈ Top(B), φ : π S (iii) φ∈AF Uφ = B. E
35.1.2 Remark: Definition 35.1.3 defines differentiable fibre bundles in terms of topological transformation groups of C k automorphisms (G, F ), where the group G does not have a differentiable structure but the fibre space F does. This kind of transformation group is given by Definition 34.6.5. A possibly suitable name for such a fibre bundle would be a “semi-differentiable fibre bundle”.
35.1.3 Definition: A C k (differentiable) (G, F ) fibre bundle with non-Lie structure group for an effective −+ topological C k left transformation group (G, F ) − < (G, TG , F, AF , σG , µF < G ) for k ∈ 0 is a tuple (E, π, B) − F (E, AE , π, B, AB , AE ) which satisfies: (i) (E, AE ) and (B, AB ) are C k manifolds and π : E → B is C k ; −1 (ii) ∀φ ∈ AF (Uφ ) → F is C k and π ×φ : π −1 (Uφ ) → Uφ ×F is a C k diffeomorphism; E , ∃Uφ ∈ Top(B), φ : π S (iii) φ∈AF Uφ = B; E −1 (iv) ∀φ1 , φ2 ∈ AF E , ∀b ∈ Uφ1 ∩ Uφ2 , ∃g ∈ G, φ2 ◦ φ1 E = Lg . b
35.2. Differentiable fibre bundles with Lie structure group
The fibre bundles presented in this section are assumed to have structure groups which are finite-dimensional differentiable manifolds. Therefore all of the relevant spaces are manifolds: the base space, total space, fibre space and structure group. Although this is very convenient for analysis, the more general case of structure groups which are not finite-dimensional manifolds may be useful in applications. This more general case is presented in Section 35.1. 35.2.1 Remark: The definition of a topological fibre bundle in Section 23.3 involves four topological spaces: the total space E, the base space B, the fibre space F , and the structure group G. The definition also specifies three maps: the projection map π : E → B, the group operation σ : G × G → G and the −1 group action µ : G × F → F . Additionally there is an atlas AF (U ) → F for open E of fibre maps φ : π sets U ∈ Top(B). A differentiable fibre bundle is the same except that the topologies are replaced with atlases and continuity is replaced with differentiability. The additional structures to be specified are differentiable manifold atlases for all four spaces, and the maps are required to be suitably differentiable. The differentiable (G, F ) fibre bundle in Definition 35.2.2 satisfies the conditions for a topological (G, F ) fibre bundle in Definition 23.6.4 if the atlases AE , AB , AG and AF are replaced with the corresponding induced topologies TE = Top(E), TB = Top(B), TG = Top(G) and TF = Top(F ). The other elements of the specification tuple stay the same. 35.2.2 Definition: A C k (differentiable) (G, F ) fibre bundle for an effective C k Lie left transformation −+ group (G, F ) − < (G, AG , F, AF , σG , µF < (E, AE , π, B, AB , AF 0 is a tuple (E, π, B) − G ) for k ∈ E ) which satisfies:
Z
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Z
35.3. Vector fields on differentiable fibre bundles
677
(i) (E, AE ) and (B, AB ) are C k manifolds and π : E → B is C k ; −1 (ii) ∀φ ∈ AF (Uφ ) → F is C k and π ×φ : π −1 (Uφ ) → Uφ ×F is a C k diffeomorphism; E , ∃Uφ ∈ Top(B), φ : π S (iii) φ∈AF Uφ = B; E −1 −1 (iv) ∀φ1 , φ2 ∈ AF ({b}) ≈ F for E , ∀b ∈ Uφ1 ∩ Uφ2 , ∃g ∈ G, βb,φ2 ◦ βb,φ1 = Lg , where βb,φ = φ π −1 ({b}) : π φ ∈ AF E and b ∈ Uφ ; −1 k (v) ∀φ1 , φ2 ∈ AF E , the function gφ1 ,φ2 : Uφ1 ∩ Uφ2 → G defined by Lgφ1 ,φ2 (b) = βb,φ1 ◦ βb,φ2 is of class C .
35.2.3 Remark: The Lie transformation group in Definition 35.2.2 is defined in Section 34.7. Although the group itself is assumed to be analytic, the action on the fibre F is only assumed to be C k . It turns out that if the group is assumed to be only C k , it will be analytic anyway although this is non-trivial to prove. k k It also turns out that if the action µF G : G × F → F of G on F is C with respect to F , then it is also C with respect to G. (See Remark 34.2.1.) The structure group G for a fibre bundle has the role of classifying parallelism according to how much structure is preserved. If the structure group is large, not much fibre structure is preserved under parallel translations. A small structure group ensures that more structure is preserved. [ Should have a theorem which can be used to prove Remark 35.2.4. ] 35.2.4 Remark: The per-fibre-set charts βb,φ : π −1 ({b}) ≈ F in Definition 35.2.2 must be of class C k because π × φ is of class C k . 35.2.5 Definition: An analytic (G, F ) fibre bundle for an effective analytic Lie left transformation group (G, F ) − < (G, AG , F, AF , σG , µF < (E, AE , π, B, AB , AF G ) is a differentiable (G, F ) fibre bundle (E, π, B) − E) such that (E, AE ) and (B, AB ) are analytic manifolds; the projection map π is analytic; −1 for all φ ∈ AF (Uφ ) → Uφ × F is analytic; E , φ : Uφ → F is analytic and π × φ : π F the transition maps gφ1 φ2 for φ1 , φ2 ∈ AE are analytic.
35.2.6 Remark: The analyticity of the map µF G in Definition 35.2.5 is implied by the weaker condition that the action map µF G : G × F → F be analytic with respect to F only. (See Remark 34.7.4.) But since this implication is non-trivial, it is best to state the analyticity requirement explicitly for both G and F . [ This is really a comment on Lie transformation groups. Should move this remark to the relevant section! ] [ Near here define differentiable fibre bundle homomorphisms, diffeomorphisms/isomorphisms, direct products etc. as in Section 23.7. Also maybe define C k compatible charts, C k equivalent atlases etc. See Definition 23.6.14. ]
35.3. Vector fields on differentiable fibre bundles 1 ,b2 [ Show relations of these vector fields to automorphisms Lbg,φ etc. Can something be done with Lbg,φ ?] 1 ,φ2 This section deals with vector fields on the total space of a differentiable manifold, both globally and locally on fibre spaces Ep individual points p ∈ M . Also discussed is the relation of such fields to the structure group action on the total space via the fibre charts. Vector fields may be generated by a Lie group on the individual fibres at points of a base space as in Section 34.8. These may be thought of as “differential actions” of the group. The term “infinitesimal transformations” is used by EDM2 [34], 431.G in the context of Lie transformation groups. Such vector fields may sometimes be extended to cover an entire fibre bundle total space by using an atlas of fibre charts. At each point the field is generated by some element of the Lie algebra of the Lie group, but this element will generally depend on the choice of chart and the point in the base space. If the vector field generated by a structure group is parametrized by vectors in the tangent space of the base space, then one may define a special class of such families for which the vector field depends linearly on the base space vector. These kinds of vector field families are used for defining connections on differentiable fibre bundles.
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
(i) (ii) (iii) (iv)
678
35. Differentiable fibre bundles
[ All vector field concepts for differentiable fibre bundles should be generalized to topological fibre bundles in the sense of transformations, as opposed to infinitesimal transformations. ] [ Here define and discuss vector fields whose domain is Ep and range is T (E) from some total space E. These vector fields are not valued in T (Ep ) because the vectors might not be vertical. In particular, consider vector fields on Ep with a constant horizontal component. ] 35.3.1 Remark: Vector fields may be defined on the differentiable manifolds E, B, G and F of a differentiable fibre bundle (E, π, B) − < (E, AE , π, B, AB , AF < (G, AG , F, AF , σG , µF E ) with structure group (G, F ) − G ). Various kinds of invariant fields and infinitesimal transformations are of special interest on these manifolds. On the group G, left and right invariant vector fields are defined in Sections 34.3 and 34.4. On the fibre space F , infinitesimal transformations are defined in Section 34.8. In this section, similar fields are defined on the total space E. Related fields are defined for principal fibre bundles in Section 35.5. Some special kinds of vector fields (or vector bundle “cross-sections”) on ordinary fibre bundles are summarized in the following table. field XuL ∈ X(G) XuR ∈ X(G) XuF ∈ X(F ) E Xu,φ ∈ X(E) E Xu,v,φ ∈ X(E)
parameters u ∈ Te (G) u ∈ Te (G) u ∈ Te (G) u ∈ Te (G), φ ∈ AF E u ∈ Te (G), v ∈ Tb (B), φ ∈ AF E
formula
description
(dLG g )e (u) (dRgG )e (u)
g 7→ g 7→ f 7→ (dRf )e (u) ∗ βb,φ XuF
left invariant vector field right invariant vector field infinitesimal transformation infinitesimal transformation via charts non-vertical infinitesimal transformation
35.3.2 Remark: The fields XUL and XuR in Remark 35.3.1 are specific to Lie groups and have nothing to do with the fibre space or the differentiable fibre bundle. The field XuF is specific to the Lie transformation E E group (G, F ) and has nothing to do with the differentiable manifold. Only the fields Xu,φ and Xu,v,φ are specific to the differentiable fibre bundle. However, all of these fields are related. 35.3.3 Remark: The vector u ∈ Te (G) in Remark 35.3.1 may be set equal to the derivative γ ′ (0) ∈ Te (G) of a differentiable curve γ : IR → G with γ(0) = e ∈ G. This suggests the natural generalization to a E differentiable fibre bundle with non-Lie structure group. In this way, the vector field Xu,φ may be generalized to the vector field X on Dom(φ) defined by π(z) ∀z ∈ Dom(φ), X(z) = ∂t (Lγ(t),φ (z)) , t=0
−1 where the left action Lbg,φ : Eb → Eb is defined by Lbg,φ : z → 7 φ E (gφ(z)). The corresponding generalization b of the infinitesimal left action XuF to non-manifold groups G is discussed in Remark 34.8.2. The corresponding left and right invariant fields XuL and XuR , which do not require curves for their definition, are discussed in Sections 34.3 and 34.4 respectively.
E 35.3.4 Remark: The vector field Xu,φ in Remark 35.3.1 is entirely vertical by virtue of its construction. If this vector field is restricted to a single fibre set Eb , it generates a family of automorphisms of Eb , which are in the set AutG (Eb ) introduced in Notation 23.8.5. A natural generalization is to vary the base point b. Then it is possible to generate a non-vertical vector field via 1-parameter families of isomorphisms in the sets IsoG (Eb1 , Eb2 ) for b1 , b2 ∈ B which were introduced in Notation 23.8.4. Consider a curve γ : I → B × G for some open interval I ⊆ IR with 0 ∈ I and γ(0) = b. A vector field may be b,γ1 (t) (z)) t=0 is well-defined for all z ∈ Eb , where generated from this curve if the derivative X(z) = ∂t (Lγ2 (t),φ,φ [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
[ In the following, must clarify exactly what T (Eb ) means in relation to T (E). A few other things need detailed justification here too. ] E The field Xu,φ on E is defined as the pull-back via a chart φ ∈ AF of the field XuF on F . This field on E is E generally fibre chart dependent. It is constructed from βb,φ = φ E , where Eb = π −1 ({b}). Since βb,φ : Eb → b ∗ E ∗ F is a diffeomorphism, the pull-back map βb,φ : T (F ) → T (Eb ) is well-defined, and Xu,φ = (b 7→ βb,φ (XuF )) E is a well-defined vector field on Uφ = π(Dom(φ)) which is vertical in the sense that π∗ (Xu,φ ) is the zero vector field on Uφ .
35.4. Differentiable principal fibre bundles
679
1 ,b2 γ1 (t) and γ2 (t) are the B and G components of γ(t) respectively, and the fibre set isomorphisms Lbg,φ 1 ,φ2 are defined in Notation 23.8.15. Then the curve γ generates a vector field on Eb which is not necessarily vertical. In fact, π∗ (X(z)) = γ1′ (0) for all z ∈ Eb . This means that the horizontal component of X(z) has the same value γ1′ (0) ∈ Tb (B) for all z ∈ Eb . This kind of vector field is well-defined even for differentiable fibrations and fibre bundles with non-Lie structure group.
E When G is a Lie group, the vector field X defined here is given the notation Xu,v,φ , where u ∈ Te (G) and v ∈ Tb (B). The vectors u and v may be thought of as the vertical and horizontal components of the vector field respectively. For a non-manifold structure group G, a more general notation is required. E Vector fields of the form Xu,v,φ are exactly what are required for defining connections on general differentiable 1 ,b2 fibre bundles. Since parallelism is defined in terms of the isomorphism maps Lbg,φ in Section 23.8, it is not 1 ,φ2 E at all surprising that connections are defined in terms of vector fields such as Xu,v,φ , which are differentials of such isomorphisms.
35.4. Differentiable principal fibre bundles Differentiable principal fibre bundles are the customary structure on which to define a connection. The reason for this is that parallel transport for all associated fibre bundles of a given PFB may be defined in terms of such a connection. However, this advantage of PFBs relative to ordinary fibre bundles is partly illusory. Connections may be defined on any old OFB, and parallel transport is then defined on associated fibre bundles by copying the fibre chart transition maps. This all becomes clearer in Chapter 36. A principal fibre bundle with structure group G is the same thing as a (G, G) ordinary fibre bundle. It is always possible to define a right action µP G : P × G → P by group elements in G acting on the total space P . This right action adds no new information because it is defined in terms of the other components of the definition. In the following definitions, the notations Lg and Rg are shorthand for the left and right actions of group elements respectively. So for g in a group G, Lg : g ′ 7→ gg ′ and Rg : g ′ 7→ g ′ g.
q(z)
(z, g) ∈ P × G for any φ ∈ AG P with z ∈ Dom(φ).
The right transformation group of the principal fibre bundle (P, q, B) is the C k Lie right transformation group (G, P ) − < (G, AG , P, AP , σG , µP G ).
A C k principal fibre bundle with structure group G is also called a C k (differentiable) principal G-bundle or a C k (differentiable) G-bundle. If the regularity class C k is not specified, it is assumed to be C 1 . 35.4.2 Remark: Definition 35.4.1 is illustrated in Figure 35.4.1. As usual, the notations Pb = q −1 ({b}) and βb,φ = φ q−1 ({b}) = φ P for b ∈ q(Dom(φ)) are adopted here for Definition 35.4.1. b
Rg q −1 (U ) ⊆ P
φ
[ www.topology.org/tex/conc/dg.html ]
G
q×
Figure 35.4.1
Lg
φ q
U ⊆B
G
U ×G⊆B×G
Principal fibre bundle with Dom(φ) = q −1 (U ) [ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Z
− 35.4.1 Definition: A C k (differentiable) principal (fibre) bundle with structure group G for k ∈ + 0 and a Lie group G − < (G, AG , σG ) is a C k (G, G) fibre bundle (P, q, B) − < (P, AP , q, B, AB , AG ) for (G, G) − < P (G, AG , G, AG , σG , σG ). −1 P The right action of G on P is the operation µP (σG (φ(z), g)) for G : P × G → P defined by µG (z, g) = φ P
680
35. Differentiable fibre bundles
k G 35.4.3 Theorem: The right action µP G of a C principal fibre bundle (P, q, B, AP ) for k ∈ P map µG : P × G → P which satisfies:
Z0
−+
is the unique
(i) ∀z ∈ P, ∀g ∈ G, q(µP G (z, g)) = q(z); (That is, q(zg) = q(z).)
P (ii) ∀φ ∈ AG P , ∀z ∈ Dom(φ), ∀g ∈ G, φ(µG (z, g)) = σG (φ(z), g). (That is, φ(zg) = φ(z)g.) P 35.4.4 Remark: The analogous left action to the right action µP G in Definition 35.4.1 is the map LG,φ : −1 −1 −1 P G × q (Uφ ) → q (Uφ ) defined by LG,φ : (g, z) 7→ βq(z),φ (σG (g, φ(z))). This left action is chart-dependent.
35.4.5 Remark: The requirements for principal fibre bundles are summarized in the following table. Topological principal fibre bundles are defined in Section 23.9. component
symbol
topological principal G-bundle
C k principal G-bundle
analytic principal G-bundle
total space base space structure group
P B G
topological space topological space topological group
C k manifold C k manifold Lie group
analytic manifold analytic manifold Lie group
projection map fibre charts group operation right action
π:P →B φ:P → ˚ G σ:G→G µ:P ×G→P
continuous continuous continuous continuous
Ck Ck analytic Ck
analytic analytic analytic analytic
A rough and simple family tree for differentiable fibre bundles is illustrated in Figure 35.4.2. differentiable manifold (X,AX )
differentiable transformation group (G,F )− 0 and z ∈ Eγ(t) , 1 cos α − sin α z γ ∂t Θs,t (z) = ∂t sin α cos α z2 1 ∂α − sin α − cos α z = , cos α − sin α z2 ∂t
where ∂t α = V 1 and V = (V 1 , V 2 ) = γ ′ (t), assuming the identity chart on F and its tangent space T (F ). It follows that θV (z) = ∂t+ Θγs,t (z) depends only on V = γ ′ (t) for x1 > 0 and is linear with respect to γ ′ (t). Therefore all three two-sided assumptions (1), (2) and (3) above are satisfied for x1 > 0. For x1 < 0, ∂t Θγs,t (z) satisfies the same equation as above with ∂t α = −V 1 . However, when x1 = 0, ∂t+ α = |V 1 | and θV (z) = ∂t+ Θγs,t (z) is well-defined for V = ∂t+ γ(t), but ∂t Θγs,t (z) is not defined for V 1 6= 0. Therefore the three one-sided assumptions (1′ ), (2′ ) and (3′ ) above are satisfied for x1 = 0, but not the two-sided assumptions because ∂t Θγs,t (z) is not defined when γ(t)1 = 0 and γ ′ (t)1 6= 0. 36.2.7 Theorem: [ If the conditions in Remark 36.2.3 hold, then the parallel transport Θ equals the integral of the connection θ which is the differential of the parallel transport Θ. ] [ Show (?) how differential parallelism can be derived from pathwise parallelism using additivity of pathwise curvature. Then use the Stokes Theorem and exterior derivatives in a differentiable manifold to derive the Riemann curvature tensor. Try to get an explicit formula for the horizontal lift functions as derivatives of the group element gf,φ1 ,φ2 (b1 , b2 ) for a parallelism f . It should then be possible to express curvature directly as some sort of second-order derivative of the parallelism. ] [ Must also present differentiability of parallelism for differentiable fibrations or differentiable fibre bundles with non-Lie structure group. ]
36.2.8 Remark: [ Since the connection is the derivative of parallel transport, try to generalize connections to almost-everywhere functions classes like L1 . Then use some sort of Sobolev space for the parallelism and see if the conversions between connections and parallelisms gives back the same structures for such generalized function classes. Similar considerations arise for conversions between the two-point metric and Riemannian metric (tensor field) in Section 39.3, particularly Remark 39.3.1. ]
36.3. Horizontal lift functions for ordinary fibre bundles 36.3.1 Remark: Section 34.8 discusses vector fields (infinitesimal actions) generated on a G-manifold by the action of a Lie group G. The extension of this concept to differentiable fibre bundles will be discussed in Section 35.3. A connection may be represented as a family of vector fields on the total space of a differentiable fibre bundle, which are generated by structure group actions and parametrized by base space direction vectors. [ Near here, define families of vector fields on the total space of a differentiable fibre bundle which depend linearly on tangent vectors in the base space. Then one may define a special class of these which are generated by the structure group. Condition (ii) requires that (V 7→ θV (z)) ∈ Lin(Tπ(z) (M ), Tz (E)) for all z ∈ E, but the map V 7→ θV for fixed p ∈ M is from Tp (M ) to a space of vector fields whose domain is Ep and range is T (E). So this is a family of fields which depend linearly on the family parameter. See Section 35.3. ] 36.3.2 Remark: Definition 36.3.5 defines connections on ordinary fibre bundles instead of principal fibre bundles. This makes it possible, for example, to define affine connections on tangent bundles of differentiable manifolds rather than the more usual coordinate frame bundles. The connection introduced in Definitions 36.3.5 and 36.3.12 could be called an “OFB connection” to distinguish it from the “PFB connection” in Definition 36.5.3. 36.3.3 Remark: Definition 36.3.5 specifies a connection by fixing a vector V in the base space and stating how all points in the fibre attached to the base point p move in a parallel fashion when the point p is moved [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
[ Put a new section here for “Horizontal lift functions for differentiable fibrations”? ]
36.3. Horizontal lift functions for ordinary fibre bundles
E
z1 z2
θV (z1 ) ∈ θV (z
2)
Tz1 (E )
∈T
z2 (E
πE M Figure 36.3.1
p
693
)
V ∈ Tp (M )
Horizontal lift function on an ordinary fibre bundle
in the direction V . This is illustrated in Figure 36.3.1. This may be thought of as a vector field on the total space for each base space velocity. Definition 36.3.12 does the reverse. In Definition 36.3.12, an element of the fibre at a point in the base space is fixed, and then the parallel motion of that element is specified for each velocity of the base point. This may be thought of as a linear map from the space of base space velocities to the space of fibre velocities for each fibre element. 36.3.4 Remark: Part (iv) of Definition 36.3.5 is specified in terms of the differential of the right action Rf of points f ∈ F on elements of the group G. For all f ∈ F , Rf : G → F is defined by Rf : g 7→ gf . Such “differential actions” dRf are discussed more fully in Section 34.8. [ (dRφ(z) )e (u) is right-invariant. Therefore it is an infinitesimal left action!? ] 36.3.5 Definition: A (horizontal) lift function on a C 1 (G, F ) fibre bundle (E, πE , M ) is a function S θ : T (M ) → p∈M (Ep → T (E)) which satisfies (i) ∀p ∈ M, ∀V ∈ Tp (M ), ∀z ∈ Ep , θV (z) ∈ Tz (E),
(iii) ∀p ∈ M, ∀V ∈ Tp (M ), ∀z ∈ Ep , (dπE )z (θV (z)) = V ,
(iv) ∀p ∈ M, ∀V ∈ Tp (M ), ∀φ ∈ AF E,p , ∃u ∈ Te (G), ∀z ∈ Ep , (dφ)z (θV (z)) = (dRφ(z) )e (u).
[ Re-express (iv) as an invariance rule for vector fields rather than in terms of a specific choice of u ∈ Te (G), which is unnecessarily restrictive. ] [ How does (dRφ(z) )e (u) in (iv) compare to infinitesimal left actions γ ′ (0)? ] 36.3.6 Remark: The maps and spaces in Definition 36.3.5 are illustrated in Figure 36.3.2. 36.3.7 Remark: Condition (ii) means that for fixed p ∈ M , θ T (M ) is a linear map. [ Section 35.3 will p discuss this kind of field on Ep with values in T (E). It is not entirely clear why this map should be linear. Should give some explanation of why this condition is required. ] Condition (iii) means that the horizontal component of θV is V . This is because a fibre moving in a parallel fashion along a path must move so that the base point of the fibre follows the path. Condition (iv) of Definition 36.3.5 is a kind of invariance condition. It means that the vertical component (with respect to a particular choice of fibre chart φ) of the connection value θV (z) for fixed V is the action of an element of the Lie algebra Te (G) on the fibre space F . The connection is not really invariant under the group action. The connection is a group action. So the fibre structure is invariant under the action of the connection. Although the connection θ is chart-independent, the vector u ∈ Te (G) depends on the choice of fibre chart. [ There should be some sort of transition rule for calculating u for all fibre charts if it is given for only one fibre chart. ] In a sense, elements of the Lie algebra of a Lie group are really members of the group itself. This may be compared with distribution theory and Radon measures, where continuous density functions and point [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
(ii) ∀z ∈ E, (V 7→ θV (z)) ∈ Lin(TπE (z) (M ), Tz (E)),
694
36. Connections on differentiable fibre bundles
G
e
πT (G)
Rφ(z) F
φ(z)
φ
Te (G)
u
(dRφ(z) ) πT (F )
Tφ(z) (F )
(dφ)z (θV (z))
(dφ)z
πT (E)
Ep
z
θV (z)
Tz (E)
θV πE M Figure 36.3.2
p
(dπE )z πT (M ) V
Tp (M )
Maps and spaces for a horizontal lift function on an ordinary fibre bundle
masses are considered to be members of the same space. Similarly, the elements of the Lie algebra may be considered as the continuous part whereas the group elements are the discrete part of a combined space. The linear map (dφ)z in condition (iv) takes the vertical component of θV (z) relative to a fibre chart φ. (More objectively, (dφ)z removes the horizontal component. The vertical component depends on the fibre chart.) Each fibre chart may define a different vertical component. The map (dφ)z is “orthogonal” with respect to (dπE )z in the sense that ker(dφ)z ∩ ker(dπE ) = {0}. (This follows from the C 1 diffeomorphism πE × φ : −1 πE (U ) ≈ U ×F for U = πE (Dom(φ)).) The map (dπE )z removes the vertical component of vectors in Tz (E).
[ Must define the lift function for vector fields. Definition 36.3.9 should not be expressed in terms of a vector field lift. There should be a more basic way of doing this. It should be a theorem that a C k connection will lift a C k vector field to C k vector field on the total space. ]
Z
− k+1 36.3.9 Definition: A connection of class C k for k ∈ + differentiable (G, F ) fibre bundle 0 on a C F k ξ = (E, πE , M, AE ) is a connection θ on ξ such that liftθ (X) ∈ X (E) for all X ∈ X k (M ). 36.3.10 Remark: If a connection is C k , then it seems reasonable that the integral of the connection along a C k+1 path should be a C k+1 function of the base point. In particular, if the connection is continuous, then the parallel transport along a C 1 path should be C 1 . [ Should express this more precisely in a theorem. ] [ Define the lift of a path. Maybe do this in Section 36.8. ] 36.3.11 Remark: The connection in Definition 36.3.5 is transposed in the equivalent Definition 36.3.12. If θ satisfies Definition 36.3.5, then θ¯ defined by θ¯z (V ) = θV (z) satisfies Definition 36.3.12 and vice versa. Definition 36.3.12 uses the same notation Rf for the right action of an element of the fibre space F as Definition 36.3.5. 36.3.12 Definition: A (transposed) (horizontal) lift function on a C 1 (G, F ) fibre bundle (E, πE , M ) is a S map θ¯ : E → z∈E Lin(TπE (z) (M ), Tz (E)) which satisfies (i) ∀z ∈ E, θ¯z ∈ Lin(TπE (z) (M ), Tz (E)), (ii) ∀z ∈ E, (dπE )z ◦ θ¯z = idT (M ) , πE (z)
−1 ¯ (iii) ∀p ∈ M, ∀V ∈ Tp (M ), ∀φ ∈ AF E , ∃u ∈ Te (G), ∀z ∈ πE ({p}), (dφ)z (θz (V )) = (dRφ(z) )e (u).
[ Should w in (iii) be notated as wV,φ ? ] [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
36.3.8 Remark: The role of structure groups of fibre bundles is shown by the definition of a connection. A connection is required to preserve the structure of the fibre which is indicated by the structure group. Therefore the value of the connection is the differential of a group action acting on the fibre bundle.
36.4. Curvature of connections on ordinary fibre bundles
695
36.3.13 Remark: The vector w in Definition 36.3.12 (iii) is (probably) not defined in case G is the group of all diffeomorphisms. But γ ′ (0) is defined for such a group. 36.3.14 Definition: A connection θ¯ on a C k+1 (G, F ) fibre bundle (E, πE , M ) for k ∈ of class C k if liftθ¯(X) ∈ X k (E) for all X ∈ X k (M ).
Z0
−+
is said to be
[ Must define C k regularity of a connection θ¯ first in terms of a chart, and then give a theorem relating this to the effect on vector fields. ]
36.4. Curvature of connections on ordinary fibre bundles 36.4.1 Remark: Curvature is, broadly speaking, the deviation of parallelism from flatness. In the case of affine connections and tangent bundles of manifolds, the curvature may be measured with the Riemann curvature tensor. In the case of connections on general fibre bundles, a more general measure of curvature is required. If a connection is defined on a general differentiable fibration whose structure group is not a finite-dimensional Lie group, then curvature must be defined in terms of vector fields on the fibre space because then there is no finite-dimensional Lie algebra which can be used for defining connection forms and curvature forms. 36.4.2 Remark: Suppose γ : IR2 → M is a 2-parameter C 2 family of curves in the base space M of a C 2 1 differentiable (G, F ) fibre bundle ξ = (E, π, M, AF E ). Suppose that θ is a C horizontal lift function on ξ. Then one may define parallel transport on ξ along two different paths starting at γ(0, 0) and ending at γ(s, t), the first path γ1 initially following the s axis, the other path γ2 following the t axis. Let γ˜kz (s, t) : IR2 → E denote the lift function with value γ˜kz (0, 0) = z ∈ Eγ(0,0) along path γk for k = 1, 2. Then (probably)
Z
t
γ˜2z (s, t) = z +
and
0
0
Z
t
θV (u,0) (˜ γ1z (u, 0)) du +
Z
s
θW (0,u) (˜ γ2z (0, u)) du +
θW (s,u) (˜ γ1z (s, u)) du
0
0
θV (u,t) (˜ γ2z (u, t)) du,
where V (s, t) = ∂s γ(s, t) and W (s, t) = ∂t γ(s, t) for (s, t) ∈ IR2 . The difference between γ˜1z and γ˜2z is a measure of curvature. (This is roughly illustrated in Figure 36.4.1.) Using the Stokes Theorem, it is possible to express this difference in terms of a suitable differential of the horizontal lift function. Eγ(0,t)
Eγ(0,0)
Eγ(s,t) γ˜ z (s, t) 2 γ˜1z (s, t)
γ(0, t)
z
γ(s, t) Eγ(s,0)
γ(0, 0) γ(s, 0) Figure 36.4.1
Curvature concept for connection on ordinary fibre bundle
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Z
s
γ˜1z (s, t) = z +
696
36. Connections on differentiable fibre bundles
36.5. Horizontal lift functions for principal fibre bundles [ Can define curvature by fixing an initial z, then taking the Poisson bracket of vector fields of f in directions V1 , V2 . ] [ Must derive all PFB definitions from OFB definitions. This may be done by converting between OFBs and PFBs using associated fibre bundles, or by defining PFBs as a special case of an OFB, with the additional right group action on the total space. Should then re-derive the OFB definitions from PFB definitions. It is likely that the Lie algebra element u for a PFB will be exactly the same as for the OFB. ] [ Define PFB connections in terms of OFB connections and vice versa via the definition of “parallel fibre transports”, i.e. parallel cross-sections. ] 36.5.1 Remark: Three alternative definitions of a connection on a principal fibre bundle are given in this chapter: Definitions 36.5.3, 36.9.2 and 36.9.4. They each have advantages and disadvantages. 36.5.2 Remark: Suppose (P, πP , M ) is a C 1 principal G-bundle. (See Definition 35.2.2 for “differentiability of a fibre bundle”. See Definition 35.4.1 for “principal G-bundle”.) Then the map πP : P → M is a C 1 map between the C 1 manifolds P and M . Hence the differential (dπP )z : Tz (P ) → TπP (z) (M ) of the map πP is well-defined for every z ∈ P . Similarly, if Rg denotes the action of a group element g ∈ G on the manifold P , then Rg is a C 1 map from P to P . Hence the differential (dRg )z : Tz (P ) → Tzg (P ) of the map Rg is well-defined for every z ∈ P and g ∈ G. [ Do a transposed version of Definition 36.5.3 with ρ : T (M ) → (P → ˚ T (P )). ] 36.5.3 Definition: A (transposed) (horizontal) lift function on a C 1 principal G-bundle (P, πP , M ) is a S map ρ¯ : P → z∈P Lin(TπP (z) (M ), Tz (P )) which satisfies (i) ∀z ∈ P, ρ¯z ∈ Lin(TπP (z) (M ), Tz (P )), (ii) ∀z ∈ P, (dπP )z ◦ ρ¯z = idTπP (z) (M ) ,
36.5.4 Remark: The maps in Definition 36.5.3 are illustrated in Figures 36.5.1 and 36.5.2. The map Rg : P → P is defined for g ∈ G by Rg : z 7→ z.g = µP G (z, g). zg
Tzg (P ) ρ¯zg (v)
liftρ¯(X)
Tz (P ) ρ¯z (v)
g
Rg
(dRg )z ρ¯zg
P
z
G
P
(dπP )zg ρ¯z TπP (z) (M ) Figure 36.5.1
πP
(dπP )z v
X
πP (z)
M
Maps for transposed horizontal lift function on a PFB
[ In Figure 36.5.1, X is no longer so relevant? ] [ In Definition 36.5.3 (iii), try to get also ρ¯gz = (dLg )ρ¯z ◦ ρ¯z or something similar. ] [ Show that Definition 36.5.3 (iii) implies that ρ· (v) is an infinitesimal left translation. ] [ The lift function in Definition 36.5.3 is equivalent to a right invariant vector field through the charts. Therefore equivalent to an infinitesimal left action. ] [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
(iii) ∀z ∈ P, ∀g ∈ G, ρ¯zg = (dRg )z ◦ ρ¯z .
36.5. Horizontal lift functions for principal fibre bundles (dRg )z
Rg
ρ¯zg (v) ρ¯z (v)
liftρ¯(X)
zg
z P
T (P ) Figure 36.5.2
697
Right actions and the lift function of a connection
36.5.5 Definition: A connection ρ¯ on a C k+1 fibre bundle (P, πP , M ) is said to be of class C k for k ∈ if liftρ¯(X) ∈ X k (P ) for all X ∈ X k (M ).
Z0
−+
36.5.6 Remark: The conditions of Definition 36.5.3 may be interpreted as follows: (i) the velocity of change of n-frame is a linear function of the velocity of the base point; (ii) the “horizontal component” of the connection is the identity; (iii) the velocity of transformation of the tangent space is the differential of a group action on the fibre bundle.
ρ¯z (v) + o(v)
ρ¯z (−v) + o(−v) z
Figure 36.5.3
−v
p = π(z)
v
Local parallelism determined by a connection ρ¯
The projection of z onto M is πP (z) ∈ M . The vector v ∈ TπP (z) (M ) is a tangent vector at πP (z) which indicates a direction of movement for the point p = πP (z). If this point p is moved by the amount v to p + v, then the vector in πP−1 ({p + v}) which is parallel to z will look like z + ρ¯z (v) + o(v). In other words, the vector z ∈ P is moved my the small amount ρ¯z (v) + o(v) ∈ Tz (P ). (This interpretation is not very rigorous. It is only intended to give an intuitive interpretation of the connection.) The term o(v) of order smaller than v as v → 0. Note that ρ¯z (v) has horizontal and vertical components. The horizontal component is the displacement horizontally by the vector v, which implies that it contains no real information. The vertical component is the deviation of z away from the point it would have been at p + v if the vertical component of z had been left unchanged relative to the coordinate system. Similarly, when the point p is moved in the direction −v to p − v, the vertical component of z is translated by φz (−v) + o(−v) to z + φz (−v) + o(−v). The dashed vertical lines at p ± v represent the parallel transport of z in the case that ρ¯ has no vertical component. The horizontal dashed lines represent the parallel transportation of the vector z under translations ±v of p, which consist of a horizontal component (the straight portion) and a vertical component (the curved portion). This explains part (i) of Definition 36.5.3. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
36.5.7 Remark: If Definition 36.5.3 is studied line by line, it is less complex than it seems at first. In part (i), ρ¯z (v) means the direction in which to move z relative to the “coordinates” when moving the point p = πP (z) in the direction v ∈ TπP (z) (M ). In other words, ρ¯z (v) means the rate of change of z required in the direction v in order to keep z parallel to the starting value. A practical example of this would be an airplane flying along a geodesic from New York to Paris. To keep the airplane in an orientation parallel to the initial orientation, it is necessary to adjust the bearing of the airplane relative to the longitude/latitude coordinate system. In this case, z would be the airplane’s orientation, v is the direction of travel of the airplane, and ρ¯z (v) is the rate at which the orientation must be changed to keep the airplane moving in a parallel fashion. Figure 36.5.3 illustrates roughly how the connection ρ¯ determines parallelism at a short distance from a point z ∈ P .
698
36. Connections on differentiable fibre bundles
The fact that condition (i) requires linearity of the connection with respect to tangent vectors in TπP (z) (M ) implies a reduced amount of information in the connection. The whole map ρ¯z is fully specified on any TπP (z) (M ) if it is known for any n linearly independent vectors in that space. Part (ii) of Definition 36.5.3 says that (dπP )z (¯ ρz (v)) = v for all v ∈ TπP (z) . This just means that if p is translated to p + v, then the value z + ρ¯z (v) + o(v) has a horizontal component equal to v. In other words, ρ¯z always translates z to a point in P which has the same base point as the point p + v. (Once again, this is a very rough first-order description. This description can be made precise, but it is more useful to think here in the language of small displacements which is usual in physics texts.) This condition implies that the horizontal component of the connection actually contains no information. Part (iii) of Definition 36.5.3 states that the parallel transport vector ρ¯zg can be obtained from ρ¯z by applying the linear transformation (dRg )z , which is (very roughly speaking) a vertical linear transformation factor of g. In other words, all vectors zg + ρ¯(zg) + o(v) can be obtained from z + ρ¯(z) + o(v) by applying the transformation g. The situation when a group element g is applied is illustrated in Figure 36.5.4. ρ¯z (v) + o(v)
ρ¯z (−v) + o(−v) z zg
ρ¯zg (−v) + o(−v)
Figure 36.5.4
−v
ρ¯zg (v) + o(v)
p = π(z)
v
Local parallelism under group action
Condition (iii) specifies not so much invariance as conservation or preservation. It specified that the structure of the fibre space must be preserved under parallel translations. The connection ρ¯ is not really invariant in any sense, but it does guarantee that the invariants of the structure group are preserved. This is anologous to the fact that the Christoffel symbols for a connection are not themselves tensors, but their use does preserve the tensor property of tensors which are differentiated with them. [ To define differentiability of the connections ρ¯, h and Q in Definition 36.5.5, could say that ρ¯ is C k when ∀X ∈ X k (M ), liftρ¯(X) ∈ X k (P ). Or could use the condition ∀X ∈ X k (P ), hz ◦ X ∈ X k (P ). The latter form looks the most natural. ] 36.5.8 Definition: The lift of a vector field X ∈ X(M ) by a connection ρ¯ on a principal G-bundle (P, πP , M ) is the vector field X ∗ = liftρ¯(X) ∈ X(P ) defined by ∀z ∈ P, X ∗ (z) = liftρ¯(X)(z) = ρ¯z (X(πP (z))). 36.5.9 Remark: Definition 36.5.8 is illustrated in Figure 36.5.5. [ The lift is defined after it is used in Definition 36.5.3. Should fix this. ] 36.5.10 Definition: A vertical vector at z ∈ P , in a C 1 principal G-bundle P − < (P, πP , M ), is a vector v ∈ Tz (P ) which satisfies (dπP )z (v) = 0. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Elements g of the group G act on the set P . The vector z.g is therefore an element of P such that πP (z.g) = πP (z) = p, as shown. The figure shows how the connection ρ¯ transports a vector zg in a parallel fashion for small displacements ±v from the base point p. For instance, zg is transported to z + ρ¯zg (v) + o(v) for a base point translation v. Just as for parallel transport of vector z, the parallel transport of zg has a horizontal and vertical component. The vertical component of the connection is invariant under the group G. This implies a high level of redundancy of the information in the connection.
36.6. Connection forms for PFB connections liftρ¯(X)
Tz (P ) ρ¯z
Figure 36.5.5
P πP
(dπP )z
TπP (z) (M )
z
699
X
πP (z)
M
Definition 36.5.8 of lift of a vector field
36.5.11 Notation: Vz (P ) denotes the set of vertical vectors at z, for a C 1 principal G-bundle P − < (P, πP , M ) and z ∈ P . 36.5.12 Remark: Vz (P ) = ker((dπP )z ) ⊆ Tz (P ) for any z ∈ P , for any C 1 principal G-bundle P − < (P, πP , M ). [ Possibly insert a new section on extension of connections to associated fibre bundles near here. ] [ Insert new section here on connection forms for differentiable fibrations and ordinary fibre bundles? ]
36.6. Connection forms for PFB connections
36.6.1 Remark: A connection form tells you how much a given motion deviates from parallel. In other words, the connection form is the difference between the actual velocity of a curve in the total space and the velocity that the curve should have for parallel motion. Therefore the connection form is zero for parallel motion. Connection forms may be calculated from horizontal lift functions by simply subtracting such functions from an identity function. (Specifically, the vector field dRg in Definition 36.5.3 should be subtracted from the actual fibre motion.)
36.6.2 Remark: A connection form may be interpreted as a measure of angular velocity because it measures the deviation of the rate of change of orientation of fibres relative to parallel motion. Therefore it is not at all surprising that the symbol ω is commonly used both for connection forms and for angular velocity in mechanics. The analogy is accurate when the structure group is an orthogonal group. For other kinds of groups, the analogy is still useful. 36.6.3 Definition: The connectionSform on a principal G-bundle (P, πP , M ) corresponding to a connection ρ on (P, πP , M ) is the map ω : P → z∈P Lin(Tz (P ), Te (G)) such that for all z ∈ P ,
(i) ωz ∈ Lin(Tz (P ), Te (G)), (ii) ker ωz = ρ¯z (TπP (z) (M )), and (iii) ωz ◦ (dLz )e = idTe (G) ,
where (dLz )e ∈ Lin(Te (G), Tz (P )) is the differential at e ∈ G of the map Lz : g 7→ z.g from G to P . [ Show the precise relation between the connection form and horizontal lift functions. Probably get ω = id −¯ ρ ◦ (dπP ) or something. This should be the definition of ω, and most of Definition 36.6.3 should then be a theorem. ] [ The conditions in Definition 36.6.3 should be expressed in plain English also. ] 36.6.4 Remark: In Definition 36.6.3, condition (iii) means that the vertical component of the connection form is the identity. [ Note that ρ¯z is not a differential form on M or P . ] [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
[ Should do absolutely everything for OFB connections first, including this section. Then do everything for PFB connections. S Theorem 36.6.5 should generalize to OFBs. The OFB connection form should be of the form ω : E → z∈E Lin(Tz (E), Te (G)), or something like that. ]
700
36. Connections on differentiable fibre bundles
36.6.5 Theorem: Let µz = (dLz )e . Then the diagram ρ¯z
ω
(dπP )z
µz
z 0 −→ ←− TπP (z) (M ) −→ ←− Tz (P ) −→ ←− Te (G) −→ ←− 0,
or equivalently µz
(dπP )z
ωz
ρ¯z
0 −→ ←− Te (G) −→ ←− Tz (P ) −→ ←− TπP (z) (M ) −→ ←− 0, is exact in both directions. (The arrows to and from 0 are zero maps.) That is, ρ¯z is injective ρ¯z (TπP (z) (M )) = ker ωz ωz is surjective µz is injective µz (Te (G)) = ker ((dπP )z ) and (dπP )z is surjective. In addition, (dπP )z ◦ ρ¯z = idTπP (z) (M ) and
ωz ◦ µz = idTe (G) .
hz = ρ¯z ◦ (dπP )z is the horizontal component operator on Tz (P ), and µz ◦ ωz is the vertical component operator. Im(¯ ρz ) = ker(ωz ) is the space of horizontal vectors, and Im(µz ) = ker ((dπP )z ) is the space of vertical vectors at z.
and
ωz ◦ µz = idTe (G)
ρ¯z (TπP (z) (M )) ⊆ ker ωz .
This should follow from Section 10.11 on exact sequences of linear maps. It should therefore be possible to write ωz in terms of µz and ρ¯z , like for instance ωz (y) = µ−1 z (y − hz (y))
= µ−1 ¯z ◦ (dπP )z (y)) z (y − ρ
for y ∈ Tz (P ). (This looks very much like the definition of a covariant derivative!) Something similar should be possible in the reverse direction, like hz (y) = y −µz ◦ωz (y), and some expression for ρ¯z . These expressions could be added to those in Remark 36.9.6. This existence and uniqueness result is necessary for showing that the connection form contains the same information as a connection. ] [ Mention somewhere the connection of exact sequences with algebraic topology. See for instance Greenberg/ Harper [113], page 54, and EDM2 [34] 277.E. ] [ Mention the relation between the connection form and the exponential map of elements of Te (G) acting on P . Define this action to be Rexp tA : P → P . This is a 1-parameter group of transformations on P , which therefore induces a vector field A∗ on P . Then can replace condition (iii) of Definition 36.6.3 by ∀A ∈ Te (G), ωz (A∗ (z)) = A. See EDM2 [34] 105.N. ] [ Interpret the connection form as the rate of change of (position + coordinates) in a manifold to a rate of change of coordinates. The “output” rate of change of coordinates is in fact a correction term which equals zero if the “input” rate of change of coordinates corresponds to parallel (motion + rotation). ] [ Here define the relation of Christoffel symbols to the connection form. See EDM2 [34], page 1573, 417.B. i dxk . ] Probably ω i j = Γkj [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
[ ωz has been made independent of the choice of fibre chart by shifting it to e ∈ G. The action of G on P gives a fibre-chart-free association between Te (G) and Tz (P ) for any z ∈ P . ] [ In the following comment, could write ρ¯z (TπP (z) (M )) = Qz = ker(ωz ). See Spivak [42], book II, page 336. Qz is the horizontal space at z. Also give a formula for ρ in terms of ω? ] [ Should try to show that ωz is uniquely determined by
36.7. Covariant derivatives for general connections
701
36.7. Covariant derivatives for general connections [ See manuscript notes. Get γ ′′ (t) − θγ ′ (t) (γ ′ (t)) etc. ] 36.7.1 Remark: It is possible to define a covariant derivative for general connections which is completely analogous to the standard covariant derivative for tangent bundles. The difference is that instead of the covariant derivative being valued in the tangent space of the base manifold, it is valued in the tangent bundle of the fibre space. It just happens in the case of tangent bundles that this fibre space is IRn , which makes it easy to identify the covariant derivative with the tangent space of the base manifold. In the case of a principal fibre bundle, the fibre is the structure group, which happens to be a Lie group. Therefore the covariant derivative for a principal fibre bundle is valued in the tangent space of the Lie group, and this tangent space is identified with the Lie algebra of the group. [ Can some sort of Riemann curvature tensor be defined for general connections? ]
36.8. Parallel displacement for PFB connections
−1 −1 36.8.1 Remark: The set {f : πE ({b1 }) ≈ πE ({b2 }); b1 , b2 ∈ B} is useful for the discussion of parallel −1 −1 displacement, where (E, πE , B) is a topological fibre bundle of some sort. Since πE ({b1 }) ≈ πE ({b2 }) holds for all b1 , b2 ∈ B, the set is clearly nonempty for all pairs (b1 , b2 ). S [ The set H(E, πE , B) = b1 ,b2 ∈B Hb1 ,b2 (E, πE , B) could be some sort of “double fibre bundle” analogous to double tangent bundles T (M1 , M2 ). ] −1 −1 Let Hb1 ,b2 (E, πE , B) denote the set {fS: πE ({b1 }) ≈ πE ({b2 })} of homeomorphisms between the fibres at −1 −1 b1 and b2 , and define H(E, πE , B) = b1 ,b2 ∈B Hb1 ,b2 (E, πE , B) = {f : πE ({b1 }) ≈ πE ({b2 }); b1 , b2 ∈ B}. −1 −1 Define π ¯E : H(E, πE , B) → B1 × B2 by π ¯E : f 7→ (b1 , b2 ) if f : πE ({b1 }) ≈ πE ({b2 }). Then the triple −1 η = (H(E, πE , B), π ¯E , B1 × B2 ) looks a little like a fibre bundle. In fact, π ¯E ({(b1 , b2 )}) = Hb1 ,b2 (E, πE , B) for all b1 , b2 ∈ B. It would be straightforward to construct a natural topology for H(E, πE , B) so that η is a topological fibre bundle. Definition 36.8.3 is an experimental definition of this sort of thing.
[ It’s pretty clear that fully general continuous connections are not possible. Must have rectifiable curves. ] 36.8.2 Remark: A definition of parallel translation should associate an element of the homeomorphism space Hb1 ,b2 (E, πE , B) with every continuous curve from b1 to b2 for b1 , b2 ∈ B. The set of all curves in B may be represented as the set C 0 ([0, 1], B) of continuous maps from [0, 1] ⊆ IRn to B. Then a definition of parallel transport could be represented as a map α : C 0 ([0, 1], B) → H(E, πE , B) such that α(γ) ∈ Hb1 ,b2 (E, πE , B) whenever γ ∈ C 0 ([0, 1], B) satisfies γ(0) = b1 and γ(1) = b2 . For this definition to be satisfactory, it should satisfy a transitivity rule, namely that if γ is the contatenation of two curves γ1 and γ2 such that γ1 (0) = b1 , γ1 (1) = γ2 (0) and γ2 (1) = b2 , then α(γ) = α(γ2 ) ◦ α(γ1 ). It would be interesting to know whether mere differentiability and transitivity for a pathwise parallelism as above would suffice to give the complete standard definition for a connection on a differentiable principal fibre bundle. Perhaps the group invariance properties would force the parallelism into the right form. S [ In the case of a vector bundle, Definition 36.8.3 should be replaced with b1 ,b2 Hom(Eb1 , Eb2 ). ] 36.8.3 Definition: The fibre-to-fibre homeomorphism space of a topological fibre bundle (E, πE , B) is the set H(E, πE , B) defined by −1 −1 H(E, πE , B) = {f : πE ({b1 }) ≈ πE ({b2 }); b1 , b2 ∈ B}.
< (E, TE , πE , B, TB ) is The fibre-to-fibre homeomorphism bundle of the topological fibre bundle (E, πE , B) − the tuple (H(E, πE , B), π ¯E , B1 × B2 ) − < (H(E, πE , B), TH , π ¯E , B, TB ), where π ¯E : H(E, πE , B) → B1 × B2 −1 −1 is defined by π ¯E : f 7→ (b1 , b2 ) if f : πE ({b1 }) ≈ πE ({b2 }), and the topology TH is defined by . . . [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
[ In this section cover parallel displacement as in EDM2 [34] 82.C. This is meaningful for general connections. ] [ Fibre-to-fibre homeomorphisms should be in the topological fibre bundle chapter maybe. Then cover only the case of connections on differentiable fibre bundles in this section. ]
702
36. Connections on differentiable fibre bundles
36.9. Alternative definitions for general connections 36.9.1 Remark: Definitions 36.9.2 and 36.9.4 are non-standard. The author prefers 36.5.3. 36.9.2 Definition (→ 36.5.3): A connection on a C ∞ principal G-bundle (P, πP , M ) is a map h : P → S z∈P Lin(Tz (P ), Tz (P )) such that (i) ∀z ∈ P, hz ∈ Lin(Tz (P ), Tz (P )),
(ii) ∀z ∈ P, (dπP )z ◦ hz = (dπP )z ,
(iii) ∀z ∈ P, ∀g ∈ G, hzg = (dRg )z ◦ hz ◦ (dRg )−1 z , The function h is called the horizontal map function of the connection. [ The conditions in Definition 36.9.2 should also be expressed in plain English. Define C k regularity of horizontal component function. ] 36.9.3 Theorem: Definitions 36.5.3 and 36.9.2 contain equivalent information under the correspondence hz = ρ¯z ◦ (dπP )z . More precisely. . . 36.9.4 Definition (→ 36.5.3): A connection on a C ∞ principal G-bundle (P, πP , M ) is a map Q : P → IP(T (P )) such that (i) ∀z ∈ P, Qz is a subspace of Tz (P ),
(ii) ∀z ∈ P, Tz (P ) = Vz (P ) ⊕ Qz ,
(iii) ∀z ∈ P, ∀g ∈ G, (dRg )z Qz = Qzg ,
(iv) the map z 7→ Qz is C ∞ in some sense. The vectors of Qz are said to be horizontal at z.
36.9.5 Theorem: Definitions 36.5.3 and 36.9.4 contain equivalent information under the correspondence Qz = ρ¯z (TπP (z) (M )). More precisely, given a map ρ¯ which satisfies Definition 36.5.3, the function Q : P → IP(T (P )) defined by Qz = ρ¯z (TπP (z) (M )) for all z ∈ S P satisfies Definition 36.9.4. Conversely, if the map Q satisfies Definition 36.9.4, then the map ρ¯ : P → z∈P Lin(TπP (z) (M ), Tz (P )), where for each z ∈ P , ρ¯z is the unique right inverse of (dπP )z such that ρ¯z (TπP (z) ) = Qz (guaranteed by Theorem 10.11.10), then Q satisfies Definition 36.5.3. S Proof: First assume that ρ¯ : P → z∈P Lin(TπP (z) (M ), Tz (P )) satisfies Definition 36.5.3, and define Qz for each z ∈ P by Qz = ρ¯z (TπP (z) (M )). Then clearly Qz is a subspace of Tz (P ) for all z ∈ P . So Q : P → IP(T (P )), and condition (i) of Definition 36.9.4 is satisfied. From Theorem 10.11.9, it follows that Tz (P ) = Vz (P ) ⊕ Qz for all z ∈ P , which verifies condition (ii). The transformation rule (dRg )z Qz = Qzg of Definition 36.9.4 follows from the corresponding rule ρ¯zg = ρ¯z ◦ (dRg )z of Definition 36.5.3. Indeed (dRg )z Qz = (dRg )z (¯ ρz (TπP (z) (M ))) = ((dRg )z ◦ ρ¯z )(TπP (z) (M )) = ρ¯zg (TπP (z) (M )), which follows from the fact that ρ¯zg = (dRg )z ◦ ρ¯z .
To show that Qz is differentiable if ρ¯z is differentiable, it is necessary to define clearly what differentiability means for each of these. Now assume that a connection is given according to Definition 36.9.4. For z ∈ P , define ρ¯z : TπP (z) (M ) → Tz (P ) to be a linear function such that (dπP )z ◦ ρ¯z = idTπP (z) (M ) and ρ¯z (TπP (z) (M )) ⊆ Qz . The existence and uniqueness of ρ¯z are guaranteed by Theorem 10.11.10. Thus (i) and (ii) of Definition 36.5.3 are satisfied. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
[ The conditions in Definition 36.9.4 should be expressed in plain English also. Must define C k regularity of the horizontal subspace function. ]
36.9. Alternative definitions for general connections
703
To demonstrate condition (iii) of Definition 36.5.3, let z ∈ P , g ∈ G and v ∈ TπP (z) (M ). Then ρ¯zg (v) ∈ Qzg . . . [ Show that ρ¯zg = ρ¯z ◦ (dRg )z . ] [ To generalize Definition 36.5.5, need to define the derivatives of Qz . ] 36.9.6 Remark: The relations between the definitions of connection are: ρ¯z ↔ hz ◦ (dπP )−1 z ρ¯z ◦ (dπP )z ↔ hz ρ¯z (TπP (z) (M )) ↔ hz (Tz (P ))
↔ ... ↔ proj. of Tz (P ) onto subspace Qz ↔ Qz
Note that Q is more easily derived from ρ¯ or h than vice versa, and between ρ¯ and h, h is more easily derived from ρ¯. So ρ¯ gives the best definition. But h is in some senses more natural. For instance, differentiability is probably easier to define in terms of h. [ Should try to include the connection form ω in the above table. ] 36.9.7 Remark: Definition 36.6.3 is equivalent to Definitions 36.5.3, 36.9.2 and 36.9.4 in the sense that a connection form contains the same information as each of the three definitions of a connection. Thus a connection form may be regarded as an alternative definition for a connection.
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
[ Give a fourth or fifth definition of connection: see Gallot/Hulin/Lafontaine [19], Definition 2.49, page 69. ]
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
704
[ www.topology.org/tex/conc/dg.html ]
36. Connections on differentiable fibre bundles
[ draft: UTC 2009–3–21 Saturday 11:36 ]
[705]
Chapter 37 Affine connections and covariant derivatives
37.1 37.2 37.3 37.4 37.5 37.6 37.7 37.8 37.9 37.10
Concepts, history and terminology . . . . . . . . Motivation for defining connections on manifolds . Affine connections on tangent bundles . . . . . . Covariant derivatives . . . . . . . . . . . . . . . Hessian operators . . . . . . . . . . . . . . . . . Elliptic second-order operator fields . . . . . . . . Curvature and torsion . . . . . . . . . . . . . . . Affine connections on principal fibre bundles . . . Coefficients of affine connections on principal fibre Connections for Lagrangian mechanics . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . bundles . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
705 707 708 709 712 712 714 716 716 717
37.0.2 Remark: If one had to explain differential geometry to a non-mathematician, a simple phrase to summarize the subject might be: “The geometry of curvature.” One might also say: “The geometry of curved spaces.” Curvature is really the core concept of differential geometry which distinguishes it from “normal geometry”. Curvature in turn may be defined as “the deviation of parallel lines from parallel”. In other words, curvature is the tendency of parallel lines to get closer or further away from each other as they are extended. So curvature requires parallelism as a prior concept. In the olden days of Euclidean geometry, parallel lines stayed the same distance apart no matter how far they were extended. (This is related to Euclid’s fifth postulate. See EDM2 [34], 139.A and 285.A.) In a space of positive curvature, parallel lines get closer together. In a space of negative curvature, they move apart. Since parallelism clearly cannot be defined in terms of equidistance in a curved space, alternative definitions are required which do not rely on the definition of distance. 37.0.3 Remark: In this chapter, there are no definitions for distance on spaces. Parallelism is defined here in terms of “connections”, which are an abstraction from the familiar concept of parallelism in flat Euclidean space. It turns out that curvature and parallelism can be very comprehensively defined with no definition of distance at all. Curvature in the absence of a metric is the subject of Section 37.7. (Some minimalist definitions of parallelism and curvature are discussed in Chapter 22 for non-topological manifolds, and in Chapter 23 for topological manifolds.)
37.1. Concepts, history and terminology 37.1.1 Remark: An affine connection defines differential pathwise parallelism for a differentiable manifold. Whereas parallelism for a flat vector space is a simple equivalence relation between the vectors at all points in the space, parallelism for curved spaces is path-dependent. Given any two points in a differentiable manifold, there is generally no absolute relation of parallelism between the tangent vectors at the two points.
©
Alan U. Kennington, “Differential geometry reconstructed: a unified systematic framework”. Copyright 2009, Alan U. Kennington. All rights reserved. You may print this book draft in A4 format. Printing in all other formats is forbidden. You may not charge any fee for copies of this book draft. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
37.0.1 Remark: Affine connections are a special case of the general connections in Chapter 36. For affine connections, the structure group is a general linear group GL(n), and the differentiable fibre bundle is the tangent bundle of an n-dimensional manifold or some fibre bundle associated with the tangent bundle. So affine connections are effectively defined on tangent vector bundles of differentiable manifolds.
706
37. Affine connections and covariant derivatives
An example of this is illustrated in Figure 37.1.1. If a vector is transported in a parallel fashion from the equator to the north pole of the earth, the vector will not be parallel to the same vector transported along a different path. Finish here
Finish here
H` a Nˆ o.i
M¨ unchen
M¨ unchen
Start here Figure 37.1.1
H` a Nˆ o.i
Start here Path-dependent parallel transport on a 2-sphere
If you stand at longitude 0◦ , latitude 0◦ , on the Earth on a surfboard pointed northwards and surf towards the north pole in a parallel fashion, the surfboard will point towards longitude 180◦ . But if you move in a parallel fashion first towards longitude 90◦ and then to the north pole, the surfboard will be pointing towards longitude −90◦ when you arrive at the north pole. (This proves that the Earth is not flat.)
37.1.3 Remark: The name “affine” comes from affine spaces in which parallelism is an invariant of the affine transformation group. This is a misnomer. Many authors call an affine connection more correctly a linear connection. Affine spaces are presented in Sections 12.1 and 12.2. The historical origin of the term “affine” is discussed in Section 46.3. As mentioned in Remark 15.3.2, the choice of the word “connection” for differential parallelism is also unfortunate. So the term “affine connection” is doubly unfortunate, and perplexing. Another misfortune of terminology in the topic of affine connections is the lack of a standard term for “differentiable manifolds with an affine connection”. Terms such as “affine space” and “affine geometry” refer to flat-space concepts. The term “affine manifold” apparently refers to manifolds with a torsion-free flat affine connection. Maybe the shortest term which can be used safely is an “affine connection manifold”. In Misner/Thorne/Wheeler [37], chapter 10, the subject of geometry with affine connections but no metric is referred to as “affine geometry”. [ Should have a table showing conversions between definitions of connections, or maybe later in chapter. See Spivak [42], book II, page 336 diagram. ] 37.1.4 Remark: As mentioned in Remark 19.4.2, tangent vectors may be regarded as invariants of the “figures” (e.g. curves) with respect to C 1 diffeomorphisms. When an affine connection is defined on a C 2 manifold, the definition of parallelism is the same for all charts in the manifold’s atlas. An affine connection is transformed in such a way that parallel transport works the same regardless of the chart transition diffeomorphisms applied to the manifold. Hence parallelism may be regarded as an invariant of the pseudogroup of chart transition maps of a C 2 manifold which has a well-defined affine connection. Since curvature is determined by parallel transport, it follows that curvature is then a chart invariant also.
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
37.1.2 Remark: The notion of a connection formalizes the path-dependence of “parallelism at a distance” in terms of infinitesimal coordinate frame motions for infinitesimal paths. In other words, a connection specifies parallelism in a differential sense. The notion of a connection was abstracted from Riemannian spaces, but is meaningful for a larger class of differentiable manifolds. The standard connection on a Riemannian manifold is called a Levi-Civita connection, which is defined in Section 39.4.
37.2. Motivation for defining connections on manifolds
707
37.2. Motivation for defining connections on manifolds [ This section probably belongs in Chapter 36, but needs to be adapted for general connections. ] 37.2.1 Remark: One way to clearly motivate connections on differentiable manifolds is to try to calculate the second derivative of a C 2 map γ : IR2 → M for a C 2 manifold M . The first derivative with respect to the first parameter is easily expressed in terms of tangent vectors as a map γ1 : IR2 → T (M ) such that γ1 (s, t) ∈ Tγ(s,t) (M ) for all (s, t) ∈ IR2 . If this is differentiated with respect to the second parameter, the result is a map γ12 : IR2 → T (T (M )) such that γ12 (t) ∈ Tγ1 (s,t) for (s, t) ∈ IR2 . If the derivatives are performed in the opposite order, the result is a map γ21 : IR2 → T (T (M )) such that γ21 (t) ∈ Tγ2 (s,t) (M ) for (s, t) ∈ IR2 . The problem with this is that the derivatives γ12 (s, t) and γ21 (s, t) are not even in the same tangent space unless γ1 (s, t) = γ2 (s, t). Therefore there is no way of comparing these quantities. This is not a problem of simple non-commutivity. The problem is that the two second derivatives are not comparable at all.
A further difficulty now arises. There is no obvious way to define the linear maps αγ2 (s,t) : Tγ1 (s,t) (T (M )) → Tγ(s,t) (M ) and αγ1 (s,t) : Tγ2 (s,t) (T (M )) → Tγ(s,t) (M ). (The subscripts for α are chosen to indicate the direction of the second derivative.) Therefore the difference between the two second-order derivatives has an arbitrary value, αγ2 (s,t) (γ12 (s, t)) − αγ1 (s,t) (γ21 (s, t)) ∈ Tγ(s,t) (M ). Although the differential parallelism along curves thus defined is arbitrary in the mathematical sense, the situation is not hopeless because in a flat space, an absolute parallelism is defined independent of paths, whereas in general relativity and mechanics, there are definitions of pathwise parallelism which arise from the physics being modelled. In mechanics, parallelism within a system state space may arise from a least action principle, whereas parallelism is derived from the metric structure in general relativity. From the above discussion, it is clear that connections arise naturally as a way of giving meaning to second (and higher) order derivatives. A very large number (probably the vast majority) of physical models require second order derivatives. Such derivatives are meaningless if vectors at different points cannot be compared. (More precisely, second order derivatives are defined in the absence of a connection, but they lie in the space T (T (M )), which is useless because this is a space of abstract derivatives, and the derivatives in an equation would then all be in different, incomparable spaces Tz (T (M )).) So “parallelism at a distance” is a prerequisite for defining the second-order derivatives which are required by most physical models. When working in flat space, it is difficult to be aware of the role of parallelism in defining second-order derivatives. Curved space exposes this role. The term “curvature” is applied to the extent to which derivatives do not commute. If the curvature is non-zero, the order of applying derivatives becomes important, which is not so in flat space if the function is twice continuously differentiable. Figure 37.2.1 illustrates path-dependent parallelism. 37.2.2 Remark: As mentioned in Section 25.6, in the connection layer (layer 3), the affine connection is a differential of parallel transport and the parallel transport is an integral of the affine connection. Similarly, in the metric layer (layer 4), the Riemannian metric tensor is a differential of the pointwise distance function and the distance functions is an integral of the metric tensor. However, the pointwise distance function is calculated by extremizing the path integral over all paths between two points. The affine connection is calculated by simply integrating the affine connection over a single path. This raises the question of whether something interesting and useful is obtained by extremizing the path [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
The core of this problem is the fact that the space T (T (M )) consists merely of formal ‘infinitesimals’ of motions within the manifold structure of T (M ). A second derivative of the form γ12 says nothing about the rate of change of γ1 with respect to the second parameter because there is no way of comparing vectors γ1 (s, t) for different parameter values, in particular for values (s, t) and (s, t + h) for small h ∈ IR. It is clearly desirable to have γ1 (s, t) and γ1 (s, t + h) in the same vector space so that they can be subtracted to yield some sort of derivative. The obvious way to solve this problem is to define a linear isomorphism between Tγ(s,t)) (M ) and Tγ(s,t+h) (M ) for all h ∈ IR so that γ1 (s, t) and γ1 (s, t + h) can be subtracted. If this plan is carried through, the result in the limit as h → 0 is a linear map from Tγ1 (s,t) (T (M )) to Tγ(s,t) (M ). This maps γ12 (s, t) into Tγ(s,t) (M ). If the derivatives are done in the reverse order, then γ21 (s, t) is also mapped into Tγ(s,t) (M ). In this way, the two second-order derivatives can be compared.
708
37. Affine connections and covariant derivatives
e2p1 ,ψ w1 = w +
p,ψ e2
v1
w
Tp1 (M
)
u1
e1p1 ,ψ v2 − v1
p,ψ e1
w2 =
w + u2
p2 ,ψ
Tp (M )
e2
v2
p2 ,ψ
e1 T p2 (M
Figure 37.2.1
)
Path-dependent parallel transport of a tangent vector
integral of the affine connection. Certainly specially distinguished paths are available in layer 3, namely the geodesic paths. In layer 4, the geodesics which follow the Levi-Civita connection are paths which extremize the point-to-point distance. So is there something special in the parallel transport which is carried by a geodesic curve? If not, why not.
37.3. Affine connections on tangent bundles [ Define affine connections on Lie transformation groups and on differentiable fibre bundles? ] [ Is it true that θv ∈ X 0 (T (π −1 ({p})))? Maybe θv (z) ∈ Tz (E), linear with respect to v. ]
[ Also do Definition 37.3.2 first for θ? ] 1 2 37.3.2 Definition: S An affine connection on a C tangent bundle (T (M ), π, M ) for a C manifold M is a map θ¯ : T (M ) → z∈T (M ) Lin(Tπ(z) (M ), Tz (T (M ))) such that
(i) ∀z ∈ T (M ), θ¯z ∈ Lin(Tπ(z) (M ), Tz (T (M ))), (ii) ∀z ∈ T (M ), (dπ)z ◦ θ¯z = idTπ(z) (M ) , n
(iii) ∀p ∈ M, ∀V ∈ Tp (M ), ∀φ ∈ AITR(M ) , ∃u ∈ gl(n), ∀z ∈ Tp (M ), (dφ)z (θ¯z (V )) = (dRφ(z) )e (u), where for all y ∈ IRn , Ry : GL(n) → IRn is defined by Ry : g 7→ gy, The function θ¯ is called the lift function of the connection.
37.3.3 Remark: The conditions of Definition 37.3.2 may be summarized as (i) linearity with respect to the translation vector, (ii) equality of horizonal component of connection to the translation vector, (iii) preservation of fibre space structure under the connection. [ Must define the lift of a vector field by an affine connection. Should find a way to define regularity of affine connections without using the lift of vector fields. Then have a theorem that a C k connection lifts C k fields to C k fields. ] − 37.3.4 Definition: An affine connection of class C k on a C k+1 manifold M for k ∈ + 0 is an affine k 1 connection θ¯ on M such that liftθ¯(X) ∈ X (T (M )) for all X ∈ X (M ).
Z
[ Should regenerate parallel displacement from the OFB connection. ] 37.3.5 Definition: [Definition of parallel displacement of a tangent space along a path.] [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
37.3.1 Remark: If the tangent bundle T (M ) is substituted for the total space E in Definition 36.3.12, the result is Definition 37.3.2. In fact, a vector bundle structure is used here. (See Section 35.8 for vector bundles.) [ Change Definition 37.3.2 to use vector bundles. ]
37.4. Covariant derivatives
709
[ See EDM2 [34] 80.C and 80.H for parallel displacement. ] [ Must show the existence and uniqueness of the parallel transport φt along a path map γ in M mapping φt : Tγ(0) (M ) → Tγ(t) (M ). This leads to a map Φ : Γ(M ) → G, where Γ(M ) is the set of piecewise smooth paths in M . ] [ Give formula for connection form in terms of connection lift functions. ] [ Somewhere define general “associated connections” for associated fibre bundles of the initial OFB or PFB, using associated parallelism. ]
37.4. Covariant derivatives 37.4.1 Remark: Simple derivatives of vectors with respect to the coordinates of a manifold are not invariant under changes of coordinates. Derivatives of vectors cannot be made coordinate-independent in the same way as derivatives of real-valued functions by applying the coordinate transformation rule in Definition 27.4.7. Naive derivatives of vectors have coordinates which depend on the second-order derivatives of chart transition maps. This is the principal motivation for defining a connection. Parallel transport is defined in order to be able to define covariant derivatives rather than because parallelism is of interest in itself.
37.4.3 Remark: The term “covariant derivative” is yet another unfortunate choice of words in the subject of differential geometry. On a differentiable manifold without any connection, a covariant vector transforms as the dual of contravariant vectors, and the contravariant vectors are the ordinary tangent vectors of the manifold’s tangent bundle. Thus “covariant” and “contravariant” vectors are duals of each other. It happens that the differential of a real-valued function on a manifold is a covariant vector. But this has nothing at all to do with affine connections or parallel transport. The “covariant derivative” of a real-valued function, when an affine connection is defined, is identical to the differential of the real-valued function in the differential layer 2 with no connection. But the differential of a vector field (in the absence of a connection) is a mixed covariant/contravariant tensor which is chartdependent (unless it is “anchored” to a particular chart as suggested in Remark 30.0.4). When an affine connection is available, the “covariant derivative” of a contravariant vector field on a manifold has a value which is a contravariant vector or vector field. Thus the so-called “covariant derivative” yields a contravariant object, which seems somewhat confusing. However, the value of the covariant derivative of a contravariant vector field varies as the dual of the vector (or vector field) which is used for the differentiating. In this sense, the value of the covariant derivative does vary “covariantly”. Unfortunately there is a further confusion of terminology here because a “covariant vector” varies as the dual of the “contravariant vectors”, and the contravariant vectors are the ordinary tangent vectors of the tangent bundle. So “covariant” really means “contravariant” and vice versa. (This source of confusion is also discussed in Remark 13.6.2.) In tensor calculus, the term “covariant” is applied to anything which uses a subscript index and the term “contravariant” corresponds to superscript indices. Since the application of the “covariant derivative” adds a subscript index to the components of the object being differentiated, the word “covariant” does seem appropriate. But the important attribute of the “covariant derivative” is that it takes into account the j k affine connection. Thus ∂i v j is not the covariant derivative of v, whereas the components ∂i v j + Γik v do correspond to the covariant derivative of a vector field v. 37.4.4 Remark: The covariant derivative of a vector field is defined in terms of an affine connection θ on a C 2 manifold in Definition 37.4.5. P [ Regarding θ¯X(π(V )) (V ) in Definition 37.4.5, see Spivak [42], II. page 317 for h(Y ) = Y − j ω j (Y ) etc. Instead of θ¯X(π(V )) (V ), could write θ¯V (X(π(V ))) etc.? [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
37.4.2 Remark: Although the covariant derivative is often used as a definition of a connection, it is applicable only to affine connections. Although a covariant derivative does yield an affine connection on the tangent bundle, it is in a sense the opposite of a connection because the covariant derivative gives the deviation of a vector field from parallelism, whereas a connection gives a definition of parallelism itself. Covariant derivatives are defined by subtracting the connection from naive derivatives of vectors.
710
37. Affine connections and covariant derivatives
The vector field X ∈ X 1 (M ) can probably be a general diff. cross-section of a diff. fibre bundle? Perhaps the drop function ̟ should be a different letter rather than ω? i Must summarize the relations between θ, ρ, ω (connection form), Γjk and DV X. ] 37.4.5 Definition: The covariant derivative of a vector field X ∈ X 1 (M ) on a C 2 manifold M with respect to a connection θ on M is the map D : T (M ) × X 1 (M ) → T (M ) defined by ∀V ∈ T (M ), ∀X ∈ X 1 (M ),
DV X = ̟X(π(V )) (dV X − θ¯X(π(V )) (V )),
where ̟ is the “drop” function for T (M ) and π is the projection map for T (M ). [ Must remember to define dV X in Definitions 37.4.5 and 37.4.7. ] 37.4.6 Remark: Definition 37.4.5 is probably clearer in terms of a fixed p = π(V ). DV X = ̟X(p) (dV X − θ¯X(p) (V )).
∀V ∈ Tp (M ), ∀X ∈ X 1 (M ),
This is illustrated in Figure 37.4.1. It certainly is not necessary that V be a vector field as many texts require. Definition 37.4.7 gives the case that V is a vector field.
X(p)
θV (
X(
π
M
Figure 37.4.1
p))
(T (M ))
dV X − θV (X(p)) ∈ ker((dπ)X(p) ) ∈T
X( p) (T
(M
))
p DV X = ̟X(p) dV X − θV (X(p)) ∈ Tp (M )
Covariant derivative of a vector field by a fixed vector
37.4.7 Definition: The covariant derivative of a vector field X ∈ X 1 (M ) on a C 2 manifold M with respect to a connection θ on M is the map D : X 0 (M ) × X 1 (M ) → X 0 (M ) defined by ∀V ∈ X 0 (M ), ∀X ∈ X 1 (M ), ∀p ∈ M, (DV X)(p) = ̟X(p) (dV X(p) − θ¯X(p) (V (p))). 37.4.8 Definition: [Definition of covariant differential of a tensor field.] 37.4.9 Remark: This simple relation between the abstract definition of an affine connection and the concrete definition of a covariant derivative is not shown in any text consulted by the author. The texts which do give a relation all explain it in terms of integral curves and parallel transport. (Note also that the OFB connection is used here, not the PFB connection.) The negative sign in the above equations for the covariant derivative are due to the fact that covariant derivatives measure deviation from parallelism whereas the connection defines parallelism itself. Therefore they must be in an opposite relation to each other. i [ Give the rule for explictly converting a connection θ into Γjk via the covariant derivative DV . In other words, i find the relation between the components of C and Γjk for any given chart. (See EDM2 [34], page 1573, section 417.B.) The connection form ω is just the identity map minus the connection θ or ρ. The Christoffel symbol is just the matrix of components of the connection form ω. Should have a table showing how to convert between all formulations of an affine connection. ] [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
T (M )
T (p) dV X ∈ X
37.4. Covariant derivatives
711
37.4.10 Remark: The following definition is a little abstract. However, it should be possible to use this high-level definition to develop an explicit definition of covariant derivatives in terms of the connection ρ¯. In fact, it should be possible to derive the covariant derivatives of all kinds of tensor fields from this kind of high-level definition. The problem here is to somehow generate the transport function φt,h out of ρ¯ and then use φt,h to develop DX Y . 37.4.11 Definition: The covariant derivative of a vector field Y along a curve γ in a C ∞ manifold M is the map Y ′ : Int(Dom(γ)) → T (M ) such that Yt′ ∈ Tγ(t) (M ) is defined for t ∈ Int(Dom(γ)) by Yt′ = lim t−1 (φ−1 t,h (Yt+h ) − Yt ), h→0
where φt,h is the “parallel displacement” along γ from Tγ(t) (M ) to Tγ(t+h) (M ). The limit is taken in Tγ(t) (M ). 37.4.12 Definition: The covariant derivative of a vector field Y in the direction of a vector field X is the vector field DX Y defined at a point p ∈ M by (DX Y )p = lim t−1 (φ−1 t (Yγ(t) ) − Yγ(0) ), t→0
where γ is an integral curve of X such that γ(0) = p, and φ is the function which effects parallel transport along γ. [ Definition 37.4.12 is very incomplete. It needs a definition of integral curve, parallel transport, and limits in Tp (M ). Also required is a proof of independence with respect to the integral curve if it is not unique, plus a proof of existence of the integral curve etc. Also need proof of existence and differentiability of the connection. ] [ The following definition looks suspiciously like the curvature form on a principal bundle. ]
37.4.13 Definition: [Define covariant differential of a differential form on a principal G-bundle (P, π, M ). Get something like (Dα)(X1 , . . . , Xk+1 ) = (dα)(hX1 , . . . , hXk+1 ), where h : X ∞ (P ) → X ∞ (P ) is one of the three definitions of a connection.] [ Try to re-express the covariant derivative in terms of the lift function ρ¯ and in terms of the other equivalent forms of connection. See EDM2 [34] 80.G. In the above definition for covariant differential of a differential form on a principal fibre bundle, have: F is a ∞ finite-dimensional vector space, α ∈ X0,k (P, F ) (that is, α is a C ∞ (alternating) k-form on the C ∞ manifold P with coefficients in F ), for i = 1, . . . , k + 1, Xi is a vector field on P (that is, Xi ∈ X ∞ (P )), h : X ∞ (P ) → X ∞ (P ) is the horizontal projection operator, and so h(X) ∈ X ∞ (P ) and h(X)(z) = hz (X(z)) ∈ Tz (P ).
Could use the identity hz = ρ¯z ◦ (dπ)z (or h = ρ¯ ◦ dπ) to simplify the expression. Thus hXi = h(Xi ) = ρ¯ ◦ (dπ)(Xi ). For z ∈ P , have α ∈ Λk (Tz (P ), F ). ]
[ See notes J for covariant derivatives. Also see EDM2 [34] 80.I and 417.B, Gallot/Hulin/Lafontaine [19] 2.58, 2.60, and 2.68. ] 37.4.14 Remark: [The motivation for the definition of covariant derivative: (d/dt)(φ−1 t,h (. . .) − . . .) etc. Make a comparison with the Lie derivative.] [ Is it possible to define covariant derivatives on T (T (M ))? ] r,s [ Define covariant derivatives of general-type tensors. Maybe use notations DX and DVr,s . ] [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
[ Do the covariant derivative of general fibre bundle cross-sections using associated fibre bundles and associated parallelism and associated connections. ]
712
37. Affine connections and covariant derivatives
37.5. Hessian operators 37.5.1 Remark: As mentioned in Section 30, second-order tangent operators are well-defined in the absence of a connection, but they have a first-order term which depends on the choice of coordinate chart because the transition rules for second-order operators have a term depending on the second derivatives of the chart transition maps. When a connection is defined, however, second-order derivatives may be defined with reference to local parallel transport so that the chart-dependent first-order component of the operator is hidden. If a second-order derivative is written in terms of coordinate derivatives instead of covariant derivatives, the first-order term reappears. Thus covariant second-order operators are really special cases of the differentiable manifold operators in Section 30.1, but they are written with the assistance of the connection to hide the first-order derivatives. 37.5.2 Remark: As mentioned in Section 32.6, the first-order derivative term in a second-order operator at a critical point of a real-valued function is zero with respect to one chart of a C 2 manifold if and only if it is zero with respect to all charts. Therefore the Hessian operator is well-defined at a critical point even in the absence of a connection, but when the derivative of a real-valued function is non-zero, the Hessian is chart-dependent unless some arbitrary choice of special coordinates is made. A connection effectively selects geodesic coordinates at each point as the “right” coordinates. The Hessian is calculated with respect to these coordinates, and the Hessian is calculated in all other coordinates by including correction terms to make the calculation agree with geodesic coordinates. [ The product DV1 DV2 in Definition 37.5.3 doesn’t commute? Probably the product commutes if the connection is symmetric. ] 37.5.3 Definition: The Hessian operator at a point p in a C 2 manifold M with a C 1 connection is the ˚p[2] (M ) defined by Hp : (V1 , V2 ) 7→ (f 7→ DV DV f ), for V1 , V2 ∈ Tp (M ) map Hp : Tp (M ) × Tp (M ) → T 1 2 2 and f ∈ C (M ), where D denotes the covariant derivative on M .
[ Show how to derive the Hessian at p ∈ M from an arbitrary linear combination of second derivatives of a function f ∈ C 2 (M ) via C 2 curves γ : IR → M passing through p. Define Dγ f = ∂s2 f (γ(s)) s=0 for p = γ(0). The natual bilinear extension of such derivatives is the Hessian. The same approach can be used for maps φ. ] [ See Greene/Wu [67], page 7 for a vector-field based definition of Hessian. ] [ Must also define Hessian of maps φ : M → M (φ ∈ C 2 (M, M )) and maps φ ∈ C 2 (M1 , M2 ) for C 2 manifolds with C 1 affine connections. ]
37.6. Elliptic second-order operator fields 37.6.1 Remark: In flat space, the theory of boundary value problems for elliptic second-order partial differential equations (as described in Gilbarg/Trudinger [109]) is concerned with the “actors”, namely the BVP solution u, the coefficients of the PDE a, b, c and right-hand side f and the boundary conditions. An important actor in this scenario is the domain Ω itself, and its boundary ∂Ω. The flat-space “theatre” of action is a space IRn , which is always firmly in the background. In differential geometry on the other hand, the “theatre” is of very great significance. The theatre is then the manifold M together with an optional affine connection and (even more) optional Riemannian metric. All of the assumptions in Gilbarg/ Trudinger [109] must be re-examined and re-worked to adapt them to a curved theatre of action. 37.6.2 Remark: The Hessian is clearly a bilinear function on the tangent space Tp (M ). This suggests that it may be contracted with contravariant degree-2 tensors to yield a well-defined scalar object. ˚p[2] (M ) by contracting the second-order covariant derivaTensorial second-order operators may be defined in T tive with symmetric second-degree contravariant tensors. (The antisymmetric component has no effect. So it is best to eliminate it from consideration.) For any symmetric tensor A ∈ Tp2,0 (M ), with coefficients k [aij ]ni,j=1 , the operator aij (∂ij − Γij (ψ)∂k ) gives a chart-independent number when applied to a particular 2 function f ∈ C (M ). This is the contraction of the type (2, 0) tensor A with the type (0, 2) tensor D2 f (p). [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
k 37.5.4 Remark: In terms of Christoffel symbols, the Hessian operator looks like ∂ij − Γij ∂k . Therefore i j k Hp (V1 , V2 ) = v1 v2 (∂ij − Γij (ψ)∂k ) for vectors Vm = tp,vm ,ψ ∈ Tp (M ) for m = 1, 2.
37.6. Elliptic second-order operator fields
713
As noted in Remark 30.6.2, the positive definite and semi-definite properties of degree-2 tensors are chartindependent. Therefore the classification of second-order operators as elliptic, weakly elliptic, hyperbolic etc. is chart-independent. 37.6.3 Definition: A (weakly) elliptic second-order operator at a point p ∈ M of a C 2 manifold M with an affine connection is an operator AD2 f (p) such that the tensor A ∈ Tp2,0 (M ) is positive semi-definite. 37.6.4 Remark: In terms of tensor components with respect to a chart ψ ∈ atlasp (M )f , the tensor k AD2 f (p) in Definition 37.6.3 may be written as aij (p)(∂ij f (p) − Γij (ψ)∂k f (p)). 37.6.5 Remark: Whereas it was not possible to distinguish between pure second-order operators and mixed first-and-second-order operators in Definition 30.6.3, when an affine connection is available, pure second-order operators as in Definition 37.6.3 are well-defined. 37.6.6 Remark: The “lower norm” matrix function λ− : Mn,n (IR) in Theorem 37.6.7 is given by Definition 11.4.2. [ 2007-3-21: Theorem 37.6.7 is work in progress. Please ignore it. ] 37.6.7 Theorem: Let M be an n-dimensional C 2 manifold with an affine connection. Assume: (i) Ω ⊆ M is an open subset of M . (ii) The contravariant tensor field A ∈ X(T (2,0) (Ω)) satisfies ∀ψ ∈ atlas(M ), ∃ka ∈ IR+ , ∀x ∈ ψ(Ω), λ− (a(ψ)(x)) ≥ ka . (iii) The contravariant vector field B ∈ X(T (Ω)) satisfies ∀ψ ∈ atlas(M ), ∃kb ∈ IR+ , ∀x ∈ ψ(Ω), ∀i ∈
Nn, |bi(ψ)(x)| ≤ kb .
∀ψ ∈ atlas(M ), ∃kc ∈ IR+ , ∀x ∈ ψ(Ω), ∀i, j, k ∈
Nn, |Γijk (ψ)(x)| ≤ kc .
(v) L : C 2 (Ω, IR) → (Ω → IR) is an operator defined by Lu(p) = A(p)D2 u(p) + B(p)Du(p) for p ∈ Ω. (vi) u ∈ C 0 (Ω) ∩ C 2 (Ω) satisfies Lu(p) ≥ 0 for all p ∈ Ω. Then supΩ (u) ≤ sup∂Ω (u). 37.6.8 Remark: Theorem 37.6.7 is an example of how the connection on a differentiable manifold becomes an actor in PDE problems. The “coordinate-free” expressions for partial differential equations hide the coordinates and the connection, but when the analysis starts, the hidden terms and factors must be brought out into the open and pinned against the wall. 37.6.9 Remark: The Christoffel symbol Γ in Theorem 37.6.7 may be replaced by arbitrary tensorization coefficients as defined in Section 30.2. An affine connection is not necessary to make the theorem valid. The connection (or tensorization coefficient field) is required only in order to allow the differential operator to be expressed in terms of a tensorial second-order coefficient field A. 37.6.10 Remark: General tensorization coefficients (and general connection coefficients) are not necessarily symmetric. (In other words, they may not be torsion-free.) In this case, the antisymmetric component of the tensor A(p) in Theorem 37.6.7 cannot be automatically disregarded. If A(p) is required to be symmetric, then the anti-symmetric component of Γ may be disregarded. If Γ is required to be symmetric, then the anti-symmetric component of A(p) may be disregarded. It is not clear whether the case where both are anti-symmetric would have interesting applications. If a connection is not torsion-free, it cannot be reconstructed from the set of geodesics using the Schild’s ladder approach. Second-order equations which depend on the torsion of the underlying space seem unlikely to be interesting. The Levi-Civita connection is, of course, torsion-free. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
(iv) The Christoffel symbol Γ for the affine connection on M satisfies
714
37. Affine connections and covariant derivatives
37.6.11 Remark: It often happens that the assumptions, the input conditions of a theorem, are difficult to provide. The output results, the assertions, are only obtained when the input conditions are satisfied. This raises the question of how one controls the input parameters. Sometimes this control is achieved by using another theorem B whose outputs give the required inputs for theorem A. In the case of second-order PDE on differentiable manifolds, one of the required inputs is a bound on the geometry itself. For Theorem 37.6.7, bounds on the Christoffel symbol Γ are required. Such a bound is not usually available. Bounds on a geometry may be given as, for example, a constant, positive or negative sectional curvature, bounds on the Ricci curvature, and so forth. It is then necessary to find relations between the bounds which are provided and the bounds which are required.
37.7. Curvature and torsion 37.7.1 Remark: The central concepts of differential geometry are manifolds, tangent vectors, pathwise parallelism, curvature and metric tensors. The core concept is curvature. If the curvature is everywhere zero in a manifold, then there is little of real geometric interest. A manifold with zero curvature may have topological and analytical interest, but geometrically speaking, such a manifold is not much different to Euclidean space with curvilinear coordinates and possibly some topological complexity.
[ For both curvature and torsion, see Crampin/Pirani [11], page 268 and 273. ] [ Note that geodesics don’t belong in this section because they are defined in Section 38.2. ] [ It is very likely that curvature and torsion are meaningful for a general connection. If so, then there should be a separate section on this before affine connections are defined. ] [ See EDM2 [34] 80.G, 80.J (and 80.E ?) for curvature form, and 364.D for curvature form on Riemannian manifolds. The curvature form should look something like Ω = Dω, which requires first the general definition of covariant derivative. ] [ Should try to define a generalized curvature for the case that the connection is weakly differentiable. ] 37.7.3 Definition: The curvature form on a C 2 manifold with a C 1 connection is . . . [ Should consider as an example (Dα)(X1 , X2 ) = (dα)(hX1 , hX2 ), where α is a 1-form and hz (X) = (¯ ρz ◦ (dπ)z )(X). Also consider the case of a 0-form α, so that (Dα)(X) = (dα)(hX). See EDM2 [34] 417.B for clues. ] 37.7.4 Definition: The canonical 1-form is . . . 37.7.5 Definition: The torsion form of a C 2 manifold with a C 1 connection is . . . 37.7.6 Definition: The curvature tensor of a C 2 manifold with a C 1 connection is . . . 37.7.7 Definition: The torsion tensor of a C 2 manifold with a C 1 connection is . . . [ Give the significance of the torsion tensor in great detail. See in particular Weyl [50], page 113. ] [ Probably the Schild’s ladder construction deserves its own section. ] 37.7.8 Remark: Note that the torsion of an affine connection has no influence on the geodesics of the manifold. So if the connection is regarded as a way of generating geodesics, then the torsion component of the connection is superfluous. Thus the connection can be regenerated from knowledge of just the geodesics, but not the torsion component. But if the torsion is required to be zero, then the connection is uniquely determined by the set of geodesics. However, knowledge of the covariant derivative determines the whole connection, including the torsion component. If the geodesics (together with affine parameters on all geodesics) are known, then the Schild’s ladder approach regenerates the definition of parallelism. The minimal unit of such a ladder is a “cross-brace”. A [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
37.7.2 Remark: The curvature of an affine connection is something like the “curl” of the connection. That is, it measures how much the parallel transport varies around the boundary of an infinitesimal area element divided by the area of the element. This is what the exterior derivative calculates.
37.7. Curvature and torsion
715
cross-brace is a pair of geodesics which meet in their mid-points. Then two pairs of opposite ends of the resulting “X” are joined by geodesics, giving a bow tie shape: ⊲⊳. In the limit of a small cross-brace, the opposing geodesics are supposed to be parallel. If the torsion is zero, then this is so. But otherwise, the torsion causes such pairs of geodesics to be non-parallel to first order with respect to the distance between them. The first-order rate of deviation of the two geodesics from parallel transport is in fact equal to the torsion in the plane of the cross-brace. [ Maybe (and hopefully) there is some sort of relation between the cross-brace method of measuring torsion and the triangle method of measuring the sectional curvature. But the sectional curvature requires a metric, doesn’t it? ] If the torsion is non-zero, then a Schild construction will generate a twisting (or expanding etc.) ladder which may resemble a DNA double helix or may simply expand or compress without rotating. It would be interesting to know how the behaviour of the double helix changes with respect to orientation. A better way of thinking of the Schild’s ladder is as an extendable handle ⊲⊳⊲⊳⊲⊳⊲⊳, similar to the type used in old-fashioned lift doors (as in the Hˆ otel des Vosges in Strasbourg). [ Give a detailed account of Schild’s ladder near here. Show how the geodesics with affine parameters can be used for successive approximation to parallel transport. ]
[ Here have a diagram showing how the standard basis at the origin is parallel displaced as the origin is moved in each of 8 compass directions away from the origin. More than one example should be given, corresponding to different possibilities for the Christoffel symbol. ] [ Should show the relation between the torsion tensor and the asymmetry of the Christoffel symbol. The Christoffel symbol is symmetric if the connection is torsion-free. ] [ There is still the question of the relation between the parallelogram law in linear spaces and the bow tie construction. In linear spaces, the inner product can be reconstructed from the norm if the parallelogram law is assumed. Is this equivalent to saying that a connection can be reconstructed out of the geodesics? If not, then what is the corresponding thing in connections to the parallelogram law. ] [ Also the important question must be dealt with of whether or not zero torsion is equivalent to the existence locally of a vector field such that the Lie derivative with respect to this field is equal to the covariant derivative. It could be that all Lie derivatives have zero torsion, or else that they all generate a local connection, with or without zero torsion, or else that there is no particularly neat relation between Lie derivatives and covariant derivatives. After all, the relation DX Y − DY X = LX Y suggests that one should not try to look for such correspondences between covariant and Lie derivatives. ] [ Should have a section with some theorems giving necessary or sufficient intrinsic conditions on a set of (parametrized) curves for these curves to be suitable for use with the Schild’s ladder construction to generate an affine connection. It is clear that the construction will yield a torsion-free affine connection if and only if the set of geodesics was originally constructed from such a connection, but this condition is not intrinsic to the set of geodesics. A first step towards this kind of investigation might be to determine intrinsic conditions on the set of geodesics which make the set self-consistent with respect to a metric. ] 37.7.9 Theorem: [Bianchi identities. Give geometric significance. Misner/Thorne/Wheeler [37], chapter 15, make a big fuss over these. So does everybody else. They must be important!] [ Give the components of the torsion tensor, curvature tensor and covariant differential. ] [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
A useful way to demonstrate torsion is to change coordinates around a point in a 2-manifold so that the Christoffel symbol is antisymmetric. Then a pair of basis vectors at the origin gets distorted in a clear way as it is moved around under parallel transport. A bow tie can be formed in geodesic coordinates from the four points (±1, ±1) or from the four points (1, 0), (1, 1), (−1, 0) and (−1, −1). If the torsion is non-zero, then a bow tie with parallel ends can be formed by varying the cross-over ratio from 0.5 to something else which varies linearly with distance apart of the end lines. But in greater than 2 dimensions, this variation of the cross-over ratio does not suffice. It is useful to demonstrate explicitly how the geodesics are unaffected 1 1 by the torsion in the case that all Christoffel symbol entries are zero except for Γ12 = −Γ21 = α. Then a geodesic in the direction (1, 1) is still a straight line in geodesic coordinates. All geodesics through the origin are still radial.
716
37. Affine connections and covariant derivatives
[ This is a very interesting passage from EDM2 [34], section 80.K: “For a Riemannian metric g on M , there exists a unique affine connection on M such that (i) ∇g = 0, and (ii) the torsion tensor T vanishes. This connection is called the ‘Riemannian connection’ corresponding to g.” The rest of what they say here is equally useful. ]
37.8. Affine connections on principal fibre bundles [ The n-frame bundle is defined in Section 35.9. ] 37.8.1 Definition: Let M be a C 1 n-dimensional manifold. Let (P (M ), π, M ) be the bundle of tangent n-frames over M with structure group GL(n) and the standard fibre atlas. An affine connection on M is any connection on the tangent n-frame bundle (P (M ), π, M ). 37.8.2 Definition: [Definition of affine transformation. See EDM2 [34] 80.J.] 37.8.3 Definition: [Definition of the product of manifolds with affine connections.] [ See notes J. ] [ Is it possible to replace GL(n) with O(n) and so forth, and then get a different sort of connection etc.? ] [ Should regenerate parallel displacement from the PFB connection. ] [ Maybe have a section on orthogonal and conformal connections here. Maybe projective connections too. ]
37.9.1 Remark: Suppose that M is a C 1 n-dimensional manifold with an affine connection ρ¯. Then for all p ∈ M and z ∈ π −1 ({p}), ρ¯z : Tp (M ) → Tz (P (M )) is a linear map between tangent spaces of M at p and P (M ) at z, where P (M ) is the n-frame bundle of M , and π : P (M ) → M is the standard projection ˚1 (M, IRn ) be a C 1 chart for M , and let ψˆ ∈ C ˚1 (P (M ), IRn ) be the corresponding map for P (M ). Let ψ ∈ C chart for P (M ). [ Note that it is not necessarily true that ψ ∈ atlas(M ). It is only required that ψ be C 1 -compatible with the atlas on M . The reason for this weak condition on ψ is not actually very good. It was introduced here out of a fear that smooth n-frame fields might not exist on chart domains that are too large. But actually this is not a problem, because the domain of a chart is always C 1 -diffeomorphic to an open subset of IRn . ] Then ψˆ : π −1 (Dom(ψ)) → IRn × GL(n). Let e : Dom(ψ) → P (M ) be a C 1 n-frame field on M . Then for all p ∈ Dom(ψ), the map (de)p : Tp (M ) → Te(p) (P (M )) can be expressed in coordinates as follows. For v ∈ Tp (M ), v = v i ∂ip,ψ , and. . .
For p ∈ M , the n-frame e(p) = (e(p)i )i∈n ∈ Pp (M ) may be expressed in terms of the basis (∂ip,ψ )i∈n by ei (p) = ∂jp,ψ a(p)j i for a unique map a : Dom(ψ) → GL(n). Thus a = ψˆ ◦ e. The n-frame field e is parallel at p in the direction v ∈ Tp (M ) if and only if (de)p (v) = ρ¯e(p) (v). That is, the lift function ρ¯ specifies the rate of change of an n-frame in a given direction which keeps that n-frame parallel. Now for p ∈ Dom(ψ) and v ∈ Tp (M ), (de)p (v) can be expressed in terms of the chart ψˆ for P (M ) by (de)p (v) =. . . ˆ
ˆ
z,ψ Define the vectors ∂iz,ψ and ∂j,k for i, j, k ∈ n by ˆ
∂iz,ψ (f ) = and ˆ
z,ψ ∂j,k (f ) =
∂ ˆ−1 (x, a)) (f ◦ ψ ˆ ∂xi (x,a)=ψ(z)
∂ (f ◦ ψˆ−1 (x, a)) j ˆ ∂a k (x,a)=ψ(z)
˚1 (P (M ), IR). This gives a set of n + n2 basis vectors for Tz (P (M )). for f ∈ C z 37.9.2 Theorem: Let M − < (M, AM ) be a C 1 n-dimensional manifold. Let p ∈ M , z ∈ Pp (M ) and w ∈ Tz (P (M )). Then for all ψ ∈ AM , for some b ∈ IRn and a ∈ GL(n), ˆ
ˆ
z,ψ w = bi ∂iz,ψ + aj k ∂j,k . [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
37.9. Coefficients of affine connections on principal fibre bundles
37.10. Connections for Lagrangian mechanics
717
ˆ
n×n In particular, for any affine connection ρ¯ on M , there are components ρ¯ψ such that z (v) ∈ IR ˆ
ˆ
z,ψ ρ¯z (v) = v i ∂iz,ψ + ρ¯z (v)j k ∂j,k
for all v ∈ IRn . 37.9.3 Theorem: If ρ is an affine connection on M , then for all p ∈ M and v ∈ Tp (M ), ˆ
ˆ
z,ψ (dq)z (bi ∂iz,ψ + aj k ∂j,k ) = bi ∂ip,ψ
and ˆ
ˆ
z,ψ ρ¯z (v) = v i ∂iz,ψ + ρ¯z (v)j k ∂j,k , ˆ
n×n for some ρ¯ψ . [ This theorem is possibly meaningless. Check this. ] z (v) ∈ IR
37.9.4 Theorem: Let ρ be an affine connection on the tangent n-frame bundle P (M ) of a C 1 n-dimensional ˚ manifold M . Then there exists a map Γ : C(IR, M ) × atlas(M ) → IRn×n×n such that. . . [ Probably Definition 37.9.5 of Γ in terms of D will eventually be replaced by a definition of D in terms of Γ, which will in turn be defined in terms of ρ. ¯ Alternatively, D could be expressed directly in terms of ρ¯. But the main task of this section will be to express Γ in terms of ρ¯. ] [ In Definition 37.9.5, must derive Γ directly from ρ and θ too. ] 37.9.5 Definition: Let ψ be a chart for a C ∞ manifold M with a connection, and let (∂i )ni=1 = (∂ip,ψ )ni=1 denote the coordinate basis vectors at p ∈ Dom(ψ). Then the Christoffel symbol for ψ with respect to the given connection is defined to be the set of components of the tangent vectors (D∂j (∂k ))nj,k=1 in the coordinate basis: n X i D∂j (∂k ) = Γjk ∂i . (37.9.1) [ To give meaning to Definition 37.9.5, must have a definition for DX Y . ] Pn i [ Equation (37.9.1) in Definition 37.9.5 could be D∂j (p) (∂k ) = i=1 Γjk (p)∂i (p)? This is because the subscript of D would then be a vector and D∂j (p) would act on the vector field ∂k . ] [ Try to express the connection form ω in terms of the Christoffel symbol. This might be possible by firstly expressing the connection ρ¯ in terms of the Christoffel symbol and then using the relation between ω and ρ¯. In fact, the whole range of expressions for the connection could be expressed somehow in terms of coordinates, maybe. ]
37.10. Connections for Lagrangian mechanics [ See Crampin/Pirani [11], page 345, section 13.8, and EDM2 [34], 271.F. ] 37.10.1 Remark: Least action principles yield shortest paths in phase spaces. These are geodesics from which a torsion-free connection may be derived. [ Show how to derive/calculate connections in terms of Lagrangians and/or least action principles. Also present Hamiltonian mechanics and phase space. Maybe the origin of the term “phase space” is the fact that the trajectories of an oscillatory system in Lagrangian coordinates sometimes look like circles or ellipses, and the motion goes through phases like sine/cosine pairs. ]
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
i=1
718
[ www.topology.org/tex/conc/dg.html ]
37. Affine connections and covariant derivatives
[ draft: UTC 2009–3–21 Saturday 11:36 ]
[719]
Chapter 38 Geodesics, convexity and Jacobi fields
38.1 38.2 38.3 38.4 38.5 38.6 38.7 38.8
Covariant derivatives of vector fields along curves Geodesic curves . . . . . . . . . . . . . . . . . . Jacobi fields . . . . . . . . . . . . . . . . . . . . Convex sets . . . . . . . . . . . . . . . . . . . . Convex combinations . . . . . . . . . . . . . . . Families of geodesic interpolations . . . . . . . . Exponential maps . . . . . . . . . . . . . . . . . Convex functions . . . . . . . . . . . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
719 720 721 721 722 722 724 724
38.0.2 Remark: A curve is defined in Section 16.2 as a map γ : I → M from a real interval to a topological space M . A (continuous oriented) path is defined in Section 16.4 as an equivalence class of curves which are related by oriented homeomorphisms of the parameter interval. In the context of geodesic curves, two curves are considered equivalent only if they are related by an affine parameter transformation. Therefore it may be convenient to define an “affine path” as an equivalence class of curves which are affine related. Then two curves may be said to “have the same affine path” if they are in the same equivalence class.
38.1. Covariant derivatives of vector fields along curves [ Also try to define covariant derivatives along curves in general differentiable fibre bundles. ] 38.1.1 Remark: This section presents covariant derivatives along curves and families of curves. The phrase “along curves” is used instead of “on curves” to emphasize that covariant derivatives are not defined on the image set of a curve. Covariant derivatives are well-defined at self-intersections of curves because they are defined in terms of the parameter interval, not the image set. 38.1.2 Definition: The covariant derivative of a C 1 vector field X : IR → T (M ) along a curve γ : IR → M in a C 1 manifold M with respect to an affine connection θ on M is the function t 7→ (Dγ ′ (t) X)(t) defined by (Dγ ′ (t) X)(t) = ̟X(t) (X ′ (t) − θγ ′ (t) (X(t))). [ The notation (Dγ ′ (t) X)(t) in Definition 38.1.2 needs to be fixed. The RHS is okay though. Maybe a better notation would be just DX(t), since the curve is implicit in the function X. (γ(t) = π(X(t)) for all t ∈ IR.) ]
©
Alan U. Kennington, “Differential geometry reconstructed: a unified systematic framework”. Copyright 2009, Alan U. Kennington. All rights reserved. You may print this book draft in A4 format. Printing in all other formats is forbidden. You may not charge any fee for copies of this book draft. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
38.0.1 Remark: This chapter defines geodesic curves, Jacobi fields, convex sets and convex functions for a manifold with an affine connection. The author’s current research interests are related to these topics. The requirements of this chapter have provided the motivation to write the rest of the book. The calculation of the parallel transport of second-order differential operators along geodesic curves has required a thorough analysis of the concepts of tangent spaces, differentials, connections and curvature. It is fortuitous that the development of definitions related to second-order Jacobi fields requires the prior clarification of such a wide range of concepts in differential geometry. Almost the whole book has been generated by recursively developing the definitions required for second-order Jacobi fields. Therefore this is in a sense the core chapter of the book – from the author’s perspective.
720
38. Geodesics, convexity and Jacobi fields
T (M )
X(t)
T (t) ′ X (t) ∈ X
θγ ′
X ′ (t) − θγ ′ (t) (X(t)) ∈ ker((dπ)X(t) )
(t) (
π
X( t))
(T (M ))
∈T
X( t) (T
M
Figure 38.1.1
γ(t)
(M
))
DX(t) = ̟X(t) X ′ (t) − θγ ′ (t) (X(t)) ∈ Tγ(t) (M )
Covariant derivative of a vector field along a curve
38.1.3 Remark: Definition 38.1.2 is illustrated in Figure 38.1.1. [ Must settle on some standard notations. Might use γ∗∗ instead of γ ′′ for the abstract (connection-free) second-order tangent vector field. Might use something like D2 γ instead of Dγ ′ (t) γ ′ (t). Might add the connection θ to the D-symbol, as for example Dγθ ′ (t) . Can then compare Dγθ1′ (t) with Dγθ2′ (t) . ] 38.1.4 Definition: The (covariant) acceleration of a C 2 open curve γ : I → M in a C 2 manifold M with respect to an affine connection θ on M is the map Dγ ′ (γ ′ ) : Int(I) → T (M ) defined by: ∀t ∈ I,
Dγ ′ (t) γ ′ (t) = ̟γ ′ (t) (∂t2 γ(t) − θγ ′ (t) (γ ′ (t))),
where ∂t2 γ : I → T (T (M )) satisfies ∂t2 γ(t) ∈ Tγ ′ (t) (T (M )) for t ∈ I, and θγ ′ (t) ∈ Tγ ′ (t) (T (M )) also. The function ̟z (ker(dπ)z ) → Tγ(t) (M ) for z ∈ T (M ) is the drop-function for T (M ). (See Definition 28.11.6.)
38.2.1 Remark: Differentiable curves are defined in Section 27.6. Curves and families of curves are always assumed to be continuous unless otherwise stated. 38.2.2 Remark: The geodesic curves in Definition 38.2.3 are affine-parametrized. It is understood that a C 2 curve I → M means a curve which is C 2 on the interior of I and continuous on all of I. 38.2.3 Definition: A geodesic curve in a C 2 manifold M is a C 2 curve γ : I → M such that (Dγ ′ (t) γ ′ )(t) = 0 for all t ∈ I, where I is an interval of IR. [ Also define geodesic curves in terms of θ, ρ, ω etc. ] 38.2.4 Definition: A geodesic path in a C 2 manifold M is the equivalence of any geodesic curve in M with respect to affine reparametrization. 38.2.5 Theorem: In terms of local coordinates, have j k d2 xi i dx dx + Γ = 0, jk dt2 dt dt
for a geodesic γ, in terms of x = ψ ◦ γ. [ Note that this sort of geodesic is an affinely parametrized geodesic. There’s probably a simple formula also for a generally parametrized geodesic. ] [ For subscripts and superscripts as in the derivatives of xk above, should have some macros to make the scripts extend over the side of the fraction. For instance, \rlapsuper could mean a superscript which overlaps to the right. ] [ Show near here a precise way of constructing the connection from affine-parametrized geodesics. ] [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
38.2. Geodesic curves
38.3. Jacobi fields
721
38.3. Jacobi fields [ See especially Klingenberg [25], section 1.12, for Jacobi fields. ] 38.3.1 Remark: The most important motivation for writing this book is to obtain some really good estimates for the Jacobi field and its first and second derivatives. Of interest are not only the magnitudes of the derivatives, but also the amount of rotation and skewing of the Jacobi field relative to parallel transport along a path. In this section, derivative estimates should be found for the Jacobi field in terms of the curvature, without using the metric. Then in the next chapter, the metric should be used to get better estimates. [ Here define families of geodesic curves. Then find the derivatives in terms of the equations for geodesics. Use this to derive the equations for Jacobi fields. Do this both with the covariant derivative D and with Christoffel symbols. ]
[ Each part of the equation in Definition 38.3.2 should be checked to see if it is sensible. There are also the questions of existence and uniqueness. The equation might be related to Dγ ′ γ ′ = 0 etc. ] [ The curvature tensor R of M is defined in EDM2 [34] 80.J. The covariant derivative Y ′ of Y along γ and the tangent vector γ ′ will hopefully also be defined somewhere some day. Jacobi fields are in EDM [33] 48.C and EDM2 [34] 178. ] [ Mention the equation of geodesic variation. See Greene/Wu [67], page 6. The coverage by Gallot/Hulin/ Lafontaine [19], pages 118–124 is very useful for Jacobi fields in the context of Riemannian manifolds. Should − − − cover the equation of geodesic variation thoroughly, especially covering the case of C : Ω × Ω × [0, 1] → Ω in Section 38.6. ] [ Estimates for the Jacobi field should go here! ] [ Give examples here of exact solutions of Jacobi field equations for “constant curvature” connections. Do this for various dimensions of manifolds. ]
38.4. Convex sets [ See EDM2 [34], 364.C, for convex neighbourhoods. ] 38.4.1 Definition: A convex subset of a C 1 manifold M with an affine connection is a subset K of M such that for all x, y ∈ K, there is a unique geodesic in M which has endpoints x and y and lies in K. [ See Klingenberg [25], definition 1.9.9, p.84. ] 38.4.2 Definition: A starlike subset of a C 1 manifold M with an affine connection is a subset K of M such that for some x ∈ K, for all y ∈ K, there is a unique geodesic in M which has endpoints x and y and lies in K. 38.4.3 Remark: An open hemisphere of a sphere is convex, but its closure is not convex. This is because a closed hemisphere contains pairs of conjugate points. A whole sphere, minus a single point, is starlike, centred on the point opposite the excluded point. [ This example must be in the section on S n also. Refer here to that much fuller treatment from here. ] 38.4.4 Theorem: If a subset of a C 1 manifold with an affine connection is convex, then it is also starlike. [ Is it true that the closure of a convex set is starlike? And is it true that a closed convex subset is covered by a single map? And what happens in manifolds with boundaries? ] 38.4.5 Theorem: Let K be an open starlike subset of a C 1 manifold with an affine connection. Then K is homeomorphic to the ball B1 (0) of IRn . [ Use normal coordinates? See Klingenberg [25] around about Definition 1.9.9. ] [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
38.3.2 Definition: The Jacobi field along a geodesic curve γ in a C ∞ manifold M is the vector field Y along the curve γ such that Y ′′ + R(γ ′ , Y )γ ′ = 0.
722
38. Geodesics, convexity and Jacobi fields
38.4.6 Theorem: Let K be a starlike subset of a C 1 manifold with an affine connection. Then there exists ˚ a chart ψ ∈ C(M, IRn ) such that ψ is C 1 -compatible with atlas(M ) and K ⊆ Dom(ψ). Proof: [ This requires a construction of some sort, presumably from some sort of gluing together of charts which cover K. ]
38.5. Convex combinations [ A better term for convex combinations might be “geodesic interpolations”? ] [ For parametrized families of geodesics, see Klingenberg [25], section 1.9. ] [ The image of a family of geodesic curves parametrized by endpoints in convex subsets K1 and K2 of a manifold M with an affine connection may be a convex subset of M under some circumstances. Determine what these are. ] − 38.5.1 Theorem: Let Ω be an open set in a C ∞ manifold M such that Ω is convex in M . Then there − − − − exists a unique family γ : Ω × Ω × [0, 1] → Ω of geodesic curves parametrized by their endpoints in Ω. The ∞ restriction of this family to Ω × Ω × [0, 1] is C . [ Near here have a graphic showing two points x1 and x2 near each other, and two points y1 and y2 also near each other. Show the geodesics joining each x-y pair, and some representative point γ(0.4), say, on each geodesic. ]
− 38.5.3 Theorem: Let Ω be an open subset of a C ∞ manifold such that Ω is a convex subset of M . Then − − − − Ω× Ω ⊆ N , and the restriction CΩ¯ of CM to Ω× Ω×[0, 1] is the unique family of geodesic curves parametrized − by their endpoints in Ω. Moreover, CM is continuous, and probably C 1 , and quite likely C ∞ , if the connection is C ∞ . [ This differentiability question deserves to be looked into. Find out at least some sufficient conditions for the continuity of C. ] [ Should try to define here the centroid of a set of points, and more generally, a convex combination of a set of points, and the convex hull of a set. Quite likely this is not so simple as in flat space, because the order in which convex combinations are performed may have an influence. So the curvature might cause a blowing out of the set of combinations from a submanifold into a set of variable thickness. ]
38.6. Families of geodesic interpolations 38.6.1 Remark: This section deals with families of geodesic curves which are parametrized by their endpoints. Of special interest is the way in which first and second order derivatives are transmitted along the family of curves. 38.6.2 Remark: Of special interest is the differential of the map x 7→ C(x, y, λ) from M to M . This probably should be analyzed in terms of the full map from M × M × [0, 1] to M . This generates a whole bunch of Jacobi fields. [ What does If
⊕
mean in Ty (M ) ⊕ Tλ (IR) in the following? ]
C is C , then the differential ∞
(dC)x,y,λ : Tx (M ) ⊕ Ty (M ) ⊕ Tλ (IR) → TC(x,y,λ) (M ) [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
38.5.2 Definition: Let M be a C ∞ manifold with an affine connection. Let N denote the subset of M ×M consisting of those pairs of points in M for which there exists a unique geodesic joining the points in M . Then the convex combination function CM : N × [0, 1] → M is defined for (x, y) ∈ N by setting CM (x, y, λ) equal to the image of γ(λ) for the unique geodesic curve γ : [0, 1] → M with γ(0) = x and γ(1) = y. The notation C may be used for CM when the manifold is implicit in the context.
38.6. Families of geodesic interpolations
723
is a well-defined linear map for all (x, y, λ) ∈ Ω × Ω × (0, 1). Let z = C(x, y, λ). Then for each (x, y, λ), the map (dC)x,y,λ may be decomposed into three components as follows: ∂z (x, y, λ) = (dC)x,y,λ (·, 0, 0) ∂x ∂z (x, y, λ) = (dC)x,y,λ (0, ·, 0) ∂y ∂z (x, y, λ) = (dC)x,y,λ (0, 0, ·), ∂λ so that
∂z (x, y, λ) : Tx (M ) → Tz (M ) ∂x ∂z (x, y, λ) : Ty (M ) → Tz (M ) ∂y ∂z (x, y, λ) : Tλ (IR) → Tz (M ). ∂λ
y n x j z j z For bases (ew i )i=1 at w = x, y and z, will then have (∂z/∂x)ei = a i ej , (∂z/∂y)ei = b i ej , and (∂z/∂λ) = j z j n j n j n h ej , for some arrays [a i ]i,j=1 , [b i ]i,j=1 , and [h ]j=1 .
[ In above “decompositions into three components”, find Jacobi fields J(v, w, λ) for component equations too. ]
C(x, y, λ). Check the
[ Under what conditions is it guaranteed that C(x, y, λ) is a C k function of x, y ∈ M ? ] 38.6.3 Definition: A family of geodesic curves parametrized by their endpoints is a continuous map γ : S1 × S2 × [0, 1] → M such that S1 and S2 are subsets of a C ∞ manifold M and (ii) ∀x ∈ S1 , ∀y ∈ S2 , γx,y is a geodesic curve in M , where γx,y : [0, 1] → M is defined for x ∈ S1 and y ∈ S2 by γx,y (λ) = γ(x, y, λ) for λ ∈ [0, 1]. The family of geodesics is said to be C r when S1 and S2 are open subset of M , and the restriction of γ to S1 × S2 × (0, 1) is a C r map. [ The derivatives d in the following remark should be partial derivatives. ] 38.6.4 Remark: In terms of local coordinates, have j k d2 z i i dz dz + Γ = 0, jk dλ2 dλ dλ i j dz dz gij = d(x, y)2 dλ dλ
etc. etc. Clearly also have ∂z i ∂xj λ=0 ∂z i ∂xj λ=1 ∂z i ∂y j λ=0 ∂z i ∂y j λ=1
= δi j =0 =0 = δi j ,
if the same chart is used over a neighbourhood of the geodesic joining x and y. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
(i) ∀x ∈ S1 , ∀y ∈ S2 , ∀λ ∈ [0, 1], γ(x, y, 0) = x and γ(x, y, 1) = y,
724
38. Geodesics, convexity and Jacobi fields
[ In a space of constant sectional curvature, the matrix ∂z/∂x is effectively diagonal, with diagonal elements α, . . . , α, (1 − λ). This puts limits on the maximum “magnification factor” along the geodesic, depending on what the sectional curvature actually is. It should be checked whether the sectional curvature is meaningful in this way with just an affine connection. Maybe not! ] [ Jacobi fields should be defined near here in terms of the C map. This sort of Jacobi field should be just Yy,z (λ) = (dC)(y, z, λ)(Ly , Lz , Lλ ), where Ly ∈ Ty (M ), Lz ∈ Tz (M ) and Lλ = 0 ∈ Tλ (IR), and y and z are fixed. In other words, the Jacobi field for given y, z ∈ M and Ly ∈ Ty (M ) and Lz ∈ Tz (M ) is the image under dC of the vector (Ly , Lz , 0) ∈ T(y,z,λ) (M × M × [0, 1]). This could be made the primary definition, and the definition in terms of the Riemann curvature tensor should be presented as a mere property of the Jacobi field. A difficulty with this is that the geodesics must exist and be uniquely defined in a neighbourhood of (y, z) for the Jacobi field to be defined, whereas the Riemann tensor definition is a local definition. Perhaps the convex combination definition should be used only as a motivational definition. Should have a theorem that Yy,z (·) is a Jacobi field. ]
38.7. Exponential maps [ Must clarify the difference between geodesic coordinates and the sorts of coordinates you get by parallel transporting vectors along a geodesic. ] 38.7.1 Remark: Exponential maps may also be called normal coordinates or geodesic coordinates. For geodesic coordinates, see EDM2 [34], section 80.J. For normal coordinates, see Gallot/Hulin/Lafontaine [19], 2.86, page 84, and section II.C (page 80) generally. [ In this section, must cover families of geodesics, in particular. Also vector fields on families of curves. This section is intended as preparation for Jacobi fields. ] 38.7.2 Definition: [Definition of exponential map.] 38.7.3 Definition: [Definition of geodesic coordinates. See end of EDM2 [34] 80.J.]
38.7.5 Definition: [Definition of convex neighbourhood. This might have something to do with conjugate points. That is, a convex set should have no pairs of conjugate points in it, maybe.]
38.8. Convex functions [ The concepts of α-concavity and harmonic concavity should be given here. Must also replicate all of the properties of α-concavity in Kennington [119]. ] 38.8.1 Definition: A convex function on a convex subset K of a C ∞ manifold is a function f : K → IR such that ∀x, y ∈ K, ∀λ ∈ [0, 1],
f (C(x, y, λ)) ≤ (1 − λ)f (x) + λf (y).
[ See Greene/Wu [67], page 7, for a definition of a convex function in terms of the Hessian. Also should define “strictly convex” here. ] 38.8.2 Definition: The convexity test function on a manifold M with affine connection is the function c : IRM → IR(N ×[0,1]) defined by ∀u ∈ IRM , ∀(x, y, λ) ∈ N × [0, 1], c(u)(x, y, λ) = (1 − λ)u(x) + λu(y) − u(CM (x, y, λ)), where N is the set of pairs of non-conjugate points in M . (This definition is non-standard.) [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
38.7.4 Definition: [Definition of conjugate point.]
38.8. Convex functions
725
38.8.3 Remark: The convexity test function in Definition 38.8.2 may be viewed as a “geodesic leverage map” because a variation of either x or y causes a variation in c(u)(x, y, λ) which is reminiscent of the action of a lever. The design of a pantograph is similar. So it could also be called a “pantograph map”. (In the 1960s, there was a cheap kind of plastic pantograph called a “sketchagraph”. The author received one as a birthday present and was not made happy by it!) The convexity test function seems to have been first used to prove geometric properties of solutions of BVPs by Korevaar [122]. A generalization to the α-convexity test function cα : IRM → IR(N ×[0,1]) is defined and used in Kennington [118]: ∀u ∈ IRM , ∀(x, y, λ) ∈ N × [0, 1], cα (u)(x, y, λ) = (1 − λ)uα (x) + λuα (y) − uα (CM (x, y, λ)), for α > 0. In fact, this may be generalized to α ∈ [−∞, ∞]. 38.8.4 Theorem: Let c be the convexity test function on a manifold M with affine connection, and let u ∈ IRK be a real-valued function on a convex subset K of M . Then u is convex in K if and only if c(u) is non-negative in K ×K ×[0, 1]. Similarly, u is concave in K if and only if c(u) is non-positive in K ×K ×[0, 1]. 38.8.5 Theorem: Let c be the convexity test function on a manifold M with affine connection, and let u ∈ C 1 (Ω) be a function on a convex open subset Ω of M . Then ∂c(z) ∂u ∂u ∂z (x, y, λ) = (1 − λ) − ∂x ∂x ∂z ∂x ∂c(z) ∂u ∂u ∂z (x, y, λ) = λ − ∂y ∂y ∂z ∂y ∂c(z) ∂u ∂u ∂z (x, y, λ) = (1 − λ) − ∂λ ∂λ ∂z ∂λ
[ The last formula above is missing at least one term! ] [ Clearly it would be interesting to know if ∂z/∂x and ∂z/∂y are invertible, and to know also something about their eigenvalues (if eigenvalues are of relevance). Also of great interest are the second derivatives of z with respect to x, y and λ. Then should get ∂2c ∂ 2 u(z) = (1 − λ) ... ∂x∂x ∂x∂x and so forth. All of these ideas on transport of second order differential operators should be worked out properly. In particular, work out what kind of object a second order operator is. Calculate all second order deriatives of the C map. ]
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
where. . .
726
[ www.topology.org/tex/conc/dg.html ]
38. Geodesics, convexity and Jacobi fields
[ draft: UTC 2009–3–21 Saturday 11:36 ]
[727]
Chapter 39 Riemannian manifolds
39.1 39.2 39.3 39.4 39.5 39.6 39.7 39.8 39.9
Historical notes on Riemannian geometry . The Riemannian metric . . . . . . . . . . The point-to-point distance function . . . The Levi-Civita connection . . . . . . . . Curvature tensors . . . . . . . . . . . . . Differential operators . . . . . . . . . . . Inner product . . . . . . . . . . . . . . . Embedded Riemannian manifolds . . . . . Information geometry . . . . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
728 729 730 732 734 734 735 735 735
39.0.2 Remark: A Riemannian manifold is essentially a metric space for which the Hessian of the square of the distance function is defined everywhere. (The conditions can probably be made weaker than this.) The Riemannian metric is defined to be half the Hessian of the square of the distance function. Thus a Riemannian manifold is the same as the familiar metric space studied in topology, with the constraint that it must have a differentiable structure, and its distance function must satisfy a differentiability condition with respect to its charts. [ This chapter covers all of EDM2 [34] 364, and a small section of EDM2 [34] 105. Federer [105], section 5.4.12, page 634, has an interesting instant summary of Riemannian geometry for submanifolds of IRn . ] [ The most urgent thing to calculate in this chapter is a set of bounds on the length of Jacobi fields. In the particular case of constant sectional curvature, it is well known that the lengths of Jacobi fields vary with the sine or hyperbolic sine of the distance along the geodesic. The second most urgent thing is probably to find out what happens to the inner product of any pair of Jacobi fields on a single geodesic. In the case of constant sectional curvature, the fields are parallel transported along the geodesic so that angle is preserved. ] [ Probably the conformal sublayer should be treated at the end of this chapter or after the pseudo-Riemannian space chapter. ] 39.0.3 Remark: A “conformal sublayer” may be interpolated between the connection and metric layers. In this sublayer, angles between vectors are defined globally but distances are not.
©
Alan U. Kennington, “Differential geometry reconstructed: a unified systematic framework”. Copyright 2009, Alan U. Kennington. All rights reserved. You may print this book draft in A4 format. Printing in all other formats is forbidden. You may not charge any fee for copies of this book draft. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
39.0.1 Remark: The Riemannian metric has been banned from all chapters prior to this to avoid confusing pre-metric geometry with the convenient special formulas of Riemannian spaces. Too many textbooks begin with Riemannian spaces and then ask readers to forget what they have just learned so that metric-free affine connections can be presented. In this book, Riemmannian manifolds are presented after the chapters on general connections. All results on general connections may be applied without restraint to the special kinds of connections on Riemannian manifolds, including Levi-Civita connections and metric connections.
728
39. Riemannian manifolds
39.1. Historical notes on Riemannian geometry 39.1.1 Remark: The Riemannian style of differential geometry commenced with the systematic study of quadratic differential forms by Gauß in “Disquisitiones generales circa superficies curvas”, (1827). These were line elements on 2-dimensional surfaces embedded in IR3 . This was generalized to intrinsic geometry of ¨ arbitrary dimensions by Riemann in “Uber die Hypothesen welche der Geometrie zu Grunde liegen”, (1854). Christoffel invented covariant differentiation in 1869, and Ricci-Curbastro called it “covariant differentiation” in his development of tensor calculus, published 1887–88. (See Bell [189], pages 210, 354, 358–360.)
39.1.3 Remark: The Riemannian metric arose as an attempt to give differential geometry an intrinsic framework as opposed to the extrinsic treatment of embedded manifolds. Gauß had found that the Gaußian curvature of a manifold was independent of embedding. (The date 1827 is given by do Carmo [16], page 36– 37, for this.) The Riemannian metric made this possible by abstracting the metric properties inherited by a surface from its embedding within a Euclidean space. It turns out that by knowing only the metric tensor, it is possible to derive all other aspects of the intrinsic geometry of a manifold. It follows, therefore, that the embedding can be dispensed with. The Riemannian metric framework was then the obvious candidate for the basis of the curved space-time generalization of Einstein’s special relativity to general relativity. As soon as the extrinsic framework of an embedded manifold is discarded, however, many questions arise as to how to redefine all extrinsic geometric objects in terms of the Riemannian metric. A useful picture to have in mind is that of a population of geometers who are trapped inside a surface embedded in a Euclidean space, but who cannot experience anything outside the surface at all. 39.1.4 Remark: The historical origin of the Riemannian metric seems to be the second fundamental form p E dp2 + 2F dp · dq + G dq 2
arising out of curvature calculations by Gauß for a 2-manifold embedded in IR3 . For this, see Gauß: “Disquisitiones generales circa superficies curvas” (1827), translated in Spivak [42] book II, pages 55–111, particularly pages 87–95. The Riemannian metric is a generalization of this second fundamental form for such embedded surfaces. The key passage for this is in article 12 of the “Disquisitiones” as follows (Spivak [42], book II, pages 91–93). Since we always have dx2 + dy 2 + dz 2 = E dp2 + 2F dp · dq + G dq 2 p it is clear that (E dp2 + 2F dp · dq + G dq 2 ) is the general expression for the linear element on the curved surface. The analysis developed in the preceding article thus shows us that for finding the measure of curvature there is no need of finite formulæ, which express the coordinates x, y, z [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
39.1.2 Remark: Bell [190], pages 263–264, said the following in 1937 about the early origins of differential geometry. During the period 1821–1848 Gauss was scientific adviser to the Hanoverian (G¨ottingen was then under the government of Hanover) and Danish governments in an extensive geodetic survey. Gauss threw himself into the work. His method of least squares and his skill in devising schemes for handling masses of numerical data had full scope but, more importantly, the problems arising in the precise survey of a portion of the earth’s surface undoubtedly suggested deeper and more general problems connected with all curved surfaces. These researches were to beget the mathematics of relativity. The subject was not new; several of Gauss’ predecessors, notably Euler, Lagrange, and Monge, had investigated geometry on certain types of curved surfaces, but it remained for Gauss to attack the problem in all its generality, and from his investigations the first great period of differential geometry developed. Differential geometry may be roughly described as the study of properties of curves, surfaces, etc., in the immediate neighbourhood of a point, so that higher powers than the second of distances can be neglected. Inspired by this work, Riemann in 1854 produced his classic dissertation on the hypotheses which lie at the foundations of geometry, which, in its turn, began the second great period of differential geometry, that which is today of use in mathematical physics, particularly in the theory of general relativity.
39.2. The Riemannian metric
729
as functions of the indeterminants p, q; but that the general expression for the magnitude of any linear element is sufficient. Let us proceed to some applications of this very important theorem. In other words, the quadratic form contains all of the information required for analysis of intrinsic geometric properties of the surface. This is the fundamental idea behind the Riemannian metric, namely that the (quadratic) Riemannian metric tensor contains all of the information required about distance elements for the surface. Therefore the particular choice of embedding does not need to be known. 39.1.5 Remark: There are many questions which naturally arise from the definition of the Riemannian metric (Definition 39.2.3). For example, why is it quadratic rather than cubic or linear, or some more general form of function? Why is it a tensor? Why is it symmetric? Why is it positive definite? Why is the Riemannian metric not path-dependent like parallelism? In the physical world, how do the distances and angles at all points in space-time get synchronized to give a single universal standard of length? Even to this day, it is not at all clear why space should have a quadratic character. Lengths add linearly in a straight line and quadratically at right angles. It is difficult to experience how surprising this fact is because human beings have never existed in a space which does not obey the Pythagoras theorem. As someone once observed, whoever discovered water was not a fish. (Albert Einstein [182] is supposed to have said: “Was weiß der Fisch von dem Wasser, in dem er sein Leben lang herumschwimmt?” In other words: “What does the fish know about the water in which it swims around all its life?”) It must have been very mysterious to the first discoverers of the 52 = 42 + 32 rule for a right angle that these numbers should have such a strange relation to each other, and that there are so few of these integer formulas. Struik [193], page 28, says the following regarding Babylonian geometry around 1750bc. The texts show that the Babylonian geometry of the Semitic period was in possession of formulas for the areas of simple rectilinear figures and for the volumes of simple solids, although the volume of a truncated pyramid had not yet been found. The so-called theorem of Pythagoras was known, not only for special cases, but in full generality, as a numerical relation between the sides of a right triangle. This led to the discovery of “Pythagorean triples” such as (3, 4, 5), (5, 12, 13) etc.
The concept of the Riemannian metric is a generalization of the Pythagoras theorem. The Riemannian metric effectively sets up a Pythagoras law at every point in space. Unlike the definition of parallelism over vectors at a distance, the lengths of vectors at different points are assumed to have an absolute relation to each other, independent of the path taken for the comparison. One might reasonably ask why vector length is not also path-dependent. A physical mechanism for parallel transport can be imagined, like photons carrying information about orientation, but it is not clear how physical space would transport vector length information between points to keep the definitions synchronized throughout a universe. However, just as Bertrand Russell had to reluctantly accept Euclid’s axioms when he was young (in 1883 as mentioned in Remark 2.1.9), one must (at least temporarily) accept Riemannian geometry’s global and quadratic assumptions if one is to understand much of modern physics, either willingly or reluctantly. Of course, it turned out 30 years later that Euclid’s assumptions were wrong and Russell was right to be sceptical! (Some recent observations suggesting that the speed of light may not be constant could conceivably be related to the question of globality of the pseudo-Riemannian metric of physical space-time. Perhaps some notion of “metric transport” could be more realistic than an absolute global Riemannian metric. Then one would need to determine how a metric “transport mechanism” could yield a Riemannian metric as an approximation at short to medium distances.)
39.2. The Riemannian metric [ Maybe could make this section work for L1 functions or some sort of Sobolev function classes for the metric tensor. Alternatively try Lipschitz or rectifiable functions. ] 39.2.1 Remark: At each point p ∈ M of a C 1 manifold M , a covariant tensor field g ∈ T 0,2 (M ) evaluated at p is a bilinear form gp : Tp (M ) × Tp (M ) → IR. It is said to be “positive definite” when gp is positive definite for all p ∈ M . [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
The Pythagoras theorem was therefore one of the earliest non-trivial discoveries of the arithm`etic nature of space. It is still at the core of our mathematical representation of space-time.
730
39. Riemannian manifolds
39.2.2 Remark: The tensor field in Definition 39.2.3 is not necessarily continuous. However, continuity guarantees that the tensor field is equal to the derivative of the integral of itself, which is highly desirable. If a Riemannian metric were defined always as the derivative of a distance function, this would ensure an adequate level of continuity. 39.2.3 Definition: A Riemannian metric (tensor field) on a C 1 manifold M is any positive-definite symmetric covariant tensor field of degree 2 on M . 39.2.4 Definition: A Riemannian metric (tensor field) of class C r on a C r+1 manifold M is a Riemannian metric g on M such that g is of class C r on M . 39.2.5 Remark: The fact that the word “covariant” appears in Definition 39.2.3 does not imply that a metric tensor requires the prior definition of a connection. It has nothing to do with covariant derivatives. The C 1 condition on M could possibly be weakened to a H¨older C 0,1 condition with the metric tensor being defined almost everywhere. This would still permit distance to be calculated.
Z
39.2.6 Definition: A C r Riemannian manifold (or Riemannian space) for r ∈ + is a pair (M, g) such that M is a C r differentiable manifold and g is a C r−1 Riemannian metric on M . The tensor g is called the metric tensor or fundamental tensor of M . 39.2.7 Definition: The length kLk of any vector L ∈ Tp (M ) for any point p in a Riemannian manifold M is defined by q kLk = gp (L, L).
39.3. The point-to-point distance function
39.3.1 Remark: The objective of this section is to determine conditions, preferably necessary and sufficient, to place on a distance function d for a manifold so that the manifold has a Riemannian metric g with the same point-to-point distance function. It is apparently sufficient that d be C 2 on a suitably differentiable manifold and satisfies some constraints on the second derivatives. It seems to be necessary that d should be at least C 0,1 . Function classes should be determined for the distance function and metric tensor so that the distance-totensor and tensor-to-distance conversions are inverses of each other. (This is similar to the parallelism-toconnection and connection-to-parallelism conversions mentioned in Remark 36.2.8.) 39.3.2 Remark: The task here is to determine the precise relation between the Riemannian metric and the point-to-point distance function of topological space theory. The latter is defined as a function d : M × M → IR+ 0 on a set M which satisfies the three conditions of identity d(x, x) = 0, symmetry d(x, y) = d(y, x) and the triangle inequality d(x, y) + d(y, z) ≥ d(x, z). This may be referred to as a “two-point distance function”. [ Check and justify conditions (39.3.1) and (39.3.2). ] When the set M is a manifold, it is possible to regard the distance function as a function d¯ : Range(ψ) × Range(ψ) → IRn defined by d¯ : (x, y) 7→ d(ψ −1 (x), ψ −1 (y)), where ψ is a chart for M . Then for the distance function d to correspond with a Riemannian metric g = (gij )ni,j=1 in terms of local coordinates, one would expect equation (39.3.1) to be satisfied. q ¯ y) = gij (x)(y i − xi )(y j − xj ) + o(|y − x|) as y → x. ∀x, y ∈ Range(ψ), d(x, (39.3.1) This implies, and (probably) is implied by equation (39.3.2). ∀x, y ∈ Range(ψ),
¯ y)2 = gij (x)(y i − xi )(y j − xj ) + o(|y − x|2 ) as y → x. d(x,
[ www.topology.org/tex/conc/dg.html ]
(39.3.2)
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
[ This section is very woolly right now. ] [ Maybe could make this section work for L1 functions or some sort of Sobolev function classes for the metric tensor. Alternatively try Lipschitz or rectifiable functions. ] [ Maybe weak necessary conditions and sufficient conditions for a two-point distance function to determine a Riemannian metric can be discovered by using normal coordinates. ]
39.3. The point-to-point distance function
731
Here |y − x| denotes the standard Euclidean norm in IRn . It is probably true that the manifold M is a Riemannian manifold whose metric tensor is g if and only if equation (39.3.2) is satisfied for all x in every chart and g is continuous and positive definite. It may follow from (39.3.2) that the second derivatives of d(x, y)2 with respect to y exist for all x by using the continuity of g. With a bit of luck, these derivatives might be continuous with respect to x too. The above equations may be equivalent to the following. ¯ ·)) = ∂v+ (d(x,
q gij (x)v i v j ,
¯ x + tv) = lim + t−1 d(x,
q gij (x)v i v j ,
which means t→0
for all v ∈ IRn . It may be that if all conditions are taken together, such as continuity of d and the triangle inequality, then the second derivatives of d¯2 may exist and be continuous. In the other direction, the distance function can be generated from the Riemannian metric by minimization over curves as follows. Z q d(x, y) = min1 gij dxi dxj , γ∈Cx,y
γ
where the minimization is over suitable C 1 curves from x to y in the usual way. If g is continuous, then for y in a small enough neighbourhood of x, the minimising curve γ should be unique and C 1 . By aligning the minimising curves with the radial directions out of x (normal coordinates), equations (39.3.1) and (39.3.2) should be recovered.
[ Should have a preliminary section on calculus of variations so as to be able to analyse the distance function d obtained from a Riemannian metric g. ] 39.3.4 Example: A simple example shows that a Riemannian distance function is a very special kind of distance function. Consider the set IRn forP n ≥ 2 with the distance function d : (x, y) 7→ |yn− x|pi , where n i p 1/p the p-norm is defined as usual by |x|p = |x | for 1 ≤ p < ∞, and |x|∞ = maxi=1 |x | in the i=1 p = ∞ case. Clearly d corresponds to a Riemannian metric if and only if p = 2. (See Definition 10.8.1 for the p-norm.) Consider the value of d(0, y) for n = 2. The value is (y 1 )p + (y 2 )p 1/p . A Riemannian metric must converge to a quadratic function of y as y → 0. This can only happen for p = 2.
A change of coordinates can remove the problem at a single point, but not at all points in IRn . This kind of example makes it clear that distance functions can only be Riemannian if they are in some sense locally affine distortions of Euclidean space with the 2-norm. 39.3.5 Remark: Theorem 39.3.6, which may not be perfectly correct in detail, is an attempt to determine the relation between two-point distance functions and Riemannian metrics. It seems that the Riemannian metric tensor is simply half the Hessian of the square of the distance function. When this Hessian exists, it is a well-defined tensor in T 0,2 (M ). Theorem 39.3.6 requires twice differentiability of the distance function. This is probably much stronger than is required. It’s quite possible that the manifold only needs to be C 0,1 . [ Although Theorem 39.3.6 may be almost right if a topological metric space is assumed, in the case of a given Riemannian space it is necessary to consider various aspects of pathwise connectivity. ] [ Maybe should split Theorem 39.3.6 into two theorems. (1) If d is derived from g, then g can be recovered from d. (2) If g is derived from d, then d can be recoverd from g. Weak conditions should be given for each of these theorems. For example, it may be that g and the Hessian of d2 only need to be defined almost everywhere in the sense of Lebesgue measure or something like that. ] [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
39.3.3 Remark: One problem with this is that if the Riemannian manifold is not connected, then there will be no curves at all between some pairs of points. Even if the manifold is connected, there may be no minimum-length geodesic. An example of this is a Euclidean space with a closed subset removed from it.
732
39. Riemannian manifolds
39.3.6 Theorem: Suppose M is a metric space with distance function d : M × M → IR+ 0 , and that M has a C 2 atlas AM . Then (M, AM ) is a Riemannian manifold with distance function d if and only if the matrix of second derivatives n ∂ 2 d(p, q)2 (39.3.3) ∂ψ(q)i ∂ψ(q)j q=p i,j=1
exists and is invertible for all p ∈ M and ψ ∈ atlasp (M ).
39.3.7 Theorem: If condition (39.3.3) in Theorem 39.3.6 holds, then for all p ∈ M , the metric tensor [gij (p)]ni,j=1 of M for each chart ψ ∈ atlasp (M ) is given by ∀i, j = 1 . . . n,
gij (p) =
1 ∂ 2 d(p, q)2 . 2 ∂ψ(q)i ∂ψ(q)j q=p
[ Since the first derivative of d2 with respect to q is zero at q = p, it follows that the tensor g = gij ei ej is the same tensor for all charts. See Section 32.6. ] 39.3.8 Remark: Note that gij has not been defined to be 12 ((∂/∂q i )d(p, q) (∂/∂q j )d(p, q)) q=p because d(p, ·) is generally not differentiable at p. [ Remark 39.3.8 should be clarified considerably. ] [ In this section, must show that the definition of a Riemannian manifold somewhere in the concavity book is equivalent to the one given in this section. ] [ In the metric space on IRn using the p-norm, it looks like the geodesics would be the same as with the 2-norm. Does this imply that the Riemannian connection is the same? ]
39.4.1 Remark: This is the section where the affine connection on a manifold is generated uniquely from the Riemannian metric. The connection, of course, does not uniquely determine the metric tensor. This is why the Riemannian metric is in a higher layer than the affine connection. [ If you construct the Levi-Civita connection from a Riemannian metric, there is an infinite set of Riemannian metrics which have the same Levi-Civita connection. For example, Riemannian metrics which differ by a constant multiplier have the same Levi-Civita connection. Determine the full generality of Riemannian metrics which can share the same Levi-Civita connection. This will then answer the question of whether it is possible to reconstruct the Riemannian metric from its Levi-Civita connection. For example, it would be interesting to know if the Riemannian metric is uniquely defined globally if you specify the metric tensor at one point and the affine connection globally. ] [ Given a distance function d : M × M → M , is it possible to determine unique shortest paths for all point pairs and therefore generate an affine connection out of this. The differentiability conditions on a distance function to generate a Riemannian metric are quite strong, but maybe much weaker conditions could yield an affine connection. ] [ Define a “length-parametrized geodesic”. See Gallot et alia, p. 116, 3.34. This is called a “normal geodesic” in Greene and Wu, page 6. ] [ For the following 4 definitions of connections, see the EDM2 [34] 80. ] 39.4.2 Definition: The Levi-Civita connection for a C 2 manifold M with a C 1 Riemannian metric tensor field g is the affine connection on M which has Christoffel symbol given by k Γij =
1 kl g 2
∂gli ∂glj ∂gij + − ∂xj ∂xi ∂xl
.
[ The Levi-Civita connection is the unique torsion-free connection which makes geodesics length-minimizing? See Gallot/Hulin/Lafontaine [19], page 70, sections 2.51 and 2.53. ] [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
39.4. The Levi-Civita connection
39.4. The Levi-Civita connection
733
39.4.3 Remark: Although the Christoffel symbol is not a tensor, it is required to transform in a specifed way under changes of chart ψ. According to Theorem 30.2.1, the differential of a parallelism must satisfy k equation (30.2.1) in order for the operator L(ψ) = ∂ij − Γij (ψ)∂k to be a second-degree covariant tensor 2 operator when applied to functions f ∈ C (M ). k 39.4.4 Theorem: The Christoffel symbol Γij in Definition 39.4.2 satisfies condition (30.2.1) for the secondk order operator ∂ij − Γij (ψ)∂k to be a second-degree covariant tensor in Theorem 30.2.1.
Proof: Let ψ, ψ˜ ∈ atlasp (M ) be charts at p ∈ M for a C 2 Riemannian manifold M . Denote the respective ˜ Then Christoffel symbols by Γ and Γ. ∂glj ∂gij 1 kl ∂gli = g + − , 2 ∂xj ∂xi ∂xl 1 ∂˜ glj ∂˜ gij ∂˜ gli k ˜ij Γ = g˜kl + − , 2 ∂x ˜j ∂x ˜i ∂x ˜l
k Γij
where g˜kℓ = φ˜k ,i φ˜ℓ ,j g ij , g˜ij = φk ,i φℓ ,j gkℓ , φ = ψ ◦ ψ˜−1 and φ˜ = ψ˜ ◦ ψ −1 = φ−1 . Then ∂˜ gli ∂ = (φm ,i φk ,ℓ gmk ) j ∂x ˜ ∂x ˜j k k m m k r = φm ,ij φ,ℓ gmk + φ,ℓj φ,i gmk + φ,i φ,ℓ φ,j
∂gmk . ∂xr
1 s k m s r ∂gms s m ˜ij Γ = φ˜k ,t φ˜ℓ ,u g tu φm ,ij φ,ℓ gms + φ,ℓj φ,i gms + φ,i φ,ℓ φ,j 2 ∂xr ∂g ms s m s r s m + φm ,ji φ,ℓ gms + φ,ℓi φ,j gms + φ,j φ,ℓ φ,i ∂xr m s r ∂gms s s m φ g − φ φ g − φ φ − φm φ ,i ,j ,ℓ ,iℓ ,j ms ,jℓ ,i ms ∂xr 1 s m s r ∂gms m s r ∂gms m s r ∂gms = φ˜k ,t φ˜ℓ ,u g tu 2φm + φ φ − φ φ φ φ ,ij φ,ℓ gms + φ,i φ,ℓ φ,j ,i ,ℓ ,j ,i ,ℓ ,j 2 ∂xr ∂xr ∂xr 1 ∂g ∂g ∂g ms ms ms m s r m s r ˜k ˜k ˜ℓ tu φm φs φr = φm + φ φ − φ φ φ φ ,ij φ,m + φ ,t φ ,u g ,i ,ℓ ,j ,i ,ℓ ,j ,i ,ℓ ,j 2 ∂xr ∂xr ∂xr ∂g ∂g ∂g 1 ms rs mr ˜k m r ts ˜k = φm + m− ,ij φ,m + φ ,t φ ,i φ ,j g 2 ∂xr ∂x ∂xs k m r t m ˜k ˜ = φ ,t φ ,i φ ,j Γmr + φ,ij φ,m .
(39.4.1)
(The boxed terms cancel to zero.) This matches equation (30.2.1). ˜k 39.4.5 Remark: It is the term φm ,ij φ,m in equation (39.4.1) which makes the Christoffel symbol a nontensorial object. That is, the symbol does not correspond to the coefficients of a tensor of any type. The Christoffel symbol is in fact a family of “tensorization coefficients”. (See Definition 30.2.3 for general tensorization coefficients.) The non-tensorial Christoffel symbol may be combined with non-tensorial higherorder derivatives to produce tensorial objects. The second-order derivative term in (39.4.1) is what allows the Christoffel symbol to convert non-tensorial partial derivatives into tensorial covariant derivatives. Equation (30.2.1) does not uniquely determine the Levi-Civita connection. In fact, equation (30.2.1) places a fairly weak constraint on the tensorization coefficients which specify a connection. 39.4.6 Definition: A metric connection on a Riemannian manifold M is a connection for which ∇g = 0, which means that parallel transport maps the tangent bundle between points in M so that orthogonal frames are mapped to orthogonal frames. 39.4.7 Remark: A Riemannian connection is a metric connection such that the torsion is everywhere zero. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Therefore
734
39. Riemannian manifolds
39.4.8 Definition: [Definition of coefficients of the Riemannian connection.] [ Give the components of the torsion tensor, curvature tensor and covariant differential. Also give the coordinate equation for a geodesic. ] 39.4.9 Definition: [Definition of tangent n-frame orthogonal bundle.] 39.4.10 Definition: [Definition of normal coordinates.] [ Show that all the ways of expressing the connection in a Riemannian space are equivalent. For example, show that Schild’s ladder (using geodesics) reconstructs the same parallelism from which the geodesics are constructed. ]
39.5. Curvature tensors 39.5.1 Remark: This section includes Riemann curvature, Ricci curvature, sectional curvature and Gauß curvature. Actually, the Riemann curvature tensor is uniquely defined in terms of the affine connection. The Gauß curvature is just the two-dimensional version of the sectional curvature. 39.5.2 Definition: [Definition of curvature form.] 39.5.3 Definition: [Definition of curvature tensor.] [ Some questions to be answered near here. Is the Riemann curvature for a Riemannian manifold in some sense an element of the Lie algebra of SO(n)? Is the Riemann curvature the exterior derivative of the connection form in some sense? The exterior derivative of a differential form is some sort of commutator of Lie derivatives? ] 39.5.4 Definition: [Definition of sectional curvature.]
39.5.5 Definition: [Definition of Ricci tensor.] 39.5.6 Definition: [Definition of Ricci curvature.] 39.5.7 Definition: [Definition of scalar curvature.] [ Present the second variation formula for the length of a family of geodesics. See Greene/Wu [67], page 6. ] 39.5.8 Definition: An Einstein space is a Riemannian manifold in which the Ricci tensor is a scalar multiple of the metric tensor. [ According to EDM2 [34], 364.D, the scalar multiple for an Einstein space must be constant if dim(M ) ≥ 3. ]
39.6. Differential operators [ Especially do the Laplacian and the modulus of the gradient. See EDM2 [34] 194.B. Probably could use the conservation of mass to determine the correct form of the Laplacian, in particular in the context of the heat equation. Must show that a conservation law applies to the heat equation. The Laplacian is nicely defined by Frankel [18], pages 305 and 93. ] [ Hodge theory operators ∗ and δ. See Greene/Wu [67], pages 7 and 8. Also present harmonic, subharmonic and superharmonic functions, page 8. Also see Warner [49], chapter 6, for Hodge theory, including the ∗ and Laplace-Beltrami operators. ] [ Here define elliptic, hyperbolic and parabolic operators. Show that the Laplacian is elliptic. ] 39.6.1 Remark: The Laplacian operator on an n-dimensional C 2 manifold M calculates the sum of second derivatives of a real-valued function in n orthogonal directions at a point p ∈ M . The individual k second derivatives are effectively calculated along geodesic curves passing through p. The term −Γij ∂k in [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
[ Around here, define spaces of constant sectional curvature, and classify such spaces. Probably the spaces of positive constant sectional curvature are isometric to a sphere. Define the negative equivalent too. ]
39.7. Inner product
735
Definition 39.6.2 ensures that the second derivatives “follow the geodesics”. The factor g ij takes care of orthogonality and scaling of the second derivatives. Clearly the Laplacian operator requires both the affine connection and the Riemannian metric for its definition, whereas general elliptic operators require only the affine connection. 39.6.2 Definition: The Laplacian (operator) on a C 2 manifold with Riemannian metric g is the operator ∆ : C 2 (M, IR) → C 0 (M, IR) defined by k ∆ = g ij ∂ij − Γij ∂k ,
where Γ is the Levi-Civita connection for g.
39.6.3 Remark: The Laplace-Beltrami operator generalizes the flat-space Laplace operator not only to Riemannian manifolds but also to differential forms of general degree. [ Define Ricci flow. I used to do research, which I did not publish except in seminars, on flow by mean curvature for embedded manifolds. I should dig up my old research on this and define the relevant concepts here. ]
39.7. Inner product [ Should cover orthogonality here, and also the dot product of two vectors, like ∇u · γ(s). ˙ Also cover the lengths of vectors here, like the length of the gradient vector |∇u| and |γ(s)|. ˙ ] [ Definition of trace. E.g. ∆f = trace D2 f . See Greene/Wu [67], page 7. ] [ There should be a section on Finsler metrics somewhere near here. ]
39.8. Embedded Riemannian manifolds
[ Here do a definition of the inherited metric tensor for embedded manifolds in a Riemannian manifold. See notes G, page 3. Also define normal vectors to tangent spaces of submanifolds. See EDM2 [34] 364.A. ]
39.9. Information geometry [ The particular benefit of the differential geometry perspective is apparently the fact that the point-to-point distance function is an invariant under changes of coordinates. I have been told that the affine connections which are used for information geometry are not the same as the Levi-Civita connection. ] 39.9.1 Remark: A Riemannian metric arises out of statistics as the Fisher information matrix. (See Amari/Nagaoka [53]. See also EDM2 [34], 399.D, page 1489.) The Fisher information matrix is defined by: ∀θ ∈ M,
gij (θ) =
Z
∂ log f (x, θ) ∂ log f (x, θ) f (x, θ) dx, ∂θi ∂θj
(39.9.1)
whereRf : S × M → IR is a family of probability densities on the set S with parameters in the manifold M . Thus f (x, θ) dx = 1 for all θ ∈ M .
[ In equation (39.9.1), the coordinates θi and points θ of the manifold M are mixed up in the colloquial fashion. This must be fixed. ]
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
39.8.1 Definition (→ 39.2.6): [Alternative definition of Riemannian manifold.]
736
[ www.topology.org/tex/conc/dg.html ]
39. Riemannian manifolds
[ draft: UTC 2009–3–21 Saturday 11:36 ]
[737]
Chapter 40 Pseudo-Riemannian manifolds
40.1 The pseudo-Riemannian 40.2 General relativity . . . 40.3 Singularities . . . . . . 40.4 Global solutions . . . .
metric . . . . . . . . . . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
737 737 738 738
[ This is currently just a place-holder for a chapter which I will write some time soon. ] Pseudo-Riemannian manifolds are a generalization of Riemannian manifolds. Therefore it makes sense to deal with the special case of Riemannian manifolds first (Chapter 39). [ This chapter should particularly deal with the case of (1, 3) manifolds. ] [ Another particular topic of interest is the way in which (1, n) manifolds (i.e. manifolds with a Minkowski metric) can be characterized locally by inequalities for d(x, y) analogous to the triangle inequality for Riemannian manifolds. There are notes on this somewhere. This sort of result should be generalized to general (m, n) manifolds. Then hopefully pseudo-Riemannian manifolds can also be given a one-sentence definition as for Riemannian manifolds. ]
40.1. The pseudo-Riemannian metric
A pseudo-Riemannian metric of class C r on a C r+1 manifold M is a pseudo-Riemannian metric g on M such that g is of class C r on M . 40.1.2 Remark: The fact that the word “covariant” appears in Definition 40.1.1 does not imply that a metric tensor requires the prior definition of a connection. The C 1 condition could possibly be weakened to a H¨older C 0,1 condition with the metric tensor being defined almost everywhere. This would still permit distance to be calculated. [ For definition 40.1.1, the concept of a C ∞ symmetric tensor of type (0, 2) is already defined. Still to be defined is non-degeneracy. This can be done in terms of contraction of tensors by requiring that g.x = 0 ⇒ x = 0 for contravariant vectors x. Should define non-degeneracy in the linear chapter. ]
40.2. General relativity 40.2.1 Remark: Bell [190], pages 503–504, quotes a fascinating 1870 paper by Clifford, “On the spacetheory of matter”, which seems very close indeed to the central idea of general relativity, namely that matter and the curvature of space are connected. Riemann has shown that as there are different kinds of lines and surfaces, so there are different kinds of space of three dimensions; and that we can only find out by experience to which of these kinds the space in which we live belongs. In particular, the axioms of plane geometry are true within the limits of experiment on the surface of a sheet of paper, and yet we know that the sheet is really covered with a number of small ridges and furrows, upon which (the total curvature being not
©
Alan U. Kennington, “Differential geometry reconstructed: a unified systematic framework”. Copyright 2009, Alan U. Kennington. All rights reserved. You may print this book draft in A4 format. Printing in all other formats is forbidden. You may not charge any fee for copies of this book draft. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
40.1.1 Definition: A pseudo-Riemannian metric on a C 1 manifold M is any continuous non-degenerate symmetric covariant tensor field of degree 2 on M .
738
40. Pseudo-Riemannian manifolds
zero) these axioms are not true. Similarly, he says, although the axioms of solid geometry are true within the limits of experiment for finite portions of our space, yet we have no reason to conclude that they are true for very small portions; and if any help can be got thereby for the explanation of physical phenomena, we may have reason to conclude that they are not true for very small portions of space. I wish here to indicate a manner in which these speculations may be applied to the investigation of physical phenomena. I hold in fact (1) That small portions of space are in fact of a nature analogous to little hills on a surface which is on average flat; namely, that the ordinary laws of geometry are not valid in them. (2) That this property of being curved or distorted is continually being passed on from one portion of space to another after the manner of a wave. (3) That this variation of the curvature of space is what really happens in that phenomenon which we call the motion of matter , whether ponderable or ethereal. (4) That in the physical world nothing else takes place but this variation, subject (possibly) to the law of continuity. Unfortunately, Clifford died at the age of 33 in 1879 (apparently from over-work), whereas Riemann died aged 39 in 1866 (from tuberculosis). If they had lived long enough to learn about the Michelson-Morley experiment [157] in 1881–1887, they might have developed a space-time theory of gravity in the 19th century, possibly quite different to Einstein’s popular theory. At the very least, it is clear that very many of the ideas of Einstein’s gravity theory were “in the air” long before the publication of his general relativity. [ Present here Einstein’s equations, including the Einstein fudge factor Λ. Present a formulation of a push theory of gravity in a different section. ] 40.2.2 Remark: Einstein’s equations look something like
where Λ is the famous fudge factor which “explains” the discrepancy between theory and observations by invoking an ad-hoc variation in the cosmological expansion rate.
40.3. Singularities [ Present here the basic definitions for black hole solutions. ]
40.4. Global solutions [ Present here the big bang hypothesis. ]
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
1 Rµν − Rgµν + Λgµν = κ0 Tµν , 2
[739]
Chapter 41 Tensor calculus
41.1 History . . . . . . . . . . . . . 41.2 Differentiable manifolds . . . . . 41.3 Manifolds with affine connection 41.4 Equations of geodesic variation . 41.5 Riemannian manifolds . . . . . . 41.6 Pseudo-Riemannian manifolds . 41.7 Submanifolds of Euclidean space
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
740 740 740 741 743 745 745
[ This is currently just a place-holder for a chapter which I will write some time soon. Please ignore this chapter for now. ] 41.0.1 Remark: Tensor calculus is a set of practical notations and methods for differential geometry calculations. The previous chapters have been concerned mainly with meaning. Obviously it is essential to have both.
The symbols of tensor calculus refer to arrays of numbers. In an n-dimensional manifold, an array of n numbers may be the coordinates of a tangent vector with respect to some implicit basis, or the coordinates of a cotangent vector. An n × n array could be the matrix of coordinates of a tangent tensor or a cotangent tensor, or the coefficients of a linear transformation of point coordinates or its inverse transformation, or the coefficients of a linear transformation of basis vectors or its inverse. There is enormous room for confusion. Tensor calculus is principally concerned with pointwise or local analysis on differentiable manifolds in terms of coordinate charts. When multiple coordinate charts are being used for a single problem, there is a further danger of losing track of which chart is being used for each expression. This creates opportunities for confusion as to which basis is being used for each tensor. Clearly every array in every equation can be the source of ambiguity and error. But despite all the pitfalls and hazards of tensor calculus, it is difficult to do serious practical calculation without it. So the best policy is to use it, but continually check each expression to ensure that its meaning is clear. [ In this chapter, everything in the previous chapters on differential geometry should be done purely in terms of coordinates in a naive sort of way. That is, a manifold should be defined to be simply a set of open subsets of IRn together with transition functions. Then a geodesic is defined to be any curve which satisfies the appropriate equations. Everything else is similarly defined purely in terms of coordinate equations and components. ] [ Some references for tensor calculus are Misner/Thorne/Wheeler [37], pages 223–224, and EDM2 [34], pages 1730–1733 (Appendix A, Table 4) and article 417. ]
©
Alan U. Kennington, “Differential geometry reconstructed: a unified systematic framework”. Copyright 2009, Alan U. Kennington. All rights reserved. You may print this book draft in A4 format. Printing in all other formats is forbidden. You may not charge any fee for copies of this book draft. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
In order to do the right calculations, one must always understand what the symbols refer to. The main down-side of tensor calculus is that it is too easy to lose track of what everything means. Everything in the earlier chapters is to a great extent motivated by the desire to give meaning to all of the expressions and equations in tensor calculus.
740
41. Tensor calculus
41.1. History 41.1.1 Remark: According to Bell [189], page 360, tensor calculus was developed by Eugenio Beltrami and Gregorio Ricci-Curbastro around the year 1890, in particular in a publication by Ricci in 1887 or 1888. This implies that tensor calculus was developed after Riemannian geometry, but before the development of affine connections and fibre bundles. 41.1.2 Remark: Einstein reportedly had serious difficulties understanding tensor calculus. He had to understand it in order to formulate his gravity theory. Bell [190], page 256, made the following comment on how Einstein arrived at general relativity. What gave Einstein his idea was the hard labor he expended for several years mastering the tensor calculus of two Italian mathematicians, Ricci and Levi-Civita, themselves disciples of Riemann and Christoffel, both of whom in their turn had been inspired by the geometrical work of Gauss.
41.2. Differentiable manifolds [ In this section, deal with those parts of tensor calculus which are valid in the absence of a connection and Riemannian metric. ]
41.3. Manifolds with affine connection [ It is assumed here that Γ is symmetric. Does this require the connection to be torsion-free? Should express Γ in terms of the connection form ω. See EDM2 [34], page 1732, App. A.4. ] 41.3.1 Theorem: Let x : [a, b] → M be a C 2 curve in a C 2 manifold M with a C 1 affine connection. Let k Γij be the Christoffel symbol of the connection with respect to a coordinate map ψ : Ω → IRn for some open subset Ω of M such that x([a, b]) ⊆ Ω. Then dxj dxk d2 xi i (t) + Γ (x(t)) (t) (t) = 0. jk dt2 dt dt
∀t ∈ (a, b),
(41.3.1)
(ii) x is a freely parametrized geodesic curve in M if and only if ∀u ∈ (a, b), ∃k(u) ∈ IR,
d2 xi dxj dxk dxi i (u) + Γ (x(u)) (u) (u) = k(u) (u). jk du2 du du du
Furthermore, k is C 1 if x is C 2 , and the re-parametrization u = f (t) makes t an affine parameter for the curve if f satisfies f ′′ (u) + k(u)f ′ (u)2 = 0. This equation has solution f (u) =
Z u Z
y
k(z) dz
−1
dy.
(iii) If dim(M ) = 2, and the curve can be locally expressed as a graph with respect to x1 , then the function x2 = h(x1 ) satisfies 1 2 2 1 2 1 2 1 − Γ11 )h′ + Γ11 = 0. − Γ21 )(h′ )2 + (Γ12 + Γ21 h′′ − Γ22 (h′ )3 + (Γ22 − Γ12
Proof: Part (ii) follows from part (i) on substitution of u = f (t), for any C 2 function f for which f ′ (t) 6= 0 for all t ∈ (a, b). This gives ∀t ∈ (a, b),
d2 xi (t) dxj (t) dxk (t) f ′′ (t) dxi (t) i + Γjk (x(t)) =− ′ 2 . 2 dt dt dt f (t) dt
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
(i) x is an affinely parametrized geodesic curve in M if and only if
41.4. Equations of geodesic variation
741
This may be interpreted as the parallelism of Dx˙ x˙ and x. ˙ In other words, the covariant derivative of the tangent vector along the curve is parallel to the tangent vector. Part (iii) follows from a comparison of the two equations specified in part (iii). Since they both must have the same value for k(u) for each u, k(u) may be eliminated between the two equations to solve for x2 in terms of x1 . Indeed, if x1 = u and x2 = h(u) are substituted into the two equations in part (iii), they simplify to 1 1 1 1 Γ22 (h′ )2 + (Γ12 + Γ21 )h′ + Γ11 = k(u) 2 2 2 2 h′′ + Γ22 (h′ )2 + (Γ12 + Γ21 )h′ + Γ11 = k(u)h′ .
Now substituting k(u) from the first equation into the second gives 1 2 1 1 2 2 1 2 h′′ − Γ22 (h′ )3 + (Γ22 − Γ12 − Γ21 )(h′ )2 + (Γ12 + Γ21 − Γ11 )h′ + Γ11 = 0.
41.3.2 Theorem: The curvature tensor components Ri jkl are given by i ∂Γjℓi ∂Γjk i i m − + Γmk Γm jl − Γml Γjk ∂xk ∂xℓ i i i i m = kΓjℓ, − ℓΓjk, + Γmk Γm jℓ − Γmℓ Γjk .
Ri jkℓ =
(41.3.2)
[ This section is at the core of the author’s interest in differential geometry. The rest of the book has been written to try to make sense of these formulas. The author got completely stuck on these calculations. Then he decided to go back and do all of differential geometry at a pure mathematical level of correctness so that he never gets stuck in this way again. ] This section presents some derivations of first and second order equations of geodesic variation in a manifold with an affine connection. For any C 1 curve γ : IR → M in a C 1 manifold M , the velocity of the curve at a point γ(t) ∈ M is strictly defined as the vector tγ(t),∂t (ψ◦γ(t)),ψ ∈ Tγ(t) (M ) for any chart ψ ∈ atlasγ(t) (M ). In tensor calculus, the coordinates ∂t (ψ i (γ(t))) for i = 1 . . . n = dim(M ) are brought into the foreground. For simplicity, ψ i ◦ γ is written as γ i for i = 1 . . . n, and ∂t (ψ i ◦γ) is written as γti . So strictly speaking, γti represents tγ(t),∂t (ψ◦γ(t)),ψ . i For a C 2 curve γ in a C 2 manifold M , the derivatives γtt are well defined, but they are not components of a vector in Tγ(t) (M ). [ Theorem 41.4.1 should also be proved somewhere without tensor calculus, preferably both in affine connections and in some sort of generalization to general connections. It should be possible to reduce the manifold regularity, maybe from C 3 to C 2,1 . ] 41.4.1 Theorem: Let γ : IR2 → M be a one-parameter family of geodesics in a C 3 manifold with a torsion-free C 2 connection with Christoffel symbol Γ. Then the tranverse field γt satisfies Dγ2s γt = R(γs , γt )γs ,
(41.4.1)
where γs denotes ∂s γ(s, t) ∈ Tγ(s,t) (M ) and γt denotes ∂t γ(s, t) ∈ Tγ(s,t) (M ). i i j k Proof: A family of geodesics satisfies γss + Γjk γs γs = 0. The derivative with respect to t is: i i i j k γsst + ℓΓjk, γsj γsk γtℓ + 2Γjk γst γs = 0.
(41.4.2)
For a general one-parameter C 3 family of curves in a C 3 manifold, i i i j k k i m j k ℓ (Dγ2s γt )i = γsst + ℓΓjk, γsj γtk γsℓ + Γjk (γss γt + 2γsj γst ) + Γℓm Γjk γs γt γs . [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
41.4. Equations of geodesic variation
742
41. Tensor calculus
i Substitution of γsst from (41.4.2) into this gives: i i i j k i m j k ℓ (Dγ2s γt )i = (ℓΓjk, − kΓjℓ, )γsj γtk γsℓ + Γjk γss γt + Γℓm Γjk γs γt γs . j ℓ m j Substitution of γss = −Γℓm γs γs into this and swapping j with m gives: i i i m i j k ℓ (Dγ2s γt )i = (ℓΓjk, − kΓjℓ, + Γℓm Γjk − Γmk Γm ℓj )γs γt γs .
i i m i i j k ℓ = (jΓℓk, − kΓℓj, + Γℓk Γmj − Γ m ℓj Γmk )γs γt γs .
This is the same as the expression (R(γs , γt )γs )i = Ri ℓjk γsj γtk γsℓ i i m i i j k ℓ = (jΓℓk, − kΓℓj, + Γℓk Γmj − Γ m ℓj Γmk )γs γt γs
obtained from Theorem 41.3.2. [ Look at simplification of equations for Jacobi fields and their derivatives by using normal coordinates to parallel-translate vectors along the geodesic. Then most Christoffel symbol values become zero. ] [ Next look at the transmission of first derivatives along a family of geodesics when minimizing c(u). This transmission may be called “leverage”. The map from one point to another along a geodesic is a “leverage map”. Using the properties of systems of linear second-order ODEs, it should be possible to deduce some useful estimates for γt in terms of the Riemannian curvature. ] [ The first equation in Remark 41.4.2 is the Hessian of φλ : M → M ? Looks like (D2 φλ )jk γtj γuk . Something to do with Hp (. . .)? ] [ What seems to be needed for convexity theory is the “generalized Hessian” of γ, or the “covariant Hessian”. ] 41.4.2 Remark: For a general C 4 family of curves γ : IR3 → M , i i j k γt γu (Dγu γt )i = γtu + Γjk
j k ℓ j k ℓ i i k ℓ ℓ i γs γs ) γu γs + 2γtj γus γs + γtj γuk γss + γtu (Dγ2s Dγu γt )i = γtuss (2γts + ℓmΓjk, γtj γuk γsℓ γsm + ℓΓjk, j j k j k j k i γsk + γtu γss ) + 2γtus + Γjk (γtss γuk + 2γts γus + γtj γuss
(41.4.3)
ℓ i i ℓ j k m n nΓjk, (2γtj γuk γsm γsn ) + nΓℓm, Γjk γt γu γs γs + Γℓm j k m j k m i n ℓ j k m p m k m ℓ i Γjk γt γu γs γs , Γℓm γs γs ) + Γnp + γtu γs + γtj γuk γss Γjk (2γts γu γs + 2γtj γus + Γℓm
where the curve family parameters are (s, t, u) ∈ IR3 . If the family is geodesic with respect to the first parameter s, all double-s derivative terms may be substituted from the geodesic equation and its derivatives: i i j k γss = −Γjk γs γs
i i i j k γtss = −ℓΓjk, γsj γsk γtℓ − 2Γjk γst γs
i i i j γuss = −ℓΓjk, γsj γsk γuℓ − 2Γjk γsu γsk
j k ℓ j i i i j ℓ i j k γtuss = −ℓmΓjk, γsj γsk γtℓ γum − ℓΓjk, (2γsu γsk γtℓ + γsj γsk γtu + 2γst γs γu ) − Γjk (2γstu γsk + 2γsu γst ). i i i [ In the above equation, maybe the formula γtuss = −ℓmΓjk, γsj γsk γuℓ γtm − 2ℓΓjk, could be useful? ] i Substitution of γtuss into (41.4.3) gives
i i i ℓ (Dγ2s Dγu γt )i = (ℓmΓjk, − jkΓℓm, )γtj γuk γsℓ γsm + ℓΓjk, γtj γuk γss
j k ℓ i i j k ℓ + (ℓΓjk, − kΓjℓ, )(2γts γu γs + 2γus γtk γsℓ + γsj γtu γs )
j j k i k + Γjk (γtss γuk + γtj γuss + γtu γss )
+ +
i ℓ ℓ j k m n i nΓℓm, Γjk γt γu γs γs + Γℓm nΓjk, (2γtj γuk γsm γsn ) j k m j k m i ℓ k m m Γℓm Γjk (2γts γu γs + 2γtj γus γs + γtj γuk γss + γtu γs γs )
[ www.topology.org/tex/conc/dg.html ]
(41.4.4) i n ℓ j k m p + Γnp Γℓm Γjk γt γu γs γs . [ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
j k j k i i ℓ j k m k i i γt γu γs γs ) + Γℓm Γjk + γtu γu + γtj γus (γts + ℓΓjk, γtj γuk γsℓ + Γjk (Dγs Dγu γt )i = γtus
41.5. Riemannian manifolds
743
j k Substitution of γtss and γuss into (41.4.4) gives i i i ℓ (Dγ2s Dγu γt )i = (ℓmΓjk, − jkΓℓm, )γtj γuk γsℓ γsm + ℓΓjk, γtj γuk γss
j k ℓ i i j k ℓ i j k + (ℓΓjk, − kΓjℓ, )(2γts γu γs + 2γus γtk γsℓ + γsj γtu γs ) + Γjk γtu γss
ℓ i ℓ ℓ ℓ j k m n i i + nΓℓm, Γjk γt γu γs γs + (2Γℓm nΓjk, − Γℓk jΓmn, − Γℓji kΓmn, )γtj γuk γsm γsn
+ +
(41.4.5)
j k m i ℓ i ℓ i ℓ ℓ k m 2(Γℓm Γjk − Γℓk Γjm )γts γu γs + 2(Γℓm Γkj − Γℓji Γkm )γtj γus γs j k m j k m i ℓ i n ℓ j k m p Γℓm Γjk (γt γu γss + γtu γs γs ) + Γnp Γℓm Γjk γt γu γs γs .
Substitution of γss into (41.4.5) gives j k ℓ i i i i k ℓ j γs ) − kΓjℓ, )(2γts γu γs + 2γus γtk γsℓ + γsj γtu (Dγ2s Dγu γt )i = (ℓmΓjk, − jkΓℓm, )γtj γuk γsℓ γsm + (ℓΓjk, i ℓ i ℓ ℓ ℓ ℓ i i + (nΓℓm, Γjk − ℓΓjk, Γmn + 2Γℓm nΓjk, − Γℓk jΓmn, − Γℓji kΓmn, )γtj γuk γsm γsn j k m i ℓ i ℓ i ℓ ℓ k m γu γs + 2(Γℓm Γkj − Γℓji Γkm )γtj γus γs + 2(Γℓm Γjk − Γℓk Γjm )γts
(41.4.6)
j k m i n ℓ j k m p i n ℓ ℓ i γt γu γs γs . Γmℓ − Γnℓ Γmp )Γjk )γtu γs γs + (Γnp + (Γℓm Γkj − Γℓji Γkm
Recognition of the Riemann curvature tensor in (41.4.6) leads to the following: i i ℓ j k m n (Dγ2s Dγu γt )i = (ℓmΓjk, − jkΓℓm, )γtj γuk γsℓ γsm + Ri mnℓ Γjk γt γu γs γs
i i ℓ ℓ ℓ ℓ ℓ i i + (ℓΓmn, Γjk − ℓΓjk, Γmn + 2Γℓm nΓjk, − Γℓk jΓmn, − Γℓji kΓmn, )γtj γuk γsm γsn
+R
i
j k ℓ kℓj γtu γs γs
+ 2R
i
j k ℓ jℓk γts γu γs
+ 2R
i
(41.4.7)
j k ℓ jℓk γus γt γs .
m i i m m m m i i i [ From notes: mΓℓn, Γjk − mΓjk, Γℓn + 2Γmℓ nΓjk, − Γmk jΓℓn, − Γmj kΓℓn, ?]
41.5. Riemannian manifolds [ This section is very old and totally useless. It’s just a place-holder for future work. ] A Riemannian manifold is a (topological) metric space which can be locally coordinatized in such a way that the square of the distance between two points is twice continuously differentiable with respect to the coordinates of the points and the matrix of second derivatives is invertible. [ Should non-degeneracy be used here instead of invertibility? ] Let d : M × M → IR+ 0 = [0, ∞) be the distance function on an n-dimensional Riemannian manifold M . Then the metric tensor on M at a point p ∈ M (for a particular local coordinate map (U, ψ)) is the matrix [gij (x)]ni,j=1 defined by 1 ∂ 2 d(x, y)2 , gij (x) = 2 ∂y i ∂y j y=x where x = ψ(p). [ It should be explained why the obvious simplification does not apply. The obvious simplification is to say that (d2 /dy 2 )f (y)2 = 2(f f ′′ + (f ′ )2 ), and that since d(x, y) = 0 at y = x, the formula for gij must reduce to ((d/dy)d(x, y) y=x )2 . However, the derivative of d(x, y) generally does not exist in a neighbourhood of y = x. But then again, maybe the right derivative of d(x, y) would do the job. ] [ It should be possible to show that g is C k if d has some regularity property. ] For integer r ≥ 0, if the matrix elements gij are C r with respect to the coordinates, then M is said to be C r+1 . k The Christoffel symbol Γij in a Riemannian metric space satisfies k Γij =
[ www.topology.org/tex/conc/dg.html ]
1 kl g 2
∂gli ∂glj ∂gij + − ∂xj ∂xi ∂xl
. [ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
[ This derivation of second-order equations of geodesic variation will be continued some time soon. These calculations are the prime motivation for the whole book. Around 2002-6-20, I finally got a “handle” on this problem. So I’m expecting to get this all worked out soon. ]
744
41. Tensor calculus
where the matrix [g ij (x)]ni,j=1 is the inverse of the matrix [gij (x)]ni,j=1 . If M is C 2 , then a continuously differentiable curve γ : [0, L] → M , mapping s 7→ (xi (s))ni=1 (in terms of local coordinates), is a normal geodesic (or a length-parametrized geodesic) if L ≥ 0 and for all s ∈ (0, L), d2 xi dxj dxk i + Γjk (x(s)) =0 2 ds ds ds as in an affine connection space, and gij (x(s))
dxi dxj = 1. ds ds
Then L = L(γ) is the length of the curve γ. The image of a normal geodesic is a geodesic. If x, y ∈ M are points such that there is a unique geodesic γx,y in M with endpoints x and y satisfying L(γx,y ) = d(x, y), then the convex combination C(x, y, λ) of x and y is uniquely defined for λ ∈ [0, 1] by
C(x, y, λ) = γx,y (λd(x, y)). A subset K of a Riemannian manifold M is convex if for all x, y ∈ K, there exists a unique geodesic γx,y with endpoints x and y such that L(γx,y ) = d(x, y). Then convex combinations are well-defined in a convex set K, and so C : K × K × [0, 1] → K is a well-defined function. A function f : K → IR on a convex set K is a convex function if for all x, y ∈ K and λ ∈ [0, 1], f (C(x, y, λ)) ≤ (1 − λ)f (x) + λf (y). A function f : K → IR is said to be concave if −f is convex.
[ Define harmonic concavity around here somewhere. ]
∂f , ∂xi 1/2 ∂f ∂f |∇f | = g ij i j , and ∂x ∂x 2 ∂ f ij k ∂f − Γij k ∆f = g ∂xi ∂xj ∂x 1 ∂ √ ij ∂f = √ g g , g ∂xi ∂xj
(∇f )i =
where g = det([gij ]ni,j=1 ). If M is a C 2+k manifold and g ∈ X 1 (T 0,2 (M )), then (probably) ∆ ∈ X k (T [2] (M )).
[ The following two definitions should only be presented if they are needed for results. ] [ Definition of parallelism. ] [ Definition of normal coordinates. ]
j i Two vectors X(1) , X(2) ∈ Tx (M ) are said to be orthonormal at x ∈ M if gij (x)X(k) X(l) = δkl .
If M is C 3 , then the sectional curvature of M at x ∈ M in the “plane” of the orthornormal vectors X, Y ∈ IRn is defined to be Kx (X, Y ) = gim (x)Rm jkl (x)X i Y j X k Y l , where the curvature tensor Ri jkl is given by Theorem 41.3.2. A Riemannian manifold of constant sectional curvature is one for which κ = Kx (X, Y ) is independent of the point x and the pair (X, Y ) of orthonormal vectors. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
The gradient ∇f , the length of the gradient |∇f |, and the Laplacian ∆f , of a twice differentiable function f : S → IR on a subset S of a Riemannian manifold M are defined by
41.6. Pseudo-Riemannian manifolds
745
41.6. Pseudo-Riemannian manifolds [ This section should include tensor calculus for special and general relativity. ]
41.7. Submanifolds of Euclidean space [ This should, in particular, deal with Gaußian curvature and mean curvature. Also, a whole bunch of things have simplified formulas when the manifold is embedded in IRn . And the book by Gallot/Hulin/ Lafontaine [19] has lots of material on this subject. It seems like in a 2-dimensional embedded manifold, the Gauß curvature is the same as sectional curvature. The mean curvature just doesn’t mean anything in an intrinsic geometry. ] 41.7.1 Definition: A subset M ⊆ IRn is said to be a C r k-dimensional submanifold of IRn if ∀x ∈ M, ∃U ∈ Top(IRn ),
x ∈ U and ∃f ∈ C r (U, IRn−k ), U ∩ M = f −1 (0) and f is a submersion.
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
(A submersion is a map whose differential is everywhere surjective.) [ See Gallot/Hulin/Lafontaine [19], page 2. ]
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
746
[ www.topology.org/tex/conc/dg.html ]
41. Tensor calculus
[ draft: UTC 2009–3–21 Saturday 11:36 ]
[747]
Chapter 42 Geometry of the 2-sphere
Terrestrial coordinates . . . . . . . . . . . . . . . . Tensor calculus in terrestrial coordinates . . . . . . . Metric tensor calculation from the distance function . The principal fibre bundle in terrestrial coordinates . The Riemannian connection in terrestrial coordinates Coordinates for polar exponential maps . . . . . . . The global tangent bundle . . . . . . . . . . . . . . Isometries of S 2 . . . . . . . . . . . . . . . . . . . . Geodesic curves . . . . . . . . . . . . . . . . . . . . Affinely parametrized geodesics . . . . . . . . . . . . Convex sets and functions . . . . . . . . . . . . . . . Normal coordinates . . . . . . . . . . . . . . . . . . Jacobi fields . . . . . . . . . . . . . . . . . . . . . . Circles on the sphere . . . . . . . . . . . . . . . . . Calculation of the “hours of daylight” . . . . . . . . Some standard map projections . . . . . . . . . . . . Projection of a sphere onto a plane . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
747 749 751 752 753 755 757 758 760 762 763 763 763 763 764 764 764
42.0.1 Remark: The 2-sphere S 2 is among the simplest non-trivial geometries to study from the differential geometry perspective. It demonstrates a large number of features which are generalized through differential geometry to more general spaces. Although the 2-sphere has been studied for thousands of years, it still has a rich variety of properties which provide a testing ground for theoretical concepts. (Of course, there was a break in its study in Europe during the Dark Ages when people were told that the Earth was flat.) Understanding the 2-sphere is fundamental to many areas of physics from the quantum mechanics of atoms and elementary particles to astrophysics and cosmology. Therefore the 2-sphere deserves its own chapter, or maybe a whole book of its own. After all, the word “geometry” does mean the measurement of the Earth, and the Earth is a 2-sphere (approximately). The manifold S 2 is a good test of the practicality of all differential geometry definitions. Any definition which cannot be applied to S 2 probably should be changed. [ See CRC [99], page 312 for spherical coordinates. See also Cohen-Tannoudji/Diu/Lalo¨e [154], volume 1, especially the end-papers. ] [ Many of the statements in the chapter would make nice exercises! ]
42.1. Terrestrial coordinates [ Probably should use (x1 , x2 , x3 ) instead of (x, y, z) in the following. ] 42.1.1 Remark: Coordinates really are necessary for differential geometry. How else would one indicate points in manifolds? One could use the methods of synthetic geometry, such as “the point on the intersection
©
Alan U. Kennington, “Differential geometry reconstructed: a unified systematic framework”. Copyright 2009, Alan U. Kennington. All rights reserved. You may print this book draft in A4 format. Printing in all other formats is forbidden. You may not charge any fee for copies of this book draft. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
42.1 42.2 42.3 42.4 42.5 42.6 42.7 42.8 42.9 42.10 42.11 42.12 42.13 42.14 42.15 42.16 42.17
748
42. Geometry of the 2-sphere
of this line and that”. But after a while, it becomes clear that these ways of indicating points are really coordinate systems in disguise. So those people who think that coordinates are bad are really fooling themselves. The “coordinate-free” concepts in differential geometry always turn out to contain coordinates in disguise. 42.1.2 Remark: Define the 2-sphere to be the subset S 2 = {(x, y, z) ∈ IR3 ; x2 + y 2 + z 2 = 1} of IR3 . Define terrestrial spherical coordinates for S 2 by ψ : S 2 → (−π, π] × (−π/2, π/2) ∪ (0, −π/2), (0, π/2) (42.1.1) ψ : (x, y, z) 7→ (φ, θ) = (arctan(x, y), arcsin(z)). (See Section 20.13 for definitions of trigonometric functions.) The range of ψ is illustrated in Figure 42.1.1. φ is called the longitude and θ is called the latitude. θ π/2
North Pole
2
M¨ unchen
1
−π
π -3
-2
-1
1
2
φ
3
-1
−π/2
Melbourne
-2
Range of terrestrial coordinates for S 2
Figure 42.1.1
ψ¯ : IR2 → S 2 ψ¯ : (φ, θ) 7→ (x, y, z) = (cos θ cos φ, cos θ sin φ, sin θ).
(42.1.2)
It is noteworthy that this kind of inverse chart ψ¯ does not have the “seams” which necessarily appear in the forward chart ψ. [ Maybe the kind of multiple-covering chart in Remark 42.1.3 should be formalized somehow in general? ] 42.1.4 Remark: Figure 42.1.2 illustrates lines of constant longitude and latitude for a 2-sphere. The curves φ = 0 and θ = 0 are emphasized. The longitude intervals are 15◦ . The latitude intervals are 10◦ .
Palo Alto
M¨ unchen R¯am All¯ ah Ouagadougou
La Habana
Kinshasa
Figure 42.1.2 [ www.topology.org/tex/conc/dg.html ]
Domain of terrestrial coordinates for S 2 [ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
42.1.3 Remark: The map ψ has a left inverse ψ¯ defined by
42.2. Tensor calculus in terrestrial coordinates
749
42.1.5 Remark: The terrestrial coordinates presented here do not constitute a chart for the manifold S 2 , but they can be made into a chart by restricting ψ¯ : IR2 → S 2 to the set V0 = (−π, π) × (−π/2, π/2) = Int(Dom(ψ)), or equivalently, by restricting ψ : S 2 → IR2 to p U0 = (x, y, z) ∈ S 2 ; x > − 1 − z 2 = {(x, y, z) ∈ S 2 ; x > 0 or y 6= 0} = ψ −1 (V0 ),
which removes the poles and the international dateline from the domain of ψ. Define ψ0 = ψ U0 and ψ¯0 = ψ¯ V . Then ψ0 is a chart for S 2 . 0
42.1.6 Remark: Astronomical spherical coordinates are obtained by forcing φ into the range [0, 2π). This is done by adding 2π to φ when y < 0. In quantum mechanics, it is customary to define θ = arccos(z), so that θ ∈ [0, π]. (See Cohen-Tannoudji/Diu/Lalo¨e [154].) Terrestrial-style coordinates with φ ∈ (−π, π] and θ ∈ [−π/2, π/2] are assumed in this chapter unless otherwise indicated.
[ Should split this section into those things which can be defined with an affine connection but no metric, versus those things which require a metric. Maybe also have a section on those things which don’t even require a connection. ] The extrinsic tangent vectors generated by the parameters φ and θ are as follows. − cos θ sin φ ∂ = cos θ cos φ ∂φ 0 (42.2.1) − sin θ cos φ ∂ = − sin θ sin φ . ∂θ cos θ The lengths of these vectors satisfy |∂/∂φ| = cos θ and |∂/∂θ| = 1. A convenient abbreviation for these tangent vectors is ∂φ = ∂/∂φ and ∂θ = ∂/∂θ. Strictly speaking, these vectors are defined by ∂φ = ∂p,e1 ,ψ and ∂θ = ∂p,e2 ,ψ , where ∂p,v,ψ : f 7→ ∂φ (f ◦ ψ −1 ) ψ(p) for v ∈ IR2 , f ∈ C 1 (S 2 ) and p ∈ S 2 . Tangent vectors are always attached to some base point in the manifold, in this case p ∈ S 2 . For convenience, the base point is omitted in the notation when there is no confusion. The vectors in equation (42.2.1) are associated with the unit vectors in equation (42.2.2). − sin φ ∂ eφ (φ, θ) = (cos θ)−1 = cos φ ∂φ 0 (42.2.2) − sin θ cos φ ∂ eθ (φ, θ) = = − sin θ sin φ . ∂θ cos θ
The metric for the 2-sphere is inherited from its embedding in IR3 . ¯ 1 , p2 )2 ≤ (θ1 − θ2 )2 + (φ1 − φ2 )2 in Theorem 42.2.1. ] [ Show that d(p
¯ 1 , p2 ) within IR3 between two points p1 and p2 in S 2 , whose spherical 42.2.1 Theorem: The distance d(p coordinates are respectively (φ1 , θ1 ) and (φ2 , θ2 ), is given by ¯ 1 , p2 )2 = 2(1 − cos(θ1 − θ2 ) + cos θ1 cos θ2 (1 − cos(φ1 − φ2 ))) d(p = 4 sin2 12 (θ1 − θ2 ) + 4 cos θ1 cos θ2 sin2 21 (φ1 − φ2 ) .
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
42.2. Tensor calculus in terrestrial coordinates
750
42. Geometry of the 2-sphere
The distance d(p1 , p2 ) between the same two points within the sphere surface S 2 is given by d(p1 , p2 ) = arccos cos(θ1 − θ2 ) + cos θ1 cos θ2 (cos(φ1 − φ2 ) − 1) = arccos sin θ1 sin θ2 + cos θ1 cos θ2 cos(φ1 − φ2 ) ¯ 1 , p2 ) = 2 arcsin 12 d(p 1/2 = 2 arcsin sin2 ( 12 (θ1 − θ2 )) + cos θ1 cos θ2 sin2 ( 21 (φ1 − φ2 )) .
¯ 1 , p2 )) or arccos(p1 · p2 ). ] Proof: [ The distance d(p1 , p2 ) may be calculated as either 2 arcsin( 12 d(p 42.2.2 Theorem: With respect to the coordinate map assumed in this chapter for S 2 , namely the terresk trial coordinates in Section 42.1, the components gij of the metric tensor, Γij of the Christoffel symbol for i the Levi-Civita connection, and R jkl of the Riemann curvature tensor, satisfy gij = cos2 θ δi1 δj1 + δi2 δj2 g ij = sec2 θ δ1i δ1j + δ2i δ2j k Γij = − tan θ (δi1 δj2 + δi2 δj1 )δ1k + sin θ cos θ δi1 δj1 δ2k
k lΓij, = − sec2 θ (δi1 δj2 + δi2 δj1 )δ1k + (cos2 θ − sin2 θ)δi1 δj1 δ2k δl2
Ri jkl = (δ1i δj2 − cos2 θ δ2i δj1 )(δk1 δl2 − δk2 δl1 )
Rijkl = cos2 θ (δi1 δj2 − δi2 δj1 )(δk1 δl2 − δk2 δl1 ) Rij = gij R=2 Gij = 0
[ Also calculate sectional curvature in Theorem 42.2.2. ] [ For Ri jkl in Theorem 42.2.2, note that Rjk = Rℓ jkl = −δj2 δk2 − cos2 θ δj1 δk1 = −gjk !? ] Proof: The formula for gij can be determined directly from the metric function on S 2 from Theorem 42.2.1, or else from the general formula for embedded manifolds. [ This should be done explicitly for at least one of the methods of proof. ] To prove the formula for Ri jkl , note that
and
i i kΓjl, − lΓjk, = (sec2 θ δ1i δj2 − (cos2 θ − sin2 θ)δ2i δj1 ) (δk1 δl2 − δk2 δl1 ) 2 i m i 2 i 2 i 1 1 2 2 1 Γm jl Γmk − Γjk Γml = −(tan θ δ1 δj + sin θ δ2 δj ) (δk δl − δk δl ).
The formula for Ri jkl follows immediately. [ Also must calculate the operators ∇ and ∆ in spherical coordinates. And also the equation of geodesic variation, and the Jacobi fields, and the second variation of length and of energy. Also calculate the sectional curvature, the Ricci curvature, and other curvatures. Also must calculate the sum of the angles of an arbitrary triangle, and relate this to the area of triangle and the sectional curvature. Also must calculate the area and circumference of an arbitrary circle, and the ratio of area to radius, etc. And also must calculate operators such as ∇(∇u|∇u|p ). And it would be nice to have the solution of such equations as ∆u + uγ = 0 in a circle, with zero Dirichlet data, for 0 ≤ γ ≤ 1. ] [ Here must state a theorem on the slope m = tan β of a curve – that it is given by m = tan β = sec θ
dθ . dφ
The proof should be done in terms of the inner product gij X i Y j . ] [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
K(X, Y ) = cos2 θ (X 1 Y 2 − X 2 Y 1 )2 .
42.3. Metric tensor calculation from the distance function
751
42.3. Metric tensor calculation from the distance function This section deals with the correspondence between the standard point-to-point metric (the distance functions) and the standard Riemannian metric on S 2 . It is well-known how the point-metric d : M × M → IR+ 0 is derived from the Riemannian metric g ∈ T 0,2 (M ) for any Riemannian manifold M . The reverse calculation is not often shown in textbooks. (See Theorem 39.3.6.) It follows from Theorem 42.2.1 that the point-to-point distance function d : S 2 × S 2 → [0, π] satisfies sin2
1 2 d(p1 , p2 )
or equivalently,
= sin2
1 2 (θ1
− θ2 ) + cos θ1 cos θ2 sin2
1 2 (φ1
− φ2 ) ,
cos(d(p1 , p2 )) = cos(θ1 − θ2 ) + cos θ1 cos θ2 (cos(φ1 − φ2 ) − 1),
(42.3.1)
(42.3.2)
where ψ0 : U0 → V0 ⊆ IR2 is the terrestrial coordinates chart defined in Section 42.1, p1 , p2 ∈ U0 ⊆ S 2 are points in S 2 with coordinates (φ1 , θ1 ) = ψ0 (p1 ) and (φ2 , θ2 ) = ψ0 (p2 ). Very unfortunately, as is usual with distance functions, the function d(p1 , p2 ) is not at all differentiable at p2 = p1 . Therefore the elementary rule for differentiating the square of a differentiable function is of little use. It follows from equation (42.3.1) that limp2 →p1 d(p1 , p2 ) = 0. So d(p1 , p2 ) is continuous with respect to p2 in a neighbourhood of p1 . It also follows that d(p1 , p2 ) ≤ arcsin ((φ1 − φ2 )2 + (θ1 − θ2 )2 )−1/2 . So all directional derivatives of d2 with respect to φ2 and θ2 are zero at p2 = p1 .
The second derivatives of d(p1 , p2 )2 with respect to the coordinates of p2 for fixed p1 may be determined from the following calculations.
∂d sin d = − sin(θ1 − θ2 ) + cos θ1 sin θ2 (cos(φ1 − φ2 ) − 1) ∂θ2 ∂d 2 ∂2d sin d + cos d = cos θ1 cos θ2 cos(φ1 − φ2 ) ∂φ22 ∂φ2
∂2d ∂d ∂d sin d + cos d = cos θ1 sin θ2 sin(φ1 − φ2 ). ∂φ2 ∂θ2 ∂φ2 ∂θ2 ∂d 2 ∂2d sin d + cos d = cos(θ1 − θ2 ) + cos θ1 cos θ2 (cos(φ1 − φ2 ) − 1) 2 ∂θ2 ∂θ2 It follows that lim
φ2 →φ1
∂d2 ∂d = 2 lim d φ →φ ∂φ2 ∂φ2 2 1 d − cos θ1 cos θ2 sin(φ1 − φ2 ) = 2 lim φ2 →φ1 sin d = 0.
This shows that ∂d2 /∂φ2 is continuous for p2 in a neighbourhood for p1 . The same is true for ∂d2 /∂θ2 . It follows that d2 is C 1 with respect to p2 in a neighbourhood of p1 . The second derivative with respect to φ2 may then be calculated as follows. ∂ 2 d2 1 ∂d = lim 2d φ2 →φ1 φ2 − φ1 ∂φ22 ∂φ2 d − cos θ1 cos θ2 sin(φ1 − φ2 ) = 2 lim φ2 →φ1 sin d φ2 − φ1 = 2 cos θ1 cos θ2 , [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
∂d sin d = − cos θ1 cos θ2 sin(φ1 − φ2 ) ∂φ2
752
42. Geometry of the 2-sphere
which equals 2 cos2 θ1 for θ2 = θ1 . This agrees with the limit of ∂ 2 d2 /∂φ22 as p2 → p1 . So this partial derivative is continuous at p2 = p1 . It may similarly be shown that ∂ 2 d/∂φ2 ∂θ2 = 0 and ∂ 2 d/∂θ22 = 2 at p2 = p1 , which also agree with the limits of the corresponding second derivatives. Therefore d2 is a C 2 function for p2 in a neighbourhood of p1 , and the Hessian matrix for d2 (p1 , p2 ) with respect to p2 satisfies 2 2 ∂ d ∂ 2 d2 # " 2 cos2 θ1 0 ∂φ2 ∂θ2 1 ∂φ2 , = 2 ∂ 2 d2 0 1 ∂ 2 d2 ∂θ2 ∂φ2 ∂θ22
which agrees with the standard Riemannian metric tensor in accordance with Theorem 39.3.6. [ There should be a general theorem to guarantee that the square d2 a distance function d is C 2 at the origin if d has the form d(p1 , p2 ) = f (g1 (p1 , p2 )h1 (x2 − x1 ) + g2 (p1 , p2 )h2 (y2 − y1 )), where (xk , yk ) = ψ(pk ) for k = 1, 2, the functions f , g1 , g2 , h1 and h2 are C ∞ with f ′ (0) = h′1 (0) = h′2 (0), g1 and g2 are non-negative, and f (z) ≥ 0 for all z ≥ 0. Something along these lines might be useful for dealing with metrics in Riemannian spaces. ]
Affine connections are defined on principal fibre bundles. (See Definition 23.9.2 for topological PFBs.) The PFB of interest is the tangent n-frame bundle of an n-dimensional manifold, acted on by the group GL(n). Three sets of manifold charts are required for the PFB (P, π, B) for B = S 2 . These charts are of the form ψB : B → IR2 for the manifold B, ψP : P → IR6 for the total space P , and ψG : G → IR4 for the structure group G. The chart ψB is defined in equation (42.1.1). Its inverse ψ¯B is defined by equation (42.1.2). This chart leaves gaps at the poles, but this is not a serious problem because this section is only intended to demonstrate the basic principles. [ Maybe in the following should use α, β, γ instead of u, v and w. Then use u, v instead of current α and β. ] The set P of coordinate frames for the manifold B is the set of ordered pairs of independent tangent vectors at each point of B. This set may be expressed as P = (v11 eφ (b) + v21 eθ (b), v12 eφ (b) + v22 eθ (b)); v ∈ M 2,2 (IR), det(v) 6= 0, b ∈ B ,
where eφ (b) and eθ (b) are the unit tangent vectors defined in equation (42.2.2) for terrestrial coordinates at b ∈ B. Note that the coefficients v21 and v12 are intentionally ‘out of order’, because row vectors and matrix multiplication on the right are assumed. Thus v v12 (v11 eφ (b) + v21 eθ (b), v12 eφ (b) + v22 eθ (b)) = (eφ (b), eθ (b)) 11 . v21 v22 P is a 6-dimensional manifold which can be coordinatized by the chart ψP : P → IR6 defined by ψP : (v11 eφ (b) + v21 eθ (b), v12 eφ (b) + v22 eθ (b)) 7→ (φ, θ, v11 , v12 , v21 , v22 ).
Standard lexicographic ordering is used for the elements of the matrix v. The projection map π : P → B satisfies π : (v11 eφ (b) + v12 eθ (b), v21 eφ (b) + v22 eθ (b)) 7→ b. The structure group G is GL(2) = {v ∈ M 2,2 (IR); det(v) 6= 0}. This set is adequately coordinatized by the matrix elements. Thus ψG : v 7→ (v11 , v12 , v21 , v22 ). The action µ : G × P → P is defined with matrix multiplication. For g ∈ G and p ∈ P , define
where
µ(g, p) = p.g = (w11 eφ (b) + w21 eθ (b), w12 eφ (b) + w22 eθ (b)) w11 w12 = (eφ (b), eθ (b)) , w21 w22
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
42.4. The principal fibre bundle in terrestrial coordinates
42.5. The Riemannian connection in terrestrial coordinates v v12 u11 u12 p = (eφ (b), eθ (b)) 11 , g= , v21 v22 u21 u22
and
w11 w= w21
w12 w22
v = 11 v21
v12 u · 11 v22 u21
u12 u22
753
= v · u.
Additionally, the principal fibre bundle (P, π, B) requires a fibre atlas AP,G , each of whose charts maps the total space P to the structure group G. The single fibre bundle chart ψPG : P → G may be defined so that −1 ψPG : (v11 eφ (b) + v12 eθ (b), v21 eφ (b) + v22 eθ (b)) 7→ v, where v = ψG (v11 , v12 , v21 , v22 ) ∈ GL(2).
The tangent bundle on B can be coordinatized with ψT (B) : T (B) → IR4 , where for b ∈ B and α ∈ Tb (B), ψT (B) : (b, α) 7→ (φ, θ, αφ , αθ ), where (αφ , αθ ) is a tuple of contravariant ccordinates for α so that α = αφ eφ (b) + αθ eθ (b). The definition of an affine connection on the principal fibre bundle (P, q, B) will require a tangent bundle on P . Since there is a differentiable manifold chart ψP : P → IR6 , it is easy to construct a tangent bundle T (P ) of P as the set of (p, β) such that p ∈ P and β ∈ Tp (P ). This tangent bundle T (P ) must now be coordinatized with a chart. [ Maybe put eφ , eθ instead of ∂φ , ∂θ in the next paragraph? ] The sequence of six vectors (∂φ (p), ∂θ (p), ∂v11 (p), ∂v12 (p), ∂v21 (p), ∂v22 (p)) is a basis for Tp (P ) for each p ∈ P . For brevity, this basis will be denoted (∂φ , ∂θ , ∂v11 , ∂v12 , ∂v21 , ∂v22 ), when the point p is implicit. Each vector β ∈ Tp (P ) may be mapped to IR6 by β 7→ (βφ , βθ , β11 , β12 , β21 , β22 ), where these contravariant coordinates are chosen so that β = βφ ∂φ (p) + βθ ∂θ (p) + β11 ∂v11 (p) + β12 ∂v12 (p) + β21 ∂v21 (p) + β22 ∂v22 (p).
The coordinatization of P is provided already by ψP . So now this can be combined with the above coordinates for Tp (P ) to give the combined chart ψT (P ) : T (P ) → IR12 defined by 42.4.1 Remark: A rather amusing thing about the approach to connections taken in this section is the fact this is the coordinate-free way of doing it. But it involves heaps more coordinates than the traditional coordinate method using the Γ symbol as in Section 42.2. This just shows that the obsession that some people have with “coordinate-free” methods actually makes things worse in practice. It seems like the popular “coordinate-free” philosophy arose from the 19th century battle between the synthetic geometers (coordinatefree) and the analytical geometers (using coordinates). Some people are still trying to do differential geometry along 19th century synthetic lines. It’s a kind of nostalgia maybe. But it’s all educational.
42.5. The Riemannian connection in terrestrial coordinates 42.5.1 Remark: The Riemannian connection for S 2 defines parallel transport of tangent vectors on S 2 . Figure 42.5.1 shows parallel transport of a vector along two paths in S 2 . Note that the region bounded by the paths has area π/4, which equals the difference in orientation of the axes at the end-points. Finish here
Hamburg
H` a Nˆ o.i Start here
Figure 42.5.1
Parallel transport in S 2 along boundary of φ ∈ (0, π2 ), θ ∈ ( π6 , π2 )
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
ψT (P ) : (p, β) 7→ (φ, θ, v11 , v12 , v21 , v22 , βφ , βθ , β11 , β12 , β21 , β22 ).
754
42. Geometry of the 2-sphere
This section uses the notation of Section 42.4 for the principal fibre bundle (P, π, B) for B = S 2 with structure group G = GL(2). 42.5.2 Remark: An affine connection ρ¯ is a function defined on the 6-dimensional total space P so that for each p ∈ P , ρ¯p is a map from the 2-dimensional tangent space Tπ(p) (B) to the 6-dimensional tangent space Tp (P ). The components of ρ¯p in the horizontal direction are quite easy to determine. As a point moves on b ∈ B, the corresponding element of p ∈ P must move so that π(p) = b. Therefore the partial derivative of the φ component of the 6-tuple (φ, θ, v11 , v12 , v21 , v22 ) for a point in P with respect to the φ component of (φ, θ) for a point in B must equal 1. The derivative with respect to θ must be 0. The derivatives with respect to θ are similar. It remains to determine the derivatives of the coordinates (v11 , v12 , v21 , v22 ) of elements in P with respect to coordinates (φ, θ) of points in B. The v-coordinates must change in such a way that the vectors (v11 , v12 ) and (v21 , v22 ) maintain parallelism as the base point (φ, θ) changes. In the case of the Levi-Civita connection, these vectors will remain orthogonal if they are initially orthogonal. This implies that the derivative of (v11 , v12 , v21 , v22 ) is an element of the Lie algebra of the Lie group SO(2). In other words, the derivative must be an anti-symmetric matrix. [ Use superscripts for α and β in the following? The map ρ¯p should be from IR2 to IR6 . Must use the coordinate vectors ∂φ and ∂θ so that the matrices are as follows. ] 0 − tan θ − tan θ 0 αφ + αθ . sin θ cos θ 0 0 0 42.5.3 Remark: The Riemannian connection ρ¯ for S 2 satisfies the following. ρ¯p (ψT−1(B) (φ, θ, αφ , αθ )) = ψT−1(P ) (φ, θ, v11 , v12 , v21 , v22 , αφ , αθ , β11 , β12 , β21 , β22 ) , 0 0
v · 11 v21
v12 . v22
(42.5.1)
The β matrix is clearly linear with respect to the tangent vector coordinates α, an invariant under right action by the structure group. [ Really should do the map ρ¯p only from one tangent space to the other and ignore the point coordinates. So it should map a 2-d space to a 6-d space. ] The fact that this connection is an orthogonal connection is due to the fact that the matrices in equation (42.5.1) are real and anti-symmetric. Anti-symmetric 2x2 matrices are in fact the generators of SO(2), which follows from the well-known formulas from linear ODE systems theory: 0 −1 cos λ − sin λ exp λ = 1 0 sin λ cos λ and d cos λ − sin λ 0 −1 cos λ − sin λ = . cos λ 1 0 sin λ cos λ dλ sin λ With the constraint that the connection be orthogonal, there are at most 2 independent parameters in the expression for the β matrix. In the case of general affine connections, there are clearly 8 independent parameters. [ Should cover linear ODE systems theory in the Lie groups chapter. ] With the above connection ρ¯, it is possible to generate parallel transport along paths. For example, consider −1 the curve γ : IR → B defined by γ : t 7→ ψB (t, θ1 ). This is a curve around the latitude line θ = θ1 π π for θ1 ∈ (− 2 , 2 ). As discussed in Section 36.8, it should be possible to define a suitable function γˆ : IR2 → {f : π −1 ({b1 }) ≈ π −1 ({b2 }); b1 , b2 ∈ B}, [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
where p = ψP−1 ((φ, θ, v11 , v12 , v21 , v22 )) ∈ P and β11 β12 0 − sin θ 0 = αφ + αθ β21 β22 sin θ 0 0
42.6. Coordinates for polar exponential maps
755
such that Dom(ˆ γs,t ) = π −1 ({γ(s)}) and Range(ˆ γs,t ) = π −1 ({γ(t)}) for all s, t ∈ IR. The parallel transport function γˆ will be such that for s = 0, each frame in π −1 ({γ(0)}) is rotated by angle t sin θ1 . So the difference in angle for any given s, t ∈ IR will be (t − s) sin θ1 . Must show that this satisfies the equations for parallel transport in terms of ρ. ¯ In terms of ψB , obtain γ ′ (t) = cos θ.eφ (γ(t)). This will justify the parallel translation in Figure 42.5.1. The matrix for the parallel translation operation from γ(s) to γ(t) is cos (t − s) sin θ1 − sin (t − s) sin θ1 cos λ − sin λ = . sin λ cos λ sin (t − s) sin θ1 cos (t − s) sin θ1
[ Must show that this satisfies the parallel transport equation for given ρ. ¯] [ Putting s = 0, t = φ above gives λ = φ sin θ. ] [ The map for a given fixed initial p ∈ π −1 ({γ(s)}) for a given s ∈ IR is called a “lift” in EDM2 [34], section 80.C. This is denoted γp∗ and is a curve in P such that γp∗ (s) ∈ π −1 ({γ(s)}) for all s ∈ Dom(γ). ] [ Also deal here with the “horizontal subspace” style of connection definition Qx . ]
42.6. Coordinates for polar exponential maps Terrestrial coordinates have bad discontinuities at the poles which are difficult to remove. Therefore to construct a 2-chart atlas for S 2 , it is necessary to introduce charts which are better behaved at the poles. Define the 2-chart atlas (ψ1 , ψ2 ) for S 2 , where ψ1 : S 2 \ {(0, 0, −1)} → IR2 and ψ2 : S 2 \ {(0, 0, 1)} → IR2 are defined by x(x2 + y 2 )−1/2 arccos(z) x arccos(z)(1 − z 2 )−1/2 ψ1 (x, y, z) = = y(x2 + y 2 )−1/2 arccos(z) y arccos(z)(1 − z 2 )−1/2 x(x2 + y 2 )−1/2 (π − arccos(z)) x(π − arccos(z))(1 − z 2 )−1/2 ψ2 (x, y, z) = = y(x2 + y 2 )−1/2 (π − arccos(z)) y(π − arccos(z))(1 − z 2 )−1/2
η = ψ12 (x, y, z) (0, 1, 0)
3
η = ψ22 (x, y, z)
θ=−
π 2
(0, 1, 0)
3
2
θ=
2
(1, 0, 0)
(1, 0, 0)
1
1
ξ = ψ11 (x, y, z) -3
-2
North Pole π θ= 2
-1
π 2
1
2
3
-1 -2 -3
Figure 42.6.1
ξ= -3
θ=0
-2
South Pole π θ=− 2
-1
1 -1 -2
3
ψ21 (x, y, z) θ=0
-3
Ranges of charts ψ1 and ψ2 for S 2
The inverses of the charts ψ1 and ψ2 are as follows. ξ(ξ 2 + η 2 )−1/2 sin (ξ 2 + η 2 )1/2 ψ¯1 (ξ, η) = ψ1−1 (ξ, η) = η(ξ 2 + η 2 )−1/2 sin (ξ 2 + η 2 )1/2 cos (ξ 2 + η 2 )1/2 ξ(ξ 2 + η 2 )−1/2 sin (ξ 2 + η 2 )1/2 ψ¯2 (ξ, η) = ψ2−1 (ξ, η) = η(ξ 2 + η 2 )−1/2 sin (ξ 2 + η 2 )1/2 . 1 − cos (ξ 2 + η 2 )1/2
[ www.topology.org/tex/conc/dg.html ]
2
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
for |z| 6= 1, and ψ1 (0, 0, 1) = ψ2 (0, 0, −1) = (0, 0). These are exponential maps radiating from (0, 0, 1) and (0, 0, −1) respectively. The ranges of the charts ψ1 and ψ2 are illustrated in Figure 42.6.1.
756
42. Geometry of the 2-sphere
The maps τi = ψi ◦ ψ0−1 , where the terrestrial coordinate chart ψ0 is as defined in Section 42.1, are as follows. It is convenient to add subscripts to ξ and η to indicate which chart ψi they belong to. π − θ cos φ ξ1 2 = τ1 (φ, θ) = (ψ1 ◦ ψ0−1 )(φ, θ) = π η1 − θ sin φ 2 π + θ cos φ ξ2 2 −1 . = τ2 (φ, θ) = (ψ2 ◦ ψ0 )(φ, θ) = π η2 + θ sin φ 2 The maps τi−1 = ψ0 ◦ ψi−1 , where the map ψ0 is as defined in Section 42.1, are as follows. # " arctan(ξ1 , η1 ) φ −1 −1 = τ1 (ξ1 , η1 ) = (ψ0 ◦ ψ1 )(ξ1 , η1 ) = π − (ξ12 + η12 )1/2 θ 2 " # arctan(ξ2 , η2 ) φ −1 −1 = τ2 (ξ2 , η2 ) = (ψ0 ◦ ψ2 )(ξ2 , η2 ) = . π − + (ξ22 + η22 )1/2 θ 2
α12 (ξ1 , η1 ) = (π − |(ξ1 , η1 )|)
(ξ1 , η1 ) . |(ξ1 , η1 )|
(42.6.1)
Let ζ1 = (ξ1 , η1 ) and ζ2 = (ξ2 , η2 ). Then ζ2 = α12 (ζ1 ) = (π − |ζ1 |)
ζ1 . |ζ1 |
−1 Coincidentally, it happens that α12 = α12 . The graft set X is defined by
X = ((0, 0), ·), (·, (0, 0)) ∪ ((ξ1 , η1 ), α12 (ξ1 , η1 )); (ξ1 , η1 ) ∈ V1 \ {(0, 0)} ,
˚i=1,2 Vi . where the dot ‘·’ represents an undefined value in a partial sequence in the partial Cartesian product × (See Definition 6.10.5.) If X is given the graft topology from V1 and V2 , then X ≈ S 2 . This graft set X may be used as the base set for the manifold S 2 , but in practice it is too clumsy. It is unnecessary because we already have the embedded subset of IR3 to focus on as the set of points in the manifold. However, it is not quite so clear what one should take to be the set of points in the tangent bundle. The bundle is itself may be defined as a set of ordered pairs (p, ∂p,v,ψ ) where ∂p,v,ψ if the tangent vector at p with coordinates v ∈ IR2 for chart ψ. But when it is time to talk about the tangent bundle of the tangent bundle, things become somewhat muddled. Then it is better to have a concrete set to point to as representing the tangent bundle in a coordinate-indpendent sense, since the form (p, ∂p,v,ψ ) can be valid only in the context of a single chart. [ Near here, should show that ψ2 ◦ ψ1−1 is C ∞ etc., and calculate the derivatives of these chart transition functions. ] [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
For embedded manifolds, it is not necessary to introduce the concept of a set graft (as in Definitions 6.10.5 and 15.11.2) to construct the set of points in the manifold. It is, however, generally very useful to use set grafts to construct tangent bundles for such spaces. Nevertheless, it is instructive to construct S 2 here from the graft of the two polar exponential maps ψ1 and ψ2 . A set graft indicates which points on two patch sets are to be identified and regarded as the same point. In ˚i=1,2 Vi , where Vi = Range(ψi ) this case, the set graft must be a subset X of the partial Cartesian product × for i = 1, 2. (See Definition 6.10.1 for ‘partial Cartesian product’.) Let (ξ1 , η1 ) ∈ V1 \ {(0, 0)}. The corresponding point in V2 is (ξ2 , η2 ) = α12 (ξ1 , η1 ) ∈ V2 \ {(0, 0)}, where α12 : V1 \ {(0, 0)} → V2 \ {(0, 0)} is defined by
42.7. The global tangent bundle
757
This section uses the 2-chart atlas {ψ1 , ψ2 } for S 2 which is defined in Section 42.6. 2 The manifold charts ψ1 and ψ2 may be extended to tangent bundle charts which look like B0,π × IR2 for ψ1 and ψ2 with G = GL(2) or O(2). (The groups SL(2) and SO(2) are unsuitable unless the orientation of ψ2 is flipped.) For C 1 functions f : S 2 → IR, consider the derivatives v i (∂/∂ζji )(f ◦ψj−1 (ζj )), where ζj = (ξ, η) ∈ Range(ψj ). There is a serious problem here. The general definition of a tangent vector regards the points of a manifold as being abstract points. But when calculating derivatives such as (∂/∂ζji )(f ◦ ψj−1 (ζj )), it is only possible to apply rules such as the composition rule for derivatives if the points of the manifold, such as S 2 , are coordinatized. In the case of an embedded manifold, the coordinates of the ambient space, such as IR3 , are not completely adequate. This is especially true in the case of derivatives or order greater than 1, but even first order derivatives have the difficulty that they must first be extended from the manifold into the ambient space. The requirement for coordinates to be defined on a manifold in order to define tangent vectors as derivative operators means that a chart is required. In other words, there is in practice really no such thing as abstract points. For practical calculations, it is always necessary to work within particular charts. Even defining the value of a function on a manifold requires coordinates, unless it is some simple form of function, such as “distance from a given point”, for instance. Thus everything is ultimately reduced to Cartesian coordinates in any real calculations. All of the theoretical definitions for manifolds in terms of abstract points are valuable only for clear thinking, not for any practical calculations. Using the chart ψ in Section 42.1, the tangent vectors from the chart ψ1 may be calculated by elementary calculus from the formula f ◦ ψ1−1 = (f ◦ ψ −1 ) ◦ (ψ ◦ ψ1−1 ) as follows. ∂θ ∂ ∂θ ∂ ∂φ ∂ ∂φ ∂ 1 ∂ 2 ∂ 1 2 v +v =v + +v + ∂ξ1 ∂η1 ∂ξ ∂φ ∂ξ ∂θ ∂η ∂φ ∂η ∂θ ∂ −η ∂ −ξ ∂ ξ −η ∂ 1 2 =v + +v + . ξ 2 + η 2 ∂φ (ξ 2 + η 2 )1/2 ∂θ ξ 2 + η 2 ∂φ (ξ 2 + η 2 )1/2 ∂θ − sin φ ∂ ∂ ∂ cos φ ∂ = v1 − cos φ + v2 − sin φ . π/2 − θ ∂φ ∂θ π/2 − θ ∂φ ∂θ Similarly, differentiation of f ◦ ψ2−1 yields ∂ ∂ − sin φ ∂ cos φ ∂ 1 ∂ 2 ∂ 1 2 v +v =v + cos φ +v + sin φ . ∂ξ2 ∂η2 π/2 + θ ∂φ ∂θ π/2 + θ ∂φ ∂θ
Subscripts have been added for the vectors ∂/∂ξ and ∂/∂η to indicate which chart they belong to. These tangent vectors may be expressed in terms of the terrestrial chart tangent vectors as follows. ∂ − sin φ ∂ ∂ = − cos φ ∂ξ1 π/2 − θ ∂φ ∂θ ∂ cos φ ∂ ∂ = − sin φ ∂η1 π/2 − θ ∂φ ∂θ
∂ − sin φ ∂ ∂ = + cos φ ∂ξ2 π/2 + θ ∂φ ∂θ ∂ cos φ ∂ ∂ = + sin φ . ∂η2 π/2 + θ ∂φ ∂θ
These can be solved for ∂φ and ∂θ as follows.
∂ ∂ ∂ = (π/2 − θ)(− sin φ + cos φ ) ∂φ ∂ξ1 ∂η1 ∂ ∂ ∂ = − cos φ − sin φ ∂θ ∂ξ1 ∂η1
∂ ∂ ∂ = (π/2 + θ)(sin φ − cos φ ) ∂φ ∂ξ2 ∂η2 ∂ ∂ ∂ = − cos φ − sin φ . ∂θ ∂ξ2 ∂η2
These formulas tell you which tangent vectors in the two charts must be identified in the set graft of the two charts. Equivalent formulas in terms of the V1 and V2 coordinates are as follows. ∂ ∂ ∂ = − η1 + ξ1 ∂φ ∂ξ1 ∂η1 ∂ −ξ1 ∂ −η1 ∂ = 2 + 2 2 2 1/2 1/2 ∂θ (ξ1 + η1 ) ∂ξ1 (ξ1 + η1 ) ∂η1 [ www.topology.org/tex/conc/dg.html ]
∂ ∂ ∂ = η2 − ξ2 ∂φ ∂ξ2 ∂η2 ∂ −ξ2 ∂ −η2 ∂ = 2 + 2 . 2 2 1/2 1/2 ∂θ (ξ2 + η2 ) ∂ξ2 (ξ2 + η2 ) ∂η2 [ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
42.7. The global tangent bundle
758
42. Geometry of the 2-sphere
[ To construct the tangent bundle topological space E = T (S) for M = S 2 , use Theorem 15.10.5, Definition 15.10.6 (set union topology), and Definition 15.1.1 (product topology) to construct T (M ) from patches Ui × IR2 . ] The base-point grafting equivalence rule for the atlas (ψ1 , ψ2 ) is shown in Equation (42.6.1). The grafting rule for tangent vectors is obtained from this by differentiating line 42.6.1. πη 2 ∂ πξ2 η2 ∂ ∂ 2 = − 1 − ∂ξ2 |ζ2 |3 ∂ξ1 |ζ2 |3 ∂η1 ∂ ∂ πξ2 η2 ∂ πξ22 =− + − 1 . ∂η2 |ζ2 |3 ∂ξ1 |ζ2 |3 ∂η1 The matrix for the basis transformation is then as follows. πη22 |ζ2 |−3 − 1 −πξ2 η2 |ζ2 |−3 ∂/∂ξ2 ∂/∂ξ1 = · . ∂/∂η2 ∂/∂η1 −πξ2 η2 |ζ2 |−3 πξ22 |ζ2 |−3 − 1
(42.7.1)
To construct the tangent bundle graft of the two charts, it is now sufficient to identify point-vector pairs which correspond to the same base point and tangent vector. The determinant of the matrix in line 42.7.1 is 1 − π(ξ22 + η22 )−1/2 . This is negative. So the group for the tangent bundle cannot be SL(2). However, replacing chart ψ2 with a mirror image of itself makes the determinant positive. It is interesting to note that if the tangent space of the mirror image of chart ψ2 is given appropriate basis vectors, the transition matrix above becomes a simple rotation in SO(2) with angle 2φ. [ Clearly it is now required to have a general definition in the differentiable structure chapter for the tangent bundle constructed from charts, in particular in the case of embedded manifolds. ] [ Here there should be a treatment of T (T (S 2 )), the tangent space of the tangent space of S 2 . This should be followed by the Riemannian connection on S 2 . Also of interest would be the space T 2 (M ) of second-order derivates on S 2 . ]
This section presents rotations of the 2-sphere in IR3 about various axes. These rotations are elements of the classical group SO(3). [ This subject is related to spinors. See Misner/Thorne/Wheeler [37], chapter 41, pages 1135–1165. ] Since the Riemannian manifold S 2 is symmetric with respect to elements of the orthogonal group O(3), this group can be used for generating new geodesics out of simple geodesics. For instance, the equator of S 2 is E = {(x, y, z) ∈ IR3 ; x2 + y 2 = 1 and z = 0}, which is the image set of a geodesic. Therefore the image gE of E for any group element g ∈ O(3) is also a geodesic of S 2 . 42.8.1 Remark: Define three generators of SO(3) to 0 0 0 0 A1 = 0 0 −1 ; A2 = 0 0 1 0 −1
be the antisymmetric matrices 0 1 0 −1 0 0 0 ; A3 = 1 0 0 . 0 0 0 0 0
Define the corresponding single parameter families of rotation matrices R1 , R2 and R3 by 1 0 0 R1 (α1 ) = exp(α1 A1 ) = 0 cos α1 − sin α1 0 sin α1 cos α1 cos α2 0 sin α2 R2 (α2 ) = exp(α2 A2 ) = 0 1 0 − sin α2 0 cos α2 cos α3 − sin α3 0 R3 (α3 ) = exp(α3 A3 ) = sin α3 cos α3 0 . 0 0 1
For k = 1, 2, 3, let Rk (t) denote the corresponding linear transformations on IR3 , defined by x 7→ Rk (t)x. Then Rk (t)(S 2 ) = S 2 for all t ∈ IR and k = 1, 2, 3. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
42.8. Isometries of S 2
42.8. Isometries of S 2
759
42.8.2 Remark: Euler’s angles θ, φ and ψ are the parameters of transformations of IR3 of the form e′i = R3 (φ)R2 (θ)R3 (ψ)ei where the ei are orthonormal basis vectors of IR3 . The coordinate transformation has the matrix R3 (−ψ)R2 (−θ)R3 (−φ). Therefore the coordinates (x′ , y ′ , z ′ ) with respect to the modified basis vectors are related to the original coordinates (x, y, z) by ′ x x y ′ = R3 (−ψ)R2 (−θ)R3 (−φ) y z′ z cos ψ cos θ cos φ − sin ψ sin φ sin ψ cos φ + cos ψ cos θ sin φ − cos ψ sin θ x = − sin ψ cos θ cos φ − cos ψ sin φ cos ψ cos φ − sin ψ cos θ sin φ sin ψ sin θ y . sin θ cos φ sin θ sin φ cos θ z
This transformation is supposed to be useful for analysing spinning tops or something. The idea is to align the z ′ coordinate (unit vector e′3 ) with the axis of the top. Although Euler’s angles are useful for mechanics, they are not at all useful as coordinates for the differentiable manifold structure of the group SO(3) in a neighbourhood of the identity. For that purpose, the three parameters α1 , α2 and α3 are perfectly suited. For instance, the transformation R3 (α3 )R1 (α1 )R2 (α2 ) ∈ SO(3) of unit vectors for α ∈ IR3 provides useful coordinates in a neighbourhood of the identity of SO(3). This particular coordinatization may be thought of in terms of an aeroplane flying in the direction of the X-axis: α2 is the forward-tilt of the plane (the “pitch”), α1 is the right-roll of the plane (the “roll”), α3 is the left-drift of the plane (the “yaw”). Each of the 6 orderings of the 3 rotations gives a different but analytic-compatible chart in a neighbourhood of the identity.
x=
sin θ(1 − cos φ)(1 − cos φ cos θ) z cos φ cos θ
y=
sin φ sin θ z. (1 − cos φ)(1 + cos θ)
and
This leads to complicated expressions for the terrestrial coordinates of the fixed point. This does not seem to be a useful way to parametrize rotations, although it does have the advantage of removing the special significance of the axial directions. [ It would be useful to have some idea of how charts such as ψ123 should be restricted so that they are welldefined. Specifically, α = (α1 , α2 , α3 ) must be restricted. What are the conditions for two values of α to map to the same rotation? The robotics literature should have some information on this. ] 42.8.4 Remark: For every vector V ∈ Te (G) for G = SO(3), a vector field is induced on S 2 by the map p 7→ (dRp )e (V ) for p ∈ S 2 , where Rp : G → S 2 is defined by Rp : g 7→ g.p for g ∈ G and p ∈ S 2 . This map is chart-independent. Define a chart ψ123 : G → ˚ IR3 so that ψ123 : R3 (α3 )R2 (α2 )R1 (α1 ) 7→ (α1 , α2 , α3 ) in some neighbourhood 3 of e ∈ S . The rotation matrix is: c2 c3 s1 s2 c3 − c1 s3 c1 s2 c3 + s1 s3 R3 (α3 )R2 (α2 )R1 (α1 ) = c2 s3 s1 s2 s3 + c1 c3 c1 s2 s3 − s1 c3 , −s2 s1 c2 c1 c2
where sk = sin αk and ck = cos αk for k = 1, 2, 3. Let V = te,v,ψ123 ∈ Te (G) for v ∈ IR3 . Define the chart ψ0 : S 2 → ˚ IR2 for terrestrial coordinates as in Section 42.1. Then to calculate (dRp )e (V ), it is necessary to differentiate the position of a point in S 2 with [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
42.8.3 Remark: It is tempting to look for better parameters than Euler’s angles. One interesting possibility would be to define two parameters as the location of the fixed point in S 2 of a group element, with the third parameter being the amount of rotation around that fixed point. The calculations for this are not completely trivial. For instance, even if the Euler angle ψ is set to zero, it may be shown (matrix diagonalization) that the fixed points of transformation R3 (−ψ)R2 (−θ)R3 (−φ) satisfy
760
42. Geometry of the 2-sphere
P −1 respect to coordinates of G. To be precise, (dRp )e (V ) i = j V j ∂xj (ψ0i ◦Rp ◦ψ123 (x)) x=ψ123 (e) , where ψ01 = φ and ψ02 = θ. When v = (1, 0, 0), (dRp )e (V ) = − cos φ tan θ epφ + sin φ epθ , where epφ and epθ are the coordinate basis vectors at p ∈ S 2 with respect to ψ0 . Similarly for v = (0, 1, 0), (dRp )e (V ) = − sin φ tan θ epφ − cos φ epθ , and for v = (0, 0, 1), (dRp )e (V ) = epφ . Therefore for general V ∈ Te (G), (dRp )e (V ) =
[ epφ
epθ
− cos φ tan θ ] sin φ
− sin φ tan θ − cos φ
v1 1 v2 . 0 v3
[ Calculate induced fields on S 2 for α coordinates. Also do this with the Euler pseudo-chart? Do a diagram of the induced fields, one sphere showing the induced field for each axis or rotation. ] [ Give the general formula for rotations around axes through arbitrary points (φ0 , θ0 ). These can be expressed in terms of the rotations Rk (t). See Notes Q. ] [ Define exponential map and semigroups in SO(3). ]
42.9. Geodesic curves A curve is defined as a map γ : I → X. A path is defined as an equivalence class of curves.
This section deals with geodesic curves which can be expressed in the form θ = g(φ) for some function g. The next section deals with affinely parametrized curves. [ Have a graphic here of the image set of the general geodesic curve, showing the intersection point φ = φ0 with the equator, and the angle β = arctan(k). ]
Conversely, any function of this form is the graph of a geodesic curve. Proof: It follows from the equations for a freely parametrized geodesic curve in 2-dimensional manifolds with affine connection (Theorem 41.3.1) that 1 2 1 2 1 2 θ ′′ − Γ22 (θ ′ )3 + (Γ22 − 2Γ12 )(θ ′ )2 + (2Γ12 − Γ11 )θ ′ + Γ11 = 0. k On substituting the Christoffel symbol Γij = − tan θ (δi1 δj2 + δi2 δj1 )δ1k + sin θ cos θ δi1 δj1 δ2k , the differential equation for θ becomes θ ′′ + 2 tan θ (θ ′ )2 + sin θ cos θ = 0.
However, θ ′′ + 2 tan θ (θ ′ )2 = cos2 θ (tan θ)′′ . So (tan θ)′′ + tan θ = 0, which has the general solution tan θ = k sin(φ − φ0 ) as claimed. 42.9.2 Remark: Another way to calculate the coordinates of great circle lines is to consider a great circle through (φ, θ) = (0, 0) with inclination α to the equator to be the intersection of the sphere S 2 with the plane {(x, y, z); z cos α = y sin α}. Then the equation in terms of (φ, θ) must be tan θ = sin φ tan α. That is, θ = arctan(sin φ tan α). This agrees with the particular case (φ1 , θ1 ) = (0, 0) in the following theorem. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
42.9.1 Theorem: If a geodesic curve in S 2 is locally expressible as a function θ = g(φ), then for some k, φ0 ∈ IR, the function satisfies θ = arctan(k sin(φ − φ0 )).
42.9. Geodesic curves
761
42.9.3 Theorem: The geodesic curve through the point (φ1 , θ1 ) with slope α has the equation tan θ =
tan α sin(φ − φ1 ) + tan θ1 cos(φ − φ1 ). cos θ1
[ The point on this geodesic which has distance r from (φ1 , θ1 ) is. . . ] [ In the above equation, solve for φ to get φ = φ1 + . . .. ] This curve passes through the equator at points (φ0 , 0) with slope β, where φ0 and β satisfy tan(φ0 − φ1 ) = − sin θ1 cot α p sin2 θ1 + tan2 α tan β = ± . cos θ1 [ It is necessary to say which β goes with which φ0 . Must also deal with special cases of the pair (θ1 , α). ] Hence the geodesic curve can also be expressed as tan θ = tan β sin(φ − φ0 ). The slope at any point (φ, θ) of the geodesic is β such that p 2 2 2 2 tan β = sign(cos(φ − φ+ 0 )) cos θ tan α sec θ1 + tan θ1 − tan θ, Proof: By Theorem 42.9.1, the equation for θ describes a geodesic curve, since tan θ is a linear combination of sin φ and cos φ. Clearly θ = θ1 when φ = φ1 . To show that the coefficients are correct for the given slope α at (φ1 , θ1 ), note that dθ tan α sec2 θ = cos(φ − φ1 ) − tan θ1 sin(φ − φ1 ). dφ cos θ1 Hence at φ = φ1 , the slope of the geodesic is dθ = tan α, sec θ dφ φ=φ1
as claimed. The geodesic curve passes through the equator when θ = 0, which occurs for φ = φ0 , where tan(φ0 − φ1 ) = − sin θ1 cot α. (If α = 0 and θ1 6= 0, this may be interpreted to mean that (φ0 − φ1 ) mod 2π = π/2 or 3π/2. If α = 0 and θ1 = 1, the value of φ0 is indeterminate, and may be taken to have any real value.) When θ1 6= 0, there are clearly two points (φ0 , 0) on the equator through which the geodesic curve passes. The slope m = tan β = sec θ dθ/dφ of the curve at any point (φ, θ) on the curve satisfies tan α m = cos θ cos(φ − φ1 ) − tan θ1 sin(φ − φ1 ) . cos θ1 Hence at a point (φ0 , 0), the slope m satisfies tan α dθ m = sec θ = cos(φ0 − φ1 ) − tan θ1 sin(φ0 − φ1 ) dφ φ=φ0 cos θ1 p sin2 θ1 + tan2 α = sign(cos(φ0 − φ1 )) . cos θ1 The slope m = tan β at a general given point on the geodesic satisfies 1 dθ cos θ dφ tan α = cos θ cos(φ − φ1 ) − tan θ1 sin(φ − φ1 ) cos θ1 p 2 2 2 2 = sign(cos(φ − φ+ 0 )) cos θ tan α sec θ1 + tan θ1 − tan θ,
tan β =
where φ+ 0 is a zero of θ at which the slope of the geodesic is non-negative. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
where φ+ 0 is a zero of θ at which the slope of the geodesic is non-negative.
762
42. Geometry of the 2-sphere
42.9.4 Theorem: The geodesic through two points (φ1 , θ1 ) and (φ2 , θ2 ) which are neither coincident nor antipodal has the equation tan θ =
tan θ2 sin(φ − φ1 ) + tan θ1 sin(φ2 − φ) . sin(φ2 − φ1 )
(42.9.1)
This geodesic passes through the equator at points (φ0 , 0) at slope β0 , with tan θ2 sin φ1 − tan θ1 sin φ2 tan θ2 cos φ1 − tan θ1 cos φ2 tan β0 = . . .
tan(φ0 ) =
The slope of the geodesic at φ = φ1 is β1 = arctan
cos θ1 tan θ2 sin θ1 − sin(φ2 − φ1 ) tan(φ2 − φ1 )
.
cos θ2 tan θ1 sin θ2 − tan(φ2 − φ1 ) sin(φ2 − φ1 )
.
The slope of the geodesic at φ = φ2 is β2 = arctan
Proof: (42.9.1) defines a geodesic curve since tan θ is expressed as a linear combination of translated sines of φ. The fact that the curve passes through both specified points follows immediately by substitution. The formula for φ0 follows by setting θ = 0 in the formula for tan θ and using the difference rule for the sine function, then solving for φ.
42.10. Affinely parametrized geodesics
This, hopefully, is an error! See notes A. ] The equations for affinely parametrized geodesics may be obtained in several ways. One way is to reparametrize the equations for a geodesic curve parametrized by φ. Another way is to solve the equations for a geodesic curve directly. And a third way is to use the symmetries of S 2 to generate families of geodesic curves out of a simple geodesic, such as the equator, parametrized by φ. This last approach gives φ(t) = φ0 + arctan(cos α tan(t − t0 )) sin α sin(t − t0 ) θ(t) = arctan q 1 − sin2 α sin2 (t − t0 ) ! sin α tan(t − t0 ) = arctan p 1 + cos2 α tan2 (t − t0 ) ! cos α tan(t − t0 ) = arcsin p , 1 + cos2 α tan2 (t − t0 )
by applying successively Rz (−t0 ), Rx (α) and Rz (φ0 ) to the curve given by φ(t) = t and θ(t) = 0. ¨ i + Γ i X˙ j X˙ k = 0. That is, θ¨ + cos2 θ φ˙ 2 = 0 and φ¨ − 2 tan θ φ˙ θ˙ = 0. For [ For affine parametrization, need X jk length-parametrized geodesics, need additionally gij X˙ i X˙ j = 1. That is, cos2 θ φ˙ 2 + θ˙2 = 1. ] 42.10.1 Theorem: Given two non-antipodal points on S 2 with spherical coordinates (φ0 , θ0 ) and (φ1 , θ1 ), the point (φλ , θλ ) which divides the line joining the two points in the ratio λ is given by φλ = . . . θλ = . . . [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
[ Here have theorems on affine parametrized and length-parametrized geodesics. This may be tricky, because when solving for f (u) with k(u) = −2 tan θ g ′ and g(u) = arctan(c sin(φ − φ0 )), f (u) comes out like Z u f (u) = (c1 − ln |1 + c2 sin2 (φ − φ0 )| )−1 du.
42.11. Convex sets and functions
763
42.11. Convex sets and functions [ In this section, it should be stated which sets are convex and which are not. ]
42.12. Normal coordinates [ Should also deal in this section with the topic of geodesic coordinates. ] 42.12.1 Theorem: The set of points in S 2 with distance r from a given point with coordinates (φ0 , θ0 ) satisfies. . . 42.12.2 Theorem: The set of points on the geodesic whose points are equidistant from a given point (φ0 , θ0 ) satisfies. . .
42.13. Jacobi fields [ This section should describe the entire family of Jacobi fields for all geodesics. The magnitude of Jacobi fields should be calculated, and also the first derivatives of the Jacobi field. ] [ Write out the equation of geodesic variation and the equation of energy minimization etc. ] [ Also look at the curvature of curves. Curvature equals the rate of change of direction of the curve with respect to distance. Write out the formulas for general curves and circles. ]
42.14. Circles on the sphere This section deals with the description of circles on S 2 in terms of terrestrial coordinates. A circle is defined as the set of points which are equidistant from a given point. Let this distance be γ ∈ [0, π] for the unit sphere embedded in IR3 . Let (φ0 , θ0 ) ∈ [−π, π] × [−π/2, π/2] be terrestrial coordinates for the centre of the circle. The set of points (φ, θ) on the circle must satisfy:
This equation may be solved in terms of either φ or θ as follows. cos γ − cos(θ − θ ) cos γ − sin θ sin θ 0 0 φ = φ0 ± arccos +1 = φ0 ± arccos cos θ cos θ0 cos θ cos θ0 2
2
2
θ = arctan(cos θ0 cos(φ − φ0 ), sin θ0 ) ± arccos(cos γ (cos θ0 cos (φ − φ0 ) + sin θ0 )
(42.14.1) −1/2
).
The formula for θ follows from Theorem 20.13.21. One application of circles on a sphere is to horizon lines for satellite pictures. If a camera is placed at a distance r0 from the centre of a sphere of radius r, the horizon circle has radius γ satisfying cos γ = r/d. Thus is a satellite’s ground point is (φ0 , θ0 ), the region which can be imaged by the satellite satisfies the above equations with cos γ replaced by r/r0 . The distance r0 = |(x0 , y0 , z0 )| and the angles φ0 = arctan(x, y) and θ0 = arcsin(z/r0 ) can be derived from the Cartesian coordinates (x0 , y0 , z0 ) for the camera viewpoint in terms of the 2-parameter length and arctan functions by the following algorithm. φ0 r0′ θ0 r0
= arctan(x0 , y0 ) = |(x0 , y0 )| = arctan(r0′ , z0 ) = |(r0′ , z0 )|.
This is useful for software like MetaPost which offers limited trigonometric functions. This procedure is readily extended to higher-dimensional spherical coordinates. The resulting angle φ0 here lies in (−π, π]. (Section 20.13 has further details on trigonometric functions.) The coordinates (r0 , φ0 , θ0 ) can be substituted into Equations (42.14.1) to obtain the horizon circle for the given camera viewpoint. (This is used in the line-hiding algorithm for Figure 42.1.2, for example.) [ Give here the general formula for the intersection of two circles on a sphere. ] [ Calculate here with the curvature of the circles and calculate parallel transport along them. ] [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
cos γ = cos(θ − θ0 ) + cos θ cos θ0 (cos(φ − φ0 ) − 1) = cos θ cos θ0 cos(φ − φ0 ) + sin θ sin θ0 .
764
42. Geometry of the 2-sphere
42.15. Calculation of the “hours of daylight” [ Should do a “length of day” calculation, which gives length of day in terms of latitude and time of year etc. 24 arccos(tan δ tan θ) hours π π/2 − φ = · 12 hours, π/2
length of day =
where θ is the latitude of the place at which the day is observed, and δ is the declination angle of the Sun, which reaches a maximum of about 23.5◦ . (The mean value was 23◦ 26′ 45′′ on 1 January 1950. See Norton [210], page 6. The angle should also be represented as a decimal, and probably also in radians.) ] Let the declination of the Sun be δ. Then the path of the Sun satisfies (x, y, z) = (− sin θ1 , 0, cos θ1 ). For δ = 0, have (− sin θ1 , 0, cos θ1 ).(cos θ cos φ, cos θ sin φ, sin θ) = 0. That is, − sin θ1 cos θ cos φ + cos θ1 sin θ = 0 cos θ1 sin θ cos φ = sin θ1 cos θ tan θ = tan θ1 tan θ = cos φ tan θ1 . For general δ,
The length of day calculation should be done after calculating the circle with a given centre and radius. Then take radius equal to π/2 − δ. Then solve for z = 0. That is, θ = 0. Then get cos φ = − sin δ/ sin θ1 .
42.16. Some standard map projections [ Should also have a section on map projections for the sphere. E.g. Mercators projection. ] [ Should find out what the Bessel ellipsoidal coordinates are, and work out the geodesics for this, and other similar things. ] [ Also in the chapter, do calculations of area of circles, the area above a curve θ = f (φ) and so forth. Also deal with the Gauß theorem, Green’s theorem etc. Show that area of a region is related to the change of angle for parallel transport around a boundary. ]
42.17. Projection of a sphere onto a plane 42.17.1 Remark: The projection of points and lines from IR3 onto a plane which is identified with IR2 is not difficult. Points and lines are projected to points and lines. The projection of a 2-sphere onto a plane requires a little more work. The outline of the image under projection of a sphere from IR3 to IR2 may be determined from the tangent lines to the sphere which pass through the viewpoint. Like many Cartesian coordinate calculations, the task is made much easier by guessing a method of attack which avoids unnecessary complexity in the intermediate constructions. 42.17.2 Remark: Define a projection map P : IR3 → IR2 by P : x 7→ (L(x)1 , L(x)2 )/L(x)3 , where L : IR3 → IR3 is defined by L : x 7→ A(x − b), where A ∈ M3,3 (IR3 ) is an invertible 3 × 3 matrix and b ∈ IR3 . The matrix A represents a rotation (or other linear transformation) of the points in IR3 , while b is a translation. The point b ∈ IR3 is the viewpoint where the camera is placed. If A is orthogonal, the row [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
− sin θ1 cos θ cos φ + cos θ1 sin θ = sin δ cos θ1 sin θ − sin δ cos φ = . sin θ1 cos θ
42.17. Projection of a sphere onto a plane
765
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
vector a3 = (a31 , a32 , a33 ) is the direction in which the camera is pointed and the row vectors a1 and a2 are the directions in which the X and Y axes are oriented. Assume that the matrix is orthogonal. Then the sphere projection problem reduces the calculation of the orientation of lines through a point b ∈ IR3 which are tangent to a sphere with centre q ∈ IR3 and radius R ∈ IR+ . Such a line has distance R from q. Such lines must be calculated for all directions eθ = a1 cos θ + a2 sin θ from q for θ ∈ [0, 2π]. The problem is solved if a number r(θ) ∈ IR can be determined for each θ so that the line segment from b to q + r(θ)eθ is tangent to the sphere. Now focus on a single orientation eθ . The plane through b and q oriented in the direction eθ is a plane in which the problem looks much simpler. In IR2 , the value of t ∈ IR for which the line through (d, 0) and t(c1 , c2 ) has distance R from (0, 0) satisfies t = d/ c1 + c2 (d2 /R2 − 1)1/2 if the denominator is positive. (The denominator is always positive in this application.) To apply the simplifed IR2 case to calculate = |q −b|, c1 = a1 ·(b−q)/|b−q| and c2 = (|eθ |2 −c21 )1/2 . r(θ), let d −1 2 2 1/2 Then r(θ) = d/ c1 + c2 (d /R − 1) . Hence r(θ) = c1 d−1 + c2 (R−2 − d−2 )1/2 . Therefore the point 3 vθ = q + r(θ)eθ ∈ IR will appear to be on the edge of the sphere for an observer at b. But vθ will be projected by the map L to L(vθ ) = A(q − b) + r(θ)(cos θ, sin θ, 0) ∈ IR3 if A is orthonormal. This is then projected to P (vθ ) = P (q) + r(θ)(cos θ, sin θ)/L(vθ )3 ∈ IR2 . This formula was used for hiding a curve behind the sphere image on the front cover of this book. The outline of a projected sphere may be drawn by joining points P (vθ ) for a range of θ values using a cubic spline. The shape is always an ellipse. So probably there’s a much easier way of doing this!
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
766
[ www.topology.org/tex/conc/dg.html ]
42. Geometry of the 2-sphere
[ draft: UTC 2009–3–21 Saturday 11:36 ]
[767]
Chapter 43 Examples of manifolds
43.1 43.2 43.3 43.4 43.5 43.6 43.7 43.8 43.9 43.10
Topological space examples . . . . . . . Euclidean spaces . . . . . . . . . . . . Non-Hausdorff locally Euclidean spaces . H¨older-continuous manifolds . . . . . . Torus . . . . . . . . . . . . . . . . . . General sphere . . . . . . . . . . . . . Conical coordinates for Euclidean spaces Hyperboloid . . . . . . . . . . . . . . . Tractrix . . . . . . . . . . . . . . . . . Analysis on Euclidean spaces . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
767 768 769 769 771 771 772 772 772 773
43.0.1 Remark: In many fields, particularly in topology and real analysis, it turns out that a very small set of examples serves a very broad set of applications, mostly as counterexamples to conjectures and to show that some theorems cannot be easily improved. (Theorems which cannot be improved, in some specified sense, are called “sharp”.) Differential geometry is similar. A small set of examples suffices for numerous roles, not only as counterexamples, but also to demonstrate how definitions and theorems play out in practice. This is a further justification for collecting popular examples together in their own chapters at the end of the book. [ Maybe should have a chapter dealing with S n for n ≥ 3 also! ]
[ Also should have a section on spherical coordinates for all of IR3 instead of just S 2 . But that would just be a study of curvilinear coordinates for the flat space IR3 rather than a curved geometry. ] [ Perhaps should have a chapter on classical differential geometry, just embedded manifolds. See EDM2 [34], App. A.4, page 1730. ]
43.1. Topological space examples Topological spaces are not all manifolds. But some examples of topological spaces are presented in this chapter. Of special interest are pathological examples. 43.1.1 Remark: Pathological sets and functions are very useful for disproving conjectures, thereby saving a lot of research time trying to prove false conjectures. Research on an open question often proceeds by an alternation between two directions of attack: (A) trying to prove that the conjecture is true, and (B) trying to find a counterexample. Usually trying to find a proof of the conjecture helps to show how to
©
Alan U. Kennington, “Differential geometry reconstructed: a unified systematic framework”. Copyright 2009, Alan U. Kennington. All rights reserved. You may print this book draft in A4 format. Printing in all other formats is forbidden. You may not charge any fee for copies of this book draft. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
This chapter presents various practical examples of manifolds. These examples are presented separately from the theoretical chapters so that the layers of structure of each example geometry can be presented in a unified fashion in one place rather than scattering the details of, say, the geometry of a sphere among the theoretical chapters. The geometry of S 2 is dealt with separately in Chapter 42. Application of the theory to practical examples is a good test of the value of particular choices of definitions. If a definition is so abstract that it cannot be applied to practical geometries, probably the definition should be revised.
768
43. Examples of manifolds
construct a counterexample. Conversely, attempting to construct a counterexample helps to find a proof of the conjecture. Pathological examples are useful for teaching mathematics. They are useful as “borderline examples” to show what is the most extreme kind of example which is just able to satisfy a particular definition. Extreme examples are also useful for showing why every condition of a definition is necessary. Quite often, a conjecture which is shown to be false by counterexample can be made true by adding an extra condition which excludes the counterexample. Therefore it is useful to learn a large repertoire of pathological sets and functions for use in testing definitions and for disproving and improving conjectures. 43.1.2 Example: [ Present a space-filling curve here. Try to explicitly calculate t so that f (t) = (π, e) for a space-filling curve f : [0, 1] → [0, 1]2 to help make the case that all points are covered, and that the inverse function is practically calculatable. ]
43.2. Euclidean spaces
It is not easy to define the space T (T (M )) in a concrete sense because tangent vectors in T (M ) are defined to have a form such as ∂p,v = v i ∂/∂xi x=p . This tangent vector is an element of the dual of C ∞ (M ). Its derivatives with respect to the base point p may be defined in the sense of Schwartz distributions as follows. ∂ 2 f (x) . ∂p,v ⊗ w : f 7→ −v i wj i j ∂x ∂x x=p
But this is not an element of T (T (M )). This functional ∂p,v ⊗ w is a map from C ∞ (M ) to IR, just like ∂p,v , whereas an element of T (T (M )) should be a map from C ∞ (T (M )) to IR. The space T (M ) may be parametrized by the pair (p, v) for tangent vectors ∂p,v within a given fixed chart for M . This defines a chart θ : Dom(ψ) × IRn → IR2n for T (M ). In this case, the domain of ψ = idM is all of IRn . So θ : T (M ) → IR2n , θ : ∂p,v 7→ (p, v). Therefore elements of T (T (M )) have the form ∂ ∂ (2) ∂p,v,α,β : g 7→ αj j + β k (g ◦ θ −1 (q, w)) . k ∂q ∂w (q,w)=(p,v) (2)
Each vector ∂p,v,α,β has a horizontal component αj ∂/∂q j and a vertical component β k ∂/∂wk . It is difficult to think how a function g : T (M ) → IR could have much real significance. (This may be contrasted with the definition of a simple tangent vector ∂p,v which acts on a function f : M → IR, which does have a clear significance and application.) A real problem here is that the representation of a tangent vector as an element ∂p,v : C ∞ (M ) → IR does not really match one’s intuitive idea of a tangent vector. It does happen to have the right transformation laws under changes of coordinates, but when it is time to look at spaces like T (T (M )) and the n-frame bundle of a manifold M , which is required for defining a connection, the generalized function style of definition is quite difficult to interpret. One relatively minor inconvenience is the fact that ∂p,0 = ∂q,0 for all p, q ∈ M . This provided a hint already that the tangent vector definition as a pointwise derivative operator is not right. The difficulty in defining the space T (T (M )) is a confirmation that the definition is not suitable for general purposes. Elements of T 2 (M ) = T (M ) ⊗ T (M ) must be of the form (v i ∂i ) ⊗ (wj ∂j ) for v, w ∈ IRn .
Z
43.2.1 Example: Let Ω be an open subset of IRn for some n ∈ + . Define an atlas on Ω by S = {ψ} with ψ : Ω → IRn defined by ψ(x) = x for all x ∈ Ω. Then (Ω, S) is an analytic manifold. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
It is not entirely superfluous to study Euclidean spaces IRn from the perspective of differential geometry. In fact, it is useful because we already know all the answers. So we only have to ensure that the differential geometry approach gives the right answers. This helps to clarify the theory without being distracted by any real geometric interest. Let M = IRn for some n ≥ 1. Let X = C ∞ (M ). A one-chart atlas for M uses the identity map idM on M as the only chart ψ. Tangent operators ∂p,v : X → IR for p ∈ M and v ∈ IRn are defined so that ∂p,v : f 7→ v i ∂f (x)/∂xi x=p . Then the tangent operator space of M is T (M ) = {∂p,v ; p ∈ M, v ∈ IRn }.
43.3. Non-Hausdorff locally Euclidean spaces
769
43.3. Non-Hausdorff locally Euclidean spaces 43.3.1 Remark: This section deals with spaces formed from Euclidean spaces by quotient operations which cause one or more points in the resulting spaces to be non-Hausdorff. 43.3.2 Example: An example of a topological space which is a non-Hausdorff locally Euclidean space may be constructed as follows. Define X = IR × {0, 1} to have the relative topology from IR2 . Define an equivalence relation R on X so that (x1 , y1 ) R (x2 , y2 ) whenever x1 = x2 6= 0 or (x1 , y1 ) = (x2 , y2 ). Let Y = X/R have the standard quotient topology. (See Definition 15.1.8.) The set Y may be identified with the set Y ′ = (IR × {0}) ∪ {(0, 1)}. Clearly the two points (0, 0) and (0, 1) cannot be separated. So Y ′ is not Hausdorff although it is T1 . [ Probably any locally Euclidean space is T1 . Should prove this. ] Define a topological atlas (which just happens to be analytic) on Y ′ to have two charts: φi : Ui → IR with U0 = IR × {0}, φ0 : (x, 0) 7→ x for x ∈ IR, U1 = ((IR \ {0}) × {0}) ∪ {(0, 1)} and φ1 : (x, y) 7→ x for (x, y) ∈ U1 . Then Definition 26.2.3 (for a locally Euclidean space) is satisfied but X is not Hausdorff. This example is illustrated in Figure 43.3.1. IR × {1}
X
IR × {0}
f f identification map
f
f f
(0,1)
f
Y′
IR × {0}
φ1 φ1
charts φ0
φ1 φ0
φ0
IR Figure 43.3.1
Non-Hausdorff locally Euclidean space example 43.3.2
43.3.3 Remark: It is shown in Example 43.3.2 that the quotient topological space formed by identifying two real lines at all points except one – the “real line with two origins” – is locally Euclidean but nonHausdorff. The rest of this section presents the natural generalization of this to the quotient space of two copies of a Euclidean space IRn at all points except some subset S of IRn . The set S could be chosen to be dense in IRn , for instance. This kind of quotient topological space construction seems to be the same thing as a topological “graft” as defined in Sections 15.10 and 15.11.
43.4. H¨ older-continuous manifolds As mentioned in Section 27.10, there seem to be no differential geometry textbooks which treat the subject of H¨older-continuity of manifolds and other scales of fractional regularity which interpolate the discrete steps of C k regularity for integer k. For example, Lipschitz manifolds (Definition 27.11.4) are rarely defined or treated. Manifolds with fractional differentiability are not at all pathological. They arise naturally as integrals of systems which have discontinuous force functions. 43.4.1 Example: The set M = {x ∈ IRn+1 ; xn+1 = |x1 |} is a simple example of a set which is naturally modelled as an n-dimensional C 0,1 manifold. (See Figure 43.4.1.) The most obvious chart for this set is ψ0 : M → IRn with ψ0 : (x1 , . . . xn+1 ) 7→ (x1 , . . . xn ). The atlas {ψ0 } containing only this chart makes this set a C ∞ manifold. One might ask why this very smooth state of affairs should be upset by adding further charts. The fly in this ointment is that this would not be an accurate description of the manifold. Problems would arise when the embedding of M in IRn+1 is used as a diffeomorphism. Tangent vectors and higher order differential constructions would not map as expected. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
IR
770
43. Examples of manifolds xn+1
-3
-1
0
α=0
.4 α= 0
α=
0. 8
1
-2
Figure 43.4.1
M
2
α=0
.4 α= 0
α=
0. 8
M
1
2
x1
Projection maps for a Lipschitz manifold
To expose the non-C ∞ nature of the set M , it suffices to project the set onto IRn in different directions. (See Remark 27.3.3 for general projections for graphs of functions.) For α ∈ (−1, 1), define the chart ψα : M → IRn by ψα : (x1 , . . . xn+1 ) 7→ (x1 − αxn+1 , x2 , . . . xn ). This map is clearly C ∞ with respect to the ambient space IRn+1 , but when xn+1 = |x1 | is substituted, this yields ψα : (x1 , . . . xn+1 ) 7→ (x1 − α|x1 |, x2 , . . . xn ). The transition map ψα ◦ ψ0−1 : IRn → IRn is defined by ψα ◦ ψ0−1 : (x1 , . . . xn ) 7→ (x1 − α|x1 |, x2 , . . . xn ), which is clearly only C 0,1 . (See Figure 43.4.2.) ψα1 ◦ ψ0−1 (x) = x1 − α|x1 | 2
α=0
α = 0.4
1
-2
-1
1
2
x1
-1
α=0 Figure 43.4.2
α = 0.4
α = 0.8
-2
Transition maps for a Lipschitz manifold
43.4.2 Remark: The important thing to note here is that for general embedded manifolds, the direction of projection has no natural choice. It would be deceptive to use only one projection of a set because this would give the manifold a structure which depends on the choice of projection chart. To honestly reflect the structure of an embedded manifold, all local projections onto hyperplanes should be included in the atlas so that there will be no perplexing chart-dependent properties when relations between the manifold and the ambient space are examined. 43.4.3 Remark: An interesting question to ask about the xn+1 = |x1 | manifold is how the lack of C 1 regularity affects the definition of the tangent bundle. In fact, this example manifold has an extra property which is not shared by general C 0,1 manifolds; namely its transition maps have unidirectional derivatives at all points. This kind of manifold is discussed in Section 28.14. 43.4.4 Example: Figure 43.4.3 shows a function which is C 0,1 but which has no one-sided derivatives at x = 0. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
α = 0.8
43.5. Torus
771 y=x f (x) = x sin (π ln2 |x|)
1
-2
-1 1
2
x
-1 y = −x Figure 43.4.3
Lipschitz function without one-sided derivatives at x = 0
The function f : IR → IR defined by f (x) = x sin(k ln |x|) for x 6= 0 and f (0) = 0 has derivative f ′ (x) = sin(k ln |x|) + k cos(k ln |x|) for x 6= 0. So the C 0,1 norm of f is kf k0,1 = (1 + k2 )1/2 . In this case, k = π/ ln 2.
Define the set M ⊆ IRn+1 by M = {x ∈ IRn+1 ; xn+1 = f (x1 )}. When this is projected at various angles onto the xn+1 = 0 plane as in Example 43.4.1, the resulting charts will have C 0,1 transition maps, but the points on M with x1 = 0 will have no one-sided tangent vectors in most directions.
43.5. Torus 43.5.1 Definition: The n-torus T n may be defined to have the base space
Cn; ∀i = 1 . . . n, |zi|2 = 1}.
43.6. General sphere [ Should present the Jacobian matrix for IRn in spherical coordinates. ] The n-sphere S n is defined for n ∈
Z+ to have the base set
S n = {x ∈ IRn+1 ; |x| = 1}. This will be given the differentiable structure of a C ∞ n-manifold. [ Use the differentiable structure induced by the embedding of S n in IRn+1 . This means that the differentiable structures induced by embeddings will have to be defined somewhere. This probably requires taking as an atlas the set of all local coordinate maps which are projections onto tangential planes. If too small a number of such projection maps is used, then a non-C 2 manifold could easily appear to be a C ∞ manifold, because it is the overlap of maps which determines the regularity of an atlas. Somewhere there should also be a good treatment of how the “maximal atlas” generated by a given atlas may actually be of variable regularity at different parts of the manifold. There probably should also be a treatment somewhere of manifolds which are actually graphs of functions f : IRn → IR. ] To generate a general set of spherical coordinates for all dimensions, it is convenient to look for a pattern in the following formulas for IR4 . x1 = R cos ψ cos θ cos φ x2 = R cos ψ cos θ sin φ x3 = R cos ψ sin θ x4 = R sin ψ. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
T n = {z ∈
772
43. Examples of manifolds
This is the same as the formulas for IR3 with the difference that the first three coordinates are multiplied by cos ψ and the fourth coordinate is sin ψ. Thus in IRn , the formulas would be i
x =
Q R. nj=2 cos θj Qn R sin θi . j=i+1 cos θj
for i = 1 for i > 1,
where θ2 , . . . θn are n − 1 angle parameters. It is interesting to note that R fits in logically as angle parameter θ1 in this scheme. In fact, it is traditional to make all of the angles lie in the set [−π/2, π/2] except for the first angle, which is in this case θ2 . The first angle is generally taken to lie in the set [0, 2π) or some such set of width 2π. One might ask why the pattern is broken by this exception. The exception disappears if R is allowed to be negative. For instance, in IR2 this gives x1 = R cos θ and x2 = R sin θ where R ∈ IR and θ ∈ [−π/2, π/2]. Whether or not R is permitted to be negative, the idea of letting R be the first coordinate θ1 causes the coordinates θ1 , . . . θn to be a right-handed set of coordinates with respect to x1 , . . . xn , and in fact the Jacobian of the map from the n-tuple θ to the n-tuple x is the identity matrix when all angles (but not R) are zero. A consequence of the above discussion is a preferred order for the coordinates for spherical coordinates for IR3 and higher-dimensional spaces. For IR3 , the order should be (R, φ, θ), where φ the longitude angle and θ is the latitude angle. Each time the dimension of the space is increased, the angle which brings in the new axis direction is added to the end of the parameter list.
43.7. Conical coordinates for Euclidean spaces [ These should be the sort of things I used for barrier functions in the old days. They involve hypergeometric functions. This section is really about curvilinear coordinates for flat space. Such stuff should really be in a different section probably. ]
ui = λr λ−2 xi g(φ) + r λ g ′ (φ)(δin r −1 − r −3 xi xn )
∆u = r λ−2 (1 − φ2 )g ′′ (φ) − (n − 1)φg ′ (φ) + λ(λ + n − 2)g(φ) Let u(x) = f (r, φ). Then ∆u = frr +
1 − φ2 n−1 n−1 fr + fφφ − φfφ . 2 r r r2
43.8. Hyperboloid Hyperboloids are sets of the form n Hs,c
n+1
= {x ∈ IR
;
n+1 X
s2i = c},
i=1
where c ∈ IR, and s ∈ {−1, +1}n .
[ Should give a basic classification of these manifolds, and also put on a differentiable structure. ]
43.9. Tractrix The tractrix curve is defined in EDM2 [34], 93.H, page 351. In Cartesian coordinates, the tractrix has a parametric formula f : (0, π) → IR2 with f (t) = a(log tan(t/2) + cos t, sin t).
A surface of constant negative curvature may be constructed as a surface of revolution generated by a tractrix. This surface is called a “pseudo-sphere”. (See Bell [190], pages 303–305, EDM2 [34], 111.I, page 419, 285.E, page 1072.) [ Calculate the curvatures of a tractrix of revolution. Also determine the form of all geodesics. ] [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
The Laplacian in conical coordinates. For x ∈ IRn , let r = |x| and φ = xn r −1 . If a function u is conically symmetric, then can write u(x) = r λ g(φ). Then
43.10. Analysis on Euclidean spaces
773
43.10. Analysis on Euclidean spaces 43.10.1 Example: Figure 43.10.1 shows the “vectors” df (or ∇f ) for a simple function on a subset of IR2 . The function f is defined by f (x, y) = (y/(1 − x2 ))2 for x ∈ (−1, 1) and y ∈ IR. This has the differential df (x, y) = (4xy 2 (1 − x2 )−3 , 2y(1 − x2 )−2 ). (Any resemblance between this diagram and a Viking opera house is purely coincidental.) f (x, y) = 1.0
y
f (x, y) = 0.9
1 (fx , fy ) = (4xy 2 , 2y(1 − x2 )) (1 − x2 )3 f (x, y) = -1 Figure 43.10.1
f (x, y) = 0.1
f (x, y) = 0.2
1
y 1 − x2
2
x
Differential of real-valued function on IR2
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
[ Could have a section on Einstein manifolds near here. See Section 39.5 for the definition of an Einstein space. This or some other section should cover such concepts as the de Sitter universe and the Friedman universe, if these concepts are not mere figments of my imagination. Should have a specific model of the universe using current ideas about its age and the Λ kludge factor and other such things. Probably should have a whole chapter on cosmology. ] [ Should have a section on Schwarzschild singularities near here. Give coordinates and solutions for black hole problems. ] [ Should have a section on information geometry near here. See EDM2 [34], 399.D, pages 1488–1489. ]
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
774
[ www.topology.org/tex/conc/dg.html ]
43. Examples of manifolds
[ draft: UTC 2009–3–21 Saturday 11:36 ]
[775]
Chapter 44 Examples of fibre bundles
44.1 44.2 44.3
Euclidean fibre bundles on Euclidean spaces . . . . . . . . . . . . . . . . . . . . . . . . . The M¨ obius strip as a fibre bundle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The M¨ obius strip fibre bundle on S 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
775 776 778
This chapter presents various practical examples of fibre bundles.
44.1. Euclidean fibre bundles on Euclidean spaces This section concerns fibre bundles with B = IRm , F = IRn and E = IRm+n with m, n ≥ 1, for a wide range of structure groups on F from the trivial group up to the group of topological automorphisms of F . The projection map π : E → B is defined as π : (x1 , . . . xm , xm+1 , . . . xm+n ) 7→ (x1 , . . . xm ). A full specification tuple for a fibre bundle has the form (E, π, B, AF < (E, TE , π, B, TB , AE,F ; G, TG , F, TF , σG , µ). E , F, G) − With this section’s assumptions, the specification tuple is as follows.
where A is the IRn -fibre atlas for E = IRm+n . The topologies on the Euclidean spaces are the standard Euclidean topologies. Let G be any subgroup of the group of homeomorphisms of the fibre space F = IRn . Define the topology TG on G as the weak topology induced by µ, namely TG = {g ∈ G; µ(g, x) ∈ Ω}; Ω ∈ TIRn and x ∈ IRn .
The operation σG is completely determined by the action of elements g ∈ G on elements of F = IRn . This operation is simply function composition. Hence the topological transformation group (G, F ) − < (G, TG , F, TF , σG , µ) is well defined. When the group G is chosen, the only freedom remaining in the construction of these fibre bundles is the choice of the atlas A. This atlas consists of continuous maps φ : π −1 (U ) → F for open sets U ∈ TB = TIRm such that π × φ : π −1 (U ) ≈ B × F . These charts are constrained by the requirement that their transition maps must be elements of G acting on F . The smaller the group G is, the stronger the constraint on the charts in the atlas. First let the structure group be the trivial group G = {I}, where I : F → F is the identity map I : x 7→ x. Now define a single chart for E = IRm+n by φ0 : E → F with φ0 : (x1 , . . . xm , xm+1 , . . . xm+n ) 7→ (xm , . . . xm+n ). Then A = {φ0 } is clearly a fibre bundle atlas for the fibre bundle. For this group G = {I} and atlas A = {φ0 }, it is interesting to determine how much freedom remains for defining compatible fibre charts for this fibre bundle. Let φ1 : π −1 (U ) → F be a fibre chart which is −1 compatible with A for some U ∈ TB . Then for all b ∈ B, φb,1 ◦ φ−1 b,0 : F ≈ F and φb,1 ◦ φb,0 : F ≈ F must both be elements of G, where φb,i = φi π−1 ({b}) for i = 1, 2. This clearly implies that φb,1 = φb,0 for all b ∈ B, from which it follows that φ1 = φ0 −1 . In other words, all compatible charts for the particular choice of π
(U)
atlas A must agree everywhere on their domain with the function φ0 .
©
Alan U. Kennington, “Differential geometry reconstructed: a unified systematic framework”. Copyright 2009, Alan U. Kennington. All rights reserved. You may print this book draft in A4 format. Printing in all other formats is forbidden. You may not charge any fee for copies of this book draft. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
(IRm+n , π, IRm , A, IRn , G) − < (IRm+n , TIRm+n , π, IRm , TIRm , A; G, TG , IRn , TIRn , σG , µ),
44. Examples of fibre bundles
Given G = {I} and any (G, F ) fibre chart at all for the fibre bundle (E, π, B), all other fibre charts agree everywhere with the given fibre chart. This implies that there are infinitely many different, mutually incompatible (G, F ) atlases for this fibre bundle. There is a very wide range of choice of fibre chart φ0 : E → F . For example, φ′0 may be defined by φ′0 : (x1 , . . . xm , xm+1 , . . . xm+n ) 7→ (xm + y 1 , . . . xm+n + y n ), where y ∈ F is an arbitrary element of IRn . The set A′ = {φ′0 } still meets all of the requirements for a (G, F ) fibre bundle atlas because φ′0 is still continuous and satisfies π × φ′0 : π −1 (U ) ≈ B × F , and A′ has no problems with transition maps because there are none. In fact, given any homeomorphism θ : F ≈ F , the set A′ = {θ ◦ φ0 } is a valid (G, F ) atlas for (E, π, B) if φ0 : E → F is any valid fibre chart. Now let G = {I, J}, where I is as above and J : F ≈ F is defined by J : x 7→ −x. Then clearly G is a group of homeomorphisms of F . If the atlas is taken to be A = {φ0 } again, where φ0 is as above. Now a chart φ1 : π −1 (U ) → F for U ∈ TB is a compatible chart if and only if φb,1 = I ◦ φb,0 or φb,1 = J ◦ φb,0 for all b ∈ B, where the choice of map I or J is constant with respect to b on connected subsets of U . If U = B, clearly either φ1 = φ0 or φ1 = J ◦ φ0 . In the case of the two-element group G = {I, J}, the choice of fibre charts which are compatible with a given single fibre chart is very limited. Consider now the group G = {θ : F ≈ F ; θ is linear} = GL(n). In this case, it is not necessary that all charts φ : π −1 (U ) → F for U ∈ TB should be linear, but if any one of these fibre charts is linear for a given point b ∈ U , then all other fibre charts must be linear for this value of b. The fibre chart φ0 does not necessarily map from origin to origin. The largest possible group G is the set of all homeomorphisms of the topological space F = IRn . Then given any fibre chart φ0 : U0 → F for the fibre bundle, any other fibre chart φ1 : U1 → F will have the form e 7→ µ(θ(π(e)), φ0 (e)) for e ∈ π −1 (U0 ∩ U1 ) for some continuous map θ : U0 ∩ U1 → G. In this case, all fibre charts are compatible. Therefore it is not necessary to specify an atlas. One of the important points that comes clearly out of this example is that fact that the total space E is only a topological space and has no other structure than that imposed by the fibre charts. The structure on the structure group imposes structure on the total space, but it does not have to respect any structure on E apart from its topological structure. Thus even if the group is a linear group such as GL(n), the point (π × φ)−1 (b, 0) in each fibre π −1 ({b}) which is mapped by φ to 0 ∈ F is not necessarily any sort of zero point such as (b1 , . . . bm , 0 . . . 0) ∈ π −1 ({b}). This “origin point” may be any point in π −1 ({b}), but in the case G = GL(n), this point must be the same for all fibre charts in a given atlas, and this “origin point” must vary smoothly with respect to b ∈ B. This shows the significance of specifying an atlas for a fibre bundle. As soon as one fibre chart is given in a neighbourhood of a point, all other fibre charts are determined up to a continuously varying group operation in that neighbourhood. There is no unique identification of the fibre space F with the fibre π −1 ({b}) at each point b ∈ B, except for the trivial structure group, but in the case of an effective structure group, the set of all possible maps from the fibre at a point to the fibre space is homeomorphic to the structure group itself. For instance, if the group has two elements, then there are two possible chart maps at b, and so forth.
44.2. The M¨ obius strip as a fibre bundle The M¨obius strip is perhaps the simplest non-trivial example of a fibre bundle. (The date 1865 is given by do Carmo [16], page 36, for the first presentation of the M¨obius strip by August Ferdinand M¨obius.) Let B = E = S 1 = {x ∈ IR2 ; |x| = 1} with the standard topology induced by IR2 , and define π : E → B by π(eθ ) = e2θ , where eθ ∈ S 1 is defined by eθ = (cos θ, sin θ). It follows that π −1 ({eθ }) = {eθ/2 , eπ+θ/2 } for all θ ∈ [0, 2π), where S 1 is identified with the set [0, 2π) with the usual equivalent topology. (See Figure 44.2.1.) Clearly each set π −1 ({eθ }) consists of two points with the discrete topology. (With the trivial topology, the set F would need the trivial topology, and the product topology of B ×F would not be locally homeomorphic to the required open sets of E. [ Check to make certain that the trivial topology on F is impossible. If F has the trivial topology, then G must be the trivial group. ]) Any fibre space for (E, π, B) would have to be equivalent to F = S 0 = {1, −1} ⊆ IR with its standard topology. The domains of fibre bundle charts are required to be of the form π −1 (U ) for some open subset of B. For example, consider open sets of the form Ut,r = {θ ∈ [0, 2π); |θ − t| < r} for t ∈ [0, 2π) and r ∈ [0, π], using the obvious folding of θ into [0, 2π). Then π −1 (Ut,r ) = Ut/2,r/2 ∪ Uπ+t/2,r/2 . When r ≤ π, this set has two components, and the value of φ(e) must be constant for e in each component, because F = S 0 [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
776
44.2. The M¨obius strip as a fibre bundle
E = S1
0
π/4
777
π/2 3π/4
π
5π/4 3π/2 7π/4
2π
π/2
π
3π/2
2π
π −1 : B → E
B = S1 Figure 44.2.1
0
Fibre bundle map for the M¨obius strip
has the discrete topology and φ must be continuous. (Of course, if F has the trivial topology, all functions φ : E → F are continuous.) Since all homeomorphisms are bijections, F must have two elements and the value of φ must be different for the two components of π −1 (Ut,r ). This also makes U = B impossible. And this means that there must be at least 2 charts in the fibre atlas for (E, π, B). [ In fact, it looks like 3 charts will be required, and two of these will flip the values of φ in their intersection. Must check this! ] . . . [ The following sets U1 etc. might contain and error. Check this. Also check φ1 and φ2 . ] Figure 44.2.2 shows a fibre bundle map for the M¨obius strip with U1 = (2π − ε, 2π − 2ε) ⊆ B and π −1 (U1 ) = (2π − ε/2, π − ε) ∪ (π − ε/2, 2π − ε) ⊆ E for some small ε > 0. (Note that intervals are all denoted left-to-right modulo 2π.) −1
F = {−1, 1}
1
π −1 (U1 ) = (2π − ε/2, π − ε) ∪ (π − ε/2, 2π − ε)
E = S1 π:E→B B = S1
0
π/4 π/2 3π/4
Figure 44.2.2
π
5π/4 3π/2 7π/4 2π
U1 = (2π − ε, 2π − 2ε)
Fibre bundle chart for the M¨obius strip
By following the arrows, it is clear that for each b ∈ B, the two elements of π −1 ({b}) map to the two different elements of F = {−1, 1}. A similar fibre chart can be defined for an open subset of B such as U2 = (π − ε, π − 2ε). Then {U1 , U2 } forms an open covering of B, and on the overlap components of the sets π −1 (Ui ), the map to F may differ, but the transition function is an element of the structure group. The fibre chart represented in Figure 44.2.2 is φ1 : U1 → F defined as follows. −1 for θ ∈ [0, π − ε) ∪ (2π − ε/2, 2π) φ1 (θ) = 1 for θ ∈ (π − ε/2, 2π − ε). If φ : U2 → F is defined by −1 for θ ∈ [0, π/2 − ε) ∪ (3π/2 − ε/2, 2π) φ2 (θ) = 1 for θ ∈ (π/2 − ε/2, 3π/2 − ε), then φ1 and φ2 agree on the sets [0, π/2 − ε) ∪ (2π − ε/2, 2π) and (π − ε/2, 3π/2 − ε), but they disagree on (π/2 − ε/2, π − ε) and (3π/2 − ε/2, 2π − ε). Hence the group element gφ1 φ2 (b) ∈ G is constant on each of these intervals, and therefore is a continuous function of b ∈ B. (See Figure 44.2.3.) [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
φ1 : π −1 (U1 ) → F
778
44. Examples of fibre bundles −1
F = {−1, 1}
1
φ1 : π −1 (U1 ) → F π −1 (U1 ) ⊆ E
b
E = S1 π −1 (U2 ) ⊆ E φ2 : π −1 (U2 ) → F
F = {−1, 1} Figure 44.2.3
−1
1
Fibre bundle charts for the M¨obius strip
44.2.1 Remark: Even though the sets B and E are both equal to S 1 in this example, they are regarded as being two “copies” of the same set S 1 rather than the same set. This can be made self-consistent by introducing the concept of a “labelled set”, which is really an ordered pair (L, S) where L is the label and S is the set. Then two sets with different labels are actually different. This would make life too difficult. So it is best to deal with this aspect of set naming in an informal way. However, it is important to know that this situation can be made self-consistent if required. One good way of dealing with would be to say that B and E are labels in some label space, and the values of both of these labels are the same set S 1 .
44.2.2 Example: [A torus: E = S 1 × S 1 , B = S 1 , F = S 1 . Turn this into a Klein bottle by turning the tube inside out at the join?] 44.2.3 Example: [Some sort of multi-M¨ obius strip: B = S 1 , F = 44.2.4 Example: [B = S 2 , F =
Zn = {e2πki/n; k = 0, . . . n − 1}.]
Zn = {e2πki/n; k = 0, . . . n − 1}. Some sort of 2-d multi-M¨obius strip?]
44.2.5 Example: [A sphere: B = S 2 , F = S 1 .] 44.2.6 Remark: The worst joke in mathematics is probably: “Why did the chicken cross the M¨obius strip? Answer: To get to the same side.”
44.3. The M¨ obius strip fibre bundle on S 1 [ This is the original section on the M¨ obius strip written in the early 1990s. ] A simple non-trivial example of a fibre bundle is the M¨obius strip fibre bundle on the circle S 1 as the base space. Define E = [0, 1) × (0, 1) ⊆ IR2 , with the topology T = τ (Top(IR × (0, 1))), where IR × (0, 1) has the topology induced by IR2 , and τ : IR × (0, 1) → E is defined by τ (x, y) =
(x mod 1, y), x mod 2 ∈ [0, 1) (x mod 1, 1 − y), x mod 2 ∈ [1, 2).
Then T is a topology on E. This can be seen by noting that for any set G ∈ Top(IR × (0, 1)), τ −1 ◦ τ (G) =
S
Z
i∈
(G + 2i) ∪
S
(G′ + 2j + 1)
Z
j∈
∈ Top(IR × (0, 1)), [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
[ The rest of these examples need to go in their own sections. ]
44.3. The M¨obius strip fibre bundle on S 1
779
where G′ = {(x, y) ∈ IR2 ; (x, 1 − y) ∈ G}. Then it follows by Theorem 15.11.6 that T is indeed a topology on E. So denote it by Top(E) = T . Similarly define the topology on B = [0, 1) to be Top(B) = σ(Top(IR)), where IR has the usual topology and σ : IR → B is defined by σ(x) = [x]. Define p : E → B by p(x, y) = x. Then p is continuous. Let F = (0, 1) have the topology induced by IR, and let G = {1, g} be the group acting on F with g : y 7→ 1−y. Define the atlas S = {ψ1 , ψ2 } by
∀(x, y) ∈ (0, 1) × (0, 1) ∀(x, y) ∈ [0, 12 ) × (0, 1)
∀(x, y) ∈ ( 21 , 1) × (0, 1)
p(Dom(ψ1 )) = (0, 1) p(Dom(ψ2 )) = [0, 1) \ { 21 } ψ1 (x, y) = y ψ2 (x, y) = y ψ2 (x, y) = 1 − y.
S Then p(Dom(ψ1 )), p(Dom(ψ2 )) ∈ Top(B) and ψ∈S p(Dom(ψ)) = B. Also note that p × ψ : Dom(ψ) ≈ p(Dom(ψ)) × F for ψ ∈ S, and −1 gψi ψj = ψi,b ◦ ψj,b =
1, g,
b ∈ (0, 12 ) or i = j b ∈ ( 21 , 1) and i 6= j,
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
for i, j ∈ {1, 2}, and for any topology on G, gij : Ui ∩ Uj → G is continuous, since each gij is continuous on the components of its domain. Thus all the conditions for a fibre bundle atlas are satisfied. [ Try to create a PostScript graphic for this fibre bundle. ]
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
780
[ www.topology.org/tex/conc/dg.html ]
44. Examples of fibre bundles
[ draft: UTC 2009–3–21 Saturday 11:36 ]
[781]
Chapter 45 Derivations, gradient operators, germs and jets
45.1 45.2 45.3 45.4 45.5 45.6 45.7 45.8 45.9 45.10
Definitions . . . . . . . . . . . . Some elementary examples . . . Further elementary examples . . Spaces of differentiable functions Spaces of smooth functions . . . The space of analytic functions . The H¨older spaces . . . . . . . . Further topics on derivations . . Germs . . . . . . . . . . . . . . Jets . . . . . . . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
782 782 785 785 787 789 789 789 789 790
The topic of the dimensionality of derivation-style tangent spaces, though apparently peripheral, takes up most of this chapter, just to make sure that the rough edges are ironed out properly. This chapter may be completely ignored. In fact, it probably should be completely ignored. A proof of the dimensionality of the space of derivations is given by Warner [49], Theorem 1.17, page 13. See also Crampin/Pirani [11], pages 247–248. The difficulty of just proving the dimensionality of the derivation-style tangent space is sufficient reason to reject this approach. The fact that this approach can only be applied to C ∞ manifolds is the coup de grˆ ace. In an attempt to get coordinate-free in relation to differential geometry, the modern Leibniz-rule definition of a tangent vector has been widely adopted. This corresponds to the concept of a “derivation”. (See [19], p.20.) It is not really possible to deal with manifolds in a totally coordinate-free manner, since manifolds are defined to be topological spaces which can be coordinatized. Attempts to work without coordinates only succeed in hiding the coordinates. But all definitions lead back to coordinates eventually. The “coordinatefree” notations are nevertheless often tidier-looking than their coordinate-oriented equivalents. So it could be useful to establish the correspondence between the two sets of definitions carefully. This chapter commences with some basic definitions and examples. The examples are intentionally elementary, in order to demonstrate the surprising complexity of the product-rule definition of tangent space. Following the simple examples, realistic function spaces like the smooth functions are dealt with. [ The “synthetic differential geometry” approach gives a tangent space dimension theorem of more or less the same sort as given in this chapter. See [49], Theorem 1.17. ] [ See Klingenberg [25], lemma 1.4.6, p.33 for a proof that all derivations are gradient operators. ]
©
Alan U. Kennington, “Differential geometry reconstructed: a unified systematic framework”. Copyright 2009, Alan U. Kennington. All rights reserved. You may print this book draft in A4 format. Printing in all other formats is forbidden. You may not charge any fee for copies of this book draft. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
This chapter might be deleted in the first release of the book. The author originally throught that derivations were the best way to define tangent spaces. After long contemplation on this subject, he concluded that derivations were the worst way to define tangent spaces. This chapter has been placed near the end in the hope that everybody will ignore it. The valiant attempt to convert tangent bundle analysis into algebra leads to a mass of convoluted theory which is unnatural and difficult to use. It is better for algebra fans to just grit their teeth and put up with all the limits and derivatives, or just stick to algebra.
782
45. Derivations, gradient operators, germs and jets
45.1. Definitions [ Gallot/Hulin/Lafontaine [19] has definitions for this section. Defn. 1.45 derivations, defn. 1.50 derivation, theorems 1.49, 1.51 Lie derivatives for derivations. ] All of the difficulties arise even in the case of the manifold IRn . So it is not necessary to deal with general differentiable manifolds here. In order for the Leibniz rule to make much sense, the function space in question should be closed under the pointwise product of functions. [ The following definition should explicitly define derivations, not just the tangent space. ] n
45.1.1 Definition: Let F ⊆ IR(IR ) be a real vector space of functions f : IRn → IR under pointwise addition and scalar multiplication which is closed under the operation of pointwise multiplication, ∀x ∈ IRn ,
(f.g)(x) = f (x).g(x),
∀f, g ∈ F,
L(f.g) = L(f )g(p) + f (p)L(g).
for all f, g ∈ F . For such a space F and any p ∈ IRn , the tangent space of IRn at p with respect to F , denoted TpF (IRn ), is the vector space of linear maps L : F → IR such that Any such linear map L is called a tangent vector of IRn at p with respect to F . 45.1.2 Remark: This definition is quite standard, although it strongly emphasizes the function space as opposed to the underlying manifold. In EDM [33] and Szekeres [86], F is taken to be in C ∞ (M ) for any C ∞ manifold M . The definition of tangent space makes sense even if the base space M is generalized from IRn to any space at all. The space F can also be generalized in various ways, for instance by replacing IR with a ring, but life is already complicated enough as it is.
In this section, the most elementary examples of function spaces are examined – those which are generated by a single member of the space. The smallest vector space F of functions closed under pointwise multiplication is the trivial space F = {0}, for which TpF (IRn ) = {0}, where 0 represents, according to context, the zero function 0 : IRn → IR or the zero operator 0 : F → IR. Suppose F contains at least one function f : IRn → IR such that f 6= 0. Then F also contains the functions x 7→ f (x)i for positive integers i, and all linear combinations of such powers of f . That is, all polynomial expressions of f (without a constant term) are in F . This may be denoted by F [f ], the space generated by f : X m i F [f ] = ai f ; m > 0 and ∀i = 1 . . . m, ai ∈ IR . i=1
F [f ] is a vector space which is closed under pointwise multiplication, and is therefore an appropriate space with respect to which a tangent space on IRn can be defined. If f is a non-zero constant function, then F [f ] is 1-dimensional. More generally, if the range of f is finite, then there will always be a non-trivial linear combination of the powers of f which is constant. Indeed, suppose the range of f consists of the m distinct non-zero values yj and possibly also 0. Then the formal polynomial in φ, m Y P (φ) = φ (φ − yj ), j=1
has the value 0 at φ = f . This may be expanded in terms of powers of φ as P (φ) =
m+1 X
c k φk
k=1
= φm+1 +
m X
c k φk ,
k=1 [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
45.2. Some elementary examples
45.2. Some elementary examples
783
P Qm where the real coefficients ck are given by cm+1 = 1, cm = m j=1 (−yj ), c1 = j=1 (−yj ) and so forth. Thus f m+1 can be written as a linear combination of lower positive powers of f , and so the dimension of F [f ] is equal to at most m. The converse fact that no lower power of f can be expressed in terms of yet lower powers of f follows from the unique factorization theorem. So dim(F [f ]) = #(f (IR) \ {0}).
If the range of f is infinite, F [f ] clearly has countably infinite dimension. It turns out that the dimension of F [f ] F [f ] has an influence on the dimension of the tangent space Tp (IRn ). F [f ]
45.2.1 Theorem: The tangent space Tp
(IRn ) is 0-dimensional if the range of f is finite. F [f ]
Proof: In the case of a constant function f , it is easy to show that L(f ) = 0 for all L ∈ Tp F [f ] fact, L(1) = L(1.1) = L(1).1 + 1.L(1). Hence Tp (IRn ) = {0}. F [f ]
(IRn ) may be applied to P (f ):
ck f
k
If the range of f is finite, then P (f ) ∈ F [f ] and any L ∈ Tp m+1 X
0 = L(P (f )) = L
k=1
=
m+1 X
(IRn ). In
!
ck L(f k )
k=1
m+1 X
= L(f )
kck f (p)k−1 .
k=1
The last line follows by the inductive application of the Leibniz rule to the product f k = f.f k−1 . Viewed as a polynomial in f (p), the coefficient of L(f ) in the last line above is the derivative of the formal polynomial P evaluated at f (p). And since P has no multiple zeros, the derivative of P must be non-zero at each of its zeros, which in this case are the possible values of f (p). So the coefficient of L(f ) is equal to 0, and it therefore follows that L(f ) = 0. (IRn ) is 0-dimensional if the range of f is finite.
45.2.2 Remark: Now suppose that the range of f is infinite. Any function g in F [f ] may be written as g=
m X
ai f i
i=1
for some integer m and coefficients ai ∈ IR. An inductive application of the Leibniz rule then reduces the expression for L(g) to the following: L(g) = L(f )
m X
iai f (p)i−1 ,
i=1
F [f ]
by induction on i. Thus Tp (IRn ) is at most 1-dimensional, since the value of any tangent vector L on a function in F [f ] is completely determined by its value on f . F [f ]
Although it is clear that the tangent space Tp (IRn ) is at most 1-dimensional, it is perhaps not so obvious that it is at least 1-dimensional. To show this, it is essentially necessary to find at least one non-zero map L which obeys the Leibniz rule for all pairs of functions f and g in F [f ]. This is equivalent to showing that no further reductions of the above expression for L(g) are possible by using the Leibniz product rule. This difficulty increases as the size of F increases. It is often difficult to know whether further reductions are possible. It is for this reason that it is useful to examine trivial examples first in depth, so as to find techniques for showing the non-existence of further reductions. F [f ]
45.2.3 Theorem: The tangent space Tp [ www.topology.org/tex/conc/dg.html ]
(IRn ) is 1-dimensional if the range of f is infinite. [ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
F [f ]
This has shown that Tp
784
45. Derivations, gradient operators, germs and jets
Proof: It is sufficient to demonstrate the existence of a non-zero linear map L : F [f ] → IR which obeys the Leibniz rule for all functions g, h ∈ F [f ]. If the range of f is infinite, then any g ∈ F [f ] may be written as m1 X g= ai f i , i=1
1 (ai )m i=1 .
for some coefficients These coefficients are uniquely determined by g, because the vectors f i are linearly independent for i ≥ 1 if the range of f is infinite. Define a map L : F [f ] → IR in terms of these unique coefficients for g by L(g) = α
m1 X
iai f (p)i−1 ,
(45.2.1)
i=1
2 for some α ∈ IR. Similarly, any h ∈ F [f ] uniquely determines coefficients (bj )m j=1 such that
h=
m2 X
bj f j .
j=1
(Of course the numbers m1 and m2 are not uniquely determined. They are actually irrelevant, and serve only the emphasize that the sequences are finite.) By expanding the product g.h, the value of L(g.h) can be calculated as: X m1 m2 X i j L(g.h) = L ai f bj f i=1
=
j=1
m1 X m2 X
ai bj L(f i+j )
i=1 j=1 m1 X m2 X
=α
(i + j)ai bj f (p)i+j−1 .
The right hand side of the Leibniz rule is X X X m1 m2 m1 m2 X L(g)h(p) + g(p)L(h) = L ai f i . bj f (p)j + ai f (p)i . L bj f j =α
i=1 m 1 X
iai f (p)i−1 .
i=1
=α
j=1
m1 X m2 X
m2 X
i=1
bj f (p)j +
j=1
m1 X i=1
j=1 m2 X
ai f (p)i . α
jbj f (p)j−1
j=1
(i + j)ai bj f (p)i+j−1 .
i=1 j=1
Both of these calculations use only the definition (45.2.1) for g. This verifies the product rule for L for all g, h ∈ F [f ], for any α ∈ IR. F [f ]
The tangent space Tp precisely 1.
(IRn ) is thus at least 1-dimensional. So the dimension of the tangent space is
45.2.4 Remark: It is perhaps not quite obvious why this form of proof would not work when the space F [f ] is finite-dimensional. The operator L defined by (45.2.1) is not uniquely defined when dim(F [f ]) < ∞, since the value of L(g) for g ∈ F [f ] depends on whether g is reduced to lower powers of f before having L applied to it. One way around this is to require all functions in F [f ] to be reduced to the lowest possible powers of f before the application of L: m X g= ai f i . i=1
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
i=1 j=1
45.3. Further elementary examples
785
Such a representation of g is then unique, but the Leibniz rule fails for the resulting operator when α 6= 0. To see this, let g = f m and h = f . Then g.h = f m+1 is equal to some linear combination of the powers f i for 1 ≤ i ≤ m: m X f m+1 = ci f i , i=1
for some real numbers ci . Hence the definition of L gives L(g.h) = α
m X
ici f i−1 .
i=1
But
L(g)h(p) + g(p)L(h) = αmf (p)m−1 .f (p) + f (p)m .α = α(m + 1)f (p)m . If the Leibniz rule is to hold, then these two calculations of L(g.h) should give the same result. But as noted in the proof of Theorem 45.2.1, the derivative of a polynomial cannot be zero at a simple zero of the polynomial. So the Leibniz rule does not hold for the product g.h. 45.2.5 Remark: Clearly, if the elements of the function space F are required to be continuous, then the only kind of function with a finite range is a constant function, on a connected region at least.
In this section, the case F = F [f1 , f2 ] and similar function spaces are treated, like F [x, sin x]. It is shown that the tangent space can have very large dimension even for IR1 . [ The case F = F [f1 , f2 ]. If this space is infinite-dimensional and there are no relations between f1 and f2 , F [f ,f ] then the tangent space Tp 1 2 (IRn ) should turn out to be 2-dimensional. It should then be possible to come to some reasonable conclusions when there are relations between f1 and f2 . Then spaces generated by any number of functions should be dealt with. The case where one of these functions is constant (or finitevalued) should make it possible to subtract g(p) from any function g ∈ F , and therefore concentrate on those functions which vanish at p. There should be some connections with abstract algebras here somewhere. ] [ Next deal with the concrete case F [x]. This is the space of all polynomials. The concrete case F [x, sin x] may or may not be instructive. Could try for instance n L(xi ) = 0, i 6= 1 1 i = 1 0, j 6= 1 j L(sin x) = 2 j=1 ( 1, i = 1, j = 0 L(xi sinj x) = 2 i = 0, j = 1 0 otherwise where m,n X aij xi sinj x, m, n ≥ 0, aij ∈ IR for all i, j}. F = {f : IR → IR; f (x) = i,j=1
Thus L(f ) = a10 + 2a01 . Comparison with the space F [1, x, sin x] may be interesting. ]
45.4. Spaces of differentiable functions See Warner [49], Theorem 1.17, page 13, proves the dimensionality of the space of derivations. He shows that the dimension of the space of derivations equals the dimension of the manifold in the case of C ∞ test functions. But in the case of C k functions for k < ∞, he says on page 16 that the dimension of the derivation space is infinite. (This is shown in Newns/Walker [80].) Then he says that the way to fix this is to define the derivations as standard derivatives. The conclusion, then, is that one can replace derivatives [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
45.3. Further elementary examples
786
45. Derivations, gradient operators, germs and jets
with derivations in a convoluted manner in the C ∞ case, but must revert to derivatives otherwise. Since the derivation approach is so restrictive and artificial, it is very difficult indeed to understand why anyone would use this approach. Maybe it is designed to make algebraists more comfortable. In this section, the case F = {f : IR → IR; f is differentiable in a neighbourhood of p} for p ∈ IRn is treated. This shows the difficulties which arise when a space is not closed under the quotient operator. Here is a hopeful form of proof of the dimension of the tangent space for the space of differentiable functions. To clarify the argument, the simplifying assumption that n = 1 is made. Suppose F is the space of real functions differentiable in a neighbourhood of p ∈ IR. For any f ∈ F , for x in a suitable neighbourhood of p, f (x) = f (p) +
Z
1 0
f ′ (p + t(x − p)).(x − p) dt
= f (p) + (x − p).
Z
0
1
f ′ (p + t(x − p)) dt
= f (p) + (x − p).g(x), R1 where g = 0 f ′ (p + t(x − p)) dt. The function g is well-defined in a neighbourhood of p. So now the Leibniz rule may be applied to f as follows: L(f ) = L(f (p)) + L((x − p).g) = 0 + (x − p) .L(g) + L(x − p).g(p) x=p
Thus the value of L(f ) is just whatever you get when you apply L to the function x 7→ x − p multiplied by the derivative of f at p. This would imply that the tangent space is 1-dimensional. It doesn’t matter what L(g) is, because it gets multiplied by zero anyway. But that’s exactly where the proof falls down. The value of L(g) is of no relevance as long as it is a real number. But in fact, it is not necessarily even defined. L(g) is only defined if g ∈ F . An example demonstrates that this is not so in general. The function g defined above is actually just the differential quotient of f . For any p ∈ IR, define the − (non-linear) operator Qp : IRIR → (IR)IR by f (x) − f (p) , x 6= p x−p Qp (f )(x) = f (x) − f (p) , x = p, lim sup x−p x→p
for all x ∈ IR, for any f ∈ IRIR . (This operator is linear on subspaces of IRIR in which all functions are differentiable at p.) Then if f is differentiable, Qp (f )(x) =
Z
1
0
f ′ (p + t(x − p)) dt.
To show that Qp (F ) 6⊆ F if F is the space of differentiable functions, consider φα ∈ F defined by φα (x) =
xα sin x−1 , x 6= 0 0, x = 0.
(45.4.1)
For α > 0, Q0 (φα )(x) = [ www.topology.org/tex/conc/dg.html ]
xα−1 sin x−1 , x 6= 0 0, x = 0. [ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
= 0.L(g) + f ′ (p).L(x − p) = f ′ (p).L(x − p).
45.5. Spaces of smooth functions
787
Q0 (φα ) is differentiable if and only if α > 2, whereas φα is differentiable if and only if α > 1. In fact, for α > 1, α−1 αx sin x−1 − xα−2 cos x−1 , x 6= 0 φ′α (x) = 0, x = 0.
So in particular, if α = 2 then the above form of proof that the tangent space dimension is 1 does not work. Essentially, the above form of proof of the dimension of the tangent space proceeds by supposing that the space F is closed under the quotient operator Qp and then applying the quotient rule to the product of the function x 7→ x − p with the quotient function. This then leads to the conclusion that the value of any tangent vector L applied to the function depends only on the value of the quotient function at p, that is f ′ (p). The requirement that F be closed under Qp is equivalent to the condition that, under the ring structure of F , the function x 7→ x − p be a divisor of all functions in F which vanish at p. Clearly this is not so for the differentiable functions on IR. Two questions are immediately raised by this. Firstly, which spaces F are closed under the quotient operator, so that the above method of proof can be made to work? And secondly, if the above method of proof cannot be made to work, is the dimension of the tangent space no longer equal to n, or are there alternative methods of proof? In Section 45.5, it is shown that the quotient operator method works for the C ∞ functions, but not for the C k functions for integer k.
45.5. Spaces of smooth functions In the lead-up to the C ∞ functions, it is useful to deal with the C k functions first. 45.5.1 Theorem: ∀k ≥ 0, ∀p ∈ IR, Qp (C k (IR)) 6⊆ C k (IR). Proof: To show that Qp (C k (IR)) 6⊆ C k (IR), consider the function φα defined in (45.4.1). φα ∈ C 1 (IR) if an only if α > 2, and Q0 (φα ) = φα−1 ∈ C 1 (IR) if an only if α > 3. So for α = 3, φα provides a counterexample to the closure of C 1 (IR) under Q0 . To find a counterexample for k > 1, it suffices to consider the (k − 1)-fold integral I0k−1 (φα ) of φα , where I0 denotes the integral operator Z f (t) dt.
0
Then clearly I0k−1 (φα ) ∈ C k (IR) if an only if α > 2 and k k d d k−1 Q0 (I0 (φα )) = x−1 I0k−1 (φα ) dx dx k X k = (−1)i i! x−1−i I0i−1 (φα ). i i=0
Since I0i−1 (φα ) is majorized by |x|α+i−1 /(i − 1)! for i ≥ 1, Q0 (I0k−1 (φα ))(k) is continuous at x = 0 when α > 2 if an only if the term x−1 I0−1 (φα ) is continuous at 0. But I0−1 (φα ) = φ′α . Inspection of the terms of φα shows that x−1 I0−1 (φα ) is thus discontinuous for α ≤ 3. So if α = 3, Q0 (I0k−1 (φα )) ∈ / C k (IR), but k−1 k k k I0 (φα ) ∈ C (IR). Hence for all k ≥ 1, Qp (C (IR)) 6⊆ C (IR). (Note that the case k = 0 is a pushover.) 45.5.2 Theorem: ∀k ≥ 1, ∀p ∈ IR Qp (C k (IR)) ⊆ C k−1 (IR). Proof: Now to show that Q0 (C k (IR)) ⊆ C k−1 (IR) for all k ≥ 1, it is helpful to recall that a function h : IR → IR with h(0) = 0 is continuous at 0 if an only if ∀ε > 0, ∃δ > 0,
|x| < δ ⇒ |h(x)| < ε.
Suppose f ∈ C k (IR), and define g ∈ C k (IR) by g(x) = f (x) − [ www.topology.org/tex/conc/dg.html ]
k X xi i=0
i!
f (i) (0).
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
x
I0 (f )(x) =
788
45. Derivations, gradient operators, germs and jets
Then g = I0k (g (k) ). But h = g (k) is continuous at 0 and h(0) = 0. So for any integer i with 0 ≤ i ≤ k, |x| < δ ⇒ |g (k−i) (x)| < ε|x|i .
∀ε > 0, ∃δ > 0,
So the (k − 1)-th derivative of Q0 (g) satisfies k−1 k−1 d d −1 Q (g) = (x g(x)) 0 dx dx k−1 X k−1 = (−1)i i! x−1−i g (k−1−i) (x) i i=0 k−1 X k − 1 ≤ i! |x|−1−i |g (k−1−i) (x)| i i=0 k−1 X k − 1 ≤ i! |x|−1−i εi |x|i+1 i i=0 k−1 X k − 1 = i! εi , i i=0 for any set of positive numbers εi , for small enough |x|. Hence Q0 (g)(k−1) (x) → 0 as x → 0. It remains to show that Q0 (g)(k−1) (0) = 0. Since g(0) = g ′ (0) = 0, Q0 (g)(0) = lim sup x→0
g(x) x
= 0. Q0 (g)(x) x = lim x−2 g(x)
Q0 (g)′ (0) = lim
x→0
x→0
= 0. By induction, since g (i) (0) = 0 for 0 ≤ i ≤ k, Q0 (g)(k−2) (x) x→0 x k−2 X k − 2 = lim (−1)i i! x−2−i g (k−2−i) (x) x→0 i i=0
Q0 (g)(k−1) (0) = lim
= 0.
So Q0 (g) ∈ C k−1 (IR). But Q0 (f ) = Q0 (g) + Q0 (f − g), and f − g is a polynomial. So Q0 (f ) ∈ C k−1 (IR), since the differential quotient of any polynomial is a polynomial. 45.5.3 Remark: An immediate consequence of the above is the closure of C ∞ (IR) under the quotient C ∞ (IR) operator. Hence dim Tp (IR) = 1. C ∞ (IR)
45.5.4 Theorem: ∀p ∈ IR, dim Tp
(IR) = 1.
C ∞ (IRn )
45.5.5 Theorem: ∀n ≥ 1, ∀p ∈ IRn , dim Tp [ www.topology.org/tex/conc/dg.html ]
(IRn ) = n. [ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Since g ′′ (0) = 0 also,
45.6. The space of analytic functions
789
45.6. The space of analytic functions [ Then deal with all analytic functions, which are basically the pointwise limits of functions in F [x], if the pointwise limit exists. This raises the question of whether the domain should be restricted from IRn to some ball around p. Anyway, the tangent space should turn out to have the right dimension for the analytic functions, because the analytic functions are closed under the quotient operator. That is, if f is analytic, then Qp (f ) is analytic for any p. ]
45.7. The H¨ older spaces The spaces C k,α (IRn ) and W k,p (IRn ) probably should be considered. As an example, let F = C 1/2 (IR), and try to define the linear map L ∈ T0F (IR) so that L(|x|1/2 ) = 1 and also L(sign(x)|x|1/2 ) = 1. Then if this can be done, it should result in L(x) = 2|x|1/2
x=0
L(|x|1/2 ),
and therefore L(ax) = 0 for all constants a ∈ IR. Thus the generators of this space are effectively the functions which are only just in C 1/2 (IR). This would make an interesting example maybe.
45.8. Further topics on derivations Also to be considered is the issue of whether demanding that the tangent maps L be continuous on some topology on F makes things better and/or simpler. Should try to show via some sort of Hahn-Banach theorem that dim TpF (IRn ) > n for F = C n (IRn ), etc.
In the case of C k manifolds, this raises the question of just how tangent spaces should be defined in spaces larger than C ∞ (M ). Maybe the derivative will have to be defined in terms of coordinates!
[ Warner [49], pages 11–22, has a good treatement of derivations and germs. See also Bump [96], section 6. ] [ This section should be written up as formal definitions and theorems. ] This is an e-mail I once received on this subject. “The idea I was telling you about is really due to Grothendieck [68]. It is hinted at in Spencer’s article [141]. It is more in evidence in algebraic geometry than in differential geometry, e.g. Fulton [107], exercise 13-3 or, indeed in analytic geometry, e.g. Gunning [69], corollary 1 to theorem 15. “Anyway, if p ∈ M a smooth manifold, let W be the ideal of germs vanishing at p inside O, the ring of germs of smooth functions at p. Then O/W = IR (induced by evaluation of a function at p). So W/W 2 is an IR-module, i.e. a vector space. I claim that it is a vector space of dimension equal to the dimension of M . You can define this as the cotangent space at p. For example, if f is a germ at p, then f − f (p) is in W , so defines an element of W/W 2 , and this is df (p). Similarly, if X is a derivation of O, e.g. a germ of a smooth vector field at p, then by the Leibnitz rule, X W 2 vanishes. So X defines a linear functional on W/W 2 , i.e. a tangent vector at p. “This is all ‘explained’ by ‘synthetic differential geometers’. I remember that Reyes & Koch were involved in the basics (but beware that they use intuitionistic logic). The MSC classification number is 51 K 10. So you’ll be able to find lots of this stuff in the library. “P.S. Anders Kock [72] seems to have written a book on it.” Let M be an n-dimensional C ∞ manifold. Let p ∈ M . Let W = {f ∈ C ∞ (M ); f (p) = 0}, which is an ideal of the ring O = C ∞ (M ). Then O/W = IR. [ See Section 9.8 for rings and ideals. ] O is a ring because it is closed under pointwise addition and multiplication. W is an ideal in O because for all f ∈ O and g ∈ W , f.g ∈ W . O/W is the set {f + W ; f ∈ O}, which is a 1-dimensional vector space [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
45.9. Germs
790
45. Derivations, gradient operators, germs and jets
under multiplication. For all f1 , f2 ∈ O, the product of f1 + W with f2 + W is equal to f1 .f2 + W . Clearly f1 + W = f2 + W if and only if f1 (p) = f2 (p). Now look at W/W 2 . The set W 2 is the set of self-products of elements of W : W 2 = {f.f ; f ∈ W }. Clearly all such functions must be non-negative. Then W/W 2 = {f + W 2 ; f ∈ W }. Then two (representatives of) elements of W/W 2 are equal if and only if they differ by a square of a function which vanishes at 0. That is, ∀f1 , f2 ∈ W,
(f1 + W 2 = f2 + W 2 ⇔ ∃g ∈ W, f1 = f2 + g.g).
Is W 2 really an ideal of W ? Check that equivalence classes of the form f + W 2 for f ∈ W have well-defined sums and products. Let f1 , f2 ∈ W . Then (f1 + W 2 ) + (f2 + W 2 ) = (f1 + f2 ) + W 2 if the sum of any pair of elements of W 2 can be written as an element of W 2 and vice versa. And so forth, and so forth. . . See Gallot/Hulin/Lafontaine [19], page 19, for a discussion of germs. They define the space of germs as W = {f : M → IR; ∃Ω ∈ Top(M ), p ∈ Ω and f (p) = 0}. Gallot et alia then go on to look at the set of derivations (Definition 1.45) on W. Then this set of derivations is an n-dimensional vector space, which can be defined to be the tangent space of M at p. This is apparently a different definition to the one indicated above.
45.10. Jets
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
[ See EDM 108.X. This is a set of definitions of tensors which is useful for defining a C ∞ manifold structure on tensor bundles of arbitrary order. ] [ Do jets have anything to do with sheaves? ] [ Are jets really just equivalence classes of curves? ]
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
[791]
Chapter 46 History of differential geometry
46.1 46.2 46.3 46.4
Chronology of mathematicians . . . Origins of words and notations . . . Etymology of affine spaces . . . . . Logical language in ancient literature
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
791 796 798 801
46.1. Chronology of mathematicians This section outlines the chronology of mathematicians who contributed directly or indirectly to the development of the topics in this book. The tables are based on Bynum et alia [191], EDM2 [34], Bell [189], KEM [121] and other sources. Names are sorted by date of death. 46.1.1 Remark: Ancient history
572–492bc
Pythagoras of Samos
c490–c425bc Zeno of Elea 480–411bc Antiphon the Sophist c470–c410bc Hippocrates of Chios c460–c370bc Democritus c400–347bc
Eudoxus of Cnidus
c427–c347bc Plato 384–322bc Aristotle c325–c265bc Euclid of Alexandria c287–212bc
Archimedes of Syracuse
c276–c194bc Eratosthenes of Cyrene c262–c190bc Apollonius of Perga fl.146–126bc Hipparchus
contribution first proof of geometric theorem; learned mathematics in Egypt; KEM [121] gives dates c625–c545bc taught geometry; maybe first to prove Pythagoras’ theorem; fl. 510bc; another source gives dates c.569–c.475bc; KEM [121] gives c580–c496bc paradoxes regarding infinitesimals atomistic calculation of the area of circles; proposed a method of exhaustion; fl.430? wrote “Elements of Geometry” (lost) computation of the volume of pyramids by dividing them into ‘atomistic’ laminas; fl.430? attributed as having developed ‘method of exhaustion’. (EDM [33] and KEM [121] give dates c408–c355bc.) P −n first sum of infinite series ∞ ; provided basis for Euclid’s n=0 4 Elements? Basic formal logic. Elements; organized ruler/compass geometry axiomatically. (fl. c280bc; EDM [33] gives c300bc; KEM [121] gives c365–c300bc) rigorous treatment of areas and volumes bounded by curved lines and surfaces using the ‘method of exhaustion’. (EDM [33] gives dates c.282–212bc.) measured distance of 1◦ on Earth Konikon Biblia; conic sections. Founder of trigonometry.
©
Alan U. Kennington, “Differential geometry reconstructed: a unified systematic framework”. Copyright 2009, Alan U. Kennington. All rights reserved. You may print this book draft in A4 format. Printing in all other formats is forbidden. You may not charge any fee for copies of this book draft. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
dates name c639–c546bc Thales of Miletus
792
46. History of differential geometry
The Mesopotamians and Egyptians made much progress in geometry before classical Greek mathematics, but no personal names are associated with this very early geometry. It is not known, for example, who discovered Pythagoras’ theorem between about 2500bc and 1850bc. According to Bell [189], page 70, the first deductive proof of a geometric theorem is traditionally ascribed to Thales about 600bc. Maybe Thales needed to use deduction to fill in the gaps in his knowledge which he learned on a visit to Egypt. He probably forgot a few things while sailing back to Anatolia, the Egyptian priests didn’t like to tell everything, and Egyptian and Mesopotamian mathematics texts never gave proofs of rules and theorems. Euclid was more important as a collector and organizer of geometrical knowledge than as an inventor or discoverer. It is the logical organization of Euclid’s “Elements” which has had a profound effect on mathematics and physics, not so much the particular set of theorems. Bell [189], page 71, says: With the completion of Euclid’s Elements, Greek elementary geometry, exclusive of the conics, attained its rigid perfection. It was wholly synthetic and metric. Its lasting contribution—and Euclid’s—to mathematics was not so much the rich store of 465 propositions which it offered as the epoch-making methodology of it all. For the first time in history masses of isolated discoveries were unified and correlated by a single guiding principle, that of rigorous deduction from explicitly stated assumptions. Some of the Pythagoreans and Eudoxus before Euclid had executed important details of the grand design, but it remained for Euclid to see it all and see it whole. He is therefore the great perfector, if not the sole creator, of what is today called the postulational method, the central nervous system of living mathematics. Unification and organization of mathematics is still an important task in the 21st century. Bell [190], page 299, says the following. Geometrical teaching was dominated by Euclid for over 2200 years. His part in the Elements appears to have been principally that of a coordinator and logical arranger of the scattered results of his predecessors and contemporaries, and his aim was to give a connected, reasoned account of elementary geometry such that every statement in the whole long book could be referred back to the postulates. Euclid did not attain this ideal or anything even distantly approaching it, although it was assumed for centuries that he had.
dates name 99bc–1bc 1–99 c85–c168 Ptolemy of Alexandria [Claudius Ptolemaeus] 200–299 fl.300–350 Pappus of Alexandria 400–499 500–599 600–699 700–799 800–899 900–999 1000–1099 1100–1199 1200–1299 1300–1399
contribution
wrote Almagest on astronomy and geometry.
Mathematical Collection referred to lost works on geometry.
Progress in mathematics was woefully slow under the Roman Empire and Catholic church until the Renaissance and Reformation. There was some progress in algebra in these Dark Ages, but geometry and analysis seem to have gone backwards. Bell [189], page 85, describes this as follows. It is customary in mathematical history to date the beginning of the sterile period from the onset of the Dark Ages in Christian Europe. But mathematical decadence had begun much earlier, in [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
46.1.2 Remark: Dark Ages
46.1. Chronology of mathematicians
793
one of the greatest material civilizations the world has known, in the Roman Empire at the height of its splendor. Mathematically, the Roman mind was crass. 46.1.3 Remark: Renaissance dates name 1404–1472 Leone Battista Alberti
contribution theory of perspective; vanishing line. (b. Feb 18, d. Apr 3.)
1512–1558 Robert Recorde 1526–1572 Rafael Bombelli
1557: invented the “ = ” sign for equality. 1572: negative number arithmetic.
1540–1603 1571–1630 1564–1642 1596–1650 1591–1661
symbolic algebra; Newton’s method. (d. Dec 13.) principle of continuity; points at infinity. 1591/1612: dynamics. (b. Feb 15, d. Jan 8.) 1637: la G´eom´etrie; analytic geometry. (b. Mar 31, d. Feb 11.) 1636–39: invented projective geometry; points at infinity. (b. Feb 21, d. Sep.) 1636–39: synthetic projective geometry. (b. Jun 19, d. Aug 19.) 1629: analytic geometry; 1657/61: tangents to curves as limits of secants; var. principle in optics. (b. Aug 17, d. Jan 12.) taught calculus to Newton. (b. Oct, d. May 4.) (b. Apr 14, d. Jul 8.)
Fran¸cois Vi` ete Johannes Kepler Galileo Galilei Ren´e Descartes Girard Desargues
1623–1662 Blaise Pascal 1601–1665 Pierre de Fermat 1630–1677 Isaac Barrow 1629–1695 Christiaan Huygens
Bell [190], page 96, has the following comment on how Newton learned calculus from Isaac Barrow. Barrow’s geometrical lectures dealt among other things with his own methods for finding areas and drawing tangents to curves—essentially the key problems of the integral and the differential calculus respectively, and there can be no doubt that these lectures inspired Newton to his own attack. Bell [189], pages 120–121, has the following comment on the earlier development of Newton’s method for solution of polynomial equations by Fran¸cois Vi`ete. Improving on the devices of his European predecessors, Vieta gave a uniform method for the numerical solution of algebraic equations. Its nature is sufficiently recalled here by noting that it was essentially the same as Newton’s (1669) given in textbooks. 46.1.4 Remark: Enlightenment dates name 1616–1703 John Wallis 1646–1716 Gottfried Wilhelm Leibniz 1642–1727 Isaac Newton 1865–1731 Brook Taylor 1698–1746 Colin Maclaurin 1707–1783 Leonhard Euler
contribution generalization of superscript exponential notation 1673/75: diff/int. calculus; fundamental theorem of calculus; calculus notation. (b. Jul 1, d. Nov 4.) 1666/84: diff/int. calculus; 1687: Principia; fund. theorem of calculus; celestial mechanics. (b. Dec 25, d. Mar 20.)
coined the term “affine”, 1748. (b. Apr 15, d. Sep 18.)
The 18th century was known as the “Age of Reason” because during this time the objections of European religious authorities to scientific progress were overcome and finally made irrelevant. Once again, as during the golden age of classical Greece, critical thinking and insightful discovery replaced ignorant authority. An [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
The 17th century is notable for the rapid development of analysis, which is distinguished from other mathematics by the use of limits and infinite processes. The big contribution of Newton was in the application of analysis to physics, for which he required some clarification and much development of the methods of analysis. Practical analysis was developed initially by Archimedes. Limits and derivatives were written about by some authors in the century or two before Newton. But these notions were elevated from curiosities to fundamental physical modelling tools by Newton. It was then perhaps inevitable that analysis would be applied to geometry to produce differential geometry.
794
46. History of differential geometry
important step in this was the publication of Lagrange’s work on mechanics in 1788. Bell [189], page 362, wrote the following on this subject. The eighteenth century has been called the Age of Reason, also an age of enlightenment, partly because the physical science of that century attained its freedom from theology. In the hundred years from the death of Newton in 1727 to that of Laplace in 1827, dogmatic authority suffered the most devastating of all defeats at the hands of scientific inquiry: indifference. It simply ceased to matter, so far as science was concerned, whether the assertions of the dogmatists were true or whether they were false. 46.1.5 Remark: Nineteenth century
1736–1813 Joseph-Louis Lagrange 1746–1818 Gaspard Monge
1753–1823 Lazare Nicholas Marguerite Carnot 1749–1827 Pierre-Simon Laplace 1802–1829 Niels Henrik Abel 1768–1830 Jean Baptiste Joseph Fourier 1811–1832 Evariste Galois 1752–1833 Adrien Marie Legendre 1781–1840 Sim´eon Denis Poisson 1781–1848 Bernard Placidus Johann Nepomuk Bolzano 1804–1851 Carl Gustav Jacob Jacobi 1777–1855 Johann Carl Friedrich Gauß 1789–1857 Augustin Louis Cauchy 1805–1859 Johann Peter Gustav Lejeune Dirichlet 1796–1863 Jakob Steiner 1805–1865 William Rowan Hamilton 1826–1866 Georg Friedrich Bernhard Riemann 1798–1867 Karl Georg Christian von Staudt 1788–1867 Jean-Victor Poncelet
1790–1868 August Ferdinand M¨ obius
contribution “Proved” that Euclidean geometry was “known a priori ”. (b. Apr 22, d. Feb 12.) calculus of variations; Lagrangian mechanics. (b. Jan 25, d. Apr 10.) introduced differential geometry; created descriptive geometry, representing solids by means of projections on a plane and forming the basis of engineering drawing. (b. May 9, d. Jul 28.) 1803: G´eom´etrie de position; 1806: Essai sur les transversailles; projective geometry. analysis, celestial mechanics, potential theory. (b. Mar 23, d. Mar 5.) (b. Aug 5, d. Apr 6.) (b. Mar 21, d. May 16.) finite groups. (b. Oct 25, d. May 31.)
Bolzano-Weierstraß theorem. (b. Oct 5, d. Dec 18.) Hamilton-Jacobi equation; Jacobian determinant. (b. Dec 10, d. Feb 18.) perfected differential geometry; Gaußian curvature enabled classic non-Euclidean geometries to be described without embedding in Euclidean space. (b. Apr 30, d. Feb 23.) partial differential equations; Cauchy sequences. (b. Aug 21, d. May 23) boundary value problems. (b. Feb 13, d. May 3) contributor to projective geometry. (b. Mar 18, d. Apr 1.) Hamiltonian mechanics. (b. Aug 4, d. Sep 2.) ¨ 1854: Uber die Hypothesen welche der Geometrie zu Grunde liegen; generalized Gaußian curvature to higher dimensions. (b. Sep 17, d. Jul 20.) elimination of metrical considerations from projective geometry 1822: Trait´e des propri´et´es projectives des figures; following Desargues, effectively created modern projective geometry; introduced imaginary points. (b. Jul 1, d. Dec 22.) 1827: Der barycentrische Calcul, includes many of his results on projective and affine geometry; introduced barycentric coordinates. (b. Nov 17, d. Sep 26.)
1811–1874 Ludwig Otto Hesse [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
dates name 1724–1804 Immanuel Kant
46.1. Chronology of mathematicians 1809–1877 Hermann G¨ unter Grassmann 1845–1879 William Kingdon Clifford 1793–1880 Michel Chasles 1808–1882 Johann Benedict Listing 1823–1891 Leopold Kronecker 1821–1895 Arthur Cayley 1815–1897 Karl Theodor Wilhelm Weierstraß 1842–1899 Marius Sophus Lie
795
development of a general calculus for vectors (b. Apr 15, d. Sep 26.) (b. May 4, d. Mar 3.) 1852: Trait´e de G´eom´etrie discusses cross ratio. (b. Nov 15, d. Dec 18.) 1847: Vorstudien zur Topologie; first printed use of word ‘topology’. (b. Jul 25, d. Dec 24.) arithmetization of mathematics. (b. Dec 7, d. Dec 29.) reduction of metrical geometry to projective geometry; ‘invented’ matrices. (b. Aug 16, d. Jan 26) (b. Oct 31, d. Feb 19.) continuous transformation groups. (b. Dec 17, d. Feb 18.)
The dearth of British names among mathematicians who died between 1750 and 1900 is quite striking. This is sometimes attributed to the isolation of British mathematicians after the silly arguments about who invented the calculus. Newton ostensibly won the argument, but it was a self-defeating victory. British mathematics went into decline. Europeans dominated the development of mathematics thereafter. Bell [190], page 144, makes the following comment on this subject. The upshot of it all was that the obstinate British practically rotted mathematically for all of a century after Newton’s death, while the more progressive Swiss and French, following the lead of Leibniz, and developing his incomparably better way of merely writing the calculus, perfected the subject and made it the simple, easily applied implement of research that Newton’s immediate successors should have had the honor of making it. (See also some related comments in Remark 18.2.11.)
dates 1835–1900 1829–1900 1822–1901 1819–1903 1832–1903
name Eugenio Beltrami Elwin Bruno Christoffel Charles Hermite George Gabriel Stokes Rudolf Otto Sigismund Lipschitz 1864–1909 Hermann Minkowski 1854–1912 Jules Henri Poincar´ e 1831–1916 (Julius Wilhelm) Richard Dedekind 1838–1916 Ernst Mach 1873–1916 Karl Schwarzschild 1845–1918 Georg Ferdinand Ludwig Philipp Cantor 1849–1925 Felix Klein 1848–1925 Friedrich Ludwig Gottlob Frege 1853–1925 Gregorio Ricci-Curbastro 1843–1930 Moritz Pasch 1861–1931 Cesare Burali-Forti
[ www.topology.org/tex/conc/dg.html ]
contribution (b. Nov 16, d. Feb 18.) covariant differentiation. (b. Nov 10, d. Mar 15.) the Stokes theorem (?) (b. May 14, d. Oct 7.) 1908: introduced Minkowski space-time formulation of special relativity (b. Jun 22, d. Jan 12.) ‘opened up the road to algebraic topology’; Poincar´e conjecture. (b. Apr 29, d. Jul 17.) set theory; real numbers. (b. Oct 6, d. Feb 12.) Mach’s principle [term coined 1918 by Einstein]. (b. Feb 18, d. Feb 19.) singularities in general relativity. (b. Oct 9, d. May 11.) Set theory 1870, 1883. (b. Mar 3, d. Jan 6.) Erlanger Programm; unification of Euclidean, projective and other non-Euclidean geometries. (b. May 25, d. Jun 22.) set theory foundations; victim of Russell’s paradox (b. Nov 8, d. Jul 26.) developed tensor calculus; absolute differential calculus (b. Jan 12, d. Aug 6.) 1882: statement of geometry as a hypothetico-deductive system Burali-Forti paradox. (b. Aug 13, d. Jan 21) [ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
46.1.6 Remark: Twentieth century
46. History of differential geometry
1875–1932 Giuseppe Vitali 1858–1932 Giuseppe Peano 1879–1932 John Wesley Young 1878–1936 Marcel Grossman 1875–1941 Henri L´eon Lebesgue 1873–1941 Tullio Levi-Civita 1862–1943 1882–1944 1892–1945 1861–1947 1873–1950 1869–1951
David Hilbert Arthur Stanley Eddington Stefan Banach Alfred North Whitehead Constantin Carath´ eodory ´ Cartan Elie
1871–1953 Ernst Friedrich Ferdinand Zermelo 1879–1955 Albert Einstein 1885–1955 Hermann Klaus Hugo Weyl 1871–1956 F´elix Edouard Justin Emile Borel 1903–1957 John (J´ anos) von Neumann 1880–1960 Oswald Veblen 1891–1965 Abraham Adolf Fraenkel 1881–1966 Luitzen Egbertus Jan Brouwer 1872–1970 Bertrand Arthur William Russell 1878–1973 Ren´e Maurice Fr´ echet 1921–1977 Alfred Schild 1888–1977 Paul Isaak Bernays 1906–1978 Kurt G¨ odel 1909–1978 Eduard Ludwig Stiefel 1896–1980 Kazimierz (Casimir) Kuratowski 1918–1988 Richard Phillips Feynman 1906–1993 Andre i Nikolaeviq Tihonov 1915–2001 Frederick (Fred) Hoyle 1915–2002 Laurent Schwartz
proved “existence” of Lebesgue non-measurable sets. (b. Aug 26, d. Feb 29) 1891–95: reduction of considerable part of mathematics to symbolism (b. Aug 27, d. Apr 20.) gave a strict axiomatic basis for projective geometry discovered relevance of tensor calculus to relativity? (b. Apr 9, d. Sep 7) measure and integration (b. Jun 28, d. Jul 26.) 1917: infinitesimal parallel transport; developed tensor calculus (b. Mar 29, d. Dec 29.) 1899: Grundlagen der Geometrie. (b. Jan 23, d. Feb 14.) 1916: Riemannian and affine connections. (b. Dec 28, d. Nov 22.) 1910–13: Principia Mathematica. geometric measure theory 1901: exterior derivative; 1923–25: defined general connections. (b. Apr 9, d. May 6.) 1908: set theory axioms 1916: general relativity. (b. Mar 14, d. Apr 18.) 1916: Riemannian and affine connections. (b. Nov 9, d. Dec 8.) (b. Jan 7, d. Feb 3.) 1922: early version of Bernays-G¨odel set theory. Ordinal numbers using successor sets. gave a strict axiomatic basis for projective geometry 1922: completed Zermelo’s set theory axioms; KEM [121] gives name order “Adolf Abraham” 1908: “the unreliability of the principles of logic” 1910–13: Principia Mathematica; Russell’s paradox. 1906: topological compactness 1970: Schild’s ladder. (d. May 24.) Set theory. (b. Oct 17, d. Sep 18.) set theory (b. Apr 28, d. Jan 14.) 1936: introduced fibre bundles as a distinct concept represented ordered pair (a, b) as {{a}, {a, b}} (b. May 11, d. Feb 15.) Andrei Nikolaevich Tikhonov. Compactness theorem. (b. Oct 30, d. Oct 7.) Astrophysics. Cosmology. (b. Jun 24, d. Aug 20.) theory of distributions. (b. Mar 5, d. Jul 4.)
[ Maybe should have a chronology of events here too. ]
46.2. Origins of words and notations 46.2.1 Remark: The word mathematics comes from the Greek word majhma meaning “the act of learning; knowledge, learning, science, art, doctrine”, from manjanw meaning “to learn, have learnt, know; to ask, inquire, hear, perceive; to understand”. (See for example Feyerabend [201] for Greek translations.) At Plato’s Academy until 529ad, the “mathemata” meant the quadrivium of subjects: music, astronomy, geometry and arithmetic. (See Remark 2.9.8 and Figure 2.9.1.) [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
796
46.2. Origins of words and notations
797
46.2.2 Remark: The word geometry (from Greek gewmetria meaning “land-surveying, geometry”, literally “land measurement” from gh meaning “earth, land; soil, ground, field; empire, home”) was used by Herodotus (c.484–c.425bc) for the Egyptian methods of redetermining land boundaries after the annual flooding of the Nile. [ Try to use Arabic font for Arabic names. ] 46.2.3 Remark: The word algorithm comes from “al Khw¯arizmi”, a nickname of the 9th century Arab mathematician Ab¯ u Ja’far Mohammed ibn M¯ us¯a from the town of Kw¯arizm (died c.850). 46.2.4 Remark: The word algebra comes from the Arabic word “al Jebr”, which means “bone-setting”. This seems to come from the book “al jebr w’almuquabala” (“restoration and reduction”) written by al Khw¯arizmi. (See Bell [189], page 99.) Presumably algebra is likened to “setting bones” because bone-setting restores the patient to the normal condition, while algebra restores the unknown variables which have been modified by arithm`etic operations. 46.2.5 Remark: The addition and subtraction symbols “ + ” and “ − ” were invented in 1489 by J.W. Widmann in Germany, according to Bell [189], page 97. Ball [187], page 206, attributes these symbols to Johannes Widman, born about 1460. (Yet another source gives the name Johannes Widman, 1462–1498, and the name of the 1489 publication as “Behende und hupsche Rechnung auf allen kauffmanschafft”.) 46.2.6 Remark: The equals sign “ = ” was invented by Robert Recorde, “Whetstone of witte”, 1557. According to Bell [189], page 129, Recorde said that: “no two things could be ‘moare equalle’ than ‘a paire of parelleles’.” According to Beckmann [188], page 12, Recorde wrote that “noe .2. thynges, can be moare equalle.”
46.2.8 Remark: Descartes introduced the notations x, xx, x3 , x4 , etc. for powers of x (Bell [189], page 144). Gauß wrote xx instead of x2 . Euler still used the xx notation in 1748. Gauß said that “neither is more wasteful of space than the other”, according to Bell [189], page 125. √ 46.2.9 Remark: The notations x−n and x1/n , instead of 1/xn and n x, were introduced by John Wallis in 1655, according to Bell [189], page 129. 46.2.10 Remark: Newton developed the dot notation y, ˙ y¨, etc. for derivatives. (Bell [189], page 127.) Z dy 46.2.11 Remark: Leibniz invented the derivative notation and the integral sign from the Latin dx dy “summa”. (See Bell [189], page 153.) According to Bell [190], page 99, referring to the notation : “This dx symbolism is due (essentially) to Leibniz and is the one in common use today; Newton used another (y) ˙ which is less convenient.” P Similarly, the sum symbol is the Greek letter “S”, related to the word “sum”.
46.2.12 Remark: The word function in the mathematical sense was apparently introduced by Leibniz. Bell [190], page 98, says: “The word function (or its Latin equivalent) seems to have been introduced into mathematics by Leibniz in 1694; the concept now dominates much of mathematics and is indispensable in science. Since Leibniz’ time the concept has been made precise.” 46.2.13 Remark: Beckmann [188], page 12, says that the notation π was not used for the ratio of circumference to diameter of a circle until the 18th century. Beckmann [188], page 145, says that a Welsh mathematician, William Jones (1675–1749), used the π notation in 1706 in Synopsis Palmariorum Matheseos. He says also that Euler first used the π notation in 1737 in his Variae observationes circa series infinitas. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
46.2.7 Remark: Rafael Bombelli’s book “Algebra” in 1572 introduced a consistent theory of imaginary complex numbers, including arithmetic for negative numbers. (The name is also spelled Raffael and Rafaello.) He also introduced an unusual notation for variables such as x and x2 . (See Bell [189], page 174; Lakoff/N´ un ˜ez [172], page 73; Struik [193], page 85; Ball [187], page 228.)
798
46. History of differential geometry
46.2.14 Remark: The word affine was introduced by Leonhard Euler for affine spaces and affine maps in 1748 in his book Introductio in analysin infinitorum [103], volume 2, chapter 18, section 442, as explained in Section 46.3. 46.2.15 Remark: The term group as an algebraic structure was invented by Evariste Galois (1811–32), according to Bell [189], page 234. 46.2.16 Remark: The integer congruence notation a ≡ b (mod m) was apparently invented by Gauß. (Bell [190], page 225.) 46.2.17 Remark: The word topology was first used in print by Johann Benedict Listing in “Vorstudien zur Topologie” (1847), according to EDM2 [34], article 426. The earlier term for this word was “analysis situs”, which is Latin for “analysis of place”. (The word “topology” is a Greek-based construction meaning “study of place” from topo meaning “place, spot; passage in a book; region, district; space, locality; position, rank, opportunity” and logo meaning “thought, reasoning, computation, reckoning, deliberation, account, consideration, opinion” [and 44 other meanings in my dictionary].) It is not immediately obvious why the word “topology” is used for the study of continuity. The connection can be seen by thinking about what a set would be without a topology. Then every point is equivalent; for instance, there would be no concept of a continuous curve. But with a topology, continuous curves are constrained to move smoothly from one point to another. Thus if progressively shorter segments of a curve are considered, they are progressively more local to their starting point. In other words, a topology gives a set a sense of place or locality. This becomes even clearer when considering how neighbourhoods are used in defining interior, exterior and boundary points of sets. In fact, the aptness of the word “topology” is most clear when a topology is defined in terms of open bases at all points rather than open sets. 46.2.18 Remark: The assertion symbol ⊢ was perhaps (?) invented in 1879 by Frege.
[ Is it possible that the QED symbol “ ” at the end of proofs in modern texts originated from a bold right square bracket? Taylor [144] used a very solid bold right square bracket in 1966. Olaf told me in 2005 that Paul Halmos claimed in “I want to be a mathematician” that he (Halmos) invented this style of QED symbol. ]
46.3. Etymology of affine spaces This section deals with the historical origin of the word “affine” in expressions such as “affine spaces” and “affine transformations”. It seems that the term was introduced by Euler in a slightly erroneous fashion in 1827, and was later defined in the modern sense by M¨obius in 1827. The word “affine” does not appear in many English dictionaries. The word “affin/affine” meant “similar” in French from the 12th to the 16th centuries and then disappeared, but reappeared in the mid-19th century [213]. Some sample definitions from various dictionaries are summarized in the following table. language dictionary word definition Latin
White [217]
English
Oxford shorter [211] affin/e
1509. A relation by marriage; a connection; closely related.
French
Petit Robert [213]
affin/e
which conserves invariant, by linear relations, transformations in the plane or in space
German German
Wahrig [215] Duden [199]
affin/e affin
parallel-related (from Latin “affinis”: adjacent, adjoining) produced by parallel projection of one plane onto a second
Italian
Sansoni [207]
affine
similar, allied, kindred, alike
Spanish
Cassell [198]
afin
contiguous, adjacent; allied, related, similar
[ www.topology.org/tex/conc/dg.html ]
affinis
bordering upon, adjacent to, allied, kindred; a connection or relation by marriage.
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
46.2.19 Remark: The word homeomorphism was introduced by Henri Poincar´e in 1895. (See EDM2 [34], section 425.G.) This comes from Greek omoio meaning “like, similar, resembling; the same, of the same rank; equal citizen; equal; common, mutual; a match for; agreeing, convenient”, and morfh meaning “form, shape, figure, appearance, fashion, image; beauty, grace”.
46.3. Etymology of affine spaces
799
for some a, b, θ1 , θ2 ∈ IR. The composition of such matrices yields matrices with non-zero off-diagonal entries. The non-closure of Euler’s transformations implies that the relation is not transitive, and therefore his relation is not an equivalence relation as he probably had assumed it would be. Between Euler in 1748 and M¨ obius in 1827, no one seems to have taken much interest in affine spaces. Although Euler apparently coined the word, M¨obius was probably the real inventor of affine transformations as a subject for study. The central concerns in geometry between the times of Desargues and M¨obius were clearly metric-invariant, conformal-invariant and projective geometries, and somehow there was no motivation to consider affine-invariant geometry as a special topic. But M¨obius found applications to the engineering problem of determining the centre of mass of a structure. The centre of mass is preserved under affine transformations but not under projections. In fact, the affine transformations make up precisely the group under which convex combinations such as the centre of gravity are invariant. Within a few decades, the projective, affine, conformal and metric transformation groups were systematized within the framework of the so-called Erlanger Programm (named after Erlangen University, where Felix Klein proposed the program in 1872). The idea of the Erlanger Programm was to study a wide range of geometries, each specified as the set of properties and relations of special subsets (the “figures”) of a given set X which are invariant under a group G of transformations of X which define a generalized notion of congruence. In the case of affine spaces, X is a linear space and G is the set of all affine transformations – the group of all combinations of translations and invertible linear transformations. It seems that this sort of metageometrical point of view did not originate in the Erlanger Programm but was rather merely systematized in Klein’s proposal. On reading Euler’s original text on the “affinity” relation, it becomes clear that he was not much interested in the significance for geometry. He was interested rather in the graphs of parametrized families of algebraic functions rather than the geometry of those graphs as geometrical objects. Here is paragraph 442 of Euler’s Introductio in analysin infinitorum [103], volume 2, chapter 18. 442. Quemadmodum in curvis similibus abscissae et applicatae homologae in eadem ratione sive augentur sive diminuuntur, ita, si abscissae aliam sequantur rationem, aliam vero applicatae, curvae non amplius orientur similes. Verum tamen, quia curvae hoc modo ortae inter se quandam affinitatem tenent, has curvas affines vocabimus; complectitur ergo affinitas sub se similitudinem tanquam speciem, quippe curvae affines in similes abeunt, si ambae illae rationes, quas abscissae et applicatae seorsim sequuntur, evadant aequales. Ex curva ergo quacunque data AM B innumerabiles curvae affines (Fig. 88 et 89) amb reperientur hoc modo: sumatur abscissa ap, ita ut sit AP : ap = 1 : m; harum rationum 1 : m et 1 : n vel alterutram vel utramque, innumerabiles prodibunt curvae, quae primae AM B erunt affines. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
The term “affine” in geometry seems to have been introduced by Leonhard Euler in 1748 in his book Introductio in analysin infinitorum [103], volume 2, chapter 18, section 442. This is quoted by August Ferdinand M¨obius in 1827 in his book Der barycentrische Calcul [75], pages 194–195, as the source for his adoption of the term. So apparently Euler was the first mathematician to use the word “affine” for general linear transformations and M¨ obius was the second. But the truth is more complex than this. Euler defined two figures to be affine if they could be oriented and translated so that one could be obtained from the other by a scaling in IR2 such as (x, y) 7→ (ax, by) (whereas a similarity transformation has the form (x, y) 7→ (ax, ay), of course). Euler’s thinking on this seems to have been rather woolly. This relation is not an equivalence relation since it is not transitive. (For instance a square can be deformed to a rectangle through one axis, or to a diamond through an axis at 45◦ to this, but there is not single two-scale scaling which can transform the diamond into the rectangle.) Therefore the set of such transformations does not form a closed group. Euler’s use of this non-transitive relation is quite understandable since geometric thinking in terms of transformation groups, equivalence relations and invariants did not really take off until the 19th century. M¨obius noted that Euler claimed that for any two affine-related figures in the plane, there must always be a pair of axes for which by scaling the figure differently in those two directions the two figures could be matched. In other words, Euler effectively implied that an affine relation between two figures could be expressed as a translation combined with a transformation matrix such as cos θ1 − sin θ1 a 0 cos θ2 − sin θ2 , sin θ1 cos θ1 0 b sin θ2 cos θ2
800
46. History of differential geometry
This may be translated as follows. 442. Just as the corresponding abscissae and ordinates in similar curves are augmented or diminished in the same ratio, so, if the abscissae follow one ratio and the ordinates follow a different ratio, the curves are no longer similar. Nevertheless, because curves arising in this way have a certain affinity to each other, we will call these curves affine; affinity therefore encompasses the similarity idea, so to speak; in fact, affine curves change to similar curves if both of those ratios, which the abscissae and ordinates separately follow, happen to be equal. Therefore from any given curve AM B, countless affine curves amb (Fig. 88 and 89) are found in this way: the abscissa ap is chosen so that AP : ap = 1 : m; then the ordinate pm is determined so that P M : pm = 1 : n; and thus by changing these ratios 1 : m and 1 : n, either both or one at a time, countless curves will be produced which are affine to the first AM B. Euler’s figures 88 and 89 are illustrated in Figure 46.3.1. The X-axis is vertical and the Y-axis is oriented to the left. The lines P M and pm represent the ordinates, whereas AP and ap represent the abscissae. B
b O
M
o
P m
p
c
a
C
Figure 46.3.1
A
Euler’s figures 88 and 89 in the Introductio
The editor, Andreas Speiser [104], of the complete works of Euler makes this comment regarding chapter 18. Kapitel 18 handelt von ¨ ahnlichen und affinen Kurven, ersteres mit der Substitution x = mu, y = mv, letzteres mit der allgemeineren x = mu, y = nv. Der Ausdruck “affin” ist wohl hier von Euler eingef¨ uhrt worden. This may be translated as follows. Chapter 18 deals with similar and affine curves, the former with the substitution x = mu, y = mv, the latter with the more general x = mu, y = nv. The expression “affine” is no doubt introduced here by Euler. In other words, this passage appears to be the origin of the term “affine” in this geometrical sense. But it seems that Euler did not do much with it. The subject seems to have not taken off until M¨obius took it up and used the same term that Euler had introduced. Here is the relevant comment by M¨obius [75], section 147, page 195, just after quoting Euler’s paragraph 442. Der von Euler hier aufgestellte Begriff der Affinitas ist also ganz mit dem vorhin entwickelten einerlei, und ich will daher gleichfalls diese allgemeinere Verwandtschaft Affinit¨ at, und Figuren, zwischen denen sie statt findet, affine Figuren nennen. Translated into English, this is as follows. The concept of affinity proposed here by Euler is thus entirely the same as that which was developed above, and likewise, I want to call this more general relation affinity, and call figures between which this relation exists affine figures. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Fig. 89 Fig. 88
46.4. Logical language in ancient literature
801
In paragraph 443, Euler discusses how to substitute scaled variables in place of the x and y variables in a given equation. But in paragraph 444, he talks about the supposed distinction between similar and affine curve relationships. 444. Discrimen autem inter curvas similes et affines hoc potissimum est notandum, quod curvae, quae sunt similes respectu unius axis vel puncti fixi, eaedem similes sint futurae respectu aliorum quorumvis axium seu punctorum homologorum. Curvae autem, quae tantum sunt affines, tales tantum sunt respectu eorum axium, ad quos referentur, neque pro lubitu alii axes seu puncta homologa in ipsis dantur, ad quae affinitas referri possit. [. . . ] This may be translated as follows. 444. Now the most powerful distinction to be noted between similar and affine curves is that curves which are similar in respect of a single axis or fixed point are going to be similar with respect to any other axes or corresponding points. On the other hand, curves which are only affine, are such only in respect of the axes to which they are referred, and other axes or corresponding points are not given in themselves arbitrarily, to which the affinity may be referred. [. . . ] Euler says here that the axes to be used for a similarity translation are arbitrary, which is true, but that the axes for an affinity transformation cannot be chosen arbitrarily. M¨obius states that in fact the axes for an affinity transformation are not only of arbitrary orientation but are also not necessarily orthogonal, although he continues to use two scale factors m and n for the scalings in the two axial directions. Thus M¨obius does not resort to linear combinations of coordinates. He still uses a diagonal matrix effectively, but axes are chosen in different angles in each of the two figures which are supposed to be affine. Clearly Euler thought only in terms of orthogonal coordinates in both figures, with no off-diagonal components in the transformation matrix. Oddly, though, in paragraphs 452–454, Euler discusses orothogonal transformations using sines and cosines of rotation angles. But he didn’t say anything about making both the diagonal components different and also the off-diagonal components non-zero at the same time. This just shows that Euler was not thinking at all in terms of what we think of as affine transformations today. But in terms of etymology, there seems little doubt that Euler was the originator of the term “affine” for a concept which directly developed into the transformation group that we know today.
46.4.1 Remark: Logical language in the epic of Gilgamesh The incidence of logical language in the epic of Gilgamesh is discussed in Remark 3.5.2. The epic of Gilgamesh contains the following 9 instances of the word “if”. The last line of the following passage [202, page 19] (repeated as lines II.284–286) is an IF-clause. ‘So to keep safe the cedars, II.227 Enlil made it his lot to terrify men; II.228 if you penetrate his forest you are seized by the tremors.’ II.229 The following passage is in lines VI.96–97 [202, page 51]. ‘If you do not give me the Bull of Heaven, VI.96 I shall smash the gates of the Netherworld, right down to its dwelling, VI.97 to the world below I shall grant manumission, VI.98 I shall bring up the dead to consume the living, VI.99 I shall make the dead outnumber the living.’ VI.100 A little later, there is the following [202, page 51]. ‘If you want from me the Bull of Heaven, VI.103 let the widow of Uruk gather seven years’ chaff, VI.104 and the farmer of Uruk grow seven years’ hay.’ VI.105 And a little later, there is the following [202, page 52]. ‘Had I caught you too, I’d have treated you likewise, VI.156 I’d have draped your arms in its guts!’ VI.157 On the next tablet, there is the following [202, page 56]. ‘Had I but known, O door, that so you would repay me, VII.47 had I but known, O door, that so you would reward me, VII.48 [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
46.4. Logical language in ancient literature
802
46. History of differential geometry
46.4.2 Remark: Logical language in B¯eowulf As mentioned in Remark 3.5.4, there are 16 OR-constructions, 26 IF-constructions, and numerous “unless” and “except” constructions in the 3182 lines of B¯eowulf. So to quote them all would be burdensome, especially if presented in both Old English and modern English. Therefore only a small sample of logical expressions from that work are given here. The following instance of “if” occurs in lines 1380–1382 (Wrenn [218], page 153). “Ic þ¯e þ¯a fæhke ¯ f¯eo l¯eanige, 1380 eald-gestr¯eonum, sw¯ a ic ær ¯ dyde, 1381 wundini golde, gyf þ¯ u on weg cymest.” 1382 The translation of this pasage by Alexander [194], page 94, is as follows. ‘I shall reward the deed, as I did before, with wealthy gifts of wreath`ed ore, treasures from the hoard, if you return again.’ The following instance of “or” occurs in lines 1490–1491 (Wrenn [218], page 157). “ic m¯e mid Hruntinge 1490 d¯om gewyrce, oþke mec d¯eak nimek.” 1491 The translation of this pasage by Alexander [194], page 98, is as follows. ‘With Hrunting shall I achieve this deed – or death shall take me!’ The following quotation shows how many “or” alternatives may be chained together (Wrenn [218], page 165). “N¯ u is þ¯ınes mægnes blæd ¯ 1761 ¯ane hw¯ıle; eft s¯ ona bik 1762 þæt þec ¯adl okke ecg eafoþes getwæfek, ¯ 1763 okke f¯ yres feng okke fl¯ odes wylm 1764 okke gripe m¯eces okke g¯ ares fliht 1765 okke atol yldo, okke ¯eagena bearhtm 1766 forsitek ond forsworcek; semninga bik, 1767 þæt kec, dryht-guma, d¯eak ofersw¯ ykek.” 1768 The translation of this pasage by Alexander [194], pages 106–107, is as follows. ‘The noon of your strength shall last for a while now, but in a little time [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
I would have lifted my axe, I would have cut you down, VII.49 I would have floated you down as a raft to Ebabbara.’ VII.50 On tablet X, there is the following [202, page 77]. ‘If you and Enkidu were the ones who slew the Guardian, X.36 destroyed Humbaba, who dwelt in the Forest of Cedar, X.37 killed lions in the mountain passes, X.38 seized and slew the Bull come down from heaven – X.39 why are your cheeks so hollow, your face so sunken, X.40 your mood so wretched, your visage so wasted?’ X.41 The form of lines X.76–77 [202, page 78] (repeated as lines X.153–154) is (A ⇒ B) ∧ ((¬A) ⇒ C). This is a two-way decision tree. It suggests the proposition A ∨ ¬A or the corresponding exclusive disjunction. ‘If it may be done, I will cross the ocean, X.76 if it may not be done, I will wander in the wild!’ X.77 A little later, there is the following echo [202, page 79]. ‘If it may be done, go across with him, X.90 if it may not be done, turn around and go back!’ X.91 Later, there is an unfinished IF-sentence [202, page 86]. ‘If, Gilgamesh, the temples of the gods have no provisioner, X.288 the temples of the goddesses . . . ’ X.289
46.4. Logical language in ancient literature
803
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
sickness or a sword will strip it from you: either enfolding flame or a flood’s billow or a knife-stab or the stoop of a spear or the ugliness of age; or your eyes’ brightness lessens and grows dim. Death shall soon have beaten you then, O brave warrior!’ The following quotation shows an example of the word “nefne”, meaning “unless” (Wrenn [218], page 178). “G¯en is eall æt k¯e 2149 lissa gelong; ic l¯ yt hafo 2150 h¯eafod-m¯aga, nefne Hygel¯ ac kec!” 2151 The translation of this pasage by Alexander [194], page 118, is as follows. ‘Joy, for me, always lies in your gift. Little family do I have in the world, Hygelac, besides yourself.’ In the above case, the word “nefne” is best translated as “except” or “besides”. In the following example, the opposite is true. The following quotation shows an example of the word “b¯ utan”, meaning “except” or “without” (Wrenn [218], page 136). “Ic hine hrædl¯ıce heardan clammum 963 on wæl-bedde wr¯ıþan þ¯ ohte, 964 þæt h¯e for mund-gripe m¯ınum scolde 965 licgean l¯ıf-bysig, b¯ utan his l¯ıc swice.” 966 The translation of this pasage by Alexander [194], page 81, is as follows. ‘I had meant to catch him, clamp him down with a cruel lock to his last resting-place; with my hands upon him, I would have him soon in the throes of death – unless he disappeared!’
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
804
[ www.topology.org/tex/conc/dg.html ]
46. History of differential geometry
[ draft: UTC 2009–3–21 Saturday 11:36 ]
[805]
Chapter 47 Exercise questions
47.1 47.2 47.3 47.4 47.5 47.6 47.7 47.8 47.9
Logic . . . . . . . . . . . . Sets, relations and functions Numbers . . . . . . . . . . . Algebra . . . . . . . . . . . Linear algebra . . . . . . . . Tensor algebra . . . . . . . . Topology . . . . . . . . . . Topological fibre bundles . . Topological manifolds . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
805 806 807 807 807 808 808 808 809
Since answers will be given for all exercises, teachers may find these exercises useless as homework and exam questions. Questions for assessment can easily be constructed by morphing the exercises in this book (and all other DG books). The difficult part of designing exam questions is to ensure that the answers will not be too difficult or too easy. So it is helpful to see the answers (as in this book) to see what the level of difficulty might be. Hopefully a slightly morphed question will have a slightly morphed answer. (Of course this is not always so!) This chapter is organized into sections which do not correspond exactly to the chapters of the book, but the topic order should be roughly the same.
47.1. Logic 47.1.1 Exercise (→ 48.1.1): Give a logical statment which is equivalent to τ (A) + τ (B) + τ (C) ≤ 2 for propositions A, B and C. (See Remark 4.3.15.) 47.1.2 Exercise (→ 48.1.2): Give a logical statment which is equivalent to τ (A) + τ (B) + τ (C) = 2 for propositions A, B and C. (See Remark 4.3.15.) 47.1.3 Exercise (→ 48.1.3): Determine the possible truth values of proposition variables A, B and C, given that A ∧ (B ∨ C) is true and (A ∨ B) ∧ C is false. (See Remark 4.4.2.) 47.1.4 Exercise (→ 48.1.4): Prove Theorem 4.10.2 directly without the “Deduction Theorem”. Theorems which are not tainted by the naive “Deduction Theorem” may be used in the proof. 47.1.5 Exercise (→ 48.1.5): Prove Theorem 4.11.11. 47.1.6 Exercise (→ 48.1.6): Write down a logical expression which means “there exist at least 3 things x, y and z with property P ”. In other words, ∃3 x, P (x). (See Remark 4.16.8.)
©
Alan U. Kennington, “Differential geometry reconstructed: a unified systematic framework”. Copyright 2009, Alan U. Kennington. All rights reserved. You may print this book draft in A4 format. Printing in all other formats is forbidden. You may not charge any fee for copies of this book draft. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
These exercises are intended for the reader’s self-study, as active learning for the concepts in the book. They are not intended for use in exam questions or any other form of assessment. The answers will be given in full (in Chapter 48) to ensure that there is no incompleteness in the presentation due to missing proofs of theorems which the reader is expected to provide.
806
47. Exercise questions
47.1.7 Exercise (→ 48.1.7): Write down a logical expression which means “there exist at most 3 things x, y and z with property P ”. In other words, ∃3 x, P (x). (See Remark 4.16.8.) 47.1.8 Exercise (→ 48.1.8): Write down a logical expression which means “there exist exactly 3 things x, y and z with property P ”. In other words, ∃33 x, P (x). (See Remark 4.16.8.) 47.1.9 Exercise (→ 48.1.9): Write down a logical expression which means “there exist at least 2 and most 3 things with property P ”. In other words, ∃32 x, P (x). (See Remark 4.16.8.) 47.1.10 Exercise (→ 48.1.10): For general non-negative integer n, write down a logical expression which means “there exist at least n things with property P ”. In other words, ∃n x, P (x). (See Remark 4.16.8.) 47.1.11 Exercise (→ 48.1.11): For general non-negative integer n, write down a logical expression which means “there exist at most n things with property P ”. In other words, ∃n x, P (x). (See Remark 4.16.8.) 47.1.12 Exercise (→ 48.1.12): For general non-negative integer n, write down a logical expression which means “there exist exactly n things with property P ”. In other words, ∃nn x, P (x). (See Remark 4.16.8.) 47.1.13 Exercise (→ 48.1.13): For general non-negative integers m and n with m < n, write down a logical expression which means “there exist at least m and most n things with property P ”. In other words, ∃nm x, P (x). (See Remark 4.16.8.)
47.2. Sets, relations and functions 47.2.1 Exercise (→ 48.2.1): Show that ∃x ∈ S, P (x) is equivalent to ¬(∀x ∈ S, ¬P (x)). (See Remark 5.1.15.)
47.2.3 Exercise (→ 48.2.3): Show that the ZF replacement axiom, Definition 5.1.26 (6), implies the specification axiom in line (5.4.1) of Remark 5.4.3. 47.2.4 Exercise: In ZF set theory (Section 5.1), an infinite number of set membership relations “on the left” is prohibited by the regularity axiom (7), but an infinite number of such relations “on the right” is perfectly okay. Why is there this asymmetry? 47.2.5 Exercise: Rewrite the ZF set theory axioms to be consistent with ∀x, x ∈ x, but only if it is possible to do this. (See Remark 5.7.17.) 47.2.6 Exercise (→ 48.2.4): Prove Theorem 5.13.4. 47.2.7 Exercise (→ 48.2.5): Prove Theorem 5.13.13. 47.2.8 Exercise (→ 48.2.6): Prove Theorem 5.14.7. 47.2.9 Exercise (→ 48.2.7): Prove Theorem 5.14.11. 47.2.10 Exercise (→ 48.2.8): Prove Theorem 5.14.13. 47.2.11 Exercise (→ 48.2.9): Prove Theorem 5.14.16. 47.2.12 Exercise (→ 48.2.10): Prove Theorem 6.1.12.
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
47.2.2 Exercise (→ 48.2.2): Show that the combination of the specification axiom in line (5.4.1) and the replacement axiom in line (5.4.2) implies the single replacement axiom in Definition 5.1.26 (6). (See Remark 5.4.3 for discussion.)
47.3. Numbers
807
47.3. Numbers 47.3.1 Exercise (→ 48.3.1): Write the ordinal numbers 0, 1, 2, 3, 4, 5 explicitly in terms of the empty set. (See Remark 7.2.14.) 47.3.2 Exercise (→ 48.3.3): Draw a diagram of the ordinal number 10 in the style of Figure 7.2.1. (See Remark 7.2.15.) 47.3.3 Exercise: Show that the successor set S + = S ∪ {S} in Theorem 7.2.17 is well-defined for all sets S in ZF set theory. 47.3.4 Exercise (→ 48.3.2): Prove Theorem 8.6.10. 47.3.5 Exercise: Prove Theorem 8.6.18.
47.4. Algebra 47.4.1 Exercise (→ 48.4.1): Construct a group G and a subgroup S of G such that ∀g ∈ G, gSg −1 = g −1 Sg, but ∃h ∈ G, hS 6= Sh. In other words, S is a normal subgroup of G, but the two possible definitions for the conjugate of S are not equivalent. (Hint: h2 Sh−2 = S, but hSh−1 6= S. So an element h of order 2 would be a good hunch.) 47.4.2 Exercise: Referring to Example 9.3.17, construct a similar example with SO(3) on IR3 with h equal to an arbitrary rotation or reflection and g equal to an arbitrary rotation. 47.4.3 Exercise: Show that if h = Sφ , then g −1 hg = Sφ−θ and ghg −1 = Sφ+θ . (See Example 9.3.17.) 47.4.4 Exercise: Prove the statements in Remark 9.3.19.
47.4.6 Exercise: Prove the statements in Remark 9.4.15. 47.4.7 Exercise: Referring to Example 9.4.32, show by calculation in coordinates that for rotations around each of the axes in IR3 , points on the sphere S 2 ⊆ IR3 are mapped to other points which lie on circles on the sphere with circle centres on the respective axes. 47.4.8 Exercise: Verify that the tuples (G, X, σ, µ) in Definitions 9.4.36 and 9.4.37 satisfy Definition 9.4.4 for a left transformation group. 47.4.9 Exercise: Verify that the skew products in Definitions 9.6.4, 9.6.5, 9.6.8 and 9.6.9 satisfy Definitions 9.4.4 and 9.5.2 for left and right transformation groups.
47.5. Linear algebra 47.5.1 Exercise: Show that for any one-dimenional linear space V over a field K, ∀φ ∈ End(V ), ∃λ ∈ K, ∀v ∈ V,
φ(v) = λv.
See Remark 10.4.1 for context. 47.5.2 Exercise: Prove Theorem 11.1.30. 47.5.3 Exercise (→ 48.5.1): Prove Theorem 11.5.5.
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
47.4.5 Exercise: Prove Theorem 9.3.20.
808
47. Exercise questions
47.6. Tensor algebra 47.6.1 Exercise (→ 48.6.1): Prove Theorem 13.2.2. 47.6.2 Exercise (→ 48.6.2): Prove Theorem 13.3.6. 47.6.3 Exercise (→ 48.6.3): Prove Theorem 13.3.7.
47.7. Topology 47.7.1 Exercise: Show that the trivial and discrete topologies in Definitions 14.2.18 and 14.2.19 satisfy the requirements of Definition 14.2.3. 47.7.2 Exercise: Tabulate the set of all possible topologies on a four-element set. (See Example 14.3.6.) 47.7.3 Exercise (→ 48.7.1): Show by direct calculation that Theorem 14.8.8 is valid for X = ∅. 47.7.4 Exercise (→ 48.7.2): Prove Theorem 15.2.10. 47.7.5 Exercise: Determine whether the empty topology (X, T ) = (∅, {∅}) is second countable, separable, Hausdorff, normal, connected, locally connected, simply connected, compact, sequentially compact, locally compact or paracompact. See Chapter 15. 47.7.6 Exercise: Determine whether the single point topology (X, T ) = ({x}, {∅, {x}}) is second countable, separable, Hausdorff, normal, connected, locally connected, simply connected, compact, sequentially compact, locally compact or paracompact. See Chapter 15. 47.7.7 Exercise: Determine whether the trivial two-point topology (X, T ) = ({x, y}, {∅, {x, y}}) is second countable, separable, Hausdorff, normal, connected, locally connected, simply connected, compact, sequentially compact, locally compact or paracompact. See Chapter 15.
47.7.9 Exercise (→ 48.7.3): Prove Theorem 15.2.17. 47.7.10 Exercise: Verify that the lexicographic total order on any Cartesian product X n for any totally ordered set X and integer n ∈ + ∀j ∈ n , (∀i ∈ n , i < j ⇒ xi = yi ) ⇒ 0 which is defined by x ≤ y ⇔ xj ≤ yj is equivalent to the order x ≤ y ⇔ ∀j ∈ n , ((xj ≤ yj ) ∨ (∃i ∈ n , (i < j ∧ xi 6= yi ))). (See Section 7.1 and Remark 16.1.8 for context.)
Z
N
N
N
N
47.7.11 Exercise: Prove Theorem 17.1.5. 47.7.12 Exercise: Prove Theorem 17.1.12. 47.7.13 Exercise: Prove Theorem 17.1.14.
47.8. Topological fibre bundles 47.8.1 Exercise (→ 48.8.1): Prove the statement in Remark 23.6.6 that the general empty topological G (G, F ) fibre bundle for non-empty F has the form (E, TE , π, B, TB , AF E ) = (∅, {∅}, ∅, ∅, {∅}, AP ), where G AP = ∅ or {∅}. 47.8.2 Exercise (→ 48.8.2): Show that condition (23.10.1) in Remark 23.10.4 and Definition 23.10.3 (ii) are equivalent. 47.8.3 Exercise (→ 48.8.3): Prove the statement in Remark 23.12.7 that the relation φ(z ′ )y ′ = φ(z)y in Definition 23.12.3 (i) is independent of the choice of φ ∈ AG P . In other words, prove that ∀b ∈ B, ∀z, z ′ ∈ q −1 ({b}), ∀y, y ′ ∈ F, ∀φ1 , φ2 ∈ AG P, φ1 (z ′ )y ′ = φ1 (z)y ⇔ φ2 (z ′ )y ′ = φ2 (z)y.
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
47.7.8 Exercise: Determine if the discrete two-point topology (X, T ) = ({x, y}, {∅, {x}, {y}, {x, y}}) is second countable, separable, Hausdorff, normal, connected, locally connected, simply connected, compact, sequentially compact, locally compact or paracompact. See Chapter 15.
47.9. Topological manifolds
809
47.9. Topological manifolds 47.9.1 Exercise (→ 48.9.1): Show that any locally Euclidean space is locally connected. (See Remark 26.2.6.) 47.9.2 Exercise (→ 48.9.2): Show that any locally Euclidean space is locally compact. (See Remark 26.2.6.) 47.9.3 Exercise: Prove Theorem 26.4.13. 47.9.4 Exercise: Prove Theorem 26.4.19. 47.9.5 Exercise: Prove Theorem 26.5.1. 47.9.6 Exercise: Prove Theorem 26.5.6. 47.9.7 Exercise: Prove Theorem 26.6.4.
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
810
[ www.topology.org/tex/conc/dg.html ]
47. Exercise questions
[ draft: UTC 2009–3–21 Saturday 11:36 ]
[811]
Chapter 48 Exercise answers
48.1 48.2 48.3 48.4 48.5 48.6 48.7 48.8 48.9
Logic . . . . . . . . . . . . Sets, relations and functions Numbers . . . . . . . . . . . Algebra . . . . . . . . . . . Linear algebra . . . . . . . . Tensor algebra . . . . . . . . Topology . . . . . . . . . . Topological fibre bundles . . Topological manifolds . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
811 813 817 817 818 818 819 819 820
Generally exercises have many possible answers. Take these answers seriously at your peril. [ The numbering scheme for exercise questions and answers obviously needs fixing. The same number should somehow be used for corresponding questions and answers. But for now, arrows point from the answers to the questions. ]
48.1. Logic 48.1.2 Answer (→ 47.1.2): (A ∧ B) ∨ (A ∧ C) ∨ (B ∧ C) ∧ ¬(A ∧ B ∧ C). Alternatively, (A ∧ B ∧ ¬C) ∨ (A ∧ ¬B ∧ C) ∨ (¬A ∧ B ∧ C). 48.1.3 Answer (→ 47.1.3): Since A ∧ (B ∨ C) is true, both A and B ∨ C must be true. Since (A ∨ B) ∧ C is false, either A ∨ B is false or C is false. But since A is true, A ∨ B must be true. Therefore C must be false. Then since B ∨ C is true, B must be true. Conclusion: A and B are true, but C is false. There are no other possible truth-value combinations. 48.1.4 Answer (→ 47.1.4): To prove Theorem 4.10.2 (i) directly without the “Deduction Theorem”: ⊢ (α ⇒ β) ⇒ ((β ⇒ γ) ⇒ (α ⇒ γ)) (1) (2) (3) (4)
(β ⇒ γ) ⇒ (α ⇒ (β ⇒ γ)) (α ⇒ (β ⇒ γ)) ⇒ ((α ⇒ β) ⇒ (α ⇒ γ)) (β ⇒ γ) ⇒ ((α ⇒ β) ⇒ (α ⇒ γ)) (α ⇒ β) ⇒ ((β ⇒ γ) ⇒ (α ⇒ γ))
PC 1 PC 2 Theorem 4.8.3 (iii) (1,2) Theorem 4.8.3 (iv) (3)
48.1.5 Answer (→ 47.1.5): Theorem 4.11.11, part (i) may be shown as follows. (A ⇒ B) ⇔ (A ⇒ A) ∧ (A ⇒ B) ⇔ A ⇒ (A ∧ B) ⇔ (A ⇒ (A ∧ B)) ∧ ((A ∧ B) ⇒ A) ⇔ A ⇔ (A ∧ B) .
(48.1.1) (48.1.2)
Line (48.1.1) follows from the tautology A ⇒ A. Line (48.1.2) follows from the tautology (A ∧ B) ⇒ A.
©
Alan U. Kennington, “Differential geometry reconstructed: a unified systematic framework”. Copyright 2009, Alan U. Kennington. All rights reserved. You may print this book draft in A4 format. Printing in all other formats is forbidden. You may not charge any fee for copies of this book draft. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
48.1.1 Answer (→ 47.1.1): (A ∧ B) ∨ (A ∧ C) ∨ (B ∧ C).
812
48. Exercise answers
[ The proof of Theorem 4.11.11 in the above answer is most unsatisfactory. The theorem isn’t very useful either. So should scrap it or do it properly. ] 48.1.6 Answer (→ 47.1.6): ∃3 x, P (x) may be written as: ∃x, ∃y, ∃z,
P (x) ∧ P (y) ∧ P (z) ∧ (x 6= y) ∧ (y 6= z) ∧ (x 6= z).
48.1.7 Answer (→ 47.1.7): ∃3 x, P (x) may be written as: ∀w, ∀x, ∀y, ∀z, (P (w) ∧ P (x) ∧ P (y) ∧ P (z)) ⇒ (w = x ∨ w = y ∨ w = z ∨ x = y ∨ x = z ∨ y = z). Alternatively, ∀w, ∀x, ∀y, ∀z, (w 6= x ∧ w 6= y ∧ w 6= z ∧ x 6= y ∧ x 6= z ∧ y 6= z) ⇒ (¬P (w) ∨ ¬P (x) ∨ ¬P (y) ∨ ¬P (z)). Of course, saying that there are at most 3 things x such that P (x) is true is not really an existential statement at all. All uniqueness statements (in this case a “tripliqueness” statement) are negative existence statments. Therefore universal quantifiers are used rather than existential quantifiers. The similarity of form to Answer 48.1.6 is not coincidental. This is not surprising because ∃3 x, P (x) means the same thing as ¬(∃4 x, P (x)). There exist at most three x if and only if there do not exist four x. It follows by simple negation of ∃4 x, P (x) that there is a third equivalent form for ∃3 x, P (x), namely
48.1.8 Answer (→ 47.1.8): ∃33 x, P (x) may be written as: ∃x, ∃y, ∃z, (P (x) ∧ P (y) ∧ P (z) ∧ (x 6= y) ∧ (y 6= z) ∧ (x 6= z)) ∧ ∀w, ∀x, ∀y, ∀z, (P (w) ∧ P (x) ∧ P (y) ∧ P (z)) ⇒ (w = x ∨ w = y ∨ w = z ∨ x = y ∨ x = z ∨ y = z) .
This is clearly the same as the conjunction of ∃3 x, P (x) and ∃3 x, P (x) in Answers 48.1.7 and 48.1.8. It is probably not possible to reduce the complexity of the statement by exploiting some sort of redundancy in the combination of statements. 48.1.9 Answer (→ 47.1.9): ∃32 x, P (x) may be written as: ∃x, ∃y, (P (x) ∧ P (y) ∧ (x 6= y))
∧ ∀w, ∀x, ∀y, ∀z, (P (w) ∧ P (x) ∧ P (y) ∧ P (z)) ⇒ (w = x ∨ w = y ∨ w = z ∨ x = y ∨ x = z ∨ y = z) .
This may be interpreted as: “There are at least 2, but less than 4, x such that P (x) is true.” More informally, one could write 2 ≤ #{x; P (x)} < 4. 48.1.10 Answer (→ 47.1.10): It is tempting to approach this exercise using induction. However, that would not yield a closed formula for the desired statement ∃n x, P (x). It is easier to think of the ordinal number n as a general set. We want to require the existence of a unique xi such that P (xi ) is true for each i ∈ n. The logical expression (48.1.3) is one way of writing ∃n x, P (x). ∃f, ∀i ∈ n, ∃x ((i, x) ∈ f ∧ P (x)) ∧ ∀i ∈ n, ∀j ∈ n, ∀x ((i, x) ∈ f ∧ (j, x) ∈ f ) ⇒ i = j . (48.1.3) In other words, there exists a set f which includes an injective relation on n, such that P (f (i)) is true for all i ∈ n. (It is only required that f restricted to n be an injective relation on n. The purpose of this generality is to simplify the logic.) Thus statement (48.1.3) means that there exists a distinct x which satisfies P for each i ∈ n. As a bonus, the logical expression (48.1.3) is valid for any set n, even if n is countably or uncountably infinite. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
∀w, ∀x, ∀y, ∀z, ¬P (w) ∨ ¬P (x) ∨ ¬P (y) ∨ ¬P (z) ∨ w = x ∨ w = y ∨ w = z ∨ x = y ∨ x = z ∨ y = z.
48.2. Sets, relations and functions
813
48.1.11 Answer (→ 47.1.11): Note that ∃n x, P (x) is equivalent to ¬(∃n+1 x, P (x)), which can be obtained from Answer 48.1.10 by simple negation as in the following expression, where n+ = n + 1. ∀f, ∀i ∈ n+ , ∀j ∈ n+ , ∀x ((i, x) ∈ f ∧ (j, x) ∈ f ) ⇒ i = j ⇒ ∃i ∈ n+ , ∀x ((i, x) ∈ f ⇒ ¬P (x)) .
This may be interpreted as: “For any injective function f on n+ , for some i in n+ , P (f (i)) is false.” The generalization of this expression to infinite n does not seem to be as straightforward as in Answer 48.1.10. 48.1.12 Answer (→ 47.1.12): This question can be answered by combining Answers 48.1.10 and 48.1.11. ∃f, V
∀g,
∀i ∈ n, ∃x ((i, x) ∈ f ∧ P (x)) ∧ ∀i ∈ n, ∀j ∈ n, ∀x ((i, x) ∈ f ∧ (j, x) ∈ f ) ⇒ i = j ∀i ∈ n+ , ∀j ∈ n+ , ∀y ((i, y) ∈ g ∧ (j, y) ∈ g) ⇒ i = j ⇒ ∃i ∈ n+ , ∀y ((i, y) ∈ g ⇒ ¬P (y)) .
48.1.13 Answer (→ 47.1.13): The expression ∃nm x, P (x) is a simple variation of Answer 48.1.12. ∃f, V
∀g,
∀i ∈ m, ∃x ((i, x) ∈ f ∧ P (x)) ∧ ∀i ∈ m, ∀j ∈ m, ∀x ((i, x) ∈ f ∧ (j, x) ∈ f ) ⇒ i = j ∀i ∈ n+ , ∀j ∈ n+ , ∀y ((i, y) ∈ g ∧ (j, y) ∈ g) ⇒ i = j ⇒ ∃i ∈ n+ , ∀y ((i, y) ∈ g ⇒ ¬P (y)) .
48.2. Sets, relations and functions 48.2.1 Answer (→ 47.2.1): By Remark 5.1.15, ∃x ∈ S, P (x) means ∃x, (x ∈ S ∧ P (x)). By Remark 4.13.2, this means ¬(∀x, ¬(x ∈ S ∧ P (x))). By Remark 5.1.15, this is equivalent to ¬(∀x, (x ∈ S ⇒ ¬P (x))). By Remark 5.1.15, this means ¬(∀x ∈ S, ¬P (x)). 48.2.2 Answer (→ 47.2.2): First assume the specification axiom in line (5.4.1) and the replacement axiom in line (5.4.2) of Remark 5.4.3. That is, assume ∃Y, ∀z, (z ∈ Y ⇔ (z ∈ X ∧ P (z))) for any set X and boolean 1-formula P , and ∃Y, ∀x, ((x ∈ X ∧ ∃a, R(x, a)) ⇒ ∃b, (b ∈ Y ∧ R(x, b)))
(48.2.2)
for any set X and boolean 2-formula R. It must be shown that ∀x, ∀y, ∀z, ((f (x, y) ∧ f (x, z)) ⇒ y = z)
⇒ ∃B, ∀y, y ∈ B ⇔ ∃x, (x ∈ A ∧ f (x, y))
(48.2.3)
for any set A and boolean 2-formula f . Let A be a set and let f be a boolean 2-formula such that ∀x, ∀y, ∀z, ((f (x, y) ∧ f (x, z)) ⇒ y = z). (In other words, f is a function.) Then by (48.2.2), there is a set Y which satisfies ∀x, ((x ∈ A ∧ ∃a, f (x, a)) ⇒ ∃b, (b ∈ Y ∧ f (x, b)))
(48.2.4)
Define the boolean 1-formula P by P (z) = “∃x ∈ A, f (x, z)”. Then by (48.2.1), there is a set Z which satisfies ∀z, (z ∈ Z ⇔ (z ∈ Y ∧ ∃x ∈ A, f (x, z))). Now let y satisfy ∃c ∈ A, f (c, y). Then there is a c such that f (c, y). This c then satisfies (c ∈ A) ∧ (∃z, f (c, z)). Therefore ∃b, (b ∈ Y ∧ f (c, b)) by (48.2.4). Thus there is a b such that b ∈ Y and f (c, b). But f is a function. So b = y. Therefore y ∈ Y . The above argument shows that any y which satisfies ∃c ∈ A, f (c, y) also satisfies y ∈ Y , it follows that ∀y, ((∃c ∈ A, f (c, y)) ⇒ (y ∈ Y )). Therefore (z ∈ Y ∧ ∃x ∈ A, f (x, z)) ⇔ (∃x ∈ A, f (x, z)). (This follows from the logical tautology (B ⇒ A) ⇒ ((A ∧ B) ⇔ B).) By susbstituting this equivalence into the definition of Z, it follows that ∀z, (z ∈ Z ⇔ ∃x, (x ∈ A ∧ f (x, z))). This is the right hand side of (48.2.3). So (48.2.3) is verified. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
(48.2.1)
814
48. Exercise answers
48.2.3 Answer (→ 47.2.3): It must be shown that (48.2.1) in Remark 5.4.3 follows from line (48.2.3) in Answer 48.2.2. (To be continued . . . ) 48.2.4 Answer (→ 47.2.6): To prove Theorem 5.13.4 part (i), note that x ∈ A ∪ ∅ ⇔ (x ∈ A ∨ x ∈ ∅) ⇔ x ∈ A.
(48.2.5)
Line (48.2.5) follows from Remark 4.11.9 and the fact that ∀x, x ∈ / ∅ by the definition of an empty set.
To prove part (ii), note that
x ∈ A ∩ ∅ ⇔ (x ∈ A ∧ x ∈ ∅) ⇔ x ∈ ∅.
(48.2.6)
Line (48.2.6) follows from Remark 4.11.9 and the fact that ∀x, x ∈ / ∅ by the definition of an empty set.
To prove part (iii), note that
A ⊆ B ⇔ (z ∈ A ⇒ z ∈ B)
⇔ (z ∈ A ⇒ (z ∈ A ∧ z ∈ B)) ⇔ (z ∈ A ⇔ (z ∈ A ∧ z ∈ B)).
48.2.6 Answer (→ 47.2.8): For Theorem 5.14.7 part (i), let S1 and S2 be sets of sets such that S1 ⊆ S S S2 . Let x ∈ S1 . Then ∃A ∈ S1 , x ∈ A. But A ∈ S1 ⇒ A ∈ S2 . So ∃A ∈ S2 , x ∈ A, which means that x ∈ S2 . T To prove part (ii), let S1 and S2 be non-empty sets of sets such that S1 ⊆ S2 .T Let x ∈ S1 . Then ∀A ∈ S1 , x ∈ A. But A ∈ S1 ⇒ A ∈ S2 . So ∀A ∈ S2 , x ∈ A, which means that x ∈ S2 . To prove part (iii), let A be a set and S1 be a set of sets. Then x∈A∩(
S
S1 ) ⇔ ⇔ ⇔ ⇔
(x ∈ A) ∧ (∃X ∈ S1 , x ∈ X) ∃X ∈ S1 , (x ∈ A ∧ x ∈ X) ∃X ∈ S1 , x ∈ A ∩ X S x ∈ {A ∩ X; X ∈ S1 }.
To prove part (iv), let A be a set and S1 be a non-empty set of sets. Then x∈A∪(
T
S1 ) ⇔ (x ∈ A) ∨ (∀X ∈ S1 , x ∈ X) ⇔ ∀X ∈ S1 , (x ∈ A ∨ x ∈ X) ⇔ ∀X ∈ S1 , x ∈ A ∪ X T ⇔ x ∈ {A ∪ X; X ∈ S1 }.
To prove part (v), let S1 and S2 be sets of sets. Then
S S x ∈ ( S1 ) ∩ ( S2 ) ⇔ (∃X1 ∈ S1 , x ∈ X1 ) ∧ (∃X2 ∈ S2 , x ∈ X2 ) ⇔ ∃X1 ∈ S1 , ∃X2 ∈ S2 , (x ∈ X1 ∧ x ∈ X2 ) ⇔ ∃X1 ∈ S1 , ∃X2 ∈ S2 , x ∈ X1 ∩ X2 S ⇔ x ∈ {X1 ∩ X2 ; X1 ∈ S1 , X2 ∈ S2 }.
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
48.2.5 Answer (→ 47.2.7): These formulas follow from Theorem 5.13.6. For example, (A∪B)∩(C ∪D) = (A ∪ B) ∩ C ∪ (A ∪ B) ∩ D = (A ∩ C) ∪ (B ∩ C) ∪ (A ∩ D) ∪ (B ∩ D) .
48.2. Sets, relations and functions
815
To prove part (vi), let S1 and S2 be non-empty sets of sets. Then T T x ∈ ( S1 ) ∪ ( S2 ) ⇔ ⇔ ⇔ ⇔
(∀X1 ∈ S1 , x ∈ X1 ) ∨ (∀X2 ∈ S2 , x ∈ X2 ) ∀X1 ∈ S1 , ∀X2 ∈ S2 , (x ∈ X1 ∨ x ∈ X2 ) ∀X1 ∈ S1 , ∀X2 ∈ S2 , x ∈ X1 ∪ X2 T x ∈ {X1 ∪ X2 ; X1 ∈ S1 , X2 ∈ S2 }.
To prove part (vii), let S1 and S2 be non-empty sets of sets. Then S S x ∈ ( S1 ) ∪ ( S2 ) ⇔ ⇔ ⇔ ⇔
(∃X1 ∈ S1 , x ∈ X1 ) ∨ (∃X2 ∈ S2 , x ∈ X2 ) ∃X1 ∈ S1 , ∃X2 ∈ S2 , (x ∈ X1 ∨ x ∈ X2 ) ∃X1 ∈ S1 , ∃X2 ∈ S2 , x ∈ X1 ∪ X2 S x ∈ {X1 ∪ X2 ; X1 ∈ S1 , X2 ∈ S2 }.
To prove part (viii), let S1 and S2 be non-empty sets of sets. Then
T T x ∈ ( S1 ) ∩ ( S2 ) ⇔ (∀X1 ∈ S1 , x ∈ X1 ) ∧ (∀X2 ∈ S2 , x ∈ X2 ) ⇔ ∀X1 ∈ S1 , ∀X2 ∈ S2 , (x ∈ X1 ∧ x ∈ X2 ) ⇔ ∀X1 ∈ S1 , ∀X2 ∈ S2 , x ∈ X1 ∩ X2 T ⇔ x ∈ {X1 ∩ X2 ; X1 ∈ S1 , X2 ∈ S2 }.
To prove part (ix), let S1 and S2 be sets of sets. Then S
S1 ) ∪ (
S
S2 ) ⇔ (x ∈ ⇔ ⇔
⇔ ⇔
S
S1 ) ∨ (x ∈
S
S2 ) ∃X, (X ∈ S1 ∧ x ∈ X) ∨ ∃X, (X ∈ S2 ∧ x ∈ X) ∃X, (X ∈ S1 ∧ x ∈ X) ∨ (X ∈ S2 ∧ x ∈ X) ∃X, (X ∈ S1 ∨ X ∈ S2 ) ∧ x ∈ X ∃X, (X ∈ S1 ∪ S2 ) ∧ x ∈ X
⇔ ∃X ∈ S1 ∪ S2 , x ∈ X S ⇔ x ∈ (S1 ∪ S2 ).
To prove part (x), let S1 and S2 be non-empty sets of sets. Then x∈(
T
S1 ) ∩ (
T
S2 ) ⇔ (x ∈ ⇔
⇔
⇔ ⇔
T
S1 ) ∧ (x ∈
T
S2 ) ∀X, (X ∈ / S1 ∨ x ∈ X) ∧ ∀X, (X ∈ / S2 ∨ x ∈ X) ∀X, (X ∈ / S1 ∨ x ∈ X) ∧ (X ∈ / S2 ∨ x ∈ X) ∀X, (X ∈ / S1 ∧ X ∈ / S2 ) ∨ x ∈ X ∀X, (X ∈ / S1 ∪ S2 ) ∨ x ∈ X
⇔ ∀X ∈ S1 ∪ S2 , x ∈ X T ⇔ x ∈ (S1 ∪ S2 ).
The above calculations use the fact that the proposition ∀x ∈ X, P (x), for any set X and set-theoretic formula P , means ∀x, (x ∈ X ⇒ P (x)), which is equivalent to the proposition ∀x, (x ∈ / X ∨ P (x)). (See Notation 5.1.14 and Remark 5.1.15.) To prove part (xi), let S be a set of sets, let A ∈ S and let z ∈ A.SIt follows that ∃y, S (z ∈ y ∧ y ∈ S) because this is true for y = A. Therefore z ∈ {x; ∃y, (x ∈ y ∧ y ∈ S)} = S. Hence A ⊆ S. T To prove part (xii), let S be a non-empty set of sets, let A ∈ S and let z ∈T S = {x; ∀y, (y ∈ S ⇒ x ∈ y)}. Then ∀y, (y ∈ S ⇒ z ∈ y). From A ∈ S it then follows that z ∈ A. Hence S ⊆ A. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
x∈(
816
48. Exercise answers
To show the part (xiii) left-to-rightSimplication, let S be a set of sets and assume ∀X ∈ S, X ⊆ A. That is, ∀X, (X ∈ S ⇒ X ⊆ A). Let z ∈ S. Then ∃y ∈ S, z ∈ y. That is,S∃y, (z ∈ y ∧ y ∈ S). From y ∈ S, it follows that y ⊆ A. So ∃y, (z ∈ y ∧ y ⊆ A). Therefore z ∈ A. Hence S ⊆ A. S To show the S part (xiii) right-to-left implication, let S be a set of sets and assume S ⊆ A. Let X ∈ S. Then X ⊆ S by part (xi). So X ⊆ A. Hence ∀X ∈ S, X ⊆ A. To show the part (xiv) left-to-right implication, let S be a non-empty set of sets and S, X ⊇ A. T assume ∀X ∈ T That is, ∀X, (X ∈ S ⇒ X ⊇ A). Let z ∈ A. Then ∀X ∈ S, z ∈ X. That is, z ∈ S. Hence A ⊆ S. T To show the part (xiv) T right-to-left implication, let S be a non-empty set of sets and assume S ⊇ A. Let X ∈ S. Then X ⊇ S by part (xii). So X ⊇ A. Hence ∀X ∈ S, X ⊇ A. S S S To prove part S (xv), let S1 and S2 be sets of sets. By part (v), ( S1 ) ∩ ( S2 ) = {X1 ∩ X2 ; X1 ∈ S1 , X2 ∈ S2 }. But {X1 ∩ X2 ; X1 ∈ S1 , X2 ∈ S2 } = {y; ∃X1 ∈ S1 , ∃X2 ∈ S2 , y ∈ X1 ∩ X2 }. This equals the empty set if and only if the proposition “∃X1 ∈ S1 , ∃X2 ∈ S2 , y ∈ X1 ∩ X2 ” is false for all y. In other words, ∀y, ¬(∃X1 ∈ S1 , ∃X2 ∈ S2 , y ∈ X1 ∩ X2 ). That is, ∀y, ∀X1 ∈ S1 , ∀X2 ∈ S2 , y ∈ / X1 ∩ X2 . This may be rearranged as ∀X ∈ S , ∀X ∈ S , ∀y, y ∈ / X ∩ X . But ∀y, y ∈ / X ∩ X means precisely that X1 ∩ X2 = ∅. 1 1 2 2 1 2 1 2 S S Hence ( S1 ) ∩ ( S2 ) = ∅ if and only if ∀X1 ∈ S1 , ∀X2 ∈ S2 , X1 ∩ X2 = ∅.
To prove part (ii), note that T T x ∈ {A ∈ S; P (A)} ⇔ x ∈ {B ∈ S; P (B)} ⇔ ⇔ ⇔ ⇔
∀A, (x ∈ A ∨ A ∈ / {B ∈ S; P (B)}) ∀A, (x ∈ A ∨ ¬(A ∈ S ∧ P (A)) ∀A, (x ∈ A ∨ A ∈ / S ∨ ¬P (A)) ∀A ∈ S, (x ∈ A ∨ ¬P (A)).
(48.2.7) (48.2.8)
(48.2.9)
(48.2.10)
Line (48.2.7) follows from Theorem 5.14.4 (i). Line (48.2.8) follows from Notation 5.1.14. Line (48.2.9) follows from Theorem 5.14.4 (ii). Line (48.2.10) follows from Remark 5.1.15. 48.2.8 Answer (→ 47.2.10): To prove part (i) of Theorem 5.14.13, let A1 be a set. Then A1 ⊆ A1 . Therefore A1 ∈ IP(A1 ) by Definition 5.8.18. (See also Remark 5.8.22.) To prove part (ii), let A1 and A2 be sets. Let X be a set such that X ∈ IP(A1 ). Then X ⊆ A1 . So X ⊆ A2 . So X ∈ IP(A2 ). S To prove part (iii), S let A1 be a set. Then A1 ∈ IP(A1 ). So A1 ⊆ (IP(A1 )). To show the reverse inclusion, letSx ∈ (IP(A1 )). Then ∃X ∈ S IP(A1 ), x ∈ X. So ∃X, (X ⊆ A1 ∧ x ∈ X). Hence x ∈ A1 . Therefore (IP(A1 )) ⊆ A1 . It follows that (IP(A1 )) = A1 . Part (iv) follows immediately from Theorem 5.14.7 (i). S S To prove partS(v), let S1 be a set of sets. Let X ∈ S1 . Then X ⊆ S1 . So X ∈ IP( S1 ). It follows that S1 ⊆ IP( S1 ). S To prove part (vi), suppose S1 ⊆ IP(S2 ) and let A S ∈ S1 . Then ∃U, S (U ∈ S1 ∧ A ∈ U ). So ∃U, (U ∈ IP(S2 ) ∧ A ∈ U ) (because S1 ⊆ IP(S )). That is, A ∈ (IP(S )). But (IP(S2 )) = S2 by Theorem 5.14.13 (iii). 1 2 S So A ∈ S2 . Therefore S1 ⊆ S2 as claimed. 48.2.9 Answer (→ 47.2.11): To prove Theorem 5.14.16, part (i), note that S {x ∈ X; P (x, y)} = {z; ∃y ∈ Y, z ∈ {x ∈ X; P (x, y)}} y∈Y = {x; ∃y ∈ Y, (x ∈ X ∧ P (x, y))} = {x; x ∈ X ∧ ∃y ∈ Y, P (x, y)} = {x ∈ X; ∃y ∈ Y, P (x, y)}.
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
48.2.7 Answer (→ 47.2.9): To prove Theorem 5.14.11 part (i), note that S S x ∈ {A ∈ S; P (A)} ⇔ x ∈ {B ∈ S; P (B)} ⇔ ∃A, (x ∈ A ∧ A ∈ {B ∈ S; P (B)}) ⇔ ∃A, (x ∈ A ∧ A ∈ S ∧ P (A)}) ⇔ ∃A ∈ S, (x ∈ A ∧ P (A)}).
48.3. Numbers To prove part (ii), note that T y∈Y
817
{x ∈ X; P (x, y)} = {z; ∀y ∈ Y, z ∈ {x ∈ X; P (x, y)}} = {x; ∀y ∈ Y, (x ∈ X ∧ P (x, y))} = {x; x ∈ X ∧ ∀y ∈ Y, P (x, y)} = {x ∈ X; ∀y ∈ Y, P (x, y)}.
(48.2.11)
Line (48.2.11) follows from Y 6= ∅, which follows from the assumption ∃x ∈ X, ∃y ∈ Y, P (x, y). 48.2.10 Answer (→ 47.2.12): This proof uses the idea that two finite sets that are equal must have the same number of elements. Suppose (a, b) = (c, d). By Definition 6.1.3, (a, b) = {{a}, {a, b}}. and (c, d) = {{c}, {c, d}}. So {a} ∈ (a, b). Therefore {a} ∈ (c, d). So by definition of {{c}, {c, d}}, either {a} = {c} or {a} = {c, d}. If {a} = {c}, then a = c. If {a} = {c, d}, then a = c = d. In either case, a = c. To show that b = d, first suppose that a = b. Then (a, b) = {{a}}. Therefore (c, d) = {{c}}. So c = d. Therefore b = d. Now suppose that a 6= b. Then {a, b} ∈ (c, d) since {a, b} ∈ (a, b). So {a, b} = {c} or {a, b} = {c, d}. Therefore {a, b} = {c, d} since a 6= b. Hence c 6= d. So b = c or b = d. But a = c and a 6= b. So b = d. (See also proofs of Theorem 6.1.12 by Halmos [159], page 23 and Mendelson [164], page 162.)
48.3. Numbers 48.3.1 Answer (→ 47.3.1):
48.3.2 Answer (→ 47.3.4): For all x ∈ IR,
Z
ceiling(x) = inf{i ∈ ; i ≥ x} = − sup{−i; i ∈ ∧ i ≥ x} = − sup{i; −i ∈ ∧ −i ≥ x} = − sup{i ∈ ; i ≤ −x} = − floor(−x).
Z
Z Z
48.3.3 Answer (→ 47.3.2): The ordinal number 10 is illustrated in Figure 48.3.1. Please check that you got all of the boxes right. Deduct one point for each wrong box.
48.4. Algebra
Z
48.4.1 Answer (→ 47.4.1): For the set X = 2 , define the bijections τi,j : X → X and σ : X → X for i, j ∈ by τi,j : (x, y) 7→ (x + i, y + j) and σ : (x, y) 7→ (y, x). These are translation and coordinate-swap actions respectively. Define the group G as the set of actions {τi,j ; i, j ∈ } ∪ {στi,j ; i, j ∈ } with the operation of function composition, where στi,j denotes the composition σ ◦ τi,j . Let H = {τn,n ; n ∈ } ∪ {στn,n ; n ∈ }. This is a subgroup of G. Let g = στa,b for any a, b ∈ . It is straightforward to show that gHg −1 = g −1 Hg for all a, b ∈ , but gH = Hg if and only if a = b. (When the author verified this example on 2004-12-25, he vowed to sacrifice 3 oxen and a fat hamster to the memory of Pythagoras, who famously sacrificed a hundred oxen to the deities in gratitude for his discovery of a general proof of the Pythagoras theorem. See Struik [193], page 42. The author was assisted by a strong hint from Bill Moran, by the way.) The example would probably have the required properties if was replaced by the finite group k for suitable k.
Z
Z
Z
Z
Z
Z
Z
[ www.topology.org/tex/conc/dg.html ]
Z
Z
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
0=∅ 1 = {∅} 2 = {∅, {∅}} 3 = {∅, {∅}, {∅, {∅}}} 4 = {∅, {∅}, {∅, {∅}}, {∅, {∅}, {∅, {∅}}}} 5 = {∅, {∅}, {∅, {∅}}, {∅, {∅}, {∅, {∅}}}, {∅, {∅}, {∅, {∅}}, {∅, {∅}, {∅, {∅}}}}}
818
48. Exercise answers
Figure 48.3.1
The ordinal number 10
48.5. Linear algebra
∀i, j ∈
Nn,
Z+0 and A, B ∈ Sym(n, IR). Then
(AB)Tij = (AB)ji = = =
n X
k=1 n X k=1 n X
ajk bki akj bik bik akj = (BA)ij .
k=1
Therefore (AB)T = BA as claimed.
48.6. Tensor algebra 48.6.1 Answer (→ 47.6.1): Let g, h ∈ L ((Vα )α∈A ; U ) and let f = g + h be the pointwise sum of g and h. The expression f (u) in Theorem 13.1.12 evaluates to g(u) + h(u) = λ1 g(v) + λ2 g(w) + λ1 h(v) + λ2 h(w) = λ1 f (v) + λ2 f (w), as required. Similarly, for λ ∈ K and f = λg, f (u) evaluates to λg(u) = λ(λ1 g(v) + λ2 g(w)) = λ1 f (v) + λ2 f (w), as required. + 48.6.2 Answer (→ 47.6.2): Let g, h ∈ Lm (V ; U ) and let f = g + h be the pointwise sum of g and h. Then m f ∈ Lm (V ; U ) by Theorem13.2.2. For permutations P : Nm → Nm and vector sequences (vi )m i=1 ∈ V , m m m m m m f (vP (i) )i=1 = g (vP (i) )i=1 + h (vP (i) )i=1 = g (vi )i=1 + h (vi )i=1 = f (vi )i=1 , as required.
− 48.6.3 Answer (→ 47.6.3): Let g, h ∈ Lm (V ; U ) and let f = g +h be the pointwise sum of g and h. Then m f ∈ Lm (V ; U ) by Theorem 13.2.2. For permutations P : Nm → Nm and vector sequences (vi )m i=1 ∈ V ,
m m f (vP (i) )m i=1 = g (vP (i) )i=1 + h (vP (i) )i=1 m = parity(P )g (vi )m i=1 + parity(P )h (vi )i=1 = parity(P )f (vi )m i=1 .
− Therefore f is antisymmetric. So f ∈ Lm (V ; U ) as claimed.
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
48.5.1 Answer (→ 47.5.3): Let n ∈
48.7. Topology
819
48.7. Topology 48.7.1 Answer (→ 47.7.3): Let X = ∅. Then IP(X) = {∅}. This gives two possibilities for subsets S ⊆ IP(X) of the power set of X, namely S = ∅ or S = {∅}. However, {∅, X} = {∅}. So the requirement that {∅, X} ⊆ S implies that S = {∅}. Therefore IP(S) = {∅, {∅}}. T In theT equation T ′ = { C; C ∈ IP(S), 1 ≤ #(C) < ∞}, the only possibility for C is {∅} because 1 ≤ #(C). Then C = ∅. So T ′ = {∅} and IP(T ′ ) = {∅, {∅}}. S S In the equation T = { D; D ∈ IP(T ′ )}, the only choices for D are ∅ or {∅}. Then D = ∅ in both cases. Therefore T = {∅}. This is the same as the only possible topology on X, namely the set {∅, X} = {∅}. To fully verify Theorem 14.8.8 for X = ∅, it must be shown that the topology T = {∅} on X = ∅ is the intersection of all topologies on X. This follows immediately from the fact that there is one and only one possible topology on X = ∅. 48.7.2 Answer (→ 47.7.4): Let X be a finite T1 topological space. If X = ∅, then there is only one topology on X, namely the set Top(X) = {∅} = IP(∅) = IP(X).
Let x ∈ X. By theTT1 property, there is a set Ωy ∈ Top(X) for each y ∈ I = X \ {x} such that x ∈ Ωy and y ∈ / Ωy . Then y∈I Ωy = {x}. Since this is a finite intersection, it follows that {x} ∈ Top(X). Since every subset of X can be written as a union of such singleton sets, it follows that Top(X) = IP(X). 48.7.3 Answer (→ 47.7.9): Let (x, y) ∈ X ×Y satisfy y 6= f (x) in Theorem 15.2.17. Since Y is a Hausdorff space, there exist Ω1 , Ω2 ∈ Top(Y ) such that f (x) ∈ Ω1 , y ∈ Ω2 and Ω1 ∩ Ω2 = ∅. Then graph(f ) ∩ (f −1 (Ω1 ) × Ω2 ) = {(x, y) ∈ X × Y ; y = f (x) and f (x) ∈ Ω1 and y ∈ Ω2 } = {(x, y) ∈ X × Y ; f (x) ∈ Ω1 and f (x) ∈ Ω2 }
But (x, y) ∈ f −1 (Ω1 ) × Ω2 and f −1 (Ω1 ) × Ω2 ∈ Top(X × Y ). Therefore (x, y) is in the interior of G = (X × Y ) \ graph(f ). Since all points of G are in the interior of G, it follows that G is open in the topology of X × Y . Therefore graph(f ) is closed in the topology of X × Y . S To make the last statement a little more rigorous, let A = (x,y)∈G {Ω ∈ Top(X ×Y ); Ω ⊆ G and (x, y) ∈ Ω}. Then A ∈ Top(X × Y ) and A = G. Therefore graph(f ) is closed.
48.8. Topological fibre bundles 48.8.1 Answer (→ 47.8.1): An empty fibre bundle means a fibre bundle whose total space is empty. If E in the topological (G, F ) fibre bundle (E, TE , π, B, TB , AF E ) has E = ∅, then TE = {∅} is the only possible topology on E. (See Remarks 14.2.17 and 14.2.10 regarding the empty topological space.) The only possible function π : ∅ → B is the empty function π = ∅. (See Remark 6.5.18.) The empty function on the empty topological space (∅, {∅}) is continuous for any target topology (B, TB ). So Definition 23.6.4 (i) is satisfied.
To satisfy Definition 23.6.4 (ii), it is necessary to set Uφ = ∅ because φ must be the empty function, π −1 (U ) = ∅ and π × φ is the empty function, which can only be a bijection π × φ : π −1 (Uφ ) ≈ Uφ × F for F 6= ∅ if Uφ = ∅. So the empty chart φ = ∅ is the only permissible chart. It then follows by Definition 23.6.4 (iii) that B = ∅.
If AF E = {∅} (i.e. if the atlas contains only the empty function), then Definition 23.6.4 (ii) requires that for −1 φ1 = φ2 = ∅ and Uφ1 = Uφ2 = ∅, the condition ∀b ∈ Uφ1 ∩ Uφ2 , ∃g ∈ G, βb,φ1 ◦ βb,φ = Lg holds. But this 2 is always true because Uφ1 ∩ Uφ2 = ∅. The single transition map gφ1 ,φ2 in Definition 23.6.4 (v) is the empty F function. Therefore AF E = {∅} is a valid atlas. The fibre atlas AE = ∅ similarly satisfies all of the conditions. 48.8.2 Answer (→ 47.8.2): To show that condition (23.10.1) in Remark 23.10.4 follows from condition (ii) −1 of Definition 23.10.3, first let φ1 , φ2 ∈ AF ({b}), φ1 (z) = E , b ∈ Uφ1 ∩Uφ2 and g ∈ G, and suppose that ∀z ∈ π −1 −1 gφ2 (z). Since βb,φ2 = φ2 π−1 ({b}) : π ({b}) → F is a bijection, one may write ∀y ∈ F, gy = gφ2 (βb,φ (y)) = 2
−1 φ1 (βb,φ (y)). But by definition of gφ1 ,φ2 , it is also true that ∀z ∈ π −1 ({b}), φ1 (z) = gφ1 ,φ2 (b)φ2 (z). So 2 [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
= ∅.
820
48. Exercise answers
−1 ∀y ∈ F, gφ1 ,φ2 (b)y = φ1 (βb,φ (y)). So ∀y ∈ F, gy = gφ1 ,φ2 (b)y. Since G acts effectively on F , it follows (by 2 Remark 9.4.15) that g = gφ1 ,φ2 (b). Therefore by Definition 23.10.3 (ii) and the definition of g˜h(φ1 ),h(φ2 ) ,
˜b , ∀˜ z∈E
h(φ1 )(˜ z ) = g˜h(φ1 ),h(φ2 ) (b)h(φ2 )(˜ z) = gφ1 ,φ2 (b)h(φ2 )(˜ z) = gh(φ2 )(˜ z ).
This has proved the forward implication of condition (23.10.1): ∀φ1 , φ2 ∈ AF E , ∀b ∈ Uφ1 ∩ Uφ2 , ∀g ∈ G,
(∀z ∈ π −1 ({b}), φ1 (z) = gφ2 (z)) ⇒ (∀˜ z∈π ˜ −1 ({b}), h(φ1 )(˜ z ) = gh(φ2 )(˜ z )).
The reverse implication follows in the same way because h is a bijection. To show the converse, suppose now that ∀φ1 , φ2 ∈ AF E , ∀b ∈ Uφ1 ∩ Uφ2 , ∀g ∈ G,
(∀z ∈ π −1 ({b}), φ1 (z) = gφ2 (z)) ⇔ (∀˜ z∈π ˜ −1 ({b}), h(φ1 )(˜ z ) = gh(φ2 )(˜ z )).
(48.8.1)
Let φ1 , φ2 ∈ AF E , b ∈ Uφ1 ∩ Uφ2 and g ∈ G. By definition of gφ1 ,φ2 (b), ∀z ∈ Eb , gφ1 ,φ2 (b)φ2 (z). Insert g = gφ1 ,φ2 (b) into the left hand side of condition (48.8.1) to obtain ∀˜ z ∈π ˜ −1 ({b}), h(φ1 )(˜ z ) = gh(φ2 )(˜ z ). ˜ Since h(φ2 ) E˜ : Eb → F is a bijection and G acts effectively on F , it follows as above that g = g˜h(φ1 ),h(φ2 ) (b). b Therefore gφ1 ,φ2 (b) = g˜h(φ1 ),h(φ2 ) (b), which was to be proved.
48.8.3 Answer (→ 47.8.3): Let (P, q, B) − < (P, TP , q, B, TB , AG < P ) be a topological G-bundle, let (G, F ) − F (G, TG , F, TF , σ, µG ) be an effective topological left transformation group, let z, z ′ ∈ P , y, y ′ ∈ F and ′ ′ φ1 , φ2 ∈ AG P , and suppose that φ1 (z )y = φ1 (z)y. Then = φ2 (z ′ )φ1 (z ′ )−1 φ1 (z)y = gφ2 ,φ1 (b)φ1 (z ′ )φ1 (z ′ )−1 φ1 (z)y = gφ2 ,φ1 (b)φ1 (z)y = φ2 (z)y,
which proves the implication φ1 (z ′ )y ′ = φ1 (z)y ⇒ φ2 (z ′ )y ′ = φ2 (z)y. The reverse implication follows in the same way. Therefore the choice of φ in Definition 23.12.3 (i) does not affect the definition.
48.9. Topological manifolds 48.9.1 Answer (→ 47.9.1): Let (X, T ) be a locally Euclidean space. By Definition 26.2.3, this means that ∀x ∈ X, ∃Ω ∈ Topx (X), ∃n ∈ + , ∃G ∈ Top(IRn ), Ω ≈ G. Let y = φ(x), where φ : Ω ≈ G is the homeomorphism in Definition 26.2.3. Since G is open in IRn , there is an r > 0 such that By,r ⊆ G. Define Ω′ = φ−1 (By,r ). Then Ω′ is connected because By,r is connected. Therefore Ω′ satisfies the requirements of Definition 15.4.20 for the local connectedness of X.
Z
48.9.2 Answer (→ 47.9.2): Let (X, T ) be a locally Euclidean space. By Definition 26.2.3, this means that ∀x ∈ X, ∃Ω ∈ Topx (X), ∃n ∈ + , ∃G ∈ Top(IRn ), Ω ≈ G. Let y = φ(x), where φ : Ω ≈ G is the ¯y,r ⊆ G. Define homeomorphism in Definition 26.2.3. Since G is open in IRn , there is an r > 0 such that B −′ −′ ′ −1 −1 ¯ Ω = φ (By,r ). Then Ω = φ (By,r ) because φ is a homeomorphism, and Ω is a compact subset of (X, T ) ¯y,r is a compact subset of IRn . Therefore Ω′ satisfies the requirements of Definition 15.7.12 for the because B local compactness of X.
Z
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
φ2 (z ′ )y ′ = φ2 (z ′ )φ1 (z ′ )−1 φ1 (z ′ )y ′
[821]
Chapter 49 Notations and abbreviations
49.1. Notations The following notations are defined or used in this book. The two-number references are section numbers. The three-number references are definitions, notations, remarks or theorems. Three-number references in parentheses are equation numbers. Three-number references in square brackets are figures. Abbreviations are currently listed in the index in Chapter 50, or else in Section 49.2. reference
meaning
1.6.6
end of proof; quod erat demonstrandum
T F
3.7.16 3.7.16
proposition-tag “true” proposition-tag “false”
¬ ∧ ∨ ⇒ ⇔ | ↓ △ ⊤ ⊥ ⊢ ⊣⊢ ⊤ ⊥ ∀ ∃ ∃′ x, P (x)
4.3.3 4.3.3 4.3.3 4.3.3 4.3.3 4.3.9 4.3.9 4.3.9 4.3.19 4.3.19 4.5.8 4.5.9 4.12.10 4.12.10 4.13.3 4.13.3 4.16.3
logical “not” (negation) logical “and” (conjunction) logical “or” (disjunction) logical implication operator (“implies”) logical equivalent operator (“if and only if”) alternative denial operator (NAND oeprator, Sheffer stroke) joint denial operator (NOR operator, Peirce arrow, Quine dagger) exclusive-or operator (XOR operator) logical operator whose value is always true logical operator whose value is always false assertion two-way assertion logical predicate whose value is always true logical predicate whose value is always false for all for some P (x) is true for one and only one x; unique existence
©
Alan U. Kennington, “Differential geometry reconstructed: a unified systematic framework”. Copyright 2009, Alan U. Kennington. All rights reserved. You may print this book draft in A4 format. Printing in all other formats is forbidden. You may not charge any fee for copies of this book draft. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
notation
822
49. Notations and abbreviations reference
meaning
x∈A x∈ /A A⊆B A⊇B A 6⊆ B A 6⊇ B x 6= y A⊂ 6= B A⊃ = \ B ∃x ∈ S, P (x) ∀x ∈ S, P (x) ∅ {x} {x; P (x)} {x ∈ A; P (x)} IP(X) A∪B A∩B A\B A S△ B TS S X− X
5.1.3 5.1.4 5.1.8 5.1.8 5.1.9 5.1.9 5.1.10 5.1.12 5.1.12 5.1.14 5.1.14 5.8.4 5.8.11 5.8.12 5.8.15 5.8.19 5.13.3 5.13.3 5.13.9 5.13.15 5.14.2 5.14.2 5.16 5.16
x is an element (or member) of set A x is not an element (or member) of set A A is a subset of B A is a superset of B; same as B ⊆ A A is not a subset of B A is not a superset of B; same as B 6⊆ A x does not equal y; i.e. ¬(x = y) A is a proper subset of B A is a proper superset of B; same as B ⊂ 6= A P (x) is true for some element x of a set S P (x) is true for all elements x of a set S the empty set; satisfies ∀x, x ∈ /∅ singleton set S satisfying z ∈ S ⇔ z = x set S satisfying z ∈ S ⇔ P (z) set S satisfying z ∈ S ⇔ (z ∈ A ∧ P (z)) the power set of set X; i.e. {A; A ⊆ X} union of sets A and B intersection of sets A and B complement of set B within set A symmetric set difference of sets A and B union of set of sets S intersection of non-empty set of sets S X is an abbreviation for Y X is an abbreviation for Y
(a, b) A×B Dom(R) Range(R) Im(R) R1 ◦ R2 R−1 f :X→Y f (x) YX id X f A f ×g ˙ g f× X→ ˚ Y f :X→ ˚ Y f : G −→ ◦ F G −→ ◦ F X→Y
6.1.4 6.2.2 6.3.7 6.3.7 6.3.7 6.3.24 6.3.28 6.5.7 6.5.12 6.5.13 6.5.21 6.5.28 6.9.11 6.9.12 6.11.4 6.11.4 6.11.6 6.11.6 6.12.2
the ordered pair {{a}, {a, b}} for any a and b Cartesian product of sets A and B domain of relation or function R range of relation or function R image of relation or function R composition of relations (or functions) R1 and R2 inverse of relation (or function) R f is a function from X to Y the value of a function f for an argument x of f the set of functions from X to Y identity function on set X the restriction of function f to set A direct product of functions f and g pointwise direct product of functions f and g set of partially-defined functions from X to Y f is a partially-defined function from X to Y f is a function from G to F → F ; same as f : G → (F → F ) the set of functions from G to F → F ; same as G → (F → F ) the set of functions from X to Y ; same as Y X
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
notation
49.1. Notations reference
ω S+ ω+
7.2.12 7.2.16 7.2.29 7.2.31 7.2.33 7.2.34
N Nn
IPk (X) −
N m|k Z+ Z Z+0− Z Z−0 Zn
∞ −∞ − −+ −+ −0− −−
Z Z Z Z Z0
Xn χA 2n δ(i, j) δij , δ ij , δji perm(X) parity(f ) n! (n)k ǫ(f ) ǫi1 ,...in ǫi1 ,...in Crn Irn Jrn List(X) List(X)
7.4.3 7.5.3 7.5.4 7.5.4 7.5.4 7.5.4 7.5.6 7.6.1 7.6.1 7.6.3 7.6.4 7.6.4 7.6.4 7.6.4 7.7.1 7.9.3 7.9.8 7.9.10 7.9.11 7.10.4 7.10.9 7.10.14 7.10.17 7.10.20 7.10.21 7.10.21 7.11.3 7.11.10 7.11.10 7.12.2 7.12.5
meaning the finite ordinal numbers {0, 1, 2, . . .} the successor set S ∪ {S} of any set S the extended finite ordinal numbers ω ∪ {ω} the natural numbers {1, 2, 3, . . .} the set {1, 2, . . . n} for n ∈ ω the set {S ∈ IP(X); #(S) ≤ k} for any set X and k ∈ + 0 the extended natural numbers ∪ {∞} m divides k, where m, k ∈ the integers {. . . , −2, −1, 0, 1, 2, . . .} the positive integers {1, 2, 3, . . .}; equivalent to non-negative integers {0, 1, 2, . . .}; equivalent to ω the negative integers {−1, −2, −3, . . .} non-positive integers {0, −1, −2, . . .} the set {i ∈ ; 0 ≤ i < n} = {0, 1, . . . n − 1} for n ∈ + 0 the positive infinite pseudo-integer the negative infinite pseudo-integer the extended integers ∪ {∞, −∞} − the positive extended integers; equivalent to non-negative extended integers; equivalent to ω + the negative extended integers non-positive extended integers Cartesian product of n copies of set X for n ∈ + 0 indicator function of a set A power-of-two function for integer argument n ∈ + 0 Kronecker delta function Kronecker delta function; same as δ(i, j) set of permutations of a set X parity of a permutation f : X → X for a set X the value of the factorial function for argument n the value of the Jordan factorial function for argument (n, k) Levi-Civita alternating symbol Levi-Civita alternating symbol Levi-Civita alternating symbol combination symbol set of increasing maps from r to n set of non-decreasing maps from r to n list space of a set X extended list space of a set X
[ www.topology.org/tex/conc/dg.html ]
N
Z
N
N
Z
Z
Z
N
Z Z
N
N N N
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
notation
823
824
Q+ Q Q−+0 Q Q−−0 Q−+ Q− Q−+−0 Q− Q−0
IR IR+ IR+ 0 IR− IR− 0 (a, b) [a, b] [a, b) (a, b] − IR −+ IR − IR+ −0 IR− − IR− 0 IRn Qm,n |x| sign(x) H(x) floor(x) ceiling(x) frac(x) round(x) x mod m
C
reference 8.1.2 8.1.4 8.1.4 8.1.4 8.1.4 8.2.2 8.2.4 8.2.4 8.2.4 8.2.4 8.3.7 8.3.8 8.3.8 8.3.8 8.3.8 8.3.10 8.3.10 8.3.10 8.3.10 8.4.2 8.4.5 8.4.5 8.4.5 8.4.5 8.5.1 8.5.3 8.6.1 8.6.2 8.6.6 8.6.8 8.6.9 8.6.13 8.6.14 8.6.16 8.7.1
meaning the set of rational numbers the set of positive rational numbers the set of non-negative rational numbers the set of negative rational numbers the set of non-positive rational numbers the set of extended rational numbers the set of positive extended rational numbers the set of non-negative extended rational numbers the set of negative extended rational numbers the set of non-positive extended rational numbers the set of real numbers the set of positive real numbers {x ∈ IR; x > 0} the set of non-negative real numbers {x ∈ IR; x ≥ 0} the set of negative real numbers {x ∈ IR; x < 0} the set of non-positive real numbers {x ∈ IR; x ≤ 0} the open interval of real numbers {x ∈ IR; a < x < b} the closed interval of real numbers {x ∈ IR; a ≤ x ≤ b} the closed-open interval of real numbers {x ∈ IR; a ≤ x < b} the open-closed interval of real numbers {x ∈ IR; a < x ≤ b} the set of extended real numbers IR ∪ {−∞, ∞} − the set of positive extended real numbers {x ∈ IR; x > 0} − the set of non-negative extended real numbers {x ∈ IR; x ≥ 0} − the set of negative extended real numbers {x ∈ IR; x < 0} − the set of non-positive extended real numbers {x ∈ IR; x ≤ 0} the set of real-number n-tuples for n ∈ + 0 the m, n-concatenation operator for real number tuples for m, n ∈ the absolute value of x ∈ IR the sign of x ∈ IR the Heaviside function of x ∈ IR the floor function of x ∈ IR the ceiling function of x ∈ IR the fractional part function of x ∈ IR the round function of x ∈ IR x modulo m for x ∈ IR and m ∈ IR \ {0} the set of complex numbers
[ www.topology.org/tex/conc/dg.html ]
Z
Z+0
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
notation
49. Notations and abbreviations
49.1. Notations reference
Lg Rg e g −1 Hom(G1 , G2 ) Iso(G1 , G2 ) End(G) Aut(G) Mon(G1 , G2 ) Epi(G1 , G2 ) gH Hg G/H Sg N (S) Z(S) Lg Rg HomA (M1 , M2 ) EndA (M ) AutA (M ) GL(M ) gl(V )
9.2.7 9.2.7 9.2.10 9.2.15 9.2.23 9.2.23 9.2.23 9.2.23 9.2.23 9.2.23 9.3.4 9.3.4 9.3.10 9.3.18 9.3.28 9.3.28 9.4.6 9.5.5 9.9.12 9.9.12 9.9.12 9.9.12 9.11.14
ad(X) ad(A)
9.11.20 9.11.23
dim(V ) vi Lin(V, W ) Hom(V1 , V2 ) Iso(V1 , V2 ) End(V ) Aut(V ) Mon(V1 , V2 ) Epi(V1 , V2 ) fi V1 ⊕ V2 V1 ⊕ V2 V /W |x|p (x, y) x·y hx, yi
10.2.6 10.2.17 10.3.2 10.3.7 10.3.7 10.3.7 10.3.7 10.3.7 10.3.7 10.5.13 10.6.2 10.6.9 10.7.2 10.8.2 10.8.6 10.8.6 10.8.6
meaning left action by a group element g ∈ G on elements of G right action by a group element g ∈ G on elements of G identity element of a group G inverse of an element g of a group G the set of group homomorphisms from G1 to G2 the set of group isomorphisms from G1 to G2 the set of group homomorphisms from G to G the set of group isomorphisms from G to G the set of group monomorphisms from G1 to G2 the set of group epimorphisms from G1 to G2 the left coset of a subgroup H of a group G by g ∈ G the right coset of a subgroup H of a group G by g ∈ G quotient of group G with respect to normal subgroup H of G the conjugate of subset S of group G by g ∈ G the normalizer of a subset S of a group G the centralizer of a subset S of a group G left transformation group action by g ∈ G for a group G right transformation group action by g ∈ G for a group G set of A-homomorphisms from module M1 to module M2 over set A set of A-endomorphisms from module M to M over set A set of A-automorphisms from module M to M over set A same as AutA (M ) for module M and set A Lie algebra associated with associative algebra AutK (V ) of K-automorphisms of K-module V adjoint of element X of Lie algebra A under the adjoint representation of A adjoint Lie algebra {ad(X); X ∈ A} of Lie algebra A dimension of linear space V the ith component of a vector v for a given basis set of linear maps from linear space V to linear space W the set of linear space homomorphisms from V1 to V2 the set of linear space isomorphisms from V1 to V2 the set of linear space homomorphisms from V to V the set of linear space isomorphisms from V to V the set of linear space monomorphisms from V1 to V2 the set of linear space epimorphisms from V1 to V2 the ith component of linear functional f for a given basis external direct sum of linear spaces V1 and V2 internal direct sum of linear spaces V1 and V2 quotient of linear space V over linear space W p-norm of x ∈ IRn inner product of vectors x, y ∈ IRn inner product of vectors x, y ∈ IRn inner product of vectors x, y ∈ IRn
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
notation
825
826
49. Notations and abbreviations
notation
reference
Mm,n (K) Mm,n (IR) Mm,n AT AB In Mn (K) Tr(A) det(A) A−1 λ+ (A) λ− (A) Sym(n, IR) Sym+ 0 (n, IR) Sym− 0 (n, IR) Sym+ (n, IR) Sym− (n, IR) Pr i=0 ai Pi
11.1.3 11.1.3 11.1.4 11.1.13 11.1.17 11.1.21 11.3.1 11.3.3 11.3.7 11.3.16 11.4.2 11.4.2 11.5.3 11.6.1 11.6.1 11.6.1 11.6.1
the set of m × n matrices over a field K the set of m × n real-valued matrices sames as Mm,n (IR) the transpose of matrix A ∈ Mm,n (K), m, n ∈ + 0 , field K the product of matrices A and B the identity matrix in Mn,n (K) for n ∈ + 0 and field K same as Mn,n (K) the trace of a square matrix A the determinant of a square matrix A the inverse of an invertible square matrix A the upper norm of a real square matrix A the lower norm of a real square matrix A the set of real symmetric n × n matrices the set of positive semi-definite real symmetric n × n matrices the set of negative semi-definite real symmetric n × n matrices the set of positive definite real symmetric n × n matrices the set of negative definite real symmetric n × n matrices
12.2.4
convex combination of points Pi , i = 0, . . . r
13.1.5 13.1.6 13.1.7 13.3.4 13.3.5 13.4.8 13.4.8 13.4.8 13.5.7 13.5.11 13.5.12 13.5.14 13.7.8 13.8.7 13.9.3 13.9.5 13.9.7 13.9.11
set of multilinear maps from ×α∈A Vα to U the set L ((Vα )α∈A ; U ) with A = m = {1, . . . m} same as L ((Vα )α∈A ; U ) with A = m and Vα = V for all α the set of symmetric multilinear maps from V m to U the set of antisymmetric multilinear maps from V m to U tensor product of linear spaces (Vα )α∈A tensor product of linear spaces (Vi )m i=1 tensor product of linear spaces (Vi )m i=1 tensor monomial corresponding to (vα )α∈A tensor monomial corresponding to (vi )m i=1 tensor monomial corresponding to (vi )m i=1 tensor product of m copies of linear space V set of multilinear maps for mixture of V and V ∗ tensor algebra of linear space V − the set Lm (V, W ) with pointwise vector addition and scalar product same as Λm (V, K), where K is the field of V alternating tensor product of m copies of V ; same as Λm (V, K)∗ a simple m-vector
Z
Z
[ www.topology.org/tex/conc/dg.html ]
N N
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
L ((Vα )α∈A ; U ) L (V1 , . . . Vm ; U ) Lm (V, U ) + Lm (V, U ) − Lm (V, U ) ⊗α∈A Vα ⊗m i=1 Vi V 1 ⊗ . . . Vm ⊗α∈A vα ⊗m i=1 vi v1 ⊗ . . . vm ⊗m V Lr,s (V, W ) ⊗∗ V Λm (V, W ) Λ V Vm m V ∧m i=1 vi
meaning
49.1. Notations
827
notation
reference
Top(X) Topx (X) Top(X) Int(S) S¯ IntT (S) ClosureT (S) Ext(S) Bdy(S) ∂S ExtT (S) BdyT (S) C(X, Y ) C 0 (X, Y ) C 0 (X) limz→x f (z) limx f X≈Y f :X≈Y Iso(X, Y ) Aut(X)
14.2.5 14.2.11 14.2.15 14.4.2 14.4.5 14.4.8 14.4.14 14.5.3 14.5.5 14.5.5 14.5.12 14.5.12 14.11.12 14.11.12 14.11.14 14.11.16 14.11.18 14.12.2 14.12.2 14.12.3 14.12.3
the topology on a topological space X the set of open neighbourhoods of x ∈ X in a topological space X the set of closed sets in a topological space X the interior of a set S in a topological space X the closure of a set S in a topological space X the interior of a set S in a topological space (X, T ) the closure of a set S in a topological space (X, T ) the exterior of a set S in a topological space X the boundary of a set S in a topological space X the boundary of a set S in a topological space X the exterior of a set S in a topological space (X, T ) the boundary of a set S in a topological space (X, T ) set of continuous functions from X to Y , for topological spaces X, Y same as C(X, Y ) same as C(X, IR) limit of function f : X → Y at x ∈ X limit of function f : X → Y at x ∈ X topological spaces X and Y are homeomorphic f is a homeomorphism from X to Y set of all topological isomorphisms from X to Y set of all topological automorphisms on X
Topcx (X)
15.4.16
the connected component of X which contains x
S(γ) T (γ) C0 (M ) [γ]0 P0 (M )
16.2.14 16.2.14 16.2.18 16.4.1 16.4.4
the the the the the
Bx,r ¯x,r B
17.1.8 17.1.8 17.1.8 17.1.8 17.1.18 17.1.18 17.1.18
x,r
Br1 ,r2 (x) ¯r ,r (x) B 1 2 B˙ r (x) ¯˙ r (x) B Lip(f )
17.1.18 17.1.18 17.1.18 17.1.18 17.1.18 17.4.11
C k (U ) C k (Ω) C k (Ω, IRm )
18.4.10 18.7.2 18.7.3
K C k,α (U )
18.7.7 18.9.3
initial point of a curve γ terminal point of a curve γ set of C 0 curves in topological space M set of curves which are path-equivalent to a given curve γ set of C 0 paths in topological space M
− open ball {y ∈ M ; d(x, y) < r}, metric space M , centre x ∈ M , radius r ∈ IR+ −0 closed ball {y ∈ M ; d(x, y) ≤ r}, metric space M , centre x ∈ M , radius r ∈ IR+ −0 open ball {y ∈ M ; d(x, y) < r}, metric space M , centre x ∈ M , radius r ∈ IR+ −0 closed ball {y ∈ M ; d(x, y) ≤ r}, metric space M , centre x ∈ M , radius r ∈ IR+ 0 − open annulus {y ∈ M ; r1 < d(x, y) < r2 }, x ∈ M , r1 , r2 ∈ IR+ −0 closed annulus {y ∈ M ; r1 ≤ d(x, y) ≤ r2 }, x ∈ M , r1 , r2 ∈ IR+ 0 − punctured open ball {y ∈ M ; 0 6= d(x, y) < r}, x ∈ M , r ∈ IR+ 0 − punctured closed ball {y ∈ M ; 0 6= d(x, y) ≤ r}, x ∈ M , r ∈ IR+ −+ 0 open annulus {y ∈ M ; r1 < d(x, y) < r2 }, x ∈ M , r1 , r2 ∈ IR0 − closed annulus {y ∈ M ; r1 ≤ d(x, y) ≤ r2 }, x ∈ M , r1 , r2 ∈ IR+ 0 −+ punctured open ball {y ∈ M ; 0 6= d(x, y) < r}, x ∈ M , r ∈ IR0 − punctured closed ball {y ∈ M ; 0 6= d(x, y) ≤ r}, x ∈ M , r ∈ IR+ 0 the infimum of Lipschitz constants for a Lipschitz function f − set of k-times differentiable functions on open set U ⊆ IR for k ∈ + − 0 + set of k-times differentiable functions on Ω ∈ Top(IRn ) for k ∈ + 0, n∈ 0 m n set of k-times IR -valued differentiable functions on Ω ∈ Top(IR ), m, n ∈ + 0, −+ k∈ 0 regularity class such as C k , C ∞ or analytic set of functions in C k (U ) with α-H¨older kth derivative, k ∈ + 0 , α ∈ (0, 1]
[ www.topology.org/tex/conc/dg.html ]
Z
Z
Z
Z
Z
Z
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
Br (x) ¯r (x) B Bx,r1 ,r2 ¯x,r ,r B 1 2 B˙ x,r ¯˙ B
meaning
828
49. Notations and abbreviations
n
reference
meaning n
n
n
Z Z Z Z
tangent bundle IR × IR on IR , n ∈ + 0 −+ set of C r cross-sections of T (IRn ), n ∈ + 0, r ∈ 0 cotangent bundle IRn × IRn on IRn , n ∈ + 0 −+ set of C r cross-sections of T ∗ (IRn ), n ∈ + 0, r ∈ 0
T (IR ) X r (T (IRn )) T ∗ (IRn ) X r (T ∗ (IRn ))
19.1.7 19.1.11 19.2.8 19.2.12
X r (Λm T (IRn ))
20.5.6
X(E, π, B) atlas(E, π, B) IsoG (Eb1 , Eb2 ) AutG (Eb ) Lbg,φ Rg (P × F )/G P ×G F
23.3.9 23.6.9 23.8.4 23.8.5 23.8.8 23.9.4 23.12.4 23.12.9
Θγs,t
the set of cross-sections of topological fibration (E, π, B) the fibre atlas of fibre bundle (E, π, B) the set of topological isomorphisms from fibre Eb1 to Eb2 the set of topological automorphisms of fibre set Eb −1 automorphism through the charts βb,φ ◦ Lg ◦ βb,φ : Eb ≈ Eb right action map on a principal fibre bundle orbit-space version of an associated topological fibre bundle same as (P × F )/G
24.2.2
parallelism map between parameters s and t of a curve γ
atlas(M ) atlasp (M )
26.4.8 26.4.8
C r (M, IRm ) C r (M ) ˚r (M ) C ˚pr (M ) C
the atlas AM for a differentiable manifold M − < (M, AM ) set of charts ψ ∈ atlas(M ) such that p ∈ Dom(ψ)
27.5.3 27.5.3 27.5.5 27.5.6
the set of C r IRm -valued functions on a C r n-dimensional manifold M the set of C r real functions on a C r manifold M ; same as C r (M, IR) ˚r (M, IR) C r real functions on open subsets of a C r manifold M ; same as C C r real functions on open neighbourhoods of p ∈ M of a C r manifold M ; same ˚pr (M, IR) as C the set of C r maps from a C r manifold M to a C r manifold N the set of C r maps from open subsets of a C r manifold M to a C r manifold N
Tp (M ) T (M ) ˚(M ) T ˚p (M ) T Tˆp (M ) Tˆ(M ) ep,ψ i
28.3.8 28.3.9 28.3.9 28.5.3 28.6.3 28.6.4 28.7.4
∂ip,ψ
28.7.11
vector space of tangent vectors at p in a C 1 manifold M the total space of tangent (coordinate) vectors of a C 1 manifold M the total space of untagged tangent operators of a C 1 manifold M vector space of untagged tangent operators at p in a C 1 manifold M vector space of tagged tangent operators at p in a C 1 manifold M the total space of tagged tangent operators of a C 1 manifold M coordinate tangent vector in direction i at p with respect to coordinate map ψ for a C 1 manifold coordinate tangent operator in direction i at p with respect to coordinate map ψ for a C 1 manifold
T ∗ (M ) Tpr,s (M ) X k (M )
29.2.2 29.3.2 29.5.7
the union of the dual tangent spaces of a C 1 manifold M the space of tangent (r, s)-tensors at a point p in a C 1 manifold M the set of C k vector fields in a C k manifold M
GL(n, IR) GL(n) GL(V )
34.7.6 34.7.6
group of general linear transformations of IRn group of general linear transformations of IRn group of general linear transformations of vector space V
Vz (P )
36.5.11
k Γij Ri jkl gij
set of vertical vectors ker((dπP )z ) at z ∈ P ; C 1 principal G-bundle (P, πP , M )
37.9.5
components of the Christoffel symbol of an affine connection components of the Riemann curvature tensor of an affine connection components of the metric tensor of a Riemannian manifold
C r (M, N ) ˚r (M, N ) C
Z
Z −+ set of C r cross-sections of Λm T (IRn ) for m, n ∈ Z+ 0 , r ∈ Z0
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
notation
49.2. Abbreviations
829
49.2. Abbreviations
AC AD AMS BC BCE BG BVP
reference 5.0.9
meaning
1.8
axiom of choice Anno Domini [i.e. Common Era] American Mathematical Society
5.0.9 21.0.2
Before Christ [i.e. Before Common Era] Before Common Era Bernays-G¨ odel (set theory) boundary value problem
CC CE
5.0.9
axiom of countable choice Common Era
DE DG DNA
21.0.2 37.7.8
differential equation differential geometry deoxyribonucleic acid
EDM
50.2
Encyclopedic dictionary of mathematics
FOL + EQ FTOC
5.2.3 20.2.2
first order language with equality fundamental theorem of calculus
GL GR
11.7.1
general linear group general relativity
HOL HTTP
6.5.16 2.5.7
higher-order logic hypertext transfer protocol
IQ IVP
1.8 21.0.2
intelligence quotient initial value problem
KEM
50.4
Kleine Enzyklop¨ adie Mathematik
LHS
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
abbreviation
left hand side
MP MSC
4.6.1 1.9
modus ponens mathematics subject classification
NAND NBG NOR
4.3.6 5.0.9 4.3.6
not AND Neumann-Bernays-G¨odel (set theory) not OR
O ODE OED OFB
11.7.1 21.0.2 1.6.7 23.0.1
orthogonal group ordinary differential equation Oxford English Dictionary ordinary fibre bundle
PC PDE PDO PFB
4.7.2 21.0.2 21.0.2 23.0.1
propositional calculus partial differential equation partial differential operator principal fibre bundle
QC QED
4.14.1 1.6.6
predicate calculus quod erat demonstrandum
RAA RHS
3.11.1
reductio ad absurdum right hand side
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
830
49. Notations and abbreviations
abbreviation
reference
meaning
11.7.1 2.5.7 11.7.1 11.7.1
special Simple special special
linear group Network Management Protocol orthogonal group unitary group
TS
29.0.1
tangent space
U URL USA UTC
11.7.1 50.0.1 2.5.20 2.5.20
unitary group Uniform Resource Locator United States of America Universal Time Coordinated
wf wff
4.5.2 4.5.2
well-formed formula well-formed formula
XOR
4.3.7
exclusive OR
ZF ZFC
5.0.9 5.9.5
Zermelo-Fraenkel (set theory) Zermelo-Fraenkel set theory with axiom of choice
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
SL SNMP SO SU
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
[831]
Chapter 50 Bibliography
50.1 50.2 50.3 50.4 50.5 50.6 50.7 50.8 50.9 50.10
Comments on other people’s books . . . Differential geometry introductory texts Other differential geometry references . Other mathematics references . . . . . . Physics . . . . . . . . . . . . . . . . . Logic and set theory . . . . . . . . . . . Anthropology and linguistics . . . . . . Philosophy and ancient history . . . . . History of mathematics . . . . . . . . . Other references . . . . . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
831 832 833 834 836 836 837 837 838 838
Section 50.2 consists mostly of differential geometry books which may be useful as introductory texts. Some of these introductory works are on other subjects such as general relativity, but have substantial introductory material on differential geometry. Section 50.3 contains differential geometry references. These are mostly on specialized topics, or have historical interest, or are research monographs. Section 50.4 contains texts on mathematical topics other than differential geometry.
50.1. Comments on other people’s books The comments in this section are the current personal opinions of this author. These comments should be regarded with maximum scepticism. The author’s opinions on other people’s books change over time and are biased by his own favourite application areas and personal preferences. The order in which books are mentioned in this section should not be interpreted as order of preference. Any comments which seem to be negative should be disregarded. The author accepts no responsibility at all for any purchase choices of the reader which may be influenced by the comments in this section. The book by Frankel [18] is an excellent exposition of a full range of differential geometry topics which combines applicability to physics with a higher level of mathematical precision than is usual in DG texts which are aimed at physicists. The Crampin/Pirani [11] book is a more mathematically oriented version of differential geometry although it is intended for applications in physics, particularly mechanics. The first 163 pages (42.7%) present the differential layer in the absence of any metric or connection. The following 72 pages (18.8%) present metrics and connections. Manifolds (charts and atlases) are defined only in the last 147 pages (38.5%) of the book. Thus this book is broadly organized according to structural layers. The Misner/Thorne/Wheeler [37] book presents the physicists’ view of differential geometry (in addition to general relativity). This book is noteworthy for apparently using no function spaces at all. Mathematical objects are presented in isolation without function classes to contain them. This contrasts with the modern
©
Alan U. Kennington, “Differential geometry reconstructed: a unified systematic framework”. Copyright 2009, Alan U. Kennington. All rights reserved. You may print this book draft in A4 format. Printing in all other formats is forbidden. You may not charge any fee for copies of this book draft. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
50.0.1 Remark: Internet locations (“URLs”) are avoided as much as possible in this book because Internet resources and locations are much more ephemeral than books and journals.
832
50. Bibliography
approach in mathematics which associates almost all mathematical objects with container classes. This book uses intuition rather than the axiomatic/deductive approach to differential geometry. The book by Szekeres [44] presents a wide range of mathematical physics topics in a systematic way which attempts to include the principal mathematical prerequisites in the earlier chapters. This is similar to my own attempt to include prerequisites in the early chapters. (Peter Szekeres was one of my mathematical physics lecturers at the University of Adelaide in the 1970s.) The five-volume Spivak [42] book is not organized according to structural layers. Nor is it a systematic deductive development of differential geometry although it is oriented to mathematics rather than physics applications. This book is useful for its wide range of mathematical applications topics and the analysis of historical DG texts. The “Encyclopedic dictionary of mathematics” [34] by the Mathematical Society of Japan is a very comprehensive set of definitions covering all of mathematics. The articles on differential geometry are a useful reference for most of the basic definitions.
1. Louis Auslander, Differential geometry, Harper & Row, New York, 1967?. 2. Marcel Berger, Bernard Gostiaux, Differential geometry: manifolds, curves, and surfaces, Springer, New York, 1988?, translated from G´eom´etrie diff´erentielle, P.U.F., Paris, 1986. 3. Marcel Berger, Bernard Gostiaux, G´eom´etrie diff´erentielle, P.U.F., Paris, 1986. 4. Richard L. Bishop, Richard J. Crittenden, Geometry of manifolds, Academic Press, New York, 1964. 5. Wilhelm Blaschke, K. Leichtweiss, Elementare Differentialgeometrie, 5th. ed., Springer, Berlin, 1973. 6. William Munger Boothby, An introduction to differentiable manifolds and Riemannian geometry, Academic Press, New York, 1975. 7. William Munger Boothby, An introduction to differentiable manifolds and Riemannian geometry, 2nd. ed., Academic Press, Orlando, Florida, 1986. 8. Nicolas Bourbaki, Vari´et´es diff´erentielles et analytiques, Hermann et Cie, Paris, 1967/71. 9. F. Brickell, R. S. Clark, Differentiable manifolds, Van Nostrand Reinhold, London, 1970. 10. Robert Coquereaux, Riemannian geometry, fiber bundles etc., ??, ??, 19??. 11. M. Crampin, F.A.E. Pirani, Applicable differential geometry, Cambridge U.P., Cambridge, England, 1986, 1994. 12. W. D. Curtis, F. R. Miller, Differential manifolds and theoretical physics, Academic Press, Orlando, Florida, 1985. 13. R.W.R. Darling, Differential forms and connections, Cambridge U.P., Cambridge, England, 1994. 14. Manfredo Perdig˜ ao do Carmo, Differential forms and applications, Springer, Berlin, 1994, translated from Formas diferenciais e aplicacoes. 15. Manfredo Perdig˜ ao do Carmo, Differential geometry of curves and surfaces, Prentice-Hall, Englewood Cliffs, N.J, 1976. 16. Manfredo Perdig˜ ao do Carmo, Riemannian geometry, Birkh¨auser, Boston, 1992, translated from Geometria Riemanniana, Instituto de Matematica Pura e Aplicada, 1979, 1988. ISBN 0-8176-3490-8. 17. A. T. Fomenko, Differential geometry and topology, Consultants Bureau, New York, 1987?. 18. Theodore Frankel, The geometry of physics, an introduction, first. ed., Cambridge University Press, Cambridge, 1997, 1999, 2001. ISBN 0-521-38753-1. 19. Sylvestre Gallot, Dominique Hulin, Jacques Lafontaine, Riemannian Geometry, Springer, Berlin, 1990. 20. Hubert Goenner, Einf¨ uhrung in die spezielle und allgemeine Relativit¨ atstheorie, Spektrum Akademischer Verlag, Heidelberg, 1996. 21. Heinrich Walter Guggenheimer, Differential geometry, McGraw-Hill, New York, 1963. 22. Robert Clifford Gunning, Lectures on Riemann surfaces, Princeton University Press, Princeton, New Jersey, 1966. 23. Sigurdur Helgason, Differential geometry, Lie groups and symmetric spaces, Academic Press, New York, 1978. 24. Morris William Hirsch, Differential Topology, Springer, New York, 1976. 25. Wilhelm Klingenberg, Riemannian geometry, Walter de Gruyter, Berlin, 1982. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
50.2. Differential geometry introductory texts
833
26. Shoshichi Kobayashi, Katsumi Nomizu, Foundations of differential geometry, volumes 1,2, Wiley Interscience, New York, 1963/69. 27. Shoshichi Kobayashi, Transformation groups in differential geometry, Erg. math. 70, Springer, Berlin, 1972. 28. Serge Lang, Introduction to differentiable manifolds, John Wiley and Sons, New York, 1962. 29. Serge Lang, Differential manifolds, Addison-Wesley, Reading, Massachusetts, 1972. 30. Serge Lang, Fundamentals of differential geometry, Springer, New York, 1991. ISBN 0-387-98593-X. 31. Detlef Laugwitz, Differential and Riemannian geometry, Academic Press, New York, 1965, translated from Differentialgeometrie. 32. John M. Lee, Riemannian manifolds: An introduction to curvature, Springer-Verlag, New York, 1997. ISBN 0-387-98271-X. 33. Mathematical Society of Japan, Encyclopedic dictionary of mathematics, 1st. ed., (ed. Shˆokichi Iyanaga, Yukiyosi Kawada), MIT Press, Cambridge MA, 1980. 34. Mathematical Society of Japan, Encyclopedic dictionary of mathematics, 2nd. ed., (ed. Kiyosi Itˆo), MIT Press, Cambridge MA, 1993. ISBN 0-262-59020-4. 35. Paul Malliavin, G´eom´etrie diff´erentielle intrins`eque, Hermann, Paris, 1972. 36. Richard S. Millman, George D. Parker, Elements of differential geometry, Prentice-Hall, Englewood Cliffs, N.J, 1977?. 37. Charles W. Misner, Kip S. Thorne. John Archibald Wheeler, Gravitation, W. H. Freeman, New York, 1970. 38. Tanjiro Okubo, Differential geometry, Marcel Dekker, New York, 1987. 39. Barrett O’Neill, Elementary Differential Geometry, ?, ?, 19??. 40. Walter A. Poor, Differential geometric structures, McGraw Hill, New York, 1981. 41. Mikhail Mikhailovich Postnikov, The variational theory of geodesics, (ed. Bernard R. Gelbaum), Saunders, Philadelphia, 1967, translated from Variatsionnaia teoriia geodezicheskikh. 42. Michael David Spivak, A comprehensive introduction to differential geometry, volumes I-V, 3rd. ed., Publish or Perish, Berkeley, 1999. 43. Norman Earl Steenrod, The topology of fibre bundles, Princeton Univ. Press, Princeton, N.Y., 1951. 44. Peter Szekeres, A course in modern mathematical physics: groups, Hilbert space, and differential geometry, Cambridge University Press, Cambridge, UK, 2004. ISBN 0-521-82960-7. 45. Andrzej Trautman, Differential geometry for physicists: Stony Brook lectures, Bibliopolis, Napoli, 1984?. 46. Izu Vaisman, A first course in differential geometry, M. Dekker, New York, 1984?. 47. Robert M. Wald, General relativity, University of Chicago Press, Chicago, 1984. 48. Frank Wilson Warner, Foundations of differentiable manifolds and Lie groups, Scott, Foresmond and Co., Glenview, Illinois, 1971. 49. Frank Wilson Warner, Foundations of differentiable manifolds and Lie groups, Springer, New York, 1983. 50. Hermann Weyl, Raum, Zeit, Materie, 7th. ed., Springer, Berlin, 1923, 1988. 51. Hermann Weyl, Gruppentheorie und Quantenmechanik, Hirzel, Leipzig, 1928. 52. Thomas J. Willmore, Introduction to differential geometry, Oxford University Press, London, 1959.
50.3. Other differential geometry references 53. Shun-ichi Amari, Hiroshi Nagaoka, Methods of information geometry, American Math Society, Providence, Rhode Island, 1993, 2000, 2007. ISBN 9780821843024. 54. Thierry Aubin, Nonlinear analysis on manifolds, Monge-Amp`ere equations, Springer, Berlin, 1982. 55. Arthur L. Besse, Einstein manifolds, Springer, Berlin, 1986. 56. William L. Burke, Applied differential geometry, Cambridge University Press, Cambridge, New York?, 1985. 57. Herbert Busemann, The geometry of geodesics, Academic Press, New York, 1955. 58. Shiing Shen Chern, Complex manifolds without potential theory, Springer, New York, 1979. 59. Shiing Shen Chern, Differentiable manifolds, ?, ?, 198?. 60. Yvonne Choquet-Bruhat, G´eom´etrie diff´erentielle et syst`emes ext´erieurs, Dunod, Paris, 1968. 61. Gaston Darboux, Le¸cons sur la th´eorie g´en´erale des surfaces et les applications g´eometriques du calcul infinitesimal, 3rd. ed., Chelsea, New York, 1972. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
50.3. Other differential geometry references
50. Bibliography
62. Georges de Rham, Vari´et´es diff´erentiables, Hermann et Cie, Paris, 1955. 63. C. T. J. Dodson, T. Poston, Tensor geometry: the geometric viewpoint and its uses, Pitman, London, 1977. 64. C. Ehresmann, Les connexions infinit´esimales dans un espace fibr´e diff´erentiable, Colloque de Topologie, Brussels (1950), 29–55. 65. Harley Flanders, Differential forms with applications to the physical sciences, Dover, New York, 1963, 1989. 66. A. Gleason, Groups without small subgroups, Ann. of Math. 56 (1952), 193–212. 67. R. E. Greene, H. Wu, Function theory on manifolds which possess a pole, Lecture Notes in Mathematics 699, Springer, Berlin, 1979. ´ ements de g´eom´etrie alg´ebraique IV, IHES Sci. Pub. Math. 24 (1965), 68. Alexendre Grothendieck, El´ ???–???. 69. Robert Clifford Gunning, Complex algebraic varieties, PUP, 1970. 70. Robert Hermann, Geometry, physics, and systems, M. Dekker, New York, 1973. 71. Shoshichi Kobayashi, Differential geometry of complex vector bundles, Iwanami Shoten, and Princeton University Press, Tokyo, 1987. 72. Anders Kock, Synthetic differential geometry, Cambridge University Press, 1981. 73. Tullio Levi-Civit` a, Nozione di parallelismo in una variet` a qualunque e consequente spezificazione geometrice della curvature Riemanniana, Rend. Circ. Mat. Palermo 42 (1917), 173–205. 74. John Legat Martin, General relativity: a guide to its consequences for gravity and cosmology, E. Horwood Halsted Press, Chichester, New York, 1988. 75. August Ferdinand M¨ obius, Der barycentrische Calcul, ein neues H¨ ulfsmittel zur analytischen Behandlung der Geometrie, Verlag von Johann Ambrosius Barth, Leipzig, 1827. (Republished: Georg Olms Verlag, Hildesheim, 1976.) 76. Deane Montgomery, Leo Zippin, Topological transformation groups, Interscience Publishers, London, 1955. 77. Deane Montgomery, Leo Zippin, Small subgroups of finite-dimensional groups, Ann. of Math. 56 (1952), 213–241. 78. Gaspard Monge, Application de l’analyse ` a la g´eom´etrie, 4th. ed., ?, Paris, 1809. 79. Mikio Nakahara, Geometry, topology, and physics, A. Hilger, Bristol, England, 1990?. 80. N. Newns, A. Walker, Tangent planes to a differentiable manifold, J. London Math. Soc. 31 (1956), 400–407. ¨ 81. Georg Friedrich Bernhard Riemann, Uber die Hypothese, welche der Geometrie zu Grunde liegen, Habilitationsschrift, ?, ?, 1854. 82. Jacob T. Schwartz, Differential geometry and topology, Gordon and Breach, New York, 1968. 83. Shlomo Sternberg, Lectures on differential geometry, Prentice-Hall, Englewood Cliffs, N.J, 1964. 84. Dirk Jan Struik, Lectures on classical differential geometry, 2nd. ed., Addison-Wesley, Reading, Massachusetts, 1961. 85. R. Sulanke, P. Wintgen, Differentialgeometrie und Faserb¨ undel, Birkh¨auser Verlag, Basel, 1972. 86. Peter Szekeres, General relativity, Lecture notes, Adelaide University, 1974. 87. Tracy Yerkes Thomas, Concepts from tensor analysis and differential geometry, 2nd. ed., Academic Press, New York, 1965. 88. R. Walter, Konvexit¨ at in riemannschen Mannigfaltigkeiten, Jahresber. DMV 83 (1981), 1–31. 89. Thomas J. Willmore, Total curvature in Riemannian geometry, Ellis Horwood, Chichester, England, 1982. 90. Joseph Albert Wolf, Spaces of constant curvature, McGraw-Hill, New York, 1967. 91. Kentaro Yano, Shigeru Ishihara, Tangent and cotangent bundles: differential geometry, Marcel Dekker, New York, 1973.
50.4. Other mathematics references 92. Robert A. Adams, Sobolev spaces, Academic Press, New York, 1975. 93. Lars Valerian Ahlfors, Complex analysis, an introduction to the theory of analytic functions of one complex variable, 2nd. ed., McGraw-Hill Kogakusha, Tokyo, 1966. 94. G. E. Bredon, Sheaf Theory, McGraw-Hill, New York, 1967. 95. G. E. Bredon, Introduction to compact transformation groups, Academic Press, New York, 1972. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
834
835
96. Daniel Bump, Lie groups, Springer, New York, 2004. 97. Claude Chevalley, Introduction to the theory of algebraic functions of one variable, Amer. Math. Soc., New York, 1951. 98. Claude Chevalley, Theory of Lie groups I, Princeton University Press, Princeton, 1946. 99. CRC standard mathematical tables, 27th. ed., (ed. William H. Beyer), CRC Press, Boca Rat´on, Florida, 1964, 1981, 1984. 100. Charles W. Curtis, Linear algebra, 2nd. ed., Allyn and Bacon, Inc., Boston, 1968. 101. Ren´e Descartes, G´eom´etrie, ?, Paris, 1637. 102. Tammo tom Dieck, Transformation groups and representation theory, Lecture notes in Math., 766, Springer, Berlin, 1979. 103. Leonhard Euler, Introductio in analysin infinitorum, Marcus-Michael Bousquet, Lausanne, 1748. 104. Leonhard Euler, Leonhardi Euleri opera omnia, first series, volume 9, (ed. Andreas Speiser), B. G. Teubner, 1945. 105. Herbert Federer, Geometric measure theory, Springer, Berlin, 1969. 106. William Feller, An introduction to probability theory and its applications, volume 1, 3rd. ed., John Wiley & Sons, New York, 1950, 1957, 1968. 107. William Fulton, Algebraic curves, an introduction to algebraic geometry, W. A. Benjamin, New York, 1969. 108. William Fulton, Joe Harris, Representation theory: A first course, Springer, New York, 1991, 2004. 109. David Gilbarg, Neil Sidney Trudinger, Elliptic partial differential equations of second order, 2nd. ed., Springer, New York, 1983. 110. David Gilbarg, Neil Sidney Trudinger, Elliptic partial differential equations of second order, 3rd. ed., Springer, New York, 1998, 2001. 111. Robert Gilmore, Lie groups, Lie algebras, and some of their applications, Wiley, New York, 1974. 112. I. S. Gradstein, I. M. Ryzhik, Tables of series, products, and integrals, Verlag Harri Deutsch, Thun, Germany, 1981, translated from «Tablicy integralov, summ, rdov i proizvedeni i», I. S. Gradxte in, I. M. Ryжik, Nauka, Moskva, 1971. 113. Marvin J. Greenberg, John R. Harper, Algebraic topology, a first course, Benjamin/Cummings, London, 1981. ´ ements de g´eom´etrie alg´ebraique I, Springer, Berlin, 114. Alexendre Grothendieck, J. A. Dieudonn´ e, El´ 1971. 115. B. Hartley, T. O. Hawkes, Rings, modules and linear algebra, Chapman and Hall, London, 1970. 116. Lester La Verne Helms, Introduction to potential theory, Robert E. Krieger Publishing Company, Huntingdon, New York, 1975. 117. John L. Kelley, General topology, Van Nostrand Company, Princeton, 1955. (Republished: Graduate Texts in Mathematics 27, Springer-Verlag, New York 1975.) 118. Alan U. Kennington, An improved convexity maximum principle and some applications, Ph.D. Thesis, University of Adelaide, South Australia, 1984. 119. Alan U. Kennington, Power concavity and boundary value problems, Indiana Univ. Math. J. 34 (1985), 687–704. 120. Alan U. Kennington, Convexity of level curves for an initial value problem, J. Math. Anal. Appl. 133 (1988), 324–330. 121. Kleine Enzyklop¨ adie Mathematik, 2nd. ed., (ed. W. Gellert, H. K¨ ustner, M. Hellwich, H. K¨ astner), Verlag Harri Deutsch, Thun und Frankfurt/M, 1984. 122. Nicholas J. Korevaar, Capillary surface convexity above convex domains, Indiana Univ. Math. J. 32 (1983), 73–81. 123. Erwin Kreyszig, Advanced engineering mathematics, 9th. ed., John Wiley & Sons (Wiley International Edition), New York, 2006. 124. Olga Alexandrovna Ladyzhenskaya, Nina N. Ural’tseva, Linear and quasilinear elliptic equations, Academic Press, New York, 1968, translated from «Line inye: kvaziline inye uravneni elliptiqeskogo tipa», O. A. Ladyжenska, N. N. Uralьceva, Nauka, Moskva, 1964. 125. Serge Lang, Introduction to algebraic geometry, Interscience, New York, 1958. 126. Carlo Miranda, Partial differential equations of elliptic type, Springer, New York, 1970. 127. Frank Morgan, Geometric measure theory, a beginner’s guide, Academic Press, New York, 1988. 128. Hanna Neumann, Schwartz distributions, Notes in Pure Mathematics 3, Department of Pure [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
50.4. Other mathematics references
50. Bibliography
Mathematics, ANU, Canberra, 1969. 129. Blaise Pascal, Trait´e dv triangle arithmetiqve, avec qvelqves avtres petits traitez svr la mesme matiere, Guillaume Desprez, Paris, 1665. URL: http://www.lib.cam.ac.uk/deptserv/rarebooks/PascalTraite/ 130. Lev Semenovich Pontryagin, Topological groups, 1st. ed., ?, ?, 1939. 131. Lev Semenovich Pontryagin, Topological groups, 2nd. ed., Gordon & Breach, ?, 1954, 1966. 132. Lev Semenovich Pontryagin, Topological groups, (German translation) 2nd. ed., ?, ?, 1957–58. 133. Lev Semenovich Pontryagin, Topological groups, 2nd. ed., ?, ?, 19??. (Republished in “Selected works”) 134. Fritz Reinhardt, Heinrich Soeder, dtv-Atlas zur Mathematik, Deutscher Taschenbuch Verlag, M¨ unchen, 1974, 1977, 1987, 1990. 135. Alex P. Robertson, Wendy J. Robertson, Topological vector spaces, second. ed., Cambridge University Press, London, 1964, 1973. 136. Walter Rudin, Principles of mathematical analysis, 2nd. ed., McGraw-Hill, New York, 1964. 137. Walter Rudin, Functional analysis, 2nd. ed., McGraw-Hill, New York, 1973, 1991. 138. Abraham Seidenberg, Elements of the theory of algebraic curves, Addison-Wesley, Reading, Massachusetts, 1968. 139. George F. Simmons, Introduction to topology and modern analysis, McGraw-Hill, Tokyo, 1963. 140. Isadore Manuel Singer, John A. Thorpe, Lecture notes on elementary topology and geometry, Springer, New York, 1967/76. 141. Spencer, Overdetermined systems of linear partial differential equations, Bull. AMS 75 (1969), 179–239. 142. Murray R. Spiegel, Mathematical handbook of formulas and tables, McGraw-Hill Book Company, New York, 1968. 143. Michael David Spivak, Calculus, W. A. Benjamin, New York, 1967. 144. Samuel James Taylor, Introduction to measure and integration, Cambridge University Press, London, 1973. 145. John A. Thorpe, Elementary topics in differential geometry, Springer, Berlin, 1979. 146. Fran¸cois Treves, Basic linear partial differential equations, Academic Press, New York, 1975. 147. Bartel Leendert van der Waerden, Einf¨ uhrung in die algebraische Geometrie, Springer, Berlin, 1939. 148. R. J. Walker, Algebraic curves, Dover, New York, 1962[1950?]. 149. Andr´e Weil, Foundations of algebraic geometry, Amer. Math. Soc., New York, 1946, 1962. 150. Kˆosaku Yosida, Functional analysis, Sixth. ed., Springer-Verlag, Berlin, 1965, 1971, 1974, 1978, 1980, 1995. 151. Oscar Zariski, Pierre Samuel, Commutative algebra, Van Nostrand, Princeton, N.J., 1958. 152. Ernst Friedrich Ferdinand Zermelo, Untersuchungen u ¨ber die Grundlagen der Mengenlehre I, Math. Ann. 65 (1908), 261–281.
50.5. Physics 153. Vladimir Igorevich Arnold, Mathematical methods of classical mechanics, Graduate texts in Mathematics 60, Springer, New York, 1978. 154. Claude Cohen-Tannoudji, Bernard Diu, Franck Lalo¨ e, Quantum Mechanics, Wiley-Interscience, New York, 1977, translated from M´ecanique quantique, Hermann, Paris, 1977. 155. CRC handbook of chemistry and physics, (ed. Robert C. Weast), CRC Press, Boca Rat´on, Florida, 1988. 156. H. A. Lorentz, Electromagnetic phenomena in a system moving with any velocity smaller than that of light., Proc. Acad. Sci. Amsterdam 6 (1904), 809–835. 157. Albert Abraham Michelson, Edward Williams Morley, On the relative motion of the Earth and the luminiferous ether, Amer. Journ. Sci. 34 (1887), 333–345.
50.6. Logic and set theory 158. Lewis Carroll, Symbolic logic and the game of logic, Dover Publications, New York, 1896, 1958. 159. Paul Richard Halmos, Naive set theory, Springer, New York, 1974. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
836
50.7. Anthropology and linguistics
837
160. John Harrison, Formal proof—theory and practice, Notices of the American Mathematical Society 55 (2008), 1395–1406. URL: http://www.ams.org/notices/200811/index.html 161. Stephen Cole Kleene, Introduction to metamathematics, Van Nostrand, 1952. 162. Stephen Cole Kleene, Mathematical logic, Wiley, 1967. 163. Edward John Lemmon, Beginning Logic, Thomas Nelson, London, 1965, 1971. 164. Elliott Mendelson, Introduction to mathematical logic, D. Van Nostrand, New York, 1964. 165. Tobias Nipkow, Lawrence C. Paulson, Markus Wenzel, Isabelle/HOL: A proof assistant for higher-order logic, Springer-Verlag, Berlin, 2008. URL: http://www4.in.tum.de/~nipkow/LNCS2283/ 166. Chris E. Mortensen, Inconsistent mathematics, Springer-Verlag, New York, 1994. ISBN 9780792331865. 167. Bertrand Arthur William Russell, Alfred North Whitehead, Principia mathematica, Volumes I–III, Cambridge University Press, Cambridge, 1910–1913. 168. Joseph Robert Shoenfield, Mathematical logic, Association for Symbolic Logic, AK Peters, Natick, Massachusetts, 1967, 2001. ISBN 1-56881-135-7.
169. Leslie C. Aiello, Robin Ian MacDonald Dunbar, Neocortex size, group size, and the evolution of language, Current Anthropology 34 (1993), 184–193. 170. Robin Ian MacDonald Dunbar, Coevolution of neocortical size, group size and language in humans, Behavioral and Brain Sciences 16 (1993), 681–735. 171. William A. Foley, Anthropological linguistics: an introduction, Blackwell Publishers Ltd., Malden, Massachusetts, 1997, 2002. 172. George Lakoff, Rafael E. N´ un ˜ez, Where mathematics comes from: how the embodied mind brings mathematics into being, Basic Books, New York, 2000. ISBN 978-0-465-03771-1. 173. Steven J. Mithen, The prehistory of the mind: The cognitive origins of art and science, Thames and Hudson, London, 1996, 1999. ISBN 0-500-28100-9. 174. Bronislaw Kasper Malinowski, The problem of meaning in primitive languages, in “The meaning of meaning”, (ed. C. Ogden, I. Richards) , pp. 296–336, Harcourt, Brace and World, New York, 1923. 175. Nicholas Ostler, Empires of the word: A language history of the world, Harper Perennial, London, 2005, 2006. 176. Bruce Richman, Some vocal distinctive features used by gelada baboons, Journal of the Acoustical Society of America 60 (1972), 718–724. 177. Bruce Richman, The synchronization of voices by gelada monkeys, Primates 19 (1978), 569–581. 178. Bruce Richman, Rhythm and melody in gelada vocal exchanges, Primates 28 (1987), 199–223.
50.8. Philosophy and ancient history 179. Ronald W. Clark, The life of Bertrand Russell, Penguin Books, Harmondsworth, England, 1975, 1978. 180. Henry Bernard Cotterill, Ancient Greece: myth & history, Geddes and Grosset, New Lanark, Sctoland, 1913, 2004. 181. Albert Einstein, Essays in science, Wisdom Library, Philosophical Library, New York, 1934, translated from Mein Weltbild, Querido Verlag, Amsterdam, 1933. 182. Albert Einstein, Aus meinen sp¨ aten Jahren, Ullstein, Frankfurt-am-Mein, 1950, 1993. ISBN 3548347215. 183. Colin McEvedy, The new Penguin atlas of ancient history, Penguin Books, London, 1967, 2002. 184. Stephen Palmquist, Kant on Euclid: Geometry in Perspective, Philosophia Mathematica II 5:1/2 (1990), 88–113. URL: http://www.hkbu.edu.hk/~ppp/srp/arts/KEGP.html 185. Bertrand Arthur William Russell, History of Western philosophy, George Allen & Unwin, London, 1946, 1961, 1974. 186. J. A. K. Thomson, Hugh Tredennick, Jonathon Barnes, The ethics of Aristotle: the Nicomachean ethics, Penguin Books, London, 1953, 1976, Aristotle. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
50.7. Anthropology and linguistics
838
50. Bibliography
50.9. History of mathematics 187. Walter William Rouse Ball, A short account of the history of mathematics, Dover, New York, 1893, 1908, 1960. 188. Petr Beckmann, A history of π, St. Martin’s Press, New York, 1971. 189. Eric Temple Bell, The development of mathematics, 2nd. ed., McGraw-Hill, New York, 1940, 1945. 190. Eric Temple Bell, Men of mathematics, Simon & Schuster, New York, 1937, 1965, 1986. 191. W. F. Bynum, E. J. Browne, Roy Porter, Dictionary of the history of science, Princeton University Press, Princeton, New Jersey, 1985. 192. J. B. Dubbey, The introduction of the differential notation in Great Britain, Annals of Science 19 (1963), 37–48. 193. Dirk Jan Struik, A concise history of mathematics, Dover, New York, 1948, 1967, 1987. ISBN 0-486-60255-9 (pbk.).
194. Beowulf, Penguin, Harmondsworth, England, 1973, 1977, translated by Michael Alexander. 195. Beryl T. Atkins, etc., Collins-Robert French-English, English-French dictionary, 3rd. ed., HarperCollins, London, 1993. 196. Graham Chapman, Terry Jones, Terry Gilliam, Michael Palin, Eric Idle, John Cleese, Monty Python and the Holy Grail (book), Methuen, London, 1977, 1989. 197. Mark Collier, Bill Manley, How to read Egyptian hieroglyphs: a step-by-step guide to teach yourself, The British Museum Press, London, 1998, 1999, 2003. 198. Angel Garc´ıa de Paredes, Cassell’s Spanish-English English-Spanish Dictionary, Cassell, London, 1978. 199. G¨ unther Drosdowski, Duden Deutsches Universalw¨ orterbuch, Dudenverlag, Mannheim, 1989. 200. Maurits Cornelis Escher, The world of M. C. Escher, Harry N. Abrams, New American Library, New York, 1971, 1974. 201. Karl Feyerabend, Langenscheidt’s pocket Greek dictionary: Greek-English, 7th. ed., Langenscheidt, Berlin, 1963. 202. The epic of Gilgamesh, (ed. Andrew George?), Penguin, London, 1999, 2003. 203. The Guinness encyclopedia, (ed. Ian Crofton), Guinness Publishing, Enfield, Middlesex, England, 1990. 204. Patrick Hanks, etc., Collins dictionary of the English language, second. ed., Collins Publishers, Sydney, 1979, 1986. 205. Brian W. Kernighan, Dennis M. Ritchie, The C programming language, second. ed., Prentice Hall, Englewood Cliffs, New Jersey, 1978, 1988. 206. Donald Ervin Knuth, The TeXbook, Addison Wesley Publishing Company, Reading, Massachusetts, 1984, 1986, 1991. 207. Vladimiro Macchi, I Dizionari Sansoni Inglese-Italiano Italiano-Inglese, 2nd. ed., Sansoni Editore, Firenze, 1983. 208. Alfred Mann, The study of counterpoint, W.W. Norton, New York, 1965, 1943, 1971, translated from Gradus ad Parnassum, Johann Joseph Fux, Austrian Empire, Vienna, 1725. 209. Shu Lin, Daniel J. Costello, Error control coding: Fundamentals and applications, Prentice-Hall, Englewood Cliffs, New Jersey, USA, 1983. ISBN 0-13-283796-X. 210. Arthur P. Norton, Norton’s star atlas, 16th. ed., (ed. Gilbert E. Satterthwaite), Gall & Inglis, Edinburgh, 1910, 1973. 211. C. T. Onions, etc., The shorter Oxford English dictionary on historical principles, Oxford University Press, Oxford, 1933, 1973, 1992. 212. John G. Proakis, Digital Communications, 3rd. ed., McGraw-Hill, New York, 1983, 1989, 1995. 213. Alain Rey, Josette Rey-Debove, etc., Le Petit Robert 1, dictionnaire alphab´etique et analogique, Le Robert, Paris, 1983. 214. Ruth Schumann-Antelme, St´ephane Rossini, Illustrated hieroglyphics handbook, Sterling Publishing, ´ New York, 2002, translated from Lecture illustr´ee des hieroglyphes, Editions du Rocher, Paris, 1998. 215. Gerhard Wahrig, Deutsches W¨ orterbuch, Bertelsmann Lexikon Verlag, G¨ utersloh, 1991. 216. Larry Wall, Tom Christiansen, Randal L. Schwartz, Programming Perl, O’Reilly & Associates, Sebastapol, California, 1991, 1996. [ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
50.10. Other references
50.10. Other references
839
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
217. John T. White, A complete Latin-English and English-Latin dictionary for the use of junior students, Longmans, Green and Co., London, 1889. 218. Beowulf: with the Finnesburg fragment, (ed. C. L. Wrenn, W. F. Bolton), University of Exeter, 1953, 1958, 1973, 1988.
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
840
[ www.topology.org/tex/conc/dg.html ]
50. Bibliography
[ draft: UTC 2009–3–21 Saturday 11:36 ]
[841]
Chapter 51 Index
a-fortiori, 4.8.6, 18.5.15 a-priori geometry, 3.4.4 a-priori knowledge, 2.1.3, 2.5.4 a-priori mathematics, 2.12.1, 6.0.3 abbreviations, 49.2 Abel, Niels Henrik, 9.2.1, 9.2.21, 20.13.22, 46.1.5 Abelian group, 9.2.21 abgeschlossene H¨ ulle, 14.4.6 aborigine, Australian, 2.2.7 absolute knowledge, 2.5.14 absolute value function, 8.6.1 abstract direct sum of linear spaces, 10.6.3 abstract discussion context, 4.3.1 abstract groups, 9.4.1 abstract logic, 3.9.6, 4.1.11 abstract logic, crisp perfection, 3.4.3 abstract-to-conrete map, logical operation, 3.13.1 abstract variable name space, 2.11.17 abstraction, three stages, 3.9.6 absurdity, 3.11.2 absurdly large infinity, 2.0.3 absurdum, reductio ad, 3.10.12, 3.11.2 Ab¯ u Ja’far Mohammed ibn M¯ us¯ a, 46.2.3 abundance of integers, 2.11.11 AC (axiom of choice), 5.0.9 AC-enhanced theorem, 5.9.1 AC-tainted theorem, 5.9.1, 7.2.27, 7.8.4 AC-tainted theorem example, 10.2.25, 20.1.4 AC-tainted theorem examples, 5.9.10 Academy, 46.2.1 acceleration of curve, covariant, 38.1.4 acceptance/rejection model, proposition, 3.7.2 accumulation point, 14.6.1 Achilles and the tortoise, 2.11.3, 14.0.4, 25.2.3 acknowledgements, 1.8 action, differential, 34.8.0 action, infinitesimal, 36.3.1 action, right, 23.9.2 action of group, differentiable, 35.3.0 action of vector, 33.1.2, 33.1.8, 33.1.16 action of vector field, 33.1.3, 33.1.11, 33.1.17 active set, 9.9.0 acyclic graph, 5.7.19 acyclic network, concepts, 2.1.7 ad-hoc kludge, 5.7.27 ADC (Analogue Digital Conversion), 2.4.3 addition, vector, 10.1.2 adherent set, 14.4.6 adjoint, etymology, 9.11.24 [ www.topology.org/tex/conc/dg.html ]
adjoint Lie algebra, 9.11.22 adjoint representation of Lie algebra, 9.11.19 adjunct, etymology, 9.11.24 advocate, devil’s, 3.10.5 aether, luminiferous, 24.1.4 affine connection, 12.0.1, 37 affine connection, preview, 25.5 affine connection, tensor calculus, 41.3 affine connection, terminology, 37.1 affine connection on principal fibre bundle, 37.8, 37.8.1 affine connection on principal fibre bundle, coefficients, 37.9 affine connection on tangent bundle, 37.3, 37.3.2 affine connection on tangent bundle, differentiability, 37.3.4 affine-invariant geometry, 46.3.0 affine path, 38.0.2 affine space, 12, 12.0.1, 12.2.1 affine space, etymology, 46.3 affine space definitions, 12.2 affine spaces discussion, 12.1 affine structure, 12.2.1 affine transformation, 12.0.1, 37.8.2 affine transformation group, 46.3.0 affinely connected manifold, 36.1.2 affinely parametrized geodesic on two-sphere, 42.10 Ages, Dark, 42.0.1 aggregate, uncountable, name, 2.10.1 Akkadian, 3.5.2 al Khw¯ arizmi, 46.2.3, 46.2.4 Alberti, Leone Battista, 46.1.3 Alcibiades, 3.4.2 Alexander the Great, 3.4.2 algebra, 9 algebra, alternating, 13.10.3 algebra, alternating tensor, 13.10 algebra, associative, 9.10 algebra, Boole’s, 3.6.3 algebra, Boolean, 4.0.1, 4.4.1 algebra, etymology, 3.8.3, 46.2.4 algebra, exterior, 13.10.2 algebra, general linear, 9.10.4 algebra, general tensor, 13.8 algebra, Grassman, 13.0.6 algebra, inverse problem, 4.4.2 algebra, Lie, 9.11, 9.11.1 algebra, Lie, real, 9.11.11 algebra, linear, 10 algebra, logic, 3.1.2 algebra, logical, 4.14.4 algebra, logical, formalized, 3.1.2 [ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
The references in this index are not page numbers. (Page numbers will be added in future.) The references are of three kinds: chapter number (simple integer), section number (two dotted integers) or subsection number (three dotted integers). A subsection may be a theorem, definition, remark or other text unit with three dotted integers. Subsections which are underlined are definitions.
51. Index
algebra, matrix, 11 algebra, mixed tensor, 13.8.14 algebra, multilinear, 13.0.6 algebra, predicate, 4.4.1 algebra, propositional, 4.4.1 algebra, rectangular matrix, 11.1 algebra, symbolic, 10.10.4 algebra, tensor, 13, 13.8.6 algebra of sets, 5.13, 5.14 algebra representations, 9.10.6 algebraic closure of group, 9.2.2 algebraic operation symbol, 9.9.1 algebraic real number, 14.1.8 algebraic style, symbolic logic, 4.0.2 algebraic system, 4.1.1, 9.9.0 algebraic topology, 14.1.3, 15.4.24, 16.2.2, 16.6, 25.2.9 algorithm, etymology, 46.2.3 algorithms, 2.10.10 alien minds, 2.5.6 all sets, 5.7.23 allocation, dynamic memory, 2.10.10 allowed homomorphism, 9.9.10 Alpha Centauri, 4.12.1 alphabet, Greek, 14.2.13 alternating algebra, 13.10.3 alternating form, 13.9.4 alternating form bundle, cross-section, 20.5.4 alternating form bundle on Euclidean space, 20.5.2 alternating symbol, Levi-Civita, 7.9.1, 7.10.20 alternating tensor, 13.9 alternating tensor algebra, 13.10 alternating tensor product, 13.9.6 alternative denial, 4.3.6, 4.6.4 alternative denial operator, 4.3.8 always-false proposition, 4.3.20 always-true proposition, 4.3.20 always-true set-theoretic formula, 6.5.16 amazing coincidence, 3.4.5 ambiguity, left/right transformation group, 2.5.4, 5.16.6, 34.8.8 ambiguity, zero tangent operator, 2.6.5, 28.5.12, 28.6.1 ambiguous conjunct, logical expression, 4.3.12 American Mathematical Society, 1.8 analogue digital conversion, 2.4.3 analysis, etymology, 14.7.1 analysis, local continuity, 14.1.2 analysis, numerical, 2.11.6 analysis on Euclidean space, 43.10 analysis situs, 14.1.1, 46.2.17 analytic atlas, 27.9.1 analytic chart, 27.9.3 analytic equivalent atlas, 27.9.4 analytic fibre bundle for Lie left transformation group, 35.2.5 analytic function space, 45.6 analytic logic, 3.9.7 analytic manifold, 27.9, 27.9.2 analytic real function, 8.7.3 anchor chart, 30.0.4, 37.4.3 ancient Greece, 2.1.6 ancient Greek logic, 3.7.11 ancient Greeks, 3.2.8, 3.7.18, 5.0.2 ancient history, law, 3.10.14 ancient literature, logic in, 3.5 ancient literature, logical language, 46.4 ancient Olympic games, 3.2.8 ancient Roman, 2.5.8 and, 3.7.7, 4.3.3 and-introduction rule, 4.6.2 [ www.topology.org/tex/conc/dg.html ]
angle brackets, 10.8.7 angles, Euler’s, 42.8.2 Anglo-Saxon, 3.5.4 Anglo-Saxon, otherwise, 3.7.17 animal, cognition, 5.7.17 animal, domesticated, 2.2.6 animal, half, 1.4.7 animal, multi-celled, 25.2.10 animal, world model, 5.7.25 animal communication, 2.5.1 animal learning, 7.2.1 animal logic, 3.5.10 animal mind, 2.3.3, 3.4.5 annulus, closed, 17.1.17 annulus, open, 17.1.17 Anschauung, 3.4.4 answer, yes/no, 3.9.5 ant, 1.9.1, 2.10.3, 2.10.7 antecedent subexpression, 4.3.14 antediluvian soup, 5.1.21 anthropocentric, 2.2.6 anthropological linguistics, 2.2.4 anthropological observable, 3.4.1 anthropologist, 2.2.7, 2.2.8 anthropology, 2.5.1, 2.5.17, 3.0.1 anthropology of logic, 3.2.8 anthropology of mathematics, 2.2.5 anthropomorphic principle, 24.1.4 anti-derivative, 21.0.1 anti-reflexive relation, 5.7.10 anticommutative product, 9.11.0 Antiphon the Sophist, 46.1.1 antipodal points on two-sphere, 42.9.4 antisymmetric multilinear effect of vector sequence, 13.0.4 antisymmetric multilinear map, 13.3, 13.3.3 antisymmetric relation, 5.7.10 Apollonius of Perga, 46.1.1 applicability, logic, 3.9.8 application, logic, 3.3.3 application rule, theorem, 4.9.2 applied mathematician, 2.9.9 arc, 16.1.1 arc, Jordan, 16.2.11 arc, open, 16.2.9 archaeologist, future, 2.3.4, 2.3.5 Archimedes of Syracuse, 20.1.1, 46.1.1, 46.1.3 architect, 5.0.6 architecture, 1.4.7 arctangent function, two-parameter, 20.13.6 Arctic, 3.10.6 area, directed, 25.4.2 argument, logical, 3.7.8 argument of function, 6.5.11 argumentation, logical, 4.4 Aristotelian logic, 3.4.2, 3.11.9 Aristotle, 3.4.2, 46.1.1 arithmetic, unsigned integer, 7.4 arithm` etic arts, 2.9.6 arithm` etic equivalent, logic operator, 4.3.15 arithm` etic triangle, 7.11.6 arrow, Peirce, 4.3.6, 4.3.8 arrow, portable, 10.1.6 arrow of time, 2.2.1, 2.2.6 art, logic, 3.2.3 art, non-representational, 3.1.3 artificial intelligence, 2.3.1 ass, 3.5.3 assertion, 4.5.7 [ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
842
assertion, delayed, 3.7.17 assertion, double-negative, 3.7.4 assertion, etymology, 3.7.1 assertion symbol, 4.5.8, 4.6.6, 46.2.18 assertion symbol, two-way, 4.5.9 assertion trigger, 3.7.17, 3.10.14 assertions, uninteresting, 4.8.1 associated differentiable fibre bundle, 35.6, 35.6.3, 35.6.7, 35.6.10 associated fibre bundle, 35.9.4 associated parallelism, 24.3 associated topological fibre bundle, 23.10, 23.10.5, 23.10.9 associated topological fibre bundle, construction, 23.11 associated topological fibre bundle, orbit space method, 23.12, 23.12.3 associated topological pathwise parallelism, 24.3.2 associative algebra, 9.10 associative algebra, associated Lie algebra, 9.11.10 associative algebra, linear representation, 9.10.7 associative algebra over commutative unitary ring, 9.10.1 associativity, logical operator, 4.3.12 associativity rule, 4.3.12 asterisk method of learning, 1.10.1 asteroid, 2.5.6 astronomical spherical coordinates, 42.1.6 astronomy, 2.9.8 astronomy, Ptolemaic, 3.4.2 asymmetric logical operator, 4.6.2 Athens, 3.4.2 atlas, analytic, 27.9.1 atlas, analytic equivalent, 27.9.4 atlas, differentiable, 27.2, 27.2.2, 27.2.4 atlas, fibre, 22.1.7 atlas, manifold, differentiable, 27.2 atlas, maximal, topological, 26.4.15 atlas, standard, for Euclidean space, 27.3.1 atlas, tangent bundle, 28.2.1 atlas, topological, 26.4, 26.4.6 atlas, usual, for Euclidean space, 27.3.1 atlas direct product, 26.5.3 atlas of curves for a path, 16.1.8 atlas of tangent bundle total space, 28.8.4 atlas product, 26.5.3 atom, 2.12.3 atoms, Universe, 2.11.15 attributes, database object, 2.4.4 attributes, set, 5.2.6 audio cable connectors, 2.5.13 audio system, positive feedback, 3.3.7 Australian aborigine, 2.2.7 Auswahlaxiom, 5.9.3 aut, 4.3.5 automated logic, 4.3.10 automorphism, idempotent linear, 10.1.5 automorphism, inner, 9.3.22 automorphism, Lie algebra, 9.11.8 automorphism, linear space, 10.3.6 axiom, comprehension, ZF, 5.4.3 axiom, empty set, 5.3.2 axiom, extension, ZF, 5.1.3, 5.2 axiom, extensionality, 5.2.1 axiom, infinity, ZF, 2.11.13, 5.6 axiom, naive comprehension, 5.7.2, 5.7.6, 5.7.8 axiom, power set, 5.3.5 axiom, reflexivity of equality, 4.15.1 axiom, regularity, 5.7.27 axiom, regularity, ZF, 5.5, 5.7.19 axiom, replacement, ZF, 5.4 [ www.topology.org/tex/conc/dg.html ]
843
axiom, separation, ZF, 5.4.3 axiom, set existence, ZF, 5.1.17, 5.1.18 axiom, set theory, CC, 5.10.1 axiom, singleton, 5.3.4 axiom, specification, 4.12.6 axiom, specification, ZF, 5.4.1, 5.4.2, 5.7.19, 48 axiom, substitution of equality, 5.2.5 axiom, substitutivity of equality, 4.15.1 axiom, union, 5.3.4 axiom, unordered pair, 5.3.3 axiom, ZF, productive, 5.1.17, 5.1.19 axiom of choice, 2.10.3, 2.12.3, 4.13.10, 5.9, 5.9.6, 5.11.5, 5.15, 6.9.5, 7.8.0, 10.2.24, 14.1.9, 14.8.9, 15.1.5, 15.7.10, 17.3.26, 23.12.5 axiom of choice, useless, 5.1.18 axiom of choice and Lebesgue measure, 20.1.3 axiom of comprehension, 5.11.2 axiom of countable choice, 5.9.1, 5.9.11, 5.10, 5.10.1, 5.10.2, 5.10.3, 7.8.3, 7.8.4 axiom of dependent choice, 5.0.10 axiom of foundation, 5.5.1 axiom of infinity, 2.10.6, 2.10.7 axiom of reckless comprehension, 5.7.24 axiom of replacement, 5.11.1 axiom of separation, 5.11.2 axiom of specification, 5.11.1 axiom of subsets, 5.11.2 axiom of substitution, ZF, 5.11.3 axiom schema, 4.5.2 axiom style, ZF, 5.6.2 axiom system, classical, 2.8.3 axiom system, modern, 2.8.3 axiom system, self-consistency, 3.11.4 axiom systems, summary of experience, 2.5.18 axiom template, 5.11.5 axiomatic approach, definitions, 5.0.4 axiomatic method of definition, 2.8.1, 2.8.2 axiomatic reformulation of logic, 3.13.4, 7.13 axiomatic specification, 2.7.2 axiomatic system, 2.5.13, 4.5.2 axiomatic system, incomplete, 3.8.2 axiomatic system, spartan, 4.6.4 axiomatic system for propositional calculus, 4.7.4 axiomatization, 5.0.7 axiomatization, credibility, 3.2.6 axiomatization, Euclid’s geometry, 3.2.7 axioms, baseless, 2.2.8 axioms, Euclidean geometry, 2.1.9 axioms, inconsistent, 4.1.9, 4.12.7 axioms, linear space, 2.8.7 axioms, non-standard, 5.0.10 axioms, Peano, 2.5.13, 7.3.3 axioms, set theory, Zermelo-Fraenkel, 5.1.26 axioms, ZF set theory, 5.3 axioms and constructions, 2.8 Babbage, Charles, 18.2.11 baboon, gelada, 2.2.4 Babylon, 3.4.2 Babylonian geometry, 39.1.5 background claim, 3.5.4 background proposition, 3.6.1, 3.7.3 backwards-deductive search, 4.8.5 bacteria, 2.2.1 ball, closed, 17.1.7, 17.3.9 ball, closed, punctured, 17.1.17 ball, closed, with zero radius, 17.1.10 ball, open, 17.1.7, 17.3.8 ball, open, punctured, 17.1.17 [ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
51. Index
ball, tennis, 2.10.10 ball centre, 17.1.7 ball radius, 17.1.7 Banach, Stefan, 46.1.6 Banach space, 10.2.22 bandwidth, finite, 2.5.13 bandwidth, finite, human communication, 2.11.14 barking dog, Harry’s, 3.6.3 Barrow, Isaac, 46.1.3 base, open, 14.10, 14.10.2, 15.6.1 base point of tangent operator, 28.5.1 baseless axioms, 2.2.8 basis, dual, canonical, 10.5.7 basis, tangent, 28.12.1 basis existence, linear space, 5.9.10, 5.9.12, 10.2.25 basis for linear space, 10.2.23 basis for linear space, finite, 10.2.9 basis vector, 10.2 basis vector, coordinate, 28.7.4, 28.12 bedrock of knowledge, 2.0.1 bedrock of mathematics, 2.1 bedrock of physics, 2.12.3 behaviour, grooming, 2.2.4 behaviour control, social, 3.7.12 behaviourism, 2.5.20 belief, 3.10.15 beliefs, 3.4.3 Beltrami, Eugenio, 41.1.1, 46.1.6 B¯ eowulf, 3.5.4, 3.12.3 B¯ eowulf, logical language, 46.4.2 Bernays, Paul Isaak, 2.9.7, 46.1.6 Bernays-G¨ odel set theory, 5.1.6, 5.5.1, 5.12 BG (Bernays-G¨ odel), 5.0.9 BG set theory, 5.12 Bianchi identities, 37.7.9 bibliography, 50 biconditional expression, 4.3.14 bidirectional derivative, 19.6.2 bidirectionally differentiable function, 19.6.3 bidual of linear space, 10.5.19 big bang, 40.4 big ideas, 1.4.10 biggest number, 5.7.18 bijection, 6.5.23, 25.2.9 bijective function, 6.5.23 bilinear map, canonical, 13.4.3 bilinear map, universal, 13.4.3 binary number, 7.5.8 binary numbers, 2.10.6 binary operator, 4.3.12 binary set intersection properties, 5.13 binary set union properties, 5.13 biology, discipline, 2.5.17 bird, 2.0.4 bird, tree, 5.7.25 birth, 2.4.2, 2.11.19 bistable transistor circuit, 4.1.1 black hole, 40.3 black number, 2.10.5, 2.11.12 blindingly obvious, 1.5.2 blocking proposition, 3.10.1, 3.10.14 blow-out, unknowns, 3.8.2 bodies, communications standardization, 2.8.9 bogus theorem, 4.9.10 Bolzano, Bernard Placidus Johann Nepomuk, 46.1.5 Bolzano-Weierstraß property, 17.3.29, 17.3.30 Bolzano-Weierstraß theorem, 17.3.29 Bombelli, Rafael, 7.5.11, 46.1.3, 46.2.7 [ www.topology.org/tex/conc/dg.html ]
51. Index bone-setting, 46.2.4 Bonn, 0.0 book, definition-centric, 1.4.5 books, other people, 50.1 Boole’s algebra, 3.6.3 Boolean algebra, 4.0.1, 4.4.1 boolean formula, 48 boot-strap, 3.14.1, 4.7.7 boot-strap, integer definitions, 7.2.5 boot-strap layer, 2.1.6 boot-strapping mathematics, 2.9.6 boot-strapping of definitions, 2.1.1, 2.1.3, 6.1.1 borderline examples, 43.1.1 Borel, F´ elix Edouard Justin Emile, 5.9.3, 46.1.6 bottle, Klein, 44.2.2 bottleneck, information, cosmic, 2.11.17 bottleneck, naming, 2.10.1 bound of partially ordered set, lower, 7.1.10 bound of partially ordered set, upper, 7.1.10 bound variable, 5.1.24 boundary, zero-thickness, 25.2.8 boundary, zero-width, 14.0.3 boundary conditions, 2.11.5 boundary differentiability of functions of several variables, 18.5.1 boundary of boundary, 20.6.12 boundary of set, 14.5 boundary of set, open/closed portions, 14.5.9 boundary of set, topological, 14.5.4 boundary point of set, 14.0.2 boundary value problem, physical models, 21.3.0 boundary value problems, 21.3 boundary value problems, differentiability at boundary points, 18.5.1 bounded set in metric space, 17.2.7 bracket, Poisson, 9.11.12, 33.2, 33.2.10, 33.3.2, 33.5.0, 34.0.4 brackets, angle, 10.8.7 brain, human, 4.2.8 Brazilian forest, 3.2.8 bricklaying, 1.4.7 brilliant guesswork, 1.4.7 British spelling, 1.6.7 Brouwer, Luitzen Egbertus Jan, 3.11.2, 46.1.6 bucket/set metaphor, 5.7.17 bug, 4.7.7 bug fix, 3.10.4 buggy set theory, 3.10.4 building safety regulations, 5.0.6 bulk conjunctions, 3.1.2 bulk disjunctions, 3.1.2 bundle, alternating form, cross-section, 20.5.4 bundle, curve, 24.4.3 bundle, differentiable fibre, 35 bundle, line, 35.7.2 bundle, orthogonal, tangent, 39.4.9 bundle, second-order tangent, 30.3.10 bundle, tangent, 28.0.1, 28.8, 28.8.1 bundle, tangent operator, 28.9, 28.9.2 bundle, topological second-order tangent, 30.3.11 bundle, vector, 35.7, 35.7.1 Burali-Forti, Cesare, 3.2.1, 46.1.6 Burali-Forti paradox, 3.2.1, 5.7.13, 5.7.15, 7.2.3 bus, 23.1.2 b¯ utan, Old English, 3.5.4 buzz-word, differential geometry, 25.0.1 BVP (boundary value problem), 21.0.2 calculation, 1.4.7 calculus, differential, 18 [ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
844
calculus, differential, and logic, 4.4.2 calculus, exterior, 13.0.6 calculus, fundamental theorem, 20.2.3, 20.3.1, 20.6.3, 20.9.1, 21.0.1 calculus, integral, 20 calculus, integral, and logic, 4.4.2 calculus, predicate, 3.1.2, 4.12, 4.13, 4.14 calculus, propositional, 3.1.2 calculus, propositional, axiomatic system, 4.7.4 calculus, propositional, formalization, 4.5 calculus, propositional, implication-based, 4.7 calculus, propositional, semantics-free, 4.5.1 calculus, propositional, theorems, 4.8 calculus, tensor, 41 calculus, vector field, 33 calculus of variations, 21.5 camel, straw, 2.11.15, 4.13.12 canoncial multilinear map for tensor space, 13.4.1 canonical 1-form, 37.7.4 canonical bilinear map, 13.4.3 canonical construction, 2.8.6 canonical dual basis, 10.5.7 canonical map for tensor space, extended, 13.12.5 canonical map from linear space to second dual, 10.5.20 canonical multilinear map, 13.5.4 canonical representation, 2.8.6 Cantor, Georg Ferdinand Ludwig Philipp, 2.9.7, 3.2.1, 5.0.5, 7.2.7, 14.1.8, 46.1.6 Cantor’s paradox, 3.2.1 capacity, cranial, 2.2.4 capital, intellectual, 4.7.7 car, 23.1.2 car parts, 2.4.3, 2.4.4 Carath´ eodory, Constantin, 46.1.6 cardinal number comparability, 5.9.10 cardinality, 25.2.9 cardinality, uniqueness, 4.16.6 Carnot, Lazare Nicholas Marguerite, 46.1.5 Carroll, Lewis, 4.0.1 ´ Cartan, Elie, 36.1.5, 46.1.6 Cartesian coordinates, 18.1.1, 26.1.3 Cartesian product, partial, 6.10, 6.10.1, 42.6.0 Cartesian product, sequence, 7.7 Cartesian product, standard identification map, 7.7.6 Cartesian product of family of functions, 6.9 Cartesian product of family of sets, 6.9, 6.9.1 Cartesian product of sets, 6.2, 6.2.1 Cartesian product of sets, properties, 6.2.5 Cartesian product p-norm, 10.8.1 Cartesian product projection map, 6.9.8 Cartesian space, 30.0.3 cat, sat, mat, 3.10.10 cat, Schr¨ odinger’s, 3.2.8, 5.7.12 catalysis, 14.7.1 categories of mathematics ontologies, 2.3.8 category, rabbit, 2.11.14 cattle counting, prehistoric, 2.11.19 Cauchy, Augustin Louis, 3.11.9, 9.2.1, 14.11.1, 20.1.1, 46.1.5 Cauchy sequence of rational numbers, 8.3.5 causality violation, self-containing set, 5.7.24 caveat emptor, 4.1.1 Cayley, Arthur, 9.2.1, 10.1.8, 11.0.1, 46.1.5 CC (axiom of countable choice), 5.0.9 CC set theory axiom, 5.10.1 CC-tainted theorem, 5.9.1, 7.2.27 CC-tainted theorem example, 7.2.26, 7.2.28, 7.2.36 CC-tainted theorem examples, 5.9.10, 5.10.4 ceiling function, 8.6.9 [ www.topology.org/tex/conc/dg.html ]
845
cell, 34.2.5 Centauri, Alpha, 4.12.1 central concepts of differential geometry, 37.7.1 centralizer of group, 9.3.27 centre of ball, 17.1.7 centre of group, 9.3.27 centroid, 9.7.0 chain, cyclic, 5.7.10 chain, implication operators, 4.6.5 chain, set membership, 5.5.3 chain, singular, 25.3.6 challenge/response, 2.11.21 channel, mathematics communication, 2.5.11 chapter groups, 1.3 chapter page counts, 1.3.1 chapters, overview, 1.2 characteristic function, 7.9.4 characteristic polynomial of matrix, 11.5.7 characterization, 2.5.8 chart, analytic, 27.9.3 chart, differentiable, 27.4.6 chart, Earth, 26.1.5 chart, fibre, 22.1.5, 23.3.4 chart, per-fibre-set, 35.2.4 chart, tangent bundle, 28.2.1 chart, topological, 26.4, 26.4.2 chart for path, 16.1.8 Chasles, Michel, 46.1.5 chemistry, 2.1.5 chemistry, discipline, 2.5.17 chess, 3.4.3, 3.9.3, 4.8.1 chicken, 2.5.6, 2.5.8, 44.2.6 chicken, egg, 5.7.21 chicken-foot symbol, 5.16.4, 6.3.20 Chinese mathematics, 7.11.7 choice, dependent, axiom, 5.0.10 choice, multiple, 3.7.3, 3.9.5 choice axiom, 2.10.3, 2.12.3, 4.13.10, 5.9, 5.9.6, 5.11.5, 5.15, 6.9.5, 7.8.0, 10.2.24, 14.1.9, 14.8.9, 15.1.5, 15.7.10, 17.3.26, 23.12.5 choice axiom, countable, 5.9.1, 5.9.11, 5.10, 5.10.1, 5.10.2, 5.10.3, 7.8.3, 7.8.4 choice axiom, useless, 5.1.18 choice function, 7.8.0 choice functions without axiom of choice, 7.8 Christoffel, Elwin Bruno, 36.1.5, 46.1.6 Christoffel symbol, 37.7.8, 37.9.5, 39.4.2, 39.4.4, 41.3.1, 41.5.0 Christoffel symbol, non-tensorial, 39.4.5 Christoffel symbol, tensorization, 30.2.7 Christoffel symbol on two-sphere, 42.2.2 Christoffel symbols, 30.0.4 chronology of mathematicians, 46.1 circle, great, 42.9.2 circle on two-sphere, 42.14 circuit, bistable transistor, 4.1.1 circuit, digital electronics, 4.1.8 circuit, electronic logic, 4.5.5 circuit voltages, transistor, 3.10.6 circularity, logic and set theory, 3.3.8 civilisation, extra-terrestrial, 2.2.6 civilisations, inter-galactic, 2.0.3 claim, foreground/background, 3.5.4 clash of definitions, pathological, 2.6.7 class, mathematical, 5.16.2 class, proper, NBG, 5.7.23 class of objects, 5.16.7 class ontology, mathematical, 2.5.4 class operational procedure, 4.1.4 [ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
51. Index
51. Index
class tag, set, 5.2.6 class test procedure, 4.1.4 class/object model, 5.7.25 classes, topology, 14.1.6, 14.7.10, 15 classical axiom system, 2.8.3 classification, global connectivity, 14.1.2 classification of set, 6.4.5 classification of topologies, 25.2.9 clay tablet, Mycenaean, 2.3.5 Clifford, William Kingdon, 40.2.1, 46.1.5 close curve, simple, 16.2.11 closed annulus, 17.1.17 closed ball, 17.1.7, 17.3.9 closed ball, punctured, 17.1.17 closed ball with zero radius, 17.1.10 closed curve, 16.2.11 closed interval, 8.3.10 closed-open interval, 8.3.10 closed path, 16.4.7 closed path, simple, 16.4.8 closed-point topology, trivial, 14.7.6 closed portion of boundary of set, 14.5.9 closed set, 14.2.12 closed set symbol F , 14.2.13 closure, concrete proposition domain, 4.9.1 closure, exterior, topology, 14.5.1 closure, topological, 14.4.4 closure, topology, notation, 14.4.6 closure of set, 14.4 closure of set unions under arbitrary unions, 5.15 cloth, 2.5.14 cloud of statistical variations, 2.3.9 cluster point, 14.6.1 Code of Laws, Hammurabi, 3.5.3 codomain of relation, 6.3.6 coefficient, 28.0.6 coefficient tuple, 28.5.1 coefficient tuple of tagged tangent operator, 28.6.2 coefficient vector, 28.5.1 coefficient vector of tagged tangent operator, 28.6.2 coefficients, tensorization, 39.4.5 coefficients of affine connection on principal fibre bundle, 37.9 cognitive science, 5.7.17 cognitive theory, 2.11.14 coherence, recursive model, 3.3.6, 3.3.7 coherent, etymology, 2.1.7 coherent network of concepts, 2.1.7 cohesion, social, 2.2.4 coin, two-sided, proposition analogy, 3.7.4 coincidence, amazing, 3.4.5 collection, 5.1.5 colloquial logic confusion, 3.5.7, 3.13.7 colonies, Spanish, 2.2.7 colour perception, 3.4.5 column vector map, 11.1.10 combination, 7.11 combination, convex, 10.2.27, 12.1.0, 12.2.4, 38.5, 41.5.0 combination, linear, 9.7.0, 9.7.1, 10.2.8, 10.2.22 combination, linear, formal, 10.10.4 combination, truth-functional, 4.2.1, 4.2.3 combination function, convex, 38.5.2 combination symbol, 7.11.2 combinatorics, topology on finite set, 14.3.6 communication, animal, 2.5.1 communication, human, finite bandwidth, 2.11.14 communication channel, mathematics, 2.5.11 communication protocol, 2.5.7 communication systems interoperability, 2.8.9 [ www.topology.org/tex/conc/dg.html ]
communications, computer, 2.1.4 communications, socio-mathematical network, 2.5 communications engineering, 2.8.9 communications standardization bodies, 2.8.9 communities of minds, 2.2.3 commutative group, 9.2.20 commutative ring, 9.8.5 commutative ring with identity, 9.11.1 commutative unitary ring, 9.8.6 commutative unitary ring, associative algebra, 9.10.1 commutator of vector fields, 33.2.5 compact analytic manifold, 27.9.5 compact differentiable manifold, 27.4.17 compact-domain curve, 16.2.9 compact-domain curve, rectifiable, in Lipschitz manifold, 27.11.5 compact-open topology, 15.7.9 compact set, 15.7.4 compact topological space, 5.9.10 compactness, Heine-Borel, 15.7.5 compactness, sequential, 17.3.29 compactness classes of topological spaces, 15.7 compass direction, 3.7.3 compatible fibre chart, 23.6.14 competition, 2.2.2, 2.2.3 complement, double, 3.6.1, 3.11.6 complement, set, 5.13.8 complete pseudogroup of diffeomorphisms, 19.4.10 complete pseudogroup of homeomorphisms, 19.4.7 complete pseudogroup of unidirectionally differentiable homeomorphisms, 19.6.7 completely regular topological space, 15.2.18 completeness, logic, 4.1.7 completeness of modelling, mathematical logic, 3.2.5 complex number, 8.7 complexity of vocalizations, 2.2.4 component, 28.0.6 component, connected, 15.4.14 component, horizontal, of vector, 27.12.8 component, vector, 10.5.13 component function of vector field, 29.5.5, 29.6.6 component map, dual, 10.5.11 component map for basis of linear space, 10.2.14 component matrix of a linear map, 11.2.1 component matrix of linear map, 11.2 component tuple, computational second-order tangent, 30.3.1 component tuple, second-order tangent, 30.3.1 component tuple of vector with respect to basis, 10.2.16 components of tensor, 29.3.6 composite integer, 7.4.2 composite number, 14.3.9 composite of functions, 6.7.1 composite of partially defined functions, 6.11.7 composition, diffeomorphisms, 19.1.2 composition of functions, 6.7, 6.7.1 composition of operator fields, 33.2.7 composition of partially defined functions, 6.11.7 composition of relations, 6.3.23 composition of vector fields, 33.2.2 composition rule for differentiation, 18.4.11, 18.4.12 compound proposition, 3.13.2 compound proposition, reception, 3.7.14 compound proposition decomposition, 3.7.14 compound proposition space, 4.2.2 compound propositions, on-demand construction, 3.13.5 comprehension, naive, axiom, 5.7.2, 5.7.6, 5.7.8 comprehension, reckless, axiom, 5.7.24 comprehension axiom, 5.11.2 [ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
846
comprehension axiom, ZF, 5.4.3 compressible integer, 2.11.6 compressible number, 2.11.2 computational second-order tangent component tuple, 30.3.1 computational tangent vector, 28.4, 28.4.2 computational tangent vector triple, 28.4.1 computations, mechanistic, 2.9.1 computer communications, 2.1.4 computer file system directory, 5.7.19 computer hardware, 3.13.8 computer language, 2.9.2 Computer Modern Roman font, 1.8 computer operating system, 4.7.7 computer programming, 2.6.9, 5.16.7 computer proof, 3.6.2 computer simulation, 3.13.4, 3.13.8, 3.14.2 computer software, 2.3.5, 3.13.8 concatenation of curves, 16.2.15 concatenation of lists, 7.12.2 concatenation of paths, 16.4.13 concatenation of sequence of curves, 16.2.17 concatenation operator for tuples, 8.5.3 concave function, 41.5.0 concept, primitive, 3.4.3 concept network, coherent, 2.1.7 concepts, acyclic network, 2.1.7 concepts, lowest-level, 3.0.1 conceptual economy, 6.5.2 concrete discussed context, 4.3.1 concrete equality relation, 2.5.11 concrete equality relation, import, 4.15.2 concrete proposition, 3.3.3, 3.10.6 concrete proposition domain, 3.1.2, 4.1, 4.1.3, 4.5.4 concrete proposition domain, dynamic, 4.1.6 concrete proposition domain, static, 4.1.6 concrete proposition domain closure, 4.9.1 concrete proposition domain examples, 4.1.8 concrete set domain, 5.7.23 concrete variable domain, 5.7.3 concrete variable space, 5.2.3 conditional expression, 4.3.14 conditional statements, Gilgamesh epic, 3.5.2, 46.4.1 conditioning, operant, 3.5.10 conditioning, psychology, 7.2.1 cone-shaped neighbourhood, 18.2.13 confidence level, logic, 3.9.3 conformal connection, 12.1.0 conformal-invariant geometry, 46.3.0 conformal map, 9.4.10 conformal metric, 25.6.4 conformal sublayer, 39.0.3 conformal transformation group, 46.3.0 conformality of transition maps, 28.8.9 confusion, colloquial logic, 3.5.7, 3.13.7 confusion, logic, 3.3.4 conical coordinates for Euclidean space, 43.7 conical coordinates Laplacian, 43.7.0 conjecture, Poincar´ e, 25.2.9 conjugate of a subset of a group, 9.3.13, 9.3.15 conjugate operator, 10.5.24 conjugate point, 38.7.4 conjugation map, 9.3.22 conjunct, 4.3.11 conjunct, logical expression, ambiguous, 4.3.12 conjunction, logical, 4.3.3 conjunction, proposition list, 3.13.6 conjunction, triple, 3.5.8 conjunctions, bulk, 3.1.2 [ www.topology.org/tex/conc/dg.html ]
847
conjunctive normal form, 4.11.3 connected component, 15.4.14 connected subset, 15.4.3 connected topological space, 15.4.1 connectedness and continuity of functions, 25.2.7 connection, affine, 37 connection, choice of definitions, 36.1 connection, conformal, 12.1.0 connection, continuous, 36.1.2 connection, general, alternative definitions, 36.9 connection, history, 36.1 connection, Lagrangian mechanics, 37.10 connection, Levi-Civita, 39.4, 39.4.2, 39.4.5, 39.6.2, 42.5.2 connection, Levi-Civita, Euclidean space, 12.4.1 connection, Levi-Civita, globality, 25.7.5 connection, Levi-Civita, metric layer, 39.0.1 connection, Levi-Civita, on two-sphere, 42.2.2 connection, Levi-Civita, orthogonal, 25.6.5 connection, Levi-Civita, parallelism, 36.2.3, 37.1.2 connection, Levi-Civita, parallelism at a distance, 24.1.1 connection, Levi-Civita, tensorization, 30.2.7 connection, Levi-Civita, tensorization coefficients, 37.6.10 connection, Lie, 33.4.3, 33.4.4 connection, metric, 39.4.6 connection, naming, 36.1 connection, OFB, 36.3.2 connection, orthogonal, 12.1.0, 42.5.3 connection, orthogonal, Levi-Civita, 25.6.5 connection, PFB, 36.3.2 connection, PFB, connection form, 36.6 connection, PFB, parallel displacement, 36.8 connection, terminology, 15.3.2, 37.1.3 connection, torsion-free, 37.1.3, 37.6.10 connection definition styles, 36.1.8 connection differentiability, 36.3.9, 36.3.14 connection form, PFB connection, 36.6 connection form on principal fibre bundle, 36.6.3 connection layer, 1.1, 36.0.4 connection on differentiable fibre bundle, 36 connection on ordinary fibre bundle, curvature, 36.4 connection on principal fibre bundle, alternative definition, 36.9.2, 36.9.4 connection on principal fibre bundle, differentiability, 36.5.5 connections on manifolds, motivation, 37.2 connective, primitive, 4.5.2, 4.7.4, 4.11.1 connective, principal, logical operator, 4.3.11, 4.3.12 connective, propositional, 4.3.4 connectivity classes of topological spaces, 15.4 connectivity classification, global, 14.1.2 consequent subexpression, 4.3.14 conservation equations, 14.0.3 conservative force field, 20.6.12 consistent, etymology, 2.1.7 constant, individual, 4.12.12 constant curvature, surface, 43.9 constant curve, 16.2.11 constant function, continuous, 14.11.3 constant logical function, 4.12.12 constant name, definition, 4.12.13 constant path, 16.4.8 constant predicate, 4.12.12 constant stretch of curve, 16.3.2 construction, canonical, 2.8.6 construction, on-demand, compound propositions, 3.13.5 construction, set, dynamic perspective, 5.7.24 construction of associated topological fibre bundle, 23.11 construction stage, ZF set theory, 5.5.4 constructional method of definition, 2.8.1, 2.8.4 [ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
51. Index
51. Index
constructions, topology, 15 constructions and axioms, 2.8 constructivism, 2.2.8 container, physical, 2.6.2 container metaphor, set, 5.7.17, 5.7.25 content, determinable, set, 5.5.3 context, discussed, 3.3.4 context, discussed, concrete, 4.3.1 context, discussion, 3.3.4 context, discussion, abstract, 4.3.1 context, meta-discussion, 3.3.4 context stack, 3.10.5 continuity, epsilon-delta, 17.4.2, 17.4.3, 17.4.5, 17.4.14 continuity, H¨ older, 18.9, 18.9.1 continuity analysis, local, 14.1.2 continuity of functions in terms of connectivity, 15.5 continuous connection, 36.1.2 continuous curve, 16.2.4 continuous function, 14.11, 14.11.2 continuous function, constant, 14.11.3 continuous function, uniformly, 17.4.8 continuous function on metric space, 17.4, 17.4.1 continuous manifold, 26.4.11 continuous path, directed, 16.4.2 continuous path, oriented, 16.4.2 continuous path, unoriented, 16.4.15 continuous vector field along curve, 29.8.2 continuum, 26.1.9, 36.1.2 continuum hypothesis, 5.0.10, 7.2.7 contour, 16.1.1 contradiction, 3.10.3, 4.3.22, 4.3.23 contradiction, proof by, 2.11.5, 3.1.4, 3.11, 4.1.7 contradiction, proof by, equivalents, 3.11.1 contradiction, proof by, validity, 3.11.10 contradiction, proof by, world-model ontology, 3.11.8 contradiction-free logic, 3.10.12 contradictione sequitur quodlibet, Ex, 3.11.1 contravariant, terminology, 13.6.2 contravariant tensor, 29.1 contravariant tensor space, 29.1 control, social behaviour, 3.7.12 convergent sequence, 14.11.27 conversion, analogue digital, 2.4.3 convex combination, 12.1.0, 12.2.4, 38.5, 41.5.0 convex combination function, 38.5.2 convex combination of vectors, 9.7.0, 10.2.27 convex curvilinear interpolation, 16.5 convex function, 38, 38.8, 38.8.1, 41.5.0 convex function on two-sphere, 42.11 convex neighbourhood, 38.7.5 convex set, 38, 38.4 convex set on two-sphere, 42.11 convex subset, 38.4.1 convex subset of Riemannian manifold, 41.5.0 convexity, 38 convexity test function, 38.8.2 cook, meat-centric, 1.4.5 cook, vegetarian, 1.4.5 cookbook, 1.5.3 cooperation, 2.2.2, 2.2.3 coordinate, 28.0.6 coordinate basis covector, 29.2.5 coordinate basis operator field, 29.6.4 coordinate basis vector, 28.7.4, 28.12 coordinate basis vector field, 29.5.12 coordinate frame bundle, tangent, 35.9.1 coordinate-free, 2.9.6, 11.2.0, 11.2.4, 27.1.3, 27.8.5, 28.1.2, 28.1.3, 28.1.8, 37.6.8, 42.1.1, 42.4.1, 45.0.0 [ www.topology.org/tex/conc/dg.html ]
coordinate function, topological, 26.4.2 coordinate-independent, 37.4.1 coordinate map, topological, 26.4.2 coordinate transformation, 19.1.6 coordinate transition matrix, 27.4.7 coordinate triple, tangent, 28.3.2 coordinate triple of tangent operator, 28.5.1 coordinate tuple, 28.5.1 coordinate tuple of tagged tangent operator, 28.6.2 coordinate vector, 28.5.1 coordinate vector of tagged tangent operator, 28.6.2 coordinates, Cartesian, 18.1.1, 26.1.3 coordinates, conical, for Euclidean space, 43.7 coordinates, geodesic, 38.7.1, 38.7.3 coordinates, normal, 38.7.1, 39.4.10 coordinates, spherical, 43.6.0 coordinates, spherical, astronomical, 42.1.6 coordinates, tangent vector, 28.2.1 coordinates, terrestrial, 42.1, 42.6.0 coordinates for polar exponential map, 42.6 corps, 9.8.10 corpus, 9.8.10 corpus, mathematical knowledge, 3.2.5, 5.0.3 corpus, PDE, 1.4.1 corpus, theorem, 5.6.2 correct logic, 3.1.3 correctness, plodding, 1.4.7 corresponding parameters of curves, 16.3.14 coset of subgroup, 9.3.3 cosine function, 20.13.9 cosmic information bottleneck, 2.11.17 cotangent bundle on Euclidean space, 19.2.8 cotangent space, 29.2, 29.2.2 cotangent space, double, pointwise, 29.4.5 cotangent space, double, total, 29.4.6 cotangent space, total, 29.2.8 cotangent vector, 29.2, 29.2.2 cotangent vector space, 29.2.2 cothrom, Domhan, 3.10.8 countability class, 15.6 countable choice axiom, 5.9.1, 5.9.11, 5.10, 5.10.1, 5.10.2, 5.10.3, 7.8.3, 7.8.4 counterexamples and sharpness of theorems, 43.0.1 counting cattle, prehistoric, 2.11.19 court, tennis, 2.10.10 covariant, terminology, 13.6.2 covariant acceleration of curve, 38.1.4 covariant derivative, 33.1.0, 37, 37.4 covariant derivative, terminology, 37.4.3 covariant derivative for general connection, 36.7 covariant derivative of vector field, 37.4.5, 37.4.7, 37.4.12 covariant derivative of vector field along curve, 37.4.11, 38.1, 38.1.2 covariant tensor, 13.6, 29.3 covector, 13.9.8, 29.2.2 covector, coordinate basis, 29.2.5 covector, unit coordinate basis, 29.2.6 cover, open, 14.10.12, 15.7.1 cover, open, finite, 14.10.12 covering, open, 15.7.1 covering, open refinement, 15.7.2 cow, 2.4.2, 2.4.3, 2.11.19 craft, 2.5.16 cranial capacity, 2.2.4 cream, ice, 3.3.10 credibility, axiomatization, 3.2.6 crime, 3.10.6 crisp perfection, abstract logic, 3.4.3 [ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
848
51. Index
[ www.topology.org/tex/conc/dg.html ]
curve constant stretch, 16.3.2 curve corresponding parameters, 16.3.14 curve equivalence class, 28.1.1 curve family, 16.2.2 curve family, higher-order differential, 32.4 curve family, vector field derivative, 33.3 curve in Euclidean space, differentiable, 18.6.5 curve length, 17.5.9, 41.5.0 curve on two-sphere, geodesic, 42.9 curve parametrization, 16.1.6 curve path-equivalence, 16.3, 16.3.7 curve reparametrization, 17.5.6 curve sequence, concatenation, 16.2.17 curve terminology, 16.1 curve topics summary, 27.6.4 curve traversal order, 16.1.5 curved space, PDE techniques, 1.4.1 curvilinear interpolation, convex, 16.5 customer, logic, 3.3.8 cut, Dedekind, 2.7.2, 2.7.3, 8.3.4 cycle of disciplines, 2.5.17 cyclic, logic and set theory, 2.1.6 cyclic chain, 5.7.10 cyclic definition, 2.0.1 cyclic inclusion of sets, 4.1.9 cyclic knowledge, 2.1.3 da Vinci, Leonardo, 3.9.8 dagger, Quine, 4.3.6, 4.3.8 danger, reductio ad absurdum, 3.11.4 dark age, 3.4.2 Dark Ages, 2.3.5, 42.0.1, 46.1.2 dark energy, 2.10.5 dark matter, 2.10.5 dark number, 2.10, 2.11.12 dark set, 2.10 dark sets, 20.1.3 data, experimental, 2.9.9 database object attributes, 2.4.4 dateline, international, 42.1.5 datur, tertium non, 3.6.4 daylight hours calculation, 42.15 DE (differential equation), 21.0.2 de Morgan’s law, 5.13.11, 5.14.8 death, 2.4.2, 2.11.19, 3.5.3 decimal classification, Dewey, 1.9.1 decision-making, mammoth hunters, 3.6.1 decomposition, compound proposition, 3.7.14 Dedekind, (Julius Wilhelm) Richard, 2.9.10, 2.12.4, 4.9.12, 9.2.1, 46.1.6 Dedekind cut, 2.7.2, 2.7.3, 8.3.4, 13.0.7 Dedekind-finite set, 5.9.10, 7.2.24 Dedekind-infinite set, 7.2.24 deduction, 3.7.8 deduction, mathematical, 2.9.7 deduction rule, 4.9.1 deduction rule, meta-theorem, 4.9.4 deduction rules, 4.4.2, 4.6 deduction rules, defining logical operators, 4.6.3 deduction theorem, 4.6.7, 4.9, 4.10.1 define-before-use ordering, 1.4.11 definite, negative, 11.4.7 definite, positive, 11.4.7 definite real symmetric matrix, 11.6 definition, clash, pathological, 2.6.7 definition, constructional method, 2.8.4 definition, cyclic, 2.0.1 definition, function, inline, 14.11.19 definition, inductive, 18.4.1 [ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
critical point, Hessian operator, 32.6 cross-brace, 37.7.8 cross product, naive, 6.1.6 cross-section, terminology, 23.3.11 cross-section of a topological fibration, 23.3.8 cross-section of alternating form bundle, 20.5.4 cross-section of alternating form bundle, differentiable, 20.5.5 cross-section of cotangent bundle on Euclidean space, 19.2.10 cross-section of fibration, differentiable, 27.12.13 cross-section of fibre bundle, 6.9.7 cross-section of tangent bundle, 33.5.0 cross-section of tangent bundle on Euclidean space, 19.1.9 cross-section of tangent bundle on Euclidean space, differentiable, 19.1.10 cross-section of tangent fibration, 29.6.1 cross-section of topological fibration, 27.12.12 cross-section over a set, 23.3.8 cryptographic puzzle, 5.15.3 cuneiform writing, 3.5.1 curl, 20.3.6, 20.6.12, 37.7.2 curvature, 37.7 curvature, constant, surface, 43.9 curvature, Gaussian, 19.5.3 curvature, positive, space, 37.0.2 curvature, Ricci, 39.5.6 curvature, scalar, 39.5.7 curvature, sectional, 39.5.4 curvature, sectional, on Riemannian manifold, 41.5.0 curvature, topological generalization, 24.4.0 curvature form, 37.7.3, 39.5.2 curvature of connection on ordinary fibre bundle, 36.4 curvature tensor, 37.7.6, 39.5, 39.5.3 curvature tensor, Riemann, 36.4.1 curvature tensor components, affine connection, 41.3.2 curve, 16.2 curve, closed, 16.2.11 curve, closed simple, 16.2.11 curve, compact-domain, 16.2.9 curve, constant, 16.2.11 curve, continuous, 16.2.4 curve, differentiable, 27.6, 27.6.1 curve, differential, 31.4 curve, differential, for higher-order operator, 32.8 curve, geodesic, 38.2, 38.2.3 curve, higher-order differential, 32.3 curve, infinitesimal, 33.2.0 curve, initial point, 16.2.13 curve, level, 16.1.7 curve, Lie transport, 33.4.6 curve, multiple point, 16.2.13 curve, never-constant, 16.3.3, 24.4.4 curve, non-self-intersecting, 31.4.8 curve, open, 16.2.9 curve, rectifiable, 17.5, 17.5.5, 27.11 curve, representative, 16.4.2 curve, simple, 16.2.11 curve, sometimes-constant, 16.3.3 curve, space-filling, 16.1.5, 26.1.4, 43.1.2 curve, tangent operator, 31.4.7 curve, tangent vector, transformation rule, 19.1.3 curve, tangent vector field, 31.4.1 curve, terminal point, 16.2.13 curve, topological, 16 curve atlas, 16.1.4, 16.1.8 curve bundle, 24.4.3 curve class, parallelism, 24.1.6 curve class, tangent, 28.0.4 curve concatenation, 16.2.15
849
51. Index
definition, operational, 2.11.8, 2.11.9 definition, template, 18.4.1 definition breeding rules, 3.3.7 definition-centric book, 1.4.5 definition notation, 1.6.5 definition of “constant”, 4.12.13 definitions, boot-strapping, 2.1.1, 2.1.3, 6.1.1 definitions, mathematical, naturalism, 5.0.4 definitions, mathematical systems, 2.8 definitions book, 2.7.1 deflationary theory of truth, 3.10.8 delayed assertion, 3.7.17 deliverables, mathematics, 3.2.3 delta function, Dirac, 28.1.1 delta function, Kronecker, 7.9.1, 7.9.10, 11.1.22, 22.4.0, 28.7.5 demand, on, unbounded modelling, 5.7.25 Democritus, 46.1.1 demonstration prototypes, 2.8.9 denial, alternative, 4.3.6, 4.6.4 denial, joint, 4.3.6, 4.6.4 denial operator, alternative, 4.3.8 denial operator, joint, 4.3.8 dense subset, 18.5.7, 18.5.12 dense subset of topological space, 15.6.5 dependent choice, axiom, 5.0.10 derivation, 28.1.1, 28.1.11, 45, 45.8 derivation examples, 45.2 derivation rules, logic, 4.14.4 derivative, bidirectional, 19.6.2 derivative, covariant, 33.1.0, 37, 37.4 derivative, covariant, for general connection, 36.7 derivative, covariant, of vector field, 37.4.5, 37.4.7, 37.4.12 derivative, covariant, of vector field along curve, 37.4.11 derivative, directional, 19.6.2 derivative, exterior, 20.6, 20.6.2 derivative, exterior, on manifold, 33.6, 33.6.1 derivative, Lie, 33.4.9 derivative, real-valued function, 18.2.10 derivative, unidirectional, 19.6.2 derivative, vector field, naive, 33.1 derivative for curve family, vector field, 33.3 derivative notation, 18.2.11 derivatives for several variables, higher-order, 18.6 Desargues, Girard, 46.1.3 Descartes, Ren´ e, 2.1.6, 2.4.5, 2.9.6, 18.1.1, 46.1.3, 46.2.8 descriptive interpretation, logical assertion, 3.7.13, 3.12.1 determinable content, set, 5.5.3 determinant of a matrix, 11.3.6 deviation from flatness, parallelism, 36.4.1 devil’s advocate, 3.10.5 Dewey decimal classification, 1.9.1 diagonal argument, 2.11.20 diagonal map, 6.9.13 diagonalization of matrix, 42.8.3 diagram, Venn, 5.7.17, 5.7.25 diagrams, intuition, 3.6.2 diagrams, semantics, 3.6.2 dialysis, 14.7.1 diameter of a set, 17.2 diameter of set, 17.2.4 dictionary, 2.2.7 diffeomorphic, 19.1.1 diffeomorphism, 19.1, 19.1.1 diffeomorphism, composition, 19.1.2 diffeomorphism differentiability class, 27.8.8 diffeomorphism family, vector field, 34.6.8 diffeomorphism group, 34.0.1, 34.6 diffeomorphism group, differentiable, 34.6.1 [ www.topology.org/tex/conc/dg.html ]
diffeomorphism in Euclidean space, 19 diffeomorphism invariant, 37.1.4 diffeomorphism on Euclidean space, 19.2 diffeomorphism pseudogroup, 18.7.7, 19.4, 23.1.9, 23.5.0, 23.6.12, 27.2.7 diffeomorphism pseudogroup, complete, 19.4.10 diffeomorphisms, differentiable, topological group, 34.6.5 diffeomorphisms, differentiable group, 34.6.3 diffeomorphisms, family, differentiable, 34.6.7 diffeomorphisms, one-parameter family, 27.7.1 diffeomorphisms, one-parameter group, 27.7.2 differences of this book from other DG texts, 1.7 differentiability, fractional, 43.4.0 differentiability, higher-order, 18.4.1 differentiability-based function space, 18.7 differentiability class of diffeomorphism, 27.8.8 differentiability implies continuity, 18.2.15 differentiability of connection, 36.3.9, 36.3.14 differentiability of connection on principal fibre bundle, 36.5.5 differentiable action of group, 35.3.0 differentiable atlas, 27.2.4 differentiable chart, 27.4.6 differentiable cross-section of alternating form bundle, 20.5.5 differentiable cross-section of cotangent bundle on Euclidean space, 19.2.11 differentiable cross-section of fibration, 27.12.13 differentiable cross-section of tangent bundle on Euclidean space, 19.1.10 differentiable curve, 27.6, 27.6.1 differentiable curve in Euclidean space, 18.6.5 differentiable diffeomorphism group, 34.6.1 differentiable diffeomorphisms, group, 34.6.3 differentiable diffeomorphisms, topological group, 34.6.5 differentiable differential form, 20.5.5 differentiable family of diffeomorphisms, 34.6.7 differentiable family of differentiable transformations, 27.7 differentiable fibration, 27.12 differentiable fibration with differentiable fibre space, 35.1.1 differentiable fibration with fibre atlas for specified fibre space, 27.12.7 differentiable fibration with intrinsic fibre space, 27.12.1 differentiable fibration with specified fibre space, 27.12.3 differentiable fibre atlas, 27.12.6 differentiable fibre bundle, 28.8.0, 35 differentiable fibre bundle, associated, 35.6, 35.6.3, 35.6.7, 35.6.10 differentiable fibre bundle, horizontal lift function, 36.3.5 differentiable fibre bundle, lift function, 36.3.12 differentiable fibre bundle, vector field, 35.3 differentiable fibre bundle association, 35.6.2 differentiable fibre bundle connection, 36 differentiable fibre bundle with Lie structure group, 35.2, 35.2.2 differentiable fibre bundle with non-Lie structure group, 35.1, 35.1.3 differentiable fibre chart, 27.12.5 differentiable function, directionally, 18.5.8 differentiable function, partially, 18.5.4 differentiable function, totally, 18.5.17 differentiable function, unidirectionally, 18.3.5, 18.5.14 differentiable function space, 45.4 differentiable group, 28.8.0, 34 differentiable manifold, 27, 27.2.6 differentiable manifold, compact, 27.4.17 differentiable manifold, paracompact, 27.4.18 differentiable manifold, tangent bundle, 28, 35.8 differentiable manifold atlas, 27.2, 27.2.2 differentiable manifold product, 27.4.15 [ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
850
differentiable manifold restriction, 27.4.12 differentiable map, differential, 31.3, 31.3.1, 31.3.4 differentiable map, higher-order differential, 32.2 differentiable map, induced map, 31.3.6 differentiable map between differentiable manifolds, 27.8, 27.8.1 differentiable map for higher-order operator, differential, 32.7 differentiable path, 27.6 differentiable principal fibre bundle, 35.4, 35.4.1 differentiable principal fibre bundle, vector field, 35.5 differentiable real-valued function, 18.2.2, 18.2.5 differentiable real-valued function on differentiable manifold, 27.5, 27.5.1 differentiable second-order vector field, 30.7.3 differentiable structure, 27.1.2 differentiable structure, preview, 25.3 differentiable tensor field, 29.7.2 differentiable transformation, differentiable family, 27.7 differentiable vector field along curve, 29.8.4 differential, higher-order, 32 differential, pointwise, 31.1 differential action, 34.8.0 differential calculus, 18 differential calculus and logic, 4.4.2 differential equation, ordinary, 21.1 differential equations, logic analogy, 4.0.3 differential equations, partial, 37.6.1 differential equations, systems, 21.0.3 differential for second-order operator, 32.7.3 differential for second-order tangent vector, 32.7.1 differential for tangent operator, 31.3.18, 31.3.22 differential form, 29.9, 29.9.1 differential form, differentiable, 20.5.5 differential form in flat space, 20.5 differential geometry history, 46 differential layer, 1.1, 27.0.1 differential manifold tensor calculus, 41.2 differential of curve, 31.4 differential of curve, higher-order, 32.3 differential of curve family, higher-order, 32.4 differential of curve for higher-order operator, 32.8 differential of differentiable map, 31.3, 31.3.1, 31.3.4 differential of differentiable map, higher-order, 32.2 differential of differentiable map for higher-order operator, 32.7 differential of function on Euclidean space, 19.2.13 differential of real-valued function, 31.2, 31.2.1, 31.2.11 differential of real-valued function, higher-order, 32.1 differential of real-valued function for higher-order operator, 32.5 differential of real-valued function for second-order operator, 32.5.1, 32.5.3 differential on Euclidean space, 19.2 differential on manifold, 31 differential operator in Riemann space, 39.6 differential operator on Euclidean space, second-order, 19.5 differential parallelism, 15.3.2, 36.1.1 differential quotient, 45.4.0 differential topology, 27.0.1 differentiation, composition rule, 18.4.11, 18.4.12 differentiation, linear spaces, 18.8 differentiation, one variable, 18.2 differentiation, several variables, 18.5 differentiation of parallel transport, 36.2 digit sequence, 7.2.5 digital electronics, 3.6.2 digital electronics circuit, 4.1.8 digital images, 2.9.1 [ www.topology.org/tex/conc/dg.html ]
851
dimension, Lebesgue, 15.9.1, 34.2.5 dimension, topological, 15.9, 34.2.5 dimension of dual linear space, 10.5.10 dimension of finite-dimensional linear space, 10.2.5 dimensional analysis, 20.3.3 dip, lucky, 5.11.5 Dirac delta function, 28.1.1 direct product atlas, 26.5.3 direct product manifold, 26.5.5 direct product of family of groups, 9.3.30 direct product of functions, 6.9.11 direct product of functions, pointwise, 6.9.12 direct product of groups, 9.3.30 direct product of two topological fibrations, 23.3.20 direct product topology, 15.1.4 direct sum of linear space sequence standard injection, 10.6.4 direct sum of linear spaces, 10.6 direct sum of linear spaces, abstract, 10.6.3 direct sum of linear spaces, external, 10.6.1 direct sum of linear spaces, formal, 10.6.1 direct sum of linear spaces, internal, 10.6.5 directed area, 25.4.2 directed continuous path, 16.4.2 direction, compass, 3.7.3 directional derivative, 19.6.2 directionally differentiable function, 18.5.8, 19.6.3 directionally differentiable homeomorphism, 19.6 directory, file system, computer, 5.7.19 Dirichlet, Johann Peter Gustav Lejeune, 46.1.5 Dirichlet problem, 21.3.0 dirt, 2.2.1 disbelief, 3.10.15 disciplines, cycle, 2.5.17 disclaimer, 0.0 discomfort level, 2.11.13 discomfort levels, 2.10.3, 2.12.5, 5.9.3 discomfort levels with infinities, 2.9.5 disconnected set of sets, 15.4.9 disconnected subset, 15.4.3 disconnection of a set of sets, 15.4.10 disconnection of sets, 15.3 disconnection of subset of topological space, 15.4.7 disconnection of topological space, 15.4.5 discontinuous function example, 18.5.6, 18.5.7, 18.5.11, 18.5.12 discovery of proof, 4.8.4, 4.8.5 discrete manifold, 26.1.6 discrete topological space, 14.2.19 discrete topology, 14.2.19 discussed context, 3.3.4 discussed context, concrete, 4.3.1 discussion, logic, 3.9.4 discussion context, 3.3.4 discussion context, abstract, 4.3.1 discussion language, 3.10.6 discussions, network, 3.3.4 disjoint sets, 5.13.7 disjoint union, 6.4.2 disjunct, 4.3.11 disjunction, 3.13.7 disjunction, exclusive, 3.5.5 disjunction, inclusive, 3.5.5 disjunction, logical, 4.3.3 disjunction, meta-proposition, 3.13.6 disjunction, triple, 3.5.8 disjunctions, bulk, 3.1.2 disjunctive normal form, 3.5.9, 4.11.3 distance between point and set, 17.2.1 [ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
51. Index
distance between sets, 17.2, 17.2.1 distance function, 17.1.1 distance function, Hessian of second derivatives, 25.7.1 distance function, nearest integer, 8.6.19 distance function, point-to-point, 39.3 distance function in metric space, 17.1 distribution, 27.10.1 distribution, Schwartz, 28.1.1, 28.5.8 distribution, tempered, 28.5.8 distribution theory, 14.5.6, 34.3.4 distributions representation of tangent bundle, 28.15 distributivity logic axiom, 4.7.5 distrust, 3.9.4 divergent sequence, 14.11.27 divides (integer relation), 7.4.4 DNA (deoxyribonucleic acid), 25.5.4, 37.7.8 dog, 2.4.2, 2.5.5 dog, Harry’s, barking, 3.6.3 dogma, prisoner, 1.4.6 domain, concrete proposition, 3.1.2, 4.1, 4.1.3, 4.5.4 domain, concrete variable, 5.7.3 domain, set, concrete, 5.7.23 domain of function, 6.5.9 domain of interpretation, propositional calculus, 4.12.8 domain of relation, 6.3.6, 6.3.7 domain restriction, relation, 6.3.33 domain/range specification, function, 6.5.5 domesticated animal, 2.2.6 Domhan cothrom, 3.10.8 dotage, Newton’s notation, 18.2.11 double-angle rules, trigonometry, 20.13.12 double complement, 3.6.1, 3.11.6 double cotangent space, pointwise, 29.4.5 double cotangent space, total, 29.4.6 double dual of linear space, 10.5.19 double negation, 3.6.1, 3.10.8 double negation, logical, 3.11.6 double-negative assertion, 3.7.4 double tangent space, 29.4 double tangent space, pointwise, 29.4.1 double tangent space, total, 29.4.3 doubt, active, 3.9.4 dough-nut and tea-cup topology, 14.1.3 dramatis personae, 1.0 dreary argumentation, 4.0.2 drop function, 31.4.3 drop function, Lie derivative, 33.4.15 drop function for second-level tangent vector, 30.5, 30.5.2 drop function of total tangent space, 28.11, 28.11.6 drop-in replacement, 2.8.9 dry argumentation, 4.0.2 dual, multilinear, 13.5.5, 13.6.5 dual basis, canonical, 10.5.7 dual component map, 10.5.11 dual linear space dimension, 10.5.10 dual map, 10.5.24 dual of linear space, double, 10.5.19 dual of linear space, second, 10.5.18 dual operator, 10.5.24 dual order, 7.1.7 dual space, 10.5, 10.5.5, 13.7.4 dual vector, 13.6.2 duality, logic quantifier, 4.13.10 duality, topology, 14.3.7 dummy variable, 4.3.16, 5.1.23, 5.8.16, 5.8.24 dummy variable, limit of function, 14.11.19 dummy variable, naked, 5.8.28 dummy variable, superfluous, 6.5.16 [ www.topology.org/tex/conc/dg.html ]
51. Index duple, ordered, 7.7.3 duplicity, 4.16.10 duplique, 4.16.10 dynamic concrete proposition domain, 4.1.6 dynamic memory allocation, 2.10.10 dynamic perspective, set construction, 5.7.24 dynamics, population, 3.7.14 dynasty, Sung, 7.11.7 Earth, flat, 42.0.1 Earth, round, 3.10.6 Earth chart, 26.1.5, 26.4.5 Earthlings, 2.0.3 easy to show, 1.5.2 economy, conceptual, 6.5.2 economy, intellectual, 2.11.14 Eddington, Arthur Stanley, 36.1.5, 46.1.6 education, medieval university, 2.9.8 effective computability, 2.12.3 effective left transformation group of a topological space, 16.8.7 effective right transformation group of topological space, 16.8.13 effective topological left transformation group of topological space, 16.8.8 effective topological right transformation group, 16.8.14 egg, chicken, 5.7.21 eggs, 5.15.6 eggs, goose, golden, 2.10.1 Egypt, 3.11.9 Egyptian geometry, 46.2.2 Egyptian hieroglyphics, 2.5.5, 3.10.6 Egyptian mathematics, 1.4.4, 46.1.1 Egyptian priests, 3.2.8 eigen, etymology, 11.5.7 eigenspace, linear space, 10.4.1 eigenspace of linear space endomorphism, 10.4 eigenvalue, linear space, 10.4.1 eigenvalue, matrix, 11.1.0, 11.5.7 eigenvalues of metric tensor, 25.7.1 eigenvector, linear space, 10.4.1 eigenvector, matrix, 11.1.0, 11.5.7 Einstein, Albert, 2.3.4, 2.3.6, 7.2.7, 25.7.6, 40.2.1, 41.1.2, 46.1.6 Einstein index convention, 7.11.15, 13.7.5, 13.8.18, 13.9.2 Einstein space, 39.5.8 Einstein’s equations, 1.4.10, 25.7.3, 40.2.2 electrolysis, 14.7.1 electromagnetic radiation, 3.4.5 electronic logic circuit, 4.5.5 electronics, digital, 3.6.2 electronics circuit, digital, 4.1.8 elementary particle, 2.0.1 Elements, Euclid, 5.0.7 ellipse projected by sphere, 42.17.2 elliptic functions, 20.13.22 elliptic integrals, 20.13.22 elliptic PDE, 21.3.0 elliptic second-order operator, 30.6, 37.6.3 elliptic second-order operator field, 37.6 elliptic second-order PDE, 27.5.10 elliptic second-order tangent operator, 30.6.3 elliptic second-order vector field, 30.7.6 embedded manifold, 27.3.3 embedded Riemannian manifold, 39.8 embedding of manifold, 31.3.11 embodied mind theory, 2.5.21 empirical proposition, 3.3.10, 4.13.10 emptor, caveat, 4.1.1 [ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
852
empty function, 6.5.17 empty path, 16.4.7 empty product of family of sets, pathological, 15.1.5 empty set, 5.8.3, 5.8.6 empty set, uniqueness, 5.3.2, 5.8.1 empty set axiom, 5.3.2 empty topology, 14.2.10, 14.3.2 endeavour, minimalist, 5.0.7 endomorphism, Lie algebra, 9.11.8 endomorphism, linear space, 10.3.6 endomorphism module, 9.9.27 endomorphisms of a module, ring, 9.9.7 endomorphisms of module, ring, 9.9.15 energy, dark, 2.10.5 energy and time, terrible waste, 4.4.4 English, Old, 3.5.4 epimorphism, Lie algebra, 9.11.8 epimorphism, linear space, 10.3.6 epsilon, 5.1.20 epsilon-delta continuity, 17.4.2, 17.4.3, 17.4.5, 17.4.14 equality, 4.15 equality, definition, 4.15.1 equality, first order language with, 5.2.3 equality, reflexivity, axiom, 4.15.1 equality, substitution of, 5.2.3 equality, substitution of, axiom, 5.2.5 equality, substitutivity, axiom, 4.15.1 equality relation, concrete, 2.5.11 equality relation, concrete, import, 4.15.2 equation, differential, ordinary, 21.1 equation, logical, 4.4.3 equation, self-referential, 5.7.14 equation of geodesic variation, 41.4 equations, logical, simultaneous, 4.4.2 Equator, 3.10.6 equilibrium, 2.11.5 equinumerosity, 5.9.10 equinumerous sets, 7.2.18 equipotent sets, 7.2.18 equipotent theorems, 4.9.2 equivalence, logical, 4.3.3 equivalence class of curves, 28.1.1 equivalence of topological spaces, 25.2.9 equivalence relation, 4.15.3, 6.4, 6.4.1 equivalent logical expressions, 3.10.13 equivalent manifolds, 27.4.10 equivalent topological fibre atlases, 23.6.16 equivalent topological fibre bundles, 23.7.5, 23.7.11 era, pre-logical, 3.2.8 Eratosthenes of Cyrene, 46.1.1 erectus, Homo, 2.2.4 Erlanger Programm, 12.2.0, 14.1.3, 19.4.2, 23.5.0, 34.0.5, 46.3.0 erroneous use of symbol, exists, 4.13.4 error, execution, 3.10.4 error, logic, 3.9.3 essence of a group, 5.16.2 essence of integers, 2.5.6 essential nature, 1.4.7 essential nature, set, 5.2.6 ethics, logic, 3.3.10 etymology, adjoint, 9.11.24 etymology, adjunct, 9.11.24 etymology, affine space, 46.3 etymology, algebra, 3.8.3, 46.2.4 etymology, algorithm, 46.2.3 etymology, analysis, 14.7.1 etymology, coherent, 2.1.7 [ www.topology.org/tex/conc/dg.html ]
853
etymology, consistent, 2.1.7 etymology, eigen, 11.5.7 etymology, naive, 2.1.2 etymology, sequence, 7.1.16 etymology, solve, 14.7.1 etymology, tangent, 28.0.2 etymology, topology, 14.1.1, 46.2.17 etymology, vector, 28.0.5 Euclid, Elements, 5.0.7 Euclid of Alexandria, 46.1.1 Euclid’s Elements, 46.1.1 Euclid’s fifth postulate, 37.0.2 Euclid’s geometry, axiomatization, 3.2.7 Euclidean fibre bundle on Euclidean space, 44.1 Euclidean geometry, 2.1.9, 2.3.4, 25.3.4, 27.2.1, 37.0.2, 39.1.5 Euclidean inner product, standard, 10.8.5 Euclidean linear space, 10.2.20 Euclidean linear space, standard basis, 10.2.21 Euclidean norm, 10.8.4 Euclidean space, 26.2, 43.2 Euclidean space, atlas, standard, 27.3.1 Euclidean space, atlas, usual, 27.3.1 Euclidean space, Levi-Civita connection, 12.4.1 Euclidean space, locally, non-Hausdorff, 43.3 Euclidean space analysis, 43.10 Euclidean space concepts, 12.4 Euclidean space conical coordinates, 43.7 Euclidean space fibre bundle, 44.1 Euclidean space submanifold, 41.7, 41.7.1 Euclidean space submersion, 41.7.1 Euclidean topological space, 26.2.1 Eudoxus of Cnidus, 20.1.1, 46.1.1 Euler, Leonhard, 12.1.0, 25.5.5, 46.1.4, 46.2.8, 46.3.0 Euler’s angles, 42.8.2 European renaissance, 2.3.5, 46.1.2 even integer, 4.13.10 even permutation, 7.10.10 Ex contradictione sequitur quodlibet, 3.11.1 Ex falso sequitur quodlibet, 3.11.1 exact sequence of linear maps, 10.11, 10.11.1 example fibre bundle, 44 example of manifold, 43 excluded middle, 3.1.4, 3.6.3, 3.11.2, 5.7.17 excluded middle, logical machine ontology, 3.11.6 excluded middle, Russell’s paradox, 5.7.11 exclusive disjunction, 3.5.5 exclusive or, 3.5.2, 3.5.7, 3.13.7, 4.3.5 exclusive or, notation, 4.3.7 exclusive-or operator, 4.3.8 execution error, 3.10.4 exercise answers, 48 exercise questions, 47 exhaustion method, 20.1.1 exhaustive substitution, logical expression, 4.14.3 existence, unique, 4.16.2 existence, unique, notation, 4.16.3, 4.16.5 existence, unknowable, Lebesgue non-measurable set, 2.10.8 existence axiom, set, ZF, 5.1.17, 5.1.18 existence proof, formalist, faith, 2.2.8 existential quantifier, 4.13.2 existential/universal quantifier, information content, 4.13.9 exists, erroneous use of symbol, 4.13.4 experience, sensible, 2.3.6 experimental physicist, 2.9.9 exponential function, 20.12, 20.12.2 exponential map, 38.7, 38.7.2 exponential of linear map, 10.4.1 exponential of matrix, 21.7 [ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
51. Index
51. Index
expression, logical, 4.3 expression, logical, exhaustive substitution, 4.14.3 expression, logical, non-atomic, 3.5.4 expression evaluation, logical, 4.4 expression parse-tree, logic, 4.3.10 expressions, equivalent, logical, 3.10.13 extended canonical map for tensor space, 13.12.5 extended integer, 7.6, 7.6.2 extended integer, negative, 7.6.4 extended integer, positive, 7.6.4 extended integer set notations, 7.6.4 extended list space, 7.12.5 extended number notation, 1.6.1 extended rational number, 8.2 extended rational number notation, 1.6.1 extended real number, 8.4, 8.4.1 extended real number notation, 1.6.1 extended truth table, 3.8.1, 4.2.6 extension axiom, ZF, 5.1.3, 5.2 extension of a function, 6.5.31 extension theorem, Tietze, 15.2.27 extensionality, axiom, 5.2.1 extent, point with no, 2.4.3 exterior algebra, 13.10.2 exterior calculus, 13.0.6 exterior closure, topology, 14.5.1 exterior derivative, 20.6, 20.6.2 exterior derivative of parallel transport, 20.6.18 exterior derivative on manifold, 33.6, 33.6.1 exterior derivative using Lie derivative, 20.7 exterior of set, 14.5 exterior of set, topological, 14.5.2 exterior point of set, 14.0.2 external direct sum of linear spaces, 10.6.1 extra-terrestrial civilisation, 2.2.6 extraneous properties management, 2.7.4 extraneous properties of mathematical objects, 2.7.2 extrapolation, concept, 7.2.1 extrinsic tangent vector, 28.1.0 factorial function, 7.10.13, 18.6.1 factorial function, Jordan, 7.10.16 faith, formalist existence proof, 2.2.8 false, abstract label, 3.2.5 false, proposition tag, 3.7.16 false application of theorem, 14.1.6 false generalization, 17.0.1 false-store, proposition, 3.7.19 false zero-operand operator, 4.3.18, 4.3.19 false zero-parameter predicate, 4.12.10 falsity, ontology, 3.6.1 falsity, proposition tagging, 3.7.5 falsity, semantics, 3.9 falsity, subjective, 3.7.2 falso sequitur quodlibet, Ex, 3.11.1 families, parametrized propositions, 4.12 family, 6.5.4 family, proposition, parametrized, 3.1.2 family of curves, 16.2.2 family of curves, higher-order vector field, 30.8 family of diffeomorphisms, differentiable, 34.6.7 family of diffeomorphisms, vector field, 34.6.8 family of functions, 6.8, 6.8.14 family of functions, Cartesian product, 6.9 family of geodesic curves parametrized by endpoints, 38.6.3 family of geodesic interpolations, 38.6 family of geodesics, one-parameter, 41.4.1 family of local diffeomorphisms, 33.5.0 family of sets, 6.8, 6.8.1 [ www.topology.org/tex/conc/dg.html ]
family of sets, Cartesian product, 6.9, 6.9.1 family of sets, intersection, 6.8.5 family of sets, union, 6.8.5 family of systems, parametrized, 2.8.5 feedback, positive, audio system, 3.3.7 feedback, sensor/motor, 3.3.2 feet, 2.2.1 female slave, 3.5.3 Fermat, Pierre de, 46.1.3 Feynman, Richard Phillips, 2.0.4, 46.1.6 fibration, differentiable, 27.12 fibration, differentiable, with fibre atlas for specified fibre space, 27.12.7 fibration, differentiable, with intrinsic fibre space, 27.12.1 fibration, differentiable, with specified fibre space, 27.12.3 fibration, non-topological, 22.1 fibration, non-topological, parallelism, 22.2 fibration, non-uniform non-topological, 22.1.1 fibration, tangent, 28.8.8 fibration, topological, 27.12.0 fibration, topological, cross-section, 27.12.12 fibration, uniform non-topological, 22.1.3 fibration, uniform non-topological with fibre space F , 22.1.9 fibration identification space, 23.4 fibre, standard, 23.3.3 fibre atlas, 22.1.7, 23.3 fibre atlas, differentiable, 27.12.6 fibre atlas, topological, equivalence, 23.6.16 fibre atlas for a topological fibration, 23.3.14 fibre bundle, associated, differentiable, 35.6, 35.6.3, 35.6.7 fibre bundle, cross-section, 6.9.7 fibre bundle, differentiable, 28.8.0, 35 fibre bundle, differentiable, associated, 35.6.10 fibre bundle, differentiable, connection, 36 fibre bundle, differentiable, vector field, 35.3 fibre bundle, non-topological, 22, 22.3, 22.3.1 fibre bundle, principal, affine connection, 37.8 fibre bundle, principal, differentiable, 35.4, 35.4.1 fibre bundle, principal, differentiable, vector field, 35.5 fibre bundle, principal, topological, 23.9 fibre bundle, tangent, 35.8.1 fibre bundle, topological, 23, 23.6, 23.6.4, 35.0.0 fibre bundle, topological, associated, 23.10, 23.10.5 fibre bundle, topological, history, 23.1 fibre bundle, topological, motivation, 23.1 fibre bundle, topological, overview, 23.1 fibre bundle, topological, pathwise parallelism, 24.2 fibre bundle association, differentiable, 35.6.2 fibre bundle example, 44 fibre bundle homomorphism, 23.7 fibre bundle isomorphism, 23.7 fibre bundle on Euclidean space, 44.1 fibre bundle product, 23.7 fibre chart, 22.1.5, 23.3.4 fibre chart, compatible, 23.6.14 fibre chart, differentiable, 27.12.5 fibre set automorphism through the charts, 23.8.7 fibre set glue, 27.12.0 fibre set isomorphism through the charts, 23.8.14 fibre set map, structure-preserving, 23.8 fibre set parallelism, topological, 24.1.9 fibre set parallelism space, topological, 24.1.9 fibre space, 23.3.3 fibre space of tangent bundle, 28.2.5 fibre-to-fibre homeomorphism space of topological fibre bundle, 36.8.3 field, 9.8, 9.8.8 field, gravitational, 2.11.18 [ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
854
field, Jacobi, 38.3, 38.3.2 field, metric tensor, 10.2.18 field, operator, composition, 33.2.7 field, Riemannian metric tensor, 39.2.3 field, tangent vector, partial, 31.4.10 field, tensor, 29.7, 29.7.1 field, tensor, differentiable, 29.7.2 field, vector, 29.5, 31.5 field, vector, composition, 33.2.2 fifth problem, Hilbert’s, 34.0.5, 34.2 figure, 37.1.4 figure of transformation group, 9.7, 9.7.0 file system directory, computer, 5.7.19 finite bandwidth, 2.5.13 finite bandwidth, human communication, 2.11.14 finite basis for linear space, 10.2.9 finite-dimensional linear space, 10.2.5 finite-dimensional linear space dimension, 10.2.5 finite group, 7.4.5 finite machine, 2.10.9 finite measurement resolution, 2.12.1 finite mind, 2.10.2 finite naive set theory, 3.13.3 finite open cover, 14.10.12 finite ordinal number, 7.2.12 finite paper, 2.10.7 finite set, 7.2.20 finite set, Dedekind, 7.2.24 finite set topology, combinatorics, 14.3.6 finite transformation group as fibre bundle, 22.4 finitely populated proposition store, 3.3.7 first order language with equality, 5.2.3 first order logic, NBG set theory, 4.13.13 first order logic, semantic space, 4.13.13 first order logic, ZF set theory, 4.13.13 fish, 5.15.6, 39.1.5 Fisher information matrix, 39.9.1 five layers, linguistic structure, predicate calculus, 4.12.4 five layers, linguistic structure, propositional calculus, 4.5.4 fix, bug, 3.10.4 fixed stars, 24.1.4 flat Earth, 42.0.1 flat paper, 26.1.5 flat paradox, 5.7.23 flatness deviation, parallelism, 36.4.1 flavour, topology, 14.1.2 flavours, logic, 4.0.1 floating-point number, 5.16.1, 8.3.1 flooding, Nile, 46.2.2 floor function, 8.6.8 flow, information, incomplete, 3.8.2 flow diagram, topics, 1.2 flow of vector field, 33.4.2 flow velocity, 33.4.3 fluxions, 18.1.1, 18.2.11 fly, 5.15.3, 23.5.0, 33.3.1, 43.4.1 fly, ointment, 7.2.5 foe, friend or, 2.2.4 foisting propositions, 3.7.12 folder, file system, computer, 5.7.19 font, Computer Modern Roman, 1.8 font, Fraktur, 14.2.14 font, rsfs, 1.8 font, wncyr, 1.8 food, poisonous, 3.10.15 football, 2.9.4 foreground claim, 3.5.4 foreground proposition, 3.6.1, 3.7.3 [ www.topology.org/tex/conc/dg.html ]
855
forest, Brazilian, 3.2.8 form, alternating, 13.9.4 form, curvature, 37.7.3, 39.5.2 form, differential, 29.9, 29.9.1 form, differential, in flat space, 20.5 form, second fundamental, 39.1.4 form, statement, 4.5.2, 4.5.4 form, statement-form-name, 4.5.4 form, torsion, 37.7.5 form bundle, alternating, cross-section, 20.5.4 formal direct sum of linear spaces, 10.6.1 formal linear combination of vectors, 10.10.4 formal notation introduction, 5.8.6 formalism, 5.9.3 formalism, secure, 1.5.1 formalism, topology, 14.2.1 formalisms and notations, plethora, 1.4.1 formalist existence proof, faith, 2.2.8 formalization, logic procedures, 4.5.1 formalization, propositional calculus, 4.5 formalized logical algebra, 3.1.2 formicarium, 2.10.3, 2.10.6 Forms, Platonic, 2.4.1 formula, boolean, 48 formula, set-theoretic, 6.3.3 formula, well-formed, 4.3.10, 4.5.2 foundation, logic, 3.0.1 foundation, mathematics, 5.0.1 foundation axiom, 5.5.1 foundations of mathematics, 1.5.1 Fourier, Jean Baptiste Joseph, 2.9.10, 46.1.5 fractional differentiability, 43.4.0 fractional part function, 8.6.13 Fraenkel, Abraham Adolf, 2.9.7, 46.1.6 Fraktur font, 14.2.14 frame, intertial, 24.1.4 frame, orthogonal, 39.4.6 frame, tangent, 28.12, 28.12.1 frame, tangent operator, 28.12.4 frame bundle, tangent, 35.9, 35.9.1 framework, logic, 4.0.1 framework system, meta-logical, 5.7.22 Fr´ echet, Ren´ e Maurice, 46.1.6 free group, 10.10.4 free linear space, 10.10, 10.10.1, 10.10.5 free linear space standard immersion, 10.10.1 free linear space used to define tensor product, 13.11 free variable, 5.1.24, 5.8.16 Frege, Friedrich Ludwig Gottlob, 2.4.6, 2.9.7, 46.1.6, 46.2.18 friend or foe, 2.2.4 frog, 7.2.7 FTOC (fundamental theorem of calculus), 20.2.2 function, 6, 6.5, 6.5.6 function, bijective, 6.5.23 function, continuous, 14.11 function, convex, 38.8, 38.8.1 function, domain/range specification, 6.5.5 function, empty, 6.5.17 function, function-valued, 6.12.1 function, identity, 6.5.19 function, injective, 6.5.23 function, inverse, 6.5.25 function, local, 6.11.3 function, logical, constant, 4.12.12 function, partial, 6.11, 6.11.3 function, partially defined, 6.11, 6.11.3 function, predicate logic, 4.12.11 function, real-valued, basic, 8.6 [ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
51. Index
51. Index
function, real-valued, higher-order differential, 32.1 function, set-theoretic, 6.5.1 function, surjective, 6.5.23 function, truth, 3.10.9, 4.2.4 function argument, 6.5.11 function composite, 6.7.1 function composition, 6.7, 6.7.1 function composition notation, 6.7.2 function definition, inline, 14.11.19 function direct product, 6.9.11 function direct product, pointwise, 6.9.12 function domain, 6.5.9 function extension, 6.5.31 function family, 6.8.14 function family, Cartesian product, 6.9 function image, 6.5.9, 6.5.10 function inverse set map, 6.6 function-predicate versus function-set, 6.5.1 function product, direct, 6.9.11 function quotient, 6.7.7 function range, 6.5.9, 6.5.10 function restriction, 6.5.27 function sequence, 7.1.14 function set, notation, 6.5.13, 6.5.15, 6.5.16 function set map, 6.6 function set notations, 6.12 function-set versus function-predicate, 6.5.1 function space, differentiability-based, 18.7 function space, integrability-based, 20.11 function target set, 6.5.10 function template, 6.5.22 function transpose, 6.12.3 function tree, 4.3.12 function value, 6.5.11 function-valued function, 6.12.1 functional, linear, 10.5, 10.5.1 functional module, 4.1.1 fundamental form, second, 39.1.4 fundamental tensor, 39.2.6 fundamental theorem of calculus, 20.2.3, 20.3.1, 20.6.3, 20.9.1, 21.0.1 Fux, Johann Joseph, 1.4.13 fuzzy ideas, 2.4.3 Gaelic, Irish, 3.9.5 Galilean relativity, 30.0.3 Galileo Galilei, 46.1.3 Galileo transformation, 30.0.3 Galois, Evariste, 9.2.1, 46.1.5, 46.2.15 Galois theory, 34.0.5 garden, 2.10.6 Gauß, Johann Carl Friedrich, 39.1.1, 39.1.2, 39.1.3, 39.1.4, 46.1.5, 46.2.8 Gauß-Green theorem, 25.3.6 Gaussian curvature operator, 19.5.3 gelada baboon, 2.2.4 general connection, alternative definitions, 36.9 general connection, covariant derivative, 36.7 general linear algebra, 9.10.4 general linear group, 10.9, 11.7, 37.0.1 general linear transformations, Lie group, 34.7.6 general relativity, 25.7.4, 39.1.3, 40.2 general relativity global solution, 40.4 general set intersection properties, 5.14 general set union properties, 5.14 general tensor algebra, 13.8 generalization, false, 17.0.1 generalized function, 28.1.1 generated topology, 14.8.4 [ www.topology.org/tex/conc/dg.html ]
geodesic, 37.7.8, 38 geodesic, length-parametrized, 41.5.0 geodesic, minimum-length, 39.3.3 geodesic, normal, 41.5.0 geodesic coordinates, 38.7.1, 38.7.3 geodesic curve, 38.2, 38.2.3 geodesic curve on two-sphere, 42.9, 42.9.3 geodesic curves parametrized by endpoints, 38.6.1 geodesic interpolations, family, 38.6 geodesic leverage, 32.7.0 geodesic leverage map, 38.8.3 geodesic on two-sphere, affinely parametrized, 42.10 geodesic path, 38.2.4 geodesic variation, equation, 41.4 geodesics, family, one-parameter, 41.4.1 geographical terrain, 3.5.10 geometric arts, 2.9.6 geometric measure theory, 13.9.0, 20.8 geometric properties of solutions of BVPs, 38.8.3 geometry, a-priori, 3.4.4 geometry, Babylonian, 39.1.5 geometry, Egyptian, 46.2.2 geometry, Euclid’s, axiomatization, 3.2.7 geometry, Euclidean, 2.1.9, 25.3.4, 37.0.2, 39.1.5 geometry, information, 39.9 geometry, pre-metric, 39.0.1 geometry, synthetic, 3.4.4 geometry of the 2-sphere, 42 geophysics, 8.7.3 germ, 28.1.11, 45, 45.9 gif, Old English, 3.5.4 Gilgamesh, 3.2.8, 3.5.2, 3.12.3 Gilgamesh epic, conditional statements, 3.5.2, 46.4.1 Gilgamesh epic, logical language, 3.5.2, 46.4.1 global connectivity classification, 14.1.2 global solution, general relativity, 40.4 global tangent bundle on two-sphere, 42.7 global warming, 3.10.6 glue, fibre set, 27.12.0 glue, topological, 25.2.1 goblin, 2.10.6, 2.10.7 G¨ odel, Kurt, 2.9.7, 46.1.6 gold, 3.5.3 golden eggs, goose, 2.10.1 golden tablets, 3.2.7 goose, golden eggs, 2.10.1 gradient in Riemannian space, 41.5.0 gradient operator, 19.2.3, 45 Gradus ad Parnassum, 1.4.13 graft, set, 42.6.0 graft, topological, 43.3.3 graft of sets, 6.10.6 graft topology, 26.6.1 grammar, 2.9.8 grammar book, 2.2.7, 2.2.8 grammatically correct, 2.5.18 granular space-time, 2.12.1 graph, acyclic, 5.7.19 Grassman, Hermann G¨ unter, 13.0.6, 46.1.5 Grassman algebra, 13.0.6 gravitational field, 2.11.18 gravity law, Newton’s, 21.0.4 great circle line, 42.9.2 Greece, ancient, 2.1.6 Greek, ancient, logic, 3.7.11 Greek alphabet, 14.2.13 Greeks, ancient, 3.2.8, 3.7.18, 5.0.2 Green-Stokes formula, 25.3.6 [ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
856
Gregorian year, 2.10.3 grey number, 2.10.1, 2.10.5, 2.11.12, 3.4.3 grey set, 2.10.1 grid, rectangular, 16.5.5 grooming behaviour, 2.2.4 grooves in space, 30.0.4 Grossman, Marcel, 46.1.6 group, 9.2, 9.2.4, 9.2.18 group, Abelian, 9.2.21 group, commutative, 9.2.20 group, diffeomorphism, 34.0.1, 34.6 group, differentiable, 28.8.0, 34 group, direct product, 9.3.30 group, direct product of family, 9.3.30 group, essence, 5.16.2 group, finite, 7.4.5 group, free, 10.10.4 group, general linear, 10.9, 11.7, 37.0.1 group, homotopy, 16.6.0 group, left module, 9.9.17 group, left transformation, 9.4 group, Lie, 34.0.1, 34.1, 34.1.1 group, Lie transformation, 34.7, 34.7.1 group, locally Euclidean, 34.2.2 group, matrix, 11.7 group, orthogonal, 24.2.8, 36.6.2, 42.8.0 group, right transformation, 9.5 group, structure, 9.7.0, 44.1.0 group, structure, parallelism, 36.0.3 group, topological, 16, 16.7, 16.7.1 group, topological, locally Euclidean, 34.1.2 group, transformation, Lie, 34.0.1 group, trivial, 44.1.0 group centralizer, 9.3.27 group centre, 9.3.27 group identity, 9.2.10 group inverse, 9.2.13 group left inverse, 9.2.13 group morphism notations, 9.2.23 group morphisms, 9.2.22 group normalizer, 9.3.25 group of differentiable diffeomorphisms, 34.6.3 group of differentiable diffeomorphisms, topological, 34.6.5 group of linear transformations of IRn , 10.9 group right inverse, 9.2.13 group size, social, 2.2.4 groups, abstract, 9.4.1 groups, holonomy, 24.4.0 groups, transformation, 9.4.1 groups of chapters, 1.3 guesswork, brilliant, 1.4.7 Guinea, New, 3.2.8 gyf, Old English, 3.5.4 habilis, Homo, 2.2.4 Hadamard, Jacques Salomon, 5.9.3 half-angle rules, trigonometry, 20.13.13 half animal, 1.4.7 half robot, 1.4.7 hall of mirrors, 5.7.19 Halmos, Paul Richard, 5.0.1 Hamilton, William Rowan, 28.0.5, 46.1.5 Hammurabi Code of Laws, 3.5.3 hamster, 48.4.1 handshake procedure, 2.1.4 hang, system, 3.10.4 hard link, computer, 5.7.19 hardware, computer, 3.13.4, 3.13.8 Harry’s dog, barking, 3.6.3 [ www.topology.org/tex/conc/dg.html ]
857
Hausdorff condition, topological manifold, 26.3.2 Hausdorff separation, 15.2.12 Hausdorff space, 15.2.13, 26.2.4 Hausdorff space condition, topological manifold, 36.1.2 Hausdorff topology, 15.2.13 Haydn, Joseph, 1.4.13 heaven, mathematical, ontology, 2.3.10 Heaviside function, 8.6.6 Heine-Borel compactness, 15.7.5 Heine-Borel theorem, 5.9.10, 5.9.13, 17.3.26, 17.3.27 Hermite, Charles, 2.4.5, 46.1.6 Hesse, Ludwig Otto, 46.1.5 Hessian matrix, 25.6.2, 27.5.9, 27.5.10 Hessian of second derivatives of distance function, 25.7.1 Hessian operator, 30.6.1, 37.5, 37.6.2 Hessian operator at a point, 37.5.3 Hessian operator at critical point, 32.6 hieroglyphics, Egyptian, 2.5.5, 3.10.6 high-level set language, 5.6.2 higher-order derivatives for real-to-real functions, 18.4 higher-order derivatives for several variables, 18.6 higher-order differentiability, 18.4.1 higher-order differential, 32 higher-order differential of curve, 32.3 higher-order differential of curve family, 32.4 higher-order differential of differentiable map, 32.2 higher-order differential of real-valued function, 32.1 higher-order operator, differentiable map, differential, 32.7 higher-order operator, differential of curve, 32.8 higher-order operator, real-valued function, differential, 32.5 higher-order tangent operator, 30.1 higher-order tangent space, 30.4 higher-order tangent vector, 30, 30.3 higher-order vector field, 30.7 higher-order vector field for family of curves, 30.8 Hilbert, David, 34.0.5, 46.1.6 Hilbert space, 10.2.22 Hilbert’s fifth problem, 34.0.5, 34.2 Hipparchus, 46.1.1 Hippocrates of Chios, 7.4.6, 46.1.1 history of differential geometry, 46 history of tensor calculus, 41.1 history of topology, 14.1 Hobby, John D., 1.8 Hoekwater, Taco, 1.8 H¨ older condition on Riemannian space, 40.1.2 H¨ older continuity, 18.9 H¨ older continuous function, 18.9.1 H¨ older-continuous manifold, 43.4 H¨ older function space, 45.7 holonomy groups, 24.4.0 homeomorphism, 14.12, 25.2.9 homeomorphism, directionally differentiable, 19.6 homeomorphism pseudogroup, 14.1.3, 19.4.2, 19.4.3, 19.4.5, 19.4.6, 19.4.9, 19.6.4, 19.6.7 homeomorphism pseudogroup, complete, 19.4.7 Homer, 3.2.8 hominid, 2.2.4 Homo erectus, 2.2.4 Homo habilis, 2.2.4 homology theory, singular, 16.6.0 homomorphism, Lie algebra, 9.11.6, 9.11.16 homomorphism, linear space, 10.3.6 homomorphism, order, 7.1.13 homomorphism between modules, 9.9.10 homomorphism module, 9.9.5, 9.9.14, 9.9.26 homotopy continuous, 24.4.1 homotopy group, 16.6.0 [ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
51. Index
51. Index
horizontal component of total tangent space, 28.11, 28.11.2 horizontal component of vector, 27.12.8 horizontal lift function for ordinary fibre bundle, 36.3 horizontal lift function for principal fibre bundle, 36.5 horizontal lift function on differentiable fibre bundle, 36.3.5, 36.3.12 horizontal lift function on principal bundle, transposed, 36.5.3 horizontal map function for principal fibre bundle, 36.9.2 horizontal vector, 10.11.11 horizontal vector on principal fibre bundle, 36.9.4 horse, 1.5.4 hours of daylight calculation, 42.15 Hoyle, Frederick (Fred), 46.1.6 Hrunting, 46.4.2 H¨ ulle, abgeschlossene, 14.4.6 human brain, 4.2.8 human communication, finite bandwidth, 2.11.14 human language, 3.2.8 human literature, 3.8.2 human minds, 2.5.7 human procedure, logic, 3.4.3 human speech, 2.2.4 hunters, mammoth, decision-making, 3.6.1 Huygens, Christiaan, 46.1.3 hydrogen, 2.1.5 hydrolysis, 14.7.1 hyperbolic first order systems of PDE, 20.10.0 hyperbolic metric, 17.1.15 hyperbolic PDE, 21.4.0 hyperboloid, 43.8 hypothesis, continuum, 5.0.10 ibn M¯ us¯ a, Ab¯ u Ja’far Mohammed, 46.2.3 ice cream, 3.3.10 ideal, Platonic, 2.3.3 ideal of ring, 9.8.7 ideas, big, 1.4.10 ideas, fuzzy, 2.4.3 ideas, Plato’s theory, 2.4 idempotent linear automorphism, 10.1.5 identification map, standard, for Cartesian product, 7.7.6 identification set, 6.4.5 identification space, 6.4.5, 6.10, 6.10.5 identification space, topological, 26.6.0 identification space method, 23.12.10 identification space representation, 6.10.6 identity, 2.4.3, 2.4.4 identity, group, 9.2.10 identity, Jacobi, 9.11.0, 9.11.1, 9.11.3 identity, set, 5.2.6 identity element of unitary ring, 9.8.2 identity function, 6.5.19 identity matrix, 11.1.20 if and only if, 4.3.3 IF-construction, 3.5.4 Iliad, 5.0.2 image of function, 6.5.9, 6.5.10 image of path, 16.4.2 image of relation, 6.3.6, 6.3.7 image of set by relation, 6.3.13 images, digital, 2.9.1 imaginary world, model, 2.10.1 imaginative process, predicate calculus, 4.14.4 immersion of manifold, 31.3.10 imperative verb mood, 3.10.14, 3.12.1, 3.12.3 imperfect universe, 3.4.3 implication, logical, 4.3.3 implication-based propositional calculus, 4.7 implication operator, primacy, 4.6.5 [ www.topology.org/tex/conc/dg.html ]
implication operator chain, 4.6.5 implies, 4.3.3 import, concrete equality relation, 4.15.2 importation, logical operator, 4.2.1, 4.3.1 imported mathematical structure, 2.6.2 inclusion of sets, cyclic, 4.1.9 inclusive disjunction, 3.5.5 inclusive or, 3.5.2, 3.5.7, 3.13.7, 4.3.5 incomplete axiomatic system, 3.8.2 incomplete information flow, 3.8.2 incomplete information transfer, 3.8 incompressible number, 2.10.4, 2.11.2 inconsistency, world-model ontology, 5.7.12 inconsistency-tolerant logic, 4.1.5, 5.7.11, 5.7.12 inconsistent axioms, 4.1.9, 4.12.7 inconsistent logic machine, 3.10.4 inconsistent truth value function, 4.1.5 index convention, Einstein, 7.11.15, 13.7.5, 13.8.18, 13.9.2 index function, permutation, 7.10.11 index lowering, 13.8.18 index of this book, 51 indexed, differentiable, 27.2.2 indexed atlas, differentiable, 27.2.2 indexed differentiable atlas, 27.2.2 indicative verb mood, 3.7.1, 3.10.11, 3.12.1, 3.12.2 indicator function, 7.9.2 indirect method of proof, 3.11.1, 3.11.9 individual constant, 4.12.12 Indo-European languages, 3.12.2 induced map, 31.1 induced map for tagged tangent operator, 31.3.24 induced map for tangent operator, 31.3.23 induced map for vector field, 31.3.14 induced map of differentiable map, 31.3.6 induced topology of differentiable atlas, 27.2.4 induced topology on a set by a function, 14.11.5 induction, 3.7.8 induction, mathematical, 2.11.1, 7.3.4 induction, mathematical, principle, 2.10.1 induction, mathematical, validity, 2.11.17 induction, naive, 6.1.15, 7.2.6 induction, transfinite, 18.4.1 induction argument example, 14.2.8 induction-capable machine, 5.7.25 induction principle invalidity, 2.11.3 inductive definition, 18.4.1 inequality, triangle, 17.1.12, 17.1.14 inference from experience, 7.2.1 infimum of partially ordered set, 7.1.10 infinite, 2.11.20 infinite-dimensional linear space, 10.2.5 infinite-dimensional manifold, tangent bundle, 28.16 infinite paper, 2.10.6 infinite proposition space, 4.12.2 infinite sequence, termination, 2.11.22 infinite set, Dedekind, 7.2.24 infinite versus unbounded, 5.5.2 infinite world, 2.10.9 infinitesimal, 37.2.1 infinitesimal action, 36.3.1 infinitesimal curve, 33.2.0 infinitesimal transformation, 34.8, 35.3.0 infinitesimal transformation of Lie right transformation group, 34.8.10 infinitesimal transformation of Lie transformation group, 34.8.4 infinitesimal vector, 19.1.5 infinitesimals, 18.1 [ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
858
infinity, 8.4.4 infinity, absurdly large, 2.0.3 infinity, concept, 7.2.1 infinity, logical quantifier, 4.13.10, 4.13.12 infinity, ontology, 2.10.1 infinity, philosophically troubling, 14.1.5 infinity, philosophy, 2.11 infinity axiom, 2.10.6, 2.10.7, 7.2.6 infinity axiom, ZF, 2.11.13, 5.6 infix notation, 6.3.22 information bottleneck, cosmic, 2.11.17 information content, 2.5.15 information content, logical conjunction/disjunction, 3.5.9 information content, logical operator, 3.5.5 information content, proposition, 3.8.1 information content, topology, 14.7.10 information content, universal/existential quantifier, 4.13.9 information flow, incomplete, 3.8.2 information geometry, 39.9 information matrix, Fisher, 39.9.1 information transfer, incomplete, 3.8 initial index, list, 7.12.3 initial point of curve, 16.2.13 initial point of path, 16.4.11 initial value problems, 21.4 injection, 6.5.23 injective function, 6.5.23 injective relation, 6.3.31 inline function definition, 14.11.19 inline proof, 4.9.10 inner automorphism, 9.3.22 inner product, 10.8 inner product, Euclidean, standard, 10.8.5 inner product on Riemannian manifold, 39.7 insert function for list, 7.12.2 insertion, proposition list, 3.7.14, 3.10.2 inside, 5.7.17 inside skew product of transformation groups, left, 9.6.8 inside skew product of transformation groups, right, 9.6.9 insight, 1.10.2 integer, 7 integer, composite, 7.4.2 integer, compressible, 2.11.6 integer, even, 4.13.10 integer, extended, 7.6, 7.6.2 integer, largest, 2.11.5, 2.11.7 integer, nearest, distance function, 8.6.19 integer, nearest, function, 8.6.14 integer, negative, 7.5.4 integer, negative extended, 7.6.4 integer, philosophy, 2.11 integer, positive, 7.5.4 integer, positive extended, 7.6.4 integer, prime, 7.4.2 integer, semi-pixie, 2.11.10 integer, signed, 7.5, 7.5.2 integer arithmetic, unsigned, 7.4 integer definitions, boot-strap, 7.2.5 integer notation summary, 1.6.1 integer representation, 28.1.4 integer set notations, 7.5.4 integer set notations, extended, 7.6.4 integers, quantum mechanics, 3.4.3 integrability-based function space, 20.11 integral, Lebesgue, 20.2 integral calculus, 20 integral calculus and logic, 4.4.2 integral of Riemannian metric, 39.3.2 [ www.topology.org/tex/conc/dg.html ]
859
integration, 20 integration, Lebesgue, 20.2 intellect, superior, 3.3.4 intellectual capital, 4.7.7 intellectual economy, 2.11.14 intelligence, artificial, 2.3.1 inter-galactic civilisations, 2.0.3 interior of set, 14.4 interior of set, topological, 14.4.1 interior point of open set, 14.2.9 interior point of set, 14.0.2 internal direct sum of linear spaces, 10.6.5 international dateline, 42.1.5 Internet protocols, 2.5.7 Internet resources, 50.0.1 Internet standards, 2.5.13 interoperability of communication systems, 2.8.9 interoperability tests, 2.5.2 interpolation, convex curvilinear, 16.5 interpretation, domain, propositional calculus, 4.12.8 interpretation, logic, 3.3.3 intersection of family of sets, 6.8.5 intersection of sets, 5.13.2 intersection of sets, properties, binary, 5.13 intersection of sets, properties, general, 5.14 intertial frame, 24.1.4 interval, open, 14.9.2 interval, real numbers, 8.3.10 interval, unit, 8.3.10 introduction, formal notation, 5.8.6 introduction to the book, 1 introspection, 2.5.20 intuition, 5.0.3 intuition, diagrams, 3.6.2 intuitionism, 2.2.8, 3.6.4 intuititionism, 5.9.3 intuitive topology, 14.2.9 invariant of diffeomorphism, 37.1.4 invariant of transformation group, 9.7 invariant vector field, right, 34.4.7 inverse, group, 9.2.13 inverse function, 6.5.25 inverse image of set by relation, 6.3.13 inverse-image topology, 14.11.6 inverse matrix, left, 11.1.24 inverse matrix, right, 11.1.24 inverse of linear map, 10.4.1 inverse problem, algebra, 4.4.2 inverse problem, logic, 4.4.2 inverse relation, 6.3.27, 6.5.25 inverse set map, function, 6.6 inverse set map corresponding to a function, 6.6.1 inverse trigonometric function, 20.13.2 invertible linear transformation, 9.11.15 invertible matrix, 11.3.15 invisible pixie, 2.2.8, 2.10.7 Iraq, 3.5.2 Irish Gaelic, 3.9.5 irrational numbers, 2.9.3, 2.12.4, 2.12.5 is-proposition, 3.3.10 isolated point of set, 14.6.5 isometry on two-sphere, 42.8 isomorphism, Lie algebra, 9.11.8 isomorphism, linear space, 10.3.6 isomorphism, order, 7.1.13 isomorphism test, 2.8.6 IVP (initial value problem), 21.0.2 Ja’far Mohammed ibn M¯ us¯ a, Ab¯ u, 46.2.3 [ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
51. Index
51. Index
Jacobi, Carl Gustav Jacob, 46.1.5 Jacobi field, 38, 38.3, 38.3.2 Jacobi field on two-sphere, 42.13 Jacobi identity, 9.11.0, 9.11.1, 9.11.3 Jacobian matrix, 19.1.3, 27.5.10, 31.3.15 Jacquard, Joseph-Marie, 2.5.14 Jacquard loom design, 2.5.14 jet, 28.1.11, 45, 45.10 jigsaw puzzle, 0.0 joint denial, 4.3.6, 4.6.4 joint denial operator, 4.3.8 joke, mathematics, 44.2.6 Jones, William, 46.2.13 Jordan arc, 16.2.11 Jordan factorial function, 7.10.16 journey, 16.1.1 juxtaposition, symbolic, 2.5.12 k-combination, 7.11.1 k-permutation, 7.11.1 kangaroo, roast, 2.2.7 Kant, Immanuel, 3.4.4, 46.1.5 Kantian philosophy, 26.1.2 Kepler, Johannes, 46.1.3 key, secondary, 2.6.1 Khw¯ arizmi, al, 46.2.3, 46.2.4 Klein, Felix, 14.1.3, 19.4.2, 23.5.0, 46.1.6, 46.3.0 Klein bottle, 44.2.2 kludge, ad-hoc, 5.7.27 knitting, 2.9.6 knowledge, 1.9.1 knowledge, a-priori, 2.1.3, 2.5.4 knowledge, absolute, 2.5.14 knowledge, cyclic, 2.1.3 knowledge, mathematical, corpus, 3.2.5, 5.0.3 knowledge, prior, 3.0.1 knowledge bedrock, 2.0.1 knowledge wheel, 2.0.1 known unknown, 4.2.8 Knuth, Donald Ervin, 1.8 Konnektion, 36.1.3 Kontinuum, 26.1.9 K¨ orper, 9.8.10 Kronecker, Leopold, 2.12.5, 7.2.7, 46.1.5 Kronecker delta function, 7.9.1, 7.9.10, 11.1.22, 22.4.0, 28.7.5 Kuratowski, Kazimierz (Casimir), 6.1.5, 46.1.6 lacto-vegetarian, 5.15.6 ladder, Schild’s, 37.6.10, 37.7.8 Lagrange, Joseph-Louis, 2.9.10, 46.1.4, 46.1.5 Lagrangian mechanics, connection, 37.10 land measurement, 46.2.2 language, 3.5.1 language, discussion, 3.10.6 language, first order, with equality, 5.2.3 language, human, 3.2.8 language, location, 2.5.7 language, logic, low-level, 5.6.2 language, logical, 3.11.7 language, logical, ancient literature, 46.4 language, logical, B¯ eowulf, 46.4.2 language, logical, Gilgamesh epic, 3.5.2, 46.4.1 language, mathematical, 2.5.7 language, natural, 3.10.6, 3.12.1 language, natural, logic, 3.13.1 language, predicate, name-to-object map, 4.12.3 language, set, high-level, 5.6.2 language, steering, 2.5.7 language families, natural, 1.4.12 [ www.topology.org/tex/conc/dg.html ]
languages, Indo-European, 3.12.2 Laplace, Pierre-Simon, 2.9.10, 46.1.4, 46.1.5 Laplace-Beltrami operator, 39.6.3 Laplacian in conical coordinates, 43.7.0 Laplacian operator, 19.5.3, 30.0.6, 30.7.7, 39.6.1, 39.6.2 Laplacian operator in Riemannian space, 41.5.0 large infinity, absurdly, 2.0.3 largest integer, 2.11.5, 2.11.7 largest topology, 14.3.1 Latin, 2.1.2, 3.9.5, 4.6.8 latitude, 26.4.5 law, ancient history, 3.10.14 law, de Morgan’s, 5.13.11, 5.14.8 law, Scottish, not proven, 3.10.15 Laws, Code, Hammurabi, 3.5.3 laws of motion, 20.6.18 layer, boot-strap, 2.1.6 layer, magma, 3.14.1 layers, differential geometry, overview, 25 layers, five, linguistic structure, predicate calculus, 4.12.4 layers, five, linguistic structure, propositional calculus, 4.5.4 layers of structure of differential geometry, 1.1 learning, animal, 7.2.1 learning, asterisk method, 1.10.1 learning, rote, 1.10.2 learning mathematics, 1.10 least action principle, 37.10.1 Lebesgue, Henri L´ eon, 20.1.1, 46.1.6 Lebesgue dimension, 15.9.1, 34.2.5 Lebesgue integration, 20.2 Lebesgue measure, 13.9.0, 20.1 Lebesgue non-measurable set, 2.10.6, 2.11.9, 4.13.10, 5.9.3, 5.9.10, 5.9.11, 20.1.3, 20.1.4 Lebesgue non-measurable set, unknowable existence, 2.10.8 Lebesgue number, 17.3.22, 17.3.24 Lebesque non-measurable set, 3.2.1 left A-module, 9.9.8 left action on Lie right transformation group, 34.8.9 left-associative operator, 4.3.12 left conjugate of a subset of a group, 9.3.15 left conjugation map, 9.3.22 left conjunct, 4.3.11 left coset of subgroup, 9.3.3 left-differentiable function, 18.3.4 left disjunct, 4.3.11 left inside skew product of transformation groups, 9.6.8 left invariant vector field, Lie group, 34.3.11 left invariant vector field on Lie group, 34.3 left inverse, group, 9.2.13 left inverse matrix, 11.1.24 left module, unitary, 10.1.3 left module over a ring, 9.9.19 left module over a ring, unitary, 9.9.21 left module over group, 9.9.17 left-open set, 18.3.2 left outside skew product of transformation groups, 9.6.4 left-side membership relation, 5.2.6 left transformation group, 9.4, 9.4.4 left transformation group, topological, 16.8.17 left transformation group homomorphism, 9.4.9 left transformation group homomorphism, topological, 16.8.6 left transformation group mirror image, 9.6.1 left transformation group of a topological space, effective, 16.8.7 left transformation group of topological space, 16.8.2 left transformation group of topological space, effective topological, 16.8.8 [ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
860
51. Index
[ www.topology.org/tex/conc/dg.html ]
Lie Lie Lie Lie Lie
group, left invariant vector field, 34.3 group, left translation operator, 34.3.2, 34.7.9 group, right invariant vector field, 34.4 group, right translation operator, 34.4.2 group, right translation operator for tangent vectors, 34.4.3 Lie group left invariant vector field, 34.3.11 Lie group Lie algebra, 34.5, 36.3.7 Lie group Lie algebra, tangent-space version, 34.5.1 Lie group Lie algebra, vector-field version, 34.5.2 Lie group of general linear transformations, 34.7.6 Lie group space, 34.7.2 Lie left transformation group, analytic fibre bundle, 35.2.5 Lie right transformation group, infinitesimal transformation, 34.8.10 Lie right transformation group, left action, 34.8.9 Lie structure group, 33.2.0 Lie structure group, differentiable fibre bundle, 35.2, 35.2.2 Lie subalgebra, 9.11.5 Lie transformation group, 34.0.1, 34.7, 34.7.1 Lie transformation group, infinitesimal transformation, 34.8.4 Lie transformation group, left translation operator, 34.7.8 Lie transformation group, right action, 34.8.3 Lie transformation group space, 34.7.2 Lie transport, 33.4.5 Lie transport curve, 33.4.6 lift function, 10.11.11 lift function, horizontal, for ordinary fibre bundle, 36.3 lift function, horizontal, on differentiable fibre bundle, 36.3.5 lift function, tangent bundle, 28.2.1 lift function, tangent bundle, unidirectional, 28.14.2 lift function for principal fibre bundle, horizontal, 36.5 lift function on differentiable fibre bundle, 36.3.12 lift function on principal bundle, horizontal, transposed, 36.5.3 lift of vector field by connection on principal fibre bundle, 36.5.8 light, speed, 39.1.5 light, visible, 3.4.5 limit, philosophically troubling, 14.1.5 limit notation, 14.11.16 limit of function at a point, 14.11.15 limit of sequence of points, 14.11.27 limit point of sequence, 14.11.26 limit point of sequence of points, 14.11.27 limit point of set, 14.6.1, 14.11.26 limit processes, 14.7.1 limit set of function at a point, 14.11.15 limits, metaphysics, 4.13.12 line bundle, 35.7.2 line segment in affine space, 12.2.5 line through points in affine space, 12.2.5 linear algebra, 10 linear automorphism, idempotent, 10.1.5 linear combination of vectors, 9.7.0, 9.7.1, 10.2.8, 10.2.22 linear combination of vectors, formal, 10.10.4 linear functional, 10.5, 10.5.1 linear group, general, 37.0.1 linear map, 10.3, 10.3.1 linear map, component matrix, 11.2 linear map, exact sequence, 10.11 linear map between modules over a ring, 9.9.25 linear map component matrix, 11.2.1 linear map exponential, 10.4.1 linear map for a component matrix, 11.2.3 linear map inverse, 10.4.1 linear map transpose, 10.5.23 linear representation of a Lie algebra, 9.11.17 [ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
left transformation group of topological space, topological, 16.8.3 left transformation semigroup, 9.4.2 left translation operator, 34.3.1 left translation operator for tangent vectors, 34.3.7 left translation operator of Lie transformation group, 34.7.8 left translation operator on Lie group, 34.3.2, 34.7.9 left/right transformation group ambiguity, 2.5.4, 5.16.6, 34.8.8 leftward path, set membership, 5.5.2 legal system, Mesopotamia, 3.7.11 Legendre, Adrien Marie, 2.9.10, 20.13.22, 46.1.5 Leibniz, Gottfried Wilhelm, 18.2.11, 20.1.1, 46.1.4, 46.1.5, 46.2.11 Leibniz rule, 45.0.0 Leibniz rule for derivative, 28.1.1 Leibniz rule tangent vector, 45.1.1 length, Planck, 2.12.1 length of curve, 17.5.9, 41.5.0 length of list, 7.12.2 length of multi-index, 18.6.1 length of ordered traversal, 17.5.9 length of rectifiable path, 17.5.17 length of vector in Riemannian manifold, 39.2.7 length-parametrized geodesic, 41.5.0 Leonardo da Vinci, 3.9.8 level curve, 16.1.7 level manifold, 16.1.7 leverage, geodesic, 32.7.0 leverage map, geodesic, 38.8.3 Levi-Civita, Tullio, 36.1.5, 46.1.6 Levi-Civita alternating symbol, 7.9.1, 7.10.20 Levi-Civita connection, 39.4, 39.4.2, 39.4.5, 39.6.2, 42.5.2 Levi-Civita connection, Euclidean space, 12.4.1 Levi-Civita connection, globality, 25.7.5 Levi-Civita connection, metric layer, 36.0.4, 39.0.1 Levi-Civita connection, orthogonal, 25.6.5 Levi-Civita connection, parallelism, 36.2.3, 37.1.2 Levi-Civita connection, parallelism at a distance, 24.1.1 Levi-Civita connection, tensorization, 30.2.7 Levi-Civita connection, tensorization coefficients, 37.6.10 Levi-Civita connection on two-sphere, 42.2.2 Levi-Civita symbol, 7.10.11, 7.10.21, 7.10.22 Levi-Civita tensor, 7.10.20 lexicographic order, 16.1.8 liar’s paradox, 5.7.13, 5.7.22 library, software, 4.7.7 Lie, Marius Sophus, 19.4.1, 34.0.5, 46.1.5 Lie algebra, 9.11, 9.11.1 Lie algebra, adjoint, 9.11.22 Lie algebra, adjoint representation, 9.11.19 Lie algebra, real, 9.11.11 Lie algebra associated with associative algebra, 9.11.10 Lie algebra homomorphism, 9.11.6, 9.11.16 Lie algebra linear representation, 9.11.17 Lie algebra morphisms, 9.11.8 Lie algebra of Lie group, 36.3.7, 42.5.2 Lie algebra of Lie group, tangent-space version, 34.5.1 Lie algebra of Lie group, vector-field version, 34.5.2 Lie algebra on Lie group, 34.5 Lie algebra representation, 9.11.17 Lie algebra representation space, 9.11.17 Lie connection, 33.4.3, 33.4.4 Lie derivative, 33.4.9 Lie derivative expression for exterior derivative, 20.7 Lie derivative of tensor field, 33.5 Lie derivative of vector field, 33.4 Lie group, 34.0.1, 34.1, 34.1.1
861
51. Index
linear representation of associative algebra, 9.10.7 linear space, 9.9.2, 10.1, 10.1.2 linear space, Euclidean, 10.2.20 linear space, finite-dimensional, 10.2.5 linear space, free, 10.10, 10.10.1, 10.10.5 linear space, infinite-dimensional, 10.2.5 linear space, internal direct sum, 10.6.5 linear space, quotient, 10.7.1 linear space, topological, 16.9.1 linear space axioms, 2.8.7 linear space basis, 10.2.23 linear space basis, finite, 10.2.9 linear space basis existence, 5.9.10, 5.9.12, 10.2.25 linear space bidual, 10.5.19 linear space differentiation, 18.8 linear space dimension, 10.2.5 linear space direct sum, 10.6 linear space direct sum, abstract, 10.6.3 linear space direct sum, external, 10.6.1 linear space direct sum, formal, 10.6.1 linear space double bidual, 10.5.19 linear space dual dimension, 10.5.10 linear space eigenspace, 10.4.1 linear space eigenvalue, 10.4.1 linear space eigenvector, 10.4.1 linear space endomorphism eigenspace, 10.4 linear space exact sequence, 10.11.1 linear space homomorphism linear space, 10.3.4 linear space morphism notations, 10.3.7 linear space morphisms, 10.3.6 linear space of linear space homomorphisms, 10.3.4 linear space of multilinear maps, 13.2, 13.2.3 linear space of rectangular matrices, 11.1.7 linear space of vector fields, 29.5.8 linear space quotient, 10.7 linear space second dual, 10.5.18 linear space sequence direct sum standard injection, 10.6.4 linear space subspace, 10.2.1 linear space tensor product, 13.5 linear subspace, 10.2 linear transformation, invertible, 9.11.15 linear transformations, general, Lie group, 34.7.6 linear transformations of IRn , group, 10.9 linguist, 2.2.8 linguistic structure, predicate calculus, five layers, 4.12.4 linguistic structure, propositional calculus, five layers, 4.5.4 linguistic style, symbolic logic, 4.0.2 linguistics, 2.2.7 linguistics, anthropological, 2.2.4 link, hard, computer, 5.7.19 Lipschitz, Rudolf Otto Sigismund, 46.1.6 Lipschitz constant, 17.4.10 Lipschitz continuity, 17.4.10 Lipschitz curve transition map, 16.1.8 Lipschitz function, 17.0.4, 17.4.10 Lipschitz manifold, 19.6.8, 24.1.7, 27.10.1, 27.11, 27.11.4, 43.4.0 Lipschitzian function, 17.4.10 Lisa, Mona, 3.9.8 list, initial index, 7.12.3 list concatenation function, 7.12.2 list element insert function, 7.12.2 list element substitute function, 7.12.2 list element swap function, 7.12.2 list insertion, proposition, 3.7.14, 3.10.2 list length function, 7.12.2 list notation, 4.12.3 list omit function, 7.12.2 [ www.topology.org/tex/conc/dg.html ]
list operation, 7.12.2 list operation on ring, 9.12.2 list operation on semigroup, 9.12.1 list operations for sets with algebraic structure, 9.12 list product function, 9.12.2 list projection function, 9.12.1 list restriction function, 7.12.2 list space, 7.12.2 list space, extended, 7.12.5 list space for general sets, 7.12 list subsequence function, 7.12.2 Listing, Johann Benedict, 14.1.1, 46.1.5, 46.2.17 listing, putative, 2.11.21 literature, 1.4.7 literature, ancient, logic in, 3.5 literature, ancient, logical language, 46.4 literature, human, 3.8.2 literature, mathematical, 5.0.3 local continuity analysis, 14.1.2 local diffeomorphism family, 33.5.0 local function, 6.11.3 local maximum, 27.6.3 local maximum on differentiable manifold, 27.5.8 local minimum on differentiable manifold, 27.5.8 local transformations, one-parameter group, 27.7.4 locally compact topology, 15.7.12 locally connected topological space, 15.4.20 locally Euclidean group, 34.2.2 locally Euclidean space, 26.2, 26.2.3 locally Euclidean space, non-Hausdorff, 43.3 locally Euclidean topological group, 34.1.2 locally Euclidean topological space, 26.3.0 locally finite subset of a topological space, 15.7.3 location, mathematical objects, 2.7.1 location language, 2.5.7 logarithm function, 20.12, 20.12.1 logic, 4 logic, abstract, 3.9.6, 4.1.11 logic, analytic, 3.9.7 logic, animal, 3.5.10 logic, anthropology, 3.2.8 logic, applicability, 3.9.8 logic, Aristotelian, 3.4.2, 3.11.9 logic, art, 3.2.3 logic, automated, 4.3.10 logic, axiomatic reformulation, 3.13.4, 7.13 logic, colloquial, confusion, 3.5.7, 3.13.7 logic, completeness, 4.1.7 logic, contradiction-free, 3.10.12 logic, correct, 3.1.3 logic, foundation, 3.0.1 logic, human procedure, 3.4.3 logic, inconsistency-tolerant, 4.1.5, 5.7.11, 5.7.12 logic, inverse problem, 4.4.2 logic, mathematical, tour, 3.1.2 logic, mathematical, true nature, 3.1.2 logic, mechanization, 3.7.5, 4.0.2 logic, mental process model, 3.11.3 logic, modern, universality, 3.4 logic, naive, 3.2.1, 4.9.5 logic, natural language, 3.13.1 logic, paraconsistent, 4.1.5, 5.7.11 logic, predicate, 3.1.2 logic, proposition-store ontology, 3.11.6 logic, propositional, world-model ontology, 3.6.3 logic, science, 3.2.5 logic, self-consistency, 4.1.7 logic, semantics, 3 [ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
862
51. Index
[ www.topology.org/tex/conc/dg.html ]
logical negation, double, 3.11.6 logical negation, semantics, 3.10 logical operation, abstract-to-conrete map, 3.13.1 logical operation tree, 3.10.13 logical operator, 3.10.10, 4.3, 4.11 logical operator, associativity, 4.3.12 logical operator, asymmetric, 4.6.2 logical operator, information content, 3.5.5, 3.5.9 logical operator, principal connective, 4.3.11, 4.3.12 logical operator, zero-operand, 4.3.18, 4.3.19 logical operator importation, 4.2.1, 4.3.1 logical operators, defined by deduction rules, 4.6.3 logical predicate, zero-parameter, 4.12.10 logical proposition, verb mood, 3.12 logical quantifier, infinity, 4.13.10, 4.13.12 logical quantifiers, 4.13 logical sub-expression, parenthesized, 4.3.12 logical thinking, 3.2.8 logical voltage, 4.1.8 longitude, 26.4.5 loom design, Jacquard, 2.5.14 loop, modelling, 3.3.7 Lorentz transformation, 30.0.3 Lorentzian relativity, 30.0.3 low-level logic language, 5.6.2 lower bound of partially ordered set, 7.1.10 lowering the indices, 13.8.18 lowest-level concepts, 3.0.1 lucky dip, 5.11.5 Lukasiewicz, Jan, 4.7.1 luminiferous aether, 24.1.4 Mach, Ernst, 46.1.6 Mach’s principle, 24.1.4, 25.7.5 machine, finite, 2.10.9 machine, induction-capable, 5.7.25 machine, logic, 3.7.5, 3.7.7, 3.10.1, 4.2.8 machine, logic, network, 3.7.14 machine, mathematical thinking, 4.2.8 machine, world-model, 3.6.1 machine model, logic, 3.6.1 machine model, logic, recursive, 3.3.6 machine reboot, logic, 3.10.5 machines, virtual, 2.5.15 Maclaurin, Colin, 46.1.4 magma, molten, 2.1.1, 2.1.3 magma layer, 3.14.1 male slave, 3.5.3 ´ Malus, Etienne-Louis, 3.11.9 mammoth hunters, decision-making, 3.6.1 management of extraneous properties, 2.7.4 management of propositions, predicate calculus, 4.12.1 manifold, affinely connected, 36.1.2 manifold, analytic, 27.9, 27.9.2 manifold, analytic, compact, 27.9.5 manifold, analytic, paracompact, 27.9.6 manifold, C 0 , 26.4.11 manifold, differentiable, 27.2.6 manifold, discrete, 26.1.6 manifold, embedded, 27.3.3 manifold, H¨ older-continuous, 43.4 manifold, level, 16.1.7 manifold, Lipschitz, 19.6.8, 24.1.7, 27.11, 27.11.4, 43.4.0 manifold, paracompact, 34.1.2 manifold, pathological, 43.4.0 manifold, pseudo-Riemannian, 40, 41.6 manifold, Riemannian, 39, 39.2.6 manifold, topological, 26, 26.3, 26.3.1 manifold, unidirectionally differentiable, 27.10 [ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
logic, semantics-free, 4.5.1 logic, symbolic, set theory, 5.0.3 logic, synthetic, 3.9.7 logic, three-truth-value, 3.11.7 logic, two-truth-value, 3.11.7 logic, wild, 2.2.7 logic, world-model ontology, 3.11.6 logic algebra, 3.1.2 logic and set theory, circularity, 3.3.8 logic and set theory, cyclic, 2.1.6 logic application, 3.3.3 logic applications, 3.2.5 logic axiom of distributivity, 4.7.5 logic axiom of restriction, 4.7.5 logic circuit, electronic, 4.5.5 logic customer, 3.3.8 logic discussion, 3.3.4 logic expression parse-tree, 4.3.10 logic flavours, 4.0.1 logic function, predicate, 4.12.11 logic in ancient literature, 3.5 logic interpretation, 3.3.3 logic language, low-level, 5.6.2 logic machine, 3.7.5, 3.7.7, 3.10.1, 4.2.8 logic machine, inconsistent, 3.10.4 logic machine, temporal parameter, 3.7.9 logic machine model, 3.6.1 logic machine model, recursive, 3.3.6 logic machine network, 3.7.14 logic machine reboot, 3.10.5 logic metaphor, 3.5.10 logic methods, 4 logic model, 3.3.3 logic ontology, proposition-store, 3.6, 3.7 logic ontology, world-view, 3.6 logic operator, arithm` etic equivalent, 4.3.15 logic operator notations, 4.3.6 logic problem, 4.1.7 logic procedures, 4 logic procedures formalization, 4.5.1 logic quantifier duality, 4.13.10 logic quantifier notations, 4.13.7 logic remarks, 2.9 logic service provider, 3.3.8 logic structure, 3.3.3 logic styles, 4.0.1 logic substitution rule, 4.9.1 logic symbols, 4.3.5 logic syntax/semantics, 4.3.10 logical algebra, 4.14.4 logical algebra, formalized, 3.1.2 logical argument, 3.7.8 logical argumentation, 4.4 logical conjunction, 4.3.3 logical disjunction, 4.3.3 logical equation, 4.4.3 logical equations, simultaneous, 4.4.2 logical expression, 4.3 logical expression, exhaustive substitution, 4.14.3 logical expression, non-atomic, 3.5.4 logical expression evaluation, 4.4 logical expressions, equivalent, 3.10.13 logical function, constant, 4.12.12 logical language, 3.11.7 logical language, ancient literature, 46.4 logical language, B¯ eowulf, 46.4.2 logical language, Gilgamesh epic, 3.5.2, 46.4.1 logical negation, 3.10.1
863
51. Index
manifold atlas, differentiable, 27.2 manifold direct product, 26.5.5 manifold embedding, 31.3.11 manifold equivalence, 27.4.10 manifold example, 43 manifold immersion, 31.3.10 manifold product, 26.5.5 manifold with affine connection, tensor calculus, 41.3 map, 6.5.4 map, differentiable, between differentiable manifolds, 27.8, 27.8.1 map, differentiable, higher-order differential, 32.2 map, exponential, 38.7, 38.7.2 map, name-to-object, predicate language, 4.12.3 map, regular, 31.3.9 map, set, function, 6.6 map, truth value, 4.1.3 map from linear space to second dual, canonical, 10.5.20 map projection for two-sphere, standard, 42.16 mapping, 6.5.4 martian robots, 3.4.1 mat, cat, sat, 3.10.10 mathematical class, 5.16.2 mathematical class notations, 5.16.5 mathematical class ontology, 2.5.4 mathematical class parameters, 5.16.5 mathematical deduction, 2.9.7 mathematical definitions, naturalism, 5.0.4 mathematical induction, 2.11.1, 7.3.4 mathematical induction, principle, 2.10.1 mathematical induction, validity, 2.11.17 mathematical knowledge, corpus, 3.2.5, 5.0.3 mathematical language, 2.5.7 mathematical logic, modelling, 3.2.5 mathematical logic, span, 3.2.5 mathematical logic, tour, 3.1.2 mathematical logic, true nature, 3.1.2 mathematical object, 5.16.2 mathematical objects, extraneous properties, 2.7.2 mathematical objects location, 2.7.1 mathematical physicist, 2.9.9 mathematical structure, imported, 2.6.2 mathematical symbols, intellectual content, 2.3.7 mathematical system definitions, 2.8 mathematical thinking machine, 4.2.8 mathematician, applied, 2.9.9 mathematicians, chronology, 46.1 mathematicians, logic, 3.13.1 mathematics, a-priori, 2.12.1, 6.0.3 mathematics, bedrock, 2.1 mathematics, deliverables, 3.2.3 mathematics, naive, 2.1.1, 3.14, 6.0.3 mathematics, naive, non-axiomatic, 3.14.2 mathematics, recreational, 2.9.9 mathematics, rigorous, 2.1.1 mathematics, self-interest, 2.10.1 mathematics, semantics, 3.2.4 mathematics communication channel, 2.5.11 mathematics foundation, 5.0.1 mathematics heaven, 2.4.1 mathematics joke, 44.2.6 mathematics learning, 1.10 mathematics ontology, 2.3 mathematics ontology categories, 2.3.8 mathematics outputs, 3.2.3 mathematics package, 1.4.7 mathematics philosophy, 2 mathematics remarks, 2.9 [ www.topology.org/tex/conc/dg.html ]
mathematics software packages, symbolic, 2.12.6 matrix, characteristic polynomial, 11.5.7 matrix, coordinate transition, 27.4.7 matrix, definite real symmetric, 11.6 matrix, Fisher information, 39.9.1 matrix, Hessian, 27.5.9, 27.5.10 matrix, invertible, 11.3.15 matrix, Jacobian, 19.1.3, 27.5.10, 31.3.15 matrix, orthogonal, 11.1.25, 11.5.7, 42.17.2 matrix, real definite, 11.4 matrix, real negative definite, 11.4.7 matrix, real negative semi-definite, 11.4.7 matrix, real positive definite, 11.4.7 matrix, real positive semi-definite, 11.4.7 matrix, real semi-definite, 11.4 matrix, real symmetric, 11.5.2 matrix, rectangular, 11.1.1 matrix, semi-definite real symmetric, 11.6 matrix, sparse, 10.10.5 matrix, symmetric, 11.3.19 matrix algebra, 11 matrix algebra, real square, 11.4 matrix algebra, real symmetric, 11.5 matrix algebra, rectangular, 11.1 matrix algebra, square, 11.3 matrix determinant, 11.3.6 matrix diagonalization, 42.8.3 matrix eigenvalue, 11.1.0, 11.5.7 matrix eigenvector, 11.1.0, 11.5.7 matrix exponential, 21.7 matrix group, 11.7 matrix identity, 11.1.20 matrix inverse, left, 11.1.24 matrix inverse, right, 11.1.24 matrix linear space, rectangular, 11.1.7 matrix product, 11.1.16 matrix trace, 11.3.2 matrix transpose, 11.1.12 matter, dark, 2.10.5 maximal atlas, topological, 26.4.15 maximum, local, 27.6.3 maximum, local, on differentiable manifold, 27.5.8 maximum of partially ordered set, 7.1.10 maximum principle, 21.0.5, 21.1.1, 21.1.4 Maxwell’s equations, 25.3.6 measure, Lebesgue, 13.9.0, 20.1 measure, Radon, 20.10 measure theory, 20 measure theory, geometric, 13.9.0, 20.8 measurement, land, 46.2.2 measurement resolution, finite, 2.12.1 meat, 5.15.6 meat-centric cook, 1.4.5 mechanics, Lagrangian, connection, 37.10 mechanics, quantum, 3.4.5 mechanistic computations, 2.9.1 mechanization, logic, 4.0.2 mechanization of logic, 3.7.5 mechanization of mathematics, 2.9.1 medieval Europe, 3.4.2 medieval university education, 2.9.8 membership, tribal, 2.2.2 membership chain, set, 5.5.3 membership relation, 5.1.3 membership relation, concrete propositions, 4.1.9 membership relation, left-side, 5.2.6 membership relation, sets, 4.15.1 membership relation network traversal, 5.5.3 [ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
864
membership relation on the left, 5.2.1 membership symbol, set, 5.1.20 membership theory, 5.1.3 memory, virtual, 2.10.10 memory allocation, dynamic, 2.10.10 mental process model, logic, 3.11.3 Mesopotamia, legal system, 3.7.11 Mesopotamian mathematics, 46.1.1 meta-assertion, 3.9.4 meta-discussion, logic, 3.9.4 meta-discussion context, 3.3.4 meta-function, 6.5.22, 7.9.1, 7.9.12 meta-language, 3.10.6 meta-language, semantics, 3.10.6 meta-language, syntax, 3.10.6 meta-logic, 3.10.12 meta-logical framework system, 5.7.22 meta-logical proof, 3.13.8 meta-meta-language, 3.10.8 meta-model, 3.3.5 meta-modelling, 3.3 meta-proof, 4.9.4 meta-proposition, 3.10.7, 3.10.10 meta-proposition, disjunction, 3.13.6 meta-sentence, 3.10.7 meta-set, 5.7.23 meta-set-theory, 5.7.21 meta-theorem, 4.9, 4.9.5 meta-theorem, deduction rule, 4.9.4 meta-theorem, statement about proofs, 4.9.3 metadefinition, tangent bundle, 28.2, 28.2.1 metadefinition, tensor product, 13.4 metaphor, container, set, 5.7.17, 5.7.25 metaphor, logic, 3.5.10 metaphysical universe, 2.7.1 metaphysics, limits, 4.13.12 method of definition, axiomatic, 2.8.1, 2.8.2 method of definition, constructional, 2.8.1, 2.8.4 method of exhaustion, 20.1.1 method of proof, indirect, 3.11.1, 3.11.9 methods, logic, 4 metric, induced topology, 17.3, 17.3.3 metric, point-to-point, 17.3.2 metric, pseudo-Riemannian, 40.1, 40.1.1 metric, pseudo-Riemannian, preview, 25.7 metric, Riemannian, 39.2, 39.2.3 metric, Riemannian, differentiability, 39.2.4 metric, Riemannian, preview, 25.6 metric, two-point, 17.1.3 metric connection, 39.4.6 metric function, 17.1.1 metric function, two-point, 25.6.2 metric-invariant geometry, 46.3.0 metric layer, 1.1 metric layer, Levi-Civita connection, 36.0.4 metric space, 17, 17.1.2 metric space, bounded set, 17.2.7 metric space, continuous function, 17.4, 17.4.1 metric space, paracompact, 17.3.7 metric space, topology, 17.3.3 metric space distance function, 17.1 metric tensor, 39.2.6, 41.5.0 metric tensor calculation from distance function, 42.3 metric tensor field, 10.2.18 metric tensor field, Riemannian, 39.2.3 metric tensor on two-sphere, 42.2.2 metric transformation group, 46.3.0 metric transport, 39.1.5 [ www.topology.org/tex/conc/dg.html ]
865
microscope, 1.9.1, 2.12.2 microscope analogy, 2.10.1 middle, excluded, 3.1.4, 3.6.3, 3.11.2, 5.7.17 middle, excluded, Russell’s paradox, 5.7.11 Middle Ages, 3.4.2 min/max equivalent, logic operator, 4.3.15 mind, animal, 2.3.3, 3.4.5 mind, embodied, 2.5.21 mind, finite, 2.10.2 mind, human, finite bandwidth, 2.11.14 mind, mathematical, ontology, 2.3.10 mind, Roman, 46.1.2 mind states, 2.5.3 mind stretching, 2.3.5 minds, 2.9.2 minds, alien, 2.5.6 minds, communities, 2.2.3 minds, human, 2.5.7 mini-logic-machine, 3.10.5 minimalist, 4.6.4, 4.11.3 minimalist endeavour, 5.0.7 minimalist principle, 4.9.4 minimum, local, on differentiable manifold, 27.5.8 minimum-length geodesic, 39.3.3 minimum of partially ordered set, 7.1.10 Minkowski, Hermann, 46.1.6 Minkowski space-time, 25.7.1 Minoan tablets, ancient, 2.5.6 mirror image of left transformation group, 9.6.1 mirror image of right transformation group, 9.6.1 mirrors, hall of, 5.7.19 missing steps, 1.5.2 mixed tensor, 13.7, 29.3 mixed tensor algebra, 13.8.14 mixed tensor product operation, 13.8.11 mixed tensor space, 13.7.2 mixed transformation group, 9.6 mobile vector, 10.1.6 M¨ obius, August Ferdinand, 12.1.0, 44.2.0, 46.1.5, 46.3.0 M¨ obius strip, 23.6.17, 44.2.0 M¨ obius strip as fibre bundle, 44.2 M¨ obius strip fibre bundle on one-sphere, 44.3 model, acceptance/rejection, proposition, 3.7.2 model, imaginary world, 2.10.1 model, logic, 3.3.3 model, logic machine, 3.6.1 model, logic machine, recursive, 3.3.6 model, mental process, logic, 3.11.3 model, object/class, 5.7.25 model, perfect, 3.4.3 model, recursive, coherence, 3.3.6, 3.3.7 model, set theory, 5.2.3 model, socio-mathematical network, 2.5.3 model, world, animal, 5.7.25 model, world, organism, 3.3.2 modelling, 3.3 modelling, mathematical logic, 3.2.5 modelling, recursive, 3.3 modelling, unbounded, on demand, 5.7.25 modelling loop, 3.3.7 modelling mathematical thinking, 2.6.3 modelling ontology, 2.10.1 models, physics, 21.0.4 models, world, multiple, 3.11.2 modern axiom system, 2.8.3 modern logic, universality, 3.4 module, 9.9 module, functional, 4.1.1 [ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
51. Index
module, unitary left, 10.1.3 module automorphism, 9.9.10 module homomorphism, 9.9.10 module morphism notations, 9.9.12 module of endomorphisms, 9.9.27 module of homomorphisms, 9.9.5, 9.9.14, 9.9.26 module operator domain, 9.9.8 module over a set, 9.9.8 module over group, left, 9.9.17 module structures summary table, 9.9.0 module with operator domain, 9.9.8 module without operator domain, 9.9.4 modules over a ring, linear map, 9.9.25 modulo function, 8.6.16 modulus function, 8.6.3 modus ponendo ponens, 4.6.8 modus ponens, 4.4.3, 4.5.2, 4.6.1, 4.6.5 modus tollendo tollens, 4.6.8 Mohammed ibn M¯ us¯ a, Ab¯ u Ja’far, 46.2.3 molten magma, 2.1.1, 2.1.3 Mona Lisa, 3.9.8 Monge, Gaspard, 46.1.5 monkey, gelada, 2.2.4 monkey tribes, 2.2.3 monomial, tensor, 13.12.2 monomorphism, Lie algebra, 9.11.8 monomorphism, linear space, 10.3.6 monster, self-referential, 5.7.27 monstrosities, 4.2.7, 4.2.8 mood, verb, 3.10.6 mood, verb, imperative, 3.10.14, 3.12.1, 3.12.3 mood, verb, indicative, 3.7.1, 3.10.11, 3.12.1, 3.12.2 mood, verb, logical proposition, 3.12 mood, verb, subjunctive, 3.7.1, 3.10.11, 3.12.2 mood, verb, symbolic logic, 3.12.1 Moran, Bill, 48.4.1 Morgan’s law, de, 5.13.11, 5.14.8 morphism notations, linear space, 10.3.7 morphisms, groups, 9.2.22 morphisms, Lie algebra, 9.11.8 morphisms, linear space, 10.3.6 motion, laws, 20.6.18 motivations of this book, 1.4 motor/sensor, feedback, 3.3.2 Mount Olympus, 5.0.2 mouse, 7.2.7 mouse click, 27.1.3 MP, equivalent to RAA, 4.6.2 MSC 2000 subject classification, 1.9 multi-celled animal, 25.2.10 multi-index, 18.6.1 multi-index derivative, 18.6.2 multi-level tangent bundle, 28.10.18 multilinear algebra, 13.0.6 multilinear dual, 13.5.5, 13.6.5 multilinear effect of sequence of vectors, 13.0.1, 25.4.2 multilinear effect of vector sequence, 13.0.5 multilinear map, 13.1, 13.1.3 multilinear map, antisymmetric, 13.3, 13.3.3 multilinear map, canonical, 13.5.4 multilinear map, canonical, tensor space, 13.4.1 multilinear map, symmetric, 13.3, 13.3.2 multilinear map linear space, 13.2, 13.2.3 multilinear quintessence, 13.4.4 multiple choice, 3.7.3, 3.9.5 multiple contexts, logic, 3.3.4 multiple point of curve, 16.2.13 multiple point of path, 16.4.11 [ www.topology.org/tex/conc/dg.html ]
51. Index multiple-valued function, 6.11.1 multiple world models, 3.11.2 multiplication, scalar, 10.1.2 multiplicative axiom, set theory, 5.9.14 multiplicity quantifier, 4.16.8 M¯ us¯ a, Ab¯ u Ja’far Mohammed ibn, 46.2.3 music, 2.9.8 Mycenaean clay tablet, 2.3.5 myth, 5.0.2 naive, etymology, 2.1.2 naive comprehension, axiom, 5.7.2, 5.7.6, 5.7.8 naive cross product, 6.1.6 naive induction, 6.1.15, 7.2.6 naive logic, 3.2.1, 4.9.5 naive mathematics, 2.1.1, 2.1.2, 3.14, 6.0.3, 7.2.5 naive mathematics, non-axiomatic, 3.14.2 naive natural number, 3.14.4 naive set, 3.14.4, 4.1.3, 4.1.4, 4.1.9, 4.12.6 naive set theory, 5.1.1, 5.2.2, 5.7.2 naive set theory, finite, 3.13.3 naive theorem, 4.9.5 naive vector field derivative, 33.1 naked dummy variable, 5.8.28 name, constant, definition, 4.12.13 name, statement, 4.5.2 name, statement-form, 4.5.4 name, uncountable aggregate, 2.10.1 name map, proposition, 4.1.10, 4.1.11 name space, abstract variable, 2.11.17 name space, proposition, 4.1.1, 4.1.10, 4.1.11 name-to-object map, predicate language, 4.12.3 naming bottleneck, 2.10.1 NAND (not-and), 4.6.4, 4.7.1, 4.11.3 NAND operator, 4.3.6, 4.3.8 Napol´ eon Bonaparte, 3.11.9 natural language, 3.10.6, 3.12.1 natural language, logic, 3.13.1 natural language families, 1.4.12 natural number, 7.2.31, 7.3, 7.3.2, 7.3.5 natural number, naive, 3.14.4 naturalism, mathematical definitions, 5.0.4 nature, essential, 1.4.7 nature, essential, set, 5.2.6 nature, set, 5.5.3 nature, true, mathematical logic, 3.1.2 navigation, terrain, 3.5.10 NBG (Neumann-Bernays-G¨ odel), 5.0.9, 5.2.2 NBG proper class, 5.7.23 NBG set theory, 4.1.5, 5.1.1, 5.2.2, 5.7.5, 5.12 NBG set theory, first order logic, 4.13.13 nearest integer distance function, 8.6.19 nearest integer function, 8.6.14 nefne, Old English, 3.5.4 negated proposition, 3.10.10 negation, double, 3.6.1, 3.10.8 negation, logical, 3.10.1, 4.3.3 negation, logical, double, 3.11.6 negation, logical, semantics, 3.10 negation, truth table, 3.10.10 negation operator, 3.10.3 negative definite, 11.4.7 negative number one’s complement representation, 7.5.9 negative number two’s complement representation, 7.5.8 negative semi-definite, 11.4.7 neighbourhood, cone-shaped, 18.2.13 neighbourhood, convex, 38.7.5 neighbourhood, per-point, 25.2.5 neighbourhood, per-set, 25.2.5 [ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
866
51. Index
[ www.topology.org/tex/conc/dg.html ]
noumena, phenomena, 2.5.10 number, biggest, 5.7.18 number, binary, 7.5.8 number, black, 2.10.5, 2.11.12 number, complex, 8.7 number, composite, 14.3.9 number, compressible, 2.11.2 number, dark, 2.10, 2.11.12 number, extended rational, 8.2 number, extended real, 8.4, 8.4.1 number, floating-point, 5.16.1, 8.3.1 number, grey, 2.10.1, 2.10.5, 2.11.12, 3.4.3 number, incompressible, 2.10.4, 2.11.2 number, Lebesgue, 17.3.22 number, natural, 7.2.31, 7.3, 7.3.2, 7.3.5 number, natural, naive, 3.14.4 number, ordinal, 7.2, 25.2.9 number, prime, 7.4.2 number, rational, 8.1, 8.1.1 number, real, 8.3, 8.3.6 number, real, algebraic, 14.1.8 number, real, philosophy, 2.12 number, real, unknown, 4.2.8 number, unmentionable, 2.10.4 number heaven, 2.4.1 number mysticism, 2.4.5 number notation summary, 1.6.1 number representation, real, 8.3.3 number tuple, real, 8.5 numbers, binary, 2.10.6 numerical analysis, 2.11.6 numerology, Pythagorean, 2.4.6 nymþe, Old English, 3.5.4 object, attributes, database, 2.4.4 object, mathematical, 5.16.2 object class, 5.16.7 object/class model, 5.7.25 objectives of this book, 1.4 objects, underlying, 4.16.1 objects location, mathematical, 2.7.1 oblique projection, 43.4.1, 43.4.2 observable, anthropological, 3.4.1 obvious, 1.5.2, 3.0.1 odd permutation, 7.10.10 ODE (ordinary differential equation), 21.0.2 Odysseus, 25.5.4 OED (Oxford English Dictionary), 1.6.7 OFB (ordinary fibre bundle), 23.0.1 OFB connection, 36.3.2 ointment, 5.15.3, 23.5.0, 33.3.1, 43.4.1 ointment, fly, 7.2.5 Old English, 3.5.4 Olduvai 9, 2.2.4 Olympic games, ancient, 3.2.8 Olympus, Mount, 5.0.2 Omega, 14.2.13 omicron, 14.2.13 omit function for list, 7.12.2 on demand, unbounded modelling, 5.7.25 on-demand construction, compound propositions, 3.13.5 1–1, 6.5.23 one-parameter family of diffeomorphisms, 27.7.1 one-parameter family of geodesics, 41.4.1 one-parameter group of diffeomorphisms, 27.7.2 one-parameter group of local transformations, 27.7.4 one-parameter transformation family, 31.5 one-parameter transformation group, vector field, 31.5.2 one-sphere M¨ obius strip fibre bundle, 44.3 [ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
neighbourhood, topological, 25.2.2 neighbourhood of point in topological space, 14.2.11 nemne, Old English, 3.5.4 network, acyclic, concepts, 2.1.7 network, logic machine, 3.7.14 network communications, socio-mathematical, 2.5 network model, socio-mathematical, 2.5.3 network of concepts, coherent, 2.1.7 network of discussions, 3.3.4 network synchronization, socio-mathematical, 2.5.3 network topology, 16.10 network traversal, membership relation, 5.5.3 Neumann-Bernays-G¨ odel set theory, 5.1.1, 5.2.2 Neumann problem, 21.3.0 neurophysiology, 2.5.17 neuropsychology, 2.5.17 never-constant curve, 16.3.3, 24.4.4 New Guinea, 3.2.8 Newton, Isaac, 18.1.1, 18.1.2, 18.2.11, 20.1.1, 46.1.3, 46.1.4, 46.1.5, 46.2.10, 46.2.11 Newton’s gravity law, 21.0.4 Nile flooding, 46.2.2 non-AC mathematician, 5.15.6 non-atomic logical expression, 3.5.4 non-axiomatic naive mathematics, 3.14.2 non datur, tertium, 3.6.4 non-existence, proof, 3.10.12 non-Hausdorff locally Euclidean space, 43.3 non-Hausdorff topology, 15.2.15 non-Lie structure group, differentiable fibre bundle, 35.1, 35.1.3 non-linear operator, 45.4.0 non-measurable set, Lebesgue, 2.10.6, 2.11.9, 3.2.1, 4.13.10, 5.9.3, 5.9.10, 5.9.11, 20.1.3, 20.1.4 non-measurable set, Lebesgue, unknowable existence, 2.10.8 non-membership notation, set, 5.1.4 non-representational art, 3.1.3 non-self-intersecting curve, 31.4.8 non-separated pair of sets, 15.3.3 non-tensorial Christoffel symbol, 39.4.5 non-topological fibration, 22.1 non-topological fibration, parallelism, 22.2 non-topological fibre bundle, 22, 22.3, 22.3.1 non-trivial topology, 14.7.10 non-uniform non-topological fibration, 22.1.1 nonsense, pure, 5.7.10 NOR (not-or), 4.6.4 NOR operator, 4.3.6, 4.3.8 norm, 10.8 norm, Euclidean, 10.8.4 normal coordinates, 30.0.4, 38.7.1, 39.4.10 normal coordinates on two-sphere, 42.12 normal form, conjunctive, 4.11.3 normal form, disjunctive, 3.5.9, 4.11.3 normal geodesic, 41.5.0 normal space, 25.2.7 normal subgroup, 9.3.8 normal topological space, 15.2.21 normalizer of group, 9.3.25 not, 4.3.3 not-knowing, tolerance, 3.2.8 not proven, Scottish law, 3.10.15 notation, derivative, 18.2.11 notation for definitions, 1.6.5 notation introduction, formal, 5.8.6 notations, 49.1 notations for sets of functions, 6.12 noumena, 2.12.1
867
51. Index
one-to-one, 6.5.23 one’s complement representation of negative numbers, 7.5.9 ontologically empty, 2.5.18 ontology, definition, 2.3.1 ontology, infinity, 2.10.1 ontology, mathematical classes, 2.5.4 ontology, mathematics, 2.3 ontology, modelling, 2.10.1 ontology, Platonic, 2.4.1 ontology, proposition-store, logic, 3.7, 3.11.6 ontology, proposition-store-machine, 3.6.2 ontology, world-model, 5.7.25 ontology, world-model, inconsistency, 5.7.12 ontology, world-model, logic, 3.11.6 ontology, world-model, proof by contradiction, 3.11.8 ontology, world-model, propositional logic, 3.6.3 ontology, world-model-machine, 3.6.2 ontology categories, mathematics, 2.3.8 ontology for logic, proposition-store, 3.6 ontology for logic, world-view, 3.6 ontology of truth and falsity, 3.6.1 open annulus, 17.1.17 open arc, 16.2.9 open ball, 17.1.7, 17.3.8 open ball, punctured, 17.1.17 open base, 14.10, 14.10.2, 15.6.1 open base for topological space, 15.6.2 open-closed interval, 8.3.10 open cover, 14.10.12, 15.7.1 open cover, finite, 14.10.12 open curve, 16.2.9 open curve, tangent operator, 31.4.7 open curve, tangent vector field, 31.4.1 open interval, 8.3.10, 14.9.2 open path, 16.4.7 open portion of boundary of set, 14.5.9 open refinement of covering, 15.7.2 open set, 14.2.12 open set symbol G, 14.2.13 open set symbol Ω, 14.2.13 open subbase, 14.10.3 open subbase of topological space, 15.6.4 opera house, Viking, 43.10.1 operant conditioning, 3.5.10 operating system, computer, 4.7.7 operation tree, logical, 3.10.13 operational definition, 2.11.8, 2.11.9 operational procedure, class, 4.1.4 operator, binary, 4.3.12 operator, differential, in Riemann space, 39.6 operator, gradient, 45 operator, Hessian, 37.5, 37.6.2 operator, Hessian, at critical point, 32.6 operator, implication, primacy, 4.6.5 operator, Laplace-Beltrami, 39.6.3 operator, Laplacian, 19.5.3, 39.6.1, 39.6.2 operator, left-associative, 4.3.12 operator, left translation, 34.3.1 operator, logic, arithm` etic equivalent, 4.3.15 operator, logical, 3.10.10, 4.3, 4.11 operator, logical, associativity, 4.3.12 operator, logical, importation, 4.2.1, 4.3.1 operator, logical, principal connective, 4.3.11, 4.3.12 operator, logical, zero-operand, 4.3.18, 4.3.19 operator, NAND, 4.3.6 operator, negation, 3.10.3 operator, NOR, 4.3.6 operator, right-associative, 4.3.12 [ www.topology.org/tex/conc/dg.html ]
operator, second-order, 37.5.1 operator, second-order, differential, 32.7.3 operator, second-order, elliptic, 30.6, 30.6.3 operator, second-order elliptic, 37.6.3 operator, second-order weakly elliptic, 37.6.3 operator, tangent, 28.0.4, 28.5, 28.5.1 operator, tangent, higher-order, 30.1 operator, tangent, second-order, 30.1.1 operator, tangent, second-order, tagged, 30.1.9 operator, tangent, tagged, 28.6, 28.6.2 operator chain, implication, 4.6.5 operator domain of module, 9.9.8 operator field, coordinate basis, 29.6.4 operator field, second-order elliptic, 37.6 operator field, tangent, 29.6 operator field composition, 33.2.7 operator frame, tangent, 28.12.4 operator homomorphism, 9.9.10 operator notations, logic, 4.3.6 operator space, second-order tangent, 30.4.1 operator space, tagged second-order tangent, 30.4.3 operator space, tangent, 28.7.6 operator space, tangent, tagged, 28.7.9 or, 4.3.3 or, exclusive, 3.5.2, 3.5.7, 3.13.7, 4.3.5 or, exclusive, notation, 4.3.7 or, inclusive, 3.5.2, 3.5.7, 3.13.7, 4.3.5 OR-construction, 3.5.4 orbit space method for associated fibre bundles, 23.12.0 order, 7 order, dual, 7.1.7 order, lexicographic, 16.1.8 order, partial, 7.1.1 order, total, 7.1.4 order, word, 3.10.6 order homomorphism, 7.1.13 order isomorphism, 7.1.13 ordered pair, 6.1, 6.1.3, 7.7.3 ordered quadruple, 6.1.14 ordered sample, 7.11.1 ordered selection, 7.11, 7.11.1 ordered set, 7.1, 7.1.1 ordered set, totally, 7.1.4 ordered traversal, 7.1.17, 7.1.18, 16.1.8, 16.2.3, 17.5.11 ordered traversal, length, 17.5.9 ordered triple, 6.1.14 ordered tuple, 7.7.3 ordering, define-before-use, 1.4.11 ordinal number, 7.2, 25.2.9 ordinal number, finite, 7.2.12 ordinal number 10, 48.3.3 ordinal numbers, von Neumann construction, 7.2.5 ordinary differential equation, 21.1 ordinary fibre bundle, connection, curvature, 36.4 ordinary fibre bundle, horizontal lift function, 36.3 organism, 5.7.25 organism, multi-celled, 25.2.10 organism, world model, 3.3.2 oriented continuous path, 16.4.2 ornithology, 2.0.4 orthogonal bundle, tangent, 39.4.9 orthogonal connection, 12.1.0, 42.5.3 orthogonal connection, Levi-Civita, 25.6.5 orthogonal frame, 39.4.6 orthogonal group, 24.2.8, 36.6.2, 42.8.0 orthogonal matrix, 11.1.25, 11.5.7, 42.17.2 orthogonal transformation, 11.5.7, 19.2.4, 19.5.3, 23.5.0, 46.3.0 [ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
868
orthogonality of transition maps, 28.8.9 orthonormal vectors on Riemannian manifold, 41.5.0 other people’s books, 50.1 otherwise, Anglo-Saxon, 3.7.17 otherwise, logic, 3.5.4 okke, Old English, 3.5.4 okþe, Old English, 3.5.4 oþke, Old English, 3.5.4 outputs, mathematics, 3.2.3 outside, 5.7.17 outsourcing definitions, 28.2.0 overview of chapters, 1.2 ox, 3.5.3, 48.4.1 oxygen, 2.1.5 page counts, chapters, 1.3.1 pair, ordered, 6.1, 6.1.3, 7.7.3 pair axiom, unordered, 5.3.3 pair category, 6.1.2 pantograph, 32.7.0, 38.8.3 paper, finite, 2.10.7 paper, flat, 26.1.5 paper, infinite, 2.10.6 papers, research, 5.0.3 Pappus of Alexandria, 46.1.2 papyrus, 2.5.5 parabolic PDE, 21.4.0 paracompact analytic manifold, 27.9.6 paracompact differentiable manifold, 27.4.18 paracompact manifold, 34.1.2 paracompact metric space, 17.3.7 paracompact topology, 15.7.14 paraconsistent logic, 4.1.5, 5.7.11 paradox, 3.2.7 paradox, Burali-Forti, 3.2.1, 5.7.13, 5.7.15, 7.2.3 paradox, Cantor, 3.2.1 paradox, flat, 5.7.23 paradox, liar’s, 5.7.13, 5.7.22 paradox, recursion-style, 2.11.3 paradox, Russell’s, 3.7.10, 3.10.4, 4.1.9, 5.1.6, 5.5.1, 5.6.4, 5.7, 5.7.7, 5.8.28, 5.12.1, 5.12.3 paradox, Zeno, 2.11.3, 7.2.7, 14.0.4, 14.1.5 parallel displacement for PFB connection, 36.8 parallel transport, 22.2.1, 25.5.1, 33.1.0, 36.0.2, 37.1.1, 37.4.1, 37.7.8 parallel transport, differentiation, 36.2 parallel transport, exterior derivative, 20.6.18 parallelism, associated, 24.3 parallelism, associated topological pathwise, 24.3.2 parallelism, deviation from flatness, 36.4.1 parallelism, differential, 15.3.2, 36.1.1 parallelism, fibre set, topological, 24.1.9 parallelism, Levi-Civita connection, 36.2.3, 37.1.2 parallelism, pathwise, 35.0.0 parallelism, pathwise, topological, 24.2.2 parallelism, structure group, 36.0.3 parallelism at a distance, Levi-Civita connection, 24.1.1 parallelism curve class, 24.1.6 parallelism for non-topological fibration, 22.2 parallelism on topological fibre bundle, 24 parallelism path class, 24.1, 24.1.6 paralysis, 14.7.1 parameter, proposition, 4.12.3 parametrization of curve, 16.1.6 parametrization of path, 16.4.18 parametrized family of systems, 2.8.5 parametrized proposition families, 4.12 parametrized proposition family, 3.1.2 parentheses, 4.3.16 [ www.topology.org/tex/conc/dg.html ]
869
Paris, raining, 3.6.3 parity, 7.10.9 parity function, permutation, 7.10.11 parse-tree, logic expression, 4.3.10 part, fractional, function, 8.6.13 partial Cartesian product, 6.10, 6.10.1, 42.6.0 partial differential equations, 30.0.6, 37.6.1 partial function, 6.11, 6.11.3 partial order, 7.1.1 partial second-order tangent vector field, 32.4.2 partial sequence, 6.10.2 partial tangent vector field, 31.4.10 partially defined function, 6.11, 6.11.3 partially defined function, composite, 6.11.7 partially defined function, composition, 6.11.7 partially differentiable function, 18.5.4 partially ordered set, 7.1.1 particle, elementary, 2.0.1 particle trajectory, 16.2.2 partition, 6.4 partition of set, 6.4.2, 6.4.5 parts, car, 2.4.3, 2.4.4 Pascal, Blaise, 7.11.6, 46.1.3 Pascal’s triangle, 7.11.6, 7.11.7 Pasch, Moritz, 46.1.6 passive set, 9.9.0 patch, software, 3.10.4 patchwork quilt, 26.0.1 path, 16.4, 16.4.2 path, affine, 38.0.2 path, closed, 16.4.7 path, constant, 16.4.8 path, continuous, directed, 16.4.2 path, continuous, oriented, 16.4.2 path, continuous, unoriented, 16.4.15 path, differentiable, 27.6 path, empty, 16.4.7 path, geodesic, 38.2.4 path, open, 16.4.7 path, rectifiable, 17.5, 17.5.16, 24.1.7 path, simple, 16.4.8 path, simple closed, 16.4.8 path, topological, 16 path atlas, 16.1.4, 16.1.8 path chart, 16.1.8 path class, parallelism, 24.1, 24.1.6 path concatenation, 16.4.13 path-equivalence of curves, 16.3 path-equivalent curves, 16.3.7 path image, 16.4.2 path initial point, 16.4.11 path multiple point, 16.4.11 path parametrization, 16.4.18 path representative, 16.4.2 path reversal, 16.4.9 path terminal point, 16.4.11 path terminology, 16.1 path topics summary, 27.6.4 pathological clash of definitions, 2.6.7 pathological empty product of family of sets, 15.1.5 pathological examples, 43.1.1 pathological examples, axiomatic system, 3.2.7 pathological manifold, 43.4.0 pathological set, 6.6.2, 6.9.5 pathwise parallelism, 35.0.0 pathwise parallelism, associated topological, 24.3.2 pathwise parallelism, reversibility rule, 24.2.5 pathwise parallelism, topological, 24.2.2 [ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
51. Index
pathwise parallelism, transitivity rule, 24.2.4 pathwise parallelism on topological fibre bundle, 24.2 pattern recognition, 2.11.14 PC (propositional calculus), 4.7.2 PC theorem, 4.7.2 PDE, elliptic, 21.3.0 PDE, elliptic second-order, 27.5.10 PDE, hyperbolic, 21.4.0 PDE, parabolic, 21.4.0 PDE (partial differential equation), 21.0.2 PDE corpus, 1.4.1 PDE techniques, curved space, 1.4.1 PDO (partial differential operator), 21.0.2 Peano, Giuseppe, 2.5.13, 2.9.7, 46.1.6 Peano axioms, 2.5.13, 7.2.32, 7.3.2, 7.3.3 Peirce arrow, 4.3.6, 4.3.8 Penelope, 25.5.4 pentuple, ordered, 7.7.3 people, pre-scientific, 3.2.8 per-fibre-set chart, 35.2.4 per-point neighbourhood, 25.2.5 per-set neighbourhood, 25.2.5 perception, colour, 3.4.5 perfect model, 3.4.3 perfection, crisp, abstract logic, 3.4.3 permutation, 7.10, 7.10.3, 7.11 permutation-invariant topology, 14.7.5 perspective, dynamic, set construction, 5.7.24 PFB (principal fibre bundle), 23.0.1 PFB connection, 36.3.2 PFB connection, connection form, 36.6 PFB connection, parallel displacement, 36.8 phase space, 37.10.1 phenomena, 2.12.1 pheonomena, noumena, 2.5.10 philosophy of infinity, 2.11 philosophy of integers, 2.11 philosophy of mathematics, 2, 2.5.21 philosophy of real numbers, 2.12 photolysis, 14.7.1 photon, 2.12.3 physical container, 2.6.2 physicist, experimental, 2.9.9 physicist, mathematical, 2.9.9 physicist, theoretical, 2.9.9 physicists and reality, 2.9.9 physics, 2-sphere, 42.0.1 physics, affine spaces, 10.1.1 physics, algebraic expressions, 7.4.6 physics, approximated by real world, 3.4.3 physics, bedrock, 2.12.3, 5.9.11 physics, chart-independence, 32.3.0 physics, discipline, 2.5.17 physics, fibre bundles, 23.1.3 physics, Galileo transformation, 30.0.3 physics, Lie groups, 34.0.1 physics, logical reasoning, 4.9.12 physics, mathematical foundation, 2.10.9 physics, mixed tensors, 13.7.4 physics, network topology, 16.10.1 physics, parallelism, 24.1.4 physics, power series, 8.7.3 physics, Riemannian geometry, 25.7.6 physics, Riemannian metric, 25.6.1, 25.7.5 physics, second-order derivatives, 25.5.2 physics, second-order operators, 30.0.1 physics, vector, 28.2.8 physics, vector fields, 16.2.2, 23.3.10 [ www.topology.org/tex/conc/dg.html ]
51. Index physics, vectors, 28.1.0 physics models, 21.0.4 π, 2.3.9, 2.7.2, 2.10.4, 2.10.6, 2.11.9, 2.11.13, 2.11.18, 5.9.11, 20.13.4, 46.2.13 pig, poke, 5.11.5 pitch, 42.8.2 pixie, invisible, 2.2.8, 2.10.7 pixie number, 2.10.5 pixies, 2.10.3, 2.12.3 pixies at the bottom of the garden, 5.9.3 plain TEX, 1.5.3, 1.8 Planck length, 2.12.1 Plato, 2.4.1, 3.4.2, 46.1.1, 46.2.1 Plato’s theory of ideas, 2.4 Platonic Forms, 2.4.5 Platonic ideal, 2.3.3 Platonic ontology, 2.4.1 plethora, formalisms and notations, 1.4.1 plodding correctness, 1.4.7 Poincar´ e, Jules Henri, 5.9.3, 46.1.6, 46.2.19 Poincar´ e conjecture, 25.2.9 point-set layer, 1.1 point-to-point distance function, 39.3, 42.3.0 point-to-point metric, 17.3.2 point transformation, 19.1.6 point with no extent, 2.4.3 pointwise convergence topology, 14.11.21, 15.7.8 pointwise differential, 31.1 pointwise direct product of functions, 6.9.12 pointwise double cotangent space, 29.4.5 pointwise double tangent space, 29.4.1 pointwise tangent space, 28.7, 28.7.1 poisonous food, 3.10.15 Poisson, Sim´ eon Denis, 46.1.5 Poisson bracket, 9.11.12, 20.7.2, 33.2, 33.2.10, 33.3.2, 33.5.0, 34.0.4 poke, pig, 5.11.5 polar exponential map coordinates, 42.6 polarization of light, 3.11.9 Polish notation, 4.3.16 politics, 3.7.18 polynomial, characteristic, 11.5.7 polynomial representation of tensor, 13.12.3 Poncelet, Jean-Victor, 46.1.5 ponendo ponens, modus, 4.6.8 ponens, modus, 4.4.3, 4.6.1, 4.6.5 ponere, 4.6.8 populated, finitely, proposition store, 3.3.7 population dynamics, 3.7.14 portable arrow, 10.1.6 portable vector, 10.1.6 portion of boundary of set, open/closed, 14.5.9 positive curvature, space, 37.0.2 positive definite, 11.4.7 positive feedback, audio system, 3.3.7 positive semi-definite, 11.4.7 postulates, 2.5.18 postulational method, 46.1.1 power, etymology, 7.4.6 power-of-two function, 7.9.7 power series, physics, 8.7.3 power set, universe set, 5.7.26 power set axiom, 5.3.5 power set properties, 5.14.13 pre-image of set by relation, 6.3.13 pre-logical era, 3.2.8 pre-metric geometry, 39.0.1 pre-scientific people, 3.2.8 [ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
870
precedence rules, 4.7.4 precision of modelling, mathematical logic, 3.2.5 predicate, constant, 4.12.12 predicate, logical, zero-parameter, 4.12.10 predicate algebra, 4.4.1 predicate calculus, 3.1.2, 4.12, 4.13, 4.14 predicate calculus, imaginative process, 4.14.4 predicate calculus, linguistic structure, five layers, 4.12.4 predicate calculus, management of propositions, 4.12.1 predicate function, 5.7.2 predicate language, name-to-object map, 4.12.3 predicate logic, 3.1.2 predicate logic function, 4.12.11 prehistoric cattle counting, 2.11.19 prerequisites, 1.0 prescriptive interpretation, logical assertion, 3.7.13, 3.12.1 priests, Egyptian, 3.2.8 primacy, implication operator, 4.6.5 primal space, 13.7.4 primal vector, 13.6.2 primates, 2.2.4 prime integer, 7.4.2 prime number, 2.11.8 primitive concept, 3.4.3 primitive connective, 4.5.2, 4.7.4, 4.11.1 primitive symbol, 4.3.12, 4.6.4, 4.7.1 principal connective, logical operator, 4.3.11, 4.3.12 principal fibre bundle, 35.9.4 principal fibre bundle, affine connection, 37.8, 37.8.1 principal fibre bundle, connection, differentiability, 36.5.5 principal fibre bundle, connection form, 36.6.3 principal fibre bundle, differentiable, 35.4, 35.4.1 principal fibre bundle, differentiable, vector field, 35.5 principal fibre bundle, lift function, horizontal, 36.5 principal fibre bundle, topological, 23.9 principal fibre bundle, vertical vector, 36.5.10 principal fibre bundle in terrestrial coordinates, 42.4 principal G-bundle, topological, 23.9.2 principle, anthropomorphic, 24.1.4 principle, Mach’s, 24.1.4 principle, principle, 4.9.4 principle of mathematical induction, 2.10.1, 7.3.4 prior knowledge, 3.0.1 prisoner of dogma, 1.4.6 probability, undefined concept, 3.9.1, 5.1.2 probability notation, 5.8.26 probability theory, 7.11.1 problem, Dirichlet, 21.3.0 problem, Hilbert’s fifth, 34.0.5, 34.2 problem, logic, 4.1.7 problem, Neumann, 21.3.0 problem, somebody else’s, 3.3.8 procedure, class test, 4.1.4 procedure, operational, class, 4.1.4 procedures, 2.10.10 procedures, logic, 4 procedures, logic, formalization, 4.5.1 product, alternating tensor, 13.9.6 product, anticommutative, 9.11.0 product, Cartesian, partial, 6.10.1 product, Cartesian, sequence, 7.7 product, inner, 10.8 product, tensor, 2.5.12 product, wedge, 13.9.2 product atlas, 26.5.3 product bundle, 23.7.13 product function, list, 9.12.2 product manifold, 26.5.5 [ www.topology.org/tex/conc/dg.html ]
871
product of differentiable manifolds, 27.4.15 product of matrices, 11.1.16 product operation, mixed tensor, 13.8.11 product operation, tensor, 13.8.2 product rules, trigonometry, 20.13.11 product topology, 15.1, 15.1.1 product topology, direct, 15.1.4 productive ZF axiom, 5.1.17, 5.1.19 programming, computer, 2.6.9, 5.16.7 projection, oblique, 43.4.1, 43.4.2 projection function, list, 9.12.1 projection map, 28.9.2, 42.17.2 projection map, tangent bundle, 28.2.1 projection map for Cartesian product, 6.9.8 projection of sphere onto plane, 42.17 projective geometry, 46.3.0 projective transformation group, 46.3.0 pronoun, 4.3.16 proof, indirect method, 3.11.1, 3.11.9 proof, inline, 4.9.10 proof, meta-logical, 3.13.8 proof, non-existence, 3.10.12 proof by computer, 3.6.2 proof by contradiction, 2.11.5, 3.1.4, 3.11, 4.1.7 proof by contradiction, equivalents, 3.11.1 proof by contradiction, validity, 3.11.10 proof by contradiction, world-model ontology, 3.11.8 proof discovery, 4.8.4, 4.8.5 proof symbol, 1.6.6 proofs, statement about, meta-theorem, 4.9.3 proper class, NBG, 5.7.23 properties management, extraneous, 2.7.4 properties of mathematical objects, extraneous, 2.7.2 proponent, 3.7.18 proposition, always-false, 4.3.20 proposition, always-true, 4.3.20 proposition, compound, 3.13.2 proposition, compound, decomposition, 3.7.14 proposition, concrete, 3.3.3, 3.10.6 proposition, empirical, 3.3.10, 4.13.10 proposition, etymology, 3.7.1 proposition, foreground/background, 3.6.1, 3.7.3 proposition, information content, 3.8.1 proposition, logical, verb mood, 3.12 proposition, negated, 3.10.10 proposition, tautological, 4.3.22 proposition, tautologous, 4.3.22 proposition, undecidable, 3.7.19, 3.8 proposition, undecided, 3.10.1 proposition, vacuum, 3.6.1 proposition, value, 3.3.10 proposition acceptance/rejection model, 3.7.2 proposition analogy, two-sided coin, 3.7.4 proposition blocking, 3.10.1, 3.10.14 proposition breeding rules, 3.3.7 proposition domain, concrete, 3.1.2, 4.1, 4.1.3, 4.5.4 proposition domain, concrete, closure, 4.9.1 proposition domain, concrete, dynamic, 4.1.6 proposition domain, concrete, examples, 4.1.8 proposition domain, concrete, static, 4.1.6 proposition families, parametrized, 4.12 proposition family, parametrized, 3.1.2 proposition list, conjunction, 3.13.6 proposition list insertion, 3.7.14, 3.10.2 proposition name map, 4.1.10, 4.1.11 proposition name space, 4.1.1, 4.1.10, 4.1.11 proposition parameter, 4.12.3 proposition pseudo-name, 3.10.7 [ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
51. Index
51. Index
proposition space, compound, 4.2.2 proposition space, infinite, 4.12.2 proposition store, 3.7.12 proposition store, finitely populated, 3.3.7 proposition-store-machine ontology, 3.6.2 proposition-store ontology, logic, 3.7, 3.11.6 proposition-store ontology for logic, 3.6 proposition tag, 3.7.15 proposition tagging, 3.9.2 proposition tagging, truth and falsity, 3.7.5 proposition template, 4.12.3, 5.1.25 proposition testing unit, 3.7.6 propositional algebra, 4.4.1 propositional bearing function, 2.2.4 propositional calculus, 3.1.2, 4.8.3 propositional calculus, domain of interpretation, 4.12.8 propositional calculus, implication-based, 4.7 propositional calculus, linguistic structure, five layers, 4.5.4 propositional calculus, semantics-free, 4.5.1 propositional calculus axiomatic system, 4.7.4 propositional calculus formalization, 4.5 propositional calculus theorems, 4.8 propositional connective, 4.3.4 propositional logic, world-model ontology, 3.6.3 propositions, foisting, 3.7.12 prototypes, demonstration, 2.8.9 provability, theorem, truth table, 4.4.4 proven, not, Scottish law, 3.10.15 provider, service, logic, 3.3.8 pseudo-instruction, 3.10.2 pseudo-integer, 3.8.2 pseudo-metric, 17.1.15 pseudo-name, proposition, 3.10.7 pseudo-notation, 1.4.7, 12.4.1, 13.8.18, 19.2.15 pseudo-number, 4.2.8 pseudo-real-number, 3.8.2 pseudo-Riemannian manifold, 40, 41.6 pseudo-Riemannian metric, 40.1, 40.1.1 pseudo-Riemannian metric, preview, 25.7 pseudo-sphere, 43.9 pseudo-theorem, 4.9.5 pseudo-truth-value, 4.2.5 pseudogroup, 37.1.4 pseudogroup for curves, 16.1.8 pseudogroup of diffeomorphisms, 18.7.7, 19.4, 23.1.9, 23.5.0, 23.6.12, 27.2.7 pseudogroup of diffeomorphisms, complete, 19.4.10 pseudogroup of homeomorphisms, 14.1.3, 19.4.2, 19.4.3, 19.4.5, 19.4.6, 19.4.9, 19.6.4, 19.6.7 pseudogroup of homeomorphisms, complete, 19.4.7 pseudogroup of transformations, 27.4.3 psychology, conditioning, 7.2.1 Ptolemaic astronomy, 3.4.2 Ptolemy of Alexandria [Claudius Ptolemaeus], 46.1.2 pull-back, 33.1.0 pull-back, transformation groups, 9.3.16 pullback operator, 10.5.24 punch cards, 2.5.14 punctured closed ball, 17.1.17 punctured open ball, 17.1.17 pure nonsense, 5.7.10 pure set, 5.5.4 pure set theory, 5.7.25 push-forth, 32.0.0 push-forth, transformation groups, 9.3.16 putative listing, 2.11.21 puzzle, cryptographic, 5.15.3 Pythagoras of Samos, 46.1.1, 48.4.1 [ www.topology.org/tex/conc/dg.html ]
Pythagoras theorem, 1.4.4, 39.1.5, 48.4.1 Pythagorean numerology, 2.4.6 Pythagorean triples, 39.1.5 Pythagoreans, 7.4.6, 46.1.1 QC (predicate calculus), 4.14.1 QED (quod erat demonstrandum), 1.6.6 QED symbol, 1.6.6 quadriplicity, 4.16.10 quadriplique, 4.16.10 quadrivium, 2.9.8, 46.2.1 quadruple, ordered, 6.1.14, 7.7.3 quantifier, existential, 4.13.2 quantifier, logical, infinity, 4.13.10, 4.13.12 quantifier, multiplicity, 4.16.8 quantifier, universal, 4.13.2 quantifier duality, logic, 4.13.10 quantifier notations, logic, 4.13.7 quantifiers, logical, 4.13 quantum mechanics, 2.4.2, 3.4.5 quantum mechanics, integers, 3.4.3 quantum mechanics of cattle, 2.11.19 quaternion, 8.7.3, 28.0.5 quilt, patchwork, 26.0.1 Quine dagger, 4.3.6, 4.3.8 quintessence, multilinear, 13.4.4 quod erat demonstrandum, 1.6.6 quodlibet, Ex contradictione sequitur, 3.11.1 quodlibet, Ex falso sequitur, 3.11.1 quotient group, 9.3.10 quotient linear space, 10.7.1 quotient of functions, 6.7.7 quotient of linear spaces, 10.7 quotient set, 6.4.4 quotient topology, 15.1, 15.1.6, 15.1.8 RAA, equivalent to MP, 4.6.2 RAA (reductio ad absurdum), 3.7.10, 3.7.18, 3.11.2 rabbit, 23.1.7 rabbit category, 2.11.14 radiation, electromagnetic, 3.4.5 radio, 1.10.1 radius of ball, 17.1.7 Radon measure, 9.7.0, 20.10 raining, Paris, 3.6.3 random sample, 2.10.9 range, meanings, 6.3.17 range of function, 6.5.9, 6.5.10 range of relation, 6.3.6, 6.3.7 range restriction, relation, 6.3.33 range/domain specification, function, 6.5.5 rat, 4.9.12 rational number, 8.1, 8.1.1 rational number, Cauchy sequence, 8.3.5 rational number, extended, 8.2 rational number notation summary, 1.6.1 real-analytic function, 8.7.3 real definite matrix, 11.4 real Lie algebra, 9.11.11 real negative definite matrix, 11.4.7 real negative semi-definite matrix, 11.4.7 real number, 8.3, 8.3.6 real number, algebraic, 14.1.8 real number, extended, 8.4, 8.4.1 real number, philosophy, 2.12 real number, unknown, 4.2.8 real number interval topology, 15.8 real number notation summary, 1.6.1 real number representation, 8.3.3 [ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
872
real number topology, 14.9.1 real number topology, standard, 14.9 real number tuple, 8.5, 8.5.1 real numbers, virtual, 2.10.10 real numbers, well-ordering, 5.9.3 real numbers usual metric, 17.1.4 real positive definite matrix, 11.4.7 real positive semi-definite matrix, 11.4.7 real semi-definite matrix, 11.4 real square matrix algebra, 11.4 real stuff of mathematics, 2.5.19 real symmetric matrix, 11.5.2 real symmetric matrix algebra, 11.5 real-valued function, basic, 8.6 real-valued function, differential, 31.2, 31.2.1, 31.2.11 real-valued function, higher-order differential, 32.1 real-valued function for higher-order operator, differential, 32.5 real-valued function for second-order operator, differential, 32.5.1, 32.5.3 real world, approximation to physics, 3.4.3 reality and physicists, 2.9.9 realm, pure mathematical, 3.2.5 reboot, logic machine, 3.10.5 reboot, system, 3.10.4 reception of compound proposition, 3.7.14 recipient of proposition, 3.7.13, 3.12.1 reckless comprehension, axiom, 5.7.24 recognition, pattern, 2.11.14 Recorde, Robert, 46.1.3, 46.2.6 recreational mathematics, 2.9.9 rectangular grid, 16.5.5 rectangular matrix, 11.1.1 rectangular matrix algebra, 11.1 rectangular matrix linear space, 11.1.7 rectangular Stokes theorem in two dimensions, 20.3 rectifiable compact-domain curve in Lipschitz manifold, 27.11.5 rectifiable curve, 17.0.4, 17.5, 17.5.5, 27.11 rectifiable path, 17.5, 17.5.16, 24.1.7 rectifiable path, length, 17.5.17 rectifiable set, 17.5, 17.5.3 recursion-style paradox, 2.11.3 recursive logic machine model, 3.3.6 recursive model, coherence, 3.3.6, 3.3.7 recursive modelling, 3.3 reduced bundle, 23.7.14 reductio ad absurdum, 3.7.10, 3.7.18, 3.10.5, 3.10.12, 3.11.2, 3.11.9, 4.5.2, 4.7.6 reductio ad absurdum, danger, 3.11.4 reductionism, 2.0.1 reductionist, 4.11.3 redundancy, specification tuple, 14.2.2 redundancy theory of truth, 3.10.8 redundant axiom, ZF set theory, 5.1.17 refinement of covering, 15.7.2 reflexive relation, 6.3.29 reflexivity, 6.4.1 reflexivity of equality axiom, 4.15.1 reformulation of logic, axiomatic, 3.13.4, 7.13 regular map, 31.3.9 regularity, weak, 1.4.9 regularity axiom, 5.7.27 regularity axiom, ZF, 5.5, 5.7.19 relation, 6, 6.3, 6.3.2 relation, anti-reflexive, 5.7.10 relation, antisymmetric, 5.7.10 relation, codomain, 6.3.6 [ www.topology.org/tex/conc/dg.html ]
873
relation, composition, 6.3.23 relation, domain, 6.3.6, 6.3.7 relation, equality, concrete, 2.5.11 relation, equality, concrete, import, 4.15.2 relation, equivalence, 6.4.1 relation, image, 6.3.6, 6.3.7 relation, injective, 6.3.31 relation, inverse, 6.3.27, 6.5.25 relation, membership, 5.1.3 relation, membership, left-side, 5.2.6 relation, predicate versus set, 6.3.4 relation, range, 6.3.6, 6.3.7 relation, reflexive, 6.3.29 relation, set-theoretic, 6.3.18 relation, source set, 6.3.14, 6.3.15, 6.3.16 relation, symmetric, 6.3.29 relation, target set, 6.3.14, 6.3.15, 6.3.16 relation, transitive, 6.3.29 relation network traversal, membership, 5.5.3 relation on the left, membership, 5.2.1 relation-predicate, 6.3.3 relation-predicate versus relation-set, 6.3.4 relation-set, 6.3.3 relation-set versus relation-predicate, 6.3.4 relation tuple, 6.3.20 relative set complement, 5.13.8 relative topology, 14.10.13 relativity, Galilean, 30.0.3 relativity, general, 25.7.4, 39.1.3, 40.2 relativity, Lorentzian, 30.0.3 relativity, special, 25.7.1, 39.1.3 relativity, truth-value status, 3.3.9 Renaissance, 3.4.2 renaissance, European, 2.3.5, 46.1.2 reparametrization of curve, 17.5.6 replacement axiom, 5.11.1 replacement axiom, ZF, 5.4 representation, canonical, 2.8.6 representation, polynomial, of tensor, 13.12.3 representation, real number, 8.3.3 representation of a Lie algebra, 9.11.17 representation of associative algebra, linear, 9.10.7 representation of identification space, 6.10.6 representation of integer, 28.1.4 representation of Lie algebra, adjoint, 9.11.19 representation of tangent vector, 28.1, 28.1.1 representation space of Lie algebra representation, 9.11.17 representational art, 3.1.3 representations of algebras, 9.10.6 representative curve, 16.4.2 research papers, 5.0.3 resolution, measurement, finite, 2.12.1 resources, Internet, 50.0.1 restriction logic axiom, 4.7.5 restriction of a function, 6.5.27 restriction of differentiable manifold, 27.4.12 restriction of domain of relation, 6.3.33 restriction of list, 7.12.2 reversal of path, 16.4.9 reverse Polish, 4.3.16 reversibility rule for pathwise parallelism, 24.2.5 rhetoric, 2.9.8 Ricci-Curbastro, Gregorio, 25.7.6, 36.1.5, 39.1.1, 41.1.1, 46.1.6 Ricci curvature, 39.5.6 Ricci tensor, 39.5.5 Riemann, Georg Friedrich Bernhard, 25.7.6, 26.1.2, 36.1.5, 39.1.1, 39.1.2, 40.2.1, 46.1.5 Riemann curvature tensor, 36.4.1 [ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
51. Index
51. Index
Riemannian connection coefficients, 39.4.8 Riemannian connection in terrestrial coordinates, 42.5 Riemannian manifold, 39, 39.2.6, 41.5.0 Riemannian manifold, alternative definition, 39.8.1 Riemannian manifold, embedded, 39.8 Riemannian manifold inner product, 39.7 Riemannian manifold orthonormal vectors, 41.5.0 Riemannian manifold sectional curvature, 41.5.0 Riemannian manifold tensor calculus, 41.5 Riemannian manifold vector length, 39.2.7 Riemannian metric, 39.2, 39.2.3 Riemannian metric, preview, 25.6 Riemannian metric differentiability, 39.2.4 Riemannian metric integral, 39.3.2 Riemannian space, 39.2.6 Riemannian space gradient, 41.5.0 Riemannian space Laplacian operator, 41.5.0 right action, 23.9.2 right action on Lie transformation group, 34.8.3 right-associative operator, 4.3.12 right conjugate of a subset of a group, 9.3.13 right conjugation map, 9.3.22 right conjunct, 4.3.11 right coset of subgroup, 9.3.3 right-differentiable function, 18.3.3 right disjunct, 4.3.11 right inside skew product of transformation groups, 9.6.5, 9.6.9 right invariant vector field, 34.4.7 right invariant vector field on Lie group, 34.4 right inverse, group, 9.2.13 right inverse matrix, 11.1.24 right-open set, 18.3.1 right transformation group, 9.5, 23.9.2 right transformation group, effective topological, 16.8.14 right transformation group, topological, 16.8.12, 16.8.17 right transformation group mirror image, 9.6.1 right transformation group of topological space, 16.8.11 right transformation group of topological space, effective, 16.8.13 right transformation semigroup, 9.5 right translation operator for tangent vectors, Lie group, 34.4.3 right translation operator on Lie group, 34.4.2 rigorous mathematics, 2.1.1 ring, 9.8, 9.8.1 ring, commutative, 9.8.5 ring, commutative unitary, 9.8.6 ring, list operation, 9.12.2 ring, unitary, 9.8.2 ring, zero, 9.8.4 ring ideal, 9.8.7 ring of endomorphisms of a module, 9.9.7 ring of endomorphisms of module, 9.9.15 ring with unity, 9.8.2 roast kangaroo, 2.2.7 robot, half, 1.4.7 robots, martian, 3.4.1 rogue project, 5.0.6 roll, 42.8.2 Roman, ancient, 2.5.8 Roman Empire, 46.1.2 Roman mind, 46.1.2 rotation of two-sphere, 42.8.0 rote learning, 1.10.2 roulette wheel, 5.9.4 roulette wheel, ex machina, 5.9.9 round, Earth, 3.10.6 [ www.topology.org/tex/conc/dg.html ]
round function, 8.6.14 route, 16.1.1 row vector map, 11.1.11 rsfs font, 1.8 rule, and-introduction, 4.6.2 rule, associativity, 4.3.12 rule, deduction, 4.9.1 rule, deduction, meta-theorem, 4.9.4 rule, self-consistency, 3.10.3 rule, substitution, logic, 4.9.1 rule, theorem application, 4.9.2 Ruler, Universe, 2.11.22 rules, deduction, 4.4.2, 4.6 rules, derivation, logic, 4.14.4 Russell, Bertrand Arthur William, 2.1.9, 2.4.6, 2.9.7, 3.2.1, 39.1.5, 46.1.6 Russell, Francis Stanley (Frank), 2.1.9 Russell’s paradox, 3.2.1, 3.7.10, 3.10.4, 4.1.9, 5.1.6, 5.5.1, 5.6.4, 5.7, 5.7.7, 5.8.28, 5.12.1, 5.12.3 Russell’s paradox, excluded middle, 5.7.11 Russia, Soviet Union, 2.4.4 salt, 14.7.1 same-group left inside skew product of transformation groups, 9.6.8 same-group right inside skew product of transformation groups, 9.6.9 sample, ordered, 7.11.1 sample, random, 2.10.9 sample, unordered, 7.11.1 sand, 1.5.1 sat, mat, cat, 3.10.10 satellite image analogy, 2.10.1 sawtooth functions, 8.6.19 scalar curvature, 39.5.7 scalar multiplication, 10.1.2 scepticism, 2.2.8, 3.9.4, 3.10.15 schema, axiom, 4.5.2 Schild, Alfred, 46.1.6 Schild’s ladder, 37.6.10, 37.7.8 Schr¨ odinger’s cat, 3.2.8, 5.7.12 Schwartz, Laurent, 46.1.6 Schwartz distribution, 28.1.1, 28.5.8 Schwarzschild, Karl, 46.1.6 Schwarzschild singularity, 40.3 science, cognitive, 5.7.17 science, logic, 3.2.5 scientific truth, 3.9.3 scope, wff name, 4.9.3 Scottish law, not proven, 3.10.15 search, backwards-deductive, 4.8.5 second countable topological space, 15.6.3 second derivatives of distance function, Hessian, 25.7.1 second dual of linear space, 10.5.18 second fundamental form, 39.1.4 second-level tangent bundle, 28.10.3 second-level tangent bundle on Euclidean space, 19.3.3 second-level tangent bundle total space, 28.10.14 second-level tangent vector, 19.3 second-level tangent vector, drop function, 30.5, 30.5.2 second-order differential operator on Euclidean space, 19.5 second-order operator, 37.5.1 second-order operator, differential, 32.7.3 second-order operator, elliptic, 30.6, 37.6.3 second-order operator, real-valued function, differential, 32.5.1, 32.5.3 second-order operator, weakly elliptic, 37.6.3 second-order operator field, elliptic, 37.6 second-order PDE, elliptic, 27.5.10 [ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
874
second-order tangent bundle, 30.3.10 second-order tangent bundle, topological, 30.3.11 second-order tangent component tuple, 30.3.1 second-order tangent operator, 30.1.1 second-order tangent operator, elliptic, 30.6.3 second-order tangent operator, tagged, 30.1.9 second-order tangent operator, tensorization coefficients, 30.2, 30.2.3 second-order tangent operator space, 30.4.1 second-order tangent operator space, tagged, 30.4.3 second-order tangent space, 30.3.7 second-order tangent vector, 30.3.3 second-order tangent vector, differential, 32.7.1 second-order tangent vector field, 32.3.1 second-order tangent vector field, partial, 32.4.2 second-order vector field, 30.7.1 second-order vector field, differentiable, 30.7.3 second-order vector field, elliptic, 30.7.6 secondary key, 2.6.1 section, terminology, 23.3.11 sectional curvature, 39.5.4 sectional curvature on Riemannian manifold, 41.5.0 secure formalism, 1.5.1 selection, ordered, 7.11, 7.11.1 selection, unordered, 7.11.1 self-consistency, axiom system, 3.11.4 self-consistency, logic, 4.1.7 self-consistency rule, 3.10.3 self-containing set, causality violation, 5.7.24 self-interest, mathematics, 2.10.1 self-referential equation, 5.7.14 self-referential monster, 5.7.27 semantic space, first order logic, 4.13.13 semantics, 2.2.3, 2.3.1 semantics, diagrams, 3.6.2 semantics, logic, 3 semantics, mathematics, 3.2.4 semantics, meta-language, 3.10.6 semantics, truth and falsity, 3.9 semantics-free logic, 4.5.1 semantics-free propositional calculus, 4.5.1 semantics of logical negation, 3.10 semantics/syntax, logic, 4.3.10 semi-closed interval, 8.3.10 semi-definite, negative, 11.4.7 semi-definite, positive, 11.4.7 semi-definite real symmetric matrix, 11.6 semi-open interval, 8.3.10 semi-pixie integer, 2.11.10 semicolon set notation, 5.8.26 semigroup, 9.1, 9.1.1 semigroup, list operation, 9.12.1 semigroup, right transformation, 9.5 sender of proposition, 3.7.13, 3.12.1 sensible experience, 2.3.6 sensor/motor, feedback, 3.3.2 separability of topological space, 15.6 separable topological space, 15.6.6 separated pair of sets, 15.3.3 separation, Hausdorff, 15.2.12 separation axiom, 5.11.2 separation axiom, ZF, 5.4.3 separation class, completely regular, 15.2.18 separation class, Hausdorff, 15.2.13 separation class, normal, 15.2.21 separation class, T4 , 15.2.20 separation class T0 , topology, 15.2.3 separation class T1 , topology, 14.2.9, 14.6.2, 15.2.4 [ www.topology.org/tex/conc/dg.html ]
875
separation class T2 , 15.2.13 separation classes of topological spaces, 15.2 separation of sets, 15.3 sequence, convergent, 14.11.27 sequence, digit, 7.2.5 sequence, divergent, 14.11.27 sequence, etymology, 7.1.16 sequence, infinite, termination, 2.11.22 sequence, partial, 6.10.2 sequence of curves, concatenation, 16.2.17 sequence of functions, 7.1.14 sequence of sets, 7.1.14 sequent, propositional calculus, 4.14.4 sequential compactness, 17.3.29 sequentially compact set, 15.7.11 sequitur quodlibet, Ex contradictione, 3.11.1 sequitur quodlibet, Ex falso, 3.11.1 series, Taylor, 8.7.3, 20.13.1, 21.7 service provider, logic, 3.3.8 set, container metaphor, 5.7.17, 5.7.25 set, dark, 2.10 set, determinable content, 5.5.3 set, essential nature, 5.2.6 set, grey, 2.10.1 set, incompressible, 2.10.4 set, naive, 3.14.4, 4.1.3, 4.1.4, 4.1.9, 4.12.6 set, nature, 5.5.3 set, pure, 5.5.4 set, rectifiable, 17.5 set, self-containing, causality violation, 5.7.24 set, undefined concept, 5.1.2 set, universal, 4.1.9 set, unmentionable, 2.10.4 set algebra, 5.13, 5.14 set attributes, 5.2.6 set boundary, topological, 14.5.4 set category, 6.1.2 set class tag, 5.2.6 set complement, 5.13.8 set construction, dynamic perspective, 5.7.24 set diameter, 17.2, 17.2.4 set distance, 17.2 set domain, concrete, 5.7.23 set existence axiom, ZF, 5.1.17, 5.1.18 set exterior, topological, 14.5.2 set family, 6.8.1 set family, Cartesian product, 6.9, 6.9.1 set graft, 6.10.6, 42.6.0 set identity, 5.2.6 set interior, topological, 14.4.1 set intersection, 5.13.2 set intersection properties, binary, 5.13 set intersection properties, general, 5.14 set language, high-level, 5.6.2 set map, function, 6.6 set map, inverse, function, 6.6 set map corresponding to a function, 6.6.1 set map corresponding to a function, inverse, 6.6.1 set membership chain, 5.5.3 set membership relation, 4.15.1 set membership symbol, 5.1.20 set non-membership notation, 5.1.4 set notation definition, uniqueness, 5.8.6 set partition, 6.4.2 set product, Cartesian, 6.2, 6.2.1 set quotient, 6.4.4 set sequence, 7.1.14 set-theoretic formula, 6.3.3 [ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
51. Index
51. Index
set-theoretic formula, always-true, 6.5.16 set-theoretic function, 6.5.1 set-theoretic relation, 6.3.18 set theory, 5 set theory, Bernays-G¨ odel, 5.1.6, 5.5.1, 5.12 set theory, BG, 5.12 set theory, buggy, 3.10.4 set theory, naive, 5.1.1, 5.2.2, 5.7.2 set theory, naive, finite, 3.13.3 set theory, NBG, 4.1.5, 5.1.1, 5.2.2, 5.7.5, 5.12 set theory, Neumann-Bernays-G¨ odel, 5.1.1, 5.2.2 set theory, preview, 25.1 set theory, pure, 5.7.25 set theory, symbolic logic, 5.0.3 set theory, Zermelo, 5.11, 5.11.1 set theory, Zermelo-Fraenkel, 4.1.8, 5.1, 5.7.2 set theory, Zermelo-Fraenkel, with axiom of choice, 5.9.6 set theory, Zermelo-Fraenkel, with axiom of countable choice, 5.10.3 set theory, Zermelo-Skolem-Fraenkel, 5.1.1 set theory, ZF, propositions, 4.1.9 set theory, ZF, redundant axiom, 5.1.17 set theory 8-line summary, ZF, 5.1.27, 5.1.28 set theory and logic, cyclic, 2.1.6 set theory axiom, CC, 5.10.1 set theory axioms, Zermelo-Fraenkel, 5.1.26 set theory axioms, ZF, 5.3 set theory construction stage, ZF, 5.5.4 set theory customer, 3.3.8 set theory model, 5.2.3 set theory service provider, 3.3.8 set translate, 14.7.4 set union, 5.13.2 set union closure, 5.15 set union properties, binary, 5.13 set union properties, general, 5.14 set union topology, 15.10 set universe, 5.7.14 setting bones, 46.2.4 several variables, differentiation, 18.5 several variables, higher-order derivatives, 18.6 Shakespeare, William, 1.0 sharpness of bounds, 21.1.6 sharpness of theorems, 43.0.1 sheep, 3.5.3 Sheffer stroke, 4.3.6, 4.3.8, 4.6.4 should-proposition, 3.3.10 sign function, 8.6.2 sign function, permutation, 7.10.11 signed integer, 7.5, 7.5.2 signum function, 8.6.3 silver, 3.5.3 simple close curve, 16.2.11 simple closed path, 16.4.8 simple curve, 16.2.11 simple m-vector, 13.9.10 simple path, 16.4.8 simple tensor, 13.5.6 simple topology, countably infinite set, 14.7 simple topology, finite set, 14.3 simulation, computer, 3.13.4, 3.13.8, 3.14.2 simultaneous logical equations, 4.4.2 sine function, 20.13.9 singleton, 5.3.3, 5.8.9 singleton axiom, 5.3.4 singular homology theory, 16.6.0 singular r-chain, 25.3.6 singularity, Schwarzschild, 40.3 [ www.topology.org/tex/conc/dg.html ]
situs, analysis, 14.1.1, 46.2.17 size, social group, 2.2.4 skew product of transformation groups, left inside, 9.6.8 skew product of transformation groups, right inside, 9.6.9 slave, 3.5.3 smallest topology, 14.3.1 smiling, 3.9.8 smooth function space, 45.5 snake, 2.1.1 snakes and ladders, 2.1.3 Sobolev space, 27.10.1 social behaviour control, 3.7.12 social group size, 2.2.4 socio-mathematical network communications, 2.5 socio-mathematical network model, 2.5.3 socio-mathematical network synchronization, 2.5.3 Socrates, 3.4.2 software, computer, 3.13.4, 3.13.8 software library, 4.7.7 software packages, symbolic mathematics, 2.12.6 software patch, 3.10.4 soil, 1.5.1 solidarity, 2.2.4 solve, etymology, 14.7.1 somebody else’s problem, 3.3.8 sometimes-constant curve, 16.3.3 soup, antediluvian, 5.1.21 source set of relation, 6.3.14, 6.3.15, 6.3.16 Soviet Union, Russia, 2.4.4 space, identification, 6.10.5 space, linear, 10.1, 10.1.2 space, Riemannian, 39.2.6 space, vector, 10.1.1 space-filling curve, 16.1.5, 26.1.4, 43.1.2 space grooves, 30.0.4 space of positive curvature, 37.0.2 space-time, granular, 2.12.1 space-time, Minkowski, 25.7.1 span of mathematical logic, 3.2.5 Spanish colonies, 2.2.7 spanning set of linear subspace, 10.2.4 sparse matrix, 10.10.5 spartan axiomatic system, 4.6.4 special functions, 20.1.2 special relativity, 25.7.1, 39.1.3 specification, axiomatic, 2.7.2 specification axiom, 4.12.6, 5.11.1 specification axiom, ZF, 5.4.1, 5.4.2, 5.7.19, 48 specification tuple, 5.16 specification tuple redundancy, 14.2.2 spectral decomposition, 11.5.7 speech, human, 2.2.4 speed of light, 39.1.5 spelling, British, 1.6.7 sphere, 44.2.5 sphere, tangent, 42.17.2 sphere of general dimension, 43.6 sphere projection onto plane, 42.17 spherical coordinates, 43.6.0 spherical coordinates, astronomical, 42.1.6 spherical coordinates, terrestrial, 42.1 Spivak, Michael David, 1.4.2 square function, 8.6.20 square matrix algebra, 11.3 stack, context, 3.10.5 stage, ZF set theory construction, 5.5.4 stages of abstraction, three, 3.9.6 standard atlas for Euclidean space, 27.3.1 [ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
876
standard basis of Euclidean linear space, 10.2.21 standard fibre, 23.3.3 standard identification map for Cartesian product, 7.7.6 standard immersion, free linear space, 10.10.1 standard immersion in tensor product using free linear space, 13.11.1 standard injection for direct sum of linear space sequence, 10.6.4 standard map projection for two-sphere, 42.16 standard topology for the real numbers, 14.9 standardization bodies, communications, 2.8.9 standardization processes, 2.5.13 starlike subset, 38.4.2 stars, fixed, 24.1.4 state, successor, 2.4.4 statement about proofs, meta-theorem, 4.9.3 statement form, 4.5.2, 4.5.4 statement-form name, 4.5.4 statement-form-name form, 4.5.4 statement name, 4.5.2, 4.7.4 statements, conditional, Gilgamesh epic, 3.5.2, 46.4.1 states, world, 3.6.3 states of mind, 2.5.3 static concrete proposition domain, 4.1.6 statistical variations, cloud, 2.3.9 statistics, geometry, 39.9.1 steering language, 2.5.7 Steiner, Jakob, 46.1.5 step function, 8.6.5 steps, missing, 1.5.2 stetiger Zusammenhang, 36.1.2 Stiefel, Eduard Ludwig, 23.1.1, 46.1.6 Stokes, George Gabriel, 46.1.6 Stokes theorem, 14.0.3, 20.3.6, 20.6.1, 20.9, 20.9.2, 24.1.2, 24.4.0, 25.3.6 Stokes theorem, rectangular, in three dimensions, 20.4, 20.4.1 Stokes theorem, rectangular, in two dimensions, 20.3, 20.3.2 store, proposition, 3.7.12 store, proposition, finitely populated, 3.3.7 straw, camel, 2.11.15, 4.13.12 stretch of curve, constant, 16.3.2 strictly elliptic second-order tangent operator, 30.6.3 stroke, Sheffer, 4.3.6, 4.3.8, 4.6.4 stronger topology, 14.2.21 strongly connected function, 15.5.4 structural layers of differential geometry, 1.1 structure, differentiable, 27.1.2 structure, logic, 3.3.3 structure, mathematical, imported, 2.6.2 structure group, 9.7.0, 44.1.0 structure group, Lie, 33.2.0 structure group, Lie, differentiable fibre bundle, 35.2, 35.2.2 structure group, non-Lie, differentiable fibre bundle, 35.1, 35.1.3 structure group, parallelism, 36.0.3 structure groups discussion, 23.5 structure-preserving fibre set map, 23.8, 23.8.2 stuff of mathematics, 2.0.3 style, symbolic logic, 4.0.2 style, ZF axioms, 5.6.2 style of this book, 1.5 styles, logic, 4.0.1 sub-expression, logical, parenthesized, 4.3.12 subalgebra, Lie, 9.11.5 subbase, open, 14.10.3, 15.6.1 subexpression, antecedent, 4.3.14 subexpression, consequent, 4.3.14 subgroup, 9.3, 9.3.1 [ www.topology.org/tex/conc/dg.html ]
877
subject classification, MSC 2000, 1.9 subjective truth, 3.7.2 subjunctive verb mood, 3.7.1, 3.10.11, 3.12.2 sublayer, conformal, 39.0.3 submanifold, 31.3.12 submanifold in Euclidean space, 41.7.1 submanifold of Euclidean space, 41.7 submersion in Euclidean space, 41.7.1 subsequence function for list, 7.12.2 subsets axiom, 5.11.2 subspace, trivial, 10.2.7 subspace of linear space, 10.2.1 subspace spanned by subset of linear space, 10.2.4, 10.6.6 substitute function for list, 7.12.2 substitution, exhaustive, logical expression, 4.14.3 substitution axiom, ZF, 5.11.3 substitution of equality, 5.2.3 substitution of equality axiom, 5.2.5 substitution operator, 13.1.2 substitution rule, logic, 4.9.1 substitutivity of equality axiom, 4.15.1 successor function, 7.3.3 successor set, 5.7.15, 7.2.2, 7.2.4, 7.3.1 successor state, 2.4.4 sugar, 1.9.1 sum of linear spaces, direct, 10.6 Sumerian, 3.5.2 Sun rises in the East, 3.10.6 Sung dynasty, 7.11.7 super-user, 5.7.19 superfluous dummy variable, 6.5.16 superior intellect, 3.3.4 supremum of partially ordered set, 7.1.10 surface of constant curvature, 43.9 surface of revolution, 43.9 surjection, 6.5.23 surjective function, 6.5.23 suspicion, truth and falsity, 3.9.4 swan, white, 4.13.10 swap function for list, 7.12.2 syllogism, 4.0.1, 4.6.5 symbol, assertion, 4.5.8, 4.6.6, 46.2.18 symbol, chicken-foot, 5.16.4, 6.3.20 symbol, Christoffel, 37.7.8, 37.9.5 symbol, Levi-Civita, 7.10.11, 7.10.21, 7.10.22 symbol, primitive, 4.3.12, 4.6.4, 4.7.1 symbol, two-way assertion, 4.5.9 symbol −, 46.2.5 symbol +, 46.2.5 symbol =, 46.2.6 symbol ⇔, 4.11.2 symbol γ for curves, 16.2.1 symbol ∧, 4.11.2 symbol ∨, 4.11.2 symbolic algebra, 10.10.4 symbolic juxtaposition, 2.5.12 symbolic logic, 2.9.7 symbolic logic, set theory, 5.0.3 symbolic logic, verb mood, 3.12.1 symbolic logic style, 4.0.2 symbolic mathematics software packages, 2.12.6 symbols, intellectual content, 2.3.7 symbols, logic, 4.3.5 symmetric matrix, 11.3.19 symmetric multilinear map, 13.3, 13.3.2 symmetric relation, 6.3.29 symmetry, 6.4.1 synchronization, socio-mathematical network, 2.5.3 [ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
51. Index
syntax, meta-language, 3.10.6 syntax/semantics, logic, 4.3.10 synthetic geometry, 3.4.4 synthetic logic, 3.9.7 system, algebraic, 4.1.1 system, axiomatic, 4.5.2 system, axiomatic, incomplete, 3.8.2 system, meta-logical framework, 5.7.22 system definitions, mathematical, 2.8 system hang, 3.10.4 system of linear second-order ODEs, 21.2 system reboot, 3.10.4 systems, parametrized family, 2.8.5 systems of differential equations, 21.0.3 T0 separation class, topology, 15.2.3 T1 separation class, topology, 14.2.9, 14.6.2, 15.2.4 T1 topology, trivial, 14.7.7 T1 topology on finite set, 15.2.10 T2 space, 15.2.13 T4 topological space, 15.2.20 T´ a an Domhan cothrom, 3.10.8 table, truth, 3.10.1, 4.4.2 table, truth, extended, 3.8.1, 4.2.6 table, truth, theorem provability, 4.4.4 tablet, clay, Mycenaean, 2.3.5 tablets, ancient Minoan, 2.5.6 tablets, golden, 3.2.7 tag, class, set, 5.2.6 tag, proposition, 3.7.15 tagged second-order tangent operator, 30.1.9 tagged second-order tangent operator space, 30.4.3 tagged tangent operator, 28.6, 28.6.2 tagged tangent operator, differential, 31.3.24 tagged tangent operator basis operator, 28.7.11 tagged tangent operator space, 28.7.9 tagging propositions, 3.9.2 tainted theorem, AC, 5.9.1, 7.8.4 tainted theorem, CC, 5.9.1 tainted theorem example, AC, 10.2.25, 20.1.4 tainted theorem example, CC, 7.2.26, 7.2.28, 7.2.36 tangent, etymology, 28.0.2 tangent basis, 28.12.1 tangent bundle, 28.0.1, 28.8, 28.8.1 tangent bundle, affine connection, 37.3, 37.3.2 tangent bundle, affine connection, differentiability, 37.3.4 tangent bundle, multi-level, 28.10.18 tangent bundle, second-level, 28.10.3 tangent bundle, second-order, 30.3.10 tangent bundle, topological second-order, 30.3.11 tangent bundle, unidirectional, 28.14, 28.14.2 tangent bundle atlas, 28.2.1 tangent bundle atlas, unidirectional, 28.14.2 tangent bundle chart, 28.2.1 tangent bundle chart, unidirectional, 28.14.2 tangent bundle fibre space, 28.2.5 tangent bundle lift function, 28.2.1 tangent bundle lift function, unidirectional, 28.14.2 tangent bundle metadefinition, 28.2, 28.2.1 tangent bundle of differentiable manifold, 35.8 tangent bundle of tangent bundle, 28.10 tangent bundle on differentiable manifold, 28 tangent bundle on Euclidean space, 19.1.7 tangent bundle on Euclidean space, cross-section, 19.1.9 tangent bundle on Euclidean space, cross-section, differentiable, 19.1.10 tangent bundle on infinite-dimensional manifold, 28.16 tangent bundle on two-sphere, global, 42.7 tangent bundle projection map, 28.2.1 [ www.topology.org/tex/conc/dg.html ]
51. Index tangent bundle projection map, unidirectional, 28.14.2 tangent bundle total space, 28.2.1 tangent bundle total space, unidirectional, 28.14.2 tangent bundle total space atlas, 28.8.4 tangent bundle total space manifold, 28.8.6 tangent bundle transition map, 28.2.6 tangent component tuple, computational second-order, 30.3.1 tangent component tuple, second-order, 30.3.1 tangent coordinate frame bundle, 35.9.1 tangent coordinate triple, 28.3.2 tangent curve class, 28.0.4 tangent fibration, 28.8.8 tangent fibre bundle, 35.8.1 tangent frame, 28.12, 28.12.1 tangent frame bundle, 35.9, 35.9.1 tangent frame space, total, 28.12.8 tangent function, 20.13.9 tangent operator, 28.0.4, 28.5, 28.5.1 tangent operator, differential, 31.3.18, 31.3.22, 31.3.23 tangent operator, higher-order, 30.1 tangent operator, second-order, 30.1.1 tangent operator, second-order, differential, 32.7.3 tangent operator, second-order, elliptic, 30.6.3 tangent operator, second-order, tagged, 30.1.9 tangent operator, second-order, tensorization coefficients, 30.2, 30.2.3 tangent operator, tagged, 28.6, 28.6.2 tangent operator, tagged, differential, 31.3.24 tangent operator, zero, ambiguity, 2.6.5, 28.5.12, 28.6.1 tangent operator basis operator, 28.7.11 tangent operator basis operator, tagged, 28.7.11 tangent operator bundle, 28.9, 28.9.2 tangent operator field, 29.6 tangent operator frame, 28.12.4 tangent operator of curve, 31.4.7 tangent operator space, 28.7.6 tangent operator space, second-order, 30.4.1 tangent operator space, tagged, 28.7.9 tangent operator space, tagged second-order, 30.4.3 tangent orthogonal bundle, 39.4.9 tangent space, double, 29.4 tangent space, double, pointwise, 29.4.1 tangent space, double, total, 29.4.3 tangent space, higher-order, 30.4 tangent space, pointwise, 28.7, 28.7.1 tangent space, second-order, 30.3.7 tangent space building principles, 27.13 tangent space of tangent bundle total space, 28.10.13 tangent space using Leibniz rule, 45.1.1 tangent to curve, transformation rule, 19.1.3 tangent to sphere, 42.17.2 tangent vector, 19.1, 28.2.1, 28.3, 28.3.3 tangent vector, computational, 28.4, 28.4.2 tangent vector, extrinsic, 28.1.0 tangent vector, higher-order, 30, 30.3 tangent vector, second-level, 19.3 tangent vector, second-level, drop function, 30.5, 30.5.2 tangent vector, second-order, 30.3.3 tangent vector, second-order, differential, 32.7.1 tangent vector, unidirectional, 28.14.2 tangent vector coordinates, 28.2.1 tangent vector field, 29.5.1 tangent vector field, partial, 31.4.10 tangent vector field, second-order, 32.3.1 tangent vector field, second-order, partial, 32.4.2 tangent vector field of curve, 31.4.1 tangent vector on total tangent space, 28.10.7 tangent vector representation, 28.1, 28.1.1 [ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
878
tangent vector triple, computational, 28.4.1 tangent vector using Leibniz rule, 45.1.1 target set of function, 6.5.10 target set of relation, 6.3.14, 6.3.15, 6.3.16 tautological proposition, 4.3.22 tautologous proposition, 4.3.22 tautology, 4.3.22, 4.3.23 Taylor, Brook, 46.1.4 Taylor series, 7.11.1, 8.7.3, 20.13.1, 21.7 tea-cup and dough-nut topology, 14.1.3 techniques, PDE, curved space, 1.4.1 tedium, 5.15.8 tempered distribution, 28.5.8 template, axiom, 5.11.5 template, proposition, 4.12.3, 5.1.25 template definition, 18.4.1 template function, 6.5.22 temporal parameter, logic machine, 3.7.9 tennis ball, 2.10.10 tensor, 13.5.2, 29.3.4 tensor, alternating, 13.9 tensor, contravariant, 29.1 tensor, covariant, 13.6, 29.3 tensor, curvature, 37.7.6, 39.5, 39.5.3 tensor, curvature, Riemann, 36.4.1 tensor, fundamental, 39.2.6 tensor, Levi-Civita, 7.10.20 tensor, metric, 39.2.6, 41.5.0 tensor, mixed, 13.7, 29.3 tensor, Ricci, 39.5.5 tensor, simple, 13.5.6 tensor, torsion, 37.7.7 tensor algebra, 13, 13.8.6 tensor algebra, alternating, 13.10 tensor algebra, general, 13.8 tensor algebra, mixed, 13.8.14 tensor algebra, preview, 25.4 tensor bundle on manifold, 29 tensor calculus, 41 tensor calculus, history, 41.1 tensor calculus for differential manifold, 41.2 tensor calculus for manifold with affine connection, 41.3 tensor calculus for Riemannian manifold, 41.5 tensor calculus in terrestrial coordinates, 42.2 tensor components, 29.3.6 tensor field, 29.7, 29.7.1 tensor field, differentiable, 29.7.2 tensor field, Lie derivative, 33.5 tensor field, metric, 10.2.18 tensor field, Riemannian metric, 39.2.3 tensor field, Riemannian metric, differentiability, 39.2.4 tensor field along curve, 29.8 tensor field on manifold, 29 tensor metadefinition, 13.4.1 tensor monomial, 13.12.2 tensor polynomial representation, 13.12.3 tensor product, 2.5.12 tensor product, alternating, 13.9.6 tensor product defined via free linear space, 13.11 tensor product defined via lists of tensor monomials, 13.12 tensor product in terms of free linear space, 13.11.1 tensor product metadefinition, 13.4 tensor product of linear spaces, 13.5 tensor product operation, 13.8.2 tensor product operation, mixed, 13.8.11 tensor product space, 13.5.2 tensor product space metadefinition, 13.4.1 [ www.topology.org/tex/conc/dg.html ]
879
tensor product standard immersion using free linear space, 13.11.1 tensor space, 13.5.2 tensor space, contravariant, 29.1 tensor space, mixed, 13.7.2 tensor space canoncial multilinear map, 13.4.1 tensor space extended canonical map, 13.12.5 tensorization, 25.5.2 tensorization, Levi-Civita connection, 30.2.7 tensorization coefficients, 37.6.10, 39.4.5 tensorization coefficients for second-order tangent operator, 30.2, 30.2.3 terminal point of curve, 16.2.13 terminal point of path, 16.4.11 termination, infinite sequence, 2.11.22 terminology, connection, 15.3.2, 37.1.3 terminology, contravariant, 13.6.2 terminology, covariant, 13.6.2 terminology, covariant derivative, 37.4.3 terminology, differentiable manifolds, 27.1 terra firma, 2.1.3 terrain navigation, 3.5.10 terrestrial coordinates, 42.1, 42.2.2, 42.6.0, 42.14.0 terrestrial coordinates principal fibre bundle, 42.4 terrestrial coordinates Riemannian connection, 42.5 terrestrial coordinates tensor calculus, 42.2 terrible waste of time and energy, 4.4.4 tertium non datur, 3.6.4 test function, convexity, 38.8.2 test particle, 16.2.2 test procedure, class, 4.1.4 testing unit, proposition, 3.7.6 TEX, plain, 1.5.3, 1.8 text-level argumentation, 4.0.2 Thales of Miletus, 1.4.4, 3.2.8, 46.1.1 the, 4.16.7 theorem, bogus, 4.9.10 theorem, deduction, 4.9 theorem, false application, 14.1.6 theorem, naive, 4.9.5 theorem, PC, 4.7.2 theorem application rule, 4.9.2 theorem corpus, 5.6.2 theorem provability, truth table, 4.4.4 theorems, equipotent, 4.9.2 theorems, propositional calculus, 4.8 theoretical physicist, 2.9.9 theory, membership, 5.1.3 theory of distributions, 14.5.6 theory of ideas, Plato, 2.4 theory of truth, deflationary, 3.10.8 theory of truth, redundancy, 3.10.8 theory of types, 4.1.9 there, 2.10.1 thief, 3.5.3 thinking, logical, 3.2.8 thinking machine, mathematical, 4.2.8 Thomson, William (Lord Kelvin), 20.8 three-storey model for differential geometry, 26.1.9 three-truth-value logic, 3.11.7 Tietze extension theorem, 15.2.27 Tikhonov, Andrei Nikolaevich, 46.1.6 Tikhonov’s theorem, 5.9.10, 15.7.10 time and energy, terrible waste, 4.4.4 tolerance of not-knowing, 3.2.8 tollendo tollens, modus, 4.6.8 tollere, 4.6.8 topic flow diagram, 1.2 [ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
51. Index
51. Index
topological atlas, 26.4, 26.4.6 topological boundary of set, 14.5.4 topological chart, 26.4, 26.4.2 topological closure, 14.4.4 topological coordinate function, 26.4.2 topological coordinate map, 26.4.2 topological curve, 16 topological dimension, 15.9, 34.2.5 topological exterior of set, 14.5.2 topological fibration, 23.3, 27.12.0 topological fibration, cross-section, 23.3.8, 27.12.12 topological fibration, direct product, 23.3.20 topological fibration with a fibre atlas, 23.3.17 topological fibration with fibre space F , 23.3.7 topological fibration with intrinsic fibre space, 23.2, 23.2.1 topological fibre atlase, equivalence, 23.6.16 topological fibre bundle, 23, 23.6, 23.6.4, 35.0.0 topological fibre bundle, associated, 23.10, 23.10.5, 23.10.9 topological fibre bundle, associated, orbit space method, 23.12.3 topological fibre bundle, fibre-to-fibre homeomorphism space, 36.8.3 topological fibre bundle, parallelism, 24 topological fibre bundle, pathwise parallelism, 24.2 topological fibre bundle association, 23.10.3 topological fibre bundle homomorphism, 23.7.1, 23.7.7 topological fibre bundle isomorphism, 23.7.4, 23.7.10 topological fibre set parallelism, 24.1.9 topological fibre set parallelism space, 24.1.9 topological G-bundle, 23.9.2 topological glue, 25.2.1 topological graft, 26.6.1, 43.3.3 topological group, 16, 16.7, 16.7.1 topological group, locally Euclidean, 34.1.2 topological group of differentiable diffeomorphisms, 34.6.5 topological identification space, 15.11, 26.6.0 topological interior of set, 14.4.1 topological layer, 1.1 topological left transformation group, 16.8.17 topological left transformation group homomorphism, 16.8.6 topological left transformation group of topological space, 16.8.3 topological left transformation group of topological space, effective, 16.8.8 topological manifold, 26, 26.3, 26.3.1 topological neighbourhood, 25.2.2 topological path, 16 topological pathwise parallelism, 24.2.2 topological principal fibre bundle, 23.9 topological principal fibre bundle with structure group, 23.9.2 topological principal G-bundle, 23.9.2 topological right transformation group, 16.8.12, 16.8.17 topological right transformation group, effective, 16.8.14 topological second-order tangent bundle, 30.3.11 topological space, 14.2, 14.2.3 topological space, compact, 15.7.4 topological space, connected, 15.4.1 topological space, Euclidean, 26.2.1 topological space, locally connected, 15.4.20 topological space, neighbourhood of point, 14.2.11 topological space, normal, 25.2.7 topological space, separable, 15.6.6 topological space compactness classes, 15.7 topological space connectivity classes, 15.4 topological space disconnection, 15.4.5 topological space equivalence, 25.2.9 topological space examples, 43.1 topological space product, 15.1.1 [ www.topology.org/tex/conc/dg.html ]
topological space subset disconnection, 15.4.7 topological transformation group, 16.8 topological vector space, 16.9, 16.9.3 topology, 14, 14.2.3 topology, algebraic, 14.1.3, 16.2.2, 16.6, 25.2.9 topology, compact-open, 15.7.9 topology, differential, 27.0.1 topology, direct product, 15.1.4 topology, discrete, 14.2.19 topology, duality, 14.3.7 topology, empty, 14.2.10, 14.3.2 topology, etymology, 14.1.1, 46.2.17 topology, Hausdorff, 15.2.13 topology, history, 14.1 topology, information content, 14.7.10 topology, intuitive, 14.2.9 topology, isolated point, 14.6.5 topology, largest, 14.2.17, 14.3.1 topology, locally compact, 15.7.12 topology, non-Hausdorff, 15.2.15 topology, non-trivial, 14.7.10 topology, paracompact, 15.7.14 topology, permutation-invariant, 14.7.5 topology, preview, 25.2 topology, product, 15.1 topology, purpose, 14.7.1 topology, quotient, 15.1, 15.1.6, 15.1.8 topology, relative, 14.10.13 topology, simple, countably infinite set, 14.7 topology, simple, finite set, 14.3 topology, smallest, 14.2.17, 14.3.1 topology, strength, 14.2.20 topology, stronger, 14.2.21 topology, T0 separation class, 15.2.3 topology, T1 separation class, 14.2.9, 14.6.2, 15.2.4 topology, tea-cup and dough-nut, 14.1.3 topology, translation-invariant, 14.7.4, 14.7.5 topology, trivial, 14.2.18, 17.3.5 topology, trivial closed-point, 14.7.6 topology, trivial T1 , 14.7.7 topology, underlying, 27.2.6 topology, weaker, 14.2.21 topology, weakness, 14.2.20 topology classes, 14.1.6, 14.7.10, 15 topology classification, 25.2.9 topology constructions, 15 topology flavour, 14.1.2 topology for real numbers, 14.9.1 topology for the real numbers, standard, 14.9 topology formalism, 14.2.1 topology generated by collection of subsets of set, 14.8.4 topology induced by a metric, 17.3, 17.3.3 topology induced by differentiable atlas, 27.2.4 topology induced on a set by a function, 14.11.5 topology of metric space, 17.3.3 topology of pointwise convergence, 14.11.21, 15.7.8 topology of real number intervals, 15.8 topology on finite set, T1 , 15.2.10 topology on finite set, uniform, 14.3.8 torsion, 37.7, 39.4.7 torsion form, 37.7.5 torsion-free connection, 37.1.3, 37.6.10 torsion tensor, 37.7.7 tortoise, Achilles, 2.11.3, 14.0.4, 25.2.3 torus, 43.5, 44.2.2 total cotangent space, 29.2.8 total double cotangent space, 29.4.6 total double tangent space, 29.4.3 [ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
880
total order, 7.1.4 total space atlas, tangent bundle, 28.8.4 total space manifold, tangent bundle, 28.8.6 total space of second-level tangent bundle, 28.10.14 total space of tangent bundle, 28.2.1 total tangent frame space, 28.12.8 total tangent space, drop function, 28.11, 28.11.6 total tangent space, horizontal component, 28.11, 28.11.2 total tangent space, vertical vector, 28.11.3 total tangent space tangent vector, 28.10.7 totally differentiable function, 18.5.17 totally ordered set, 7.1.4 tour, mathematical logic, 3.1.2 trace of a matrix, 11.3.2 track, 16.1.1 tractrix, 43.9 trajectory, 7.1.18, 16.1.1 transcendental functions, 20.12.3 transcendental numbers, 2.9.3, 2.12.5 transfer, incomplete information, 3.8 transfinite induction, 18.4.1 transformation, differentiable, differentiable family, 27.7 transformation, Galileo, 30.0.3 transformation, infinitesimal, 34.8, 35.3.0 transformation, Lorentz, 30.0.3 transformation, orthogonal, 11.5.7, 19.2.4, 19.5.3, 23.5.0, 46.3.0 transformation family, one-parameter, 31.5 transformation group, 9.4.4 transformation group, left, 9.4 transformation group, Lie, 34.0.1 transformation group, mixed, 9.6 transformation group, one-parameter, vector field, 31.5.2 transformation group, right, 9.5, 23.9.2 transformation group, topological, 16.8 transformation group ambiguity, left/right, 2.5.4, 5.16.6, 34.8.8 transformation group as fibre bundle, finite, 22.4 transformation group figure, 9.7, 9.7.0 transformation group homomorphism, 9.4.9 transformation group homomorphism, topological left, 16.8.6 transformation group invariant, 9.7 transformation group of a topological space, effective left, 16.8.7 transformation group of topological space, left, 16.8.2 transformation group of topological space, topological left, 16.8.3 transformation groups, 9.2.3, 9.4.1 transformation groups, left outside skew product, 9.6.4 transformation groups, right inside skew product, 9.6.5 transformation pseudogroup, 27.4.3 transformation rule, tangent to curve, 19.1.3 transformation semigroup, 9.4.2 transformation semigroup, right, 9.5 transformations, general linear, Lie group, 34.7.6 transistor circuit, 4.6.4 transistor circuit, bistable, 4.1.1 transistor circuit voltages, 3.10.6 transition map, tangent bundle, 28.2.6 transition map conformality, 28.8.9 transition map orthogonality, 28.8.9 transitive relation, 6.3.29 transitivity, 6.4.1 transitivity rule for pathwise parallelism, 24.2.4 translate of set, 14.7.4 translation-invariant topology, 14.7.4, 14.7.5 translation operator, left, 34.3.1 translation operator, right, on Lie group, 34.4.2 [ www.topology.org/tex/conc/dg.html ]
881
transport, Lie, 33.4.5 transport, metric, 39.1.5 transport, parallel, 33.1.0, 36.0.2, 37.1.1, 37.7.8 transpose of a matrix, 11.1.12 transpose of linear map, 10.5.23 transposed horizontal lift function on differentiable fibre bundle, 36.3.12 transposed horizontal lift function on principal bundle, 36.5.3 transposed map, 10.5.24 transposition of function, 6.12.3 transposition operation on set, 7.10.7 traversal, 16.1.1 traversal, membership relation network, 5.5.3 traversal, ordered, 7.1.17, 7.1.18, 16.1.8, 16.2.3, 17.5.11 tree, bird, 5.7.25 tree, function, 4.3.12 tree, logical operation, 3.10.13 triangle, arithmetic, 7.11.6 triangle inequality, 17.1.12, 17.1.14 triangle of Pascal, 7.11.6 tribal membership, 2.2.2 tribes, monkey, 2.2.3 trichotomy, 5.9.14 trigger, assertion, 3.7.17, 3.10.14 trigonometric function, 20.13 trigonometric function, inverse, 20.13.2 trigonometry double-angle rules, 20.13.12 trigonometry half-angle rules, 20.13.13 trigonometry product rules, 20.13.11 triple, ordered, 6.1.14, 7.7.3 triple, tangent coordinate, 28.3.2 triple conjunction, 3.5.8 triple disjunction, 3.5.8 triplicity, 4.16.10 triplique, 4.16.10 tripliqueness, 48.1.7 trivial bundle, 24.4.4 trivial closed-point topology, 14.7.6 trivial group, 44.1.0 trivial subspace, 10.2.7 trivial T1 topology, 14.7.7 trivial topological space, 14.2.18 trivial topology, 14.2.18, 17.3.5 trivium, 2.9.8 truck, 23.1.2 true, abstract label, 3.2.5 true, proposition tag, 3.7.16 true nature, mathematical logic, 3.1.2 true-store, proposition, 3.7.19 true zero-operand operator, 4.3.18, 4.3.19 true zero-parameter predicate, 4.12.10 truth, definition, 3.4.3 truth, ontology, 3.6.1 truth, proposition tagging, 3.7.5 truth, scientific, 3.9.3 truth, semantics, 3.9 truth, subjective, 3.7.2 truth, theory of, deflationary, 3.10.8 truth, theory of, redundancy, 3.10.8 truth, undefined concept, 3.9.1 truth function, 3.10.9, 4.2.4 truth function, unary, 3.10.10 truth-functional combination, 4.2.1, 4.2.3 truth table, 3.10.1, 4.4.2 truth table, extended, 3.8.1, 4.2.6 truth table, negation, 3.10.10 truth table, theorem provability, 4.4.4 truth table applicability, predicate calculus, 4.14.3 [ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
51. Index
51. Index
truth value, 3.7.15 truth value, unknown, 3.2.5, 3.8.2, 4.2.5, 4.2.7 truth value function, inconsistent, 4.1.5 truth value map, 3.1.2, 4.1.3 truth-value status, relativity, 3.3.9 tuple, computational second-order tangent component, 30.3.1 tuple, real number, 8.5 tuple, second-order tangent component, 30.3.1 tuple, specification, 5.16 tuple, specification, redundancy, 14.2.2 tuple concatenation operator, 8.5.3 tuple of real numbers, 8.5.1 two-parameter arctangent function, 20.13.6 two-point metric, 17.1.3 two-port object, 34.4.11 two-sided coin, proposition analogy, 3.7.4 two-sphere antipodal points, 42.9.4 two-sphere geodesic curve, 42.9, 42.9.3 two-sphere geometry, 42 two-sphere global tangent bundle, 42.7 two-sphere isometry, 42.8 two-sphere rotation, 42.8.0 two-truth-value logic, 3.11.7 two-way assertion symbol, 4.5.9 two’s complement representation of negative numbers, 7.5.8 types, theory, 4.1.9 typing, 1.4.7 typographic arts, 2.9.7 ¨ Ubertragung, 36.1.3 ugly set-construction definitions, 2.8.8 unary truth function, 3.10.10 unbounded modelling, on demand, 5.7.25 unbounded versus infinite, 5.5.2 uncertainty, logic, 3.9.3 uncountable, 2.11.20 uncountable aggregate, name, 2.10.1 uncut space, 10.10.4 undecidable proposition, 3.7.19, 3.8 undecided proposition, 3.10.1 undefined concept, probability, 3.9.1, 5.1.2 undefined concept, set, 5.1.2 underlying objects, 4.16.1 underlying topology, 27.2.6 unidirectional derivative, 19.6.2 unidirectional tangent bundle, 28.14, 28.14.2 unidirectional tangent bundle atlas, 28.14.2 unidirectional tangent bundle chart, 28.14.2 unidirectional tangent bundle lift function, 28.14.2 unidirectional tangent bundle projection map, 28.14.2 unidirectional tangent bundle total space, 28.14.2 unidirectional tangent vector, 28.14.2 unidirectionally differentiable atlas for topological manifold, 27.10.3 unidirectionally differentiable function, 18.3.5, 18.5.14, 19.6.3 unidirectionally differentiable homeomorphisms, complete pseudogroup, 19.6.7 unidirectionally differentiable manifold, 27.10, 27.10.4 uniform non-topological fibration, 22.1.3 uniform non-topological fibration with fibre space F , 22.1.9 uniform topology on finite set, 14.3.8 uniformly continuous function, 17.4.8 uniformly H¨ older continuous function, 18.9.1 uninteresting assertions, 4.8.1 union, disjoint, 6.4.2 union axiom, 5.3.4 union of family of sets, 6.8.5 union of sets, 5.13.2 union of sets, properties, binary, 5.13 [ www.topology.org/tex/conc/dg.html ]
union of sets, properties, general, 5.14 unique existence, 4.16.2 unique existence notation, 4.16.3, 4.16.5 uniqueness, 4.16 uniqueness, cardinality, 4.16.6 uniqueness, empty set, 5.3.2, 5.8.1 uniqueness, set notation definition, 5.8.6 unit coordinate basis covector, 29.2.6 unit element of unitary ring, 9.8.2 unit interval, 8.3.10 unit step function, 8.6.5 unitary left module, 10.1.3 unitary left module over a ring, 9.9.21 unitary ring, 9.8.2 unitary ring, commutative, 9.8.6 unity of unitary ring, 9.8.2 universal bilinear map, 13.4.3 universal quantifier, 4.13.2 universal set, 4.1.9 universal/existential quantifier, information content, 4.13.9 universality of modern logic, 3.4 Universe, atoms, 2.11.15 universe, imperfect, 3.4.3 universe, mathematical, ontology, 2.3.10 universe, metaphysical, 2.7.1 Universe, Ruler, 2.11.22 universe set, 5.7.14, 5.7.23 universe set, power set, 5.7.26 university education, medieval, 2.9.8 unknowable existence, Lebesgue non-measurable set, 2.10.8 unknown known, 4.2.8 unknown real number, 4.2.8 unknown truth value, 3.2.5, 3.8.2, 4.2.5, 4.2.7 unknown unknown, 3.8.2, 4.2.8 unknowns, blow-out, 3.8.2 UNLESS-construction, 3.5.4 unmentionable number, 2.10.4 unordered pair axiom, 5.3.3 unordered sample, 7.11.1 unordered selection, 7.11.1 unoriented continuous path, 16.4.15 unsigned integer arithmetic, 7.4 upper bound of partially ordered set, 7.1.10 URL (Uniform Resource Locator), 50.0.1 useless axiom of choice, 5.1.18 usual atlas for Euclidean space, 27.3.1 usual metric on real numbers, 17.1.4 usual topology for real numbers, 14.9.1 usual topology for IRn , 14.9.3 vacuum, proposition, 3.6.1 validity, proof by contradiction, 3.11.10 value, absolute, function, 8.6.1 value, truth, map, 3.1.2 value of function, 6.5.11 value proposition, 3.3.10 variable, bound, 5.1.24 variable, dummy, 4.3.16, 5.1.23, 5.8.16, 5.8.24 variable, dummy, limit of function, 14.11.19 variable, dummy, naked, 5.8.28 variable, dummy, superfluous, 6.5.16 variable, free, 5.1.24, 5.8.16 variable domain, concrete, 5.7.3 variable name space, abstract, 2.11.17 variable space, concrete, 5.2.3 variation, geodesic, equation, 41.4 Veblen, Oswald, 46.1.6 vector, action, 33.1.2 vector, basis, 10.2 [ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
882
51. Index
[ www.topology.org/tex/conc/dg.html ]
verb mood, logical proposition, 3.12 verb mood, subjunctive, 3.7.1, 3.10.11, 3.12.2 verb mood, symbolic logic, 3.12.1 vertical vector, principal fibre bundle, 36.5.10 vertical vector of total tangent space, 28.11.3 vertical vector on total space of fibration, 27.12.9 Vi` ete, Fran¸cois, 46.1.3 viewpoint of projection, 42.17.2 Viking opera house, 43.10.1 Vinci, Leonardo da, 3.9.8 violation, causality, self-containing set, 5.7.24 virtual machines, 2.5.15 virtual memory, 2.10.10 visible light, 3.4.5 visual art, 3.1.3 Vitali, Giuseppe, 3.2.1, 46.1.6 vocalization pattern, 2.2.4 vocalizations, complexity, 2.2.4 voltage, audio system, 3.3.7 voltage, logic, 4.5.5 voltage, logical, 4.1.8 voltages, 2.5.8 voltages, transistor circuit, 3.10.6 von Neumann, John (J´ anos), 5.12.1, 7.2.4, 46.1.6 von Neumann construction, ordinal numbers, 7.2.5 von Staudt, Karl Georg Christian, 46.1.5 Wallis, John, 46.1.4, 46.2.9 warming, global, 3.10.6 Washington University, 1.8 waste of time and energy, terrible, 4.4.4 water, 2.1.5, 39.1.5 weak regularity, 1.4.9 weak topology, 14.10.9 weaker topology, 14.2.21 weakly connected function, 15.5.3 weakly elliptic second-order operator, 37.6.3 weakly elliptic second-order tangent operator, 30.6.3 weaving, 2.5.14 Webster, Noah, 1.6.7 wedge product, 13.9.2 Weierstraß, Karl Theodor Wilhelm, 2.9.7, 17.4.2, 46.1.5 well-defined function, 6.11.1 well-formed formula, 4.3.10, 4.5.2 well-ordering of real numbers, 5.9.3 Weyl, Hermann Klaus Hugo, 1.5.4, 10.1.7, 12.2.0, 26.1.9, 36.1.1, 36.1.2, 36.1.5, 46.1.6 wf (well-formed formula), 4.5.2 wff, 4.7.4 wff (well-formed formula), 4.5.2 wff name, 4.7.4 wff name scope, 4.9.3 wheel, roulette, ex machina, 5.9.9 wheel of fortune, 1.9.0 wheel of knowledge, 2.0.1 white swan, 4.13.10 Whitehead, Alfred North, 46.1.6 Widman, Johannes, 46.2.5 Widmann, J.W., 46.2.5 wild logic, 2.2.7 wncyr font, 1.8 word order, 3.10.6 world, imaginary, model, 2.10.1 world, infinite, 2.10.9 world model, animal, 5.7.25 world model, organism, 3.3.2 world-model machine, 3.6.1 world-model-machine ontology, 3.6.2 world-model ontology, 5.7.25 [ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
vector, coordinate basis, 28.12 vector, cotangent, 29.2 vector, dual, 13.6.2 vector, etymology, 28.0.5 vector, horizontal, 10.11.11 vector, infinitesimal, 19.1.5 vector, mobile, 10.1.6 vector, portable, 10.1.6 vector, primal, 13.6.2 vector, second-order tangent, 30.3.3 vector, simple, 13.9.10 vector, tangent, 19.1, 28.2.1, 28.3, 28.3.3 vector action, 33.1.8, 33.1.16 vector addition, 10.1.2 vector bundle, 35.7, 35.7.1 vector component, 10.5.13 vector field, 29.5, 31.5 vector field, action, 33.1.3, 33.1.11 vector field, component function, 29.5.5 vector field, composition, 33.2.2 vector field, coordinate basis, 29.5.12 vector field, differentiable second-order, 30.7.3 vector field, elliptic second-order, 30.7.6 vector field, higher-order, 30.7 vector field, indeced map, 31.3.14 vector field, Lie derivative, 33.4 vector field, Lie group, left invariant, 34.3.11 vector field, Lie group, right invariant, 34.4 vector field, lift, by connection on principal fibre bundle, 36.5.8 vector field, linear space, 29.5.8 vector field, right invariant, 34.4.7 vector field, second-order, 30.7.1 vector field, tangent, 29.5.1 vector field, tangent, partial, 31.4.10 vector field, tangent, second-order, 32.3.1 vector field, tangent, second-order, partial, 32.4.2 vector field action, 33.1.17 vector field along curve, 29.8, 29.8.1 vector field along curve, continuous, 29.8.2 vector field along curve, differentiable, 29.8.4 vector field calculus, 33 vector field commutator, 33.2.5 vector field component function, 29.6.6 vector field derivative, naive, 33.1 vector field derivative for curve family, 33.3 vector field flow, 33.4.2 vector field for family of curves, higher order, 30.8 vector field generated by family of diffeomorphisms, 34.6.8 vector field generated by one-parameter transformation group, 31.5.2 vector field on differentiable fibre bundle, 35.3 vector field on differentiable principal fibre bundle, 35.5 vector field on Lie group, left invariant, 34.3 vector map, column, 11.1.10 vector map, row, 11.1.11 vector sequence, antisymmetric multilinear effect, 13.0.4 vector sequence multilinear effect, 13.0.1, 13.0.5 vector space, 9.9.2, 10.1.1 vector space, cotangent, 29.2.2 vector space, topological, 16.9, 16.9.3 vegetarian cook, 1.4.5 vel, 4.3.5 velocity of flow, 33.4.3 Venn diagram, 5.7.17, 5.7.25 verb mood, 3.10.6 verb mood, imperative, 3.10.14, 3.12.1, 3.12.3 verb mood, indicative, 3.7.1, 3.10.11, 3.12.1, 3.12.2
883
51. Index
world-model ontology, inconsistency, 5.7.12 world-model ontology, logic, 3.11.6 world-model ontology, proof by contradiction, 3.11.8 world-model ontology, propositional logic, 3.6.3 world models, multiple, 3.11.2 world states, 3.6.3 world-view ontology for logic, 3.6 writing, cuneiform, 3.5.1 XOR (exclusive OR), 4.3.7, 4.3.8 Yang Hui, 7.11.7 yaw, 42.8.2 year, Gregorian, 2.10.3 yes/no answer, 3.9.5 Young, John Wesley, 46.1.6 Zeno of Elea, 2.11.3, 7.2.7, 14.0.4, 14.1.5, 25.2.3, 46.1.1 Zeno’s paradox, 25.2.3 Zermelo, Ernst Friedrich Ferdinand, 2.9.7, 5.9.3, 46.1.6 Zermelo-Fraenkel set theory, 4.1.8, 5.1, 5.7.2 Zermelo-Fraenkel set theory axioms, 5.1.26 Zermelo-Fraenkel set theory with axiom of choice, 5.9.6 Zermelo-Fraenkel set theory with axiom of countable choice, 5.10.3 Zermelo set theory, 5.11, 5.11.1 Zermelo-Skolem-Fraenkel set theory, 5.1.1 zero-operand operator, logical, 4.3.18, 4.3.19 zero-parameter predicate, logical, 4.12.10 zero ring, 9.8.4 zero tangent operator ambiguity, 2.6.5, 28.5.12, 28.6.1
zero-thickness boundary, 25.2.8 zero-width boundary, 14.0.3 ZF (Zermelo-Fraenkel), 5.0.9 ZF axiom, productive, 5.1.17, 5.1.19 ZF axiom of substitution, 5.11.3 ZF axiom style, 5.6.2 ZF-clean theorem, 5.9.1 ZF comprehension axiom, 5.4.3 ZF extension axiom, 5.1.3, 5.2 ZF infinity axiom, 2.11.13, 5.6 ZF regularity axiom, 5.5, 5.7.19 ZF replacement axiom, 5.4 ZF separation axiom, 5.4.3 ZF set existence axiom, 5.1.17, 5.1.18 ZF set theory, first order logic, 4.13.13 ZF set theory, propositions, 4.1.9 ZF set theory, redundant axiom, 5.1.17 ZF set theory 8-line summary, 5.1.27, 5.1.28 ZF set theory axioms, 5.3 ZF set theory construction stage, 5.5.4 ZF specification axiom, 5.4.1, 5.4.2, 5.7.19, 48 ZF-unreachable, 2.12.3 ZFC (Zermelo-Fraenkel set theory with axiom of choice), 5.9.5 Zhu Shijie, 7.11.7 ziggurat, 1.4.14 zoology, 2.2.5 Zorn’s lemma, 5.9.13, 5.9.14, 10.2.25 Zusammenhang, 26.1.9, 36.1.1, 36.1.2 Zusammenhang, stetiger, 36.1.2
number value number value number value pages 902 definitions 721 theorems 399 chapters 51 notations 229 proofs 206 sections 455 examples 51 exercises 68 diagrams 281 lemmas 2 solutions 39 remarks 1993 metadefns 3 [comments] 1200
[ www.topology.org/tex/conc/dg.html ]
[ draft: UTC 2009–3–21 Saturday 11:36 ]
Differential geometry reconstructed.Copyright (C) 2009, Alan U. Kennington. All Rights Reserved.The author hereby grants permission to print this book draft in A4 format.Printing in all other formats is forbidden.You may not charge any fee for copies of this book draft.
884