LOGIC COLLOQUIUM '87
STUDIES IN LOGIC AND THE FOUNDATIONS OF MATHEMATICS VOLUME 129
Editors: J. BARWISE, Stanford
H...
11 downloads
1041 Views
48MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
LOGIC COLLOQUIUM '87
STUDIES IN LOGIC AND THE FOUNDATIONS OF MATHEMATICS VOLUME 129
Editors: J. BARWISE, Stanford
H.J.KEISLER, Madison P. SUPPES, Stanford A.S. TROELSTRA. Amsterdam
NORTH-HOLLAND AMSTERDAM 0 NEW YORK 0 OXFORD 0 TOKYO
LOGIC COLLOQUIUM '87 Proceedings of the Colloquium held in Granada, Spain July 20-25, 1987
Edited by
HrD. EBBINGHAUS
Mathematics Institute, A. Ludwigs University, Freiburg, F.R.G.
J. FERNANDEZ-PRIDA
Universidad Complutense de Madrid, Spain
M. GARRIDO
Universidad Complutense de Madrid, Spain
D. LASCAR
University of Paris Vll, France
M. RODRIQUEZ ARTALEJO Carretera de Valencia, Madrid, Spain
1989
NORTH-HOLLAND AMSTERDAM 0 NEW YORK 0 OXFORD 0 TOKYO
ELSEVIER SCIENCE PUBLISHERS B.V. Sara Burgerhartstraat 25 P.O. Box 2 11, lo00 AE Amsterdam, The Netherlands Distributors for the U.S.A. and Canada: ELSEVIER SCIENCE PUBLISHING COMPANY, INC. 655 Avenue of the Americas New York, N.Y. 10010,U.S.A.
ISBN: 0 444 88022 4 0 ELSEVIER SCIENCE PUBLISHERS B.V.. 1989
All rights reserved. No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means, electronic, mechanical, photocopying. recording or otherwise. without the prior written permission of the publisher, Elsevier Science Publishers B.V./ Physical Science and Engineering Division, P.O. Box 103, lo00 AC Amsterdam, The Netherlands. Special regulations for readers in the U.S.A. - This publication has been registered with the Copyright Clearance Center Inc. (CCC), Salem. Massachusetts. Information can be obtained from the CCC about conditions under which photocopies of parts of this publication may be made in the U.S.A. All other copyright questions, including photocopying outside of the U.S.A., should be referred to the publisher. No responsibility is assumed by the publisher for any injury and/or damage to persons or property as a matter of products liability, negligence or otherwise, or from any use or operation of any methods. products, instructions or ideas contained in the material herein. pp. 24 1-274: copyright not transferred Printed in the Netherlands
V
PREFACE “Logic Colloquium ’87”, the European Summer Meeting of the Association for Symbolic Logic, was held at the University of Granada from July 20 to July 25, 1987. The meeting was co-sponsored by the Journal “Teorema” (Madrid) and the University of Granada. The main sections of the conference were Logic, Set Theory, Recursion Theory, Model Theory, Logic for Computer Science and Semantics of Natural Languages. The program committee consisted of J.J. Acero, N. Batlle, R. Beneyto, H.-D. Ebbinghaus, J. Fdz.-Prida, M. Garrido, D. Lascar, J. Mosterin, J.A. Paulos, M. Rgz. Artalejo and E. Trillas. Invited lectures were given by F. Delon (Paris), F.R. Drake (Leeds), J. Fenstad (Oslo), J. Flum (Freiburg), W.A. Hodges (London), U. Hrushovski (New Brunswick), J. Ihoda (Jerusalem), G. JBger (Zurich), P. Koepke (Oxford), A. KuE‘era (Prague), G . Longo (Pisa), J. Meseguer (Menlo Park), St. C. Simpson (Stanford) and J.F.A.K. van Benthem (Amsterdam). The present volume contains the papers of the invited talks as they were made available by the authors, together with a paper by G. Kreisel (Salzburg), who was invited for a main lecture but was unable to attend the meeting. All papers have been refereed. Abstracts of most of the contributed papers may be found in the report of the conference in the Journal of Symbolic Logic. The conference was financially supported by the Comisi6n Interministerial de Ciencia y Tecnologia (CICYT, Ministerio de Educaci6n y Ciencia, Madrid), Fundaci6n Banco Exterior (Madrid), Consejo Superior de Investigaciones Cientificas (Madrid), the Division of Logic, Methodology and Philosophy of Science of the International Union of History and Philosophy of Science, Universidad Complutense (Madrid), Universitat de les Illes Balears and the Junta de Andalucia. We wish to thank the institutions mentioned above for their generous support. We also owe a debt of gratitude to J.J. Acero and Carmen Garcia-Trevijano for their invaluable help during the organization of the Colloquium.
vi
Reface
We dedicate the present volume to the memory of John R. Myhill (1923-1987) as a token of esteem for him and his scientific work.
H.-D. Ebbinghaus
J. Fernandez-Prida
M. Garrido D. Lascar M. Rodriquez Artalejo
John R. MyhiIl(1923-1987) (Photograph reproduced by courtesy of MI. and MIS. Lejaren Hiller)
This Page Intentionally Left Blank
ix
CONTENTS Preface Photograph of John R. Myhill(1923-1987) Model Theory of Henselian Valued Fields F. Delon On the Foundations of Mathematics in 1987 F.R. Drake Logic and Natural Language Systems
V
vii
1 11
JE. Fenstad
27
Model Theory of Regular and Compact Spaces J. Flum
41
Categoricity and Permutation Groups
W.Hodges
53
Unidimensional Theories. An Introduction to Geometric Stability Theory E. Hrushovski
73
Unbounded Filters on w J .I, Ihoda
105
Type Theory and Explicit Mathematics G .Jilger
117
An Introduction t o Extenders and Core Models for Extender Sequences P. Koepke
137
bgical Aspects of the Axiomatic Method: On Their Significance in (Traditional) Foundations and in Some (Now) Common or Garden Varieties of Mathematics G. Kreisel
183
X
Cbnrenrs
On the Use of Diagonally Nonrecursive Functions A. Ku&ra
219
Some Aspects of Impredicativity: Notes on Weyl's Philosophy of Mathematics and on Today's Type Theory G. Long0
241
General Logics J. Meseguer
27 5
Semantic Parallels in Natural Language and Computation J. Van Benthem
331
Logic Colloquium '87
HrD.Ebbinghaus et al. (Editors)
0 Elsevier Science Publishers B.V. (North-Holland), 1989
I
Frangoise Delon
Many papers on the model theory of valued fields have now been written, nost of which are rather technical. We would like to make this account as accessible as possible, and for this reason we choose to avoid giving the statement of' the most complicated results. Instead we will emphasize the underlying ideas, and more understandable corollaries. Some results will not be mentioned at all. One can find a fairly exhaustive bibliography in [WZ], and an excellent survey about the case of finite ramification in [MI. Because of our emphasis on simplicity, in parts I and I1 we will consider only fields of residual characteristic 0 . The added complications in the case of positive characteristic are algebraic rather than model theoretic.
A valued field will be denoted by (K,v), with vK its value group, which is an ordered abelian group (not necessarily archimedean), and K/v its residue field. The basic definitions and results lnay be found in most algebra textbooks. We are going to study these structures in the language
Lo = I 0 , 1, where A is a
unary
*,
-1
-9
1
A
I
predicate with the interpretation: A(xJ ++ v(x) 2 0 . As
one can see, we do not include cross section in the language. We will briefly explain in the appendix what happns in the case where we adjoin a cross section.
Suppose that: R is a theory of fields of characteristic 0 in the language
E
0 , 1,
+,
*
9
1;
V is a theory of nan trivial ordered abelian groups (or "o.a.g.") in
the leneuaee
E
0 , t, 6
1.
F. Delon
L
We now define theories in Lo:
W V R and vK V, lM(R,V) = T ( R , V ) t "v is henselian", where (using the fact that the residual characteristic is 0 ) "henselian" means "algebraically closed modulo the given W v and vK" , i e. more T ( R , V ) is the theory of valued fields (K,v) such that
.
precisely: (K,v) henselian
c)
for all algebraic extensions (L,w) 2 (K,v) which satisfy W v = L/w and vK = WL (for the natural embeddings W v 5 L/w and vK E wL), we have K = L.
The Ax-Kochen-Er€iov principle asserta the transfer of properties of the theories R a n d V to 'IH(R,V). For example we have: a ' L l i ( It, V ) complete R a n d V complete R a n d V decidable
1
T H ( R , V ) decidable
H a n d V model complete R model canpanion of R1 v model crmpanion of V1
3
l H ( R , V ) model complete
On the contrary:
3
=> 'IH(R,V) model companion of T ( R 1 ,V1).
'1 +
R model completion of R. 1 V model cunpletion of V1
'IH(R,V) model completion of T ( R ~, v t )
(not even of 'IH(R1 , V 1 ) ) .
Let us give a counter-example: R = R1 = the theory of real closed fields
V1 = the theory of 0.a.g. V = the theory LXXG of divisible 0.a.g.. The field R ( ( X ) ) with the natural valuation is a model of ' R I ( R , V I ) , -1 1
R((XG))
:=
U
na*
R ( ( F ) ) a model of 'IH(R,V). There exist several embeddings
of R( (x)) in R( (
1
.
~) 3some of these embeddings are contradictory: for example, one can send X to X or to -X; in the first embedding X becomes a square, but not in the second one. More generally we are going to see the crucial role played by the predicates Pn(x), which mean "x is an nth-p0wer".
3
Model Theory of Hensellan Valued Fields
The link between -1 completion and quantifier elimination (''q.e.") is well known: T haa q.e. iff T is the model capletion of its universal part Tv. The previous counter-example might suggest that 9.e. does not transfer fran R and V to TH(R,V). In fact: R and V have q.e. = TH(R,V) has q.e., but this is not what we want in a transfer result, because the assumptions
-
are too strong. Indeed:
* H is the theory of algebraically closed fields (of characteristic O), V has q.e. in { 0 , t, < 1 V = MIAG, hence 'IH(R,V) is the theory of algebraically closed fields w i t h non R hae q.e. in { 0, 1, t, , }
trivial valuation and residual characteristic 0, which indeed has 9.e.. A meaningful transfer result mast say something about structures whose residue fields and value groups admit q.e. in an enriched language. Notation. Ro is the theory of fields in L(Ro) = { 0, 1, t,
. }.
R 2 Ro a theory in a language L(R) 2 L(Ro). Vo is the theory of 0.a.g. in L(Vo) =
{ 0 , t,
0I +
M*(?)
;
this enables us, for (K,v) I= T(R,V), to define on K/v an L(R)-structure in
the following way: w v c ~(;;/v) iff
(K,v)c
H*W;
F. Delon
4
we add then the following axioms: 3. K/v R ; 4 . for each M Q L(V) - L(Vo)
M*(x) [
M*(Z) A
M
M
xi # 0
v(xi) = v(yi) 1 + M*(i);
hence, for (K,v) C T(R,V), vfC has a natural L(V)-structure defined as follows: VK c M(v(~)) iff (K,V) c M*(X), and we add: 5. v K c V . Now we take 'RI(R,V)= 'I'(R,V)t "v is henselian". With these enriched languages the transfer of the notions of completeness, model umpleteness, decidability and model companion remain valid, but not 9.e.. The actual q.e. result is in fact rather complicated. We are going to need the following notation: if P is a quantifier free formula in L(R) with str variables and if nl,..msare positive integers,
.
then we let Fp,nl,..ns be a new predicate symbol of arity str, and we
This result has two nice corollaries. (Cherlin-Dickmann). Suppose that R and V have quantifier elimination and that for all k R and all integers n, the multiplicative group k*n has finite index in in k*. Then in the language L(R,V) U {
Pn ; n e w } ?H(R,V)
the theory n,j ; n e w , 1 5 j < in } { Pn(x) Y 3y yn=x ; n c o } U { the residues of
U { c U
Model Theory of Henselian Valued Fields
Cn,
... cn,in represent the different classes of k*/k'"
5
;n e w }
has q.e.
m2 There is. an extension T of ZH(R,V) language L, such that: - T has q.e. - any symbol of L - { 0, 1, t,
-,
by definition, in a
. } is relational and its interpretation
-
in the models is stable under the equivalence * : x y iff v(x) = v(y) < v(x-y). Remark. The previous theorems of quantifier elimination are very sensitive to slight modifications of the language. For example it is often useful to include rings among the structures of the considered language: this -I corresponds to eliminating from the language. It is known classically that we can not do so without some care: algebraically closed valued fields do not have q.e. in { 0, I , t, -, ., A 1 . Indeed the formula v(x) 2 v(y) which is equivalent to xy-l e A in L,, can not be expressed quantifier-free in Ll. But this is the only problem, in the sense that
there is q.e. in { 0, 1, t, -, ., D 1 where D(x,y) is interpreted as the relation v(x) v(y). This can not be generalized: for example the -1 Cherlin-Dickmann theorem is not true if we remove and replace A by D. The problem arises in the following situation: let (K,v) be a valued field, A its valuation ring, M its maximal ideal, and B a subring of K; if we define B/v by mB/MB, then often the quotient field Q(A/v) is strictly included in Q(A)/v. For example in the model k( (X)), the ring O[X,tXl, tck-Q, does not contain any element with constant tern t, while t is definable over this ring. The previous lifting of the residual language does not carry enough information. We have to do this lifting more -1 carefully, in order to get a statement not requiring in the language: to each new residual predicate M(xl,. .xn) we have associated M*(yI,. .yn)
.
such that
..
.
.
M*(Y~, .ynl iff M(Y~/v,. .Y,/v)
we
. M*(yl, ...yn,zl,...zn 1
.
;
must now associate M*(yl,. .yn,zl,..zn) such that -1
iff M(ylzl /v
,...ynzn-1 /v).
We have mentioned that q.e. for R and V does not necessarily imply
6
F. Delon
q.e. for lll(R,V), which means that the definable subeets of (K,v) can not be obtained in a simple way from those of vK and W v , where (K,v) is a model of TH(R,V). Ch the other hand, there is a very simple converse:
-.
A part of (K/v)" which is interpretable in (K,v) is already definable in W v , uniformly in the parameters. The same holds for the
value group. This follows from Weispfenning's theorem of q.e. in the language with cross-section (see appendix).
Kz. AN APPLICATION
m_pks
We want now to develop one application of quantifier elimination results, an application which is surprising in two aspects: - it concerns the padics, while the q.e. theorem was stated for valued fields with residual characteristic 0,
-
it describes, instead of definable subsets, what we call sets definable with external parameters: given a structure M, A 5 M" is definable in M with external parameters if there exist an N ) I4 and V a definable B in N such that A=&. Che can characterize stable theories by the fact that sets definable with external panvmeters are always definable. It may happem that some models of unstable theories have the same property. In particular Lou van den Dries proved that this is the case for the field R and asked the question for the 9 ' s . P
-.
Any subset of 0 definable with external parameters is P definable in Qp. The field 0 is the only one of its elementary class P having this property. Let us concentrate on the first assertion of the theorem, and give an idea of the proof. Details may be found in [Dl]. we are given an extension (M c N, v) of valued fields, G the convex closure of vM in vN. The convex subgroups are the sub-objects which give rise to quotients in the category of o.a.g., hence we can consider the cornposition w: N 2 VN -% VN/% , here n is the canonical projection. The map w is again a valuation and is trivial on M, so M may be identified with M/w and erpbeds into N/w. We define now the following two properties.
Model Theory of Henselian Valued Fields
This embedding M -* N/w is an isomorphism. 2. Any subset of 8 definable with parameters in (N,v)is definable in 1.
(M,v)
1
If (M,v) and (N,v) are henselian of characteristic 0, then
* 2.
Let us deduce theorem 1 from theorem 2. To do so, let us return to condition 1. We used the convex closure of vM in VN to define w. More generally, for every convex subgroup H of vN, we define w: N 2 vN --H vN/H and hence get a residue field N/w. In fact N/w keeps track of the original (finer) valuation v: we can define a valuation denoted by v/w on N/w by setting (v/w)(x/w) = v(x), for x in N with w ( x ) = 0 . 'Ihen v(x) E H ) , and (v/w)(N/w) = H (because w(x) = 0 (N/w)(v/w) = N/v (this simplification justifies our quotient notation v/w). We saw that for H=Gi, w is trivial on M and hence M/w is identical to M, furthermore (M/w,v/w) is identical to (M,v). Therefore we have an embedding of the whole valued structure (M,v) into (N/w,v/w)
.
w of
Let (M,v)= (Q~,V~). men for all (N,v)
> (M,v),
1.
holds. Indeed vM=Z, which is convex in any elementary extension, hence %=vM. Then the valued field (N/w,v/w) has value group vM and residue field N/v = Fp = M/v. Hence the inclusion (M,v) (L/w,v/w) is inmediate. But the value group is 2 and (K,v) is caoplete, hence we have equality. Then apply theorem 2. 0
.
Proof of theom 2 We are going to apply quantifier elimination results in a structure (N,w) defined as follows: it is the valued field (N,w) with an additional structure on N/w: N/w = (N/w, v/w). The structure (N,v) is interpretable in (N,w): v(x)lO iff ( w(x)lO and (v/w)(x/w)lO ). Notice that the residual characteristic of w is 0 since M embeds in N/w, and that w is henselian because it is rougher than v which is henselian. Hence the quantifier elimination applies:
.
F
9 }
(N,w) has q.e. in a language { 0 , 1, t, -, ) U { F ; where the F ' s represent predicates stable under the equivalence
F. Delon
8
For P c N[x] and such an F, { ; (N,w) C F(P(2)) } is not stable under &t we can reduce ourselves to consider sets stable under by the following.
-.
-.
..
..
Let (M ,C N, w) be such that w is henselian 0x1 N and trivial on M, and M is N/w. Let A be a subeet of ff which is definable in (N,w). where B is definable in Then An# is a Boolean combination of sets d, (N,w)and either Zariski closed, or stable under
-.
No problem arises with Zariski closed sets, as m have the following.
Lemna 3. If the extension of fields M 5 N is separable, and B is Zariski closed in N", then mM" is ZariAi closed in M". of the Dmof of theorem2.Let A be a subset of M" definable with parameters in (N,v), hence in (N,w). We have to prove it is definable in (M,v).BY proposition 2 we restrict ourselves to the case A = d where B is either - miski closed, then md' is u i s k i closed in M by lemna 3, - or stable under In this w e we can define B/w as follows: for E (N/w)", i e ~ / wiff n i c N " , i = ; ; / w ~ ; e ~ iff V;; 8, X = idw 4 X o B. Under hypothesis 1, we have f: M I N/w, then f(B'IM) = B/w; B/w is now a part of (N/w)" definable in (N,w), hence in (N/w) alone. Hence f-l(B/w) is definable in f-'(N/w), which means: A is definable in (M,v).O
-.
A cross section of the valued field (K,v) is a section of the valuation consiciered ee a morphism fran (K*,.) onto VK, that is a map 11: vK 4 K such that n(atb) = n(a).n(b) and von(a) = a Results concerning theee structures cen be expressed in a nice way in a language with 3 sorts of variables, for elements of K, vK and K/v. Let UB then define LS(R,V) = { 0, 1, +, -, } (concerning variables from K)
.
U
I
.
v i II,
-1
U L(R) n L(V)i
where L(R) concerns variables fran W v and is an extension of
Model Theory of Henselian Valued Fields
.
+, -, 0, +, -, 5 ,
{ 0, 1, {
v: K
4
vK
U {
9
, L ( V ) concerns variables from vK and contains } , and v, n and res are function symbols, } , n: VK U ( } 4 K , res: K K/v. Now we consider }
4
0
the following theories in the language LS(R,V): TS(R,V) = T(R,V) t "n(atb) = r(a).n(b)" W(R,V)
= TH(R,V)
U
+
"von(a) = a"
TS(R,V)
where, to be accurate, we should really have taken, instead of T ( R , V ) and 'RI(R,V), their translation in L S ( R , V ) , when v, II and res have their usual interpretation (res(x) = x/v if v(x)>O and is not defined otherwise). Things go mom smoothly in this language than in L ( R , V ) , in the sense that, as before, completeness, model completeness and decidability transfer from R and V to ' f f i s ( R , V ) , and in this case, so does quantifier elimination, see [Wl]. Hence why not work in this language? 'l'here are several reasons. - A valued field does not necessarily have a cross section. - Even if it has one (which is the case for natural models), adding this cross section to the language greatly enriches its original valued field structure. In particular in positive characteristic this can lead to radically different behaviours of the two theories: there are cases where TH(R,V) is decidable, while W ( R , V ) is not (provided that we replace the property of being henselian by a stronger one) IK], [MI. Nevertheless every valued field has an elementary extension which has a crosm section. This shows that 'RIS(R,V) is a conservative extension of TH(R,V), hence completeness, model completeness and decidability of THS(R,V) will imply that of TH(R,V). On the contrary, it does not seem
possible to deduce quantifier elimination in TH(R,V) from that in THS(R,V).
Let us finish with an explanation of why sane of the problems which arose when we were working in the language L ( R , V ) , disappear in the new language. 1. Let us look back at the counter-example at the end of P a r t I: if we choose n -+ X? as a section of R ( (X)) and q X? as a section of 1
R ( (x;)),
then
x
-
cen only be sent to an element of the form
X?.
2. The language LS(R,V) does not include the inverse. Jht if an LS(R,V)-
structure A contains x, it must also contain n(-v(x)). This ensures that a reasonable ntlmber of elements of A are invertible in this ring.
F. Delon
10
[Dll, ID21
[DRI
IK l [MI
[Wll
IWZI
F. Delon, D6finissabilit6 avec pmadtres ext(irieure dens Q
P et R, to appear in the A.M.S. Proceedings. F. Delon, Eliminatian des quantificatevrs dene lea cow valds, in preparation. F. Delon et Y. Rouani, IndBcidabilit6 de corps de s6ries fondles, to appear in the J.S.L.. F.-V. KuhlmeM, Henselian Functions Fields, Dissertation, Heidelberg, in preperation. A. Macintyre, 'l'wenty years of padic fields, in 3, Paris, Wilkie and Wihrs (eds.) , North-Holland, Amsterdam 1986. V. Weispfenning, h the el-nteiry theory of Hensel fields, Annals Math. Logic 10 (19761, 59-93 V. Weispfenning, mtifier elimination and decision procedures for valued fields, in w, Miller and Richter (eds.), Springer-Verlag, Berlin, 1984.
Franwise Delon Universitk Paris 7 UA 763 45-55 5 h e 6tage 2 place Jussieu 75 251 Paris &ex 05 FRANCE CNRS,
Logic Colloquium ‘87
H:D. Ebbiaghaus et sl. (Editors)
11
8 Elrevier Science Pubbhas B.V. (North-Holland), 1989
On the Foundations of Mathematics in 1987 Frank R. Drake Department of Pure Mathematics, University of Leeds.
1.1 Introduction. It is my intention in this paper to set out what I see as the current state of the foundations of mathematics, and in particular the implications of the results of the programme of Reverse Mathematics, as initiated by Harvey F’riedman (see [F74]) and continued by Stephen G. Simpson and others. I consider that there are many such implications, and that these make much of what was written in the past on the philosophy of mathematics, obsolete. (Thus to me, many parts of the Benacerraf-Putnam volume of readings in the philosophy of mathematics seem far from relevant today. To be fair, it might be argued that that was true when they were written; but it is clearer today).
1.2 Foundations of Mathematics. I am first going to make a claim: namely, that any piece of mathematics can be considered as fully formalized in some first order system, and should be so considered from the point of view of foundations. I think there is clear evidence that the way in which doubts (about a piece of mathematics) are resolved, is that the doubtful notions or inferences are refined and clarified to the point where they can be taken as proofs and definitions from existing notions, within some first order theory (which may be intuitionistic, non-classical, or category-theoretical, but in mainstream mathematics is nowadays usually some part of set theory, at least in the final analysis). So I shall work from this claim, that any established piece of mathematics is within some first order formal system, and so involves some assumptions which form a system of axioms within first order logic. For any such system, we can ask about the strength of these assumptions. One natural way to do this is via the quasi-ordering induced by 2’1 U con(T1) I-
con(T2)
(at least for systems with sufficient arithmetic for con(T) to be formalizable in the system). The equivalence classes of this quasi-ordering are measures of “consistency strength”, and so consistency strength enters
12
F.R. Drake
as a first measure for any reasonably strong piece of mathematics. It is not the only question, but it is a starting point, to measure how much set theory is involved. [The only pieces of mathematics to which it does not obviously apply, are the very weak systems which are involved with practical questions in computing, at levels where exponentiation is not available; at these levels no distinctions seem to be available to measure consistency strength. All such systems appear to be equally weak, using any measures known to me.] Now, S. Shapiro [SH85]has argued that second order logic is the right logic for mathematics, and in one sense I think he is right. How does this square with my claim? Simply that I regard second order logic and mathematics as amounting to the same thing, and to study either, we study first order approximations. I think that this view is best regarded as part of my philosophy of mathematics, and I think it worthwhile here to draw a working distinction between foundations of mathematics and philosophy of mathematics: Foundations of Mathematics is (or at least involves) the discussion of the first order systems within which mathematics is set out. Philosophy of Mathematics is (or at least involves) the justification for accepting or rejecting those systems. Thus discussion of which notions are needed for presenting a piece of mathematics would fall under foundations, as does the notion of consistency strength. But discussion of what sort of objects those notions actually are, and arguments for the consistency of the systems involved, is part of the philosophy of mathematics. In terms of this distinction, most of this paper will be about the foundations of mathematics; but I shall point out some implications for the philosophy of mathematics. 2.1 The range of consistency strengths. Although the range is infinite and not linearly ordered, most of the systems that have been found interesting for presenting pieces of mathematics, fit into a line between the weakest systems considered (which might be some restricted part of primitive recursive arithmetic P R A , but here I shall take it to be the whole of P R A ) and the strongest, which is the inconsistent theory 0 = 1. It should be noted that none of these systems can be proved consistent except by assuming the consistency of a stronger theory, and in that sense there is plenty of work for the philosophy of mathematics. The best known formal system for mathematics is probably ZermeloF’raenkel set theory Z F C ; but there may well be comparatively few pieces of mathematics which actually require the full consistency strength of ZFC . Of course, several are known that require even more, and large cardinals give interesting stages above ZFC (as noted in
On the Foundations of Mathematics in 1987
13
[DSS]). But I shall be concerned in this paper with much weaker systems, and in particular with the subsystems of second order arithmetic which are the main focus of the programme of reverse mathematics. These systems are simplest taken as extensions of P R A . P R A has variables for natural numbers ( m n ,, p , . . .z,y, . . .) , and symbols for each primitive recursive function, together with recursion equations as axioms defining each one. PRA also has induction for open formulas. To this we add second order variables X, Y,2,.. . for subsets of the natural numbers; the membership relation E, and the induction axiom in the form (0 E X A Vn(n E X --f n 1 E X)) + Vn(n E X)
+
This is usually called restricted induction, since it is restricted to the sets which are available in the particular system; in general, induction will not be available for all formulas. Now for the system RCAo , (Recursive Comprehension), which forms a convenient base system for much of reverse mathematics, we add the comprehension axiom scheme for A(: formulas: if p(x) is Ey and x(z) is U(:,we add (so for A? formulas cp(z), { x I p(z) } exists); and the scheme of C(:induction:
for Ct formulas (p(z) In both schemes, the formulas are allowed to have free set variables Y,2,.. . as parameters. The power of RCAo is summed up by saying: all recusive sets (of natural numbers) exist, and if other sets are given as parameters, then any sets Turing reducible to them will also exist. Thus RCAo can carry out a recursive definition with given sets as parameters, and in particular can carry out primitive recursions; we shall use this later. The systems ACAo and IIiCAo are now very simply described. These add to RCAo , a stronger comprehension scheme
3XVz(z E x t--t
(p(2))
(i.e. (z I cp(z)} exists), where, for ACAo, q ( x ) is any arithmetic formula; and for rI$Ao, p(z) is any rIi formula (again allowing parameters in both cases). [ ACAo is an extension of Peano Arithmetic P A , which can be thought of as PRA plus the induction scheme for all arithmetic formulas; in fact ACAo is conservative over Peano Arithmetic.]
14
F.R.Drake
Two other systems, intermediate between RCAo and ACAo and between ACAo and IIiCAo, have been shown to have particular interest for reverse mathematics (and so for measuring the amount of set theory used). The first of these is W K L o , which adds to RCAo, Konig’s lemma for 0-1 trees. If T is an infinite set of finite 0-1 sequences, closed under subsequences (so that T forms an infinite binary tree of 0-1 sequences), then WKLo says that T has an infinite branch; i.e. there is a set Y such that for each n , the characteristic function of Y n n belongs to T (or equivalently, there is a 0-1 function f such that for each n , f n E 2’). WKLo stands for weak Konig’s lemma. In ACAo , the full Konig’s lemma, (for arbitrary finite branching trees) is provable, and indeed ACAo is equivalent to RCAo plus full Konig’s lemma (see table 3)’ so WKLo is clearly a subsystem of ACAo The other intermediate system is AT& (for Arithmetic Transfinite Recursion), which adds to RCAo the statement that any two wellorderings of the natural numbers are comparable. An equivalent formulation is that transfinite recursion can be carried out for arithmetic formulas (this is the original formulation and leads to the name). Arithmetic comprehension amounts to just one step in such a recursion, so AT& includes ACAo ; and IIiCAo can carry out stronger recursions (in particular it proves that recursions stabilise) and ATRo is included in IIiCAo . 2.2 Reducibilities. We have already mentioned that ACAo is conservative over Peano Arithmetic for arithmetic formulas: that is, any arithmetic formula provable in ACAo is already provable in P A . More important from the point of view of foundations are the following: (i) WKLo is conservative over PRA for IIg formulas. Since PRA has been given as the explication of Hilbert’s intended notion of finitistic (see Tait [T81]), this means that any II; formula provable in WKLo is finitistically provable; and in this sense, any mathematics proved in WKLO is finitistically justified. The extent of this (which can be seen as carrying out Hilbert’s programme for these parts of mathematics) can be seen in the table of reversals (table 2). It should be noted, for example, that WKLo can prove the existence of a nonconstructible set of natural numbers (or of a non-recursive function), but it cannot define any particular non-constructible set. ( ACAo , in contrast, can define such a set; indeed it can define any arithmetic set.) (ii) AT& is conservative over Feferman’s system IR,for II: formulas. Fefeman [F64,68] gave the system I R as an explication of the notion of predicative (of PoincarC and others), and so we have a sense in which the mathematics of AT& is predicatively justifiable; see table 4
On the Foundations of Mathematics in 1987
15
for the extent of this. Here we can note that ATRo can prove the existence of a non-hyperarithmetic set, but cannot define any particular non-hyperarithmetic set (while IIiCAo can define many such, e.g. the complete IIi set, Kleene’s 0)
Table 1. Some consistency strengths.
PRA
I
RCAo
I \
WKLo
I I
\
\ RCAo
+ wW
I
II:CAo 22
(2nd order arithmetic)
z
( Zermelo)
ZFC
( Zermelo-F’raenkel )
Large cardinals
0=1
F.R. Drake
16
2.3 Further systems. Other systems have been used in reverse mathematics, and two of these included in table 1 are the systems “uWis well-ordered”, and RCAo “ w W y is well-ordered”. RCAo (See Simpson [SS?]) These two systems are between RCAo and ACAo , but independent of WKLO; and they are not finitistically reducible (they are not llq conservative over PRA since they prove, for example, the existence of Ackerman’s function, which is not provable in P R A ) . There are also weaker systems, such as a restriction of RCAo to functions of exponential growth; this is considered, e.g. in Simpson and Smith [SSSS], where equivalents of Ey induction are given. (I do not present these weaker systems here since the philosophical implications seem to me to involve the higher levels.)
+
+
2.4. The results of Reverse Mathematics. A selection of these results is given in tables 2 to 5 . The point to notice here is that each of these results constitutes an answer to the question “How much set theory is needed for this piece of mathematics” (or: what consistency strength is needed); and the reversal part of the proof (the proof of the axioms of the system from the mathematical statement) shows that nothing less will do-these are complete answers. One possible question which might remain, is whether RCAo is the right starting point for this work. I would want to argue that it is; in particular, if any non-constructive step is going to be considered, then all recursive constructions are simpler and should be accepted first. Hence RCAo should be admitted as a background theory in all these cases.
Table 2. Equivalents of WKLo. Each of the following is equivalent to W K L o , over the system
RCAo :
(a) The compactness theorem for the propositional calculus. (b) The completeness theorem for the predicate calculus. ( c ) Lindenbaum’s lemma. (d) The sequential Heine-Bore1 theorem. (Every covering of the closed unit interval by a sequence of open intervals, has a finite subcovering.) (e) Every countable field has a unique algebraic closure. (f) Every formally real field has a real closure. (g) Every countable commutative ring has a prime ideal. (h) Every continuous function on the closed unit interval is bounded. (i) Every continuous function on the closed unit interval is uniformly continuous.
On the Foundations of Mathematics in 1987
17
(j) Every continuous function on the closed unit interval is Riemann integrable. (k) Every continuous function on the closed unit interval attains a maximum value. (1) The local existence theorem (of Cauchy/Peano) for solutions of ordinary differential equations. (m) The Hahn-Banach theorem for separable Banach spaces. (n) The C(:separation principle: if cpo(n) and cpl(n) are both Cy formulas, 13n(cpo(n)Acpi(n))+ 3Xvn((cpo(n)+ n E X)A(cpi(n) + n
XI).
Table 3. Equivalents of ACAo Each of the following is equivalent to ACAo , over the system RCAo : (a) Konig’s Lemma. (b) Every bounded [or: bounded increasing] sequence of real numbers has a least upper bound. ( c ) The sequential Bolzano-Weierstrass theorem (every bounded sequence of real numbers has a convergent subsequence). (d) Every countable vector space has a basis. ( e ) Every countable field has a transcendence base. (f) Every countable Abelian group has a unique divisible closure. (g) Every countable commutative ring has a maximal ideal. (h) The Ascoli lemma. (i) Ramsey’s theorem w + (w); (or w + (w); for any n 2 3 , k 2 2 ; but note that for the case w + ( w ) $ , the reversal is an open problem).
Table 4. Equivalents of ATRo Each of the following is equivalent to AT&, over the system RCAo : (a) The perfect set theorem (every tree with uncountably many paths has a perfect subtree). (b) Any two well-orderings of the natural numbers are comparable. ( c ) If any two distinct reals in an arithmetically definable collection of reals are at least one unit apart, then there is a sequence which includes all the reals in that collection. (d) The Ramsey property for clopen [or: open] subsets of [wIw. (e) Every uncountable closed [or: analytic] set of reals contains a perfect set. (f) Any two disjoint analytic sets can be separated by a Bore1 set.
F.R.Drake
18
(g) The domain of any single-valued Borel set in the plane is Borel.
(h) The Ulm structure theorem for countable reduced abelian p-groups. (i) The determinacy of open [or: clopen] subsets of ww . (j) The Ci separation principle: if cpo(n) and cpl(n) are both C: formulas,
-3n(Vo(n)AVi(n))
+
3Xvn((cpo(n)
--+
n E X)A(Vi(n) + n 6 X I ) .
Table 5. Equivalents of IIiCAo. Each of the following is equivalent to ZIiCAo, over the system
RCAo :
(a) The perfect kernel theorem: to every tree T , there is a subtree P , perfect (or empty), and a sequence of paths through T , such that every path through T is either through P or is a member of the sequence. (b) Every bounded arithmetically definable collection of reals has a least upper bound. ( c ) The determinacy of boolean combinations of open subsets of ww . (d) Kondo’s theorem: every co-analytic set in the plane can be unifonnized by a single valued co-analytic set. ( e ) Silver’s theorem: for every co-analytic [or: Fb] equivalence relation with uncountably many equivalence classes, there exists a perfect set of inequivalent elements. (f) Every countable abelian group is the direct sum of a divisible group and a reduced group. 3.1 A further programme. Amongst the various proofs which are involved in these results, some are clearer than others as examples of using the strength of the systems involved. It would seem worthwhile, for pedagogical purposes, to have specific examples which are as clear as can be found, which necessarily use the strength of each of these systems. Probably there should be different examples from different contexts as well as of different strengths. At this point I shall give just one example, which is not ideal in that it is not a result for which any reversal has been proved. It is presented here as an example to show how an argument which is not predicative, can be relatively simple and certainly compelling to most mathematicians. It is a proof formalizable in IIiCAo but not in AT&, and so presents rather clearly a problem for any philosophy which demands a constructive justification for all of mathematics. (I also mention a further example which requires II; induction as well as IIiCAo .) Let 7 be the set of all finite trees, (i.e. sets of sequences of natural numbers closed under initial segments, so that all have a root-the
On the Foundations of Mathematics in 1987
19
empty sequence-and meets are well-defined). Let 5 be defined by: TI 5 2'2 iff there is a meet-preserving embedding f : TI + T2. Then the theorem is:
3.2 THEOREM. (Kruskaf). 7 under 5 is weff-quasi-ordered (is WQO), i.e. for any infinite sequence (Ti)i. . . >ak, po>. . . >ps and no, . . . ,nk, mo, . . . .ms 2 1. Then
Of course, the invariants (ao,no,ak) characterizing the homeomorphism type of T(a) can be defined by purely topological
J. Flum
42
means. We introduce the corresponding concepts, since we are going to use them later on. Let A be a topological space. If X is a subset of A let X' be the set of accumulation points of X. For an ordinal p denote Ao:= A, At:= fl{A'Iq on (Xa,+) isomorphic t o t o the action o f Va on i t s e l f by translation. Assume a l l the Xa's are disjoint. Then the t r i v i a l a f f i n e cover N of M ( w i t h respect t o the family Va, aaS) i s obtained from M by adding X = u a X a as a new sort, together w i t h the p r o j e c t i o n x-a (xaXa), and a r e l a t i o n t h a t gives the Va-affine-space structure of Xa, uniformly
in a. Again, each of M,N interprets the other, though they are not bi-interpretable, and N i s t o t a l l y categorical. This i s the well-understood part of the theory. A f t e r taking a f i n i t e or a f f i n e cover, however, there i s a step over which w e have almost no control a t a l l . Let M be t o t a l l y categorical and N a t r i v i a l cover of one o f the t w o types. Then a cover of M w i t h skeleton N i s any expansion of N that induces no new structure on M. Typically, N induces no new structure on any fiber. (In the a f f i n e case, if any new structure i s induced on a generic fiber, than the covering collapses t o a simpler one, covering a set of smaller dimension. The f i n i t e case could also be defined so as t o make a s i m i l a r statement true, by considering t r i v i a l
Unidimensional Theories
77
covers by a given f i n i t e structure.) Nonetheless i t i s possible f o r N t o include new relations among the different fibers. We i l l u s t r a t e t h i s w i t h an example. Let A be the countable free (2/9Z)-module. categorical. From the point of view of our analysis. it
A i s totally
looks as follows.
The single element 0 can be disregarded, since we are concerned w i t h A only up t o biinterpretability. Let B=3A=(xlA: 3ycA. 3y=x}. Then B has naturally the structure of a GF(3)-vector space. Let D=(B-(O))/-
be the
corresponding projective space. One can check that the structure induced on D i s precisely the projective structure. Going backwards, 8 4 0 ) i s a (double) cover of D. Wi th each blB-(O) i s associated the GF(3)-vector space B. A-(0) i s an a ffi n e cover of B-(0):
the p r o j e c t i o n
map i s a-2a. and the a ffi n e structure on a fiber Ab=(y: 2y=b) i s given by (x,y)-x+y
(xaB.ytA,,).
Note that in both the double cover and the a f f i n e
cover, our description misses the group structure on the cover; it cannot be f u l l y recovered fro m the induced structure on the quotient and on each fiber. The problem of classifying the possible structures on a given t o t a l l y categorical skeleton remains open, even fo r the case of a single finite cover o f a project i v e space. We do have a global theorem l i m i t i n g the number of such expansions. The expansions of a given skeleton M are p a r tially ordered by inclusion (of the set of definable relations,) w i t h rl i t s e l f as the minimal element. We know that t h i s p a r t i a l l y ordered set i s well-founded:
there i s no sequence M<M1<M2< ... w i t h Mn+l a proper
expansion of Mn.
It fo l l o w s th a t i f M i s to ta l l y categorical, there e x i s t s
78
E. Hrushovski
a f i n i t e set of re l a ti o n symbols L such th a t every r e l a t i o n on M is equivalent t o some f i r s t order combination of the symbols in L. Moreover, upon r e s t r i c t i n g t o the f i n i t e language L. M becomes almost f i n i t e l y axiomatizable in the following sense: there e x i s t s a f i n i t e set of sentences To of L. such th a t i f M’kTo and card(M’)=card(M). then M ’ W .
The well-foundedness of the class of expansions i s also the key idea here. It follows, in particular, that there are only countably many possible expansions of a given skeleton. This suggests attempting t o classify the maximal expansions of a given skeleton. This has been done in the disintegrated case - when the projective space i s in fa c t degenerate, i.e. a structureless set. Even there cert ain problems associated w i t h finite nilpotent groups prevent us from a f u l l cla s s i fi c a ti o n of the intermediate expansions. In addition, a complete solution fo r the case of the double cover of a p r o j e c t i v e space over GF(2) has recently been announced by Ahlbrandt and Ziegler. The result s described in t h i s section can be found in [BLIJZ 1 I,[CHLI,[H21; t h i s sequence can be read without further references,
except f or a def in i ti o n of Morley rank (from [MI; also given in s2). [BL] i s more general, and w i l l be described in the next section. [ Z l l gives the fundamental theorem on the existence of a (possible degenerate) projective space w i t h not further structure in a t o t a l l y categorical theory. To appreciate the impact of t h i s result, and the progress it made possible in the area, i t i s instructive t o look a t [BCMI, where it was nontrivial even t o c l a s s i fy the to ta l l y categorical rings. [CHL] widened the context and proved the co-ordinatization theorem described above. (In the t ot a l l y categorical context some version of t h i s was
UnidirnensioM I Theories
79
known t o Zil’ber.) [A21 proved the well-foundedness, and the quasi-finite- axiomatizability, in the case where only finite covers are involved (no af f in e ones.) The general case is in [H2, s21. It remains t o be seen whether the fi n a l form o f the theory w i l l include the combinatorial device used t o prove well-foundedness, or whether an e x p licit understanding of the possible expansions w i l l eventually make i t unnecessary.
Intermediate generalizations were achieved in [ L I I and
[ C I I . ( S t i l l excluding affine covers, but allowing orthogonal covers and several dist inct projective spaces.)
The c l a s s i f i c a t i o n o f the maximal
structures in the disintegrated case i s in [H2,§41, f o l l o w i n g work in [ L I I . There i s no reference yet t o the c l a s s i fi c a ti o n in the case of the double cover, 2-element field. 2. N,-categorical structures.
There are t w o points of view regarding the r e s u l t s discussed in the previous section. One i s t o regard No-categoricity as the m a i n assumption; then the second assumption, of N1-categoricity,
iS
somewhat too res tri c ti v e , and one can look f o r a wider context in which s i m ilar result s are valid. The main tool then i s reduction t o f i n i t e questions, and the c l a s s i fi c a ti o n of the f i n i t e simple groups. See [El,[KLMl and [C21 f o r a development from t h i s direction.
Here we w i l l take the point of view of s t a b i l i t y theory. Our goal in t h i s context i s a theory of N1-categorical structures: the success in the locally finite (or No-categorical) case i s an encouraging, but extremely special case. The guiding philosophy here i s Zil’ber’s conjecture.
80
E, Hrusho vski
Roughly speaking it says t h a t an N,-categorical theory i s a geometric theory; it comes either from linear algebra or from algebraic geometry (over an algebraically closed field.) There i s also a degenerate case; there i s somewhat more t o it than in the t o t a l l y categorical case, but w e w i l l ignore it here. (It would not m a t t e r t o the general picture i f some bigger class o f degenerate structures existed, if it could be isolated and controlled, and one could show that it cannot support any coverings of interest, or otherwise interact w i t h the ambient model.) Let us s t a r t by describing the known structures of non-linear type. If p i s prime or 0, the theory of algebraically closed f i e l d s of
characteristic p i s complete and N,-categorical.
The theory has
quantif ier-elimination, but the structure of the definable subsets o f K n i s no longer simple; much of algebraic geometry is devoted t o describing it. Every f i n i t e cover of an N1-categorical structure i s s t i l l
N1-categorical, as one easily verifies. The analog of a f f i n e coverings i s no longer finite-dimensional, however; the Abelian groups Va of the a f f i n e covering must be replaced w i t h arbitrary algebraic groups. (They may be assumed connected and simple or irreducible Abelian.) To be precise, l e t M be an N1-categorical structure, b u i l t over ( K , + ; , C ) ~ , ~ ~ (KO i s a set of distinguished constants). Let S be a sort in M, l e t Ga (agS) be an algebraic group over K, definable uniformly in a. Let Xa be a Ga-set, isomorphic t o Ga; assume the Xa's are p a i r w i s e d i s j o i n t and d i s j o i n t from M, and let X=uaaSXa. Then the t r i v i a l principal cover of M over S associated w i t h the f a m i l y (Gal i s given as before, by adding X as
81
Unidimensiowl Theories
a sort, together w i t h the projection and the group action on each fiber. An arbitrary principal cover i s obtained by adding structure between the fibers in a way that i s invisible t o both PI and t o each fiber. Every known N,-categorical structure i s either degenerate or of linear type (in which case it i s very w e l l understood, as w i l l be shown below), or can be obtained from an algebraically closed f i e l d by taking f i n i t e and principal coverings a f i n i t e number of times. Zil'ber's conjecture states that a l l N,-categorical structures are in fa c t of t h i s form. The usual statement of the conjecture implies another claim. Note that one has a natural class of f i n i t e coverings of the f i e l d internal t o (i.e. definable over) the field, namely the algebraic curves together w i t h a projection t o the field. The "8"conjecture states that every f i n i t e model-theoretic cover of an algebraically closed f i e l d can be realized algebraically, as a reduct of a possibly reducible algebraic curve, possibly w i t h f i n i t e l y many points added or deleted. This i s somewhat reminiscent of Riemann's existence theorem
- that
every topological
cover of the Riemann sphere (punctured in f i n i t e l y many points) can be realized algebraically. It is a very interesting statement, but not central t o our picture of the field, and we shall say no more about it. It would follow from Zilber's (A) conjecture that the above picture
i s valid not only fo r the known structures but for a l l of them. We w i l l now say how much of the picture is known t o be true.
. . . e f i n i t i q n A structure D is strongly minimal i f every definable subset of D i s uniformly f i n i t e or co-finite. The u n i fo r m i t y condition means this:
if R c D ~ x Di s definable, and TtDn, l e t R ( ~ = ( x : ( ~ , x ) ( R ) .Then f o r each R,
E. Hrusho vski
82
some integer m. f o r a l l
i f R ( 2 i s f i n i t e then it has a t most m
elements. This d e f i n i t i o n mentions only definable subsets of D i t s e l f . One deduces from it, however, the possibility o f ' d e f i n i n g the dimension (Morley rank) of a definable subset of Dn. Assume w e have defined the dimension of definable subsets of Dm, (always an integer cm), and we have: (*) For each ism, and each definable RLDnxDm, 6 # D n : dim(R(g)=i) i s a
definable subset of Dn. Then f o r a definable ECDm+ '=DxDm, f o r each ism, (a#D: dim(E(a))=i) i s f i n i t e or cofinite; so it i s c o - f i n i t e f o r exactly one value iosm; and the maximal value i s jo. Define dim(E)=max(io+l,jo). Then one sees easily t h a t (*) holds f o r m + l also. This d e f i n i t i o n depends on the choice of co-ordinates. but it has many good properties that yield invariant characterizations. Morley's was this: dim(E)=O i f f E i s finite (and nonempty), and dim(E)m+l i f f there exists an i n f i n i t e set of p a i r w i s e d i s j o i n t subsets of E, each of dimension a t least n. A definable set
E of i s said t o have m u l t i p l i c i t y
1
if it cannot be s p l i t i n t o t w o d i s j o i n t sets of the same dimension. Any definable subset of Dn i s the d i s j o i n t union of a f i n i t e number of m u l t i p l i c i t y 1 sets of the same rank; t h i s number i s called the m u l t i p l i c i t y o f E. The key point about t h i s notion of dimension i s t h a t it applies t o a l l definable sets. This can be rephrased as a strong homogeneity property. Let VLDn be a definable s e t of m u l t i p l i c i t y 1. Over any set B of parameters, consider the set of a l l elements of V t h a t do not l i e in any
Unidimensional Theories
83
B-definable set of smaller dimension. These are called the generic elements of V (over 6). Any t w o of them look exactly the same over 6: given a formula Cp(x,b) w i t h baBk, one of ( x c V : Cp(x,b)l and (xaV: -Cp(x.b)) has smaller dimension than V, and so any t w o generics are in the other, and Cp does not distinguish them. It fo l l o w s (it can be shown) t h a t there i s an automorphism of the model carrying one t o the other. In particular, w i t h n = l . any t w o elements of D not algebraic over a set B are conjugate by an automorphism fi x i n g the algebraic closure of B pointw ise. The property used t o define dimension can now be stated thus: (**) Let E,B be definable sets of m u l t i p l i c i t y I , and f:E+B a definable
map. Assume f i s generically surjective, i.e. (f(e): e a generic element of E l contains the set of generic elements of 6. Then d(E)=d(B)+K, where K i s the dimension of a generic fiber of f .
(i.e. K=d(f-'(b)) f o r a generic
b a 6). Note that i f the strongly minimal set in question i s in f a c t an algebraically closed field, then w e can talk about the dimension and the m u l t i p l i c i t y of a definable set, but cannot actually identify the irreducible components. To show how serious t h i s problem i s we give an alternate statement. I f we w i s h t o talk, in the present language, about curves in a plane (say), we must talk about subsets of D2 o f m u l t i p l i c i t y
I , dimension 1 . However, such subsets can be curves w i t h f i n i t e l y many points removed or added. The best we can do i s t o identify a curve w i t h an equivalence class of definable sets, t w o s e t s being equivalent i f they d i ffer by f i n i t e l y many points. So now we have identified the notion of a curve; what we no longer have is the knowledge, f o r a given curve,
E. Hrushovski
84
which points l i e on it and which ones do not! The notion of a strongly minimal set i s our approximation t o the idea of an algebraic structure over which geometry can be done. This i s the weakest part of our understanding of N1-categorical theories. Note th a t an algebraically closed f i e l d i s strongly minimal, but so i s any irreducible curve over the field, and even any t r i v i a l f i n i t e cover o f such a curve. Our present definitions are not sharp enough t o make the d is t inct ion. The r e s t of the picture i s much clearer. Given an N,-categorical structure b u i l t over a strongly minimal s e t D, one naturally defines the class of principal coverings of M w i t h respect t o definable groups in Deq, just as in the algebraic case. Then we do know t h a t every N,-categorical structure can be obtained from some strongly minimal set by a finite sequence of such coverings. There i s also a rudimentary theory of the nature of the definable groups over strongly m i n i m a l sets; from the theory of algebraic groups one has the notions of connected component, generic type, and of course dimension, and t h i s can be used t o go a cert ain way. In the case where the fi b e rs are 1-dimensional (of m u l t i p l i c i t y !), the group i s forced t o be 1-dimensional, since we are assuming the action i s regular, and t h i s implies that the group i s Abelian. But even i f the regularity assumption is removed, w e know the precise nature o f the s tr u ct ural group in t h i s case. It i s either Abelian or isomorphic t o one of t w o specif ic m a t r i x groups (of dimension 2 or 3) over an algebraically closed field. (And in particular, the strongly minimal set can be taken t o have a definable structure of an algebraically closed field. Note that
85
Unidimensional Theories
t h i s gives only half of the conjecture, though, since one does not know that the f i e l d has no extra structure above the f i e l d relations.) Much less i s known when the fibers are either 2- or O-dimensional. To i l l u s t r a t e the use of the notion of dimension associated w i t h a strongly minimal s e t we describe the "linear" case. Let us f i r s t w r i t e down some equivalent definitions, and then explain them and the equivalence. A) For some (equivalently, every) strongly minimal set D of M: 1 ) Every definable family of k-dimensional subsets of D I has dimension
at most I-k. 2) Let A,B be algebraically closed sets in Deq. Then A,B are independent over their intersect ion. Equivalently, d(A)+d(B)=d(AuB)+d(AnB). B) 1 ) Let F be a family of k-dimensional definable subsets of a definable
set S of M. Then d(F)+k=d(S). 2) Let A,B be algebraically closed sets in Meq. Then A,B are independent over their intersection.
Explanations: If RlCBxDl i s a definable set, l e t R(b)=Iy: (b,y)tRl. (R(b): b t B ) i s called a definable family of k-dimensional subsets of DI i f
R(b) i s k-dimensional for beB. The equivalence relation: b - b ' i f f
(R(b)-R(b'))u(R(b')-R(b))
has dimension n
/(m)
< g(m).
Let T E weU be a tree. We cdl T superperfect if and only if for mry 0 E
T,there
ad.ts t E T extending a such that
i.e., t ha6 in6nitely many immediate urtenmom in T. A sobset A ww ie K,-regular if and only if A is u-bounded or there exists a mperperfect tree T such that every branch of T is in A. Again 6lters on w produce Qndl K,-regular eta. 4.
FACT:If 7 L a 6lter on w then
is K,-regular iff f ? is a-bounded.
PROOF:If f7is sbounded then clearly f7 is K,,-re.gdar. Now euppoec that f ? is K,-regular and not a-bounded, then there exists T C w h m - l } ; let h m c b m + 1 belonging to [m,fa(2n))n a. Then, by construction c c &, and this implies that b E 7. (e) Suppose that 7is not Ramsey. Then for every a E [w]" there exists 6 E [a]"n?. Let f E w", and u E [w]" such that for every k E u
Then there Uciaa 6 E [a]" such that
Then €or every iE w [fb(2i),fb(2i+
i))n&=1.
Therefore
If@),f(fb(2i)))
nb= 0.
Hence ? is an unbounded filter. I
We can resume dl these facts in the following
Unbounded Filters on w 8a.
111
THEOREM.Let 7 be a l9Jter on w , tben tbe foUorVing are eqoivrlent
(i) 7 is nnbonnded. (3) cbu(7)does not bave tbe Bire propew. (ii) 7 is not h e y . ( i ) /7¬K,,-n@u. I The natural queation should be the following: Is it poeaible to add to the theorem 8 the awertion (v) char(;)
is not Lebesgue measurable.
The anmer to this question is negative in the two possible d i r e ~ t i o ~ .
FACT: ( B u t ~ ~ ~ y If~ there k i ) udsts an unbounded filter on w then there
9.
exiata (L Ulter 7 on w such that
(i) 7 is unbounded.
(c)
drar(7) hur measure 5ero.
PROOF:(This proof is given by T. Bartoszy~ki.)Let F be an unbounded 6lter on Let InE w be such that for c~ev R, In = (sUp(Zn-1) n 1) - (sUp(In-1) 1) thenifn<mthenInnl,=Oandw=UZn. L e t 7 = { z E w : { n : Z n ~ ~ } E F } .
+ +
O.
10.
+
n
CLAIM:
7 is an unbounded fflter.
PROOF:Clearly 7 is (L Ulter. We need to show that 7 is unbounded. Let f E be 6xed. And let g E ww be such that g(n) = inf(Zn). Now let u E 7 such that A = {n E w : [n,f(g(n))) n u = O} is infinite.
We defne b = U{In: n E u}. Then clearly b E 7 and if R E A then
iff there exists m E a such that
and this implies and this implies
ww
J.I. Ihoda
112
and this implies [A,
a contradiction. Therefore
11.
CLAIM: char(;)
PROOF:char(7) G meMun aero. I
f(d4))n # 0 Q
7 is unbounded.
has outer measure sero.
nu
{f E 2@:(Vi E In)(f(i)
= 1))
and the right dde has
m€wnlm
12. FACT:(‘Magrand): (MA) There exists an unbounded 6lter 7 on w such that char(7) is not Lebeegue measurable.
PROOF:Essentially we will use less than the additivity of measure. Let In, A < w be d e h e d as in the proof of Fact 9. We will construct a non-measurable 6lter 7 on w satisfying for everycl E 7 there exists n E w such that for every m 5 n I,na # 8. Clearly this 6lter is bounded by the function
In order to build the filter we need the following 13.
CLAIM: {char(z) : (3% E w ) ( z nI,, = 8)) has measure zero.
(PROOF: As olaim 11. l] 14.
CLAIM: Fix y
s.t. ( P n ) ( y n I n
# @)then
B, = {char(z) : (gmn)(n E 1 n z)} has measure one.
[PROOF:char(z) 9 B, if and only ifthere exists A E w such that for every m E #-A char(z)(m)= 0, thus B, haa measure one.] Then our construction wiU be by induction on a < c. Fix (A,,: a < c) be an enumeration of all measure iero sets. Then suppose we are given (z,,: a < /Y) for , B < c s a m
(i) z,
9 A,
(ii) z,,nz,
for Q < 8. E [wlW for a < 7 < ,B.
(iii) (Va< @uVm 2 n)(zanIm # 8). Then use daima 13 and 14 and the additivity of measare in order to choose zp EA@such that (z,,: a < ,B l) satisfies (i), (ii) and (iii). l
+
Unbounded Filters on w
113
QUESTION: Does ZFU imply there exiists a non-measurable bounded filter on w? We have s complete picture of the unbounded atem on w and their relation with the Bdre property, etc., in the real numbers. Now we plan to discuss the existence of unbounded 6 l k m on w. The first results on this direction say that from ZFO we can d w a y s find unbounded tutera on w. 1s.
FACT: It 7 is an ultralter then 7 is an unbounded 6lkr on w.
-
PROOF:Let f : w 4 w be mch that n c m + f(n) c f(m). Let no = 0, = f(nk) and u = (w:i c w). Then ii E T or ii E 7 , without loss of generdity ii E 7. Then (Vi E w)[osi+l, f(uZi+l)) nii = 6. I
nk+l
The natural direction should be to search for an unbounded ultratllter with the simpled dennition. And the B l a t results on this are the following
FACT: If 7 is a Ci-fllter then 7 is bounded. PROOF: If 3 is c:-6lter then char(7) is a c: set of re& and we know thbt every 16.
andytic set has the property of Baire. I
17.
FACT: If V = L then there exists a A:-unbounded
filter on w.
PROOF:Let (so:Q < w1) be a C:-good well order of [w]" n L (this says that quantification w e r the a's less than 9, is The filter wiU be constructed inductively taking the Brst member of the lists witnessing that 8~ belongs to the filter or zp belong ta the tuter. This construction is in geneml but if V = L it is A:. I
c:).
-
18.
c;
THEOREM.The fallowing are equivalent
(i) There exists s e;-nnbounded Blter on w. (ii) Then exists s nd number r so& that L[r] is not u-bounded. PROOF: (i)
+
(i). Let 7 be a C:-unbounded ultrafilter on w. Let r be such that
the definition of T belong to L[r]. We will show thst L[r] nw" is an unbounded
family. If not there exists a function f : w
-*
w satisfying
We will show that this f is a a-bound to 7. alearly this gives a contradiction. Because 3 is a Ci-net there exists (F,:i< w1) in L[r] such that (i) for each i C WI Fi is a C:-eet. (ii) 7 = Fj.
U
i<wl
J. I. Ihoda
I14
ci
Then clearly each Fi is contained in Blter on w ( t a e { a : (3E 4)(6 a)). Then in L[r] there exists fi E w" a a-bound for the members of 4.Using a simple absolutenew a r p n e n t , we can show that thh holds in the universe; and for every i,
(3nVm 2 n)(fi(m) < f(m)) Therefore f is a a-bound to 7.
Now we need to show (ii)+(i). In order to do this we need some lemmas and dehitions. DEFINITION:A filter 7 on w is a rapid filter iff for every f E ww there exists a E 7 satisfying 19.
Pn E W I ( l f ( 4 nI. < 4. I Olearly rapid filters arc unbounded filters. The converse of this fret is false:
.
THEOREM ( A . MILLER) In the Laver model for the B o d conjecture there is n o t rapid M t e m on w .
I
LEMMA.For every a E R then e x h a c:(a)-Nter 7 on w s o d that for every f E w w n&(a] there exists z E 7 mtis&ng 20.
Unbounded Filters on w PROOF OF (ii)d(i): Suppose that wo
n L(r] is an unbounded family.
Ci-flter given in Lemma 20 for L[r]is a Ci-unbounded filter in V. 1
115
Then the
It may be interesting to see when it is possible to &Id A:-unbounded 6ltenr on w , in this direction we have the following results 21.
FACT: (a) if MA holds and
w1
is an acceesible cardinal in L then there
exists a A:-rapid flter on w. (b) cons(2FC) implies cone(2FC 4 MA(I)+ there is no A:-unbounded filter on w ) . Where I ={P:(tQ e.c.e.)(V* nP e.e.e.y)}. (c) ifMA(u-centered) holds and w1 is an accessible cardinal in L then then es.ets 5lter on w . (d) (Raieonnier) If every C:-set of nab is Lebesgue measurable and accessible cardinal in L then there exista a z - r a p i d filter on w. a C:-unbounded
w1
is an
PROOF:See [ISh].
To 6nish the reader can now obtain his proper conclusion about the messurabilify (or Bske property, Ramsey property, K,-reguIar) of the projective sefe of reds in the presence of disttrct forms of MA. For this use 21 and 8 and the following FAUT (TALAGRAND): If 7 is a rapid filter on w , then char(7) irr not 22. Lebeegue measurable. 1 I like to remark that the idea to use filters on w , in order to produce patolo~cal sets, are previously introduced for many other mathematicians, for example A. R.D. Mathias, J. Rairmnier, ete. More information mast be fonnd in (ISh]. REFERENCES p h ] Ihodb, I. aud Shelrh, S., I(&% in The Jour. of Sym. Logic.
Adorn, muambib?#a d yviooruirtaul nrulr, Accepted
This Page Intentionally Left Blank
Logic Colloquium ’87
HrD.Ebbinghaus et al. (Editors) 0 Elsevier Science Publishers B.V. (North-Holland), 1989
117
Type theory and explicit mat hemat ics Gerhard Jager Institut fur Informatik ETH Zurich
1
Background and introduction
The computational significance of constructive mathematics has been discussed in numerous publications and certainly is an issue of some controversy. The traditional logical and proof-theoretic contributions to this field are mostly concerned with computability in principle whereas practical applications require detailed case studies as well as the development and analysis of relevant subtheories. However, there are some very important approaches which attempt to bridge the gap between abstract formal treatments of (constructive) mathematics and practically useful computer applications. De Bruijn’s AUTOMATH project is a longstanding forerunner in this respect. In his survey of AUTOMATH he says that “The idea was to develop a system of writing entire mathematical theories in such a precise fashion that verification of the correctness can be carried out by formal operations on the text” [7]. Constable’s proof development system Nuprl [5] goes into the same direction. It is sensitive to the computational meaning of terms, assertions and proofs and designed in such a way that it can carry out the actions used to define that computational meaning. One should also mention the important work done in Edinburgh [S] in connection with LCF and ML, the theory of constructions due to Coquand and Huet [6], and the Swedish group which tries to implement Martin-Lof theories [26,27,28]. The general logical background of these approaches is the formalism of constructive type theory and typed lambda calculus. Besides the ideas of traditional ChurchCurry type theory, Howard’s formulae-as-types interpretation [21] and the whole area of polymorphism [18,30] are of major importance. Proof theory is the link between formal theories and actual computation. Reducing proofs through cut elimination or similar normalization procedures is computing out a formula or a term into its normal form. Hence this corresponds to functional programming languages in which computation takes place by reduction of terms. The work on the semantics for theories of computations and functional programming languages like LISP has been very much inspired by model constructions for the (un-
118
G.Jager
typed) lambda calculus due to Engeler [9], Plotkin [29] and Scott [32]. Denotational semantics [34] and Scott domains [33] are important tools for a mathematical theory of computation with a wide spectrum of applicability. And, after all, a lot of activity is presently invested in the study of the polymorphically typed lambda calculus, both from a semantic and syntactic point of view (cf. e.g. [4]). Feferman’s theories for explicit mathematics [10,13] provide an alternative approach to constructivism. They were originally deviced as a formalization of Bishop’s constructive mathematics [2]but soon turned out to be of general interest for proof theory and in connection with the proof-theoretic analysis of subsystems of set theory and second order arithmetic [3,13] Beeson’s book [l]contains a detailed comparative study of explicit mathematics and other (formal) versions of constructive mathematics, and Hayashi’s work [20] brings us back to computability. He uses realizability interpretations for a variant of Feferman’s theories in order to extract LISP programs from constructive proofs. The use of (constructive) mathematics as a specification language and the development of new formalisms for the design of programming environments is stimulating for mathematical logic and a challenge for proof theory. In order to estimate the range of applicability of (subsystems of) proof development systems like Nuprl or AUTOMATH, it is often helpful to gain some information about the proof possibilities and strength of the respective theories. But languages of computer science generally have a rich syntactic structure such that it is often cumbersome to compare them directly with the plain formalisms analyzed in traditional proof theory. With respect to the attempt of extracting algorithms from constructive proofs, we think that it is expecting too muchif one tries to capture the whole area of constructive mathematics or at least a very huge part of it. Although general proof-theoretic methods are available, it is very likely that the programs extracted directly form proofs in strong systems for constructive mathematics would be inefficient and practically useless.
It is a more promising approach to the analysis of specification languages and programming environments to work with a formal framework which is built upon a language of significant expressive power and poses gradations by restricting some axioms but not the design of the language. In a way, constructive mathematics (at the top) and PROLOG (at the bottom) form the borderlines of one of the most fascinating areas of research where logic meets computer science. In the following we propose (a variant of) Feferman’s explicit mathematics as a framework for a unifying approach to the kind of type theories and lambda calculi that are used in present day computer science. The next section is concerned with some rather general considerations about monomorphic and polymorphic type structures. Then we turn to explicit mathematics and present the basic ideas and definitions of this approach. The last section, finally, is dedicated to the conceptual interplay between type theories and explicit mathematics.
Type Theory and Explicit Mathematics
119
An excuse might be in order: This is a survey paper which concentrates on concepts and ideas rather than precise results. As a consequence, we do not develop the technical details to such an extent that many deep proof-theoretic theorems can be formulated or even proved. However, a thorough analysis of this approach is partly available in the literature and partly planned for future publications.
Type structures
2
On a practical level, types arise in mathematics and computer science in order to organize (formerly untyped) universes in different ways and for different purposes. A deeper reason for the introduction of types into mathematics is an attempt to avoid inconsistencies and to provide a sound basis for foundational studies. In computer science types and typed languages are useful, among other things, to structure the data, prevent forbidden operations and support the correctness of programs.
As a short repetition and as motivation for the following we recall some basic notions of type theory. There exists an extensive literature about this topic, and we cannot go into details here. For further information the interested reader may for example consult Feferman’s survey article [ll],which is concerned with the connections between type theory and mathematics, or Cardelli-Wegner [4], which discusses type theory from the point of view of computer science.
2.1
Monomorphism
In the following we make a distinction between monomorphic and polymorphic type structures. A type structure is said to be monomorphic if every term can be interpreted to be of one and only one type; we have polymorphism if some terms may have more than one type. If we have a monomorphic type system and the assignment of types to terms is (more or less) explicit, then the system is considered to be statically typed. For programming applications this means, that type errors are detected during compile time. If it is only guaranteed that there exists a consistent type assignment, then we call the type system strongly typed. In a strongly typed programming language, the explicit type assignment and the test whether it is at all possible, is carried through during run time. Systems of monomorphic lambda calculus MAC are generated from a finite sequence Bl, ..., B, of basic types using lambda abstraction and application. The collection MT of monomorphic types S, T, . . is inductively defined by:
.
1. Every basic type is a type. 2. If S and T are types, then (S I-+ T) is a type.
A language of MAC has a certain set of constants of various types as well as infinitely many variables xs, ys, r S , .. . for each of its types S. The terms of various types are generated as follows:
C. Jdger
I20
1. Every constant and variable of type S is a term of type S. 2. If a is a term of type (S H T ) and b a term of type S, then (ab) is a term of
type T. 3. If a is a term of type T , and zs a variable of type S, then AxS.a is a term of type (S H T ) .
Over every sequence GI,
T E M T ) by letting:
..., G,
of non-empty sets we construct a hierarchy (MT :
1. M g i := Gi for 1 < i 5 n. 2. M ~ + , T ):= M r M S , the set of all functions from Ms to MT.
It is obvious that the terms can be interpreted in (MT : T E MT) such that every term of type S denotes an element of Ms; term application then corresponds to function application and lambda abstraction to function definition. Monomorphic type structures have a clear mathematical interpretation. Besides the iterative function space construction described above, there are well known models based on hereditarily recursive operations HRO and on hereditarily extensional operations H E 0 (cf. e.g. Feferman [ll]or Troelstra [36]). Disadvantages are the lack of recursors, the lack of uniformity of definitions (every type S has its own identity Xzs.zs, etc.) and the impossibility to form subtypes.
2.2
Polymorphism
Polymorphism has its origin in the work of Girard [18] and Reynolds [30]. In the polymorphic lambda calculus PAC one is more generous in the way how types are defined. We have basic types B1, ..., B, as before, but now we also allow infinitely many type variables X, Y,2, .... The collection PT of polymorphic types is defined by the following inductive clauses: 1. Every basic type and every type variable is a type. 2. If S and T are types, then (S I-+ T ) is a type. 3. If
T is a type and X
a type variable, then IIX.T is a type.
These more flexible types are also reflected in the term formation. A language of the polymorphic lambda calculus PAC has a certain set of constants of various types and infinitely many variables zs, ys, z s , . . . for each type S. However, the terms of various types are now generated by the following rules:
1. Every constant and variable of type S is a term of type S. 2. If a is a term of type (S H T ) and b a term of type S, then (ab) is a term of type T .
Type Theory and Explicit Mathematics
121
3. If a is a term of type T and zs a variable of type S,then AzS.a is a term of type (S I+ T). 4. If a is a term of type IIX.T and
S a type, then a { S } is a term of type T [ S / X ] .
5. If a is a term of type T and X a type variable, then AX.a is a term of type
IIX.T. We write a : T to express that the term a is of type T. As in the case of the untyped and monomorphic typed lambda calculus, there is a reduction relation I > which goes along with the term structure. The principal cases are the following:
Example 1 We give two elementary ezamples which ezplain the strong uniformity of polymorphism:
1. AZ.AzZ.zZ is the term of t yp e IIZ.(Z w 2)which represents the general identity function. If we apply it to an arbitrary t y p e symbol S, we obtain the term (AZ.Azz.zz){S} of type ( S I+ S ) . This term can be reduced to Xzs.zs and is the identity function on S . 2. The general application is given b y the term
of type IIZ.(Z
H
( ( 2H 2 ) H Z)).
Girard showed: (i) the terms of the polymorphic lambda calculus can be normalized; (ii) the number-theoretic functions which can be represented in PAC are the provably total number-theoretic functions of second order arithmetic. With respect to models, the situation is rather complex in the case of polymorphic typed lambda calculus. Reynolds [31]showed that there are no models with a settheoretic interpretation of the impredicative types, although it is problematic to see whether his notion of ”set-theoretic interpretation” really covers all possibilities. Non-set-theoretic models are due to Troelstra [35] and Girard [19]. Troelstra uses hereditarily recursive operations of order 2, HR02, whereas the recent work of Girard is a conceptually new approach to model polymorphism by employing his notion of qualitative and quantitative domain. Besides this, there are also attempts to find models of the polymorphic lambda calculus related to Scott domains. In the following sections we will be concerned with theories of explicit constructive types which are between (a compromise between) monomorphic and polymorphic type systems. They have constructive interpretations such that they provide a link with constructive mathematics. We will also come back to full polymorphism and see how it can be treated in an impredicative version of explicit type theory.
122
G.Jager
The formalism o f explicit mathematics
3
The best references about Feferman’s notion of explicit mathematics are [10,12,13]; in these papers he discusses the philosophical and technical aspects of his approach in full length. He shows that explicit mathematics is a suitable framework for the development of constructive mathematics which has close connections to other constructive and non-constructive formalizations of mathematics. Let us begin with describing the general scenario: We proceed on the assumption that we are confronted with a given mathematical universe. The ontology of this universe is not specified in greater detail; its elements may be 0
sets in a Platonistic sense
0
constructive sets
0
constructions and operations bitstrings of a computer memory
Explicit mathematics captures the idea of a flexible ontology; constructive and nonconstructive interpretations are possible. However, it follows Feferman’s philosophy that each object of the universe should be given by a linguistic expression, at least in all natural models. Hence the expression ”explicit”. The language L ( E M ) of explicit mathematics has as its first order variables the individual variables z , y, 2,. , and as its second order variables the type variables X,Y,Z, . There are also individual constants, type constants and relation symbols, to be specified. The individual constants include the symbols 0, 1, k, s, p, p ~ , p R , d, j and c, ( n < w ) , the meaning of which will be explained later. The principal term formation operation is term application which we write as (..a) or often just as ab. In this simplified form we adopt the convention of association to the left such that alas...a, stands for (. . .(ala2). .). We also use the notation a(b1,. ..,b,) for abl ...b,.
. ..
..
.
The individual terms (a,b, c, ...) of L ( E M ) are generated as follows: 1. Each individual variable is a term. 2. Each individual constant is a term.
3. If a and b are terms, then ( a b ) is a term. The class terms (A, B , C,...) of L ( E M ) are the class variables and class constants. The atomic formulas of L ( E M ) are those of the form s = t , s 1,s E t and R(s1,. . ,sn) where s, t, sl,.. .,s, are individual or class terms and R is an n ary relation symbol.
.
The formulas ( c p , $ ,
x , ...) of L ( E M ) are generated
1. Each atomic formula is a formula.
as follows:
Type Theory and Explicit Mathematics
123
2. If p and $ a r e formulas, then so are i p ) (PA$), (pV$), (9---t $) and (p c-) 4). 3. If p is a formula, then so are Vxp, 3zp, V X p and 3x9.
The logic of explicit mathematics is the classical or intuitionistic logic of partial terms due to Beeson-Feferman as in [l]. The atomic formula a 1 is to be read as " u is defined". We assume that every individual variable and individual constant is defined; if a compound term is defined, then each subterm is defined. We also introduce a partial equality N defined by
:*
u z b
ulvbl+
u=b.
The crucial modifications of ordinary predicate logic are in the quantifier axioms W(z)Aa
1 44,
$44 A a 1
3w(z).
+
+
An L ( E M ) formula is called atratified if it has no subformulas of the form a = B or B = a , a E b or A E B . Elementary formulas are those stratified formulas which contain no bound type variables. The non-logical axioms of explicit mathematics can be divided into the following six classes. I. Applicative axioms (APP). (1) kzy = z
(2) scyz (3) k
N
zz(yz) A s z y 1
#
111. Definition by cases. ( 5 ) d(z,y,z)
1A
d(O,y,z) = Y A d(l,Y,Z) = z
IV. Ontological axiom. (6) V X 3 d X
N
Y)
The axioms (1) - ( 6 ) are denoted by (APP)'. The applicative axioms state that the universe forms a partial combinatory algebra. By standard arguments one can show that they prove the theorem about lambda abstraction and the recursion theorem. Pairing and projection are as usual. In the following we write ( a ,b) instead of pab. ( 5 )
124
G.Jdger
is definition by Boolean cases, a stronger form is definition by cases on the universe (cf. [13]). The ontological axiom formalizes that every type is an individual. This is a natural requirement since in the context of computer science all entities, including types, are represented by bitstrings in a computer memory. Individual terms are the abstraction of these bitstrings, and therefore every type is equal to the individual term which is its formal representation (in the memory). V. Comprehension axioms. Each formula cp is assigned a Gbdel number [cpl in some standard way; then we write cI instead of cr,,,l. Elementary comprehension (ECA) is the scheme
(ECA)
3 X [ X N c,(x,=) A vz(z E
x
H
cp(x,=,z))]
x
for all elementary formulas cp with the the free type variables = Yl,...,Ym and individual variables = 21,. ,2,. Stratified Comprehension (CAz)is the same scheme as for (ECA) except that cp may be any stratified formula. More intuitively, we will write {z :cp(l',z,z)} instead of cI(x,z).
..
VI. Join (J).Join is an axiom to reflect the disjoint union operation.
(J)
(Vz E A ) 3 Y ( z N Y ) + 3 X [ X j ( A , f ) A Vz(z E X H (3z E A)3y(z N (2, y) A y E fz))]
(ECA), (CAZ) and (J)are some of the more important type existence axioms of explicit mathematics; various forms of inductive generation constitute further important principles for the introduction of types (cf. [10,13]). Explicit mathematics is a framework for a specific approach to constructive mathematics, not a formal theory. Accordingly, we do not have a fixed set of axioms, but axioms which share the pattern of explicit representation of the objects under discourse. The exact selection of axioms depends on the particular application one has in mind.
Remark The version of explicit mathematics presented above is a minor modification of that introduced in [13]. The main differences are: (i) Feferman speaks of classes when we use the word type. (ii) Instead of application terms we use the logic of partial terms. (iii) The combinators k: and s replace the lambda notation.
4 4.1
Some features of explicit mathematics Consequences of elementary comprehension
+
In this subsection we work in the theory (APP)' (ECA), i.e. in a partial combinatory algebra where elementary comprehension is the only type formation principle. Although this system is proof-theoretically very weak - (APP)' (ECA) is a conservative extension of (APP)' - it is nevertheless a useful and flexible framework for
+
Type Theory and Explicit Mathematics
125
the generation of type structures. The first observation refers to the universal and empty type. Since the formulas (z= z) and (z # z) are both elementary, we can use ( E C A ) to form the empty type 0, and the universal type V,
0
v
:= {z : x #z}, := { z : z = z } .
V is the collection of all objects but not the collection of all types. The formula T y p ( z ) :* 3 X ( z
N
X)
which expresses that the object z is a type is not elementary and therefore the "type of all types" is not elementarily definable. However, Feferman [13] shows that the axiom 3XVz(z E X * T y p ( z ) )
+
can be consistently added to (APP)' ( E C A ) . If we have stronger type existence axioms, then the type of all types often yields a Russell style inconsistency. The following types can be defined by elementary comprehension: :=
:= := := := := :=
:= :=
:=
: z = ( p ~ ~ , p AzD~Z ) E A A p 2 ~ E B} {X:XEAAXEB} (z:zEAVzEB} { ~ : z E A A ~ $ B } {z : z $ A } {z : 2 = a1 v ... v z = a,} {f : (Vz E A ) ( f zE B ) } {f : (Vz E A ) ( f z J-+ fz E B } A H {0,1} A - {0,1} (8
These definitions are uniform in the type parameters A and B and their meaning is obvious. The elements of P ( A ) are the characteristic functions on A which we also call the subsets of A . In this sense, P ( A ) is the power set of A . We write z E a for a z N 1 when a E P ( A ) and z E A . The subsets of A must not be confused with the subtypes of A . We write B every element of type B is an element of type A,
B
c A :+ Vz(z
EB
---t
c A if
z E A).
Every subset f of A induces the subtype {z : z E A A f z N 1) of A, but a subtype B of A need not have a characteristic function. Accordingly, P P ( A ) is the type of partial characteristic functions on A , i.e. the type of partial subsets of A. Warning. A
c
B and a E P ( A ) do not imply that a E P ( B ) . From a E P ( A ) we A ) ( f z c2( 0 V f z N 1) but we cannot conclude that f is defined on B \ A. However, A c B and a E P P ( A ) imply a E P P ( B ) .
only know that (Vz E
C. J a e r
126
Using the definitions above, it is obvious that elementary comprehension is sufficient to develop the monomorphic type structures of 2.1. Besides this, type variables are available, and we have the possibility to form all subtypes of a given type A which have an elementary definition; i.e.
{. E A is a type provided that type parameters.
4.2
‘p
: ‘p(z)}
is an elementary formula.
‘p
may even contain object and
Polymorphism
The’first treatment of polymorphic type structures in the formalism of explicit mathematics goes back to Feferman [16]. The basic idea of his method is a forgetful interpretation of polymorphism which consists of the following three steps (details suppressed): First every term a of polymorphic lambda calculus is translated into a term a- of pure lambda calculus by taking off from a all type information. The definition of ais by induction on a:
1. If a is a variable or constant of some type, then a- := a. 2. If a is of type (S H T) and b of type S, then (ab)- := ( a - b - ) . 3. If a is of type T and zs a variable of type S, then (AzS.a)- := Xz.a-. 4. If a is of type IIX.T and
S a type, then a { S } -
.- a - . *-
5 . If a is of type T and X a type variable, then ( A X . a ) -
.- a - . a-
In the second step one shows that for every polymorphic type T(x,g) with the type parameters X = X I , . .. ,X , and object parameters g = 21,. ..,z, there exists a such that PAC proves t : T(X,g) if and only if (a stratified formula ‘p~(X,g,y) E, t - ) . The critical suitably chosen subsystem of) explicit mathematics proves p ~ ( & case is the proper treatment of types of the form IIY.S(X,Y,E).But
t : IIY.S(X,Y,g) is translated into
W’ps(X,Y,z, twhere (ps(& Y,z, s-) is the formula which corresponds to s : S(& Y,2). Hence the translation ’ p ~ ( t -. ,..) of the polymorphic expression t : T(. ..) will be a stratified formula of explicit mathematics. we The third step provides the interpretation. Given a polymorphic type T(...), determine the formula (PT(z,.. .) and use stratified comprehension (CA2) to define its interpretation {z : ‘p~(z,. ..)}.
Type Theory and Explicit Mathematics
I27
T h e o r e m 2 The polymorphic lambda calculus can be interpreted into (APP)'
+
(CAI). A more direct interpretation T' of polymorphic types T is obtained as follows: 1. If
X
is a type variable, then
X'
:= X
.
2 . If T is the type ( A H E ) , then T' := {f : (Vz E A')(fz E E')}. 3. If T is the type IIX.S(X), then
T' := { a : VX[a E S * ( X ) ] } .
+
If this inductive definition has to be carried through in (APP)' ( C A I ) ,we might have problems since then all formulas VX[a E S * ( X ) ] should be stratified. However, there exist natural and proof-theoretically harmless extensions of the notion of stratification such that the translation T' of every polymorphic type T is a type in the sense of explicit mathematics.
Example 3 Let a be the general identity h2.XzZ.zZ and T the polymorphic type IIZ.(Z H 2). Then t is a term of type T , and the forgetful interpretation oft yields t- = Xz.2. Then we have t- E T' since T' = {f : VZ(f E ( 2 H 2))= {f : VZ(Vz E Z)(fz E 2 ) ) . Polymorphic type structures can be interpreted in explicit mathematics but the price is the scheme of stratified comprehension. Often it is more interesting to restrict comprehension and add further axioms which are tailored for specific data structures.
4.3
Specification of abstract data types
Space does not permit to discuss the concept of abstract data type in the context of explicit type theory in full detail. Instead, we adopt the convention to identify them with many sorted algebras and consider the problem of extending a given theory Th for explicit type theory by a abstract data type ADT. For us, a (new) abstract data type ADT will be syntactically represented with respect to a given theory T h by a series of new symbols and axioms:
1. First we extend the language L(Th) of T h by a sequence 4,.. .,ck of type constants as names for the domains of ADT, a sequence A,. ,fi of operation symbols as names for the operations of ADT and a sequence R1,...,R, of relation symbols as names for the (basic) relations of ADT.
..
2. Then we add a collection of axioms about
(4, * - * iCk; f l y
-
* *
,fi;
R1,
-*
i
%)
which implicitly define the meaning of these constants. It is natural to assume that for every 1 5 i 5 m there exist j,, , .,j, such that the sentence
...
V Z ~ Vzn(R(21 ,...,z,)
is one of the axioms.
.
+
21
ECj, h . . . h ~ , ECj,)
128
G.Jriger
+
The new theory is called T h (ADT) and is formulated in the extended language. If the axioms of T h include axiom schemes, one has to decide whether these schemes extend to the new language or not.
Definition 4 We say that the representation of a new abstract data type ADT had no side effects with respect to the theory T h if Th+(ADT) is a conservative eztension of Th. Otherwise we denote (ADT) a8 ( a representation of the) abstract data type ADT with side effects with respect to Th.
4.4
The natural numbers
After the very general remarks of the previous section we turn to a specific data type. The natural numbers constitute the simplest infinite data type. In our context they can be easily represented by introducing the following type and operation constants:
N
for the type of the natural numbers, for the zero element, S N for the successor function, p N for the predecessor function. 0
Sometimes it might be useful to add further relation and operation symbols but we refrain from doing so now. Hence the 4 tuple
hf
= (N,~,SN,PN)
serves as linguistic representation of the natural numbers. Next we turn to the axioms which go with hf. Closure azioms of hf
(1) 0 E N ( 2 ) t/z(Z E N + 8 ( 3 ) Vz(z E N A z SN
E2 N )
~
#0
+ p
~ E zN )
and p~ azioma of hf
(4) VZ(Z E N
-+
8
~
#2 0 A PN(SNZ)= Z)
( 5 ) Vz(z E N A z # 0 +
SN(~NZ= ) z)
The closure, S N and p~ axioms of hf are denoted as the basic hf axioms BasicN. They are augmented by principles of induction on hf where we distinguish between four different types: Set - INDN a € P(N)AO c aA(Vz E N ) ( z c a - + s N r c a) + (Vz E N ) ( z E a )
Type Theory and Explicit Mathematics
129
Partial - Set - I N D M a E P P ( N ) A 0 e a A (Vz E N)(c e a -t aNz c a )
-t
(Vz E N)(x
E
a)
Type - I N D M
X cN
A 0 E X A ( V z E N)(z E
X
-t
SNZ E
X) -t
(Vz E N)(z E X )
for all formulas ‘p of the corresponding language. Since every subset of N is a partial subset of N , every partial subset of N defines a subtype of N and every subtype of N induces a formula, it is clear that
Full - I N D M
+
Type - I N D x + Partial - Set - I N D M + Set-INDM
Notation
(N)l := ( N ) 2 :=
+ Set-INDM + Partial - Set - I N D M + Type-INDM + Full - INDM
BasicM BasicM (N),:= BasicM (hf)4 := B a 8 i c ~
The induction schemes Full - I N D M and Type - I N D M are ezternal inductions in the sense that they refer to the whole type structure. They export induction to all types. Partial - Set - I N D M and Set - INDM, on the other hand, are internal with respect to N. In these cases induction is applied to subsets or partial subsets of N but no reference is made to types. Hence the meaning of these induction principles is independent of the type structure and the type existence axioms formulated in the corresponding theory.
To appreciate the relevance of theories with internal inductions for questions concerning the relationship between constructive mathematics and computer programming, the following points should be noted. 1. Feferman has shown that internal inductions are sufficient to develop ordinary mathematics to a reasonable extent; cf. e.g. [11,15,17]. 2. In general the proof-theoretic strength of a theory depends very much on the induction principles which are available. Hence a system with external inductions is very likely to be significantly stronger than the corresponding system with internal induction.
3. It is known from standard proof theory that normalization procedures and proof extraction procedures are very sensitive to the inductions which are involved. Now we relate a few theories for explicit mathematics to well known subsystems of analysis.
G. J&er
130
T h e o r e m 6 The following proof-theoretic equivalences have been established: 1. ( A P P ) + + ( E C A ) + ( N ) ,
=
PA
1. ( A P P ) + + ( E C A ) + ( J ) + ( N ) ,
3. (APP)'
4 . (APP)'
(C:-AC)o
PA
+ ( E C A )+ (N)4 = (II; - C A ) + ( E C A )+ ( J ) + (N),I (C: - A C )
Similar results for (N)1 and
(JV)~ are known but will be omitted in this paper.
In joint work with Solomon Feferman (under preparation) we will discuss the prooftheoretic aspects of various internal and external induction principles on N. In particular, we will determine the proof-theoretic strength of typical theories with Set - I N D M , Partial - Set - I N D M )Type - I N D M and Full - I N D M . General induction in type theories is studied in Jiiger [25].
4.5
Computable types
In accordance with our basic philosophy that the objects of the universe represent computable operations, we define a type X to be computable if it has a characteristic function on the universe V.
Definition 6 Computability of a type X is defined by: Comp(X)
:*
3f E P(V)Vz(z E
x
* .f
N
1)
It is important to observe that the type N , for example, cannot be proved to be (CA,) ( J ) The computable even in a fairly strong theory like (APP)' number of computable types depends very much on the operations available in our language; a n uncomputable type might be made computable in the sense of this definition if an appropriate operation is added to the language. But a Russell style argument shows that there is no chance t o make all types computable provided that (ECP.)and (J)are admitted as type formation axioms.
+
Theorem 7 (APP)'
+ ( E C A )+ ( J ) F
+
+ (w4.
3X4omp(X)
Proof. By ( E C A ) we may conclude that
Vf E P ( V ) 3 Z ( Z = {z : f z
N
1)).
Hence (J)guarantees the existence of a type Q such that
z E Q * pizE P ( V ) A (Piz)(Pzz)
1
for all individuals z. Now we use ( E C A ) again in order to form the "Russell type"
R,
R := {z : ( 2 , ~ )$ Q}.
Type Theory and Explicit Mathematics
131
We will show that R is uncomputable. Indeed, suppose that an element of P(V)which witnesses R, i.e.
R is computable and r
Vz(z E R
t)
rz
N
1).
This implies that rr N 1 if and only if rr N 0; a contradiction. Hence we have proved that there exists an uncomputable type. Join (J)is essential in the previous argument. It is unclear whether one can find a natural version of explicit mathematics which allows that all types are computable. Question 8 computable ?
Is (APP)'
+ (ECA) consistent with the assumption that all types are
The notion of computable type can be relativized in a very natural way: The type Y is said to be computable over the type X,if Y is a subtype of X and if it has a characteristic function on X. Deflnition 9 Relative computability of types is defined by:
Comp(X,Y)
:($
YCX
A
3f E P ( X ) V z E X ( o E Y
w
fz
N
1)
It is easy to check that relativized computability is transitive: If Y is computable over X and 2 computable over Y,then 2 is computable over X. Also, X is computable if it is computable over V.
Computability over N is particularly interesting. The following observation is implicit (CAZ) ( J ) (N), such that in Feferman [10,13]. There are models of (APP)' the types computable over N are recursive (hyperarithmetic, A:, .) subsets of the natural numbers. With reference to the previous theorem it is noteworthy that we can consistently assume that all subtypes of N are computable over N.
+
+
+ +
+
+
..
+
T h e o r e m 10 (APP)' ( C A I ) ( J ) (M)4 (VX C N)Comp(N,X) is consistent. Moreover, there i s a model of To (CA2) such that the types computable over N are ezactly the subsets of the natural numbers.
4.6
+
Universes
Universes are collections of objects which satisfy certain (natural) closure conditions. They have been introduced into many different parts of mathematics and gained importance in connection with foundational questions and the notion of predicative/impredicative definitions. In category theory, universes appear in the form of small and large categories. MartinLijf [26] is interested in universes as a fundamental principle of constructive mathematics. Jiiger [22,23,24] deals with so called admissible universes in order to design theories for admissible sets of the strength of predicative and constructive mathematics. Feferman [14] uses theories for explicit mathematics with universes for the proof of Hancock's conjecture. The role of universes for proof development systems
G. J&er
132
ir la Nuprl and computer science in general is discussed in Constable et al. [5]and Cardelli-Wegner [4]. The relativiaation 'pA of a formula 'p to a type A is obtained by replacing each type quantifier Q X ( . ..), in 'p by (Q E A)(.. .). If 'p is a formula with the free variables XI,. . .,X,, then we write ' p ( A ) for the universal closure of
X 1 E A A ...A X , E A
+ 'pA.
Definition 11 Let J be a collection of type constants and K a collection of formulas. Then the type D is called a ( J ,K ) universe if it satisfies the following conditions. 1. Every element of D is a type. 1.
3.
D contains all elements of J . D reflects a11 formulas of K , i.e.
We write U J , K ( D to ) express that the sentence
(LimJ,K)
p(D)is true for every
'p
E K.
D is a (J,K) universe. The (J,K) limit axiom is
\dxjY(xE Y A UJ,K(y)).
Not much is known about arbitrary (J,K) universes. Proof theory was mainly concerned with the special case J = {N) and K = ( E C A ) ( J ) .
+
Theorem 12 If J = { N } and K = ( E C A ) + ( J ) , then we have the following prooftheoretic results. 1. (APP)'
+
2. (APP)'
+ ( E C A )+ ( J )+ ( L i m j , ~+)(A/), has the proof-theoretic strength re0.
+
+
+
( E C A ) ( J ) ( L i m j , ~ ) (N)S has the proof-theoretic strength Po, i.e. the strength of predicative mathematics.
A proof of this theorem is implicit in Jager [23]. The corresponding results for theories with universes and the induction principles (N)l and (nr), will be published elsewhere.
References [l] M.J. Beeson, Foundations of Constructive Mathematics, Springer, Berlin, 1985. [2] E. Bishop, Foundations of Constructive Analysis, McGraw-Hill, New York, 1967. [3] W. Buchhola, S. Feferman, W. Pohlers, W. Sieg, Iterated Inductive Definitions and Subsystems of Analysis: Recent Proof-Theoretical Studies, Lecture Notes in Mathematics 897, Springer, Berlin, 1981.
[4] L. Cardelli and P. Wegner, On understanding types, data abstraction and polymorphism, Computing surveys 17 (1985).
Type Theory and Explicit Mathematics
133
[5]R.L. Constable, S.F. Allen, H.M. Bromley, W.R. Cleaveland, J.F. Cremer, R.W. Harper, D.J. Howe, T.B. Knoblock, N.P. Mendler, P.Panangaden, J.T. Sasaki and S.F.Smith, Implementing Mathematics with the Nuprl Proof Development System, Prentice-Hall, Englewood Cliffs, NJ, 1986. [6]Th. Coquand and G. Huet, Constructions: A higher order proof system for mechanizing mathematics, in: Proceedings EUROCAL 85, Lecture Notes in Computer Science 203, Springer, Berlin, 1985. [7]N.G. de Bruijn, A survey of AUTOMATH, in: J.P. Seldin and J.R. Hindley (eds.),
To H.B. Curry: Essays on Combinatory Logic, Lambda Calculus and Formalism, Academic Press, New York, 1980.
[8]M. Gordon, R. Milner and C. Wadsworth, Edinburgh LCF, Lecture Notes in Computer Science 78,Springer, Berlin, 1979. [9]E. Engeler, Algebras and combinators, Algebra Universalis 13, (1981).
[lo] S. Feferman, A language and axiom for explicit mathematics, in: Algebra and Logic, Lecture Notes in Mathematics 450, Springer, Berlin, 1975.
[ll] S. Feferman, Theories of finite type, in: J. Barwise (ed.), Handbook of Mathematical Logic, North-Holland, Amsterdam, 1977. [12]S. Feferman, Recursion theory and set theory: a marriage of convenience, in: Generalized Recursion Theory 11, North-Holland, Amsterdam, 1978. [13]S. Feferman, Constructive theories of functions and classes, in: M. Boffa, D. van Dalen, K. McAloon (eds.), Proceedings of Logic Colloquium ’78, North-Holland, Amsterdam, 1979. [14]S. Feferman, Iterated fixed-point theories: application to Hancock’s conjecture, in: G. Metakides (ed.), The Patras Symposium, North-Holland, Amsterdam, 1982. [15] S. Feferman, A theory of variable types, Revista Colombiana de Mathemhticas, XIX,(1985).
(161 S. Feferman, A constructive theory of functions and classes as a framework for polymorphism, Preprint, Stanford, 1986. [17]S. Feferman, Weyl vindicated: ”Das Kontinuum” 70 years later, Preprint, Stanford, 1987. [18]J.-Y. Girard, Une extension de l’interpretation de Giidel a l’analyse, et son application a l’elimination des coupures dans l’analyse et la theorie des types, in: Proceedings 2nd Scandinavian Logic Symposium, North-Holland, Amsterdam, 1971. [19]J.-Y. Girard, The system F of variable types, fifeteen years later, Theoretical Computer Science 45 (1986).
134
G.Jager
[20]S. Hayashi and H. Nakano, PX a computational logic, Report Research Institute for Mathematical Sciences, Kyoto University, Kyoto, 1987. [21]W.Howard, The formulas-as-types notion of construction, in: J.P. Seldin and J.R. Hindley (eds.), To H.B. Curry: Essays on Combinatory Logic, Lambda Calculus and Formalism, Academic Press, Amsterdam, 1980. [22]G. Jiiger, Iterating admissibility in proof theory, in: J. Stern (ed.), Proceedings of the Herbrand Symposium, Logic Colloquium '81, North-Holland, Amsterdam, 1982. [23]G. Jiiger, The strength of admissibility without foundation, Journal of Symbolic Logic 49 (1984). [24]G. Jiiger, Theories for Admissible Sets: A Unifying Approach to Proof Theory, Bibliopolis, Napoli, 1986. [25]G. Jiiger, Induction in the elementary theory of types and names, to appear in: Proceedings Logik in der Informatik, Lecture Notes in Computer Science ?, Springer, Berlin, 198?, preprint, 1988. [26]P. Martin-L6f, Constructive mathematics and computer programming, in: Proceedings 6th International Congress for Logic, Methodology and Philosophy of Science, North-Holland, Amsterdam, 1982. [27]B. Nordstrom and K. Petersson, Types and specifications, in: Proceedings IFIP '83, North-Holland, Amsterdam, 1983. [28]B.Nordstrom and J. Smith, Propositions, types, and specifications in MartinLiif's type theory, B I T 24, n. 3 (1984). [29]G.D. Plotkin, A set-theoretical definition of application, School of A.I., Memo MIP-R-95, 1972. [30]J.C. Reynolds, Towards a theory of type structure, in: Proceedings Colloque sur la Programmation, Lecture Notes in Computer Science 19,Springer, Berlin, 1974. [31]J.C. Reynolds, Polymorphism is non set-theoretic, in: Internat. Symp. on Semantics of Data Types, Lecture Notes in Computer Science 173,Springer, Berlin, 1984. [32]D.S. Scott, Continuous lattices, in: F.W. Lawvere (ed.), Toposes, Algebraic Geometry and Logic, Lecture Notes in Mathematics 274, Springer, Berlin, 1972. [33]D.S. Scott, Domains for denotational semantics, in: Proceedings ICALP '82, Lecture Notes in Computer Science 140,Springer, Berlin, 1982. [34]J.E. Stoy, Denotational Semantics: The Scott-Strachey Approach to Programming Language Theory, MIT Press, Cambridge, MA, 1977.
Type Theory and Explicit Mathematics
135
[35] A.S. Troelstra, Notes on intuitionistic second order arithmetic, in: A.R.D. Mathias and H. Rogers (eds.), Cambridge Summer School in Mathematical Logic, Lecture Notes in Mathematics 337, Springer, Berlin, 1973. [36] A.S. Troelstra, Metamathematical Investigation of Intuitionistic Arithmetic and Analysis, Lecture Notes in Mathematics 344, Springer, Berlin, 1973.
This Page Intentionally Left Blank
Logic Colloquium '87 H a . Ebbinghaus et el. (Editors) 0 Elsevier ScienLe PublishersB.V.(North-Holland), 1989
I37
AN INTRODUCTION TO EXTENDERS AND CORE MODELS FOR
EXTENDER SEQUENCES Peter Koepke Universitit Freiburg, West Germany Abstract. This article surveys the theory of core models for non-overlapping coherent sequences of extenders. These models of set theory arise canonically when one wants to construct inner models for strong cardinals. Strong cardinals are defined in terms of elementary embeddings of V. and we define extenders a s a way of coding elementary embeddings. The natural inner model for a strong cardinal is of the form L[E] , where, in L[E], E is a coherent sequence of extenders. We obtain such sequences E = <EHVI~eOn.u recursively if each EKV is a suitable extender on the core model K[Ep]. We give a definition of the core model K[F] from iterable premice, which are small "L[ E] -like" structures. Without proofs, we cite the fundamental properties of this family of core models. If a model L[E] for a strong cardinal does not exist (''7L[E]'') we can define a canonical core model K[ Fcan]. K[Fcan] is the largest core model. Assuming ?L[ El, # K[Fcan] satisfies analogues of the properties of L when 0 does not exist: K[Fcan] is rigid, i.e.. there is no non-trivial elementary embedding II :K[ Fcan] -t K[ Fcan] , and "[ 'can1 satisfies a weak covering theorem. The coarse ( = non-finestructural) characterisation of core models together with some fundamental facts on core models suffice for certain applications. We show: If the existence of a successor cardinal which is Jonsson is consistent. then the existence of a strong cardinal is consistent (relative to ZFC).
138
P. Koepke
Introduction. The study of consistency strengths in axiomatic set theory involves the construction of certain models of set theory out of given models. The method of forcing allows to extend ground models by generic sets. Smaller models are obtained by forming inner models of set theory within given universes. The well-established program of reducing consistency strengths to large cardinal axioms (see [6]) typically proceeds as follows: To show that a principle A is equiconsistent with a large cardinal axiom B (relative to ZFC) one (a) gives a forcing construction of a model of ZFC+A from a ground model of ZFC+B, and (b) shows that within any model of ZFC+A we can define an inner model of ZFC+B. Part (b) corresponds to the intuition that by suitably restricting the universe large cardinal properties should grow stronger. a d e l ' s model L of constructible sets is the paradigm of an inner model. It is distinguished as being the smallest inner model of ZF. L is definable as the union of a hierarchy whose structure is exceedingly uniform. L is a model a of the generalized continuum hypothesis (GCH). Jensen's finestructure analysis of the L=-hierarchy produced many important combinatorial principles in L. Unfortunately, L ' s role as an inner model for large cardinals is restricted since there are no measurable cardinals in L [lo]. The decisive step towards larger inner models with a constructible structure was taken by Dodd and Jensen in defining the core model K [4]. Large cardinal properties below measurability are compatible with "V = K" . The definition of K readily generalizes to include measurable cardinals. and also measurable cardinals of higher order (see [91). In this paper we consider the notion of a strong cardinal which was introduced by R.Jensen. Like measurables. these cardinals are defined using elementary embeddings of the universe. We define an inner model K[Fcan] which captures the
Extenders atid Core Models for Extender Sequences
139
large cardinal strength of the universe below a strong cardinal. K[ Fcan] relates to a strong cardinal much like K relates to a measurable cardinal. K[Fcan] is the canonical member of a whole family of inner models. the family of c ~ r e models for non-overlapping extender sequences. Extenders are a generalization of the concept of a normal measure on a measurable cardinal. They were invented by A.Dodd, R.Jensen, and W.Mitchel1. This family of core models satisfies natural embedding properties, and a weak covering theorem holds for K[Fcan]. We apply these in establishing a lower bound for a consistency strength: Theorem. Assume that the existence of a successor cardinal which is Jonsson is consistent with ZFC. Then the existence of a strong cardinal is consistent with ZFC. Although the core models for extender sequences are defined here without using any finestructure. the deeper results depend heavily on finestructural techniques. So far, these techniques are of much greater complexity than the finestructure for L. However, ongoing research indicates that by convenient re-structuring of K[Fcan] as the union of some hierarchy one can get along with an "L-like" finestructure theory. This paper is structured as follows: First we introduce strong cardinals and show, that they are of maximal order of measurability. Extenders are introduced in 5 2 . In 53 we consider extendability. i.e. whether an extender possesses an extension map with a transitive target model. Next we show how extenders can be used to approximate arbitrary elementary embeddings. This is used to re-formulate the notion of strongness within ZFC. $ 5 introduces the model L[E] for a strong cardinal. To test the "L-likeness" of L[E] we begin to prove the continuum hypothesis (CH) in L[E] and are led to the consideration of iterable premice. S6 studies iterable structures, and 57 develops (part of) the theory of iterable premice. With this we are able to conclude the proof of CH in
P. Koepke
140
L[E]. $ 8 contains an informal argument why one is lead to consider core models if one wants to recursively build a model L[E] for a strong cardinal. The next paragraph quotes the fundamental theorems on these core models, and finally in 5 1 0 we prove the result on the consistency strength of successor Jonsson cardinals, strengthening a result of [7]. The theory of extenders presented here is basically that of R.Jensen in [5]. Throughout this paper we use standard set theoretical notation. We work in ZFC, i.e., Zermelo-Fraenkel set theory with the axiom of choice. By ZFC- we denote the system ZFC but without the powerset axiom. Relative constructibility is done with the J-hierarchy. If d is a finite sequence of predicates let rudx(X) denote the closure of X under the rudimentary functions and the functions x xnAi (see [ 21 ) . The J[d]-hierarchy is defined as:
JO[x] = @
;
Ja+l[x] = rUdd(Ja[x]u{Ja[x]});
JA[x] =
U
a and respectively. A
for limit
A.
Then L[d] =
structure <M,B> is amenable if vxcM xnBieM. So Ja[d] and L[d] are amenable. A map n:M-tN where M,N are €-structures is called cofinal, if vyrN 3xrM yrNn(x). If n:M+N is zo-elementary (we write n:M+ N for this) and cofinal, then n M-N is =0
zl-elementary. z0-elementary cofinal maps a low to map amenable predicates: Let <M,d> be amenable, and let n:M
N
-t
=0
be cofinal. N transitive. Then there is a unique sequence
B
such that rt:<M,x> -. is zo-elementary; B is defined by Bi= U{n(xnAi)lxsM}. Bi .
is amenable, and we define: n(Ai):=
Extenders and Core Models for Extender Sequences
141
1. Strong Cardinals. A most-important unifying principle in the theory of large
cardinals is their characterisation in terms of elementary embeddings. Scott [ 101 proved: K is a measurable cardinal iff there exists an elementary embedding n: V-rM where M is transitive. nPK = id, and ~ ( K ) > K .
Measurability can be strengthened by stipulating that M has certain largeness properties, e.g. by requiring that M satisfies some closure conditions or that M contains certain sets. This way, one obtains most of the large cardinal notions above measurability. Let us fix some notation: n:M-rN at K means that n is an elementary embedding between transitive e-structures M.N. KCM. n r K = id, and ~ ( K ) > K . K is called the critical point of n. A particularly natural strengthening of measurability is given by: 1.1 Definition. A cardinal K is strong if for all xeV there exists n:V-M at K such that xrM and n(K)>rank(X). One could show that the clause n ( ~ ) >rank(x) can be omitted from the definition. If K is strong, it is measurable. Indeed, its Mitchell-order O ( K ) of measurability takes the largest possible value (see [ 81 ) : 1.2. Theorem. Let
K
be strong. Then
O(K)
= (2K)+.
is Proposition 1.2 of [ 8 ] . For the converse, Proof. 0(K) K , which reflects sufficiently many properties of 2 ~ be V. In particular. the functions V W O ( V ) and v ~ should absolute between V, and V. Take n:V-rN at K such that VacN. It then, since V,cN, O(K) = suffices to show that oN (~)>((2~)+)~; ((2K)+)N = (zK)+. Let U:={X~K(KEU(X)}.Let u:V-"M at K be the ultrapower of V by the measure U. Define T:M.+N by o(f)(K) -r n(f)(K). T is elementary, and the following diagram commutes: ON(K)
142
P. KOep ke
(1) P(K)nM = P(K) = P(K)nN.
(2)
Tl'K+1
= id, by the definition of
T .
( 3 ) t#idpM. since UeM.
Let 6 be the critical point of
T.
(4) 6>(2K)'.
Proof. Assume 6 < ( 2 K ) Minstead. Let f:(2K)M-
P(K) be
bijective, feM. T(f): (2K)N- P(K) is bijective. Since Gerange(r), we have for c ( ( 2 ) 1 Proof: Assume not. Then (2K)M 5 6 < ( ( 2 K ) + ) M . There is fcM, such that f:G-P(K) is bijective. This gives a contradiction as in (4).
qed (5).
= id.
(6) T / ' o ( K ) M
Proof: Let
t-+ is z 0-elementary and is amenable for all a c [ ~ ] < ~ Then . P(K)nM = P(K)nN. Proof. (c). Let X€P(K)nM. Then x=n(X)nKeN. ( 3 ) . Let xeP(K)nN. x = n(f)(K.a)
for some ae[v] lRWf, (f,U),. . . ,f;lC(t.u)))cBc j ea(x), since <x,c> is one of the <x ,a.>;
P. Koepke
150
3.4. Theorem. Let E be a countably complete extender on M. Then M is extendable by E. Proof: Assume M,E were a counterexample. We can assume that M is a set: The proof of 2 . 6 shows that extendability is equivalent to the well-foundedness of a certain relation + e
z.
So
. %
formed for M,E is ill-founded; e formed for some initial part of M which is a set is ill-founded, hence there is a set counterexample to the theorem. Let M,EeHo. e a sufficiently big regular cardinal. Let o : f i
+ZwHg
such that
M,E,K,v.
E
- - fi is transitive, fi countable, and a(fi,E,~,w)=
is an extender on
fi.
In
fi, fi
is not extendable by
E . Since extensions are uniquely determined up to isomorphism (2.5). M is not extendable by
in V.
But the assumptions of Lemma 3 . 3 are satisfied, hence extendable. Contradiction!
i , E is QED
The following provides us with countably complete extenders: 3.5 Theorem. Let V be extendable by the extender E at K , V . Then E is countably complete. Proof. Let R : V + ~ M and E'=n(E). Let xieEai, for i e n ( x ~Set ) . q : = n -1rU{n(ai)li<w}. Then: n(ai) (1) r):U{n(ai) I i<w}+n(K) is orderpreserving, and ~n(x~). for i<w. We express the existence of such an r ) as the non-wellfoundedness of a certain partial order <W,>> where WEM. Let W = {fl3n<w vi is ill-founded in V. So n (xi) E E '
-
<W.>> is illfounded in M, and there is ;EM which satisfies (1). So, with 6 = t): M c ~ I + ( K ) 36:U{n(ai)li E Xfor ~ i<w. QED .
Extenders and Core Models for Extender Sequences
151
We conclude this chapter with two lemmas showing that extendability may be preserved when we change the structure on which a certain extender lives: 3.6 Lemma. Let V be extendable by an extender E at K , V ; let n:V-rEN. Let M be suitable such that "McM. Then M is extendable by E. and nrM:M-rEMi.where M'=U{n(x)lxaM}.
Proof: KMcM implies: if R(f)(K,a) n(g)(K,b), geM, then there is such that n(f)(K,a) = n(f')(x.a). This shows E
f'EM
nrM :M-rEM'
QED .
.
3.7 Lemma. Let E be an extender at by E iff H is extendable by E.
K . V .
Then V is extendable
+
K
Proof: Assume n:V+M is an extension of V by E. where M is not well-founded. So there are fi,ai,ai"~]sup(z) such that V is extendable by E and if n:V-rEM then zeM. This definition is informally equivalent to Definition 1.1; the following argument could be formalized in a set theory which allows to quantify over classes, e.g. Gdel-Bernays set theory. Suppose K is strong in the sense of 1.1. For zeV, zcOn there is n:V+N at K , such that z c N , sup(z) E F ) . Let dom(F):={lv€F}. For ~ , p € Odefine n FrK:={<x.a,A ,v>EF~AEFlh be amenable. We say that , ~dom(F), if the following <M,F> is coherent at ~ , p where holds : Let n:<M,F+F < M ' , F t > where , F t =IT(F). Then F'rK+1 = Fr, KP i.e. oFrK = oF,PK, oF,( K ) = p , and Fhv = F,!! for all < h . v > E dom(F'), A S K . <M,F> is coherent if <M,F> is coherent at K , U for all ~dom(F). We will see in 6.1-6.3 that being a coherent sequence of extenders is first-order definable, modulo the extendability of <M,F>. Coherency allows to cut off the extenders with . If critical point K at some X , P through an extension by F K 9M , can try to push some undesired property of F holds at ~ , p we this problem up by extending at K . P . By transfinite iteration of this method we are sometimes able to smooth out some problem until it vanishes.
P. Koepke
I56
Coherency allows to define a model 5.4. Theorem. Let L[ F]= such that o~(K)=x-.Then L[F] C " K is Proof. We can assume that F= FnL[ F] . eL[F]. In L[F], form some aeon. FK, w a
for a strong cardinal: be a coherent structure strong". Let xeL[ F] ; xeJa[ F] for a:L[F] L[ F '1.
Ja[ F]=J,[ F'] by the coherency of L[ F] at XEJa [F] = Ja [F'] c L[F'].
-)FK, oa K , o ~ .Hence
QED
The above model L[F] can be shown to be very "L-like". it satisfies the generalized continuum hypotheses (GCH) and various other combinatorial properties of L. A s a test for L-likeness we intend to give a proof of the ordinary continuum hypothesis 2 O = w 1 (CH). We begin the proof until we are led to the notion of iterable premouse. We develop the theory of iterable premice in the subsequent two chapters, and we are then able to conclude the proof of CH in L[F].
5.5 Theorem. Let L[F] be a coherent structure. Then L[F]cCH. We give the initial steps of the proof: Work in L[F] and assume V = L[ F] . For acw take an Na = J Xa
< Na
a
d
F] with asNa. Let
such that aeXa and Xa is countable. Let oa:MazXa where
Ma is transitive. Then Ma is of the form Ma = J
[Fa], with
P (a)
p(a)<w,
and aeMa.
In general, Fa will be incompatible with Fb , so that we cannot conclude the proof as in the L-case. We have to develop methods to compare different Ma and Mb: Ma will be an iterable premouse. We study these structures in the subsequent chapters and postpone the conclusion of the proof of 5 . 5 until chapter 7.
Extenders and Core Models for Extender Sequences
157
6. Iterable Structures.
P ( d Fa] formed in the CH-argument are
The structures Ma = J extendable. since u a :Ma+
J
E w a(a)
[ F] . Moreover. the target
models of the resulting extensions are extendable, and so on. This process can be carried on into the transfinite: Ma is iterable. We will study iterability for structures <M,F> where F is an extendersequence on M and <M,F> is amenable. We first convince ourselves that being a (coherent) extendersequence on M is uniformly nl(M), so that along an iteration the predicates of the target models are again (coherent) extendersequences. 6.1 Lemma. There is a nl-formula B,(v.w) such that for all suitable structures M, all K,u€OnnM. and E=<Ealac[~]<w>: E is an extender on M at ,. where E = {<x.a>JxeEa}.
+
K , U
iff <M,E>I=@~(K,u),
Proof. (Sketch). We can view the extender axioms El-E5 as axioms for the structure <M,E>, and we have to show that El-E5 on
<M.E> are uniformly equivalent to n l < ~ , statements ~> in CI
K.U.
Now E 1 - E 5 have an obvious nl-structure, except for the use of notions like [ u ] < ~ KX[K] , l a l , xab, fab. Some detailed study of these operations shows that they are closely related to rudimentary operations, and that in El-E5 they are used in a Al-manner. The complete argument is somewhat involved and is omitted for the sake of brevity.
QED
A s an immediate corollary we get: 6.2. Lemma. There is a nl-sentence 0, such that for all suitable structures M and all natural predicates F: F is a sequence of extenders on M iff <M,F>kB2. The coherency of <M.F> at K.U. where <M,F> is amenable. can be expressed by two axioms C1 and C2 as follows:
P. Koepke
158
C2 expresses that o ~ , ( K ) < uWe . transform this, using Los Theorem : tt
vt(t,> be an iteration as above. Then: (i) The iteration is uniquely determined by the sequence of indices.
(ii) Each n :<Mi,Fi>+<Mj,Fj>is zl-elementary. ij (iii) If K < K ~for all i+l<e then n p K = id and oj P(K)nMo = P(K)nM. for all j<e. 7 Proof: (iii) follows from 2 . 8 and simple properties of direct
-
limits.
QED
6.7. Definition. An extender structure <M,F> is called
iterable if for every iteration ,> of <M,F> : is extendable, (i) if e = e+l then <M-,Fe> e (ii) if e is a limit ordinal then the direct limit of ,> is well-founded. This means that every iteration of <M,F> can be freely continued. We consider some criteria for iterability: 6.8. Lemma.Let a:<M,F> + be a xo-elementary embedding of extenderstructures, and let I = , >be an iteration of M with indices < < ~ ~ , v ~ > l i + l < e =Let - . be iterable.
P. KOep ke
I60
(ii) u 0= u . (iii) < K ~ , U ~ > = C Y ~ ( < K ~ . U ~for > ) , i+l<e. (iv) a .onij=nijoai for ijj<e. 3 (v) I ’ , S are uniquely determined by (a)-(d). Proof. I’.S are constructed recursively. At limit stages form
the obvious limits. At successor stages use 3.1.
QED
An immediate corollary is: 6.9. Lemma. Let a:<M.F> -P be a Zo-elementary embedding of extenderstructures and assume that is iterable. Then <M,F> is iterable.
Countable completeness implies iterability: 6.10. Theorem. Let a:<M,F>+be a Z0-elementary embedding of extenderstructures, and assume that G is countably complete. Then <M.F> is iterable. ijj<e>> be a Proof: Assume not; let I = , +ij counterexample to iterability, i.e. 6.7(i) or ii) fails for I. Let IeH where p is a regular cardinal. Let X .( H X cc cc’ countable, such that I e X . Let p : & X , fi transitive. Let T = , with d(i)<e, satisfying: ( 4 ) i<j,
contradiction.
> above: We are done if we show the existence of I , < ~ ~ l i < gas In W, we can construct an iteration
I = ,>of <M,F> with index and a strictly increasing sequence of ordinals edom(Fd(i)) there exists jc[d(i).d(i+l)) that
< K ~ , v ~=> n d(i),
such
j()*
Now define )cdom( ~ Fd(i)).
that
-
). 3.1,
there exists an
:di+l,pi+l>+ <Mj+l,Fj+l>,such that ZO
on^,^+^
-
"j,j+l0"d(i -
"i+loni,i+l
To define
- * j+1,d( i+l ) OU ' O"i, i+l - "j+1,d( i+l )On j ,j+lO"d( i ) ,jOCi - f?d(i),d(i+l)ouiS as rewired. -
for limit A < e proceed similarly.
QED .
The next theorem shows that by iterations we can compare coherent extender structures. This is the fundamental method in the theory of inner models for extendersequences, core models, etc. :
Extenders and Core Models for Extender Sequences
163
6.13. Definition. Let <M.F>. be extender structures. <M.F>. are comparable if (i) o ~ ( K )= o ~ ( K ) for all KeOnnMnN; (ii) FKvanMnN = GKvanMnN for all KeOnnMnN, v,be iterable coherent extender
* *
structures which are sets. Then there are iterates <M ,F >,
* *
of <M.F>,, respectively. which are comparable. Proof. It will simplify notation if in iterations
-
, with indices < < K ~v, > 1 i+l<e> we 1 > be an arbitrary pair of ordinals; in case allow for < K ~ . v ~ to < ~ ~ , ~ ~ > e dio) mwe ( Fthen require that Mi+l = Mi and ni,i+l -
idPMi. It is clear that this notion of iteration is equivalent to the ordinary one. Now assume that the theorem is false with counterexamples <M,F>,.Let e be a regular cardinal > card(M),card(N). We define iterations
IM = >be an iteration of the 7.2. Lemma. Let , premouse J,[F,D] over D. Let T = min{KIedom(F)}. Consider Mi = Ja,[ F a,Di], i oE (7). Case 1: oE (T) 2 K for some T < K . Let U be a measure on the measurable cardinal K . By an indefinite iteration of V by U we obtain a coherent structure L[Ei] from L[E] with oE'(7) = m ("iterating U out of the universe"). Let u:L[E'] + L[F] be an iteration of L[E'] by E i o such that U(T) = K . Then L[F] is a s required. Case 2 : oE (T)p (although the details of the argument are a bit subtle the main intuition is that a strong cardinal is able to supply us with arbitrarily strong extenders). The condensation lemma 7.8 is then used to show that for a proper class &On, if p,p'eB then the predicates
FPnL[FP]nL[FP'] and
FP'nL[FP]nL[FP']
agree on their common domain. Then the predicate is as required for Theorem 8.1.
F =
U Fp P EB
Now if we have not got this profusion of real extenders to construct F from, we will have to require that each F K V is an extender on some sufficiently,large auxiliary model, which can already be defined at stage < K , v > of the recursion. These auxiliary models, which capture P ( K ) of the final L[F], will be provided by core models. The following line of thought is intended to motivate the definition of these models: Assume we have, by some method, constructed our natural sequence F. For simplicity assume that all components FKV are countably complete (in the sense of 3.2). Now assume the
I70
P. Koepke
construction failed, i.e. L[F]!= "F is not a coherent sequence of extenders". Since the right hand side is z1 (6.2, 6.3), there is aGOn such that: (1) Ja[F]k "F is a coherent sequence of extenders", and (2) Ja+l[F]t "F is not a coherent sequence of extenders". The counterexample in (2) can be taken to be a subset CCK for ~[ F] , and c codes a some cdom(F): C E J ~ +F]\Ja[ counterexample to the extender properties of FKV or to its coherency. Now the elements of Ja+l[F] are in a certain sense definable over J [F]. and for simplicity of this argument let a us make the assumption that c is xl-definable over Ja[F] from some parameters, and K % a . Let n:Ja[F] Ja,[F']. Setting D = FP, G = F'\D. the *FKV
structure M = Ja ' [G,D] is a premouse over D. M is iterable by the countable completeness of F ( 6 . 8 ) . c is z over M. since n -1
is zl-elementary and n r K = id. By an argument of Dodd and Jensen there exists an iterable
+
premouse M+ = JP+l[G+,D] over D such that J [G ,D] is an P iterate of M (this difficult result is called the continuation lemma since it allows to continue M to a longer iterable premouse - modulo an iteration of M).
-
crz (J G+,D]) c M+, and we have located c in the low part of -1 P [ an iterable premouse over D. So in taking all low parts of iterable premice over D together we get a domain which contains counterexamples like c in case such exist. This motivates the following: 8.2. Definition. Let D be a natural sequence. If D is a set. define K[D]:= U{lp(M)IM is an iterable premouse over D). If D is a proper class, define K[D] := U K[DnVa]. aeon 8.3. Theorem. K[D] is an inner model of ZFC + "V=K[D]", and
is amenable, The proof of 8.3. involves the above-mentioned "continuation lemma". Let us just give an indication of techniques used by showing that the powerset of w exists in K[D]. First use
Extenders and Core Models for Extender Sequences
171
"comparison iterations" as in 7.4 to put all subsets of w in K[D] into the low part of a single iterable premouse M over D. By the continuation lemma there is an iterable premouse M+=
J p + d F+,D]. such that JP [F+,D] is an iterate of M.
Then
P(o)nK[D] = P ( w ) n J [F+,D] E lp(M+) c K[D], as required. P With these new models we are now able to formulate the "local" criterion for the successive choices of FKV as described in the beginning of this chapter: 8.4. Theorem. Let F be a natural sequence such that each F KV is countably complete. Then is an extendable coherent extenderstructure provided the following holds: Whenever ~dom(F), then FKV satisfies the extender axioms (El-E5) and the coherency axioms (Cl,C2) with respect to K[ F ~ < K . v > and ] Fr. (H + ) K
The proof of this theorem follows the line of argument that led us to the definiton of the structures K[D]. but a complete proof is far beyond our means here. In particular the case when the counterexample c is not zl(J [F]) but more ...a
complicated necessitates finestructure theory and a different, finestructure preserving notion of iteration. This method of proof also shows that the "intermediate" models K[Fr] satisfy that Fr is an extendable coherent extendersequence. We have arrived at the notion of core model: 8.5. Definition. A model K[F] is called a core model if K[F]c "F is an extendable coherent extendersequence". If K[F] is a core model. the predicate F is called strong.
9. Fundamental Properties of Core Models. Core Models are "L-like" models of set theory. The family of core models possesses natural embedding and extension properties, and there is a unique canonical core model which is the largest of these models. The canonical core model can
P. Koepke
172
be considered to be a direct analogue of Gijdel's model L: If there is no inner model with a strong cardinal ("lLIE]"), the canonical core model has many properties of L under the assumption .O # . Though natural, these results are very difficult to prove. Usually, finestructure theory and iteration theory have to be combined, and we are unable to include any of this in the present paper. If these fundamental properties are taken for granted it is however sometimes possible to apply higher core models to certain problems, just like the Covering Theorem for L can often be used without mentioning finestructure theory. Such an application will be given in the next chapter. 9.1. Definition. ''7L[E]'' stands for: there is no inner model with a strong cardinal. Most proofs about core models presuppose .L[E] ; .L[E] allows to define maximal core models which are technically important. Let us assume .L[E] for the rest of this chapter. This is no restriction for most applications: if we want to show that there is an inner model of a strong cardinal, assume .L[E] and work for a contradiction. Let us remark that the analogy to L and becomes even closer if we would admit models L[E] for a strong cardinal into the family of core models and define a real 041 ("zero-pistol") which transcends models L[ E] as'0 transcends L. We would then get the subsequent theorems under the weaker assumption .O'.
But working with lL[E] saves us the exact
definition of the set 0 ' . 9.2. Theorem (,L[E]). Let K[F] be a core model. Then K[F] c GCH. Also other L-like principles like 0 , o , . . . in K[ F] .
hold
The proof of the continuum hypotheses (CH) could be done like the proof of Theorem 5 . 5 .
Extenders and Core Models for Extender Sequences
173
9.3. Theorem (,L[E]). Let F be strong and edom(F). Then: (i) F ~ < K , u is > strong.
K)nK[ F] . 9 . 4 . Theorem (,L[E])
(The Limit Theorem). Let F be a natural sequence such that Fr is strong for all edom(F). Then F is strong. 9.5. Theorem (.L[ E] ) . Let F be strong, and let n :K[F]+W be an elementary embedding into a transitive class W. Then W = K[F'], where F a =n(F).
dom(F)=dom(G).
.
Let K[ F] , K[ G] be core models with Then IK[ F] I = I K[G] I and FnK[ F] = GnK[ F] .
9.6. Theorem (,L[ E] )
By 9.4 and 9.6 one can define a unique canonical core model by choosing extenders recursively at the smallest possible critical point: 9.7. Definition. Assume .L[E]. Then there is a unique strong sequence Fcan satisfying:
1'(
'can c ' [ ' c a n ] ; (ii) for KeOn, u = o (K) there is no strong sequence G Fcan with G l \ < l ( , u > = Fcanr and edom(G).
So extenders with critical point K are included into Fcan for as long as possible; only then the next critical point is considered. K[Fcan] is called the canonical core model. 9.8. Theorem (,L[E]). If n:K[Fcan]+W is elementary, W transitive, then n is the iteration map of an iterated ultrapower of K[ Fcan] .
an immediate corollary one has the following rigidity property of K[ Fcan]: 9.9. Theorem (,L[E]) There is no non-trivial elementary embedding n :K[ Fcan]+K[ Fcan] As
.
P. Koepke
I14
9.10. Theorem (,L[E]). If G is a strong predicate which is a set and G c K[G] then there exists an iterated ultrapower n:K[Fcan] + K[F] such that G is an initial segment of F, i.e. G = Fpv for some ordinal Y . Finally, we have a weak covering theorem: 9.11. Theorem (.L[ E] ) . Let K be a singular cardinal in V. Then
10. An Application to Jonsson Cardinals. 10.1. Definition. Define J ( A ) by: every first-order structure P of cardinality A whose type ( = number of relations, functions, and constants) is < p possesses a proper elementary substructure of cardinality A . A is a Jonsson cardinal if Jw(A). Details on Jonsson cardinals can be found in [1,7.3.2]. Every Jonsson cardinal is > o w . The successor of a regular cardinal is not Jonsson [ll]. Here we will obtain some information on successors of singular cardinals being Jonsson. 10.2. Lemma. Let A be a Jonsson cardinal and let p be a regular cardinal < A . Then J ( A ) . I.( Proof. Consider a structure S = edom(G).
K
+
I t e r a t i n g M i f n e c e s s a r y w e can assume t h a t K E ( A . A ) . Now
H c Card(K)A+
are countably complete.
We shall now construct an embedding G:K[ F']+K[%] such that
,..
a P H = o r HK[F'I = +
0.
h
Let IQI :=(lfcK[F'], f is a function, dom(f)eH, xeu (dom(f)) } .
,.
Define relations % and E onlQl by: a iff <x,y>ea ( { I f ( u) = g( v)} ) , :
iff <x,y>co({lf(u)eg(u)}).
.-
The "structure" Q:= satisfies a L O theorem: ~ (16) Let 'f be a formula of set theory and ..., eIQI.Then Qk='f(, . . . , < fl.xl>) iff , <X l,..., x~>Eo({IK[F']CS(fl(ul) ,...,fl(u~))}). Proof. By induction on the complexity of 9. The induction is clear for atomic formulae and for the case of propositional connectives. Let us consider the induction step for 9 I 3vot. Then : Qk~vo~(vo, , . . . , < fl,xl>) Q k i ( ~ f o , x o ~ , ~ f l , ..., x l ~ ), . for some . <Xo,..., X~>EU({lK[F']kt(f~(u,) ,....fl(ul))}) <Xo,..., X ~ > E U ( { < U ~ , . . . , U ~ > ~ K [ F ' ] ~ 3v0t(vo,fl(~l)~.~ * ,fl(Ul))}) <xl,..., x ~ > E Q ( { < u ~ . . . . , u ~ > ~ K [ F ' ] ~ 3 v ~ i ~ v 0 , f l ~ ~ l ~ ~ . . . ~ f l ~ ~ 1 ~ ~
Extenders and Core Models for Extender Sequences
I79
QC3(>,, ...'< fl,xl>), and Q ~ 3 v o t ( ~ o . < f l , ~ 1 ~ ,). ..., (17) The relation is well-founded on Q. Proof. Assume instead that there exist sQ for neu ~ , ( { < u ~ +un> ~ ,I fn+l( u ~ +E ~ fn( ) un)} ) . There exists an iterable premouse M = J8[G,Firg] over F'Pg for some Q > A such that fnslp(M) for n < w . We can assume that G is countably complete, and sinie F' is countably complete above A we can view M as an iterable premouse over P = F'PA. A downward Uwenheim-Skolem argument shows that there is an iterable premouse N over
with card(N)%. } 2 {ulg(u)*H} 3 {ul3vg(u)ef(v)}. v)}) because co((~g(u)~f(v)})~ 1). qed(C1aim) cH, and we can use the inductive
z = T([9,Y]) = 7([9',Y]) = 0(9')(Y) e o(f)(x). Conversely, let zau(f)(x). z = o(idrd)(z), for some transitive dsH with zsu(d). o(idPd)(z)Ea(f)(x) a e ~({~id~d(u)ef(v)}) .., * r([idPd,z]) E ~([f~x]). By inductive assumption, ~ ( [ i d r d , ~ ]=) o(idrd)(z) Hence Z E T ( [ f,x] ) .
= z.
qed( 19)
z
(20) u 2 u .
Proof. For ZEH,
z
o ( z ) = T ( [ { < z . @ > } , @ ] ) = u ( { < z , @ > } ) ( @ ) o(z). =
qed(20 1
.., (21) FrA = F = FcanrA. z
-
Proof: FrA = u(F'rA) = o(F) = Fcanr~. Consider
-
K[Fcan]
3 K[F'] 2 K[?]
.
By theorem 9 . 8 . O O B is an iteration map of K[Fcan]. Its critical point is < a < A. call it K . Then o ( K ) > o - ( K ) , because Fcan F but o (K) = o _ ( K ) , by (21). Fcan F Contradiction!
z
OOB
is an iteration at
K,
QED.
182
P. Koepke
References [l] C.C. Chang and H.J. Keisler, Model Theory (North-Holland. Amsterdam, 1973). [2] K.J. Devlin, Constructibility (Springer, Berlin, 1984). [3] A.J. Dodd, The Core Model, Lecture Note Series 61 (London Math. SOC., Cambridge, 1982). [4] A.J. Dodd and R.B. Jensen. The Core Model / The Covering Lemma for K / The Covering Lemma for L[U], Ann. Math. Logic 20(1981) 43-75 and 22(1982) 127-135. [5] A. J. Dodd and R.B. Jensen, The Core Model for Extenders, Handwritten notes, Oxford. (1978-1980). [6] A. Kanamori and M. Magidor. The Evolution of Large Cardinal Axioms in Set Theory, in: G.H. MUller and D.S. Scott, eds., Higher Set Theory, Lecture Notes in Math. 669 (Springer, Berlin, 1978) 99-275. [7] P. Koepke. Some Applications of Short Core Models, Ann. Pure and Appl. Logic 37(1988) 179-204. [ 81 W.J. Mitchell, Sets Constructible from Sequences of Ultrafilters, J. Symbolic Logic 39(1974) 57-66. [9] W.J. Mitchell, The Core Model for Sequences of Measures I, Math. Proc. Camb. Phil. SOC. 95(1984) 229-260. [lo] D.S. Scott, Measurable Cardinals and Constructible Sets, Bull. Acad. Pol. Sci., Ser. Math. Astron. Phys. 9 (1961) 521-524. [ll] J. Tryba, On Jonsson Cardinals with Uncountable Cofinality. Israel J. of Math. 49 (1984). 315-324.
Logic Colloquium '81 HrD. Ebbinghaus et al. (Editors) 0 Elsevier Science Publishers B.V.(North-Holland), 1989
I83
LOGICAL ASPECTS OF THE AXIOMATIC METHOD: ON THEIR SIGNIFICANCE IN (TRADITIONAL) FOUNDATIONS AND IN SOME (NOW) COMMON OR GARDEN VARIETIES OF MATHEMATICS
G.Kreisel, Salzburg
Introduction. Contributionsof mathematical logic (m.1.) to logical foundations (l.f.), for which m.1. was first introduced, are here used for the topic of this article, briefly described in its long title. The new twist is not only to pursue the aims of l.f., but to examine the extent to which their realizations contribute to effective knowledge of the various areas in the title. By and large, the results reflect experience with other branches of mathematics, especially if, like m.l., they started with grandiose aims. Some very elementary properties of logical aspects contribute to all areas. But then shifts of emphasis - over and above refinements - are needed: in the choice of objects (= notions) or at least of their description, and of problems about them. Shifts away from those that are fundamental for l.f., and still most prominent in m.1. since the (c1)aims of 1.f. remain enshrined in the terminology of m.1. Generally, the shifts are not needed for mere validity, the preoccupation of 1.f. (nor for the mere possibility of combining logical aspects with others), but to avoid sterility; correcting a tacit assumption of l.f., that generality - here, of logical aspects is enough to guard against sterility. The examination below (c1)aims to describe the shifts and the facts involved in them more fully; by use of a little roo1 kit of norions containing, but not confined to, the literary forms of m.1. This goes beyond a mere ritual of 'precision', as illustrated by the following two samples. The first starts off with the observation that 1.f. are - perhaps, the chemically purest realizations of some heroic idea(1)sof theoretical understanding, which were formulated by the Greeks so memorably that they have become part of Western culture; for example, ideas on such perennials as truth, definition, proof, and others in the flashy terminology of m.1. (Thus 1.f. need not be only a marginal business preoccupied with paradoxes and other wouldbe dramatic puzzles.) Those ideals have been pursued also outside m.l., at different levels of sophisticationin different areas of knowledge. This suggests, not only to me, a
First job: to state and exploit relations gained by experience in m.1. and other areas;
I84
G. Kreisel
relations in both directions, positive and negative experience, areas within and outside mathematics.’ The main ‘negative’ results extend the (still) best known parts of m.1.. which establish limitations of particular branches of l.f., for example, of (would-be evident assumptions in) set-theoreticor formalist foundations. The results emphasized below establish close relations between those different variants, and thus a mainstream of l.f., and its limitations; being more general, the new negative results are generally (even) more elementary than the old. Viewed this way the notorious foundational debates have not only been ‘passively’ sterile, but have ‘actively’ obscured assumptions common to a mainstream of 1.f. by magnifying ripples on it. On the positive side this view leads to a broader meaning of ‘foundations’ (in mathematics), conveyed in Chapter I11 by a straightforward - not necessarily intended - interpretation of Bourbaki’s treatise. More generally, since 1.f. and scientific culture are related via the venerable ideals of knowledge alluded to earlier, both the old and the new negative results about 1.f. draw attention to questionable assumptions common to the whole heroic tradition. The notions used here to describe the nature, as it were, of that tradition constitute a tool kit for a
Second job to provide logical hygiene of (parochial) use to (some) specialists in m.1.: most simply, for better use of attempts by experienced mathematicians to express their malaise about 1.f. and thus about the most prominent parts of m.1. Since those attempts are usually abysmally ignorant of contemporary m.l., they draw the attention of the (average) logical reader to their own defects, and away from those of m.l.! For example, as it stands, the would-be telling grunt ‘m.1. can be useful, but is not mathematics’ is at best a reminder that (even) mathematiciansdo not live by mathematics alone. The following use of logical hygiene is more specific. It concerns Model theory. This studies logical aspects of the axiomatic method. Some successful shifts of emphasis were made here, particularly in the sixties. Since then, at least, measured by popularity, there has been a shift back to Lf., specifically to (models of) large cardinality, a prominent topic of 1.f. since Cantor: as a holy grail or bogy, no matter. One ‘objective correlative’ to this sociological fact, concerning fashions, is the difference between norions available toformulate (c1)aimsinvolved in shifts to andfrom 1.f. The notions of l.f., which go back to the Greeks, are ips0 facro both memorable and accessible without recondite experience. In contrast, just because they are derived from ex-
* Relations of 1.f. to areas outside mathematics are to be found mainly in - necessarily sketchy - noles at the
end. Other notes. which contain specific and sometimes demanding details, should be skipped unless they are referred to in a passage that is felt to be cavalier, when their style may be of use (to the particular reader). Below [N i] means the ifi note.
Logical Aspects of the Axiomatic Method
185
tended experience, the new (cl)aims, in shiftsfrom I.f., cannot be conveyed with equally limited, and thus equally widely accessible, background knowledge. The tool kit, meant to reduce that imbalance, offers another option for conveying the facts; at least, partly by contrast (not always: superiority) to I.f., and in terms of m.1. Evidently, this option is not available to those who treat 1.f. (oreven m.1. generally) with benevolent or other kinds of neglect. This article stresses the diversity of uses for the option; cf. [M 11 for a sustained use of it. The following so to speak introductory generalities are easy to state without specialized knowledge. but, as documented later, neglected with embarassing effects. We begin with some Disclaimers. First, no overall heuristic value is claimed (for the option), which has to be balanced against the effort of learning (it) and the resulting distraction from ‘getting on with the job’ by the light of nature. Secondly, no general claim is intended that what is ‘seen’ or, simply, ‘known clearly’, here, about shifts, can ips0 facto be said, let alone, in a prescribed vocabulary. Thirdly, no dramatic ‘need’ for the option is assumed, implicit in popular talk of ‘bewitchment’ (or ‘demons’, with the corollary that some kind of exorcism by an elaborate ritual, for example, of precision, is required). What is suggested here is, first, that the option is cheap for specialists in m.l., and may help consolidate better some of the shifts in the 60’s (which were usually made without exposing explicitly defects of the old, flashier (c1)aims). Secondly, as a matter of experience, the particular tool kit below allows those shifts to be articulated in civilized terms, not, as somebody said, in barks and grunts (and that we can afford that luxury). Thirdly, the modest word ‘hygiene’ contains a warning against a two-fold mistake: of msuming that (i) insights or errors involved in 1.f. are dramatic, and that (ii) if they are not, they are of no consequence or, at least, easy to correct. The second assumption is a corollary to high-minded idea(1)s of ‘depth’. To underline how simple-minded the latter are, here are some Parallels between logical and (familiar) literal hygiene. (a) The latter does not assume any pathology to be cured, by therapy, once and for all, but has to be applied constantly and with discretion (not as mere cosmetics; to be compared on the logical side with logic chopping, for example, in socalled exact philosophy or with the majestic prose of its discursive tradition). (b) Practically, improvements in hygiene have conmbuted more to overall health than (expensive) medical technology, and, theoretically, (i) some spectacularcatastrophes, previously attributed to Higher Intervention, were discovered to be matters of hygiene, but also (ii) some hygienic successes were (later) related to recondite discoveries about complex ecological systems or single molecules. Disregarding these facts is a blunder about the nature of our bodies, with parallels in the area of knowledge. - Finally, it is to be remembered that (c) neglect-
186
G.Kreisel
ing hygiene tends to have more striking effects than applying it; to be compared with profound effects of not knowing or not remembering socalled trivial points. The home truths (a) - (c) do not answer any great theoretical questions of biology. But what is actually said about the latter by professional mandarins (or equally pig-headed antimandarins) does not answer those questions any better, and distracts from the usefulness of (a) - (c) to boot. Viewed this way, the parallels (a) - (c) constitute hygiene for thinking about thinking: epistemology in fancy language. As a matter of fact (not, mere possibility) this hygiene is neglected in most of the literature on the topic of this paper. In current jargon such hygiene is called ‘foundations’ (in the modest sense of 1.2b). The rest of the introduction contains more specific hygiene for the heroic tradition of foundations.
Introduction (ctd.): foundational schemes and some rules of thumb. A foundational scheme (c1)aims to start from - notions and properties expressing its (particular) - first principles and to proceed systematicallyvia (its) fundamentalresults. In the case of what was called above Heroic Tradition (HT) the following blue prints for a Royal Road to knowledge - are among those that - have caught the imagination of many over a couple of millenia: the privileged place given to intuitive notions, either as a principal source of knowledge or of (Cartesian) doubts; cf. [N 13. More specifically, Kant proposed to start with ‘all’ possiblities-in-principleor, going the whole hog, with just one, and Hegel proposed to syzhesize all ‘theses’ in sight (with more on HT in [N 21). Generalities on examining HT, first. without reference to experience in m.1. With expanding knowledge, concepts beyond the intuitive kind are developed, for example, in trade jargons. Doubts can be dubious, too. Possibilities-in-principle,outer limits as it were, distract from more central (= rewarding) matters. Conflicting ‘theses’ can also be put in their places by being assigned different domains of effective (or, simply, of valid) application; rather than be ‘reconciled’ by synthesis: like proverbs (in [N 21) and unlike questions-of-principle,the darlings of HT. which are to be answered by a plain Yes or No, independently of grubby applications. The horse sense used in the last paragraph will be called Common Sense (CS); not so much because it is commonly applied, but because it applies to common experience. This idea(1ization) of ‘common sense’ is suitably extended in science also to uncommon experience; recall, for example, Clifford’s The common sense ofthe exacr sciences. - Let there be no mistake. The traditions of HT and CS are in conflict; obviously, when HT (c1)aims to correct CS, but also when HT wants to do better in subtler ways, for example, by being systematic (tacit-
Logical Aspects o f the Axiomatic Method
I87
ly, according to its particular blue prints, and to the claims implicit in its darlings). In contrast, for CS it is a main burden of research to determine whether a given question is a matter of principle; similarly, with notions accepted in HT as ‘fundamental’. For CS, research determines whether, in relations with other things, a given notion is generally the source of the information flow (as experience expands), and thus fundamental. Last but not least (and still without need for specific experience in m.1.): since each of the traditions puts the other in question, there is the possibility-in-principleof an unending see saw between - pursuing and refuting the assumptions of - HT and CS. But there is also the possibility that as experience expands, stability is reached;
at least, in some more or less wide domain, which is then said (in current jargon) to ‘lend itself to theory’; in the style of HT or CS, as the case may be. This second possibility is belittled in glib ‘relativisms’. ([N31 contains more on them.) In short, there is no need for m.1. to cast general doubts on - the most cherished theoretical and practical idea(1)s of - HT. But, without forgetting the generalities above, w.r.t. the topic of this article, we can do better still by remembering the following Generalities on examining HT (ctd.),but now by reference to - the shifts of emphasis in m.1. The discovery involved here is that contributions of m.1. generally do not derive from the (would-be) fundamentals of l.f., but, at best, from variants, lemmas, afterthoughts and the l i e ; depending on - and measured by the norms established in - the area considered. Far from relying on some preestablished harmony, the discovery follows the rule of thumb: a’kgager les hypoth2ses utiles (extended to notions and theorems). Since the fundamentalsof 1.f. specialize those of HT, the shifts are candidates for (locating) general, but dubious assumptions; not only in l.f., but also in - the broad idea(1)s of - HT. To repeat what was said about logical hygiene (and cannot be repeated too often): The heuristic value of examining HT, over and above making the shifts, involves a delicate balance. The cultural value does not inasmuch as HT is an integral part of our intellectual heritage. Be that as it may, the generalities generate some practical Rules of thumb, applied throughout the article as a kind of hygiene (or, practically equivalently, like lemmas in mathematics). 1. What is abstractly so uncontroversial as to be banal, is often in conflict with the most prominent parts of m.1.;
188
G. Kreisel
and of the erudite literature on HT (not only with imagined situations). (a) The rule reflects the general conflict between HT and CS, and the (systematic) errors in HT so found save its examination from being a mere ritual. (b) Except at a very primitive stage those errors are not brutal inconsistencies, but, for example, obstacles to rewarding combinations mentioned in passing at the outset; cf. 1.4 andpnrsim. - In particular, 2. The (double) risk presented by would-be fundamentals increases as their formulations are made more specific or elaborated in other ways; (heroic) risks of error and of distraction. The (implicit) rule agrees with CS, on premature choices, and conflicts with the spirit of the idea(1) - since Plato’s Republic (on mathematicians) - of laying down first ‘what we are talking about’. (a) Not being too specific reduces the need to reculer pour mieux sauter, especially, the risk of unending see saws. (b) Plato’s ideal is of course realized according to the letter by use of abstractions, and the axiomatic method provides an agreeable literary form for formal precision (without being specific). The next and last rule also goes beyond mere validity (cf. l b above), and is more explicit about premature decisions (than 2a):
3. Contrary to HT, which demands unique answers to its (favourite) questions: What is X?, a main burden of research is to focus on some of the different, impeccably correct answers; with the long range goal of discovering relatively few that are adequate for relatively many situations, where X occurs; cf. for theorems T: Why is T true? to be answered by suitable (correct) proofs of T. - Two words need attention. (a) One is ‘relatively’; here an hypothesis of stability, as experience expands, is made; just the opposite to glib relativisms (of [N31). (ii) Below, ‘focus on’ is replaced by the (shorter) ‘choose’; but with the understanding that this is not preceded by going into all correct answers first! which would be a distraction. Manifesto. I view the rules above, and several reminders below, as (parochial) footnotes to the corpus of those thoughts that belong to the broad - as opposed to the academic - tradition of philosophy (in its literal sense as it were), and, above all, to the world’s literature; with its many inventions of literary forms to make such thoughts memorable (enough for spotting when these are relevant); with a modest, but permanent place in that corpus for the literary forms of m.1. Much of what was said earlier about hygiene applies to this corpus. In particular, there is no conflict between the manifesto and the scientific tradition, which lives on extending experience, including the intellectual kind. However, not only for me, but for the silent majority (of scientists):
Logical Aspects of the Axiomatic Method
189
There is a - high chance of - conflict with HT and, more generally, with the academic tradition of meticulous scholarship, which insists on ‘going back to sources’, especially its venerable (choices of) ‘fundamentals’. As practiced this is tantamount to not learning from (extended) experience; making a virtue out of - one of the less imaginative devices that may be - a necessity where little experience is available. Chapters II and III will illustrate in some detail how the academic tradition functions in the case of 1.f. Chapter I is preliminary. It recalls briefly the current meanings of the words in the title, which have become remarkably stable in the last 100 years (after a few initial changes).
I. Reminders: By way of a kind of metareminder, readers are warned that the first rule of thumb above applies with a vengeance. 1. Euclid’s or Frege’s use of the axiomatic method was preoccupied with rigorous analysis of some intuitive concepts (and such all-purpose virtues as clarity and economy). The modern, abstract use is essential also for the passage from such concepts to others that lend themselves better to theory; cf. [N 11. In flashy language, the abstract kind begins where the other leaves off, roughly, as follows. (a) Given a body of (striking) problems and solutions concerning an area (that has struck our attention) their proofs are analysed axiomatically. This means, as usual, stating (= viewing) them in terms of certain abstract aspects of that area, and deriving the (restated) results from suitable properties (= axioms) of those aspects, say, Ai. The abstractionsinvolved in the Ai may thus be discovered, technical concepts; cf. [N 11. In general, there are many correct analyses by means of different choices of such Ai; recall the third rule of thumb. (b) As to the choice of A;, CS conflicts with such demands of HT as: (i) general, systematic principles (for automation as it were, of the area considered), and (ii) relentless doubts on the mere validity of Ai (in the Cartesian tradition). For CS, (i) is premature unless the area is well understood, and thus not very heroic, and (ii) is liable to distract attention from questions of choice (and others less flashy than those of HT). (c) A couple of such questions related to - the generality of - logical aspects are implicit in Bourbaki’s manifesto [B], in obiter dicta, about ‘the least interesting side’ of axiomatic analysis; cf. also [N 21. (i) Given a (growing) corpus of results, an Ai used in all of them is liable to be ‘least interesting’ if the ‘sum’ Ai A A, is much greater than its (abstract) ‘part’ Ai (or, more soberly, if the set of elementary consequences of Ai A Aj includes that of Ai with a ‘large’ margin). Of course, as in the material world, weight is not determined only by numbers, but also density.
I90
G. Kreisel
Viewed this way logical aspects, which are present in all areas (= all possible worlds), are particularly suspect. - The next point is less obvious. (ii) A principal function of the (more slowly growing list of) Ai is to sfrucrure the area considered according to the abstract aspects, alias structures, that enter the Aj; to structure the mathematical objects in that area and our - by [N lb(ii)] corresponding - intuitive resonances to them. So - as always, except for very elementary questions - the choice between the Ai will be decisive, and not what they have in common, for example, their definability from a stock of foundational, here, logical notions. Since this definability is a preoccupation of l.f., the first rule of thumb applies here. In the language of science and technology the Ai constitute modules for the area considered, at least, provided no recondite properties of that stock are used in deriving the Ai; cf. [N 4b]. For CS, (i) and (ii) above are lessons of experience, not matters of principle. Parallels in natural science are differences between different fundamental schemes; with quantum mechanics and molecular biology on the one hand which suffuse the accessible world, and high energy physics and relativity on the other; cf. [N2c(iii)]. 2. Two principal elements of the metaphorfounddons are, first, an order of priority and, secondly, its implications for reliability, the prior supporting the later; as in a building on earth, not - like the earth itself - in space. So for CS the metaphor constitutes an assumption, for example, with the alternative of relying on cross bars for (structural) stability. The metaphor does not reflect a more rewarding function of fundamental schemes in the natural sciences [N 41. which consists in combinationswith specific experience; cf. [N 2b(iii)]. (a) In contemporary jargon ‘foundations’ has acquired a modest sense, corresponding to the elementary Ai in I.lc(i) above, with this difference. They should be (i) combined with more specific material in later developments, like generalities about left hand and right hand inverses in groups, and unique orderings of real closed fields, and (ii) not merely be generally valid, like undecidability, resp. decidability of the elementary theories of those objects. (A fresh start has to be made with, say, the word problem for groups or with effective decisions by use of spherical trigonomemy or topology of R). The choice of this kind of foundational material is made after experience with the whole subject, and, generally, there is no (intrinsic) order of priority. Realistically speaking, modest foundations are essential for reliabiliy, in particular, as cross checks (to be compared to cross bars above). Perhaps, less obviously: The attention given to modest foundations by the scientifically mature contrasts with, and is progress over, both (i) the would-be hard-nosed tradition in mathematics (which relies on brutal difficulty for its choice of problems, as in the Credo of [C]) and (ii) the starry-eyed tradition in [N lc], which expects miracles from such generalities.
Logical Aspects of the Axiomatic Method
191
(b) Foundationsin the traditional, heroic sense do require a global order of priority in order to structure - the exposition, if not acquisition of knowledge of - the broad canvas of mathematics; illustrated in the small (and thus not heroically) in Llc(ii), and of course in many textbooks. For CS a main question is the kind of order; in particular, in terms of [N 4bl: Does it have logical character? The question is not rhetorical, but corrects the question-of-principle whether mathematics is logic. (i) By comparison with the kind of discoveries that went into the successful foundational (= fundamental) schemes of natural science, as in [N 4a(iii)]. logical aspects seem too shadowy for the burden that heroic foundations (c1)aim to carry. (ii) Assuming that logical aspects must be adequate is a typical ideology; going from the observation that those aspects are present to the doctrine that they contribute decisively or at least significantly; cf. numerology and, to a lesser extent, astrology. All finite, in particular, planetary configurations have numerical, resp. spatial aspects. The ideologies assume that these aspects are adequate for all problems, resp. for predicting human destiny. Viewed this way logical ideology is suspect because logical aspects are present everywhere;cf. I. lc(i). The next order of business, in connection with ‘logical character’, is to be more explicit about ‘logical’. 3. The branches of mathematics in the title, m.1. and its common or garden varieties, have shifted their boundaries; cf. also [N la] and [N4a(ii)l. (a) Contemporary m.1. still pursues generality; mainly, in the sense of arbitrary domains of its quantifiers; in its infinitistic part, in the sense of some largest domain. Thus the familiar logical operarions, generally, have as domains and ranges arbitrary collections of sets constrained only by weak closure conditions. M.l. has expanded through discoveries, for example, of the V3 form for Turing’s idea(lization) of mechanical computability,which is not intuitively a matter of logic at all. In fact, it requires a separate treatment; cf. [N 61. The first rule of thumb applies to the (innocent) last two paragraphs. In particular, (i) the old sense of ‘logic’ as the science of correct reasoning is now, at best, modest, as in I.2a: what is true for all such reasoning is not assumed to be ofren significant in cases that present themselves. The emphasis above on (ii) generality raises different questions in m.1. from those demanded by popular (c1)aims linking logic primarily to language; cf. [N 51. These are dubious inasmuch as algebraic geometry is equally concerned with syntactic forms of its objects, say, of polynomials. (b) As for other branches of mathematicsthe shifts away from the intuitive kind, l i e arithmetic or geometry, are reflected in new names; for example, arithmetic geometry (say, over
192
G. Kreisel
fields of arbitrary characteristic), algebraic topology, topological algebra, and many more besides. Discussion. For readers steeped in the logical tradition (or, more generally, in socalled exact philosophy) such concepts, which are merely named, not elaborated, lack weight. Recall that for CS - and for mathematicians afraid to rush in where angels fear to tread - greater rigour (as in [N lb], but now applied to the educated attention) may be premature. As a corollary, asides in the mathematical literature are. usually more rewarding if not taken literally; cf. the second rule of the thumb and [N 21. - Samples. (i) Cassels occasionally describes the shift of emphasis, in number theory, from the reals to the p-adics as natural (though, not necessarily to the inexperienced). The list of over 20 counterparts, called ‘analogies’ by Macintyre [M 11, constitute objective correlatives to the subjective aspect touched by Cassels (and do not seem premature). (ii) Mazur has used the question: What is a diophantine equation? (and the answer: I don’t know) to introduce a survey of spectacular results in the field, establishing convincingly two conclusions: (a)The intuitive concept - of all polynomial equations with coefficients and solutions in Z - does not lend itself to available theory (and, by Matyasevic, certainly not to algorithmic treatment), but (p) certain discovered subclasses and (imaginative) questions, listed in his survey, do. The second (and last) item involved in the notion of ‘logical character’ is related to socalled value judgments, as follows. 4. The word significance - in the title, like its synonyms (value, importance) and its antonyms (sterility, triviality) - sounds more problematic than it is here. In particular, not its intuitive meaning is used below (since it is not adequate), but two technical concepts developed in this century: significance rests in statistics, and - with a colourless tradename - generalization in abstract mathematics. (a) Significance tests apply to classifications, the delicate element in the package deal of L3a. The familiar tests require small (mean square) deviations from averages, the darlings of statistics. Elsewhere the idea extends to deviations from other things that one may want to know about; with obvious modificationsfor comparisons between classifications. This banality, about significance,is enough for a broad conclusion (in the case of decision problems): at least, generally, logical classifications lack algorithmic significance. ‘Enough’ together with detailed results in the part of m.1. called ‘complexity theory’, discussed separately in [K 11for reasons in [N61. Ironically, socalled probabilistic complexity theory pays no attention to significance tests or anything else beyond averages; as if statistics had nothing to add to the wisdom that on an average a rich man and a poor man eat 1/2 a chicken, when the former eats one and the latter none.
Logical Aspects of the Axiomatic Method
I93
As in the f i t rule of thumb, all this is banal abstractly, but in conflict with the popular literature. ‘Popular’ (i) in the literal sense when it is forgotten that generality, the hallmark of logic in L3a, does not ensure significance (in the precise sense above), and (ii) in the logical fraternity when exorbitant bounds for orthodox classifications are (a) ‘improved’ by even bigger bounds, but (p) not improved by more significant classifications (often already available in the literature of other branches of mathematics). (b) In contrast to (a) above, there is nothing unorthodox about attention to generalizations, except, perhaps, (stressing) their relation to matters of significance.Thus for proofs n in, say, intuitive arithmetic the question: What is significant about n? is interpreted by: (*) To which abstract structures does d generalize?
Equivalently, as in [N lc(iii)a]: Of which general forms, in place of Z and other objects referred to, does n partake? with familiar variants for the theorem proved, in place of the specific proof n. Again, conflicts with popular idea(1)s abound; both venerable ones already mentioned, and the recent kind. (i) Generally, (*) - incidentally, a paraphrase of the earlier question: Why is such and such a theorem m e ? - admits several correct answers; recall: What is water? in [N 4a]. The parallel goes further. (a)When more is known, more specific - loosely, more precise - variants are in order. What is water made of? (when an atomic structure is envisaged), and the question (*) is replaced by a conjecture on specific abstract structures, to which the theorem proved by n generalizes. (p) In a particular area, certain variants may be - discovered to be - fundamental (or ‘privileged’ in low-key jargon, for example, by the direction of the information flow ro alternatives). - Until quite recently, (ii) the advice to ‘look at the proof for the meaning of the theorem (proved)’ was touted us if there were no alternative; distracting attention from (*), which - without twisting the words - does the opposite. It looks for the meaning of TI in a new (suitably generalized) theorem; in line with [N21 on treating such ‘theses’ like proverbs. Logical generalizationspresent a limiting case, when n - is discovered to be so elementary that it - generalizes to all structures of some logical class, for example, those defined by an arbitrary elementary property. Corollary. The idea of ‘logical character’ involves a package deal; according to the refrain: logical aspects are not only present, but used in (the definition of) significant classifications and generalizations. Remark. The discussion above shows that simple reminders are enough - and thus that elaborations are liable to be distracting - when applied to the value of classifications and abstractions, for the issues at hand. When, for example, material objects - located in space and time (cf. [N lc(iii)f3]), and protected by property laws! - can be freely exchanged then the
I94
G. Kreisel
technical concept of monetary value goes much farther (since it is less sensitive to specific purposes), and lends itself to theory. The remainder is meant for specialists in m.l., and is presented accordingly.
11. First ideas: consolidation and developments. Some of these ideas, which concern logical aspects of the axiomatic method and their most elementary properties, have become part of the language (of all areas in the title), and are barely recognized as separate ideas; cf. [N Sa]. But this no longer applies to several almost immediate extensions. The shifts of emphasis needed to exploit the latter are best described by being more explicit about the easy ideas; more than the mathematical tradition in 1.3 for conveying ideas by new terminology; cf. [N 71 for more on terminology. 1. The ideas considered in this section go back to the last century. (a) Elementary m.1. corrected the - phylogenetically (= in the classics) and ontogeneticallyearly impression that precise reasoning requires all terms used to be completely defiied, recall Aristotle [Met r 7, 1012a, 21-23]. Indeed, early abstract mathematics was spoilt by clumsy wording of proppitions P about abstractions, say, about any ring; a patently incompletely defined term, when neither P nor its (formal) negation is valid. Malaise about such matters was dispelled by paraphrases in logical notation, and we - including the hard-nosed among us, cf. 1.2a(i) - have never looked back. An immediate extension concerns predicates P, say, with one variable, a, about a class R of rings with unity. Let N be the term 1+1+...1 with n summands, where n E o.The subsets of 0 (i) In: R I- P[a/N]) and (ii) w - (n: R I-
7
P[a/Nll
are now, in general different (but not for specific rings in place of R). Exceptional P and the corresponding sets were called entscheidungsdefinit by Gael. resp. definable by Tarski (tacitly, w.r.t. R). These notions-have not become household words outside m.1.; nor in the would-be spectacular literature on incompleteness where (Cantor’s) numbering of words on a countable alphabet and the use of self reference in diagonalization are prominent. The term ‘completeness’ is there associated with (Hilbert’s) dramatic, but dubious assumptions about formalist foundations. Cause and effect left aside the sociological facts above go well together. If notions are primarily tied to a discredited doctrine they had better not become (Wordsworth’s) ‘dear and intimate members of the household of men’.
Logical Aspects of the Axiomatic Method
I95
(b) Elementary m.1. has separated different sides to questions of the form: What is X ? in particular, What is a point? and: What is rhe number I ? in Euclid. resp. - some 2500 years later - in Frege. (i) Concise. literal answers have been familiar since filbert’s principal contribution to logic n irrelevantfor validify of axiomatic (in his Foundations of Geometry). (a)The questions a geometry or arithmetic. But also: (p) Answers are a sine qua non (= part of the data)for any d l of (axiomatic) geometry or arithmetic. In contemporaryjargon such models are said to exhibit a geometric. resp. arithmetic structure;for example, the former in R,when a point is a pair of reds, the latter in some collection of sets, when 1 is the set (0)(at least, when E is the successor or order relation). Viewed this way Euclid’s and Frege’s questions are not only deep, inasmuch as geometric and arithmetic structurescan be hard to spot, but reward the effort. (ii) As interest shifted from such mathematical structures as points and numbers to preferred structureson them there was a shift to: What is the structure of the continuum? and of: the number series; cf. the addition of ‘made of‘ after ‘water’ in 1.4b(i). Dedekind’s, resp. Peano’s axioms constitute impeccable answers. (c) Those axioms are categorical, and thus, by another memorable result of early m.l., not equivalent to elementary axioms. (They contain quantifiers - in fact, just one, which is universal - ranging over all subsets of the domain D considered, while elementary quantifiers range over the elements of D.) This fits I.3a on 2 kinds of generality, with a couple of provisoes. (i) The second order (subset) quantifiers range over the largest collection of subsets (= full power set of D). The proviso is that sets are meant, not the mind-boggling multitude of all their descriptions, say, by predicates, which Cantor would have called ‘Vielheit’.The latter is too indefinite to justify a superlative like ‘largest’ (in 1.3a); except for assertions that are ostensibly about predicates, but depend only on their extension; recall mirichlet’s) graphs of functions in place of rules for them. (ii) Corresponding abstractions, from that largest range. are first-order schemata, where elementary predicates are substitutedfor the set quantifier. The provisc; here affects the choice of structure; specifically, in Peano’s own axioms the only structure is the successor, from which, for example, addition is second order, but not elementarily definable. 2. This section concerns the consolidation of the material of 11.1. As the introduction has warned, the consolidation takes different directions when guided by common sense (CS) and by traditional logical foundations (1.f.).
196
G . Kreisel
(a) The most familiar descriptions of results stem from 1.f.: (negative) incompleteness and non-categoricity, with undecidability belonging to [K 11 by [N 61. For l.f., improvements consist in (i) generalizing those results and in (ii) ‘bridging the gap’ between them and positive ideals. The glamour surrounding those negative results comes from familiar assumptions about the adequacy of elementary logical operations and formal rules for definitions, resp. proofs; cf. 81. (b) For CS, even extended by 19th century mathematics, those assumptions are not compelling. This is formally ratified as it were by the negative results. Even without knowledge of details, (more compelling) positive interpretations present themselves; for example, (i) incomplete systems have non-standard models; in the case of arithmetic, different from, say, classical number fields, and (ii) non-categoricity ensures transfer principles to non-isomorphic structures (and thus a precise style of reasoning by analogy at the cost of restricting the propositions considered, in (ii), to their elementary kind). In effect, of course, not on purpose, the negative formulations distract from examining what, if any, dective knowledge is derivedfrom completeness and categoricity in the many areas where these ideals have been achieved. The former comes up in 11.3. As to the latter, categorical axioms contribute, literally, less than their abstractions in 11. lc(ii) to such questions as: What is significant about R or Z for some given proof or theorem? in I.4b; cf. also [N 7c(iii)p] for marginalia. - The following use of 11.1. in particular, 11. la, is less familiar; but cf. [K] for a recent reference. (c) L.f. are notorious for associating different parts of m.1. to hoary isms. and for seeing irreconcilable conflicts among the latter. For CS, relations between the isms have greater weight, and corresponding versions of 1.f. are thus variants (or ripples) on a mainstream. A mathematical counterpart of this CS view is a common foundation, in the modest sense of L2a, for - the bulk of - all parts of m.1. (i) n.la provides the germ of a model-theoreticfoundation for recursion theory; definable and representable sets correspond to (total) recursive. resp. semi-recursive (or, again, A1 and El) subsets of w. Such a foundation interprets (ii) proof theory as enriching (i) by (suitable) structures on semi-recursive sets, and finding generators for them, and (iii) intuitionistic logic as introducing a new kind of generality (in I.3a), with weaker closure conditions on the collections of sets that are the domains and ranges of logical operators; in particular, sets without any (sharp) boundary, like the open sets in Tarski’s interpretation, rehashed in the fad called ‘fuzzy logic’. For the present (undemanding consolidation of ILl), (i) - (iii) above are reminders of parallels, within mathematics, to guide the choice of new notions and problems in m.1. Thus,
Logical Aspects of the Axiomatic Method
191
(ii) and (iii) match algebra. resp. topology. (The continuous maps of the latter - have always been used to - handle approximations, also to ‘fuzzy’ sets!) By themselves (i) - (iii) are not enough to determine to what extent the mainstream above satisfies the demands of heroic foundations (in 1.2b); no more than elegance of mathematics ensures its relevance in natural science, cf. [N 4c(iii)p]. (Chapter I11 goes into such matters.) However, (i) - (iii) are adequate objective correlatives to the view of CS that, what was actually said in the foundational debates, was much ado about nothing (a view shared - by [N 91, oddly - by Einstein). 3. The kind of consolidation demanded by effective contributions to some common or garden varieties of mathematics involves, beyond such generalitiesas in II.2b, attention to the two key risks: sterile combinations (of logical and specific knowledge; cf. [N 2b(ii)]) and luck offocus (when logical aspects are only present, without contributing effectively; cf. 1.2 in fine). Here, the focus is on p-adic fields and their model-theoretic aspects, but without conjectures on some kind of uniqueness of this area; cf. [M 11, in particular the warning on rash conjectures in 1.3 on p. 122. Here logical classifications have been discovered to be effective - in contrast to the algorithmic matters of [N 61 -,provided the foundational tradition is not followed to the letter. (a) One effective tool is quantifier elimination (QE), related to ILla by contrast where & # 111, while, under QE, Z, = & for all n E o.Evidently, QE is effective only if the collection of sets defined without quantifiers has properties contributing to problems that involve closure under projections. In the case of padics, certain topological properties and Serre’s conjecture on the rationality of Poincad series fill the bill; cf. [M 11. The proviso above - about conflicts between the requirements of effective knowledge and of the foundational tradition - applies here. QE is sensitive to both explicitly defined and to resp. (cross sections) in other enrichments of the structure considered, for example, in [MI, [AK]. For logical foundations explicit definitions are a kind of paragon of irrelevance. - Remark on the foundational hankering after uniqueness (here, of certain fields; cf. also the third ‘rule of the thumb’): 7.2 on p. 144 of [M 11 contains a memorable twist. (b) Another effective tool is the reasoning by analogy mentioned in II.2b(ii), which is made possible by logical properties of ultraproducts.Evidently (once again), this is effective only if some information is available for such an analogy. In the case of p-adics, specifically, of Arrin’s conjecture,useful information comes from the (easier) theory of fields of power series with coefficients in afinitefield. The proviso above is liable to apply to research on variants and refinements; for example, of special kinds of ultraproducts that preserve suitable non-elementaryproperties of the common or garden variety of structures. Here it would be logical ideology, as in I.Zb(ii), to as-
198
G. Kreisel
sume that such properties must be defined in some (new) logical class, for example, of infinitely long formulae. Though this particular matter is left open here, the next section goes into the general infinity business. 4. It is fair to say that most current developments of m.1. are injinirisric; even such an exception as the theory of Turing degrees is increasingly related to constructibility and other infinitistic objects. The broad interest of those developments is obscured by would-be dramatic metaphors, about large cardinals, which are neither compelling for the infinite nor for the finite kind; cf. [N 7c(iii)a], resp. [BD], where 31@87is small (enough; numbers are not substances liable to turn into black holes when they are large). Those metaphors simply distract from the following less ethereal matters. (a) For (informed) mathematicians the background to current infinitistic logic and set theory includes (i) set-theoretic topology before its algebraic (= finitistic) version by Poiwar6 and Brouwer, and, more generally, (ii) the pursuit of infinitistic notions, for example, in the Polish journal Fundamenta Mathematicaebefore the success of (algebraic) abstractions of those notions in current abstract mathematics. So reservations about - relative sterility, not brutal incoherence of - infinitistic enterprises need not be prejudice. All this is in sharp contrast to II.3 on ordinary logic: (iii) Except in such marginal areas as universal algebra or the dormant branches in (i) above, mathematics has little to offer for rewarding cornbinations with infinitistic logic; not even with its high spots. (b) By a(iii), the IiIOSt immediate prospect for effective contributions by infinitistic logic is to project down to common or garden varieties of mathematics; as in model theory, from (Shelah’s) classificationof uncountable models to socalled stability theory for groups, Banach spaces and the like. Diverse comments. (i) With relatively few exceptions the work of the Polish school in a(ii) has not been projected down with similar determination,in contrast to m.1. also outside model theory. (ii) In all its main parts there are, incidentally closely related, infinitistic developments; some being projections down from large cardinals, some projecting down to combinatorial mathematics; cf. [K13. (iii) Digression, for reference below, on popular ideological responses to a(i) and (ii) above, and philistine counter objections. The former point out that the predecessors of contemporary infinitistic logic had no inkling of (a) the full inwardnessof such axioms of infinity that they had stumbled on, for example, measurable cardinals, nor (p) of the (logical) need-inprinciple for such axioms in order to establish even m-theorems. Counterobjections are implicit in [N81: (y) Few of the axioms, of ZF,known in the twenties, have been used (let alone needed; cf. [N 9b]), resp. (6) those needs-in-principle, w.r.t. logical orders of priority, are on trial here.
Logical Aspects of the Axiomatic Method
199
(c) However defective infinitistic m.1. may be for the concerns of (a) and (b), it is very rewarding for examining assumptions of the logical tradition; by magnifying as it were those discussed in II.2 by use of elementary m.1. (i) Above all it provides the kind of logical hygiene of [N lc] by removing doubts about the mere possibility of pursuing familiar logical ideals and of extending their orders of priority. Thus (a)in connection with IL2a(ii) on ‘bridgingthe gap’, GWel’s large cardinal axioms and Turing’s ordinal logics reach for the Truth, w.r.t. to the classes of a22 elementary formulae about sets, resp. about (the ring of) natural numbers; recall [N 4c(iii)a] on pursuing rational mechanics. As to priorities, (p) according to dejinability (by standard means), the full cumulative hierarchies of V a and R are prior to their constructible,resp. algebraic parts, since the wholes are not definable from their parts. Accordingly, (y) in a fanfare to his (one) popular article, G a e l calls Cantor’s CH - tacitly, for V a (a> o + l), not for L - ‘fundamental’, and stresses questions left open by (a)and (p). (ii) For CS (and for most contemporary mathematicians) such questions distract from broad mathematical experience. (a)By 1.3a, the significance of logical classifications is problematic; in (i)a above it is the privileged place given to the class of all elementary formulae. (p) Contrary to (i)p, abstractions of R,quite apart from other fields. have become prominent. As to V, - and without denying mere legitimacy - the question arises which abstractions (corresponding to other models of ZF)contribute effectively; at least, to the marginal branches of mathematics in IIAa(iii). (iii) In connection with (i)y. the open questions distract from shifts of emphasis in the interpretation of existing infinitistic logic. In models of YCH some simple subsets of R have cardinal c card R,but do not have a simple enumeration (where ‘simple’ means: definable in the model). For CS this asymmetry reduces interest in the truth of CH (for Va), while a refutation would round off, as it were, the result above (by its incomparable - variant without ‘simple’). Remark. Especially (ii) has an empirical flavour by involving mathematical experience. But it is at an opposite extreme to those curious doctrines - about positivist verifications or their mirror images, socalled falsifications, in [N 31 - that would have us establish legitimacy (= validity) itself ‘empirically’, for example, by trial and error (without remembering the delicate problems of weighing numerical evidence for mere conjectures in number theoryj.
m
111. A broader view of foundations: an illustration. Experience with logical foundations will not be mated with (benevolent)neglect, let alone, be discarded. It will be used as a frame of reference; as a yard stick to measure progress with
200
G. Kreisel
alternatives, and often as the most effective means of description, by contrast, available; in particular, of the view to be consideredhere. The latter does nor accept the conclusions that logical ideology or the anarchists (in [N 2b(iii)], who reject heroic foundations altogether) derive from 11.4 and from other negative results (cf. II.4b(iii)a and
PI, nor the familiar argument for the abstract style (cf. [N2cJ), that too many foundational schemes are possible-in-principle to rely on existing experience as a principal guide. Positively, for that view - probably in accordance with CS after familiarity with the last two chapters - different fundamental schemes in the theoretical sciences do provide guidance; cf. [N 41 and particularly 1.1 in fine. By (i) and (ii) of [N2b] an important difference has to be respected: not only weak effects - like those of gravity on individual atoms - but the brutal impossibility of combining separate fundamental schemes must be compared to sterile combinations in mathematics. 1. Thus the following alternatives present themselves as candidates for the 2 principal elements of heroic foundations (in L2b) (a) The ‘main burden’, of an alternative order of priority (cf. I.lc and [N 4bj), is a choice of modules (= initial elements), but without the master assumption of orthodox foundations. Specifically, it is not assumed that (i) the choice of modules involves any, let alone, an irreconcilable conflict between their objective and subjective aspects, nor that (ii) the latter are - fleeting, and - unstable; cf. [N 1b(ii)] on anti-psychologism. (b) As for reliability, the main burden is to resolve a potential conflict; between realistic reliability and any particular idea(1ization) concerning reliability-in-principle(= reliability of principles). (i) Abstractly, such an idea may require the elimination of some principles that are, realistically, 100% reliable, and thereby reduce realistic reliability; recall Hilbert’s quotation of Kronecker in the Zahlbericht on the computational idea(1) of reliability. (ii) Concretely, realistic risks go beyond mere probability of error in long computations. Quite simply, if knowledge, say, a proof is to be reliable it has to be taken in, and remembered,two blatantly subjective aspects, but not fleeting; cf. a(ii) above. (c) On the view consideredparticular instances of (a) and (b) above may have logical character in the sense of 1.3b; cf. [N4b]. This would be the case if (i) recondite logical, for example, set-theoretic principles are used to establish (defining) properties of the modules or, quite simply, (ii) there just are realistic doubts about particular principles so that they reward logical analysis.
Logical Aspects o f the Axiomatic Method
20 I
In (ii), research - now within the logical tradition - may decide whether investigation of a particular principle is best served by, say, descriptions of models or by proof-theoretic eliminations; cf. [N6(ii)] and [K 11. 2. Bourbaki’s Elkments serve to illustrate III. 1 above, as follows. (a) The modules, of K l a , are here the basic structures (= structures m&res)with their defining properties (= axioms); in terms of [B], modules both for the architecture of mathematics and for nos dsonances intuitives. More fully, (i) in any particular case such a module (say, a group) is defined explicitly, and the axioms (here, for groups) are derived for those explicit definitions; cf. II.3a. Thus (ii) there is the possibility-in-principle that (i) has logical character. But (iii) not only is (ii) rarely realized (reminder: the bulk of mathematical practice follows from (logically)weak principles), but also - and more significantly - as a matter of experience, (iv) modules contribute efectively independently of (ii); in an extreme case, for spotting a computational error; cf. where (Siegel’s) lemmas were interpreted in terms of a certain unitary group and the odd man out contained an error. (b) The proofs in the Elkments are broken up into few lemmas, and so are easy to take in. The lemmas are about (i)fmiliar(basic) structures, and (ii) since the latter are abstract, they code their own proofs; recall L4b about abstraction stating what is significant about a proof. Thus (i) and (ii) make the lemmas and their proofs easy to remember. (c) Viewed this way, the Elkments are, for the alternative envisaged, what Whitehead and Russell’s Principia were for logical foundations; perhaps, with the manifesto [B] - not the drivel in [B 11 - corresponding to the introduction of Principia (by Russell, whose style is in the philosophical tradition in contrast to [B]). The following more specific comments are random. (i) Principiu (c1)aims to interpret intuitive mathematics,where R and especially Z are prominent. For the Efkmentsit is a matter of discovery which areas lend themselves to their style (of axiomatic theory; cf. [N la]). (ii) Russell attributes the unity of mathematics to the set-theoretic ‘synthesis’ from one primitive (of each kind: object, E, propositional operator, quantifier). Bourbaki sees it in relations by means of those basic structures. (iii) Russell described Principiu as ‘a parenthesis in the refutation of Kant’; recall Frege’s anti-psychologismin [N lb(ii)], and the master assumption in l(a). The latter is dismissed in [B] by a footnote on intuitive resonances; cf. 2(a) and [N lb(ii)l. However, the material above is at best an illustration because of the following omission, which incidentally has a parallel in Principia too. 3. The manifesto [B] and a fortiori the view above are only pursued in the Elkments, and not tested; in fact, in contrast to Hilbert’s metamathematics, [B] does not even state any ex-
[w
202
G. Kreisel
plicit tests. Some are implicit in obiter dicta, some of which, as so often in such cases, are in blatant contradiction to one another. Here are some samples, chosen because of their relation to the positivist literature, touched already at the end of the last chapter. (a) [B] brings up the logical unity of mathematics as a foil to its own kind. The former is dismissed by a would-be telling comparison with the ‘unity’, of physics and biology, that consists in their both using the experimental method (= the positivist idea(1) of the unity of science). Now this (method of verification) is at best a counterpart of using formal rules of inference, which [B] adopts, too; not of the set-theoretic unity in 2c(ii) above. But it is the latter that competes with the modules of [B]. A better case for [B] compares 2c(ii) with Newton’s own description of the ‘main burden’ of physics: to go from the mechanical phenomena (orbits) to the forces, and from the forces to other phenomena. Here the forces, tacitly, between point masses, correspond to (operations on) sets, and the phenomena to more or less familiar intuitive mathematics. The last 300 years have corrected the description. (i) The composition of matter goes beyond this scheme (of Newtonian point masses) altogether; cf. [N 4a] on microscopic mechanics. But also (ii) within rational mechanics a major burden was the discovery of so to speak its modules (= ‘perfect’ models) to which the general laws of force are applied; cf. (iii). resp. (ii) of [N 4c]. Viewed this way the elementary properties of logical aspects of 1.1 correspond to, say, - easy, but useful - dimensional analysis, and high expectations, for infinitistic logic, in II.4b(ii) match the wonders of post-Newtonian mechanics, where minute anomalies (= paradoxes) are related to momentous changes. (b) [B] does not follow up its disavowal of positivist ideals in (a) above. Near the end it falls back on two equally feeble ideas of this sort in order to express its malaise about (i) categorical definitions, and (ii) muthemuticalproofs of the soundness or, at least, consistency of principles. Specifically, (i) is rejected on grounds of economy, and in place of (ii), a policy of wait and see is proposed. The present view is different. As to (i), abstractions from categorical axioms are superior in making explicit what is significant about the latter, cf. ILlc(ii). As to (ii) there is, above all, the master distinction between dubious and realistic doubts. The latter kind exist, even if they are not usual; cf. II.lc(ii). So, for CS, unusual remedies may be appropriate; in particular, logical analysis, even if, in general mathematics, it distracts from more rewarding aspects; cf. [N 101 on elementary analyses of familiar paradoxes. Remurk on the policy of wait and see. Here the most elementary (statistical) precautions would be disregarded. By 11.4b(iii)y the bulk of principles that have already been formulated are not used in practice at all, and so would never be tested! Cf.also the remark at the end of Chapter 2. 4. A disclaimer and concluding remarks.They concern opposite extremes.
Logical Aspects of the Axiomatic Method
203
It is not claimed that logical hygiene is necessary for mathematics or even for m.1.; cf. the Digression in [N 7c(ii)] on doing without logical erudition. Nor is it sufficient for heroic foundations. For the record I suspect that, at the present time, such foundations are not likely to be satisfying unless they have something much more detailed to say about those intuitive resonances (than, say, 11.2b); in other words unless they introduce idea(1ization)s of our capacities for data processing; especially, of our memory structures. But logical hygiene may serve the purposes stressed in the introduction, and so be a relief to those (of us) less robust souls, who are ill at ease with the flashy language of the logical tradition (and thus are sympathetic to shifts of emphasis), but do not have equally concise aims for any alternatives. Reminder of a bonus. At least, occasionally - for example. in II.4a(i) or II.4c(ii)p - some shifts that are immediate after applying a little logical hygiene, have not been made by the light of nature. For the record, salutary as all this may be, it seems to me to fall short of the ‘raw’ appeal of m.1. - To conclude I turn to more adequate objective correlatives to that appeal. (a) Would-be dramatic exaggerations about the power of (bad) philosophy - from Kant’s optical illusions [A 4221 to Wittgenstein’s bewitchments - obscure the fact that such thoughtlessness does find its way even into works like [B]; cf. IL3b, with garbled formulations of sound insights, which are expressed painlessly by using a little m.1. Put differently, m.1. is neutral, as it were; equally effective for formulating weaknesses and virtues of the logical tradition. Speaking of traditions readers may feel ill at ease about the irreverence towards some of them in this paper. Again m.1. itself provides object lessons; cf. [N 111 of many examples where almost universal, but quite wrong impressions, alias convictions have persisted. - But at least for my taste the next and last point is more satisfying still. (b) As stressed repeatedly, m.1. and, in fact, 1.f. constitute a kind of intellectual microcosm, which reflects in a particularly elementary way, broad experience of general, cultural interest. Thus, we find (i) the broad division between those, who see the ‘nature’ of their subject in its most outlandish phenomena with tenuous relations to accessible experience (and ips0 fact0 hard to study empirically). and the rest; in mathematics, we have large cardinals, in physics, black holes or phenomena typical of Planck’s length; cf. the end of I. 1a. More specifically, we have (ii) the idea(1) of a complete description, and the division between those who do and those who do not consider the means available (tacitly, according to current theory); in mathematics, we have such structures as N,in physics, nature itself; cf. [N 9b] (compared to the would-be spectacular literature alluded to in II. la).
204
G. Kreisel
Notes 1. The passage from intuitive to discovered, alias technical concepts is illustrated more fully in Lla. (a) Roughly, intuitive concepts correspond to (ranges 00 phenomena that strike the attention as a unity, and thus vary with experience. Such concepts are useful, for example, for recognizing or describing the phenomena, without neediig or even admitting recondite theory; obviously in case of hunters who need not write dissertations about their quarries, but also in the intellectual sphere; 6.[N5a] (b) Specific intuitive concepts, at various levels of experience, often have impeccably precise analyses, with socalled informal rigour; sometimes in one-liners, sometimes after scrutiny; recall the cases of area under a curve, resp. area o f a curved surface. Less obviously: (i) Limitations of those concepts, as scientific tools, are thus not a matter of mere precision, but of marginal utility, generally for different reasons in common and uncommon experience. In a different vein: (ii) The two geometric concepts above fit both the objects perceived and - the way we think of - ourperceptions; so informal rigour is applied here to idea(1ization)s of both objective and subjective phenomena; cf. St. Thomas’s adaequatio et rei et intellectus, T. S. Eliot’s objective correlatives (to subjective experience, and the disarming rhnances intuitives of [B]). All this is overlooked in (Frege’s) crude anti psychologism, directed against visualization alias Anschauung - which Kant, gratuitously, claimed to be needed-in-principle (over and above being used effectively) - in mathematical reasoning. By (ii) above, subjective experience has stable elements while the bogy of psychologism comes from an opposite extreme, such as fleeting impressions produced by a couple of photons. (In between there are phenomena like the twinkling of a star where external and psychological elements are, perhaps, more entangled.) (c) The contribution of informal rigour in (b) is typically two-fold. It provides a scientific tool (for exploiting the intuitive concept considered), and logical hygiene, which serves (here) to reveal genuine limitations when dubious doubts (about precision) are removed; perhaps, those intuitive concepts that Plat0 calls ‘great general most memorably for the hardcore of forms’, for example,Truth, and Identity. (i) Realistically speaking, impeccably precise definitions refute the - for CS, most questionable - assumption that (such concepts are so fundamental that) a definition cannot be correct unless it tells us, so to speak, what we want to know most (about them)! as if a correct definition of ‘wealth’ automaticallyfilled our pockets. Reminders: Tarski’s humble adequacy conditions (= implicit definition) for Truth and Leibniz’s (second order) axioms for Identity. - At the same time (ii) just using symbols may be sufficient (but, practically, also necessary!) to apply useful rules of thumb, for example, about significant ranges of the quantifiers involved: (a)the class of propositions for which Truth is to be defined, or (p) the class of predicates and operations for which Identity (=
Logical Aspects of the Axiomatic Method
205
equivalence) is to be respected. Just how demanding (p) can be, is seen from the following Digression: Which comesfirst? The chicken or the egg? tacitly, in some hereditary succession. (a)For intuitive identity. by outer appearance, it is the chicken (since, here, ‘eggs is eggs’, and if there is a difference at all, it fmt appears in a chicken). (p) For genetic identity, it is the egg ( a chicken always being identical to the egg from which it is hatched, pedantically, the germ plasm is part of the egg). A variant of the Digression is at the end of Medawar’s The Arr of rhe Soluble. - Let there be no mistake. (iii) Philosophical logic, the scholastic tradition in modem dress, has its own arsenal of (traditional) constraints for applying the rule of thumb in (ii); under such headings as: partial truth definitions or fuzzy identity. But, by and large, these distract from the demanding questions: which parts? resp.: which margins? -Remark. The scholastic tradition has even coarser distractions. Thus (a)(at least, Plato’s) Pamenides wonders whether every particular, including dirt, ‘partakes of‘ some general form (= has some abstract aspect); a question-of-principle that is pretty close to the more demanding: Which abstract, for example, logical aspects, if any, contribute to effective knowledge about the particular considered? - In contrast, (p) there are scholastic parlour games about some kind of location of abstractions in time and space, even when - the possibilities of or obstacles to realizations of a particular abstraction in space or time are well understood. For CS, the (linguistic) conventions of those games are odd: when no location in time is in sight, the abstraction is said to be ‘eternal’; in the case of space, ‘nowhere’ or ‘non-existent’. 2. HT and, especially, gifted philosophers (writing within the literary constraints of HT) are not ignored here nor by CS. But also they are nor raken literally. In particular, matters-ofprinciple (with Yes/No answers, as mentioned) are treated like proverbs and thus (as explained further in b(ii) below) like common or garden varieties of scientific theories. They all leave open, albeit to different degree, their domains of rewarding application, resp. validity; cf. charity begins at home andfamiliarity breeds contempt applied to some particular logician writing on logic. Banal as this CS-view is abstractly, it is in sharp conflict with the general scholarly tradition, and especially with the analytical branch of HT, which belabours its precise analyses, and is casual about - defects in - the choice of the objects analysed. The following corollaries to the CS-view above are used throughout the article. (a) Attention must be paid to the perennials of HT, but without the assumption that systematic (= general and mechanical) answers must be efficient in any wide domain. In fact perennials have counterparts in CS. Here are some samples. (i) Disentangling appearance from reality (or the parallel in [N lb(ii)] on objective and subjective elements); counterpart: real and apparent motion of the planets. (ii) Possibilities of (a)a priori knowledge, but also of (p) response to experience have counterparts-in-principle with computers: in-built routines, resp. sensors and other processing of inputs. Both (a)and ((3) are thus trivial in principle, but generally difficult when specific data processors are to be efficient in a complex environment.
206
G. Kreisel
A recent favourite of HT: (iii) Saying what we know (or even: what is shown, for example, by our ‘actions’) in some would-be universal language has other alternatives besides mysticism, such as paraphrases of scattered items. Counterparts are trade jargons including neologisms in, say, topology for elements of visual experience that do not have a name in the vernacular. The list (i) - (iii) could be continued almost indefinitely (and continues similar material in the text). (b) Especially because of his unique talent for strong language, Kant is a goldmine for applying the CS-view above. For example, according to [B 3621 reason demands all knowledge to be derived from one principle, roughly, in order not to confuse the mind. Taken literally, this is not convincing at all. In fact, ‘the’ mind copes very well with different aspects or principles in parallel, which is a different thing from a simultaneous (= running) commentary on what the mind does or knows; cf. a(iii) above. However, if not taken literally, [B 3621 underlines, by contrast, a porenrial conflict, fundamental for the whole of this paper. As soon as 2 aspects are considered, (i) in mathematics - where mere validity is rarely paramount - possible combinations are liable to be sterile. For example, finite groups are ‘much more than the sum of their (abstract) parts’, here, finite structures and arbitrary groups; cf. 1.lk (ii) in natural science, generally knowledge about different aspects cannot be combined or, in the jargon of the trade, superposed - at all. Thus the equations of motion for a particle under gravitational and under magnetic forces can be superposed, but not their separate solutions. (So near the surface of the earth it can be a major problem to decide which, if either, solution applies). - An heroic exception: grand unified theories (aim to) eliminate this potential conflict, which is here taken as an objective correlative to romantic gushing; about the beauty or some other all-purpose virtue of such theories. (iii) A familiar alternative to [B 3621 (= Simple Simon’s idea of reason) goes back to Socrates: endless lists, but with a new label: epistemological and - a, by [N lb(ii)], corresponding - ontological anarchy. Scientific experience has established a better alternative:
relativelyfew aspects that can be combined adequately for relurively large ranges of phenomenu (suitably selected;cf. [N 1a]);
see I. 1 for more in mathematics, [N 41 in natural science. (A new label would be: oligarchy, as opposed to the absolute monarchy, ruled by a single principle, of [B 3621.) Warning against rushing in where angels fear to tread. For CS it seems premature to speculate on reasons for the general success of such oligarchies; especially, in view of the (considerable)background knowledge that is a rhreshold for any informed discussion of their success in some limited area; for example, in number theory, of - embeddings of Z in - re-
Logical Aspects of the Axiomatic Method
201
latively few fields: of the real, complex, p-adic numbers (and, more recently, the use of fmite fields). - But CS is also sensitive to the following complement, as it were, of this need for extending experience. CS asks: (c) Do we already know enough? of course, for some specific conclusion(s). Here the style of HT is often adequate; fittingly, since it goes back to the Greeks, and appeals to (most) able teenagers, both without much experience (phylogenetically,resp. ontogenetically). Concrete examples are those gems of ‘pure thought’ where purely mathematical properties of theories, together with purely ‘qualitative’ experience, are enough to decide between (or discredit) them. First a couple of modest reminders from common experience. (i) Galileo’s refutation of his first theory (dddt = cs) for free fall near the earth because then a body at rest would never start. (ii) Olbers’ refutation of an early (perfect?) cosmological principle: the universe cannot be both infinite and homogeneous because, then, at night, it would either be pitch-dark or brightas-day. - Outside common experience (iii) Einstein’s - expositions of - relativity theories represent a most spectacular singularity of this style (of HT). albeit with domains of significant application for speeds near that of light, resp. for very dense masses (where intuitive notions presumably do not have intuitive appeal!). 3. The CS-view in [N 21. on nor being literal-minded, applies also to various relativism; by Hegel in connection with the socalled historical process, and more recently by Lakatos, in high-faluting jargon: relativity of The Truth. Actually, since precious few details of that ‘process’ are used (or known) it would be less pretentious to abstract from it the knowledge available at any particular time. This is attempted by Popper, but spoilt by mouthfuls about the inaccessibility of The Truth, embroidered with ‘measures of approximation’ in the scholastic tradition of precision-out-of-all-proportion-to-the-data;cf. (c) below. The ‘evidence’for this would-be revolutionary stuff consists of snippets of scientific texts, collected and presented without regard for the most elementary statistical precautions (perhaps fittingly, since those relativists eschew inductive reasoning). From afew, admittedly, sometimes radical changes at an early stage - to mention an extreme, when the old Greeks thought of the stars as holes, not bodies, in the sky -,the need for an endless series of corrections is glibly ‘inferred‘. So taken literally, the snippets are worthless. CS interprets them differently. (a) Shifts certainly occur over time. Occasionally they are corrections w.r.t. mere validity. But most often they are shifts in the foci of attention, and hence to new questions, as experience increases; to be compared to changes in the centre(s) of gravity of (significant parts of) a growing material body. But contrary to glib relativisms - and like [N 2b(iii)] on oligarchies - research may establish stability, not only the need for change.
208
G . Kreisel
Viewed this way there is nothing revolutionary about those snippets. They are (i) banal abstractly, but, as in the first rule of thumb, (ii) (even) their sober interpretation is in conflict with cherished ideals such as [B 3621 in [N 2b] and, indeed, with the ideals of 1.f. themselves. - The next two paragraphs contain objective correlatives to literary fashions. Reminder. The latter are, for CS. far too complex for speculationson cause and effect. (b) Those relativisms have become quite popular even among mature scientists who are (i) ill at ease about the rigidity of I.f., and (ii) generally not accustomed to take grand obiferdicta literally, anyway. So a sober interpretation like (a) above may be an objective correlative to that popularity. (c) The would-be revolutionary jargon of the relativists, about The Truth, is - not only a piece of exceptionally poor taste, but - also a special case of the tradition in [N lc(ii)], of straining to find logical defects in 1.f.; with the scholastic embroiderings distracting from effective contributions(as in [N lc(iii)a]), in particular from the reinterpretation in (a) above. 4. Our experience of the material world is so vast and its effects on us, literally, so striking that we cannot afford to ignore any parallels with natural science. Of course, some discretion is needed; if only, because mathematical structures are not substances. But the latter 'partake of' abstract, in particular, mathematical forms, which is quite enough for parallels. Digression (again, on the familiar refrain). Banal as this remark is abstractly, it deflates a good deal of literature on - the success of - mathematical physics being unreasonable-in-principle(though in any particular domain the degree of success may be remarkable in either direction; recall [N 2cl). - The following points are more specific. (a) There are different schemes for describing matter, for example, according to (i) its state (solids, liquids, gases, plasmas), close to some of our intuitive concepts of the material world, cf. [N la] or, more poetically, earth, water, air,fire, (ii) old-fashioned chemical composition (of molecules by atoms), refined and enriched in (iii) subatomic structures, resp. lengths of, and angles between the bonds over and above their (graph-theoretic) valency in (ii). The classifications (i) and (ii) are incomparable as opposed to one refining the other: many different chemicals are solid (say, at room temperature), and ice, water, steam have the same chemical composition. Thus (i) and (ii) give different answers to: What is water? At another extreme, the classification (i) does not apply to (iii) at all. In terms of - by 1.2 - corresponding orders (of priority), (i) applies only to mixtures (of solids, liquids etc.), and in (ii) elements are prior to molecules. After (iii) was discovered, there was a shift away from some intuitive branches of science corresponding to 1.3b, in particular, from chemistry and physics, with a new interpretation of a classical fomula: physics and chemistry concern changes of matter itself, resp. of its state. (Changes of macroscopic matter correspond to changes in the state of microscopic matter.)
Logical Aspects of the Axiomatic Method
209
The moral intended here is that general abstractions about foundations have not reached the threshold of informed discussion unless the experiencejust adumbrated is taken into account, including such ‘negative’ experience of errors; for example, when (some) physicists in the twenties concluded from the shift above that (iii), the advent of the quantum theory, had superseded (ii), old-fashioned chemistry. In terms of 1.1b, the theory may be a ‘least interesting side’, as follows. (b) The error in question consists in forgetting modules in the sense of Llc; here, according to the levels of energy available to perturb the objects involved in the phenomenon studies: protons, atoms, molecules, bricks and so forth. In current jargon, CS asks whether any particular phenomenon has quantum-mechanical,chemical, or other character. Roughly, whether or not Planck’s constant appears in its laws, more precisely, whether these are sensitive to details of - alternativesto - current theory. Evidently, this is not settled merely by ‘reducing’ chemistry to the quantum theory. Current physical theory looks very different from the ideals of HT, which were dominant in the twenties; cf. [N 2c(iii)]. Instead, (i) a relatively small number of modules represent oligarchies as in [N 2b(iii)], and (ii) their imaginativecombination provides an understanding of the accessible world not dreamt off even 50 years ago; especially, if presented in words and pictures with the artistry of Life on Earth or Planet Earrh. (c) Diverse comments(i) The intuitive concepts that determine the way we see the world, shapes and colours, do not lend themselves at all well to theory; cf. [N la]. - (ii) Those of a(i), together with motion (M, L, T of dimensional analysis), are more amenable to theory. Its early stage, socalled rational mechanics, relied largely on the style of [N lc], expressed in concepts of perfect (or ideal) rigid bodies, liquids or gases, with quite uneven success. Accordingly, (iii) the last 300 years of rational mechanics provide memorable examples of both dogged pursuit and shifts of emphasis. (a)The stability and other asymptotic behaviour of, say, 3 body systems have been pursued in celestial mechanics, one of the wonders of contemporary mathematics, but with a tenuous relation to the realities of, say, sun, earth and moon (which are not point masses and are surrounded by interstellar matter, to boot). (p) The theory of functions of a complex variable provides the mathematics of the 2 dimensional steady motion of an ideal liquid. By Cauchy’s theorem such motion does not exert any drag on any cylinder. Accordingly, on the mathematical side there was a shift away from notions and problems required by the hydrodynamic interpretation (= terminology), and on the physical side a shift to phenomena, for example, electric currents to which the mathematical structure is better suited.
210
G. Kreisel
(iv) The topic of reliability provides excellent parallels for foundations, too; including the reliability of observations (with questions of principle, and of statistical distribution). But this topic does not seem equally suited to brief reminders. 5.The alliteration of ‘logic’ and ‘language’ is catchy, but obscures some pertinent distinctions and assumptions. (a) The slogan, of contemporary mathematical jargon, ‘logic is only a language’ (also applied to ‘set-theory’ or ‘category theory’ in place of ‘logic’) codes two complementary points. First, contemporary mathematics uses logical concepts (tacitly, together with some elementary properties), but no recondite theory (= no properties established by fancy deductions); cf. [N la] on intuitive concepts. But the slogan also makes the point that, even in mathematics, concepts are used to state facts and not only as objects for mathematical high jinks; this conflicts with the spirit, if not the letter of the would-be hard-nosed tradition in L2a(ii). - Digression on ‘mandarins’ (or ‘anti mandarins’) pontificating that thrilling pure mathematics must (or cannot) contribute elsewhere; no matter - by the beginning of [N 41 whether within or outside mathematics. The fact is that, in this respect - current developments of - different branches of mathematics differ remarkably. Thus science or even ordinary life is hardly imaginable without elementary arithmetic, while this is not true of - such main topics as diophantine equations in - higher arithmetic; i n contrast, the relative contributions of elementary and higher geometry or analysis are more balanced. The assumptions that any particular mix must be either only temporary or that it must be permanent are equally glib, and comparable to the relativisms of [N 31. (b) In the jargon of the logical fraternity, logic is said to be even a language; meaning that (i) language is more central to m.1. than to other branches of mathematics. True, elementary geometry does not use mouthfuls like ‘syntax’ and ‘semantics’ but, say, ‘quadratic equations’, resp. ‘conics’. However, careful attention is given throughout mathematics to notation, for example, to normal forms. Another, perhaps even more embarrassing claim is a kind of converse: (ii) Logical aspects are - not only present in, but - claimed to be particularly significant for languages; from the natural kind(s) to programming languages. This piece of ideology is not new to the computer age, but a legacy from the first half of this century, as follows. (c) The panacea of ‘reducing’ metaphysical problems to those of (our) language. For CS this is simply an abuse of language. (i) Not only metaphysical, but also other issues have linguistic counterparts.Thus, a particularly coarse oversight may have a purely grammatical counterpan; cf. [N 10121on (ab)uses of the definite article. (ii) As in [N la], linguistic concepts, though of course familiar, may be theoretically recalcitrant. They often are, and then the panacea ‘reduces’ obscurum ad obscurius. (iii) In connection with logic and language, CS (supported by [N 2a(iii)]), does not assume that the literary forms of m.1. are generally
Logical Aspects of the Axiomatic Method
21 1
adequate, let alone, superior to the rest of the arsenal of literary inventions. M.l. is superior, in the erudite language of [N 4b], in cases that have logical character. HT is a source of such cases, with one proviso. The problems themselves may not have this character, but what the traditional literature says about them does (in which cases m.1. says it better). 6. Much more radical shifts of emphasis are required to exploit the potential of logical aspects in computational than in axiomatic mathematics; (a) away from computation in the case of those aspects that are the subject of recursion theory, even though this is the theory of the perfect computer, and (b) towards computation with those of proof theory, even though the latter (c1)aimed to contribure to reliability-in-principle (and not primady to algorithmic matters at all). Some of the points alluded to in [N4c(iii)] are taken up in [K 11. 7. Some uses of terminology in different parts of m.1. and other parts of mathematics. (a) It is of the essence for informal rigour, cf. [N lb]. (b) It seems marginal in, say, contemporary lower arithmetic, which does not (seem to) choose its (favourite) problems by the literal meaning of words like amicable or perfect; with compelling parallels in m.l., for example, cardinal arithmetic or Turing degrees. (Hagiographies,alias histories of number theory leave open to what extent words like imaginary or transcendental were (ab)used before a definite meaning, resp. their existence was established; by remarkably simple interpretations, resp. cardinality considerations.)- The remaining uses, which include new names for mathematical disciplines as at the beginning of chapter I1 and in L3b, are much more delicate. (c) The major shifts of emphasis below, mainly in set theory, are (intended to be) conveyed by a single word, as it were, poetically; cf. also [N Za(iii)]. (i) GMel’s constructible means ramified (sets). but without the associations of the latter. The would-be precise literature distracts from this significant shift by making a different point: Godel used (Zermelo’s) cumulative rather than (Russell’s) simple types. - (ii) Cohen’sforcing is free from distractions by intuitionistic logic, with which equivalent terminology (in the general area of Kripke models) is associated, away from set theory; cf. the review of [Kri]. Pseudo histories about priorities, for example, in anecdotes recounted at Hull (Logic ‘86) obscure the significance of the shift. Digression. The anecdotes happen to be garbled. But they are correct enough to document a point not stated there: The kind of formal knowledge of axiomatic set theory, regarded as essential by the logical fraternity, was not needed for Cohen’s particular, significant contributions. - (iii) The following uses relate to second order axioms - in place of the schemata - of ZF.(a) Such axioms are implicit in the familiar term large-cardinal-axiomssince the latter are satisfied by countable ordinals, with suitably strange closure properties, if first order schemata are meant. (p) The provocative term second order decidability, say, of CH serves to underline, first, the (corresponding) ineffectiveness of first order, recursive decidability (cf. [N 6a]), and secondly, in line with [N 5a1, to state concisely brutal differences in character between different independence proofs. The independence of the parallel axiom
212
G. Kreisel
from Euclid’s axioms (together with Dedekind completeness)or replacement from Zermelo’s 1908 axioms is of one kind; of GMel sentences in formal arithmetic, of CH and TCH, but also, (Abel’s) insolubility of quintics by radicals of the other. (In logical terms: the independence of Vxl ... Vx53y(y5 + xlf + x2y3+ x3y2+ q y + x5 = 0) from the axioms for fields together with Vx3y(yn = x) for n E 0.)- Reminder. (y) The (coarse) notion of second order consequence is perfectly adapted to the simple matters of (a)and (p). By II.2b. in the common or garden variety of problems, second order axioms are generally less suitable than their abstractions. 8. Logical orders of priority have become so familiar that their significance is either assumed or ignored (cf. [N 7b]), and rarely considered. The two elementary examples below concern definitions and proofs; cf. I I . 4 for more. (a) Two principal questions are illustrated by the two granddaddies of logical definitions (for N and R).(i) What do Peano’s - original, second order - axioms contribute? (a)The classical answer is: rigour; recall [N lb(ii)] on safeguards against tacit assumptions about N that may slip in when, say, visualization intervenes. This, realistically dubious, doubt distracts from the need, underlined by Gadel’s incompleteness theorem, for so to speak logical visualization: to see properties of the sets involved in Peano’s axioms in order to exploit the latter. (p) For CS (and [K]) a very different answer presents itself: purify of method; here, of the set-theoretic kind (in connection with arithmetic). - The second question presupposes Llc(ii), in particular, the fact that the theory of real closed fields is determined by abstraction from Dedekind’s axioms for W. (ii) What does this description of the theory contribute? compared to (a)the familiar definition, due to Artin and Schreier, which is easy to check in practice, and (p) the logical favourite: the set of first order statements that are true in the field R, which draws attention to possibilities of effective reasoning by analogy (from R to other real closed fields). I do not know any compelling answer except of course for the examination of the logical scheme of abstraction itself; for example, similarities and contrasts with Peano’s axioms and their enrichments by adding equations for + and x separately, resp. jointly. Nor, incidentally, do I know any use of the logical property of (a) being a V3 axiomatizationof the theory (in contrast to the other alternatives).In this way the potential of the logical order@)of priority for definitions is seen to be limited. (b) One old and one recent example, both from set theory, illustrate how CS is at cross purposes with the logical orders for proofs and especially provability that are implicit in the package deal of L3a. (i) The first concerns the axiom of choice, AC. Here, generally, CS requires the introduction of AC, when some theorem T has been proved by use of a particular choice set, but is discovered to follow from the mere existence of such an object; cf. I. 4b. For CS, (a)metatheorems about eliminating AC, if T is, say, absolute, constitute a neat complementprovided ((3) it does not degenerate into preoccupation with some unspecified validity
Logical Aspects of the Axiomatic Method
213
of AC (without specifying the kinds of sets considered), and thus into a distraction from the question: Whichproperties of T - say, of an algorithmic kind (in [K l]), beyond mere validity - do depend on particular choice sets? Trivially. (y) the omission, say, by Cantor, to state AC explicitly is, of course, not interpreted as putting its legitimacy (= validity for sets generated by Cantor’s favourite, the power set operation) into question; no more. mutatis mutandis, than Euclid’s omission to state axioms for order. Last, but not least, (6) AC is not expected to hold in collections of sets that are definable in some simple-minded way (in contrast to GMel’s constructibleor ordinal definable sets; and by the proviso in II.lc(i), intuitive defmability may not lend itself to theory anyway). - (ii) The second example concerns (the power set operation, and) the replacement axiom, which Zermelo omitted in 1908, or, more specifically, transfinite recursion, which Fraenkel and von Neumann considered in the twenties, and Borel determinacy. This belongs, by 11.4a(iii), to a marginal part of mathematics, and is thus not (professionally) rewarding to contemporary mathematicians. But given the good will to consider Borel sets &, for a in a countable well ordering c, at all, Martin’s idea is compelling: For any ‘game’ X, a strategy o, is obtained from op:p c a,for a suitable game on the power set of X,no less compelling (and of course more ingenious) than Cantor’s ‘cardinality considerations’ in transcendence theory; cf. [N 7b]. An imagined (= Gedanken)experiment shows that transfinite induction on c here is comparable to AC and to geometric order in (i)y above. (a)The metamathematicalargument, going back to H. Friedman, which establishes the logical inadequacy of Zermelo’s list in 1908, is a neat complement. But (p) CS does not interpret (a)as evidence of some kind of reduced reliability (security), which would distract from (obvious) alternative interpretations in terms of the - not algorithmic, but definitional - complexity of oa. 9. It is a (sociological)common place that the often heated foundationaldebates strike outsiders as a storm in a tea cup. An objective correlative refers to the unit of measurement used (for the differences between the foundational ‘schools’). For the logical tradition it is the narrow range of logical orders of priority, for outsiders it is the full or at least a broader range. - (a) Examples of the latter are provided by alternative interpretations of those schools in II.Zc(i) or, more fully, in [K] (where II.lc(ij is related to taking homomorphic images, Hilben’s programme to - here, finitistic - purity of method, intuitionistic logic to enlarging the category of completely to suitable partially defined notions). - (b) Einstein’s well known description of those debates, as a war between frogs and mice, is, abstractly, impeccable, but a bit ironic, since those debates centered around hoary isms (applied to mathematics, cf. end of II.2), and were thus parallel to Einstein’s (later) debate with Bohr. More specifically, that debate concerned (i) idea(1)s on objective and subjective descriptions of nature, in particular, their completeness. Einstein himself disregarded (a)the means of description, tacitly, available according to theory, and thus (p) the obvious parallel with Gadel’s incompleteness the-
214
G.Kreisel
orem for descriptions of N (by formal means, extended by Tarski to socalled internal means). Correspondingly, (ii) for such outsiders - to the debate about (i) - as down-to-earth physicists differences between the views of Bohr and Einstein appear exaggerated compared to, for example, those between the different components of matter that are prominent at different levels of energy; cf. the end of Llc on [N 4b] on modules. 10. Let there be no mistake. For traditional l.f., especially, the part preoccupied with reliability-in-principle, questions of the validity of principles (like AC in [N 8b(i)a]) are - not only problems to be solved, but - a principal raison d’gtre; and, here, paradoxes are a godsend. Unlike mathematics, which is a conservative discipline, proverbial for its rigour, an examination of 1.f. cannot afford to treat those puzzles with (benevolent) neglect. It is best to look at them provided only they are not belaboured, which, as always, would distract from more rewarding matters, for example, from parallels in broader scientific culture. Some main points are illustrated by Russell’s paradox since it is here typical enough. (a) Of course, the paradox does not arise before abstract sets or predicates are a subject of study, which they are not for CS; nor were they for most contemporaries of Cantor and Frege (who complained of being ignored). (b) For specialists a century ago, oversights in Frege’s form of comprehension were either stated explicitly or implicit in the oral tradition. (i) Cantor himself pointed out - in 1885, 16 years before Russell’s paradox - that a predicate, a Vielheit,as it were, cannot generally be assumed to have a definite extension (a hallmark of an Einheit). At another extreme, (ii) those familiar with, say, Venn diagrams (and the pardwhole relation between subsets of some universal set) tacitly restrict a in a E b by: 3y (a E y). In short, (iii) at that stage it would have been premature to lay down what (kinds 00 sets to consider. As a corollary, and in accordance with the third rule of thumb, there could be no question of a unique oversight! However, for CS. (i) - (iii) are too delicate compared to the brutal ‘nature’ of the paradox, and belong to the study of particular kinds of sets. (c) Ironically, Russell himself drew attention to a more fitting view, the abuse of the definite article (cf. [N Sc]); at least, modulo the shift from ‘the present king of France’ to ‘the class of sets belonging to themselves’. Consider then the indefinite article, that is any X satisfying R: VY (Y E X + Y 4 y).(i) For a literate use of ‘the’, applied to x: R(X), there must be a largest such X. But (ii) if R(X) then X 4 X. So X u ( X ) # X and R(Xu(X)) hold too, recalling (children’s paradoxes about) ‘the’ largest integer, say, xo (where x u [ x) is - von Neumann’s - successor of x); for example, xo = 1 since otherwise x02 > xo or xo = 0, and 1 > 0. Incidentally, (iii) for Russell’s analysis of ‘the’, both ‘the present king of France is bald’ and ‘the present king of France is not bald’ are false, which is a paradox (and a mild joke) too. (d) What moral follows from (b) and (c)? (i) Not that there is no rewarding relation between paradoxes and other topics such as the import of ‘innocent’ axioms; cf. sx # 0 for the
Logical Aspects of the Axiomatic Method
215
arithmetic of finite fields (and the children’s paradox above). (ii) Not that there can be no systematic oversight in CS;but only that the paradoxes aren’t it. - Readers ill at ease with this irrevennt view of the logical tradition are recommended to look at [N 113. 11. The reminders below, which list some simple, but salient facts about m.l., also underline diferences from earlier dicta; especially those that were accepted by all sides (of the foundational debate, and, perhaps fittingly, have persisted). Thus this note continues the philosophy of [N 11 - [N 41; far from preaching preestablished harmony it stresses progress, including the discovery of ‘harmony’ between certain aspects of frst thoughts (or, at least, words) and later experience. Some - warnings concerning - morals that present themselves are at the end. NB. First thoughts are expressed below (anachronistically)in current jargon. (a) 100 years ago the expressive power of predicate logic seemed wholly inadequate. (i) It is for such intuitive concepts as N and R. (ii) It is enough for much current axiomatic mathematics or, at least, for its axioms (though not always in their simplest form; cf. orderable and real closed fields). (b) Some 75 years ago both ramified types and intuitionistic logic seemed inherently (i) formally intractable and (ii) proof-theoretically weak. (iii) One side accepted those unorthodox (= deviant) types, resp. logics for the sake of The Truth, the other rejected them for the sake of Progress, when in fact both (i) and (ii) arefalse. The deviant styles are related to classical types and logics by a web of embeddings and equivalences;cf. II.2c on a logical mainstream. More specifically, (a)in connection with ramified types the glamour issue was the notorious reducibiliry axiom, which, by Giidel’s work on L, holds provided only that - not all (ordinal), but -just (constructibly) cardinal levels are considered. (p) In connection with intuitionistic logic a principal bone of contention was the notion of choice sequence, recalling (Aristotle’s) qualms about incompletely defined terms; cf. ILla. Intuitive properties of - variants of - this notion are the single most compelling contribution of this deviant logic to m.1. (and to itself; cf. completeness and disjunction properties); in contrast to threadbare (c1)aims about reliability or algorithmic efficiency (cf. [K 11). - In a somewhat different vein, (c) after Giidel’s incompleteness theorem, would-be fundamental culprits were ‘identified’ in so to speak philosophical lemmas about metalanguage/object language, self reference etc. They have lost their sting with provability interpretations of modal languages, and simple fixed point theorems. (d) Some 50 or 25 years ago the ‘quantum jumps’ in m.1. seemed to (i) remain isolated, for example, Giidel’s work on L compared to, say, L-measure (when Lebesgue’s dissertation appeared) or (ii) lack of ingenuity and/or elegance i n m.1. compared to, say, harmonic analysis. Busy work since then has corrected (i), and, in 300 years, the high spots of contemporary m.1. will surely be appreciated, for their ingenuity, as much as work on amicable and perfect numbers in the 17th century is today; cf. [N 7b(i)] and p. 53 of 11.
216
G. Kreisel
Discussion. The contrast in (a) - (d), between so to speak fossils and survivors among (species of) ideas, is not meant here for some kind of developmental history, but mainly for rhetoric. Specifically, for stating discoveries memorably, and generally for that touch of irreverence that is needed for learning from experience; cf. the second rule of thumb in the introduction, and of course [N 23 on nor taking the traditional literature literally. But, spurious speculationsabout cause and effect aside, (a) - (c) are properly related to a general malaise. In (a) and (b), the unfounded specific complaints - about restrictions to elementary logic and to its deviants - are related to a well-founded malaise about the muster ussumption that the restrictions are needed for reliability, and in (c) to a luck ofproportion: between Gadel’s nice idea and its use for merely discrediting oversimplisticprogrammes (in Principiu, and to a lesser extent of Hilbert). But finding a rewarding use requires a new idea; it is not enough to bandy about hackneyed stuff on, say, Mind and Matter.
References J. Ax and S. Kochen, Diophantine problems over local fields. Amer. Jour. of Math. 87(1965) 605-648.
A. Baker and H. Davenport, The equations 3x3 - 2 = y2 and 8x2 - 7 = z2. Quarterly J. Math. Oxford Ser. (2) 20(1969) 129-137. N. Bourbaki, L’architecture des mathtmatiques, pp. 35-47 in: F. Le Lionnais (ed.), Les grands courants de la pens& mathtmatique. Blanchard, Paris 1948. -, Foundations of mathematics for the working mathematician. Journal of Symbolic Logic 14(1949) 1-8. L. Carleson, Steele Prize (acceptance speech), Notices Amer. Math. SOC.67( 1984) 568-569.
G. Kreisel, Frege’s foundations and intuitionistic logic. The Monist 67( 1984) 72-91. -, Logical aspects of computation: contributions and distractions, to appear (in Logic ‘87). S. A. Kripke, Semantical analysis of intuitionistic logic I, pp. 92-134 in: J. N. Crossley and M. A. E. Dummett (eds.). Formal systems and recursive functions. North-Holland, 1965; reviewed in: Journal of Symb. Logic 35(1970) 330-332. A. Macintyre, On definable subsets of p-adic fields. Journal of Symb. Logic 41(1976) 605-610.
-,
Twenty years of p-adic model theory, pp. 121-153 in: J. B. Paris. A. J. Wilkie, and G. M. Wilmers (eds.), Logic Colloquium ‘84. North-Holland 1986.
Logical Aspects of the Axiomatic Method
IWJ
217
A. Weil, Sur la formule de Siege1 dans la thbrie des groupes classiques. Acta math. 113(1967) 1-87.
[w 11
-,
Number theory: an approach through history: from Hammurapi to Legendre. Birkhiluser. Boston 1984.
This Page Intentionally Left Blank
Logic Colloquium '87
Ha.Ebbhhaus et d.(Editors)
219
0 Elsevier Science PubLiPhers B.V. (North-Holland),1989
ON THE USE OF DIAGONALLY NONRECURSIVE FUNCTIONS Antonfn
KUEERA
Department of Computer Science, Charles University Malostranskg ndmesti 2 5 , 118 OOPraha 1 , Czechoslovakia 1.
INTRODUCTION It turns out that functions which diagonalize all partial recursive
functions possess important properties and that the study of these functions can give us a new view on some results in recursion theory as well as new methods to prove them. Let {el be the eth partial recursive function in some standard enumeration of all such functions and let W
be the domain of {el. A (total)
function g is called a fixed point free function (an FPF function) if We # W9(e) for all e.
A
(total) function f is called a diagonally nonre-
cursive function (a DNR function) if f(e) # {e](e)
for all e. A degree con-
taining an FPF function is called an FPF degree. Although DNR functions differ from FPF functions, FPF degrees coincide with degrees of DNR functions (C91, Lemma 4.1). Usually it is more convenient to work with the class of DNR functions instead of the class of FPF functions since the class of DNR functions forms a
class of functions. We often restrict 1 our attention only to the class of 0-1 valued DNR functions (or their To
equivalents) since (due to the compactness) to work with recursively 0
bounded r 1 classes has further well-known advantages. Let us also recall that degrees of 0-1 valued DNR functions'coincide with degrees of complete extensions of Peano arithmetic and also with degrees of sets which separate an effectively inseparable pair of r.e. sets (see c111, C241, C9l). This is why a degree containing a 0-1 valued DNR function is called a PA degree. We summarize some known facts about FPF degrees and PA degrees. From a global point of view the class of FPF degrees has (Lebesgue) measure 1 (C131) while the class of PA degrees has measure 0 (c111). On the other
hand, no 1-generic degree is an FPF degree and thus the class of FPF degrees is meager in the sense of Baire category (cf. Ell],
C91, [El). The
class of FPF degrees (as well as PA degrees) is closed upwards. Obviously 0'
N
is an FPF degree while
0 is not an FPF degree. From
a local point of
220
A . Kubra
view there is an especially interesting area of degrees 5 Q'. M.M.Arslanov [Z]
[l],
used the class of FPF degrees to give the following completeness
criterion. If A is an r.e.
set then A is complete (i.e. deg(A) =
0') iff
there is a DNR function recursive in A. This completeness criterion was further extended in [ 9 ] from r.e.
sets to all finite Boolean combinations
of r.e. sets and even to all finite levels of n-REA hierarchy introduced in [ l o ] .
Thus, one might conjecture that there are no FPF degrees
, and let z
s+l
+
M
..., xs+l-1, E
As n
x
=
, we have
Ext(oz) and, thus, W. is 1
necessarily recursive. Let
It is easy to see that there is a function f recursive in
v
that f =
fs, i.e.
f =
Since the set M
is r.e.
=
'
s+ 1 ,Is+1 )
= W .( z ) for all z and all g 3
$ ' '-recursive.
xs+l
SI
be the least z satisfying (1) (for i
s+ 1 Observe that for all x = x +1,
{i)g(z)
E
u us s x
$ ' I ,
f
such
E Go,
. Note that the sequence {xS1 S E N is
in $ ' for any fixed s it follows that the set of
all strings strictly preceding f
is r.e. in 0' for any s. It easily follows
from the construction that also the set S of all strings strictly preceding f is r.e. in 0'.
Since f
r
S,
deg(f) is r.e.
in
0'.
It remains to show that deg(f) does not bound any nonzero r.e. degree. Observe that nonrecursiveness of W . implies Vi zfs( = xs). Thus, if W j 7 f is nonrecursive then we have for all i {il # W . and, moreover, 3 # Wj for all g E As+l (for s satisfying x = ).
0
Remark 1. The above proof can be easily modified to yield a PA degree r.e. in
0'
such that
a does
not bound any nonzero degree
g',
5
a
(O-l,a)
i.e.
is a minimal pair. The above theorem shows that if we want to extend the use of GNR functions (equivalently, DNR functions) for more complicated priority constructions we have to take more care about constructions of the corresponding GNR functions and not to rely on a sufficiently powerful oracle in a simple
minded way. We start with a construction of a PA degree 2 such that 5 # a' *
= 0"
to produce a high incomplete PA degree, i.e. a PA degree and
5' L
a'
=
Oft.
= Of', N
for all Y Y y and we obtain a nonempty subclass of A. In fact, {x }(y) is an analogue 0
of a flexible formula in axiomatizable arithmetics (like Peano arithmetic). This concept was first introduced and studied by A.Mostowski El91 and S.Kripke [l2] as a generalization of Godel's incompleteness theorem. Proof. By the Recursion Theorem choose x iff there is a string
A n {f
:
T
0
such that for all y
{x }(y)+ 0
such that
f(<xo,j>) = T(j), for all j
lh(T)}
) = ~ , ( j )
(j < Ih(ro)) so that A n {f : f(<xofj>)= r0(j), j < l h ( ~) } would imply A =
0, a contradiction. Thus, there is no
=
0
A
which
satisfying (2) and
T
therefore {x }(y)t for all y. Moreover, (since ( 2 ) does not hold) for every 0
string T there is a function g
A such that g(<xo,j>) = ~ ( j for ) all
E
j < lh(T). Using the compactness (of the Cantor space {O,l} y
c
3g E
Au y(g(<xo,y>) = C(Y) 1.
0. Observe that if A
-
'c0 which causes {x } (y)J. = 1 0
T
0
=
0
=
o
independently on the as-
0
(y) for y < lh(T
0
)
but we may control 0
(as above), or
iil {xo}(y)l. and V C 3 g a BVy(y some nonempty n o class, A 5 1
B
we have
0 then, of course, we necessarily find
{xo}(y) for y 2 lh(To). We can require for all y 2 1h(T
i) {x )(y)b
)
0
Remark 3. The method of the above proof yields x sumption A #
N
2 lh(ro)
-+
)
either
g(<x ,y>) = C(y)), where 8 is 0
5 G o , given in advance.
The Use o f Dingonally Nonrecursive Functions By the previous theorem, in every nonempty
TI^0
225
class A
c
Go
we can effec-
tively find infinitely many places for coding information. Thus, in a natural way we have a mapping from {0,1jN into a collection of nonempty subclasses of A which assigns a nonempty 'TIO" class to every set C. It is easy 1 to see that if C is recursive then the image, i.e. the class
A(c,x,)
=
{f : f
A
E
is again a nonempty methods for
'TI'
& 'TI'
1
f(<xo,y>) = ~ ( y ) ,all yl class. This fact enables u s to combine the usual
classes like various modifications of the proof of the Low
1 Basis Theorem (i.e. forcing by T O subclasses) with the method of coding 1 infinitary (effective) information.
Theorem 3 . There is a PA degree Proof. 1) -
5 such that 2 2 gland 5'
8, 2
for all g
8".
An easy way how to prove this is to construct (uniformly in classes { B . } . such that i i c N contains exactly one element, say f, {iIg
a sequence of (indices of) nonempty Go =
=
B, E
... ,
2
Bi
82i+land all i, f'
-T
0")
'TI:
z 0'
0".
To ensure this we can construct at step i i) a nonempty
'TI'
class 8
1
2i+1
c
8
2i
so that for some fixed z {iIg(z) # 0'(z)
82i+l,using an oracle PI', ii) a w such that 8 n c . # 0, where c . for all g
E
2i+1
3
j = 0,1, applying Theorem 2, 0
iii) a nonempty oracle
'TI
0'1.
1
class B 2i+2 -
3
B2i+l
n
=
{f
:
f(<w,O>) = j}, for
ck, where k
= @"(i),
(Roughly speaking, for each i the information about i
E
0"
using an is coded at
just one place). We will take a slightly different (more complicated) way since we would like to modify the construction later for the case of a high incomplete PA degree (working only with an oracle 0') as well as under further refinement for the case of a high incomplete r.e. degree.
2) We will use a more adaptable type of coding of @", formation about i
E
for each i the in-
0" will be coded by an infinite binary sequence.
Let h be a recursive function such that for all y
0''
iff W is finite, and y j4 0'' iff Wh(y) = N (see C251, Theorem h (y) (H is reand let H be the characteristic function of W h(y)' y Y cursive for all y but we need an oracle 0" to find an index of it).
y
E
IV.3.2),
The construction. Let 8
S t e p i. Let M = {x
= G 0 0' : {i}'(x)+ = 0 holds for all g E
that M is an r.e. set. Thus, there is a z such that
BZi}.
It is easy to see
226
A . KuPera or z
$' n M. (Such z can be found effectively from an index
E
Let
).
B2i+1
=
B2i
if z E 0' and
11
n {g : lilg(z)+ v {ilg(z)+ =
B2i+l
if z
c
$ 1 .
is a nonempty no class which forces the inequality {i}' 1
B2i+l
for all g E
#
$I,
we have {i}g(z) # $ ' ( z ) . Applying Theorem 2 to
i.e.
2i+1 we
obtain xi such that Yc3g(g
E
B2i+l
&Yy(g(<xity>) = C(Y))).
Observe that for any function g E 8 i
E
i L
2i+2 k(g(<xi,y>) = 0 ) and
$I'
iff
3kYy
$ I T
iff
3 k Q y 5 k(g(<xiIy>) = 1)
2
iff Qy(g(<xiry>) = 1 ) .
This completes the step i of the construction. It is easy to verify that Clearly, f
sT
1 Bi contains exactly one element, say f.
It is obvious from the construction that f 2
$ 1 1 .
It remains to show that f'
T 8'. we did not decide directly at f fl (we decided only whether {i) (214 &
zT $ I t . Although
!,
step i whether i E f' or i f {i} ( z ) = 0 or not) a standard argument can be used to show that we de-
cided this in some other step k, where k can be effectively computed from i. More precisely, we use a recursive function d such that
Thus, f' sT
$ 1 ' .
To see that 0 ' ' sT f' observe that using an oracle f' we
are able to reconstruct the whole construction and decide (recursively in f') at every step for which j = 0 , l we have 3 k V y 2 k(f(<xi,y>) = j ) (where x. is the number chosen by Theorem 2 at step i).
0
We would like to modify the above proof to obtain a high incomplete PA degree. The main difficulty is that during a $'-oracle construction we cannot $'-recursively decide whether y
E
$"
or y
0".
We shall overcome
this obstacle by using @'-approximations to answer these questions. Roughly speaking, we will use a finite injury method during a $'-oracle construction. It will lead to a necessity to change our coding technique to be able to follow changes in $'-approximations to
$'I.
We have to ensure that re-
0 placing some nl class by another one at a given step, a finite initial part
of the constructed function f
E
G
constructed up to this step is extend-
O0
ible to an element of this new n l class. To do that we leave a uniform style of coding into IT classes : (where coding places do not depend on mem0
bers of a given n l class and form an infinite recursive set) and we use a more flexible one given us by the following lemma.
The Use of Diagonally Nonrecursive Functions Lemma 4. For every nonempty
T!
class
A
5
and every A-extendible string
Go
a there is w such that there are go, g 1 from
a 5 gi
221
A
satisfying
g. (w) = i for i = 0 , l .
&
Such w can be found effectively from u and an index of A . Let us fix some recursive function code such that code(a,u) is such a w (a being an index of A). Proof. The statement is a straightforward corollary of Theorem 2 when applied to the n o class 1
A
n Ext ( 0 ) .
0 class A 5
Remark 4. If we start with the empty string and a nonempty
Go
with index a and repeatedly use Lemma 4 in an obvious way we obtain a tree of coding places such that every member g of
A
determines an infinite path
through this tree and, thus, we can find g-effectively infinitely many coding places along g. Namely, for g
E
A
let
gola = 0 and for n 2 0 gn+l,a = g P (code(a,gnfa) + l ) . Roughly speaking, gnfa is a piece of g up to the nth coding place along g. Thus, we have again a mapping from {O,llN into a collection of nonempty subclasses of A which assigns to any set C a ":T A(c,a) = {f
:
f
A
E
&
f(code(a,f
)) =
class
C(n), for all ni.
n,a It is easy to see that if C is recursive then A(c,a) is a
To
1
class.
Theorem 5. There is a high incomplete PA degree, i.e. a PA degree 2 such that 2
) = 0
for
(the set of all such y's is either empty or cofi-
all y for which {x}(y)4
nite and, thus, recursive). Then there is a high r.e.
set
A
recursive in f. Moreover, an index of
A
can be found uniformly from a P'-index of Pt. Note. Let us explicitely mention -
that A is constructed only from Pt inde-
pendently of f. The crucial point is that
A
is recursive in all f's which
satisfy the above assumptions. Proof. Since Pt is gl-partial to Pt, i.e.
recursive we can use recursive approximatjons
by the limit computability there is a recursive function R
such that for all a
Pt(a)+ iff lim R ( s , u ) exists and if Pt(U)+ then S
Pt(a) = lim R ( s , a ) . Thus, at each step s we have current candidates xs ,o , a f xsljlo-
S
...rxs,i ,u
for Pt(al\ 0),
= R ( s , u l \ j) for j = 0,
...,Pt(o) , where i = lh(u), i.e.
...,i.
We describe how to construct an r.e.
set A with the desired properties.
at step s (i.e. y E Wh(i),at s ) . We enumerate Suppose y appears in W h (i) into A at the first step t, t 2 s , at which there are a string (I, lh(a)
..
5
=
if and a finite sequence to, ...,t.
ti 2 s
such that t = to 2 tl 2
....
and the following conditions are satisfied:
(i) {X~,~,~}(Y)+ = 0 at some step 5 ti, (ii) for all those j's for which j < i & u(j) = 0 we have {x }(k)+ Yfjra for described j ' s
0 at some step 5 t for all k = 0, ...,tj+l j t. is an upper bound for a step before which
=
(i.e.
1
{x }(k)+ = 0 for all k = 0, ...,tj+l)f Yfjra (iii) there is no change in recursive approximations to Pt(Ul\ 0),
...,Pt(u)
from step y up to step t, i.e. x = x for all k = y, ...,t Y,jr(J krjru j = 0, ,i,
...
and
(iv) no new element entered W from step y up to step t for all those h(j) = wh(j1 ,Y for j < i & u ( j ) = 1. j's for which j < i & u(j) = 1, i.e. W h(j) ,t (In an obvious sense we say that an element is enumerated into A at step t
on the basis of a ) . A
is obviously an r.e.
set.
The Use of Diagonally Nonrecursive Functions
231
I t i s not d i f f i c u l t t o see t h a t A i s high.
Really, i f a 5
0"
( i . e . a i s a t r u e guess of length l h ( a ) about
r e c u r s i v e approximations t o P t (a
0)
,
...
v'l)
then
, P t ( a ) e v e n t u a l l y reach t h e f i n a l
values, f o r every j f o r which j < l h ( a )
& a ( j ) = 1 a l l elements of W h(j1 e v e n t u a l l y appear and, t h e r e f o r e , f o r a l l s u f f i c i e n t l y l a r g e y we have
iff
(i = l h ( a ) ) . 'h(i) Thus, f o r a l l i t h e s e t {y : < i , y > E A} E
A
f i n i t e otherwise.
is cofinite i f i #
I t follows t h a t A i s high, i . e .
A'
vtr
and
zT 0".
I t remains t o show t h a t A 2 T Since f o r every i t h e r e a r e only f i n i t e l y many a ' s s a t i s f y i n g l h ( a ) = i
it i s s u f f i c i e n t t o show t h a t f o r a l l i , y and a , l h ( u ) = i , we can f - e f f e c t i v e l y compute an upper bound f o r a s t e p a t which
can be enumerated
i n t o A o n t h e b a s i s of a a t a l l . Observe t h a t f o r any a such t h a t l h ( a ) = i
g T 1 and j 0 i s t h e l e a s t j < i such t h a t a ( j ) # 0" ( j ) then then i n f i n i t e l y many w ' s e n t e r W ad) i f a ( j ) = 1 & j o # 0" 0 h(jo) ( a t l a r g e r and l a r g e r s t e p s ) ,
a) if u
ab) i f a ( j o ) = 0 & j o E Pt(ul\ jo)+) for
0"
then ( s i n c e
x = Pt(uI\ j 0
)
o r j,
5
a''
w e have { x } ( y ) 4
&
and, t h u s , f(<x,y>) = 0
a l l s u f f i c i e n t l y l a r g e y (more p r e c i s e l y , f o r a l l y # W h ( j (note t h a t
{x} ( y ) + + {x} ( y )
= 0 & f ( < x , y > ) = 1 and t h u s
f (<x,y>) = 0 excludes t h e p o s s i b i l i t y b) i f a 5
0
,
then a f t e r both t h e l a s t change i n r e c u r s i v e approximations
t o Pt(al\ 0 ) , ...,Pt(u) for j < i
( y ) + = 0)
for
)),
&
and t h e l a s t s t e p when some element e n t e r e d W
a ( j ) = 1 a l l g r e a t e r y ' s f o r which f ( < x , y > ) = 1 (where
x = P t ( a ) ) a r e r e a l l y enumerated i n t o A.
h(j)
...,
( l h ( a ) = i) w e take x f o r j = 0, i (i.e. the y f j ,a c u r r e n t candidates a t s t e p y f o r P t ( a l \ j ) ) and we claim t h a t we can f-ef-
Thus, given i , y , a
f e c t i v e l y f i n d t such t h a t e i t h e r
i s enumerated i n t o A a t s t e p t
(on t h e b a s i s of a ) o r t h e r e i s no o t h e r chance t o do i t a f t e r t h e s t e p t. Our claim follows from p a r t s a ) and b ) described above s i n c e t h e r e necessa r i l y occurs ( a t l e a s t ) one of t h e following cases:
-
f(<x , y > ) = 0 ( i n t h i s c a s e t h e r e i s "no chance" a t a l l ) , Y,i,cJ t h e r e i s a change i n r e c u r s i v e approximations t o P t ( a l \ j ) , j 5 i , a t
-
f o r some j _< i and t > y ) , Y,jra # Xt,j,u some element e n t e r s W a t s t e p t , t > y , f o r some j < i , a ( j ) = 1 , h(j) f(<x ,k>) = 0 f o r some j < i , a ( j ) = 0 , and some k where t h e value Y,jra k excludes any p o s s i b i l i t y t o s a t i s f y e i t h e r t h e c o n d i t i o n ( i i ) f o r t h e s t e p t , t > y (i.e. x
interval {j+l,
...,i-1}
( i n an obvious sense) o r t h e c o n d i t i o n ( i ) ,
232
-
A. KuEera or (eventually) or some other u c
is enumerated into A (on the basis of either U = i).
with Ih(u')
It follows from the above that A is recursive in f. Note. A -
-U
similar argument can be used to show that the r.e. set A
(con-
strutted in the above proof) is recursive in the r.e. set t : Y
wh(i)}.
E
We now show how to apply Theorem 7 to get a high incomplete r.e. degree. We need the following modification of Theorem 2. Lemma 8 . For every nonempty
class 8
1
c
and every a there is x
Go
0
such
that for all y i)
{xo1(y)+
-+
ii) {xo}(y)+
txo}(y) = 0, y
cf
-
Waf
E
iii) if bo, bl,
.....
complement of W
)
is an increasing list of all elements of W
then for every set
C
there is a function g
Iw
E
8
(the such that
g(<x ,b.>) = C(i) for all i < I (where I is the cardinality of 0 1 Such x can be found uniformly from an index of 8 and a.
w
1.
0
-
Proof. By the Recursion Theorem we have x 1) whenever there are a) if j < lh(T)
and
T
s
0
such that
such that
T ( j ) = 1, a,s b) we can verify in s steps (by some fixed standard way) that
8 n {g
:
&
j
E
2) if y
T~
0
1
)
then
0,
(observe that the
{xo} (y)+ for all y and
is the first such T which we find (at the least such
Wa,at
E
then
g(<xofy>) = ~ ( y ) ,all y < lh(r)} =
last condition is C
where
W
then {x0}(y)+
and {x0I(y)
=
0
s),
(if Ix0 }(y) was not pre-
viously defined in some other way by the first clause). Observe, as in the proof of Theorem 2, that we never use the first clause since otherwise 8 would be empty which contradicts our assumption. Thus we only use the second clause and we have
txo} (Y)+
{xo} (Y)
-+
=
{xo}(y)+
iff y
E
Waf
0.
The remaining part of our assertion (concerning "flexibility" of {xo}(y) in
8 for all y Theorem 2.
E
w
)
is proved by exactly the same way as in the proof of
0
Theorem 9. There are a 0-1 valued GNR function f and a @'-partial recursive function Pt such that f sT ! d f f f f 2
T
0' (and also f' E~ 0") both satisfy-
233
The Use of Diagonally Nonrecursive Functions i n g t h e assumptions of Theorem 7. Thus, t h e r e i s a high incomplete r.e.
set
A recursive i n f .
f i r s t construct @ ' - p a r t i a l recursive functions P t ( a ) , Class(a)
Proof. W e -
(by induction on t h e l e n g t h of d)
.
L e t C l a s s ( @ )be an index of t h e c l a s s G
0'
S t e p i. Let a be given, l h ( a ) = i. I f C l a s s ( a ) 4 then a l s o P t ( u ) 4
and
Class(a*j ) 4 for j = 0,l.
A a ny c l a s s with index w.
Suppose C l a s s ( a ) + and w = C l a s s ( a ) . Denote by
From i and w w e can e f f e c t i v e l y f i n d ( a s i n t h e proof of Theorem 3 ) zo such that
for a l l g E A)
( { i } g ( z o ) += 0
Let 8 be a n o c l a s s such t h a t 8 =
1
A n of B. 8
A
iff i f zo E
[g : [ i } g ( z 0 ) 4 v { i l 9 ( z 0 ) + = 11
=
Apply Lemma 8 t o 8
0'.
z,, E
0' and k 0'
i f zo
(given by i t s index b) and h ( i )
function as explained b e f o r e Theorem 7) t o o b t a i n xo {xo}( y )+
+
{XOHY)+
*Y
and l e t b be an index (h i s a r e c u r s i v e such t h a t
IX,) (y) = 0 E
wh(i).
L e t P t ( a ) = xo, Class(a*O) = b
(it corresponds t o a guess i
C
@'I,
which
implies {x } ( y ) + = 0 f o r a l l y ) . To compute C l a s s ( a * 1) we f i r s t use an 0
o r a c l e 0'' t o look f o r a f i n i t e s e t D such t h a t W then C l a s s ( o * 1 ) 4 . I f we have found such D
(D e x i s t s i f f i
E
0'')
of a n o c l a s s B n {g : g ( < x o , y > ) = 0, f o r a l l y 1 I t i s easy t o v e r i f y t h a t f o r every a i) i f a 5
0''
h(i)
= D. Thus, i f i
k 0''
l e t C l a s s ( o x 1 ) be an index
k
DI.
then P t ( a ) + and C l a s s ( a ) + , C l a s s ( a * O ) + ,
ii) whenever P t ( a ) + and x = P t ( a )
then f o r a l l y
{XI(Y)+* {Xl(Y) = 0 IX)(Y)C
*Y
€
wh(i)
(i = l h ( a ) ) ,
iii) whenever C l a s s ( o ) + then C l a s s ( a ) i s an index of a nonempty n o c l a s s 1 (5 G ) and f o r a l l j such t h a t j < l h f a ) & o ( j ) = 1 we have 0 g() = 0 f o r a l l g from a n o c l a s s with index C l a s s ( a l \ ( j + l ) ) and a l l 1 = P t ( a l \ j). s u f f i c i e n t l y l a r g e y ( i n f a c t , f o r y k W h ( j ) ) f where
x
If we denote by A
a no c l a s s with index C l a s s ( u ) (whenever it i s d e f i n e d ) a 1 such that then it follows from t h e above t h a t f o r a sequence { a , } .
y
l h ( a . ) = j and ui = 7 j, A0 2 Aa 2 Au 2 1 2 say f .
l l € N
0''
....
w e have P t ( a ) + , C l a s s ( o . ) t , A u , # pl, f o r a l l j 3 and A. c o n t a i n s exactly'one element, i
1
234
A . Kutera
clearly f ST 0 ' '
and, further, f
gT 0'. Note that we also have f' -T = drl
(as explained in the proof of Theorem 3 ) . Both Pt and f obviously satisfy the assumptions of Theorem 7. Applying it we obtain a high r.e. set A recursive in f. Since f A is incomplete.
we have that
@I,
0
1) In a natural way, Class(o)
Remark 6 .
gT
(or Pt(o)) gives us a $'-partial
recursive tree and f just corresponds to the path through this tree which is given by
@ I t ,
or, in other words, f is constructed just along this path.
Without going to details, it is close to the "true path" in the tree method (as explained in [25]) but, on the other hand, here we have a @'-partial recursive tree and a path which is true for an oracle construction (@'-oracle) rather than for an r.e. construction. 2) Let us explicitely point out an important feature of our @"-oracle
construction of f. Whenever i
E
@",
i.e.
3xO'dx 2 x Y y Q(i,x,y) holds 0
for an appropriate recursive Q, we have a uniformly given sequence of true n
sentences (namely, Y y Q(i,x,y) for x
2
x
0
)
and we have to fix for ever
"true evaluation" for them (or for their codes) while when i Y x S y Q(i,x,y) holds, we need not take
,!
@ I t ,
i.e.
care about it during the @"-or-
acle construction since we can take care about it during an effective construction (roughly speaking, uniformly given sequence of true Z
0 1 sen-
tences does not cause us troubles, cf. Lemma 8). Of course, we cannot, in general, effectively find indices of the corresponding uniform sequences which code (in f) the "true evaluation". Remark 7. As discussed with M.Lerman, there is a more general principle concerning the above situation which could give an interesting view on priority arguments in general (but this is not elaborated here). Theorem 7 and Theorem 9 can be straightforwardly modified to yield also the following result known as the Sacks Jump Theorem ( [ 2 0 ] ) . Theorem 10. Let S be r.e. in f i t , S >T such that W is equal to N if j s(j)
c
ST
$I,
let q be a recursive function
,! S and is finite if j
E
S, fi T
@I
or S +
0') or give a uniform construction
by combining the method with an explicit care about nonrecursiveness of A (this is easy and we omit details). The latter gives us the desired uniformity concerning an index of A.
0
We illustrated the use of 0-1 valued GNR functions in simple cases of @"-priority
results. The described approach can be generalized for other
@"-priority
results but we will not present it here.
4.
NAP DEGREES FPF functions have no fixed points in the standard enumeration of all
r.e. sets while DNR (or GNR) functions are connected with a diagonalization against the standard enumeration of all partial recursive functions. We can also consider a kind of a diagonalization of another type of E o objects, 1
a diagonalization of L o approximations in measure (roughly speaking, a 1
diagonalization of classes of effectively small measure). The use of effective measure (or measure of various levels of effectiveness) turned out to be very important from several points of view (recursion theory, constructive mathematical analysis, randomness, etc.). We will consider here only a very special case. Recall Definition from [13]. is of C y measure zero if there is a reDefinition. 1) A class A 5 {O,l} N . 0 N B,, B,, B2, cursive sequence of (indices of) C classes 5 {O,l} 1
such that
.....
236
A . Kubra
Y n(
w
(8,)
< 2-n)
A 5
and
Bn,
(where p denotes Lebesgue measure).
2 ) A set B (cN) is called a NAP set (nonapproximable in measure) if the
class {B} is not of C
0
1 zero. 3) A degree containing a NAP set is called a NAP degree. Concepts equivalent, or closely related, were studied (for various pur-
poses) by A.Church, P.Martin-Lof, O.Demuth, G.Chaitin, R.Solovay, S.Kurtz and others "41, C181, C51, C61, C31, C161, C171, C131
-
is a selection of
basic references). For example, NAP sets turned out to be equivalent to so called 1-random sets (or Solovay random sets) (cf. c161, c171, Cl81, etc.). P.Martin-Lijf C181 first proved that there is a universal class of C 0 1 measure zero. More precisely, there is a recursive sequence of (indices of) 0 N C1 classes 5 {O,l} such that uo 2 u1 3 u2 2 Uo, U1, U 2 , vn( (U,) < 2-") and for any class A of C o1 measure zero A 5 ' n ' It follows that the class of all NAP sets is a C 20 class, namely, n' N where Pn denotes the class which is the complement (in {O,l} ) of the 1 class Un. Moreover, NAP degrees coincide with degrees of members of P for N any n and, more generally, for every no class A 5 {O,l} of positive 1 measure degrees of members of A contain all NAP degrees (C131). The class
.....
....,
9
X
of NAP degrees has measure 1 (1133). Any NAP degree is also an FPF degree (C13]), in fact, any NAP set is bi-effectively immune and, thus, gives naturally rise to a recursively bounded DNR function. Although FPF degrees are closed upwards this is not true for NAP degrees. NAP degrees contain the upper cone
{a : a 2 0 ' ) but
upwards (C131). In fact, there is a PA degree grees and PA degrees are disjoint below
a
5
5
they are not closed
Q'2'
such that NAP de-
(C131). Thus no PA degree
5
2 is
a NAP degree (but every PA degree has many NAP degrees below). Such a PA degree
5 can be even pushed below g'(unpub1ished). As a corollary we have
that the class of FPF degrees does not coincide with the class of PA degrees even below
0'.
If we denote by (A)i the set {x :
E
A}
then for any NAP set A all
its components (A), are strongly independent ([13]), cursive in {<j,k>: <j,k>
E
A
&
(i.e. no (AIi is re-
j # i}). Generally, if B and C are Turing
comparable then B join C has Ey measure zero. Moreover, all (A)i for any NAP set A are again NAP sets. Thus, no NAP degree is minimal and any NAP degree has infinitely many NAP degrees below in a u n i f o r m way (C131). It shows that NAP sets and NAP degrees play an important role in the study of the structure of degrees (as well as in constructive mathematical
The Use of Diagonally Nonrecursive Functions
231
analysis, as already mentioned). 0.Demuth C71 showed that if A is a NAP set and 0’ 0
K
Ax
and F such that for all U,n,x is nonempty (where A
denotes
a n o class 5 {O,ll 1
i) the measure of 8 is
2
K(U,n,x), and
ii) F(u,n,x) 2 the least y at which the left-most and right-most member of 8 differ each other. Theorem 1 2 . There is a high incomplete NAP degree. Proof. -
We start with the class
P0
(containing only NAP sets, as defined
above). We use an analogical method as in the proof of Theorem 5 (or Theorem 3 ) but with a different type of coding. We use the same technique for f avoiding 21, i.e. for ensuring {i} # 0’ (and we omit details). The difference concerns a coding technique. For any nonempty :n
8 5 Po
class
with index b we define
a) a recursive function H such that H ( 0 ) = 0, H(n+l) = max fF(U,O,b) : l h ( U ) = H(n)j (F is a recursive function
from Lemma 1 1 ) b) for any g
E
, and
8 gO,b - 0 , gn+l,b = g P (1+H(n+l)).
238
A . Kufera The idea (how to code information in the class 8) is
i) we first guess i & p l f f (for given i) and we take a nonempty
C
5 8
TI
0
class
which consists only of those g's for which for all n
g n+l,b is not the beginning of the left-most infinite member of 8 extending g n,b (roughly speaking, for all n we throw away from 8 a neighbourhood, whose
measure is small, of the left-most member of
c=
B
extending gn,b), i.e.
vn3T(T 2 g & lh(T) = lh(gn+l,b) & T Hermann Weyl, 1944.
Part I: Philosophy 1. Logic in Mathematics and in Computer Science 1.1 Why Weyl's philosophy of Mathematics? 2. Objectivity and independence of formalism 3. Predicative and non-predicative definitions 3.1 More circularities 4 The rock and the sand 4.1 Impredicative Type Theory and its semantics 5 . Symbolic constructions and the reasonableness of history Part /I: Models 1 . Impredicative Type Assignment 2. Polymorphic Types as provable retractions 3. Polymorphic Types as quotients
(*) 1987/88: Computer Science Department, Carnegie Mellon University.
Lecture delivered at the L o g i c
Colloquium
87. European Meeting of the ASL, and
written while teaching in the Computer Science Dept. of Carnegie Mellon University, during the academic year 1987/88.
The generous hospitality and the exceptional
facilities of C.M.U. were of a major help for this work.
242
G.Long0
Part I :Philosophy
1. Logic in Mathematics and in Computer Science. There is a distinction which we feel a need to stress when talking (or writing) for an audience of Mathematicians working in Logic. It concerns the different perspectives in which Logic is viewed in Computer Science and in Mathematics. In the aims of the founders and in most of the current research areas of Logic within Mathematics, Mathematical Logic was and is meant to provide a "foundation" and a "justification"for all or parts of Mathematics as an established discipline. Since Frege and, even more, since filbert, Roof Theory has med to base mathematical reasoning on clear grounds, Model Theory displayed the ambiguities of denotation and meaning and the two disciplines together enriched our understanding of Mathematics as well as justified many of its constructions. Sometimes (not often though) results of independent mathematical interest have been obtained, as in the application of Model Theory to Algebra; moreover, some areas, such as Model Theory and Recursion Theory, have become independent branches of Mathematics whose growth goes beyond their original foundational perspective. However, these have never been the main aims of Logic in Mathematics. The actual scientific relevance of Logic, as a mathematical discipline, has been its success in founding deductive reasoning, in understanding, say, the fewest rational tools required to obtain results in a specific area, in clarifying notions such as consistency, categoricity or relative conservativity for mathematical theories. This is not so in Computer Science, where Mathematical b g i c is mostly used as a tool, not as a foundation. Or, at most, it has had a mixed role: foundational and "practical". Let us try to explain this. There is no doubt that some existing aspects of Computer Science have been set on clearer grounds by approaches indebted to Logic. The paradigmatic example is the birth of denotational semantics of programming languages. The Scott-Strachey approach has first of all given a foundation to programming constructs already in use at the time. However, the subsequent success of the topic, broadly construed, is mostly due to use that computer scientists have made of the denotational approach in the design new languages and software. There are plenty of examples - - from Edinburgh ML to work in compiler design to the current research in polymorphism in functional languages. Various forms of modularity, for example, are nowadays suggested by work in Type Theories and their mathematical meaning. In these cases, results in Logic, in particular in lambda-calculusand its semantics, were not used as a foundation, in the usual sense of Logic. but as guidelines for new ideas and applications. The same happened with Logic Programming, where rather old results in Logic (Herbrand's theorem essentially) were brought to the limelight as core programming styles. Thus Mathematical Logic in Computer Science is mostly viewed as one of the possible mathematical tools,
Some Aspects of Impredicativity
243
perhaps the main one, for applied work. Its foundational role, which also must be considered, is restricted to conceptual clarification or "local foundation", in the sense suggested by Feferman for some aspects of Logic in Mathematics, instead of the global foundation pursued by the founding fathers of Logic. Of course, the two aspects, "tool" and "local foundation", can't always be distinguished, as a relevant use of a logical framework often provides some sort of foundation for the intended application. It is clear that this difference in perspective deeply affects the philosophical attitude of researchers in Logic according to whether they consider themselves as pure mathematicians, possibly working at foundational problems, or applied mathematicians interested in Computer Science. The later perspective is ours. In the sequel we will be discussing "explanations" of certain impredicative theories, while we will not try to "justify" them. This is in accordance with the attitude just mentioned. By explanation we essentially mean "explanation by translation", in the sense that new or obscure mathematical constructions are better understood as they are translated into structures which are "already known" or are defined by essentially different techniques. This will not lay foundations for nor justify those constructs, where by justification we mean the reduction to "safer" grounds or an ultimate foundation based on anshakable certainties,,, in Brouwer's words [ 1923,p.492]. The same aim as Brouwer's was shared by the founders of proof theory. However, we believe that there is no sharp boundary between explanation and foundation, in a modem sense. The coherence among different explanations, say, or the texture of relations among different insights and intuitions does provide a possible, but never ultimate, foundation.
1.1 Why Weyl's philosophy of Mathematics? As applied mathematicians, we could avoid the issue of foundations and just discuss. as we claimed, explanations which provide understanding of specific problems or suggest tools for specific answers to questions raised by the practice of computing. However, in this context, we would like to justify not the mathematics we will be dealing with, as, we believe, there is no ultimate justification,but the methodological attitude which is leading our work. Our attempt will be developped in this part of the paper, mostly following Hermann Weyl's philosophical perspective in Mathematics. At the same time, with reference to the aim of this talk,we will review P o i n c d s and Weyl's understanding to the informal notion of "impredicative definition". The technical part (Part 11) is indeed dedicated to the semantics of impredicative Type Theory (and may be read independently of Part I).
244
G. Long0
The reader may wonder why we should refer to Weyl in the philosophical part of a lecture on impredicative systems, since Weyl's main technical contribution to Logic is the proposal for a predicative foundation of Analysis (see 8.1.4). The point is that this proposal, as we will argue, is just one aspect of Weyl's foundational perspective. His very h a d scientific experience led him to explore and appreciate, over the years, several approaches to the foundation of Mathematics, sometimes wavering between different viewpoints. The actual unity of Weyl's thought may be found in his overall philosophy of Mathematics and of scientificknowledge, a matter he treated in several writings from 1910 to 1952, the time of his retirement from the Institute for Advanced Studies, in Princeton. In our view, Weyl's perspective, by embedding Mathematics into the real world of Physics and into the "human endeavors of our historical existence", suggests, among other things, the open minded attitude and the attention to applications, which are so relevant in an applied discipline such as Logic in Computer Science.
2. Objectivity and independence of formalism
The idea of an "ultimate foundation" is, of course, a key aspect of Mathematical Logic since its early days. For Frege or, even more, Hilbert this meant the description of techniques of thinking as a safe calculus of signs, with no semantic ambiguities. With reference to Geometry, the paradigm of axiomatizable Mathematics for centuries, >(Wey1[1927,p.483]). Hilbert's , Weyl [quoted in van Heijenoort, p.4811). The roots of this change of perspective in a disciple of Hilbert, may be found in the main foundational writing of Weyl's, i.e. in Das Kontinuum, 1918. As a working mathematician, Weyl cares about the actual expressive power of mathematical tools. He is very unsatisfied though with the (Poincart?[1913.p.471). Weyl takes up Poincad 's viewpoint and gives a more precise notion of predicativity. First he points out that impredicative definitions do not need to be paradoxical, but rather they are implicitly circular and hence improper (Wey1[1918]. Feferman[l986]). Then he stresses that impredicativity is a second order notion as it typically applies in the definition of sets which are impredicatively given when ccquantified variables may range on a set
246
G. Long0
which includes the definiendum*. Wey1[1918,1.6]. That is a set b is defined in an impredicative way if given by b = ( x I VyeA.P(x,y) ] (1) where b may be an element of A. The discussion of impredicative definitions in the second order case is motivated by Poincark and Weyl's interest in the foundation of Analysis and, hence, in second order Arithmetic (seealso Kreisel[l960], Feferman[l968 through 19871). Thus the need to talk of sets of numbers, provided that this is done in the safe stepwise manner of a predicative, definitionist approach: u... objects which cannot be defined in a finite number of words....are mere nothingnessn (Poincarc! [1913,p.60]). On these grounds Weyl sets the basis for the modem work in predicative analysis, which has been widely developed by Feferman, Kreisel and other authors in Roof Theory. The crucial impredicative notion in Analysis is that of least upper bound (or greatest lower bound). Both are given by intersection or union (i.e. by universal or existential quantification) with the characteristic in (l), since the real number being defined, as a Dedekind cut, may be an element of the set over which one takes the intersection or union that defines it. That is, for the greatest lower bound, g.l.b.(A) = n(r I reA ) , where g.l.b.(A) may be in A . In Das Kontinuum, Weyl proposes to consider the totality of the natural numbers and induction on them as sufficiently known and safe concepts; then he uses explicit and predicative definitions of subsets and functions, within the frame of Classical Logic, as well as definable sequences of reals, instead of sets, in order to avoid impredicativity. Weyl's hinted project has been widely developped in Feferman[l987].
3.1 More circularities At this stage, are we really free of the dangerous vortex of circularities? Observe that even the collection w of natural numbers, if defined by comprehension, is given impredicatively, following Frege and Dedekind :
*
*
VY (Vy (YEY Y+lEY) * ( OEY xeY )) Thus also inductive definitions turns out to be impredicatively given, classically. A set defined inductively by a formula A , say, is the intersection of all the sets which satisfy A . As a matter of fact, Kreise1[196O,p.388] suggests that there is no convincing purely classical argument x . The equational theory "=" is defined, as usual, by taking the reflexive, symmetric and transitive closure of '5".We claim that it is sound to say that the definition of 6 is impredicative here. Indeed, the 6 axiom, when presented equationally,is equivalent to M=N 6MN = M , for arbitrary terms M, N . Thus 6 , by definition, internalizes "=", which 6 itself contributes to define. Or, the definition of 6 refers to "=", which we are in the process of defining. This violates Poincare's restriction above. An inspection of Klop's proof in Barendregt[1984] may give a feeling of where impredicativity comes in: an infinite Bohm-tree is reduced to different trees, with no common reduct, by 6 , once that the entire tree is known to 6 . (Note that, in contrast to Church's delta, one does not ask for M and N to be in normal form. As a matter of fact, C.L.6 is provably not Church Rosser, by a result in Klop[1980]. It is consistent, though, by a trivial model, where impredicativity is lost: just interpret 6 by K . One may wonder if there is any general theorem to be proved here about impredicatively given reduction systems and the Church Rosser property.) A further understanding of the impredicative nature of C.L.6 will be given in s.II.2, when using what may be roughly considered an extension of it as an interpretation of impredicative second order Type Theory.
4 The rock and the sand. Every finite subset of A has a model. [Otherwise, Z would classically imply a formula -Mai ; where the ai are (negated) L,-atoms occurring in A. But then, this formula would belong to Z(L, and be true in D, whereas by definition, Mai is true in D.] By compactness, then, A itself has a model, say M. Now, consider the submodel M* of M whose domain is generated by all closed term interpretations from L,+L,. By the universal form of Z,M* kZ. Moreover, M* is a minimal model for Z, given its term construction. Finally, D is isomorphic to M* II L,since M* satisfies a faithful description of its atomic
354
J. Van Benthem
structure. But then, the required extension of D may be constructed by isomorphic copying from M*. CL
M: This type of analysis can be pushed further. For instance, for arbitrary theories TI s T, the following statements can be proved equivalent: 0
(pMOD(T)) II L1 = pMOD(T1) T is m-conservativeover TI, i.e., for all quantifier-free L1-sentences Q, T b Q only if Ti kind Q.
Finally, we note one further similarity between the theory of abstract data types and the philosophy of science. Philosophers have emphasized the existence of a huge network of scientific theories, connected by various relations, such as being an 'extension' of another theory, or being 'interpretable' in it. Again, such relations have also been proposed and studied for abstract data types. In fact, a very interesting calculus of operations on and relations between data types has been developed in Bergstra, Heering and mint 1986, building upon the seminal work of Burstall and Goguen on the language CLEAR. (Again, a categorial setting, as in Meseguer 1988, may provide the best level of generality here.) We conclude with one example, which is relevant to the useful property of modularity. The s u m Z1+& of two abstract data types is just the union of their axioms, in their combined language. Again the question arises as to the connection between the minimal models for the various components and for the whole. As before, one direction is easy to establish (again, for the purposes of illustration, we consider the case of individualminimality): 0 if D isaminimalmodelfor C, is a minimal model for C, (i = 1,2) . then D 1 I For a converse, we would need an amalgamation property: 0 If D1 isaminimalmodel for Z,, and D2 one for &, then there exists some minimal model D for Z such that D II Li = Di (i = 1,2) This will hold only if Z satisfies a strong splitting condition: if Z P;(pIv'p2for quantifier-free Li-sentences (pi (i =1,2), then Xi ki,d (pi or cl k i n d CPZ. In such a case, Z1 and & do not really 'interact' in Z. As soon as they do, however, the above may be violated. Examole: Z1= (Pa v Qa), & = (-Qa v Ra). Z = Z~U& b PavRa, k i , d Pa nor & bid Ra. but neither The interest in the fine-structure of axioms Z, both as to different kinds of vocabulary involved and as to various syntactic modules, represents a notable tendency in current research. Against the background of the earlier general semantic results, actual performance
Semantic Parallels in Natural Language and Computation
355
of logic programs or data specifications will still be crucially affected by their syntactic design. It is important to bring to light useful structures here. (See also Apt and Pugin 1987 on 'stratified logic programs.) This is not just a technical concern within computer science. Also in general logic, there is a great need for a better theory of the structure of premises. if we are to arrive at a deeper understanding of at present intangible phenomena like good organization and structuring of arguments.
3. DYNAMICS The study of minimal models and non-monotonicity almost forces one to acknowledge the more dynamic aspects of knowledge acquisition and revision. But even so, it is only one strand among various motives pointing in the latter direction. In this Section, we shall also consider dynamics at the level of building up interpretations of single statements, and related forms of information flow.
3.1. Dynamics of Interpretation 'Processing mechanisms' play a pervasive role in programming languages, obviously but also in natural language. For instance, it seems natural to understand various possibilities and impossibilities in anuphoric linkage from an algorithmic perspective. And, such considerations are also appropriate in the analysis of conditionals (cf. Stalnaker 1972) or bare plicrals when viewed as expressing default rules (cf. Pelletier and Schubert 1985). Here, we shall concentrate on the example of anaphora (see Section 3.1.2). There are various new semantic theories invented especially for their 'dynamic flavour'; but, in fact, ordinary predicare logic itself provides an excellent initial model for studying such phenomena. To see this, we take our starting point in a well-known approach to the semantics of programs, due to Floyd and Hoare. 3.1.1 Operational Semantics for Programs The basic format of interpretation for predicate logic is Tarski's relational schema ('cp is true in M under I, a'); M,I t cp [a1 where M is a model structure, I an interpretation of vocabulary into suitable items of M, cp some formula, and a an assignment to the variables occurring freely in cp. Here, the assignment is a modest 'auxiliary interpretation function', needed in order to get a recursion going on the structure of cp. Later on, however, this Cinderella met her prince. In Computer Science, assignments may be viewed as memory stares of a computer (functions from identifiers/addresses to data values) - and as such, they play the leading role in computationallyoriented introductions to Logic (such as Cries 1981). One fundamental generalization of the Tarski schema to the semantics of programming languages has taken place in so-called operational semantics. In addition to the set of
J. Van Benthem
356
descriptive formulas, one also defines a set of programs inductively, and then interprets these according to the following schema (in the context of some model M, I ): [[x]] is the set of successful state transitions associated with the program x. Here are some typical steps encountered for simple program constructions: (1) [[x := t]] = [(a,ax.(t)) I a any assignment) (2) [[xl;x~ll= ((ah) I for Some c, ( a 4 E "xdl and (c,b) E "~211) (3) [FF E THEN xl, ELSE xzll = [(a&) I M.1 eta1 and (a,b) E "x111, or MJ & &[a1 and (a,b) E ~ [ x ~ l l l Predicate-logical formulas function as static assertions in this context. They may be tests as in (3). or statements of specification for programs, or statements about program txhaviour, such as the well-known correctness assertions (9)n : for all assigments a such that M,I C I$ [a], and all assignments b such that (a,b) E [[It]]. it holds that M,I I= [b]. . ('pre-condition I$ for x implies post-condition
(w)
w
w')
Although the semantic format is still very much in the usual spirit, even this application yields its own questions. For instance, the logical meta-theory of this framework derives its interest to a large extent from the fact that its predicate-logical counterpart does not transfer smoothly. Notably, in searching for completeness theorems in this area, it turns out that even the correctformulation of what should be regarded as 'completeness' is a tricky issue (cf. Cook 1978, Bergstra and Tucker 1982). Programming languages, at least the traditional imperative ones, exert dynamic control over a series of actions to be performed by the audience, usually the computer. But, this feature is also very conspicuous in natural languages, where speakers direct their listener's representation of information by means of various textual devices. This process occurs explicitly in reasoning, when we set up an argumentative structure using imperatives: 'suppose', 'let', 'take', ... . But it is also present at the sentence level, witness the illustration given below. Thus, the slogan of h n g u u g e as Action has become a central one in current semantics, and many proposals for 'dynamic formats' of interpretation have appeared (cf. Kamp 1981, Heim 1982, Seuren 1985). One phenomenon where dynamic aspects emerge is that of anaphoric connection. In actual discourse, pronouns are used to refer back to earlier expressions, in such a way that the listener can pick up links intended by the speaker. There are limits to this process, however: not anything goes - and contemporary dynamic theories are often motivated by the desire. to explain these constraints by reference to some processing mechanism. Here are three examples of anaphoric facts whose explanation seems to go beyond predicate logic as ordinarily conceived:
Semantic Parallels in Natural Language and Computation ‘A speaker arrived. He was late.‘
357
(1)
How can the ‘existentialquantifier’pick up a pronoun in another sentence? Moreover, there is a clear-cut ordering involved, witness the impossibility of an intended link as in the following sentence: *‘He was late. A speaker arrived.‘ (2) And finally, within a single sentence, consider the pattern: ‘If a speaker arrives, he is late.‘ (3) This is traditionally transcribed as follows: Vx ((SXAAX) 4Lx) , interchanging the conditional with the existential quantifier. (It is as if the, invalid, prenex law (3x$(x) -+ ~ ( x ) )t)Vx($(x) -+ ~ ( x ) )had been applied.) Why cannot this sentence be interpreted correctly as it stands: its obvious form being (~~(SXAA ) ? -+XLx) These problems are solved in the cited work by Kamp and Heim through the intermediary of a level of ’discourserepresentation’,in between syntactic form and eventual truth conditions. But, as was pointed out convincingly in Barwise 1987, they can be handled equally well without such a move, by adopting a more dynamic format of interpreting syntactic forms, inspired by the earlier operational semantics.
Operational Semantics for Assertions A very clean presentation of this idea arises when the language interpreted is predicate logic irseg as is shown in Groenendijk & Stokhof 1987. Thus, the earlier ‘static’ assertion language for programs itself receives a dynamic interpretation: [[Px]] = {(a,a) I M, IkPx[all : i.e., atomic formulas are just tests without special side-effects. 3.1.2
“CPAWll = “911 0 “wll : i.e., conjunction is treated like composition ( o or ; ). [I3x(p]]= ((a,b) I (c,b) E [[(PI]for some c -x a): i.e. existential statements are made true by finding some witness to their mamx. These stipulationsexplain the first two anaphoric facts mentioned above. 3 x A x ~ L xwill have the same interpretation as ~ ~ ( A x A L xbut ) , not as LxdxAx. In general, the effect of the above clauses is to widen scopes toward the right. An explanation of the third phenomenon observed requires a treatment of conditionals. We start with a preliminary notion, however: [[-cplI = ((a,a) I for no b, (a,b) E “cpll) . Thus, a negation is a strong denial, without side-effects. Then, a clause for implications cp -+ y~ may be derived via their traditional equivalence with - I ( ( P A - I Wor ) , by direct introspection:
J. Van Benthem
358
[[$+y]] =, ((a,a) I for all b with (a,b) E [ [ c p ] ] , there exists c with ( b d E "ylll. Finally, we introduce universal quantification, again without side-effects: [[Vxcp]] = {(a,a) I for all b with a -x b, (a,b) E [[cpll I. Then, it may be checked that the following two formulas indeed have the same associated transition relation: 3xAx -+ Lx and Vx(Ax -+ Lx). Thus, the anaphoric sentence 3 is vindicated as it stands. On the other hand, no binding across the conditional occurs with the related syntactic form ~ X A+XLx. For, the negation in the antecedent 'envelops' the scope of the existential quantifier. But this is as it should be. After all, there is no such link in natural language either, witness the incorrect sequence *'If no speaker anives, he is late.' It should be admitted at once that this account still has many empirical weaknesses. For instance, it describes anaphoric facts in natural language only indirectly, being dependent on a translation into predicate logic. Moreover, those anaphoric facts are much more diverse and subtle than may be apparent from the four examples given here. But, the account does show how standard predicate logic, far from being an obstacle to recognizing the role of dynamical interpretation,can actually be a good testing ground for theorizing about it. In fact, there is still a close connection between ordinary predicate logic and its interpreted variant. In a sense, the latter amounts to reading 'ordinary' formulas with a different scope convention , extending scopes of existential quantifiers as far as possible toward the right, until one hits the boundary of an operator like negation, which 'seals off its subformula. (For a more general discussion of different scoping conventions, as expressing various strategies of interpretation for one single language, cf. van Benthem, 1986b). More formally, there is a reduction to ordinary predicate logic which works as follows. For each formula cp. with free variables X I , ...,Xn, the following induction defines a predicate TRANS (Cp, Xl*...Jn, YI,...,Yn): [intuitively, '(xl,..~.x,,; yI .....y ,) is a successful cp-transition between partial assignments'] TRANS(Pxi; XiJi)
=
TRANS((PAY;
= 3%(TRANS(9; k,z) A TRANS(Y;
TRANS(3xjq; X,?) TRANS(7cp;
(here the Z are new free variables) = 3zj TRANS(9; Y(Zj/Xj),y) : : 4; = A VZ-tTRANS(cp; x,Q.
x,?)
x,?)
yj = Xi A
x
Pxi
z, 7))
Semantic Parallels in Natural Language and Computation
359
_-
By an obvious induction, TRANS(cp; x,y) defines the successful transitions for cp. Therefore, any central semantic notion of dynamic predicate logic can be reduced to static assertions in this way. Nevertheless, the new formalism as it stands does seem to correspond more closely to practical uses of predicate logic, or semi-formalismsemployed in mathematical prose, where w e do tend to use scoping conventions closer to the one mentioned above. Another attraction of the dynamic framework is that it suggests taking a fresh look at definining connectives, and logical operators generally. As has often been observed, logical operators do not have meaning only: they also exert control. (For instance, Jennings 1986 points at the use of “and” and “or” as sequencing, or more generally punctuation devices.) What we can do in the new setting, for instance, is to define both ’dynamic’order-dependent versions of connectives, and more classical ‘parallel’ ones, studying their interplay. One instructive example is provided by conjunction. The natural ‘classical’stipulation, in terms of intersection of successful transition sets, will now express a different option from the preceding one: namely. a requirement of parallel execution. And similarly, a classical negation in terms of complement will suddenly acquire a new operational significance. Finally, the new system also suggest several ways of defining valid consequence as arising from successive processing of premises. One natural candidate is the following: 91,...,(Pn b y if, for all models M,I and assigments a1,...an+l such that (a1a2 1 E “cpilldan, an+dE “cp,Il, there exists an assignment b with (an+l,b) E “y11 . This notion of consequence, like the ones presented in Section 2, lacks several of the main structural properties of the standard one (be it for different reasons). It is not monotone for instance: 3xAx b Ax; but 3xAx A 3x-1Axb& Ax. But this time also, it even lacks such simple ‘domestic’ properties as insensitivity to pennuration, or contraction of identical premises: ) ; LxdxAx F ~ ~ ( A x A L x ) , 3xAx A Lx I= ~ ~ ( A x A L xbut 3xAx A 3 x 4 A~3xAx I= Ax; but 3xAx A 3 x 4 #~Ax. Still,as in Section 2, certain special cases will remain valid - including the following leftward form of monotonicity: cp b y implies x, cpb y. Once again, the divergences from classical logic here arise from reasons different from those in Section 2. We have defined ‘logical consequence’rather close to one particular interpretation algorithm, and are now feeling its effects. Whether this has been a wise policy, will be discussed further in Section 4 below.
J. Van Benthem
360
Dynamics of Information Flow Changing assignments means no more than changing our links with a certain model. We have been studying the ‘dynamicsof adjustment’, so to speak. But already in Section 2, we also encountered the dynamics of changing information. The latter perspective is currently receiving a good deal of attention too. For instance, in the philosophy of science, there has been work on the dynamics of changing theories, which has also issued in more general epistemic studies of various operations on knowledge states: in particular, addition and refraction of infomation (see GMenfors 1988.) There are at least two ways of thinking about the dynamics of changing information states. One is the classical perspective: common to both standard logic and such less standard frameworks as e.g. possible worlds semantics. Here, incoming information is m a t e d as reduction of the space of a priori possibilities - as was done already in Section 2, when discussing ‘classical’versus ’minimal’transformations on infonation states. The other perspective takes some more primitive notion of epistemic state, in particular, one in which partial informarion need not be represented by a cloud of all possible total (world) extensions. And then, ‘propositions’can act as more abstract operators on such states. We shall consider both approaches.
3.2
3.2.1 Reducing Possibilities First, the method of ‘eliminating uncertainty’ is easy to implement, witness current folklore. Here is one particular example, due to Veltman 1987 (but compare also Heim 1982, and others). Consider a modal propositional language with the ordinary Boolean connectives as well as modality 0 (“might”). Let U be a set of ‘possibilities’(say, ordinary valuations), for which it makes sense to call an atom p true or false. Then we can define, for each formula cp, a corresponding transformation [[cp]] on subsets of U: [TpD (X) = [ XE X I x verifies p) = uqn m n “wx) ucpm (XI= uvn 00 ummx) U-CPII(X) = x - umx)
I~~NI
and
If desired, one might add a sequential conjunction as before:
aw
acpn
“cp ; = “MI o) For purely propositional formulas cp, it is easy to see that [kpDm merely amounts to intersecting X with the truth range of cp in U computed in the standard fashion. But, already with the modal operator, some interesting phenomena occur, once consequence is introduced on the analogy of the previous subsection:
Semantic Parallels in Natural Language and Computation (PI,
...,q n
36 1
if.
for all X, “91;... ;cp,ll(X) s “wIl(X). will be a consistent sequence, implying p, whereas its permutation Note, e.g., that 0,p;p p;O-,p is inconsistent. As Veltman argues, this reflects the facts of life for our ordinary use of the epistemic modality ”might”. And there are various other applications of this formal system too. Also as before, the new dynamic consequencecan be embedded in ‘static’standard logic. For instance, in a sense, the above little system is a part of monudic predicate logic. To see this, assign unary predicate letters P (uniquely) to each proposition letter p, while also taking one distinguished unary predicate letter X. By induction, we translate each formula 9 into a syntactic operation A(cp) on monadic formulas a = a(x) having a free variable x: A?r@)(a) = ahPx A(cpAV)(a) = A(cp)(a) A(V)(a) A(cpvyr)(a) = A(cp)(a) v A(-+@) = a -,A(cp)(a) A(Ocp)(a) = (3y[~/xlA(cp)(a)) a A(cp;v)(a) = A(WA(cp)(a)) Evidently, this is a direct transcription of the above ‘truth definition’ into a simple language quantifying over states in U. Thus, we have the following reduction: 91, ..., qnby ifandonlyif Vx(A(cpl; ...;cpn)( Xx) + A(y)(Xx)) is valid in monadic predicate logic. As a bonus, decidability, thefinire model property and other desirable features are immediate for the new dynamic logic.
3.2.2 Building Up Information Next, implementation of the second approach can actually go in many directions, since we can structure ‘knowledgestates’ in many different ways. (The first approach may in fact be defended as being an elegant way of avoiding such decisions: cf. Stalnaker 1986.) In particular, we have to specify the ‘grain size’, as it were. Are knowledge states like sets of senrences, with all their syntactic peculiarities? Or, should we smoothen these somewhat by thinking of deductively closed theories in some logic? Could we work on the analogy of Beth rableaus, or should we steer away from their particular notational structure, as is done in Hinrikka model sets? Slight differences in presentation may now become logically significant. We shall not go into any particular proposal here. Rather, we want to point out a certain danger, of merely revamping existing systems. Suppose that we choose the approach via deductively closed theories (as is suggested by Glrdenfors’ treatment). Let us work in
362
J . Van Benthem
ordinary classical propositional logic. (Note that this decision itself determines what will be ‘deductivelyclosed‘theories.) The main idea then becomes just this: the action of any formula cp on any ‘state’T consists in forming the deductive closure of T u (9). Now, these theories come in an obvious relation of inclusion: T ~ E ; Tif ~T i s T 2 . Moreover, they form a distributive lattice, with respect to the available operations of supremum u and infimum n on theories. Then, one can set up a recursive definition of operators Uqll by merely providing a convenient decomposition: “(phwllrn = uw11aI9llrn>= U9llm u [ w 1 1 0 9 [twKlXT) = UqDOn [IwflO.
And other connectives could be treated by postulating additional structure on the set of knowledge states (e.g., for implication, one would have to make it into a Heyting algebra). But obviously, nothing much would be gained in this way. So, there is a danger (to be avoided) of trivially achieving a ‘dynamic’presentation. Another approach might be to start from a very abstract notion of dynamic information srrucrure, studying specializationsas they arise. For instance, the general type might be this: (S,E , b p 1 PEPI), where S is a set of ‘information states’, ordered by increasing strength (E), on which a certain family of transformations acts, indexed by propositions. In other words, we assume a perspective from Croup Theory. This perspective makes it easy to ask systematic questions. One is how much structure should be imposed on the transformations. Evidently, they should form a semi-group under composition. But, should they also be a group: i.e., should there be an inverse to every proposition (its ‘retraction’)? [G&denfors 1988 has rather looked at this structure of transformations from the point of view of Category Theory, demanding the presence of equalizers, modelling equivalence between propositions.] But also. this question interacts with the possible structure to be imposed on infomation states: should they form a lattice, a Heyting Algebra or even a Boolean Algebra? Then, it would be natural to have corresponding closure conditions on transformations too. E.g., there should be some operation on them such that (zp n zq)(x) = rp(x) n rq(x), for all XE S. Once a class of transformations with certain closure properties has been chosen, we can bring up various additional questions. For instance, what are natural subclasses of transformations satisfying additional mathematical requirements? One natural example are iakmpotent operators, satisfying the condition zpoTp=z*
Semantic Parallels in Natural Language and Computation
363
This is certainly a very plausible logical requirement too. Another reasonable condition would be monotonicity , in the sense of respecting growth of information: x E y only if zp(x) E zp(y), for all x,y E S. This certainly holds for 'classical' propositions; but also, e.g., for all modal propositions in Veltman 1987 (where E; is identical with a). Here is one elementary observation in this vein. boosition: If S is a Boolean Algebra, and ( zp I PE P) a set of idernpotent operators which are also distributive, i.e., zp(x ny) = T ~ ( Xn) zP(y), for all x,y E S, then the whole information structure can be represented by a set structure of the 'eliminative' kind described earlier. (Compare also the analysis of set transformationsgiven in Section 2.) For each state SE S, set
w:
s = (XIXES).
For each proposition PEP, set
-p = [ x I zp(x) = X I .
(This is the common idea of identifying propositions with theirfmedpoints.) Then it is easy to prove that (1) x ~ iffy Xay (2)
W-Fnj.
d
But, one could also ay to apply more sophisticated mathematical results here, on the representation of algebras or lattices with certain types of endomorphisms. (Compare J6nsson and Tarski 1951n.) Finally, one basic attraction of the present framework is that it also suggests new types of question, beyond the classical case. For instance, where transformations are around, invariants cannot be far away (cf. van Benthem 1985b). A relation R between states is invariant with respect to our class of propositions if, say. (X,Y)ER iff ( ~ ~ ( xzp(y))€ ), R, for all PEP. Is there an interesting invariant structure on S? 3.2.3 Relational Calculus and Categorial Grammar Nevertheless, there are also other possibilities for abstract information structures. Perhaps, propositions are not really functions on information states, but rather relations, which can also take no value, or more than one. For instance, certain propositions might embody choices, leaving several options. In that case, one natural framework would be to view propositions as forming an algebra in the sense of the Relational Calculus, with basic
364
J. Van Benthem
operations such as composition, intersection, union, etcetera. (Recall the earlier discussion of the dynamics of changing assignments.) One basic question in this perspective too, is what would be the natural operations on propositions. This amounts to asking for some principled account of operations on binary relations. (For an algebraic study of these matters, see J6nsson 1984.) One way of doing this employs the framework of Type Theory. For instance, there is a natural notion of logicality for this type of operator, in terms of invariance for permutations of states (cf. van Benthem 1987b). Moreover, again, we can introduce additional useful conditions, such as continuity, in the sense of respecting arbitrary unions of families of relations. All possibilities can then be classified (cf. van Benthem 1987a). and they form a neat basic set, including the above-mentioned examples. RoDosition: The logical continuous operators on binary relations are exactly those defined by a schema of the form hR.hxy.3uv.R~~A 'some Boolean condition on identities involving x,y,u,v'. Examples are converse: hR.hxy.3uv.R~~A x=v A y=u, or diagonal: hR.Xxy.3uv.R~~A x=y=u=v The result is easily specialized to operations from binary to unary relations; with a typical example such as projection: XR.Xx.3uv.Ruv A x=u It can also be generalized to n-ary operations on binary relations, bringing in (typically) disjunction or conjunction: ~S.hxy.3uv.3zw.Ruv A szw A x=u=z A y=v=w , or composition: hRS.hxy.3uv.3zw.R~~A Szw A x=u A v=z A w=y This kind of systematic classification is important if we are to bring some order into the plethora of possibilities for 'dynamic' logical operators. Another interesting question is how such a relational structure would translate into a corresponding system of inference. Here, we can follow a suggestion arising from Orlowska 1987, and exploit an analogy with Categorial Grammar. The logic will be very much like a so-called 'Lambek Calculus', as developed in that area (cf. van Benthem 1986a. 1987a). The main idea can be explained as follows: basic propositions p are interpreted as binary relations Rp complex propositions p.q are interpreted through the composition Rp o disjunctionsmay be interpreted by unions But, what to do about implications?
Semantic Parallels in Natural Language and Computation
365
Here, Orlowska notes that two quite plausible 'implicational' operators may be used, introduced recently by Tony Hoare: RU = u ( X I R O X E ; S ) S/R = ~ { X I X O R S S ) which describe 'weakest pre- and post-specifications'. (This continues the well-known work on pre- and post-conditions in the operational semantics of programming languages. But see also Jbnsson 1984 for a purely algebraic introduction of \ and 1.) Accordingly, the dynamic perspective suggests introducing two implications, one searching for its argument on the left-hand side, and one searching on the right. [A sinlilar idea has been suggested independently by Gordon Plotkin (private communication).] But, this is precisely standard practice in Categorial Grammar, which has developed systems of proof for directed types a b and bla. For instance, the basic Lumbek Calculus is the following Gentzen-type system: Axiom: a + a Rules: X a Y,b,Z c X +a Y,b,Z + c
*
Y,X,a\b,Z
X,a
X
3
3
b
Y,b/a,X,Z
+c
a,X
+b
X
bla
X a a X,Y
*c
Y 3 b
+ a.b
* ab
X,a,b.Y X;a.b, Y
c
+c
Derivable sequents in this system L include a a b + b, but typically exclude ab.a + b. [Here, we have deviated somewhat from the standard version, in allowing ernpfy sequences o n the left. As a consequence, we can derive, e.g., (e/e)L 3 t.] prooositioq: The Lambek Calculus L is sound for the above relational calculus. That is, if the sequent a1,...,an b is L-derivable, then, for any assignment of binary relations R, to primitive types x, with Ra for complex a computed as above, Ral o ... OR% E Rb. It would be very interesting to have a converse too (for the operations .,\ and 1, that is.) Thus, we would establish a link between basic logics of categories in natural language and plausible systems of 'dynamic logic'.
366
J. Van Benthem
Propositional Dynamic Logic Indeed, there is also a profitable connection to be found with 'Dynamic Logic' in the usual sense of that phrase (see Hare1 1984). In that research program, one adopts an enriched modal logic having both propositions and programs, in which the latter denote transition relations between states, whereas the former stand for functions from states to truth values (their more traditional role). One interesting feature is that there are operators mapping one into the other: modalities take programs to operate on propositions, whereas a test operator takes propositions to programs. This turns out to be a third convenient abstract framework to explore for present purposes. Note that here, a possibility comes to the fore which has hitherto been neglected. We have been treating propositions themselves as being operations on information states. But in fact, we may want a separation of concerns: into propositions expressing a certain inf'ormational content, and various modes of transforming states (which can use certain propositional contents, to be sure). For instance, the above test operator is one such mode, which checks if a state has a certain property. but then leaves it as it is. Another operator might be addition. which, given a propositional content, transforms any state into a minimal extension (as measured along some prior inclusion relation among states) having that content. Thus, we are now interested in Propositional Dynamic Logic for its potential as a dynamic logic of propositions, rather than programs. Again, the general situation here can be analyzed in type theory. Propositional dynamic logic has primitive types t (for truth values) and s (for states). Propositions have the functional type ('from states to truth values'), (ss) while programs have ('from pairs of states to truth values'). (s,(s.t)) The above 'switching modes' will then be operators in the type 3.2.4
((SJ), (s,(s.t))). As before, it makes sense to ask for logical items here, being those which are invariant for permutations of states. [Note the formal similarity with the earlier relational calculus case of the converse type ((e,(e,t)), (e,t)).] And in fact. the test operator is logical, while also satisfying the earlier special requirement of continuity. Also as before, we can classify all possibilitiesof the latter kind in the schema hP.hxy.3u.P~ A 'Boolean condition on identities involving x,y,u'. As a fmt attempt, one might consider that fragment of dynamic logic which only has basic programs of the form ?cp (where ? is the test operator) and then the usual program operations ; (sequencing), u (choice) and * (iteration)
Semantic Parallels in Natural Language and Computation
361
But, this will reduce to ordinary propositional logic, because of such equivalences as the following : <XI
u X2>cp
c) <Xl>cp
;R2>(P cp
v <X2>cp
t) <Xl><X2xp