STUDIES IN LOGIC AND THE FOUNDATIONS O F MATHEMATICS VOLUME 80
Editors
H. J. KEISLER, Madison A. MOSTOWSKI, Warszawa
...
15 downloads
1685 Views
25MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
STUDIES IN LOGIC AND THE FOUNDATIONS O F MATHEMATICS VOLUME 80
Editors
H. J. KEISLER, Madison A. MOSTOWSKI, Warszawa
A. ROBINSON, New Haven P. SUPPES, Stanford A. S. TROELSTRA, Amsterdam Advisory Editorial Board
Y. BAR-HILLEL, Jeritsalem K. L. DE BOUVEYE, S U ~ ~tiara U H. HERMES, Freihurg i. Br. J. HINTIKKA. Helsinki J . C. SHEPHERDSON, Bristol E. P. SPECKER, Ziirich
NORTH-HOLLAND PUBLISHING COMPANY -AMSTERDAM . OXFORD AMERICAN ELSEVIER PUBLISHING COMPANY, 1NC.-NEW YORK
LOGIC COLLOQUIUM '73 PROCEEDINGS OF THE LOGIC COLLOQUIUM BRISTOL. J U L Y 1973
-~
Edited by H. E. ROSE Lecturer in Mathematics, Bristol and
J. C. SHEPHERDSON
Professor of Pure Mathematics, Bristol
1975
NORTH-HOLLAND PUBLISHING COMPANY- AMSTERDAM . OXFORD AMERICAN ELSEVIER PUBLISHING COMPANY, 1NC.-NEW YORK
@ NORTH-HOLLAND PUBLISHING COMPANY-197s
A11 right.\ reserved. N o part of this publication m a y be reproduced, stored in a retrieval system, o r transmitted, in any form or by any means, electronic, mechanical, photocopying, recording o r otherwise, without the prior permission of the copyright owner.
Library of Congress Catalog Card Number 74-79302 North-Holland ISBN S 0 7204 2200 0 0 7204 2282 5 American Elsevier ISBN 0 444 10642 1
Published by: North-Holland Publishing Company-Amsterdam North-Holland Publishing Company, Ltd.-Oxford
Sole distributors for the U.S.A. and Canada: American Elsevier Publishing Company, Inc. 52 Vanderbilt Avenue New York, N.Y. 10017
PRINTED IN THE NETHERLANDS
PREFACE This volume contains most of the proceedings of a conference on mathematical logic held under the auspices of the Association for Symbolic Logic in July 1973 in Bristol, England. The main areas of interest of the conference were: Philosophy of mathematics, metamathematics of algebra, proof theory, category theory and theory of computation. There were fourteen invited papers, thirty contributed papers and a further thirteen contributed papers presented by title only. Abstracts of most of these papers appear in the Journal of Symbolic Logic for 1974. The speakers and titles of the invited papers were:
P. Bernays, Mathematics as a domain of theoretical science and also of mental experience M. Dummett, Philosophical foundations of intuitionistic logic A. Robinson, Concerning progress in the philosophy of mathematics G. Muller, Set theory as a ‘frame’ for mathematics W. W. Boone, An algebraic characterization of groups with solvable word problem A. Kino, Introduction to proof theory P. Martin-LGf, On the formalization and proof theoretical analysis of intuitionistic mathematics S. Maclane, Internal logic in toposes and other categories F. W. Lawuere, Coherent topos: algebraic geometry =geometric logic D . Scott, Continuous functions and computability C. Strachey, Some unsolved problems in the theory of computation R. Milner, Processes, a mathematical model of computing agents C. C. Elgot, Monadic computation and iterative algebraic theories E. Engefer, On the structure of algorithmic problems
In the following pages, ten of the first eleven papers are expanded and developed versions of the corresponding lectures given at the conference; Kino’s introductory lecture has been replaced by the paper by Kino and
viii
PREFACE
Myhill listed below. The remaining twelve contributed papers are concerned with classical set theory, recursive function theory, metamathematics of algebra, proof theory, model theory and combinatory logic. We should like to take this opportunity, on behalf of all participants at the conference, to thank the following organizations for financial support: International Union for the History and Philosophy of Science, International Computers Ltd, British Academy, British Council and University of Bristol. It was with deep regret that, during the final preparations for this book, we heard of the death of Abraham Robinson. As an editor of the ‘Studies in Logic’ series and as a friend he gave us much help and encouragement in his usual urbane and efficient manner both during the conference and while working on this volume. His death is a great loss to the communities of mathematical logicians and philosophers of mathematics alike. April 1974
H. E. Rose J. C. Shepherdson
MATHEMATICS AS A DOMAIN OF THEORETICAL SCIENCE AND OF MENTAL EXPERIENCE
Paul BERNAYS Zurich, Switzerland
Ladies and Gentlemen! Firstly may I apologise for the fact that the abstract of my talk only partly corresponds to what I have to say. I considered the general theory of knowledge in more detail than I will do today. So allow me to deviate from that abstract, and in particular to begin in an other way. Ferdinand Gonseth often speaks of the problem of beginning (problbme du commencement). Let me begin, at least provisionally, with some words about positivism. Many contemporary philosophers hold the opinion that the appropriate view in philosophy of science is positivism. However, the majority of scientists do not take a positivistic view, and further it cannot be maintained that the results of science are clearly in favour of that view. Positivism appears to many people to be a reasonable attitude of an enlightened man because it suggests a radical opposition to intellectual authority. However, from an antagonism to confessional doctrines it passes to a general antagonism to any kind of belief, whether it is our instinctive belief in the existence of exterior objects, or a more abstract belief in a frame of objectivity which we are induced to adopt in science. The antagonism to the belief in external things and beings is of course not practicable in real life but it seems to some philosophers that it can be maintained as a philosophical attitude. As you know, the doctrine has been set up that natural science deals properly only with our sensations. Perhaps it is not possible to refute directly this view, yet it can hardly be claimed that we get from it a satisfactory understanding of physical science. The relations which are stated in physics, i.e. the relations which constitute the contents of the physical laws, are not (apart from very
2
P A U L BERNAYS
special exceptions) between sensations but rather between physical entities. Only indirectly do we get information from them concerning our sensations. Still, it might seem that positivism has a better justification with respect to mathematics: as a device for abstaining from any Hypostasierung or, as Willard V. Quine calls it, ontological commitment. An element of positivism is contained both in the intuitionistic view of mathematics and in the view denoted as ‘formalism’, which arose from Hilhert’s proof theory. I think one main point of Hilbert’s proof theory can be accepted without adopting the philosophy of formalism. Indeed this main point is the idea expressed by Hilbert in his talk ‘Axiomatisches Denken’: to make mathematical proof as such an object of mathematical investigation, a possibility resulting from the method of logical formalization. Yet it was certainly not Hilbert’s intention that mathematics should consist only of proof theory (though some of his statements describing his attitude to metamathematics might suggest this view). Also intuitionistic mathematics can be accepted without the adoption of the positivistic views concurrent with it. After this first consideration let us begin again, let us go back to the traditional view of mathematics, which is that mathematics is the science of quantities. This view certainly originated from the work of Euclid, where geometry was based on a theory of quantities whose axioms were stated in the K O L U O ~ivvo~ai ~ (common notions). This system of axioms, as it seems, was regarded a s expressing evident facts, whereas the other axioms were taken as postulates. Yet the application of the concept of quantity to the geometrical entities was taken as unproblematic. It is one of the characteristics of the newer axiomatizations of geometry, since Pasch, that the privileged status of the concept of quantity has been abolished. For every kind of geometrical entity quantitative characters become established separately by axioms. In particular for linear quantity one introduces axioms of betweenness or of order. The emphasis on the concept of quantity is related t o the traditional opposition of quality and quantity. The inadequacy of this opposition may be seen by noting that linear quantities are measured by observing betweenness. Reflecting on this we come to replace the old opposition by that one of quality and structure; and, in fact, mathematics may be regarded as the science of structures. This characterization has still to be specified more precisely. The objectivity of mathematics has to be distinguished from the objectivity of
MATHEMATICS AS A DOMAIN OF THEORETICAL S C I E N C E
3
physics, and also from the objectivity of phenomenal description. On the one hand this means that structures are considered as such, that is, independent of their occurrence in real objects. Thus the objectivity of mathematics is a kind of phenomenological objectivity and may be explained by the analogy with works of art, in particular musical works-which indeed cannot be identified with their various performances. On the other hand the structures we consider in mathematics have nothing to do with the complications peculiar to sensual phenomena and are therefore more easily conceptualised. We may express this by saying that the structures considered in mathematics are idealized struc tures. The difference between concrete structures and idealized structures has been stressed especially by Stephan Korner. Idealization is in particular present in the conception of geometrical figures, and in this case it can be observed that idealization does not only occur in science but is already instinctively applied when we look at visual structures and form concepts about them. There is a general relation of the study of idealized structures to theoretical science in the following sense: all theoretical description in natural science is schematic (as Gonseth calls it), in particular it is restricted to a certain scale of quantity (Grossenordnung); and the schemata used in theoretical science are just the idealized structures we consider in mathematics, though not all idealized structures come to be applied in theoretical physics. I have stressed the difference between mathematical objectivity, physical objectivity and the objectivity of sensual description. We have however to admit that there are methodical analogies between mathematics and empirical science: there is a kind of experience in mathematics, to which especially George P6lya points in his investigations on mathematical heuristics. There is also an analogy with theoretical physics in so far as the general programme is not stable but changes essentially with the development of the science. This means that iri mathematics not only the questions to be solved within the particular theories but also the suitable theoretical setting up of the whole of mathematics constitutes a problem. Let us recall some newer developments in this respect. At the beginning of our century the theory of algebraic functions was the central domain; it included function theory, algebra, algebraic geometry, theory of Riemann surfaces and topology. The privileged role of this frame was abandoned
4
PAUL BERNAYS
about the time of Hilbert. Then came the idea of taking axiomatic set theory as the general frame of mathematics, and now this position of set theory is being challenged. One has various embracing theories, and axiomatic set theory itself is extended by model theory, where set theoretic concepts are used independently from the axiomatization. The modified situation is especially clear in the general theory of mappings which is called the theory of categories. From this aspect of mathematics we can also deal with the criticism of classical mathematics advanced by the intuitionists. Thus Heyting argues that it i s too naive to ask questions like ‘Does there really exist a well-ordering of the continuum?’ But the question need not be put in this way. The question rather may be asked: ‘Is it suitable to adopt the strong extrapolation of classical analysis as it is formulated in the conceptions of set theory, in particular in the theory of transfinite cardinals?’ At the present time no deciding experience has been found to answer this question. Mathematicians have different opinions about the suitability of stronger or only weaker methods of idealizations. At all events the fact that our mathematical idealizations are so successful i s a striking mental experience and by no means something trivial.
THE PHILOSOPHICAL BASIS OF INTUITIONISTIC LOGIC
Michael DUMMETT University of Oxford, U.K .
The question with which I am here concerned is: What plausible rationale can there be for repudiating, within mathematical reasoning, the canons of classical logic in favour of those of intuitionistic logic? I am, thus, not concerned with justifications of intuitionistic mathematics from an eclectic point of view, that is, from one which would admit intuitionistic mathematics as a legitimate and interesting form of mathematics alongside classical mathematics: I am concerned only with the standpoint of the intuitionists themselves, namely that c!assical mathematics employs forms of reasoning which are not valid on any legitimate construal of mathematical statements (save, occasionally, by accident, as it were, under a quite unintended reinterpretation). Nor am I concerned with exegesis of the writings of Brouwer or of Heyting: the question is what forms of justification of intuitionistic mathematics will stand up, not what particular writers, however eminent, had in mind. And, finally, I am concerned only with the most fundamental feature of intuitionistic mathematics, its underlying logic, and not with the other respects (such as the theory of free choice sequences) in which it differs from classical mathematics. It will therefore be possible to conduct the discussion wholly at the level of elementary number theory. Since we are, in effect, solely concerned with the logical constants-with the sentential operators and the first-order quantifiers-our interest lies only with the most general features of the notion of a mathematical construction, although it will be seen that we need to consider these in a somewhat delicate way. Any justification for adopting one logic rather than another as the logic for mathematics must turn on questions of meaning. It would be impos-
6
MICHAEL DUMMETT
sible to contrive such a justification which took meaning for granted, and represented the question as turning on knowledge or certainty. We are certain of the truth of a statement when we have conclusive grounds for it and are certain that the grounds which we have are valid grounds for it and are conclusive. If classical arguments for mathematical statements are called in question, this cannot possibly be because it is thought that we are, in general, unable to tell with certainty whether an argument is classically valid, unless it is also intuitionistically valid: rather, it must be that what is being put in doubt is whether arguments which are valid by clas~icalbut not by intuitionistic criteria are absolutely valid, that is, whether they really conclusively establish their conclusions as true. Even if it were held that classical arguments, while not in general absolutely valid, nevertheless always conferred a high probability on their conclusions, it would be wrong to characterise the motive for employing only intuitionistic arguments as lying in a desire to attain knowledge in place of mere probable opinion in mathematics, since the very thesis that the use of classical arguments did not lead to knowledge would represent the crucial departure from the classical conception, beside which the question of whether or not one continued to make use of classical arguments as mere probabilistic reasoning is comparatively insignificant. (In any case, within standard intuitionistic mathematics, there is no reason whatever why the existence of a classical proof of it should render a statement probable, since if, e.g., it is a statement of analysis, its being a classical theorem does not prevent it from being intuitionistically disprovable.) So far as I am able to see, there are just two lines of argument for repudiating classical reasoning in mathematics in favour of intuitionistic reasoning. The first runs along the following lines. The meaning of a mathematical statement determines and is exhaustively determined by its use. The meaning of such a statement cannot be, or contain as an ingredient, anything which is not manifest in the use made of it, lying solely in the mind of the individual who apprehends that meaning: if two individuals agree completely about the use to be made of the statement, then they agree about its meaning. The reason is that the meaning of a statement consists solely in its role as an instrument of communication between individuals, just as the powers of a chesspiece consist solely in it5 role in the game according to the rules. An individual cannot communicate what he cannot be observed to communicate: if one individual associated with a mathematical symbol or formula some mental content, where the association did not lie in the use he made of the symbol
T H E PHILOSOPHICAL BASIS OF lNTUITIONlSTIC LOGIC
7
or formula, then he could not convey that content by means of the symbol or formula, for his audience would be unaware of the association and would have no means of becoming aware of it. The argument may be expressed in terms of the knowledge of meaning, i,e. of understanding. A model of meaning is a model of understanding, i.e. a representation of what it is that is known when an individual knowc the meaning. Now knowledge of the meaning of a particular symbol or expression is frequently verbalisable knowledge, that is, knowledge which consists in the ability to state the rules in accordance with which the expression or symbol is used or the way in which it may be replaced by an equivalent expression or sequence of symbols. But to suppose that, in general, a knowledge of meaning consisted in verbalisable knowledge would involve an infinite regress: if a grasp of the meaning of an expression consisted, in general, in the ability to state its meaning, then it would be impossible for anyone to learn a language who was not already equipped with a fairly extensive language. Hence that knowledge which, in general, constitutes the understanding of the language of mathematics must be implicit knowledge. Implicit knowledge cannot, however, meaningfully be ascribed to someone unless it is possible to say in what the manifestation of that knowledge consists: there must be an observable difference between the behaviour or capacities of someone who is said to have that knowledge and someone who is said to lack it. Hence it follows, once more, that a grasp of the meaning of a mathematical statement m u 4 , in general, consist of a capacity to use that statement in a certain way, or to respond in a certain way to its use by others. Another approach is via the idea of learning mathematics. When we learn a mathematical notation, or mathematical expressions, or, more generally, the language of a mathematical theory, what we learn to do ic to make use of the statements of that language: we learn when they may be established by computation, and how to carry out the relevant computations, we learn from what they may be inferred and what may be inferred from them, that is, what role they play in mathematical proofs and how they can be applied in extra-mathematical contexts, and perhaps we learn also what plausible arguments can render them probable. These things are all that we are shown when we are learning the meanings of the expressions of the language of the mathematical theory in qiiestion, because they are all that we can be shown: and, likewise, our proficiency in making the correct use of the statements and expressions of the language is all that others have from which to judge whether or not we
8
M I O . Here a formula b(u,) with vo as its only free variable is said to define a set A of numbers if b ( i ) is true iff i E A. 2.1.-2.3. as above. 2.4'. V,%, is of rank a and rV,?I? = 2'. 3' if each ?i, is of rank s a and f is a Godel number of a formula %(vo) of rank < a defining the set ('%,lIi<w}.
2.5'. A is of rank a and rA, %l = 2' .3', if each 91, is of rank y ’ ) , (x’,x”, y’? y”))).
(Note the tacit use of the principle that a proof of a proposition is also a proof of a definitionally equal proposition!) Hence the functions f ’ and f ” as we have defined them take a closed normal term z’ with type symbol (X x E A ) B [ x ] ’and a proof z” of (C x E A ) B [ x ] ” ( z ’into ) a closed normal term with type symbol C ’ [ z ’ ,z ” ] such that f ( z ’ )red f ’ ( z ’ ,z”) and a proof of C ”[z ‘ , z “](f’( z ’ , z ”)) , re spect ivel y .
C-conversion. If a E A and h E B [ a ]and f has been introduced by the above schema, then
A N INTUITIONISTIC THEORY OF TYPES
f((ff,b))’ = d e f f ’ ( ( U , =def
107
b)’,(a, b)”)= d e f f ’ ( ( f f ’ , b ’ ) , ( f f ‘ U, ” , b’, b”)) b’, b”] =def C [ a , b]’,
C ‘ [ f f ’ ,a”,
f((a, b))“= d e f f ” ( ( a , =def
b)’,(a, b)“)= d e f f ” ( ( a ’ , b‘),( a ‘ , a’’,b ’ , b“)) C ” [ f f ‘ , a ” , b’, b”]=def C [ a , b]”,
as desired. The last steps in the two chains of definitional equalities follow from the substitution property.
+-reflection. For A E V,,, and B (A
+ B)’
E
=def
V,, we put (A
+B )
and define the species (A + B)” of closed normal terms with type symbol (A + B)’ by the proof conditions if x’ is a closed normal term with type symbol A ’ and x ” is a proof of A”(x’), then i”(x’,x ” ) is a proof of (A + B ) ” ( i ( x ) ) ,
if y ‘ is a closed normal term with type symbol B‘ and y ” is a proof of B“(y‘), then j”(y‘, y”) is a proof of (A + B)”(j(y’)). Remembering that A + B is but an informal notation for a constant with type symbol Vmax(,,,, ,,), it is clear that (A + B)’ is a closed normal term with that type symbol such that (A + B ) r e d ( A + B)‘,and that ( A +B)”is a max ( m ,n)’h order species of closed normal terms with type symbol (A + B)’. +-introduction. For a closed normal term x‘ with type symbol A ’ and a proof x” of A “ ( x ‘ ) , we put i ’ ( x ’ ,X’’)=def i(x’), which is a closed normal term with type symbol (A +B)’ such that i(x’) red i‘(x‘, x”), and let i “ be the function introduced in the previous paragraph which takes x ’ and X” into a proof of (A +B)”(i(x’)). The second rule of +-introduction is treated similarly. +-elimination. If c[x]E C[i(x)] and d [ y ] E C[j(y)] depend on the variables x E A and y E B, respectively, and the unary function constant f has been introduced by the schema f(i(X))
conv c [ X I ,
f ( j ( y ) ) conv d[yl,
108
we put
PER MARTIN-LOF
I
f’(i(x’), i”(x’,x”))
(fr(.i(yr), j”(y’,
i
=def
r”))
f”(i(x’),i”(x’, x”))
c’[x’, x”],
=def
d’[y’, y ” ~ ,
=def
~ ” [ x ’x”], ,
f”(j(y’), j”(y’, y”)) =def d”[y‘, y”].
Since
C [i (x )I’
=defC ’[ i (x ’),
i ”(x ’,x ”)I,
C[i(x)]”(c[x]’) =def C”[i(x’), i ” ( x ’ , x”)]cf’(i(x’), i”(x’, x”))), f(i(x’)) red c[x’] red c’[x’, x”] =deff’(i(X‘), i”(x’, x”)), and correspondingly for j ( y ) instead of i(x), the functions f ‘ and f” take a closed normal term z’ with type symbol (A +B)‘ and a proof z” of (A + B)”(z’) into a closed normal term with type symbol C’[z’,z”] such z”)), respectively. that f(z’) red f’(z’, z”) and a proof of C”[z’, z”]cf’(z’,
+-conversion. If a € A schema. then
and f has been introduced by the above
f ( i ( a ) ) ’=deff’(i(a)’,i(a)”) =deff’(i(a’), i”(a’, a”)) =def
c’[a’, a ” ] =def c [ a ] ’ ,
f(i (a)>”=def f”(i (a )’, i ( a)”) =def f”( i (a ’), i”(a ’, a”)) =def c”[a’, a”]=def c[a]”, as desired. Similarly for j ( b ) instead of i(a).
I-reflection. If the binary function constant I denotes the identity relation on A E V,, we put I’(X‘, x”, y ‘ , y”) =def I(x’,y’) and define the species I”(x’, x“, y‘, y”) of closed normal terms with type symbol I’(x’,x”, y‘, y”) by the proof condition if x’ is a closed normal term with type symbol A ’ and x ” is a proof of A”(x’), then r”(x‘,x’’) is a proof of I”(x‘, x”, x’, x ”)(r (x ’)). Clearly, I‘ and I”so defined take closed normal terms x’ and y ’ with type
109
AN INTUITIONISTIC THEORY OF TYPES
symbol A ’ and proofs x and y of A ”(x ‘) and A ’I(y’) into a closed normal term with type symbol V, such that I ( x ’ , y’) red I’(x’, y’, y“) and an nth order species of closed normal terms with type symbol I’(x’, x”, y’, y”), respectively. lr
XI’,
I-introduction. For a closed normal term x ’ with type symbol A ’ and a proof of A”(x’), we put XI’
r’(xr, Xrr)
=def
r(x’),
which is a closed normal term with type symbol I’(x’, x r , x”) =def I ( x ’ , x r ) such that r(x’) red r r ( x r , x”), and let r“ be the function introduced in the previous paragraph which takes x r and x ” into a proof of XI’,
I”(XI,
x”, x’, x’l)(r(xl)).
I-elimination. If c [ x ] E C[x, x, r(x)] depends on the variable x E A and the function constant f has been introduced by the schema f(x, x, r(x)) conv ~ 1 x 1 ,
we put f’(x’, xrr, x‘, f”(Xr,
xr‘, x‘,
r(x’), rn(x’, x”))
=def
c’[x’, x”],
r(x’), r”(x’,
=def
C’‘[x’, x”].
XI’,
Xr’,
X”))
Since c[x, x,
r(X)]’
=def
C r [ X r , XI’, x r , XI’, r(x’), r”(x‘,
X’’)]?
C[x, x, r ( ~ ) I ’ ~ ( c [ x ]=def ~ ) C”[x’, x”, x’, x”, r(xr), r”(x’, x”)] Cfl(X’,
f(x’,
X ” XI, XI’,
r ( x I r”(x’,
XIr))),
r(xr)) red c [ x ’ l red cr[x’, x”]
XI,
=def
f’(X’,
x”, x‘, x”, r(x’),
r”(X’,
XI’)),
the functions f ’ and f ” take closed normal terms x ’ and y’ with type symbol A’, proofs X ” and y “ of A“(x‘) and A”(y’), respectively, a closed y ‘ , y r ‘ ) =def I(x’, y‘) and a proof normal term z r with type symbol I y x ‘ , z ” of I”(x’, y r , y”)(z’) into a closed normal term with type symbol C’[xr, x”, y’, y f r , z’, zrr] such that XI’,
XI’,
f(x’, y’,
z’) redf’(x’, x”, y’, y“, z’, 2 ” )
and a proof of C”[x’, xr’, y r , y”, z’, z”]Cf’(x’, x”, y r , y”, z‘, z”)), respectively.
I10
PER M A R T I N - l , o F
I-conversion. If a E A and the function constant f has been introduced by the above schema, then f ( a , a, r(a))’ =def f ’ ( a ’ ,a”, a’, a’I, r ( a ‘ ) ,r”(a’,a”)) -
-def
c ’ [ a ’ ,a ” ] = d c f c [ a ] ’ ,
f ( a , a, r ( a ) ) ”=deff”(a’, a”, a ’ , a”, r ( a ’ ) ,,”(a’, =drf
a”))
,”[a’, a ” ] =dcf c [ a I”,
as desired.
N,,-reflection. We put N:t = d e f Nn and define the species N : of closed normal terms with type symbol NA by the proof condition
m” is a proof of N : ( m ) for rn
=
1, ..., n.
Clearly, NA is a closed normal term with type symbol VA = d e f V, such that N, red NA, and N : is a Oth order species of closed normal terms with type symbol NA.
N,,-introduction. For m
=
1, ..., n, we put
m’
=def
m,
which is a closed normal term with type symbol NA = d e f N, such that m red m ’ , and let m“ be the proof of N:(rn’) =def ”,‘(m) which enters into the proof condition for N:.
N,-elimination. If c I E C[l], ..., c, has been introduced by the schema
E C [ n ] and
f ( l ) conv c I , f ( n ) conv c,
we put
the function constant f
A N INTUITIONISTIC THEORY O F TYPES
111
Since C [ m ] ’=def C ’ [ m ,m ” ] , C[m]”(CA)=def C”[m,m”]Cf’(m,m ” ) ) ,
f ( m ) r e d c , redc,’,”deff’(m,m‘’), for m = 1, ..., n, the functions f ’ and f ” take a closed normal term x’ with type symbol NL =def Na, and a proof x ” of NE(x’) into a closed normal term with type symbol C’[x’, x”] such that f(x’) red f’(x’, x”) and a proof of C”[x’, x”](f’(x’, x”)), respectively. N,-conversion. If the function constant f has been introduced by the above schema, then
f ( m ) ‘ =deff‘(m’, m ” ) =deff’(m, m”)=def cA> f(m)II =deff”(m’, m”) =def f”(m, m ” ) =def c g for m
=
1 , ..., n, as desired.
N-reflection. We put N’
=def
N,
which is a closed normal term with type symbol VA =def Vo such that N red N ’ , and define inductively the Oth order species N ” of closed normal terms with type symbol N ’ =def N by the proof conditions 0” is a proof of ”’(0) and if x ’ is a closed normal term with type symbol N and x ” is a proof of N ” ( x ’ ) ,then s“(x’, x”) is a proof of N ” ( s ( x ’ ) ) . N-introduction. We put
0’ =def 0, which is a closed normal term with type symbol N ‘ =def N such that 0 red 0 ’ , and let 0” be the proof of N ” ( 0 )=def “’(0) which enters into the first proof condition for N”. Similarly, for a closed normal term x’ with type
112
PER MARTIN-LOF
symbol N‘
=def
N and a proof x “ of N ” ( x ’ ) ,we put s‘(x‘, x”)
=def
s(x’),
which is a closed normal term with type symbol N ’ = d e f N such that s (x’) red s’(x’, x”), and let s ” be the function which enters into the second proof condition for N” and takes x ’ and x ” into a proof of N ” ( s ’ ( x ’ 7x”)) = d e f N ” ( s ( x ’ ) ) .
N-elimination. If c E C[O], d [ x , y l E C[s(x)] depends on the variables x E N and y E C[x], and the function constant f has been introduced by recursion,
(i””
conv c,
f ( s ( x ) )conv d[x,f(x)l,
we define f ‘ and f ” by the simultaneous recursions
{
f’(O,O>
=def
c ’,
f ’ ( s ( x ’ ) ,s”(x‘, x”)) f”(0,W)
=def
=def
d ’ [ x ’ , X”,f’(X’,
=def
d ” [ x ’ , x”,f’(X‘, x”),f”(x‘, x”)].
X”), f ” ( X ’ ,
x”)],
c ”,
f ” ( s ( x ’ ) ,s”(x‘, x”))
Since
1
{
C[O]’=def C”0, O”], c[s(x)]’
=def
C’[s(x’), s”(x’9 x”)],
c[O]‘‘(c’)= d e f C”[o,o’’]Cf‘(o,o”)), C [ s ( x ) ] ” ( d “ x ’ ,x ” , f ’ ( X ’ , x ” ) , f ” ( x ’ ,x”)]) =def
C’“s (x ’), s ”(x ’,x ”)](f‘(s (x ’), s “(x ’,x ”))),
f(0) red c red c ’
=def
f’(0,O”)
and, under the induction hypothesis that f ( x ’ ) red f ’ ( x ’ , x ” ) , f(s(x’))red d [ x ‘ , f ( x ’ ) lred d [ x ’ , f ’ ( x ’ x”)] , red d ’ [ x ’ , x”, f ’ ( x ’ , x”), f ” ( x ’ ,x”)]
=def
f ’ ( s ( x ’ ) ,~ ’ ’ ( x ’x”)), ,
AN INTUITIONISTIC T H E O R Y OF TYPES
113
the functions f ’ and f ” take a closed normal term x r with type symbol N’ =def N and a proof x” of N”(x‘) into a closed normal term with type symbol C ‘ t x ‘ , x“] such that f ( x ‘ ) red f ‘ ( x ‘ , x“) and a proof of C“[x‘, x”] cf’(x‘, x”)), respectively. N-conversion. If a E N and the function constant f has been introduced by recursion, then f(0)’
=def
f’(0, 0”)=def c ’ ,
f ( s ( a ) ) ’=def f ’ ( s ( a ’ ) ,S ’ Y U ‘ , a”)) =def
I I
d ’ [ a ’ ,a”,f‘(a’,a”),f”(a’,a”)]=def d [ ~ , f ( ~ ) ] ‘ ,
f(0)”=def f”(0,O”) =def c ” ,
f (s (a))”=def f ” (s ( a ’) s ”( a ’, a ”)) 9
=def
d ” [ a ‘ ,U ” , f ’ ( U ’ ,
U”),f”(a’,
a”)]=def d [ a , f ( u ) ] ” ,
as required. V,-reflection. We have already defined VL and Vf:, but we also have to verify that V; =def v,, is a closed normal term with type symbol V k , =def V,,,, such that V, red VL, which is clearly so, and that V : is an ( n + l)th order species of closed normal terms with type symbol V:,=def V,. Remember that, for a closed normal term A with type symbol V“, V X A ) =def the type of nth order species of closed normal terms with type symbol A, that is, the type of functions from the closed normal terms with type symbol A into the nthuniverse, which is an object of the ( n + 1)‘’ universe. Hence V : is indeed an ( n + l)th order species of closed normal terms with type symbol
v:
=def
v n -
This finishes the construction of the model of closed normal terms and thereby the proof of the normalization theorem for closed terms. 3.4. THEOREM. If a and b are closed normal terms such that a conv b, then a = b, that is, a and b are syntactically identical.
The following proof is due to Peter Hancock. From the construction of the term model, we know that a red a’ and b red b’. But a closed normal
114
PER M A R T I N - L O F
term can only reduce to itself, and hence a = a' and b = b'. On the other hand, a conv b implies a' = d e f b' (and a'' =def b " , but we do not need that) and, a fortiori, a' = b ' . Hence a = a' = b' = b as was to be proved.
3.5. THEOREM.If the closed term a converts into the closed normal term b, then a reduces to b. We know from the normalization theorem that a red a' and, by assumption, that a conv b, where a' and b are both closed normal terms. From a red a' and a conv b, we can conclude a ' conv h. Hence, by the previous theorem, a' = h which, together with a red a ' , yields a red b as desired.
3.6. UNIQUENESS OF N O R M A I . FORM. If the closed term a converts into the closed normal terms b and c, then b = c. The assumptions a conv b and a conv c imply b conv c and hence, b and c being closed normal terms, b = c. 3.7. CHURCH-ROSSFR PROPERTY. If a and b are closed terms, then u conv b i f and only i f there exists a closed term c such that a red c and b red c.
For systems with a sufficiently simple type structure, like the typed lambda calculus or the system of Godel terms, the Church-Rosser property can be proved by pure combinatorial means. For the present theory, however, it is an open question whether a combinatorial proof can at all be given. The following proof, by the Hancock method, uses in an essential way the properties of the model of closed normal terms. We know that a red a' and h red b ' . Moreover, a conv b implies a ' = d c f h ' . Hence, if we put c = d c f a' = d e f b ' , we can conclude a red c and b red c as desired. 3.8. DECIDABILITY OF T H E
CONVERTIBILITY RELATION. For two closed terms a and b, it can be mechanically decided whether or not a conv b.
This was first proved for the Godel terms by Tait[22]. The (clearly mechanical) decision procedure is this: Reduce a and b to their normal forms u' and b' and check whether or not a' = h ' , that is, whether or not a ' and b' are syntactically identical.
3.9. UNIQUENESS OF TYPE SYMBOLS. If a and b are closed terms with type symbols A and B, respectively, and a conv b, then A conv B
AN INTUITIONISTIC THEORY OF TYPES
115
From the properties of the term model, we know that a red a ’ ,
A red A ’ ,
b red b’,
B red B ’ ,
where a’ and b’ are closed normal terms with type symbols A ’ and B‘, respectively, and, since a conv b, that a’ =def b‘. Hence, if we put c =def a ’ =def b ‘ , c is at the same time a closed normal term with type symbol A ’ and a closed normal term with type symbol B’. However, as remarked in connection with the definition of the closed normal terms, the type symbol of a closed normal term is uniquely associated with it, and hence A ’ = B ’ so that A conv B as was to be proved. 3.10.
If there are closed derivations of A E V,,,, B E V, and A conv B , then m = n
COROLLARY.
This shows that the order of a closed type symbol is unique. From the previous theorem, we can conclude that V,,, conv V,,. But V,,, and V, are both closed normal terms, and hence V,,, = V,, that is, m = n. 3.1 1. THEOREM. From closed derivations o f a conv b and b E B, we can find a closed derivation of a E B This shows that a conv b b EB aEB holds as a derived rule for closed derivations. T o prove it, first find a closed derivation of a E A for some A by following the derivation of a conv b. Then, by the uniqueness of type symbols, we can find a closed derivation of A conv B. Applying the rule for converting a type symbol to these two derivations, a €A
A convB aEB
we get the desired derivation of a E B.
3.12. THEOREM. From a closed derivation o f a closed derivation of A E V,, for some n
E A,
we can find a
The proof is by induction on the length of the derivation of a E A. The
116
PER
MARTIN-IAF
only rule which causes any difficulty is the rule a E A
AconvB aEB
The induction hypothesis then allows us to find a closed derivation of A E V,, for some 17. Hence, by the previous theorem and the symmetry of the convertibility relation, B E V,, for the same value of n. THE EPSILON RELATION. It can be mechan ically decided whether or not a closed term i s a type symbol. And, given a closed term a and a closed t y p e symbol A, it can be mechanically decided whether or not a E A
3.13. DECIDABILITV OF
.A closed term is a type symbol if and only if the type symbol of its normal form is syntactically identical with V, for some n = 0, 1, ..., and
there is clearly a mechanical procedure for checking whether or not this is the case. Similarly, if n is a closed term and A a closed type symbol, then a E A if and only if the type symbol of the normal form a ’ of a is syntactically identical with the normal form A ’ of A, and again this is something that can be mechanically decided. The decidability of the epsilon relation shows that the theory of types satisfies the adequacy condition formulated in (the discussion following) [ IS, Problem 101, namely, that it should be recursively decidable whether or not a closed term formally proves a given closed formula in the hypothetical theory of constructions. This is the formal counterpart of the experience that we can decide whether or not a purported proof actually is a proof of a given proposition (in Kreisel’s words: we recognize a proof when we see one). 3.14. THEOREM.I ( u , b ) is provable, that is, there is a closed term with type symbol I ( a , b ) , i f and only if a conv b Here, of course. it is supposed that a and b are closed terms with common type symbol A and that I denotes the identity relation on A. The sufficiency is trivial, because r ( a ) is a closed term with type symbol I ( u , a ) and a conv b implies I ( a , a ) conv I ( a , b ) so that, by the rule for converting a type symbol, r ( a ) is a closed term with type symbol I ( a , b ) . Conversely, suppose that c is a closed term with type symbol I ( a , b ) . Then c ’ is a closed normal term with type symbol I ( a , b)’ = d e f I ( a ’ , b ’ ) which is only possible if a ’ = b’ and r ( u ’ )= c ’ . Since a red a ‘ , b red b ’ and a ‘ = b ’ , we can conclude a conv b as desired.
A N INTUITIONISTIC THEORY O F TYPES
3.15.
1 I7
A number theoretic function which can be constructed in the theory of types is mechanically computable
THEOREM.
Of course, the fact that there is a not necessarily mechanical procedure for computing every function in the present theory requires no proof at all once we have recognized that the axioms and rules of inference of the theory are consonant with the intuitionistic notion of function, according to which a function is the same as a rule or method. By saying that a number theoretic function can be constructed in the theory of types, I mean that there is a closed term f with type symbol N + N which denotes it. Suppose that we want to find the value of the function for a certain natural number which is denoted by the numeral m. Then f ( m ) denotes the value of the function for this argument. But f ( m ) is a closed term with type symbol N and hence, by the normalization theorem and the fact that the closed normal terms with type symbol N are precisely the numerals, it reduces to a numeral n. It only remains to remark that the normal form of a term can be found in a purely mechanical way, that is, by manipulating symbol strings according to rules which refer solely to their syntactical form and not to their meaning. Similarly, having formalized the construction of the real numbers (for example, a s Cauchy sequences of rationals) in the theory of types, we can prove as a corollary to the normalization theorem that every individual real number which we can construct in the formal theory can be computed by a machine with any preassigned degree of approximation. These corollaries show that formalization taken together with the ensuing proof-theoretical analysis effectuates the computerization of abstract intuitionistic mathematics that above all Bishop [ 11 and [2] has asked for. What is doubtful at present is not whether computerization is possible, because we already know that, but rather whether these proof-theoretical computation procedures are at a11 useful in practice. S o far, they do not seem to have found a single significant application.
References [l] E. Bishop, Foundations of Constructive Anulysis (McGraw-Hill, New York, 1967). [2] E. Bishop, Mathematics as a numerical language, in: J. Myhill, A. Kino and K. E. Vesley, eds., Intuitionism and Proof Theory (North-Holland, Amsterdam, 1970) pp.
53-71. [3] H. B. Curry and R. Feys, Cornhinutory Logic, Vol. I (North-Holland, Amsterdam, 1968).
118
PER M A R T I N - L o F
141 S . Feferman. Systems of predicative analysis, Journal of Symbolic Logic 29 (1964) 1-30. [5] S. Feferman, Set-theoretical foundations of category theory, in: Reports o f t h e Midwest Category Theory Seminar 111, Lecture Notes in Mathematics, Vol. 106 (Springer, Berlin, 1969) pp. 201-247. [6] G . Gentzen. Untersuchungen uber das logische Schliessen, Mathematische Zeitschrift 39 ( 1 934) 1 7 6 210, 405-43 1 . [71 J.-Y. Girard, Interpretation fonctionnelle et elimination des coupures de I’arithmetique d’ordre superieur, These, Universite Paris VII (1972). [8] N. D. Goodman, A theory of constructions equivalent to arithmetic, in: J. Myhill, A. Kino and R. E. Vesley, eds., Intuitionism and Proof Theory (North-Holland, Amsterdam. 1970) pp. 101-120. 191 K. Giidel, Uber eine bisher noch nicht beniitzte Erweiterung des finiten Standpunktes, Dialzctica 12 (1958) 280-287. [ 101 W. A. Howard, The formulae-as-types notion of construction (l969), Unpublished. [ l I] W. A. Howard, A system of abstract constructive ordinals, Journal of Symbolic Logic, 37 (1972) 355-374. [I?] G. Kreisel, Foundations of intuitionistic logic, in: E. Nagel, P. Suppes and A. Tarski, eds., Logic, Methodology and Philosophy of Science (Stanford University Press, Stanford, Calif. 1962) pp. 198-210. 1131 G. Kreisel, Mathematical logic, in: T. L. Saaty, ed., Lectures o n Modern Mafhernatics, Vol. I11 (Wiley, New York, 1965) pp. 95-195. [I41 G. Kreisel, Functions, ordinals, species, in B. van Rootselaar and J. F. Staal, eds., Imgic, Merhodology and Philosophy o f Science I11 (North-Holland, Amsterdam, 1968) pp. 145-159. [I51 G . Kreisel. Church’s thesis: a kind of reducibility axiom for constructive mathematics, in: J. Myhill, A . Kino and R. E. Vesley, eds., Intuitionism and Proof Theory (North-Holland, Amsterdam, 1970) pp. 121-150. 1161 P . Martin-LBf, About models for intuitionistic type theories and the notion of definitional equality, paper read at the Orleans Logic Conference (1972). 1171 D. Prawitz, Ideas and results in proof theory, in: Proceedings ofthe Second Scandinavian Logic Symposiuni, ed. J. E. Fenstad, (North-Holland, Amsterdam, 1971) pp. 235-307. [I81 B. Russell, The Principles of Mathematics, Vol. I (Cambridge University Press, Cambridge, 1903). [I91 D. Scott. Constructive validity, in: Symposium on Automatic Demonstration, Lecture Notes in Mathernaticf, Vol. 125 (Springer, Berlin, 1970) pp. 237-275. [20] K. Schdtte, Predicative well-orderings, in: J. N. Crossley and M. Dummett, eds., Formal Systems and Recursive Funcrions (North-Holland, Amsterdam, 1963) pp. 279-302. [21j K. Schutte, Eine Grenze fiir die Beweisbarkeit der Transfiniten Induktion in der verzweigten Typenlogik, Archiv fur mathematische Lo& und Grundlagenforschung 7 ( 1965) 45-60. 1221 W. W. Tait, Intentional interpretations of functionals of finite type, Journal of Symbolic Logic 32 (1967) 198-212. [23] W. W. Tait, Constructive reasoning, in: B. van Rootselaar and J. F. Staal, eds., Logic, Methodolog! and Philosophy of Science 111 (North-Holland, Amsterdam, 1968) pp. 185-199.
SETS, TOPOI, AND INTERNAL LOGIC IN CATEGORIES
Saunders MACLANE" University of Chicago, Chicago, Ill. U.S.A.
A recent development in category theory has been the description and study of a useful class of categories called 'elementary topoi'. This class includes both the ordinary category of sets, that of diagrams of sets, and also that of sheaves of sets on a topological space. The axiomatic description of these categories provides a formulation of axiomatic set theory wholly different from the usual set-theoretic axioms on the membership relation, and the further study casts considerable light on a number of problems of foundations. This paper is intended to give a general survey of these developments. 1. The variety of models
The intention is to axiomatize sets in terms of functions and their composition, as in the definition of a category. A category E consists of objects X , Y,... and arrows f , g ,...; each arrow f has an object X as domain and Y as codomain, as indicated by writing f : X -+ Y ; if g is any arrow g : Y -+ Z with domain Y,the codomain off, there is an arrow g 0 f : X + Z called the composite of g with f; for each object Y there is an arrow 1 = l u : Y -+ Y called the identity arrow of Y . These data are subject to identity and associativity axioms: ly.f=f,
golu=g,
h.(g.f)=(hog).f:X+
w,
(1)
where the latter axiom is to hold for all arrows h : Z -+ W. The fundamental model is the category E = Set of sets: Objects X are all sets,
* Supported in part by
a grant from the National Science Foundation.
120
SAUNDERS M A C L A N E
arrows are functions, composition is the usual composition of functions, and identity Ix the usual identity function. There are many other models of these axioms. For the category of pairs of sets, an object is an ordered pair (Xo,XI)of two sets, an arrow f : ( X o ,XI)+ ( Y o ,Y I ) is a pair of functions f o : Xo -+ Y o and f l : XI+ Y I ,with the evident composition and identity. The category Set’ of functions has as objects the functions t : Xo + Xi (as always here, functions with specified domain X , and codomain X , ) and as arrows
f : ( t : Xo + Xi) -+ ( t ’ Yo -+ Yi) the pairs of functions f o , f l such that in the diagram
f”
1 Yo
1
-
f,
Yi
*’
one has t ’ f o = f l t (one then says that this diagram commutes); the composition is again evident (put another commuting square below ( 2 ) and compose the vertical edges). The category SetN of sets through time has objects denumerable strings X , X1 X 2 + of functions between sets, and arrows the strings of arrows fi, -+
-+
x,A XI a xz A x,A...
for which all the squares commute (tfi = J + , t for i = 0 , 1 , ... and all the appropriate (different) t’s). In this category, we can consider an object to be a set which ‘varies’ through discrete time, starting at time 0; the notion is reminiscent of the Kripke models for intuitionistic logic [ 131. Other such patterns of variability are possible. In model ( 2 ) , an object consists of sets in the pattern . +. .; in (3) the sets have the pattern . -+ . -+ . -+ -..; one may regard such a pattern as itself a category D, while an object is a ‘functor’ on D to Set, the resulting models then form a functor category SetD (as described, for example, in [ 181). We think of this as the category of diagrams of shape D, any D.
SETS, TOPOI, AND INTERNAL LOGIC IN CATEGORIES
121
Still other models arise from a topological space S ; take an object to be a ‘sheaf‘ X of sets on the space S and an arrow X + Y to be a morphism of sheaves (as for example in [7] or [25]),the resulting category will again be a model for the axioms to be formulated; indeed, for suitable generalizations of ‘topological space” to a so-called ‘Grothendieck topology’, it will be substantially the most general model. 2. The basic axioms
To the axioms (1) for a category E we now add some further axioms; initially such as to provide for a ‘number’ 1, products, and powers (exponentials). Terminal object. The category E contains an object 1, called a terminal object, such that for any object X of E there is exactly one arrow X + 1. It follows readily that any two such terminal objects 1 and 1’ are isomorphic (that is, there is f : 1 + 1’ with a two-sided inverse). In Set, any one-element set {*} is terminal, while the terminal diagram of shape D is the diagram with 1 at each vertex (1 -+1 -+ 1
-
--a).
f
Pullbacks. To every pair X B A Y of arrows f and g with common codomain B in E the category E contains a pullback ( P , r, 4 ) . A pullback consists of a new vertex P and arrows p , q such that the square on P commutes cfp = gq, as below) with the following ‘universal’ property: Given any commutative square (gk = f h ) on the edges f and g with new corner 2,there is a unique arrow s : 2 + P such that the whole diagram commutes (ps = h, qs = k ) :
I lp Ig
x
= X
L f
(4)
B
In the category Set, this pullback always exists, with P the set of those pairs (x, y ) with x E X , y E Y and fx = gy, while s is the unique function
122
SAUNDERS MACLANE
with s ( z ) = ( h z , k z ) . There are several useful special cases: If B is the terminal object 1, the square automatically commutes and P is exactly the Cartesian product X x Y ;if Y is a subset Y C B and the function g is the inclusion, then P is that subset of X which is the inverse image of Y under the map f : X .+ B. Finally, if X and Y are both subsets of B (and f and g both inclusion maps) then P is clearly just the intersection X n Y C B. Pullbacks exist in all the other model categories listed, and in each such ca\e they yield products of arbitrary objects, intersections of subobjects and other useful constructions, just as in Set. For example, the unique map s of (4) can be used to construct the product h x k of two arrows h : A -?. X . k : B + Y , as in
i
1
Ip2
A-7X-I.
Exponential. T o every pair of objects Y , X , the category E contains an object Y called the ex p o n en tid , and an arrow e : Y xX X .+ Y , called eLci/iration, with the following ‘universal’ property: To every arrow f : W x X + Y in E there is a unique arrow g : W + Y x in E such that the following diagram commutes ( e ( g x 1) = f )
Y X X X AY
wxx In Set, the exponential is the usual function-set
Y x= hom(X, Y ) = { tI t : X
-+
Y}
(7)
consisting of all arrows from X to Y , and the evaluation is that function e which takes a function t and an element x E X to the value e ( t , x) = t ( x ) of t at x. Then, given any function f of two variables w and x, the corresponding function g in (6) is the function of one variable w with values in Y x given by g ( w ) ( x )= f(w, x). Thus the correspondence f g
-
SETS, TOPOI, AND INTERNAL LOGIC IN CATEGORIES
123
exhibited in (6) is a bijection hom(W
X
X , Y )= hom( W , Y " ) ;
(8)
in other language[l8] it asserts that the functor x X (product with X ) is left adjoint to the functor ( o ) ~ (raise to the power X ) . Exponentials may be readily constructed in the various diagram categories, such as sets through time. We could continue to list other categorical constructions possible in Set and in our diagram categories: There is an initial object 0 (a unique arrow from 0 to any object); there are pushouts (dualize the description of pullbacks by reversing all arrows); hence there are sums (disjoint unions in Set). An axiom of choice can be formulated (to every f : X 3 Y with X f 0 there is a g : Y + X with fgf = f ) and an axiom of infinity-the infinite set o of all natural numbers is characterized by the property that functions on w can be defined by recursion, as expressed by a suitable diagram (See [17, QII.111). 3. Characteristic functions of subobjects
An arrow m : S + X is said to be monic in E when it can always be 'cancelled' on the left; that is, whenever mf = mg for arrows f, g : Y + S always implies f = g . (The property is just like the assertion that m is one-one whenever ms = mt for elements s, t E S always implies s = t ) . In Set, the inclusion m : S -+ X of a subset S in X is always monic. Conversely, every monic m' : S' + X in Set is essentially of this form, in that there is always a bijection (= isomorphism) 8 : S' E S to a subset, with m0 = m '. Hence in any category E, we may define a subobject of an object X to be an equivalence class of monics m : S 4 X , where two monics m, m' count as equivalent when there is an isomorphism 8 with m0 = m'. In a category of diagrams of shape D,a subdiagram is just a subset at each vertex of D, so that all arrows restrict to arrows. In Set, however, there is another way of describing subobjects; namely, by characteristic functions; in Lawvere's original axiomatics for the category of sets[l4] this proved quite difficult to handle. Here the characteristic function of a subset S C X is a function cp : X + (0, l} to the set of two 'truth values' 0 and 1 with cpx = 0 when x E S and cpx = 1 when X ES. Moreover, there is an easy way of using pullback to reconstruct a subset of X from its characteristic function cp. Starting from
124
SAUNDERS MACLANE
the special subset {0}C(0, l} and its inclusion monic (0) --+ (0, I}, called the monk true, we can recover the subset S by pullback as in the diagram
s
1 =(O}
I ItNe x ----,
(9)
'p- {0,1>;
indeed, the top arrow in this square is necessarily the unique arrow to the terminal set 1, and S is (equivalent to) the set of those pairs (x, 1) with 'Px = 1 . In the other model categories there are similar characteristic functions, but with appropriately different sets of truth values. First, in any category E, the pullback of a subobject (a monic) is easily seen to be still a subobject (a monic). In the category of pairs ( X o ,X I )of sets, a subobject is a pair of subobjects, so that we can use as characteristic function of a subobject a pair of (ordinary) characteristic functions, with 'truth values' in the product set (0, I} x (0, 1)-not two-valued, but still a Boolean algebra. In the category Set' of functions a subobject of t : Xo --+ X I is a pair of subsets S o C X oand S, C X , such that t ( S o )C S , . Then the elements of X o can be classified not in two but in three ways with respect to the subobject: Elements xo now in So,elements xo not in Sobut with txo E S,, and elements for which even this is false. This is a characteristic function 'Po :
defined by q o x 0= 0
xo
-
(0, I , 21,
when xoE So,
=1
when xoE So,txo E S , ,
=2
when txo E S,.
Moreover, if we examine the diagram
SETS, TOPOI, AND INTERNAL LOGIC IN CATEGORIES
125
where t '0 = 0, t ' l = 0, t'2 = 1 and where cpl is the ordinary characteristic function of S1-+ XI,we see that the subobject (So,S J -+ ( X o ,XI) can again be obtained from its 'characteristic function' (q0,c p ] ) by pulling back a special subobject 'true' as shown. A similar construction applies to 'sets through time'. A subobject has the form of a string of subsets SiC X i for all i,
such that TSi c Si+, for all i. Then the appropriate characteristic function cpo on X o will have, for elements xo E X o cpoxo= 0 =n = 00
if xoE So, if t " - ' x 0 E t " X o E S,, if no t "xoE S, ;
(1 1 )
in other words, cpo : Xo -+ (0, 1,2, ...,w} has (poxo= n the 'time till truth' (time till xo lands in the subobject, or otherwise). The truth values form an infinite set a={ 0 , 1 , 2 ,...,m} on which f 2I \R -& R ... is defined by tO = 0, t ( n + 1 ) = n and t w = m. Again the subobject can be obtained by pulling back a special subobject {O}CO along the characteristic function {cpo, cpl, ...} of {Si}. Much the same description of characteristic functions will apply to other categories of diagrams; the characteristic function cp of a subdiagram S C X will send each x at a vertex not to 'time till truth' but to 'the set of paths till truth'-the paths (arrows) which carry x into the subobject. This suggests the following definition. A subobject classifier in the category E is an object R of E and a monic true: 1 -+ f2 from the terminal object such that to every monic rn : S 4 X (i.e., to every subobject) there exist exactly one arrow cp : X -+ R such that rn and S is the pullback of true along cp, as in the diagram
126
SAUNDERS MACLANE
In this pullback square, the top arrow is again the unique arrow from S to the terminal object 1. As indicated above, each of our categories E has a subobject classifier. The basic discovery of LawveretlS] resides in the observation that the characteristic functions required for the elementary theory of the category of sets could be effectively described by this axiom, Lawvere and Tierney then showed that this axiom had considerable deductive power. It applied also to the categories of sheaves; this was the connection with the theory of sheaves which had been developed by the Grothendieck School[l] in France. There, Giraud had described a ‘Topos’ by axioms exactly sufficient to characterize the category of sheaves of sets over an arbitrary Grothendieck topology. 4. Internal logic in a topos
A n elernentary t o p s is now defined to be a category E which has a terminal object, pullbacks, exponentials, and a subobject classifier. This list of axioms proves to be quite rich in consequences. For example, with suitable ingenuity, Mikkelsen proved that an elementary topos E has an initial object and pushouts (and hence sums); the result can now be found in elegant conceptual form, by an application of universal algebra[21]. Moreover, just as in sets, every arrow in E can be factored as f = i n k , where m is m o n k and k is epi (an epi is by definition an arrow cancellable on the right); thus the subobject m is the image of t h e arrow f. For any object B in the category E one can construct a new category E 1 B whose objects are the arrows X + B to B ; if E = Set, X is just a B-indexed set. One may prove[6] that whenever E is a topos, so is E 1 B. In a topos E one might consider the arrows x : 1 + X as the ‘elements’ of X-they do correspond to elements if E =Set. However, there are iisually not ‘enough’ such elements, because one may have arrows f, g : X - + Y with f # g although fx = gx for every such ‘element’ of X . Hence it is more effective to think of arrows u : U + X from a n arbitrary domain U as (generalized) ‘elements’ of X . At the same time, the arrows cp : X .+ R are the predicutes of X , taking values in the object R of truth values. Now the description (12) of a subobject S from its characteristic function in terms of pullbacks means exactly the following (take Z = U , P = S, and Y = I in (4) above): An element u of X lies in the subobject S
SETS, TOPOI, AND INTERNAL LOGIC IN
-
with characteristic function cp if and only if cpu arrow true” : U + 1
CArEGORIES
= true,,.
127
Here the unique
is the predicate ‘truth for U’. Because predicates are built into the system E , some portions of the usual logic is ‘internalized’ in a topos. There is a power set object PX = O x for any object X . Indeed, the subobjects S of X correspond, by the classifier, to arrows cp : X + a; since X = 1 x X , these correspond in turn, by the definition of the exponential, to arrows rcpl : 1 -+ P X ; that is, to ‘elements’ of P X ’ . We may call this arrow rcpl the name of cp or of S. Much as in (X), predicates 11, : W x X -+ R of two variables then correspond to arrows cp : W + P X = ax. This amounts to a sort of comprehension axiom, in that cp would send each w E W to the set {x I $ ( w , x)}. It is an internal version of the usual comprehension axiom-‘internal’ because the predicate 11, in question is an arrow in tk system, not a metamathematical formula about it. Let E ( X , a)denote the set of all arrows X + R in the topos E. Since R is the subobject classifier, the elements of this set E(X,R) correspond exactly to the subobjects of X . Now the set Sub X of these subobjects is a lattice: it is partially ordered by the evident inclusion of subobjects, there is a greatest lower bound, which is just the intersection S A T of subobjects, as found by pullback, and there is a a least upper bound or union S v T , found by using the images described above. (Given subobjects S, T + X , form the sum S + T and take S v T to be the image of the resulting map S + T -+ X ) . This lattice is actually a Heyting algebra, where a Heyting algebra (= Brouwerian lattice) is a lattice L which, regarded as a category, has exponents. This category L is the one with objects the elements of L , and a (unique) arrow LI + h if and only if a Ib in the partial order of L. Since the greatest lower bound A in the lattice is then precisely the product in this associated category, the exponential associates to each pair of elements b, c of L an element b c such that, for all a,
+
a
A
b
5
c
if and only if
u
5
h 3 c.
(13)
Since a A b 5 c if and only if the set hom(a A h, c ) is non-empty, this statement is exactly the adjunction (8) defining exponentials, rewritten in
I28
SAUNDERS MACLANE
lattice language. In any Heyting algebra, with zero element 0, the negation
l a of an element n can then be defined as 1 a = a + O
(14)
Thus in any topos E the set S u b X of subobjects of X is a Heyting algebra, and so is the set E(X,R). Moreover, this structure behaves ‘functorially’ in the object X , which means that the structure can be transformed to the object R of E. Hence, for example, there is an arrow A : R x R + R of E , called the formal conjunction, such that for every object X the conjunction cp A J, of two predicates of X can be obtained as the composite arrow
where A is the diagonal map, with both projections p h and q A the identity X 4 X. Similarly, the operations v, and 1on the sets E ( X , R) can be obtained from unique arrows
+
Now a Heyting algebra can be defined by equations valid for these operations A , v , 3 ,and 1, and each such equation can be rewritten as a suitable commutative diagram in E ; for example, the associative law for intersection is expressed by the commutative diagram
nxi
1
SlXR
1.
’0
where 8 is the canonical isomorphism stating that the categorical product is associative. In virtue of this set of diagrams, the object R is called a Heyting algebra object in E. In this sense, the intuitionistic predicate calculus is internalized in E. In a general topos, these Heyting algebras are not Boolean; that is, 11 is not always the identity. This is the case already with the topos of functions, where the subobject classifier R as in (10) is essentially a set with three elements, hence cannot possibly be Boolean. For the topos of
129
SETS, TOPOI, AND INTERNAL LOGIC IN CATEGORIES
sheaves of sets on a topological space S , the corresponding Heyting algebra s1 is the well-known Heyting algebra of all open sets of S ; this is not in general Boolean because the double negation is the interior of the closure. 5. Internal quantifiers
By an insight due to Lawvere, the quantifiers can also be internalized in a topos. For example, in the topos Set of all sets, consider a predicate $ : X x Y + 0 of two variables (in X and Y , respectively) and the corresponding subset S CX x Y classified by g. Then the predicates ( 3 y ) $ and (Vy)$ of one variable (in X ) become the subsets
3,s = {x I 3 y , (x, Y ) E s>,
VPS = {x I VY, (x, Y ) E s>;
here the function p denotes the usual projection p : X x Y subset T CX, let p - l T be the inverse image
-+ X.
(18)
For any
p -'T = {(x, Y ) I p (x,Y ) = x E T>.
Then the sets 3,s and V,S can be described by
3,s
T
5
T
5 V,S
if and only if
S 5 p -IT,
if and only if
p -'T
IS,
Now in Sub( Y )regarded as a category, 3,s 5 T holds if and only if there is a (unique) arrow from 3,s to T. Hence (19) states that 3, : Sub(X x Y ) -+ Sub(X) is a left adjoint to p - ' , and similarly (20) states that V, is a right adjoint. Since the inverse image functor p can be defined categorically (as a pullback), this makes possible a description of quantifiers in any topos E . Explicitly, for any f : 2 -+ X in E, define
f-' : Sub X
-
Sub 2.
Then one can prove that this functor has both a left adjoint 3, and a right adjoint Vf. The description of this latter adjoint requires a number of preparatory steps: one must first classify the 'partial functions' from A to B, and then use these to construct suitable cross section functors and hence the quantifier Vf. Observe that these quantifiers when f is not a projection p have an immediate meaning in Set; for f : W -+ X and
130
SAUNDERS MACLANE
S C W we can extend the definition used in (18) to 3,,9 -= {x 13w E S and fw V,S
={
x I (Vw)fw
=x
= x},
implies w E S }
(21)
6. Equivalence to membership axioms
The axioms for an elementary topos can all be stated in first order form, without using any set theory; in particular, without using the ‘sets’ Sub X and E ( X , 0 - n o t e that the description of a subobject classifier and of an exponential did not use these sets. Hence these axioms provide an alternative way of formulating an ‘axiomatic set theory’. In this form, the axioms cover many models beside the usual sets-such as the sets through time noted above. For set theory proper we consider a stronger system, the eler~ieiifrirytheory of the category of sets (ETCS for short): the axioms for an elementary topos, plus the axiom of choice, the axiom of infinity, and the assumption that the subobject classifier is Boolean and two-valued (the latter means that 1 + 1 = a).These axioms all hold in the category of all sets (supposing say the Zermelo-Fraenkel axioms), but they are considerably weaker than Zermelo-Fraenkel. For a considerable period the exact relation was wholly unclear: Starting with a rnembershipstyle set thcory, one constructs readily the category of these sets, but how can one proceed conversely? Julian Cole 151 and William Mitchell [ 191 simultaneously and independently found the answer to this question. Consider the following variant Z,, of Zermelo set theory: Extensionality, the null set, unordered pairs, power sets, unions, regularity, choice, infinity, and bounded comprehension (comprehension for well-formed formulas where all quantifiers are bounded). Add t h e following two special axioms, each of which is a consequence of the replacement axiom: To every set X , there exists a set which is its ti-anyitive closure (call it T c X ) ;to every set X , there exists the set
H( X ) = { Y 1 cardinal T,-Y 5 cardinal X}. Then every model of Zo yields a category of sets which satisfy ETCS. Conversely, to each model of ETCS, Cole and Mitchell constructed a
SETS, TOPOI, AND INTERNAL LOGIC IN CATEGORIES
131
model of Zo.They started with the observation that in Z, a set under the membership relation can be regarded as a tree (of members of members of members ... as in [24]) and that trees-especially for transitive sets-can be described directly as objects in a model of ETCS. Recently their argument has been considerably simplified by Osius, who found a method of characterizing the transitive sets as objects in E. In noting the relation of ‘categorical’ set theory to ‘membership’ set theory, we emphasize that the categorical approach starts with a much broader coverage, since there are many different types of models for an elementary topos, in particular, the sheaf-theoretic models. We note also that Kock[9] has shown how the Kripke models fit naturally in the topos-theoretic situation. 7. Independence proofs
This reformulation of set theory raises a question as to the position of the set-theoretic method of forcing. In 1971, Tierney and Lawvere showed that Cohen’s proof of the independence of the continuum hypothesis can be formulated in Topos-theoretic language. Moreover, and this is the crucial point, an essential step in Cohen’s proof turns out to be exactly the well-known topological process of turning a presheaf into a sheaf (the sheafification functor). This proof has now been published by Tierney [27]. The continuum hypothesis is easily formulated categorically: The set o of natural numbers is at hand; the hypothesis states that there is no object J with proper monics w
H
J
H
2”.
We start with a model S of ETCS; in it there is an object I and proper monics
o-2=->-,r The intent is to force I inside 2”. Now Cohen[3] considered finite sets p of ‘conditions’, each condition being either a statement n E i or n E i, for n E w and i E 1. The set P of all such finite sets p is partly ordered by inclusion. Tierney introduces essentially the same P, regarded as a category, and constructs the functor category S‘ (of all diagrams in S of
132
SAUNDERS MACLANE
shape P ) . This category S' is also a topos; we aim to construct further topoi as in
S
S' -% ShllP
Sh-PIC-'.
(22)
Cohen's process[3] of forcing makes a suitable use of double negation; correspondingly, the operation of double negation 11: R -+ Q in the subobject classifier for S' turns out to be a topology there-not a topological space in the usual sense, but a Grothendieck topology. More exactly, t h e operation j = 1 1: R -+ R has the properties that j z = j, j(true) = true, and j
A =
A(jxj);RxR+R.
of From these properties it is possible to define the category Sh-P sheaves for the topology j = 11, and to construct in (22) the sheafification functor L. By a standard argument, this category of sheaves is also an elementary topos. Its subobject classifier all has a Boolean algebra TO, of global cross sections, but this algebra is not yet two-valued. The axiom of choice, however, provides a map 11 : TQll + 2, so consider the class C of these monics S + X in Sh,,P for which j(char(V,S)) = true, where t : X + 1. A standard construction in category theory now constructs the universal category 'of fractions' in which all the monics in C are turned into isomorphisms. This is the last category in (22). One proves that it satisfies the ETCS axioms. Some further complex arguments show that the continuum hypothesis is indeed false in this topos, as intended. This method of translation can be applied to other cases of forcing. For example, the independence of Souslin's hypothesis has been established by Tennenbaum [26] (using forcing) and by Jech [8] (using Vopenka models). Recently, Bunge [2] has given a categorical formulation of this proof. Its form makes it clear that the same should be possible in other cases. 8. The prospects
The subject sketched here is in rapid development. The basic categorical ideas are formulated in [17, Chapter xv] or more fully in [18], [22] and elsewhere. Systematic developments of the properties of a topos from the
SETS, TOPOI, AND INTERNAL LOGIC IN CATEGORIES
133
axioms appear in [6] and [12]. Other categories with internal logic and quantifiers, and their relations to standard logical systems, are discussed in [lo] and [23]. The necessary basic sheaf theory may be found, say, in [7] or in [25]. Given the basic geometric character of sheaf theory, this common development of ideas from geometry and concepts from set theory embodies exciting prospects. References [1] M. Artin, A. Grothendieck and J. L. Verdier, Thiorie des topos et cohomologie itale des schimas (SGA4), Vols. 1,2, and 3 , Lecture Notes in Mathematics, Vol. 269 (1972), Vol. 270 (1972) and Vol. 305 (1973) (Springer, Berlin). [Z] Marta C. Bunge, Boolean topoi and independence of Souslin’s hypothesis, Aarhus Universitet, Matematisk Institut, Preprint Series, 1972/73 No. 25. [ 3 ] Paul Cohen, The independence of the continuum hypothesis, I and IT, Proceedings of the National Academy of Sciences of the U S A , 50 (1963) 1143-1148 and 5 1 (1964) 105-1 10. [4] Paul Cohen, Set Theory and the Continuum Hypothesis (Benjamin, New York, 1966). [S] J. C. Cole, Categories of sets and models of set theory, in: The Proceedings of the Bertrand Russell Memorial Logic Conference, Denmark 1971 (School of Mathematics, Leeds, England, 1973) pp. 351-399. [6] Peter Freyd, Aspects of topoi, Bulletin of the Australian Mathematical Society 7 (1972) 1-76. [7] Roger Godement, Thtorie des Faisceaux (Hermann, Paris, 1958). [8] T. Jech, Non-provability of Souslin’s hypothesis, Commentationes Mathemaficae Universitatis Carnlinae 8 (2) (1967) 291-305. [9] A. Kock, On a theorem of Lauchli concerning proof bundles, to appear. [lo] A. Kock, Introduction to functorial semantics, Lectures at the Bertrand Russell Memorial Logic Conference, Denmark, 1971. [11] A. Kock and Chr. Juul Mikkelson, Topos-theoretic factorization of non-standard extensions, Aarhus University, Preprint. [I21 A. Kock and G. C. Wraith, Elementary toposes, Aarhus Universitet, Lecture Note 30 (1971). [13] S. Kripke, Semantical analysis of intuitionistic logic I, in: J. N. Crossley and M. A. E. Dummet, eds., Formal Systems and Recursive Functions (North Holland, Amsterdam, 1965) pp. 92-130. [I41 F. W. Lawvere, An elementary theory of the category of sets, Proceedings of the National Academy of Sciences of the U S A 5 1 (1964) 1506-1510. [15] F. W. Lawvere, Quantifiers and sheaves, Actes des Congres International des Mathtmatiques 1970, tome 1, p. 329. [16] F. W. Lawvere, Introduction, in: F. W. Lawvere, ed., Toposes, Algebraic Geometry and Logic, Lecture Notes in Mathematics, Vol. 274 (Springer, Berlin, 1972). [17] Saunders Mac Lane and Garrett Birkhoff, Algebra (Macmillan, New York, 1967). [I81 Saunders Mac Lane, Categoriesfor the WorkingMathematician (Springer, Berlin, 1971). [I91 William Mitchell, Boolean topoi and the theory of sets, Journal of Pure and Applied Algebra 2 (1972) 261-274.
134
SAUNDERS MACLANE
1201 Gerhard Osius, Categorical set theory: a characterization of the category of sets, Journal of Pure and Applied Algebra 4 (1974) 79-1 19. [21] Robert Pare, Colimits in topoi, Bulletin ofthe American Mathematical Society, 80 (1974) 556-561. 1221 Bodo Pareigis, Categories and functors (Academic Press, New York, 1970). [23] Gonzalo Reyes, From sheaves to logic, to appear. 1241 E. Speckel-. Zur Axiornatik des Mengenlehre (Fundierungs und Auswahlaxiom), Zeitschrift fur Mathemutische Logik und Grundlagen derMathematik 3 (1957) 173-210. 173-2 10. 1251 Richard G. Swan, The Theory of Sheaves, Chicago Lectures in Mathematics, (The Univ. of Chicago Press, Chicago, 1964). 1261 S. Tennenhaum, Souslin’s problem, Proceedings ofthe Nationul Academy of Sciences of the U S A 59 (1968) 60-63. 1271 Myles Tierney, Sheaf theory and the continuum hypothesis, in: Toposes, Algebraic Geometry and Logic, Lecture Notes in Mathematics, Vol. 274 (Springer, Berlin, 1972) pp. 13-42.
CONTINUOUSLY VARIABLE SETS; ALGEBRAIC GEOMETRY = GEOMETRIC LOGIC
F. William LAWVERE University of Perugia, Perugia, Italy
The (elementary) theory of topoi, the fundamentals of which were outlined in Prof. Mac Lane’s talk at this colloquium, (see also [6, 12, 131) is a basis for the study of continuously variable structures, as classical set theory is a basis for the study of constant structures. The need for the autonomous development of such a theory may be doubted in view of the existence of representations of a variable structure, (e.g. a vector bundle or a family of curves) in terms of a domain of variation (considered as a constant structure such as a topological space) and a succession of constant structures, one for each ‘point’ in the domain of variation. But there is an analogy here with the notion of variable quantity, a notion which was taken quite seriously by the founders of analysis and which has not been ‘eliminated’ by set theory any more than continuity has been eliminated by the ‘arithmetization of analysis’ (which is just that and not analysis itself). As Engels remarked in.the period when set theory and the arithmetization of analysis did not yet dominate mathematical thinking, the introduction of the advance from constant quantities to variable quantities is a mathematical expression of the advance from metaphysics to dialectics, but many mathematicians continued to work in a metaphysical way with methods which had been obtained dialectically ( A n t i -Diihring, in the section on Quantity and Quality). The existence of a representation of a commutative ring of variable quantities in terms of functions on its spectrum does not eliminate the need for t h e theory of commutative rings (and indeed one of the ways of accounting for the differential structure of the variable quantities is precisely through the use of rings with nilpotent elements for which such representation in its classical form is not
136
F. WIL,LIAM LAWVERE
faithful). There are also useful concepts of variable quantity such as Schwartz distributions or Sato hyperfunctions, in which the ‘domain of variation’ is clearly ordinary space but just as clearly not the ‘points’ in it. The characterization of motion as the presence of the same body in two places at the same time is only an irresolvable contradiction if we ignore that the metaphysical opposition between points and neighborhoods (introduced by the Platonic deification of points and revived by set theory) is not maintained in the practice even of mathematics. As Lenin affirmed in his Conspectus of Hegel’s Lectures on the History of Philosophy (in the section on the Eleatic School) it is that characterization of motion which correctly expresses the continuity of time and space, whereas the concept of motion as the presence of a body one place at one time, in another place at a later time, describes only the result of motion and does not contain an explanation of motion itself. Every notion of constancy is relative, being derived perceptually or conceptually as a limiting case of variation’ and the undisputed value of such notions in clarifying variation is always limited by that origin. This applies in particular to the notion of constant set, and explains’ why so much of naive set theory carries over in some form into the theory of variable sets. Our inversion of the old theoretical program of modeling variation within eternal constancy has something in common with that of the intuitionists, though we consider variation generally, not only variation of mathematical knowledge; the internal logic of a topos is always concentrated in a Heyting algebra object. Tf this object happens to be Boolean, then the variation of the sets is (constant or) random in the sense that for every part h of the domain of variation the topos splits as a full product C = E/,, x C F / h . , i.e., any motion over b and any motion over the complementary part can be combined into a total motion admitted by G, whereas for most topoi there is a continuity condition at the boundary of b ; this is of course analogous to the contrast between continuous and measurable variable quantities. There is a more profound connection than analogy between structure and quantity, a s also was pointed out at this colloquium by Prof. Bernays. The primary subject matter of mathematics is the variation of quantity in time and space, but also this primacy has the nature of a first approxima’This remark i s ;\Is.) relevant to non-standard analpsis[4] which can also be clarified by topoi [ 1 11. ‘Limited by that origin’ has also a positive aspect.
CONTINUOUSLY VARIABLE SETS
137
tion, not only because occurring systems of quantities have structure, but also because of the fact that each material quantity is a quantity of something and hence has its own particular structure which we can hope to clarify mathematically. Thus what I want to emphasize here about the theory of topoi is that it allows the passage from constant to variable sets (and back) and is a basis for studying relationships between (variable) quantities and (variable) structures. Since the theory arose from geometry and permits a deepening of analysis, it is striking that the axioms we arrived at are essentially (a categorical formulation of) the logician’s definition of analysis: higherorder number theory. No general axiom of extensionality can be assumed but for a particular topos we may be able to discover a particular generalization of extensionality which is applicable (leading to a representation of the objects as sheaves) and if the topos is defined over a topos of constants in which the axiom of choice is valid, there may be enough points (leading to a representation of the objects as families of constant objects; this representation will however not account €or the morphisms between objects without the further information of a left exact ‘comonad’, which generalizes the fact that the ‘continuity’ of a classical sheaf in its espace Ctale representation is not a property of the family of stalks but is the further information of a specified topology on their sum, and that morphisms between sheaves are represented only by those families of morphisms of constant objects which preserve these specified topologies). The close connection of the axiom of choice with the existence of points (primes) in algebraic geometry as well as with the existence of models in logic (below we will point out that models are points and show how both Krull’s Theorem and the Godel-Henkin-Kripke completeness theorem follow from Deligne’s theorem on coherent topoi) is especially striking when we notice (Diaconescu) that the axiom of choice (in the form that all epimorphisms split) implies the law of the excluded middle and hence implies the constancy-randomness of sets as pointed out above; the falsity of the Sierpinski-Banach-Tarski paradox in the world is doubtless connected with the fact that material bodies are varying in a non-random fashion, and for similar reasons it is idealism to claim that something exists in the real world because its theory is consistent though of course the claim might be defended for a world of eternal thought. In order to extend the realm of direct applicability of the theoretical experience of set theory, part of our programme is the development of
138
F. W I L L I A M LAWVERE
mathematics over an arbitrary base topos, (in particular one without the axiom of choice); a simple and beautiful example of this (discussed later in this paper) is a construction due to Joyal of the spectrum of a commutative ring without any use of primes (correcting an error in my paper, written in Nice, in which I mistakenly thought that enough such internal points would exist if only an intuitionistic definition was taken). Part of what follows was developed in discussions with and in unpublished lectures by AndrC Joyal, Gonzalo Reyes, Jean Giraud, and Gavin Wraith. The fact that the axiom of choice implies the law of the excluded middle does not mean that intuitionistic analysis is inconsistent, although the following simple argument, starting from del Ferro’s theorem, might seem to show at first that it i \
vy
3x [ y = X(X2- 3)],
3.f VY [ Y = .f(Y)(f(P)2 - 311, f : R + R is continuous (since by Brouwer all functions are). However, there is no such continuous f, although there is a covering of R by two open intervals (-so, !)(- I , m ) on each of which continuous functions f-,, f y can be defined which satisfy the equation. There are at least two lessons to be drawn from this: The choice ‘functions’ in intuitionism are not functions, i.e., do not preserve equivalence of Cauchy sequences though they are functions at the level of Cauchy sequences; this suggests a weaker ‘axiom of choice’ which i s valid for the topos of sheaves on a zero -dimensionul space (such as NN), namely that (although not every object is projective) every object is the epimorphic image of some projective object (the ‘some’ could even be replaced by a functor); however for sheaves o n a space, the space would at least have to be connected, no matter which definition of real-numbers object is taken, if Brouwer’s theorem that the real numbers object is not the union of two proper disjoint subobjects is to hold. Kripke’s method of modelling intuitionistic logic in a presheuf category, in which existential quantification commutes with evaluation at the ‘stages’ will probably not work for higher-order logic; the second lesson is that the more general ‘commutation relation’ for existential quantification involves passing to a covering. The latter is typical for general sheaf categories, as we will now explain more precisely for a more general class of categories.
CONTINUOUSLY VARIABLE SETS
139
It is a theorem (Mikkelsen and Park, unpublished) that any topos (i.e., any category having finite inverse limits, a ‘function space’ functor ( . ) A right adjoint to each Cartesian product functor A x (.), and a ‘truth-value’ object R uniquely classifying arbitrary subobjects of an arbitrary object A by characteristic functions A + R) also satisfies the following conditions characteristic of a pretopos: Besides a terminal object 1 and pullbacks, there exist a coterminal object 0 and finite coproducts (denoted by +) and these are preserved by pullback in the sense that for
any X Y, f*(O,) = Ox and if A l + A Z + Y, then f*(A, + A2) = f*(A,) + f * ( A z )as objects over X ; every equivalence relation E s A on an object A may be obtained by pulling back some A + B against itself; every morphism A + Y may be factored uniquely into an epimorphism followed by a monomorphism A -+ I H Y and (especially important) this factorization is preserved on pulling back along any X -+ Y. The Y , and if A , w Y , A z w Y subobject I H Y is called the image of A are two subobjects, then the image of A , + A2 + Y may be denoted by A 1U A z Y. It follows that every epimorphism A + B is the coequalizer of the equivalence relation it induces and that every equivalence relation (though unlike for a topos, not necessarily every pair A ‘ 2 A in a pretopos) has a coequalizer; there is clearly the derived rule f*(A, U A?) = f * A 1U f*Az. A typical example of a pretopos may be constructed as follows: Take any many-sorted first order theory involving at least the connectives =, A, v, 3 (logical equivalences and entailments may be used as axioms but not necessarily in formulas); let the objects of the category be arbitrary formulas of the theory and let the morphisms be (provable equivalence classes of) relations A -% B which are provable functions, i.e., F(a,b )t A(u), F(a,b ) t B(b), 3 a [ F ( u ,b ) A F ( u , b’)]t b = b’, A(a) t 3 b F(a,h ) , f
-+
where a, b, b’ are appropriate vector variables; adjoin coproducts and quotients formally if necessary. Then A -% B is an epimorphism iff B ( b ) k 3 a [ F ( a ,b ) ] and in general all existential quantifiers and images can be expressed in terms of each other. The fact that monic-epics in a pretopos are isomorphisms means that unique existence implies actual existence, but in general the axiom of choice fails because there are not enough constants. This last remark is connected with the non-trivial ‘commutation rule’ for existential quantification which we will now make more precise with the aid of an auxiliary class of objects.
140
F. WILLIAM LAWVERE
Let 3 be any class of objects in a pretopos @. % could be the class of all objects for the purpose of the following proposition, but for example a useful condition on a topos CF is that there exists a single object the class of whose subobjects forms a suitable class %. The objects of % may be variously interpreted as stages of subjective or objective time, as open sets, as rings of definition, etc; any morphism U --%X of Q (whose domain U belongs to % ) may be called an ‘element of X ’ defined at (or over) U , and if U’ --L U is any morphism U’* X may be interpreted as x restricted to U’ or the fate of the element x under the transition t. The suitability of qL is expressed by the condition that % generates CF in the sense that any monomorphism A
- f
Y in Q is an
isomorphism provided it is ‘%-surjective’, where a morphism A
-
I
Y
is called %-surjective iff for any U A Y with U in % there exists
U A with uf = y . For a pretopos this suitability is equivalent to another way of expressing % -extensionality: If f
A a Y 4
in CF are such that uf = ag for all U -% A and all U in %, then f = g. In order to state the proposition we need one more definition: a class %’ of rnorphisms all having codomain A (but possibly various codomains) is
said to c o u e r A iff for any subobject A ’
-
A, we have an isomorphism
provided for every a in %‘ we have u E A ‘ (in the sense that there exists a ‘ in Q with a ’ i = a ) . Clearly, if %‘ is a subclass of %‘ and covers A, then the morphisms in %‘ with codomain A also cover A. PROPOSITION 1. Let E be a pretopos and let % be a class of objects with respect to which @ sutisfies %-extensionality and let X
f
--+
Y be u
morphism o f Q. Let V-’ Y be an element of Y defined over V in %. Then the validity of the formal condition 3 x [ X ( x ) A x f = y ] m a y not : \X with x f = y (even if V = 1) but is imply the existence of V 1 equiuulent with the uulidity of the cutegoricul condition that the cluss of all
those U --%V f o r which there exists u commutative square
CONTINUOUSLY VARIABLE SETS
u-v 1 8
i
,
x- f
141
i.
Y‘
and for which U is in %, covers V . The morphism f i s an epimorphism iff Y ( y ) 1 3 x [ X ( x ) A xf = y ] holds formally i$ for every y with domain in %, the class of a as above covers V , i.e., i$ for every such y there is some cover % of V for each element ( U , a ) of which U is in % and the above square can be completed with an x. (If we simply took the pullback square, then we would have one epic a, but U would usually not be in 021.) As another example of the Proposition 1, consider the fact that the complex logarithm exists and yet does not exist. Logically speaking this contradiction was solved by passing to a deeper stage of knowledge, geometrically speaking by passing to a covering (the first integer cohomology group shows that the contradiction was not vacuous). Here we take @ as the category of set-valued sheaves on a topological space (such as an open set in the plane) and the class % of open subsets of the space is suitable; let X be the sheaf of complex-valued continuous functions and Y the sheaf of non-vanishing complex-valued continuous functions and take for f the exponential mapping. Then f is an epimorphism in Q (i.e., V y 3 x is true with formal variables x, y ) however (taking, say, V = 1, i.e., the whole space) for given y (i.e., a non-vanishing function defined on the whole space), there is no x with xf = y. But there is an open covering U, of 1 with x, such that xf = yIU,. A pretopos need not admit an internal universal-quantification operator, but if it does (as for example any topos does) then the ‘commutation relation’ for V relative to a suitable Q is the same one familiar from Kripke models and from forcing, i.e., the truth of a universal statement at V involves all elements defined over all U with U + V and U in %, not only all elements defined over V. The close relationship between the logical and geometrical ways of solving existential contradictions may be further illuminated by another simple proposition in which we change the global domain of variation (i.e., adjoin a variable ‘constant’). The opposition between global, eternal elements 1 -+ X and elements U -+ X with an arbitrary domain of definition is not metaphysically fixed. Consider the category @/ U whose
142
F. WILLIAM LAWVERE
objects are ‘over U’, i.e., objects of Q equipped with a structural morphism to U and whose morphisms are commutative triangles (i.e., morphisms of Q which respect the structural morphisms). Then the terminal object 1 of @ / U is just the identity morphism of U and if we consider X u = U x X as an object of E / U (namely as the ‘constant’ X vacuously varying over U ; this may amount to a restriction or an expansion of the original domain of variation according to the particular U ) ,then the eternal elements I + X u of X in the sense of @/ U are just the U + X defined over (or at) U in the sense of E.
PROPOSITION 2 . For any object U of Q , Q / U is a pretopos if @ is. The functor Q%
e/u
preserves the pretopos structure and the internal universal -quantification operator i f Q has it ( a s well as the topos structure if E has i t ) . The object U has, after passing into E / U by the functor, always a canonical eternal element (in terms o f E, this global element is just the diagonal map U + U x U ) .If U -+ 1 is an epimorphism in @, then the functorisfaithful. Note that U may have had no global elements in E. One geometrical example of the above proposition involves extending the ring of definition; then U is just the spectrum of the extended ring. If we recall the correspondence between pretopoi and many-sorted intuitionistic firstorder theories, we see that the above proposition implies the well-known lemma on consistently adjoining constants; namely U -+ 1 epic just means that k 3 u [ U ( u ) l whereas a morphism 1 + U is really a constant satisfying U ; the construction of @/ U amounts to adjoining the ‘constant’ d and the axiom k U ( d )and closing to obtain a theory of the same kind as before (i.e., to maintain the pretopos nature). Recall that a continuous map (geometric morphism) 3 f,9 between topoi is just a functor having a left exact left adjoint f * . Given a fixed topos S,by a topos defined over 5 is meant a topos X together with a given continuous map X -+ 5,and by a continuous map over G is meant a continuous map X + y) which commutes up to a given equivalence of functors with the given maps X -+ G, 8 -+ G. If
143
CONTINUOUSLY VARIABLE SETS
are two continuous maps over G, then by a morphism f -+ g is meant any natural transformation f * -+ g * which reduces to the identity on 6 ;thus there is a category Top~(X,y]) of continuous maps from X to 8 ; seemingly all that can be affirmed about it in general is that it has colimits over filtered category objects from G (e.g. over directed poset objects in G). In particular, TopS(lB,$!I the ), category of sections of the given structural map of 23, is called the category of (6-valued) points of y. If 9 is the category of internal G-valued sheaves on a complete Heytingalgebra object in 6 ,the morphisms in the category of points of g generalize the usual partial ordering of points of a not-necessarily T,-space. The category of points may be empty, even if 2 is ‘the’ category of constant abstract sets-for example, if 8 is the category of sheaves on a complete Boolean-algebra object (in 6)which is atomless ; of course we can then generalize the ‘point’ analysis of 9 by considering ‘points’ (i.e., continuous maps) of defined over a suitable class of W’s and below we describe a precise theorem of Barr to that effect. P
The sense in which a topos X equipped with a continuous map X + G is ‘defined over’ the topos G has two aspects. The ‘closed category’ aspect (exploited in [IS]) is that we can define X(A, X ) = p.(X”) for any two objects A, X of X , so that the horn sets of .% are not merely and in particular we may abstract sets but are enriched to be objects of 5, consider p . itself as the functor represented by A = 1. The other aspect is that for any object S of G we can define
xq= X / p * ( S ) and consider the latter as the category of S-indexed families of objects of X ; such a family (i.e., an object of 3’) should be thought of as ‘G-smoothly’ indexed. Usual formal operations on families remain within these, for example substituting along any ‘change of index set’ S’ + S in G , internal coproduct and product n
S
etc. One may consider the structure on X consisting of the notion of families as an atlas on % , with models in G, and these operations on families as coordinate transformations; this notion of atlas will also be
1 44
F. WILLIAM LAWVERE
realized for many categories W over 5 which are not topoi such as the category a b ( 2 ) of abelian group objects in 2, the category top(S) of topological space objects in S,the category cat(,Z>of (‘small’) category objects in 5,etc., and in general we should consider between such ‘large’ categories ‘over’ 5 only those functors which respect the atlas structures (which apparently any functor definable within the set theory G will automatically do). The general facts about topoi over a base topos S depend of course on 5,but here we will discuss mainly the three cases of any topos E, a topos 2 having a natural-numbers object N,and any topos S satisfying the axiom of choice (the conjunction of the last two conditions we may roughly identify with Boolean-valued models G of Zermelo set theory weakened to bounded comprehension). Naturally, more can be affirmed if we assume that one or both of W, ?] satisfy some ‘smallness’ condition relative to 2;we will consider mainly three such conditions. The first smallness condition reflects a very important construction in the classical sheaf theory, but has not been investigated much in topos theory and in particular, I do not know an internal characterization of those maps !?I -+ 2 which satisfy it. The external form is this: for every topos .t‘ over 2,there is an object rz(,T,)‘)) in .t‘ and an equivalence of categories TOP,(X,
y)) =.?(I, rz(,T,?I))
and in particular
x(x,r=(x,
~ o p ~ ( . t ‘ /= ~,
for any object X of 3. Then l’z(W,?)) is determined as the ‘sheaf of !?)-valued continuous functions (over 5) defined in W’. This will exist at least in the case that is the topos of sheaves on a complete Heyting algebra in 2 (for example on the open sets of a topological space object in 2). The fact that Tz(!I-, ?)) is an object of 2 is the (quite restrictive) condition that y) has only a ‘set’ of points over 2 ;if b is thought of as the sheaves on a base space more general than a point, it is more usual to call Tr(I >, >)))the sheuf of sections of >7) + S rather than calling it the ‘set of points’. Also neglected has been the study of the intrinsic structure of the objects T,(.t‘, I)); for example they have a category structure (it need not reduce to a poset even though it is small; see [ I , Vol. I, pp. 479-490 on ‘Ctendues’]) and a topology (richer even in the classical case than the
CONTINUOUSLY VARIABLE SETS
145
espace CtalC topology, which in the intrinsic sense of X is discrete) derived from the truth-value object of 8. For example, if 5 has a natural numbers object, then it also has an object RCaiuchy of Cauchy-real numbers, and = E/Rc.ruLhy is a topos over Q. For any X over G , rs(X,g) is then the sheaf of locally-constant real valued €unctions defined in X ; if X happens to be ‘locally connected over Q’ in the sense that the structural map
X4
is essential, i.e., there is another functor X a G left adjoint to the left adjoint p * , then the sheaf of locally constant real-valued functions in X is just the object of Cauchy reals calculated in the sense of X. On the other hand, the object of Dedekind reals as calculated in X will, at least in many cases, be T s ( X , !It), where % = sh(R, 5)is the ‘usual’ category of sheave, over R E 2 with its usual topology; thus the Dedekind reals in X typically correspond to continuous real functions on a topos .t‘ while the Cauchy reals are only the locally constant ones (even if the two definitions happen to agree in the base topos S).Thus a particular quantity (varying over 5) may be represented by a pair of adjoint functors (i.e., continuous map of topoi)
w
-
91
and other kinds of quantities can also be so represented by considering a different topos in place of ?H. Truth values may be considered a kind of quantity, and the category G’ of morpbisms in 5 plays the role of the Sierpinski space relative to 6 in the sense that continuous maps X -+ G’ correspond to morphisms 1 -+ Rt= L ( X , 5’) in
X.
If II is an abelian group in 6 ,we might agree to consider even an element of the cohomology group H ’ ( X ,II) as a ‘quantity varying over X‘; at any rat? such an element can be represented by a continuous map
X
-
5”
of topoi over G (where 5”is the category of n-sets in G ) ,or again by the structure of a principal homogenous space in X. We will see below that quite general kinds of structures (e.g., first order structures) varying over X can also be represented by continuous maps
X-E
146
F. WILLIAM LAWVERE
where E depends on the particular kind of structure but Z is any topos over 5; but since typically there are a ‘proper class’ of structures of kind 2, there will usually not exist objects I‘d