STUDIES IN LOGIC AND
THE FOUNDATIONS OF MATHEMATICS
Editors L. E. J. BROUWER, Laren (N.H.)
A. HEYTlNG, Amsterdam A. R...
84 downloads
1699 Views
10MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
STUDIES IN LOGIC AND
THE FOUNDATIONS OF MATHEMATICS
Editors L. E. J. BROUWER, Laren (N.H.)
A. HEYTlNG, Amsterdam A. ROBINSON, Los Angeles P. SUPPES, Stanford
Advisory Editorial Board Y. BAR-HILLEL, Jerusalem K. L. DE BOUVERE, Amsterdam
H. HER M E S, Munster if W. J. HINTIKKA, Helsinki A. MOSTOWSKI, Warszawa J. C. SHEPHERDSON, Bristol E. P. SPECKER, Zurich
NORTH-HOLLAND PUBLISHING COMPANY AMSTERDAM
FORMAL SYSTEMS AND RECURSIVE FUNCTIONS PROCEEDINGS OF THE EIGHTH LOGIC COLLOQUIUM OXFORD, JULY 1963
Edited by
J. N. CROSSLEY Fellow of St. Catherine's College, Oxford and
M.A.E.DUMMETT Fellow of All Souls' College, Oxford
~It
m ~ 1965
NORTH-HOLLAND PUBLISHING COMPANY AMSTERDAM
No part of this book may be reproduced in any form by print, microfilm or any other means without written permission from the publisher
PRINTED IN THE NETHERLANDS
PREFACE In July 1956 the first Logic Colloquium was held in Oxford thanks to the efforts of Professor A. Prior. It was a fairly small gathering. Since that time, however, the Colloquium has grown considerably. In 1963 the second colloquium to have a larger international membership took place and there were nearly 100 logicians present. The Colloquium was recognized as a meeting of the Association for Symbolic Logic. Financial support was provided by NATO and the Colloquium was a NATO Advanced Study Institute with a Symposium on Recursive Functions sponsored by the Division of Logic, Methodology and Philosophy of Science of the International Union of the History and Philosophy of Science. We are grateful to NATO for their generous contribution towards the publication of this book. MICHAEL DUMMETT JOHN
Oxford, June 1964
N.
CROSSLEY
I. FORMAL SYSTEMS
SOME MODAL CALCULI BASED ON IC R. A. BULL Wadham College, Oxford, UK
The motivation for this work was an attempt to find modal logics acceptable to intuitionists. While I do not know whether any of the systems described here would in fact satisfy an intuitionist philosopher, they seem to me to have some formal interest. A fundamental difficulty is that the customary modal equalities
yield
Met
= NLNet and La. = NMNa. CNNLpLp,
which is an intuitionistically implausible thesis. Of the two approaches described here, the first accepts this thesis, and the second avoids it by having only L and not M. The first system formalises the position that, while contingent propositions obey intuitionist logic, necessary propositions obey classical logic. I must confess that I know of no philosopher who actually maintains such a view. The first system is suggested by the Wajsberg completeness proof of S5 with respect to the Henle model (see [4]). It is obtained by adding to IC f- a. => f- Let
CLpp CLpLLp CLCpqCLpLq ALNLpLp Met = dfNLNa.
4
R. A. BULL
(cf. the Godel axioms for S5 given on p. 312 of [3]). The model I use for this system has elementsxc., 02' 03' •.. ), where each Gi is an element of i) x ,1) with ¢, ¢, ¢, ... ) designated. The values of A, K, C, and N are found by applying i) x to the corresponding terms of the sequences. The word L\I. has value x as follows: PROOF.
(1) For each i the sub-tree of X' under (i, n) is a copy ofY. (2) X' is the closure of these points. Allocate value P/ to each Pj: it will be found by induction on the length of words that ex gets value A' and /3 gets value B'. Clearly A' is empty but B' is not, so this allocation verifies ex but rejects /3. References [1] M. A. E. Dummett and E. J. Lemmon, Modal Logics between 84 and 85. Zeitschr. f. math. Logik und Grundlagen d. Math. 5 (1959) 250-264. [2] J. C. C. McKinsey and Alfred Tarski, On Closed Elements in Closure Algebras. Annals of Math. 47 (1946) 122-162. [3] A. N. Prior, Formal Logic, 2nd ed. (Oxford 1962). [4] M. Wajsberg, Ein erweiterter Klassenkalkul. Monat. f. Math. Phys. 40 (1933) 113-126.
THE LOGIC OF INTERROGATIVES M. J. CRESSWELL Victoria University of Wellington, New Zealand
This paper attempts to give a formal analysis of a concept which bears many resemblances to "p is the answer to d" where p is a statement and d a question. Let Spd represent this concept. The sense of "answer" which it characterizes is such that the following laws hold; Al
(Spd. Sqd)
~
(p = q)
Every question has at most one answer. This will exclude answers which give too much information. E.g., if! ask "Is it raining?" and receive the reply, "No it isn't; in fact the sun is shining." Then this would usually count as an answer. But in the sense of "answer" characterized by S the only correct answer to "Is it raining?" would be, "It is not raining" though the statement, "It is not raining and the sun is shining" obviously entails the answer to "Is it raining?" A2 (tlp)Spd, Every question has at least one answer. A3 Spd ~ p, The answer to a question is true. S then designates the true or correct answer to a question. A4
[0 (p == q). Spd]
~ Sqd,
If a statement is the answer to a question then any statement logically equivalent to it answers that question. This does make S diverge from many ordinary senses of "answer". It is a sense which, e.g. is of no use to the mathematician, since all mathematical truths are logically equivalent. But it does represent "answer" in that given a p which answers d wecan deduce an answer which satisfies any stricter criterion for "answer".
9
THE LOGIC OF INTERROGATIVES
There are cases where the same statement answers different questions. If I ask "Are both Mary and John here?" and someone else asks "Which of Mary and John are here?" then if both Mary and John are here the answer to both questions is "Both Mary and John are here" (though in the first case we would express it by saying "yes" and in the second by saying "both".) But they are different questions; for suppose Mary is here but John isn't. Then the answer to the first is, "Mary and John are not both here" and the answer to the second is "Mary is here but John isn't." (This last is an "answer" to the first question which gives too much information.) Identity of questions is defined,
Def=
(d
=
e)
=
df
(P) 0 (Spd
=Spe)
The following question-forming operator on questions is such that; Q
(d v e) is a question whose answer is the conjunction of the answer to d and the answer to e. If I ask, "Which of Mary and John is here?" then the possible answers are,
"John is here and Mary is here" "John is here but Mary isn't" "Mary is here but John isn't" "Neither Mary nor John is here" Q
and these are the only possible answers. The axioms for v are; A5
Q
(Spd.Sqe) => S(p.q) (d v e) Q
Q
A6 (d v e) = (e v d) Q
Q
Q
Q
A7
((d v e) V f) = (d v (e v f)
A8
(d
= e)
=>
Q
Q
[(fv d) = (fv e)J
The simplest kind of question is the yes/no question, i.e, the question which has the form "Is it the case that p'I", If I ask "Is it raining?" there are only two possible answers, "It is raining" "It is not raining"
Letting Qp = "Is it the case that p?" we have A9
SpQp v
S~ pQp
10
M. J. CRESSWELL
from which we may prove (using A3) SpQp == p, S"'pQp == "'p, and Qp = Q'" p. But yes/no questions form only a subclass of questions. A much more extensive class may be formed by introducing a questionforming quantifier. We use (Qa)A(a) to mean, "for which of the a's does A hold?" where a represents any variable and A a wff in which a occurs. I might ask, (Qp)fp, for which p does the propositional function/hold?
or
(Qx)¢x, which x's ¢?
or
(Q¢ )¢x, what properties does x possess? etc.
The answer to (Qa)A(a) will be the true conjunction of A(an)'s or '" A(an)'s for every an" To express this formally we express what it is for a proposition p to entail and be entailed by such a conjunction. p entails such a conjunction iff, (a) [p ~ A(a) . v -P -'5 '" A(a)] p is entailed by such a conjunction iff it is entailed by every q which
entails such a conjunction (since among these q will be the conjunction itself) i.e. iff (q){(a)[p ~ A(a) => q -'5 A(a):. :p -'5 '" A(a) => . q -'5 '" A(a)] => (q -'5 p)} 0
0
0
p is the answer to (Qa)A(a) if both these are satisfied and p is true.
AlO p. (a)[p
~
A(a). v.p ~ '" A(a)]. (q){(a)[p ~ A(a). => . q -'5 A(a):. :
p ~ '" A(a). => . q -'5 '" A(a)] => (q ~ p)).:
== :. Sp(Qa)A(a)
If we consider the class of questions definable in terms of (Qa) and restrict our logic to these we can define the expression (Sa)pA(a) (i.e. "p is the answer to the question 'which a's A?''') as; p.(a)[p -'5 A(a).v.p ~ ",A(a)].(q){(a)[p ~ A(a). =>.q ~ A(a):.: p -'5 ",A(a). =>.q -'5 ",A(a)] => (q
~
p)}
A wider class of questions can be considered by defining the schema Sp(Qna) [A 1(a), ... , An(a)] where (Qna) [A 1(a), ... , A,,(a)] would be equivQ
Q
alent to (Qa)A 1(a) v ... v (Qa)An(a). It is then possible, substituting (Qa)A 1(a)/d, (Qa)Aia)/e, etc., to prove as theorems the equivalents of Al-AlO by means of this definition.
THE LOGIC OF INTERROGATIVES
11
This method is sufficient to define, in a modal system with quantification, the concept "p is the answer to d" (in the sense of "answer" outlined) whenever the question d can be interpreted as asking, "Which a's A 7" I.e. it is sufficient for questions where a set of possible answers is so definable that the true member of the set is the answer to the question. If we can interpret questions like "When - - - - 7" as asking, "At what time- - - - 7", "Why- - - - 7" as asking, "For what reason - - - - 7", "How- - - - 7" as asking "By what means- - - - 7", etc. then the account given will suffice as an account of question logic in respect of "p is the answer to d", References [1] C. L. Hamblin, Questions. Australasian Journal of Philosophy 36 (1958) 159. Discussion-Questions aren't Statements. Philosophy of Science 30 (1963) 62. [2] David Harrah, Communication: A Logical Model (M.LT. Press 1963). [3] Henry S. Leonard, Interrogatives, Imperatives, Truth, Falsity and Lies. Philosophy of Science 26 (1959) 172. A Reply to Professor Wheatly. ibid. 28 (1961) 55. [4] J. M. O. Wheatly, Note on Professor Leonard's Analysis of Interrogatives. ibid. 28 (1961) 52.
SOME GENERALIZATIONS AND APPLICATIONS OF A RELATIVIZATION PROCEDURE FOR PROPOSITIONAL CALCULP) RONALD HARROP University of Newcastle upon Tyne, UK
1. Introduction In an earlier paper [2], a study was made of the possibility of reducing propositional calculi of a general structure to ones with equivalent decision problem of a fairly simple structure. As an application a proof was given of a result, a particular case of which was the theorem of Post which states the undecidability of the problem of testing for completeness of a finite set of tautologies under substitution and detachment. To avoid many technical and notational complexities in the proofs of the relativization results, the theory was in the main developed only far enough to enable the application mentioned above to be made. This had, however, the effect that several restrictions were imposed which were unnatural and which, from the point of view of the truth of the main theorems, were completely unnecessary. It was stated that many of these restrictions would be removed in the present paper. It was also said that further applications of the theory would be given, including the construction of sequences of decidable and of undecidable calculi satisfying certain conditions. The present paper will be found to make frequent reference to [2] not only when use is made of lemmas and theorems proved there but also in order to avoid the necessity of excessive repetition of definitions and of notational conventions introduced there. 1) This paper, together with [2], covers the results outlined in the paper presented to the Logical Colloquium at Oxford in July 1963 under the title: A relativization procedure for propositional calculi and some applications.
A RELATIVIZATION PROCEDURE FOR PROPOSITIONAL CALCULI
13
2. Generalization of the relativization result
For technical simplicity in the proof of Theorem 2 of [2], a propositional calculus L was considered such that (i) its set of formulae was defined in terms of an infinite set of propositional variables and a finite set of binary connectives and (ii) its provable formulae were obtained using a finite number of axiom schemes and two-premise rules, no axiom scheme or premise or conclusion of a rule being a vaf (variable for arbitrary formula). We generalize the theorem in two stages corresponding roughly to the use instead of L of the calculi K, J described below. For each of K, J it is shown that there is a calculus L with equivalent decision problem, the calculus satisfying the conditions imposed on L in [2] except that if K, or J, contains infinitely many axiom schemes or rules or J has infinitely many constants or connectives, then L also has infinitely many axiom schemes or rules. The corresponding calculus L + has in these special cases infinitely many axiom schemes. Let K be such that (i) its formulae are built up from an infinite set of (propositional) variables a 1 , a 2 , ••. together with a finite, or denumerably infinite set a 1 , •.. , rx" or a 1 , ... , of (propositional) constants, by means of a finite or denumerably infinite set 't'1' ••. , 't'so or 't'1' .•• , of connectives. For each i for which 't'i is a connective, 't'i shall be ti-ary for some t i > O. It is important to notice that when we later use the phrases "of r, form", "for some a/' we will automatically assume that we are only concerned with values of i, j for which 't'i is a connective and rx j a constant. Other similar phrases involving suffixes which belong to a set which may be finite should be similarly construed. Only at a few isolated places in the paper, where special emphasis seemed desirable, is direct reference made to this type of restriction on the values of suffixes. We use A 1 , ••• to denote vafs of K. By a formula scheme of K we will mean an expression constructed from vafs and constants by means of connectives . .5/'X for a formula X and Y tp; JfqJ for a formula scheme qJ will be defined as in [2], it being noticed that substitution can take place for vafs or variables but not for constants. As in [2], we will think of substitution or taking instances as being done simultaneously for all vafs or variables occurring in the formula under consideration. Certain vafs or variables may of
14
RONALD HARROP
course be subject to an identity substitution, that is, be unchanged Suppose 0.0' X w are formulae ofL.
18
RONALD HARROP
LEMMA 2: If 0 is a PP formula scheme, then, with the above notation, w, the bracketing to be inserted in (9) to obtain 8, and, under the same substitution 1) as is used in (9), Y AI> YOlo' .. , YOw are all uniquely determined by 8. A similar result holds for PP formulae.
We use induction on the length of O. It follows from the hypothesis of the lemma that 8 must be of the form (CeO 0 «(to (2) where ( is the formula scheme substituted for A 1 under the substitution used in (9). If (1 is (e(, then w = I, Y8 t = (2 and no inserted bracketing is required. If (1 is not (e(, then «(cO 0 (1' «(eO 0 (2 are PP formula schemes which are shorter than O. Further, any expression in form (9) of can be considered as trivially arising from similar expressions for these schemes. The required result now follows by suitable use of the induction hypothesis. We say that a formula scheme ofL is of-t i form (with the usual restriction, if necessary, on the range of i) if it is of the form YB i[A I 0 ... 0 AtJ* and of tu form ifit is of the form YA iA I . The formula scheme AilJ will be called the IJ-rt i formula scheme. PROOF.
°
LEMMA 3: Noformula scheme can be of t , and «.form or of i .form for two distinct values of i or of «, form for two distinct values of i or the IJ-rti formula scheme for more than one choice of the ordered pair (IJ, i). PROOF.
Immediate from (2), (3) and the definitions of the terms in-
volved. The next two lemmas give a result which corresponds to Lemma 6 of [2]. Their proofs are similar to the corresponding parts of the proof of that lemma but there are some additional complications. These originate from the fact that the operator H used in this paper is less simple than the corresponding operator T of [2]. LEMMA 4: Suppose t/J l' . . . , t/J m; XI' .•. , Xm are formula schemes of K, 8,8 1 , .•. , Om are formula schemes ofL and 5', 5" substitutions for vafs of L in each of which AI' if it occurs, is replaced by 8. Suppose further that (i) 0i = YHt/Ji' I :::;; t s; m, under 5', (ii) OJ = YHXb I :::;; i s; m, under 5", (iii) no vaf in any Ht/Ji' HXi' I :::;; t s. m, except possibly AI' gets 1) Strictly, the restrictions of the substitution to the vafs in the formula schemes concerned.
A RELATIVIZATION PROCEDURE FOR PROPOSITIONAL CALCULI
19
replaced under S', SIt respectively by a Tj formula scheme or by the B-~j formula scheme for any j in the appropriate ranges, (iv) in S' and also in SIt, distinct formula schemes are substituted for distinct vafs A j' A k (j ~ 2, k ~ 2). Then l/Ji and Xi' 1 :ij, 1 ::; j ::; k i, under S and (iii) ()j = Ht/Jj, 1 ::; j ::; k i, under S'. Let Ok, + 1 = .9"Hq>i under S and let t/J~, + 1 be a variable other than A 1 which does not occur in any t/Jj, 1 ::; j ::; k;, so that we can extend S' so that Ok, + 1 = .9"Ht/J~, + 1 under S'. By Lemma 5, there exist t/Ji, ... , t/J:, + 1 such that (iv) t/Jj = .9"q>ij, 1 ::; j ::; k i, t/Jk' + 1 = .9"q>i under a common substitution, (v) t/Jj = .9"t/Jj, 1 ::; j ::; k, + 1 under a common substitution, and (v i) OJ = .9"Ht/Jj, 1 ::; j ::; k, + 1 under a common 0'" -substitution. It follows from (v) that the t/Jj, 1 ::; j ::; k i , are provable formula schemes of K and hence, using (iv) and (1), that + 1 is provable in..K. We can deduce from the definitions of 0*, Ok, + l' that 0, which is the conclusion of (18), can be written as CO* 0 (CO* 0 Ok, + 1)' Thus, using the case j = k i + 1 of (vi), 0 can be written as .9"(CA10 (CA 10 Ht/Jt. + 1)) which is of the required form. This completes the proof of Theorem 1.
»:
THEOREM 2: The decision problems of K, L are with standard Giidel numbering primitively recursively equivalent to each other. PROOF. We first note that both in the calculus K and in the calculus L, a formula is provable if and only if the corresponding formula scheme, obtained by replacing a, by Ai for all occurring i, is provable. If X is provable in K, then, by Theorem 1 (ii) and (5), Cal 0 (Cal 0 HX) is provable in L. On the other hand, if Cal 0 (Cal 0 HX) is provable in L, then so is CAl 0 (CAl 0 Hip for the formula scheme q> which corresponds to X. Using Theorem 1 (i), it follows that this formula scheme can be written in form (13) for some substitution S and some formula schemes t/Jl' ... , provable in K. By Lemma 2, w must be 1, S must be an Acsubstitution, and, with substitution S, Hip must be .9"Ht/J r- Hence, by Lemma 5, there exists t/J* such that t/J* = .9"tp, t/J* = .9"t/Jl and Hip = .9"Ht/J* under an A1-substitution. Since, by
v;
A RELATlVIZATlON PROCEDURE FOR PROPOSITIONAL CALCULI
25
Lemma l(i), Ht/J* (= HYcp) can be written in the form YHcp under an AI-substitution, it follows that H cp and Ht/J* are variants of each other, the substitution employed in obtaining Htp from Ht/J* and that employed in obtaining Ht/J* from Hip being an Acsubstitution. A trivial application of Lemma 4, taking m = 1 and e, t/Jl' Xl' el as AI' cp, t/J*, Hip respectively, now shows that ip, t/J* are variants of each other. Hence cp is provable in K. Thus a formula X is provable in K if and only if Cal 0 (Cal 0 HX) is provable in L. To complete the proof of the theorem we show that we can determine whether or not a formula P is provable in L either directly or in terms of the provability or otherwise in K of certain effectively obtainable formulae of K the Godel numbers of which would, under standard Godel numbering, be primitively recursively bounded in terms of the Godel number of P. P is provable in L if and only if the corresponding formula scheme e is provable in L. Using Theorem 1(iii), the fact that we can determine effectively whether or not a formula scheme is potentially provable, Lemma 2 and Theorem 1(i), we note that we can test whether or not e is provable directly, or reduce the problem to determining whether or not certain primitively recursively obtainable formula schemes (1' ... , (w (yet> ... , yew of Lemma 2) can be expressed as YHt/Jl' ... , YHt/Jw under a common substitution with a given substitution for Al (Y Al of Lemma 2), for some t/Jl' ... , t/Jw provable in K. We can assume for simplicity and without loss of generality that there are no variables in common between any t/J hand t/J iz for which i l =f. i 2 • Our problem thus reduces to determining, given a formula scheme ( of L, whether or not ( can be expressed in the form Y Ht/J for some provable formula scheme t/J of K. Since, for any formula scheme t/J of K with at least one connective, Ht/J is longer than t/J, and, except for fairly trivial substitutions, YHt/J is longer than Ht/J, the consideration of ( reduces to the determination as to whether or not at least one of a finite number of primitively recursively obtainable formula schemes of K restricted so that no two are variants of each other, is provable in K. Since such a determination can be made by considering the provability or otherwise in K of the corresponding formulae, this completes the proof of Theorem 2. Suppose given a calculus K satisfying the conditions imposed at the beginning of this section of the paper. We can construct from it the
26
RONALD HARROP
calculus L described above and from L can construct, in the manner described in [2], calculi L', L * (ambiguously denoted by L +) which can, if desired, by taking ::J as the only connective, be made subsystems of positive implicational logic. L + will have detachment as its only rule. If K has only finitely many axiom schemes and rules, the number of connectives and constants possible being infinite, then L will satisfy the conditions required of that calculus in [2] and so, by Theorem 2 of the present paper and Theorem 2 of [2], the decision problems of K and L + are equivalent, and in fact are primitively recursively equivalent. If K has infinitely many axiom schemes or rules, then L + , as constructed following the method described in [2], will have respectively infinitely many axioms of type 1+,2+ (there will never be infinitely many axioms of type 3 + since L has only finitely many connectives). -The presence of infinitely many axiom schemes in L does not affect the proofs of Theorems 1, 2 of [2] and we can still assert that the decision problems of K, L + are primitively recursively equivalent. In cases when there are infinitely many rules in K, and therefore in L, although the proof of Theorem I of [2] is unaffected, the proof of Theorem 2 of [2] requires some change. In the consideration of the cases ({3+), (y+), we have to note that by our assumptions concerning the form of the rules ofK, and their effect on the form of the rules ofL, we can test recursively whether or not the members of a set of formula schemes form variants, under a common substitution, of the premises and conclusion of some rule R i of L and if they are can determine which rule it is. We can thus still show that the decision problems of K and L + are recursively equivalent but our proof will not show that they are primitively recursively equivalent. We have thus proved THEOREM 3: (Extension to K of Theorem 2 of [2]). With the above notation the decision problems of K and L + are recursively equivalent. L + has finitely many axiom schemes and rules except in cases when K has infinitely many axiom schemes or rules. L + has, in such cases, infinitely many axiom schemes.
The calculi so far considered have all had their formulae built up by general iterated application of connectives of definite order to the members of an infinite set of propositional variables together possibly with the members of a set of propositional constants. They have had axiom
A RELATIVIZATION PROCEDURE FOR PROPOSITIONAL CALCULI 27
schemes which were formula schemes, and rules, the premises and conclusions of which were formula schemes. This has meant that there has been a simple duality between the provability offormulae and offormula schemes in that a formula was provable if and only if its corresponding formula scheme was provable and vice versa, that is, a formula scheme was provable if and only if its corresponding formula was provable. Considerable use has been made of this fact. To conclude our discussion of systems more general than the system L of [2] to which the relativization process applies, we develop in outline the theory for a system J for which the duality properties are not available. There are other examples of similar type which could have been used instead of J. Consider a system J which is defined exactly as K except that there shall only be finitely many propositional variables, say ai' ... , a q and that there shall be a recursivity condition imposed on the axiom schemes of J similar to that imposed on the rules of K, and therefore of J, namely that the axiom schemes shall form a recursive set and that the ith axiom scheme shall involve exactly the vafs Ai' ... , Ar; for some rio We can still use A 1, ••• , for vafs, there being no need for them to be finite in number. Let (i' R; denote respectively the ith axiom scheme and the ith rule of J (for appropriate values of i). An example of a system of type J which illustrates the differences which can exist between such a system and one of type K is the following: propositional variables - just a i ; propositional constants - none; connectives - just ~ (binary); axiom scheme A 1 ~ Ai; rules
Ai (A 3
~
~
(A 3 A z)
~
~
A z)
Ai
It is immediate that the set of provable formulae of the system is the set of formulae of the form U ~ V where U, V are formulae, that is, it is the set of instances of Ai ~ A z . The formula scheme Ai ~ A z is not itself provable. If in the definition of this system, the conclusion of the second rule is replaced by Ai ~ A z, no change would occur in the set of provable formulae but Ai ~ A z would become provable. It will be
noticed that the axiom schemes and rules of the system both in original and revised form are independent. We associate with J a calculus K which satisfies the conditions we
28
RONALD HARROP
imposed on a calculus denoted by K. Its propositional variables will be ai' .... It will have the same constants as J, and, for connectives, the connectives of J together with F, G (unary) and * (binary). We define a mapping E from formulae of J to formula schemes of K by Ea,
= F~l
EIX i
=
1 ~ i ~ q,
each constant
IXi
IXi
of J,
I
E(-r:iUl' ... , Uk)) = TiEU1' ... , EU k) each connective Tj } (20) (krary) and all formulaej U 1 , ••• , U k j of J. In this definition, F iA 1 is defined by') F 1A1 = FA 1 Fe + 1 A 1 = F(F eA1 )
all c z-L
K has one axiom scheme corresponding to each variable of J and one corresponding to each constant of J. The axiom schemes corresponding to a, (1 ~ i ~ q) and IXi (relevant i) are respectively (21)
K has one rule corresponding to each connective of J, one corresponding to each axiom scheme of J and one to each rule of J. The rules corresponding, for relevant values oft, to the connective T; (k i ary), the axiom scheme ~ i and the rule R; are respectively FA 1 *F A z .,. FA 1 *FA k i + 1 FA 1 FA 1
* FA z
.. , FA 1
FA 1
* G~;
where the vafs occurring in ~i are A 1 , by replacing A r by A r + 1 for each r, 1 FA 1
* FA z
... FA 1
* FA
ti
(22)
* FT;(Az, ... , A k i + 1)
+ 1
FA 1
••• ,
~
r
FA 1
* Gp;
* FA
si
+ 1
(23)
A r i and ~; is obtained from ~i r.; and
~
* Gp;l
... FA 1
* Gp;h, (24)
1) Note that F is an explicit unary connective of J, not part of an abbreviated form of an expression involving a binary connective like D in (2).
A RELA TIVIZA TION PROCEDURE FOR PROPOSITION AL CALCULI
29
where R; is
, Pil ... Pih,
R~
(25)
are all the vafs which occur in R;, and P;j (1 ::;; j ::;; hJ and P; are obtained from Pij (1 ::;; j ::;; h;) and Pi by replacing A r by A r + I for each r, 1 ::;; r ::;; Si. The special restriction imposed on the form of the rules of K is automatically satisfied in view of the satisfaction of the AI' ... , As,
corresponding conditions imposed on the form of the axiom schemes and rules of J.
We can now prove the following results: (a) A formula is provable in K if and only if the corresponding formula scheme is provable in K and vice versa. (b) Given formula schemes sp, l/J of K, we can effectively determine whether or not cp can be expressed in the form :/EU for some formula U of J under a substitution in which Al if it occurs at all is replaced by l/J. If it can be so expressed then U is uniquely determined by cp, l/J and can be effectively obtained. The substitution will also be fully determined by cp, l/J since no vaf other than Al can arise in EV for any formula V of J. (c) We can effectively determine whether or not a formula scheme ofK can be expressed in one of the forms :/(FA I * FEU), :/(FA I * GEU) for some formula U of J. Ifit can be so expressed then the form concerned and the formula U involved are uniquely determined and can be effectively obtained. (d) If U is a formula of J then FA I * FEU is a provable formula scheme ofK. (e) If U is a provable formula of J then FA I * GEU is a provable formula scheme of K. (f) If cp is a provable formula scheme of K then it must be either of the form :/(FA I * FEU) for a formula U of J or of the form :/(FA 1 * GEU) for a provable formula U of J. (This result can be proved by induction on the length of proof of cp in K, making considerable use of (c)). It follows from (a), (c), (d), (e) and (f) that the decision problems for provability of formulae in J and K are primitively recursively equivalent. Hence, using Theorem 3, we obtain the following theorem:
30
RON ALD HARROP
THEOREM 4: (Extension to J of Theorem 2 of [2]). With the above notation the decision problems of J and L + are recursively equivalent. L + has finitely many axiom schemes and rules except in cases when J has infinitely many constants, connectives, axiom schemes or rules. In such cases L + has infinitely many axiom schemes.
3. Iteration of the relativization process - preliminary results Later in this paper we construct by iteration of the relativization process, which led in [2] from L to L +, certain sequences of distinct calculi. The proofs that the calculi are distinct rest on some results now to be obtained which are concerned with the relation of the structure of provable formulae of L + to that of provable formulae (if any) of L which have ~ as main connective. We will use, as far as possible and often without comment, definitions and notations used in [2] but, since we are working towards an iterative application of relativization, it will not be possible to keep quite as rigidly as in [2] to the use of particular letters to denote formulae or formula schemes of particular calculi. For example, in [2], formulae such as TX, Y'TX would be "recognized" as formulae of L', L* and L + even before they were stated to be so. Further, the equation Q = Y'TX would look "reasonable" whereas the equation Y = Y'TX would look "unreasonable". We now consider explicitly some effects of the possibility of ~ being a connective of L and will on occasions wish to consider expressions such as TX, Y'TX not only as formulae of L', L* and L + but also as formulae of L. Under such circumstances any attempt to have a rigid distinction between the use of Q and of Y would probably lead to confusion. 1) Since considerable reference will be made to theorems, lemmas and displayed formulae of [2], we will through the remainder of this paper denote the use of such by means of an asterisk, for example, Theorem 1*, Lemma 1*, (1)* will refer to Theorem 1, Lemma 1 and formula (1) of [2]. Suppose L, L +, T satisfy the conditions imposed on the calculi L, L + and the transformation T in [2] and that, in addition, Land L + 1) The possibility of ::J being a connective of L and of TX being a formula of L as well as of L + was naturally present throughout [2]. The use of the notational conventions referred to relied on the fact that we never wished in that paper to consider TX other than as a formula of L! (or L', L*).
A RELATIVIZATION PROCEDURE FOR PROPOSITIONAL CALCULI
31
both have :::> as their only connective, this having the effect that s = I and (J t, is :::> in the definition of T (see (1)*). Let S, l: denote respectively the set of formulae, formula schemes, of L, and therefore also of L + • Suppose XES. We define the rank of X through the following conditions, used inductively; (i) If X cannot be expressed in the form !/'TY. :::> ZZ for some Y, Z E S, Y having at least one connective and ZZ denoting Z :::> Z as in (2)*, then the rank of X is I, (ii) If X can be expressed in the form !/'TY:::> ZZ for some Y, Z E S, where Y has at least one connective and has rank r, but cannot be expressed in the form !/'TY' :::> Z'Z for Y', Z' E S where Y' has at least one connective and has rank greater than r, then the rank of X is r + I. Using the fact that, for any U, V, and any substitution, !/'TU:::> V Z is longer than U, we can prove inductively that the rank of a member of S is a uniquely determined effectively obtainable integer and that the rank of a member of S is equal to the rank of any variant of that member. By changing X, Y, Z, S to cp, t/J, x, l: respectively, throughout the definition of rank of a formula, we obtain a corresponding definition for the rank ofa formula scheme. Suppose M is a calculus which has S as its set of formulae. M is said to be offinite rank if it does not possess provable formulae of arbitrarily large rank. The rank of M will be defined to be zero if M has no provable formulae and to be r (> 0) if M has provable formulae of rank r but none of greater rank. If M is not of finite rank it is said to be of infinite rank.
6: If X, YES, Y = !/'X and X is of rank r, then Y is of rank greater than or equal to r. LEMMA
If r = I, there is nothing to prove. If r > 1, then there exist W, Z E S, with W having at least one connective and being of rank r-l, such that X = !/'TW. :::> ZZ. Hence Y can be written as !/'(!/'TW. :::> ZZ). By Lemma 5(i)*, this formula is of the form !/'TW. :::> Z,Z for some substituted form of TW and some Z' E S. The required result can be obtained immediately. We note trivially that for some formulae, for example at, a substituted form can have a greater rank than the original formula, while for others, for example at :::> at, all substituted forms have the same rank as the original formula (in this case, I; see Lemma 2 (i) (aJ (,yz)*). PROOF.
RONALD HARROP
32
THEOREM 5: Suppose L, L +, T are as above. If L has finite rank, so has L +. More precisely, if the rank of L is r, then that of L + is r+ 1. PROOF. By Theorem 1*, a formula P is provable in L + if and only if it satisfies condition D+. By Lemmas 1(ii)* and 3* and the definition of rank of a formula, if P is provable in L + then it is of rank 1 unless it satisfies D + by (a+). Hence if r = 0, that is, ifL has no provable formulae, then L +, which will have some formulae which satisfy D+, but none of them by (a+), will be of rank 1. Suppose that r > and that P, of rank b greater than or equal to r+ 1 satisfies D+. Then, (i) since P satisfies (a+), it is of the form !JYTX.::> y2 for some provable formula X of L and some YES and (ii) since P has rank b it can be written in the form !JYTX'.::> Y'2 for some X', Y' E S where X' has at least one connective and has rank b - 1. By Lemma 7*, P can be written in the form YTZ. ::> y 2 for some provable formula Z of L which is of the form !JYX'. Hence, by Lemma 6, L has a provable formula of rank at least b - 1. Thus b - I :s; r and the rank of L + is not greater than r + 1. Since the rank of Lis r, there is a provable formula W of L of rank r. The provable formula TW. ::> ai of L + which satisfies D+ by (a+), has rank at least r+ 1. Hence the rank of L + is at least r+ 1 and thus is exactly r+ 1. This completes the proof of Theorem 5. Our final results in this section are concerned with a calculus which is essentially the "union" of the relativizations of two "disjoint" calculi. Suppose that L 1 , L 2 are calculi which satisfy the conditions imposed on Lin [2] and that they have disjoint sets of binary connectives which we will denote by CT1' ••• , CTs ; CTs + 1, ..• , CTk respectively. Let S1' I 1 ; S2' I 2 denote the sets of formulae and formula schemes of these calculi and let S, I denote the set of formulae, formula schemes constructed from a l' . . . ; A 1, .•. , by means of the single binary connective ::>. Let T be the mapping defined as in (1)* with respect to the complete set CT 1, , CTk of connectives. Tthus maps the set of formulae constructed from aI' , by CT 1, . . . , CTk' and thus also S1 and S2' into S. We construct as usual, using the restriction of T to 8 1 , II' and consider the case in which Lt has no connectives other than c . There is a slight complication in the construction of the corresponding relativized form of L 2 since the definition of L + in [2] used (1)* and this depended,
°
t.;
A RELA TlVIZATION PROCEDURE FOR PROPOSITIONAL CALCULI
33
through its third line, on the connectives of the calculus concerned being numbered consecutively as the 1st, ... , sth connectives, for some s, whereas the restriction of T as defined above to Sz, X z treats the connectives of L z as if they were numbered as the (s+ l)st, ... , kth connectives.' ) Let Li(s) denote the relativized calculus, with ~ as its only connective, which has axiom schemes and rules formally the same as those given for Li in [2], that is, for L + with L replaced by L z , except that in the statement of 3*, 1 ::;; r ::;; s, 1 ::;; t ::;; s should be replaced by s + 1 ::;; r ::;; k, s + 1 ::;; t ::;; k respectively. T should be considered as the restriction of the mapping Tdefined above to the sets Sz, 2,'z. It will not be surprising to find that Li(S) behaves like Li (obtained by first renaming the connectives as 1st, ... , (k-s)th), since the main aspects of (1)*, (4)* are retained. These are the possibility of carrying over the structure of the formulae of L fully into L + in such a way that formulae of some recognizable structure not previously used (namely (4)*), would be available in L + for use in the construction of the axiom schemes of L +. We denote by Ltz the calculus with formulae S, formula schemes X, axiom schemes the union of the axiom schemes of L 7, Li (s) and detachment as its only rule. Similar notation is used in corresponding cases when calculi other than Lj , L z are involved. The calculi Ltz and Li.1 will, in general, naturally be distinct. It is shown later that their decision problems are primitively recursively equivalent to each other. Let »t, denote the condition obtained from D+ by making the following changes: (i) replace everywhere, D+ by Dtz and S+ by S. (ii) replace (ct+), (r), (y+) by (atZ)l' (cttz)z, Wi.Z)l, (f3tz)z, (ytZ)l' (ytz)z, the conditions with suffix i having L i for L, i = 1,2. The changes in (y+) are consequential on those in (r) in the sense that (ytZ)i shall include a reference to (f3tZ)i' i = 1,2, where (y+) includes a reference to (r). (iii) (0+), (e+), (C+) shall be renamed (otz), (etz), ("i.z) respectively. The special condition imposed on (otz), which has consequential effects ') The third line of the definition of T restricted to the connectives of L 2 reads T(EaiF) = TE :J ((TE)3+i :J (TF)2), s-} 1 :::; i ::;; k. For normal theory it would read T(Ea s + i F) = TE :J ((TE)3+i :J (TF)2), 1 :::; i :::; k-s.
RONALD HARROP
34
on (Btz), (n.z) shall be modified so as to read "it is required that Ql' Qz should be of the form fT(A l urA 3 ) , fT(A z utA 4 ) respectively, for some r, t such that either both 1 ~ r ~ sand 1 ~ t ~ s or both s + 1 ~ I' ~ k and s + 1 ~ t ~ k:" THEOREM 6: With the above notation, a formula N is provable in Li.z
if and only if it satisfies
»t.:
PROOF. The result follows by consideration of the details of the proof of Theorem 1* that is, of the proof of Lemmas 4*, 9*. In obtaining, following the proof of Lemma 4*, a proof that if N (E S) satisfies »i, then it is provable in Li.z, the main cases are (ai.Z)l and (ai.zh- These, however, lead the consideration straight back to proofs of corresponding results for Lt, Li(s) and these can be obtained by direct use of the method employed in the consideration of (a +) in the proof of Lemma 4 *. The restrictions imposed in the definition of axiom scheme 3* are satisfied at places where the axiom scheme is required for use both in the (ai.Z)l and in the (at2)Z case. The cases (Y:'z); are referred back to the use of cases of axiom scheme 2 + of L:'z, these being cases due to Lt or to Li(s) according as i = 1 or 2. The remaining cases are trivial, noting where necessary the special restriction imposed on (btz), (atz) and (n.z)·
We now show that if N is provable in Lt2 then it satisfies »i; This trivially reduces (compare the proof of Lemma 9*) to showing that if P, P :;) R are provable in Li.z and satisfy Di.z then R, which is provable in Ltz, also satisfies D7.2' Due to the splitting of each of (a+), (r), (1'+) of D+ into two parts when Dtz was formed, there are now 81 instead of 36 cases to be considered. Cases corresponding to those in parts (i), (ii) and (iii) of the proof of Lemma 9* can still be shown to be trivially impossible by use of the structural considerations previously employed, or to lead easily to the conclusion that R satisfies Di.2' This accounts for 36 + 27 + 12 (= 75) cases. Consider now cases of type (iv), that is, cases in which P satisfies D7.z by (Ci.2) and P :::> R satisfies D7.z by a method other than (arZ)l' 2 (C7.2)· P is of the form Ql A Qz. :;) M where, in the case of D'1'.z, Ql, Q2 are respectively of the form JT(A lurA 3 ) , fT(A 2utA 4 ) and either 1 ~ r ~ s, 1 S t ~ s, or s + 1 S r S k, s + 1 s t s k. From Lemma 2(i)(a;)(p)*, P :::> R cannot satisfy D1.z by (btz), (S'1'.2)' If P :::> R satisfies
«:»;
A RELATIVIZATION PROCEDURE FOR PROPOSITIONAL CALCULI 35 D~.z by (b~.z), (e~.z) then R satisfies D' by (e~.z), «(~.z) respectively. We know, by Lemma 2(i) (IX;) (f3)* that P :::J R cannot satisfy Dtz by (f3i.Z)i' i = 1, 2. Suppose P :::J R satisfies Di.z by (yi.Z)i' Then R is of the form R t :::J M Z. Since P satisfies ot; it follows, using Lemma 3*, which with trivial notational modification applies to »i; that both Qt :::J M Zand Qz =:> M Zsatisfy Dtz by (IXtz)t, (IXtz)z or Since Qt, Ql' u, are of the form JT({Jt, JT({Jz, JTt/J (under a common substitution) for some rule ({Jt ({Jz/l!J of L i, it follows, again using lemma 3*, that Qt =:> M 1 , Qz =:> M 1 must satisfy Dtz by (IXtz)t or (IXtz)z. Further, by using Lemma 1(iii)* modified so as to apply to Ltz, L t, L z, we see that Qt, Qz, R are respectively of 1';" 1';2' T, form where all of it, i z, j are in the range 1, ... , s if i = 1, and in the range s + 1, ... , kif i = 2. Further application of lemma 1(iii)* shows that Qt =:> M 1 and Qz =:> M Z must both satisfy »t, by (IXtZ)i' Using Lemma 6* as in the proof of Lemma 9*, we can now show that R t :::J M Z satisfies Dtz by (IXi.Z)i by following directly the last part of the consideration of case (iv) in the proof of Lemma 9* substituting r, for Land (IXtZ)i for (IX+). This completes the proof of Theorem 6.
«z,»
THEOREM 7: With the above notation and any standard type of Giidel numbering, the predicate corresponding to provability in Ltz is primitive recursive in the corresponding predicates for L t , L z, and, for i = I, 2, the predicate corresponding to provability in L; is primitive recursive in the corresponding predicate for Li.z. PROOF. This is similar to the proof of Theorem 2* preceded by the proof of a result which corresponds to Lemma 10*. Suppose X E S, and that X is provable in L i, then TX . =:> ai satisfies »t, by (IXi.Z)i and is provable in Ltz by Theorem 6. Suppose now that X E S, and that TX . =:> ai is provable in Ltz. By Theorem 6, the definitions of (lXi.z)t, (lXi.z)z, and Lemmas 3*,1 (ii) (iii)* (having put them in a form applicable to Ltz, Lj , L z), we can deduce that X is not a variable and that there exists X' which is a provable formula of Si' and is thus also not a variable, such that TX = !/TX'. Hence, by lemma 8*, which generalizes to the new context, X = !/X' and therefore X is provable in L; Hence, for i = 1, 2, if X E Si then X is provable in L, if and only if TX . =:> ai is provable in Li.z. Thus the decision problems of L, , L z are primitive
36
RONALD HARROP
recursive in that of Li.z' The proof that the decision problem of Li.z is primitive recursive in those of Lj , L z can be obtained, using Theorem 6, by trivial modification of the corresponding part of the proof of Theorem 2*. Using the notation of Theorem 7 we can obtain the following two corollaries the first of which is trivial and the second of which can be obtained by noting that the decision problem of Li.z is primitive recursive in the decision problems of L t, L z each of which is primitive recursive in the decision problem ofL;.t. r ) Corollary 1. Ltz is decidable if and only ifL 1 andL z are both decidable. Corollary 2. The decision problem of Li.z is primitively recursively equivalent to that of L;.l'
Suppose L 3 is a calculus of type L and that t is an integer greater than zero. Consider the calculi Lj and Lj(t). It seems likely that a proof of the primitive recursive equivalence of the decision problems for these calculi could be obtained by constructing explicitly a condition (I) for LjB 2 ) 2 =:> (T(A 3 =:> A 4 ) . =:> B ) (d)(T(A l*A 2).=:> B 2 ) =:>.(T(A l*A 2 ) 2 =:> (T(A 3 * A 4 ) . =:> B )
(e) (AI
=:>
B2)
=:>. (A 2 =:>
B2 )
=:>
(AI
A
A
T«A l * A 2)*(A 3*A 4)).=:> B
2
)
A 2 • =:> B 2 ) .
Here, T is defined as in (1)* with s = 2 and with the connectives being =:>, * respectively. For each n ~ 5, L, will have detachment as its only rule. By Theorem 6 and Lemmas 3* and l(iii)*, the only formulae of Lm n ~ 5, which have rank greater than 1, that is the only ones of the form YT(o/l =:> 0/2)' =:> M 2 (or equivalently .?T(X 1 =:> X 2 ) . =:> M 2 ) ) , where T is temporarily restricted to act only on S so that the rank of a formula will be defined, are those which satisfy condition (et~.3)n of D~.3.1) These are exactly the formulae of the form .?TX. =:> M 2 where X is a provable formula of L, _ i - Hence, by Theorem 1*, the provable formulae of Ln , n ~ 5, of rank greater than 1 are exactly the provable formulae of L~ - 1 of rank greater than 1. If we can show that L 4, which is L~(l) has rank 1, we can deduce at once, by induction and Theorem 5, that L, will have rank n - 3 all n ~ 5, and therefore that the calculi of our sequence are distinct. Let L o now be used to denote the calculus with S as its set of formulae and with no axiom schemes and no rules. Then, compare the proof of Corollary 3 of Theorem 7, L~.2 will be the same calculus as L~(l). Since, by Theorem 6 and Lemmas 3* and 1(iii)*, there are no provable formulae of L~.2 of rank greater than 1, while, on the other hand, L~.2' which has =:> as its only connective, does have some provable formulae, it follows that L~.2 has rank 1. Hence the calculi L m n ~ 5 are distinct. The calculi Ln , n ~ 5 can be seen to be undecidable by induction using Corollary 1 of Theorem 7, the definition of L, and the fact that L 3 which is equivalent to L 2 is undecidable (see the definition of L~.l (=L 2 ) in construction (ii)). Hence, to complete the proof that our construction is satisfactory, it is now sufficient to show that the calculi L m n ~ 5, form an increasing sequence. We first show that the axiom schemes of L 6 include those of Lj. Since schemes (b), (c), (d) and (e) are automatically 0"1,0"2
1) (ex' n.3)n denotes the condition related to L n , L. in the way in which (C/1.2)1 is related to L 1 , L. in the definition of L1.2'
A RELATIVIZATION PROCEDURE FOR PROPOSITIONAL CALCULI
41
in L 6 all we need to prove is that the axiom schemes of type (a) in L, are axiom schemes of L 6 • Since L, = L~.3' the schemes involved are the axiom schemes of type I' (see the definition of L' in [2]) of L~, that is of (L~(1))'. Now the axiom schemes (b), (d), (e) of L, are the axiom schemes of L~(l) and these will provide the required axiom schemes of (L~(1»)' as axiom schemes of L S . 3 , that is, of L 6 , of type (a). Hence the axiom schemes of L 6 include those of L s. Suppose now that, for some r ~ 5, the axiom schemes of L, are included among those of L, + r- The axiom schemes of L, + 2 consist of (b), (c), (d), (e) and the schemes of type (a) which arise from the axiom schemes of L, + r- Thus by our hypothesis, the axiom schemes of L, + 2 include (b), (c), (d), (e) and the schemes of type (a) which arise from the axiom schemes of L" that is, they include the axiom schemes of L, + i - Hence, by induction, the axiom schemes of L, + 1 include those of L, for all n ~ 5. Since the calculi L m n ~ 5, all have detachment (in form (26)) as their only rule, they form an increasing sequence. This completes the proof that the constructed calculi satisfy the required conditions.
References [I] R. Harrop, On the Existence of Finite Models and Decision Procedures for
Propositional Calculi. Proc. Camb. Phil. Soc. 54 (1958) 1-13.
r2]
R. Harrop, A Re1ativization Procedure for Propositional Calculi with an Application to a Generalized Form of Post's Theorem, Proc. London Math. Soc. 14 (1964) 595-617.
A METHOD FOR PRODUCING REDUCTION TYPES
IN THE RESTRICTED LOWER PREDICATE CALCULUS H. HERMES AND D. RODDING') University of Munster (Westf.), Germany
1. A class T of formulae with unsolvable decision problem for validity
The method is based on an appropriate description of a Semi-Thuesystem in the language of the restricted lower predicate calculus. No effort has been made to optimize the resulting reduction type. We hope that our method will yield better results in the future. We start with the well-known notion of a Semi-Thue-system S. The words W (including the empty word 0) are built up from the letters aI' " ., aN' The relations W =;Os W' (W immediately produces W') and W ~s W' (W produces W') are introduced in the usual way by the use of a finite system (L k , R k ) (k = 1, ... , M) of rules. The task of the wordproblem for S is to find a method to decide for arbitrary words W, W', whether or not W ~s W'. This problem is known to be unsolvable for many S. Now let us assume that we have a method to associate effectively to each triplet S, W, W' (W, W' are words in the alphabet of the SemiThue-system S) a formula (f.s; w, w' of the restricted lower predicate calculus such that (*)
W
~s
W' iff (f.s; w, w' is (universally) valid.
Let us further assume first that the word-problem for S is unsolvable and second that for all W, W' in the alphabet of S the resulting formula (f.s;w,w' belongs to a class T of formulae. Then it is obvious that the decision problem for validity is unsolvable for T. 1) Communicated by H. Hermes.
A METHOD FOR PRODUCING REDUCTION TYPES
43
2. A reduction type T for validity
The completeness of the restricted lower predicate calculus can be expressed in the following way: There exists a Turing machine M and an initial complete configuration Co of M, and a function C which associates effectively with every formula o: a complete configuration C(ct) of M, such that ('1'*)
Co gives rise to C(ct) iff o: is valid.
Every Turing-machine M can be described in terms of a suitably chosen Semi-Thue-system SM' To every complete configuration C there corresponds a word W(C). If C1 and Cz are arbitrary complete configurations, we have (***)
C 1 gives rise to
c,
iff W(C 1) -+ SMW(CZ)'
(*), (**) and (***) imply that
ct is valid iff cts M ; W(Co), W(C(~)) is valid. Here the formula ctSM; W(Co), W(C(~)) can be effectively calcu ited from ct. The last result shows that T is a reduction type for validity. 3. Construction of the formula ct s ;w, w'
Let S be a Semi-Thue-system over the alphabet at> ... , aN' Let the rules of S be (L k , R k ) (k = 1, ... , M), where L k , R k are words in the alphabet of S. One can assume that none of the L k , R k are empty and that every word LkR k consists of at least three letters (because all SM can be chosen in this manner). The formula ct s; w, w' contains at most the following N + 2 predicate variables: Singulary predicate variables E j (j = 1, ... , N), Binary predicate variables A, T. We shall now give the standard interpretation of a closed formula built up from these predicate variables: The individual domain is the class of all words in the alphabet of S (including the empty word D). The interpretation of E j is the property of ending with a j' of A is the binary relation which holds between two words W, W' iff W' == Waj for a suitable aj
44
H. HERMES AND D. RODDING
of T is the binary relation which holds between two words W, W' iff W~SW'.
We define Cjxy by (1)
Cjxy
~
Axy
A
Ejy
(j
= 1, ... , N),
Furthermore we give inductive definitions for Pwx (for every W) and for Cwxy (for non-empty W) by (2)
(3)
PoX ~ , Elx
A ... A'
ENx,
PWaox ~ V(PwZ ACjZX) } z
(j = 1, ... , N),
Cajxy
(j
~
= 1, (j = 1,
Cjxy
Cwaoxy ~ V(CwXZ } z
A
Cjzy)
, N), , N).
The standard interpretation respectively associates with P w the property of being identical with W,
Cw the binary relation which holds between two words
W', WI! iff W" == W' W.
Now let
IXo
be the conjunction of the following formulae (i, j = 1, ... , N), (k = 1, ... , M), where
IX4 ij ' IXs , IX6' IX7' IX8k IXl
~ VPox x
IXs ~
AAA(Txy A Tyz
Gt6 ~
AA(Pox
x y z
Gt 7 j ~
x y
A
Poy
AAAA(Txy x y z w
A
~
~
Txz)
Txy)
CjXZ
A
Cjyw
~
Tzw)
IXl, IX1j' IX3j,
A METHOD FOR PRODUCING REDUCTION TYPES
45
It is easily seen that IlC O is valid under the standard interpretation. Now we are in a position to give the following definition:
(0)
IlC S-
w
.,
W'
~ AA(IlC OA x y
Pwx
A
Pw'y ~ Txy),
4. Proof of (*)
We have to show (a) If IlCs;w,w' is valid, then W ~s W', (b) If W ---"s W', then IlCs;w,w' is valid. For (b) see section 5. (a) can easily be shown as follows: Let cxs; w. W' be valid. Then it holds for the standard interpretation. Hence the formula IlCO A Pwx A Pw'y ~ Txy holds when arbitrary words are associated with x and y. If we associate W with x and W' with y, the formulae Pwx and Pw,y are valid. Since llC o is also valid, we find that Txy holds. This means that W ~s W'. 5. Proof of (b) (cf. section 4)
We start with the following lemmata, where the premise llC o has been suppressed. LEMMA
1:
vr;«
LEMMA
2:
AVCwxy
LEMMA
3:
AAA(PwXA PwY
LEMMA
4:
AA(Pyx ---" (Cwxy
LEMMA
5:
AA(Pwx
LEMMA
6:
AAAA(Txy A CwXW
x
xy
x y z
xy
xy
xywz
A
A
(W =1=
D)
(W =1=
D).
CjXZ ---" Cjyz)
~
PywY))
PwY ~ Txy) A
Cwyz ---" Twz)
(W =1=
D).
We omit the proofs which are in all cases straight-forward by induction on W. To prove (b) we show by induction on n, that, if we suppose that
46
H. HERMES AND D. RODDING
=>S W 2 =>S ... =>s Wn , then Txy is deducible from lXo, P W1x, Pwny· For n = I see Lemma 5. Let us assume that the proposition is proved for n (induction hypothesis). Now we have to show that if we suppose that W1 =>s W2 =>s ... Wn =>s Wn + 1 ... , then Txz is deducible from lXo, P W,x, P Wn + 12. We shall show in section 6 that Tyz is deducible from lXo, PwnY, P Wn + lZ, provided that Wn =>s Wn + r- Using the induction hypothesis together with IXs we get TX2 from IXO, P W,x, PwnY, P Wn + 1Z, provided that W1 =>s ... =>s Wn + l' Now it is obvious from Lemma 1 that we can eliminate pwny.
W1
6. Deducibility of Tyz Wn + 1 means that there exist words U and Vand a k such that Wn == U L k V and Wn + 1 == U R; v: According to Lemma 1 there exists a U such that Puu. According to Lemma 2 there exist an I and an r such Wn
=>s
that CLkul and CRkur (we should keep in mind that L k i= 0 and R k i= D). With IXSk we get Til'. Lemma 4 yields PUL) and PURkr. Case 1: V == D. Then Wn == UL k , Wn + 1 == UR k , P w ) , P w " + 11'· Lemma 5 yields Tyl and Trz. This, together with Til' and (xs, Implies Tyz. Case 2: V i= D. Lemma 4 gives Cyly from PUL) and pw"y, and Ccrz from PURkr and p w " + lZ. From Til', Csl y, Ccrz we get Tyz using Lemma 6. 7. Application to the decision problem 1Xl> .•• , (X7j are given in prenex normal form (Po and C, are quantifierfree). Their conjunction has a prenex form of prefix YAyNA 3 . CLk and CRk are contained in IXSk' Let Ilk + 2 be the length of the word LkR k• Ilk ~ 1, accordmg to our assumptions in section 3. Then IXSk can be brought into prenex normal form with prefix Ailk + 3. Let Jl be the greatest of all the Ilk' Then the conjunction of all (XSk can be brought into prenex normal form with prefix All + 3. Hence lXo has a prenex normal form with prefix YAyNAIl+2. Pwx A Pw.y has a prenex normal form with a purely existential prefix. No finite upper bound exists
A METHOD FOR PRODUCING REDUCTION TYPES
47
for the length of this prefix. We denote this type of prefix by the symbol YOCJ . C(O /I Pwx /I Pw,y has a normal form with prefix YOCJAyNI\I x, y):
c.; -
(10)
II v Bi(al' ... , ak, x)
i = 1
&
Ct~ - 2(a 1 ,
••• ,
ak> x, y).
The reason for the choice of the term "bough" should be obvious in view of the tree structure of a-constituents. Notice that the first conjunction of (10) is the same as that of (9). We shall apply to boughs criteria of identity similar to the ones we have been applying to constituents and attributive constituents: notational variation does not constitute a reason for calling two boughs different. Hence we may say, in the same way as in the case of a-constituents and for the same reason, that any two different boughs with the same parameters are logically incompatible. Whenever a bough (or one of its notational variants) is determined by one of the a-constituents of depth d - 2 occurring in (9) or in (7) we shall say that this bough is contained in (9) or in (7), respectively. If the roles of x and yare interchanged in (10), we obtain a formula which is again a bough with the same parameters (up to notational variation, which we are here disregarding). This bough will be called the inverse of (10). The operation of forming the inverse will be expressed by " inv" . After these preparations, we may argue as follows: By the omission lemma, (7) implies
(Ex) (Ey) (II v Bi(a 1 , i = 1
...
,ak,x)
&
Ct~-2(al' ... ,ak'x,y»,
for this formula can be obtained from (7) by omitting quantified for-
66
JAAKKO HINTIKKA
mulae and by extending the scope of the quantifier (Ey). In virtue of the permutability of the two existential quantifiers, (7) also implies (11) (Ex)(Ey)(inv(I1 v Bi(a 1, ... ,ak'x) & i = 1
Ct~-2(al' ... ,ak'x,y))).
On the other hand, (7) implies by the omission lemma the formula (12)
(Ux) (Uy) (Bg 1(x, y) v Bgz(x, y) v ... )
where the members of the disjunction are all the boughs of depth d - 2 that are contained in (7). Because of the incompatibility lemma (as applied to boughs), (11) and (12) are compatible only if the inverse of (10) is among the members of the disjunction in (12). But this means that an a-constituent of depth d is inconsistent unless it contains the inversion of each bough of depth d-2 contained in it. This condition is the formal counterpart to the intuitive requirement that each individual mentioned in the relative list (9) has to find a place in the absolute list (7). (B) Assume again that we are given (7) and (9) in (7) and that we are also given another a-constituent Ct:- 1(a b ... , ak, x) which likewise occurs in (7). Then by the omission lemma (7) implies (Ex) (Uy) (BguJx, y) v Bgu/x, y) v ... )
where the members of the disjunction are now all the boughs of depth d-2 that are contained in (9). In virtue of the well-known exchange rule for quantifiers of different kinds (7) also implies (13)
(Ux) (Ey) (inv(BguJx, y)) v inv(Bgu,{x, y)) v ... ).
On the other hand, by the omission lemma (7) also implies (14)
(Ex) (Uy) (Bgq,(x, y) v Bgq2(x, y) v ... ),
where the members of the disjunction are all the boughs of depth d - 2 that are contained in Ct:-1(al' ... , ak, x). In virtue of the incompatibility lemma, (13) and (14) are incompatible unless the two disjunctions share at least one member. But this means that of two attributive constituents of depth d-l occurring in the same consistent a-constituent of depth d one has to contain the inversion of at least one of the boughs of depth d- 2 which the other contains.
DISTRIBUTIVE NORMAL FORMS IN FIRST-ORDER LOGIC
67
This condition is the formal counterpart to the intuitive requirement that every individual mentioned in the absolute list (7) must find a place in the relative list (9). (C) By universal instantiation the a-constituent (9) implies II v Bla 1 , •.• ,ak'x) & O"w Ctf- 2(a l ' ... ,ak>x,x), i = 1 i = 1
This formula is inconsistent unless
(15)
II v Bi(a 1 ,
i = 1
•••
,ak,x) & Ct~-2(al'" .,ak>x,x)
is consistent for at least one a-constituent ci; - 2(a 1, ... , ak' x, y) occurring in (9). In order for (15) to be consistent, it must not contain any conjunction one of whose members is the negation of another. (This is so for the same reason for which the inconsistency lemma is valid.) Whenever this is the case, we say that the corresponding bough (10) is strongly symmetric with respect to x and y. By the same token, in order for (7) to be consistent it must contain at least one bough of depth d-l which is strongly symmetric with respect to ak and x. Furthermore, by the very same token (7) must for each ai (where i = 1, 2, ... , k) contain at least one bough of depth d-l which is strongly symmetric with respect to a, and x. In general, it may be said that every consistent a-constituent ofdepth d whose outermost bound variable is x must, for every free individual symbol b occurring in it, contain at least one bough of depth d-l which is strongly symmetric with respect to x and b. This is the general, exact form of the intuitive requirement that the individual referred to by x in (9) must find a place in its own list. It is not difficult to see that in a given a-constituent (9) there can be contained at most one bough of depth d-2 (say (10» which is strongly symmetric with respect to x and y, provided that the conditions (A) and (B) of consistency are satisfied. (There cannot be more than one place, we may thus say, which the referent of x may assume in its own "relative" list.) This suffices to explain what the three conditions (A)-(C) of consistency are. When they are said to be applied to a constituent or an a-constituent, it is understood that they are applied to this constituent or a-constituent as well as to all the a-constituents of lesser depth occurring in it. If one of these a-constituents is inconsistent, then so is the given constituent or
68
JAAKKO HINTIKKA
a-constituent by the inconsistency lemma. Even in this extended sense, the question whether a given constituent or a-constituent fulfills the conditions (A)-(C) (or any single one of them) can always be decided in a finite number of steps. The three sufficient conditions of inconsistency (A)-(C) were the main content of chapter 5 of the author's dissertation. 1 ) Here they have been reformulated in a different terminology and notation, and derived from one and the same intuitive principle. 11. The effects of an exclusive interpretation of quantifiers What happens to the conditions (A)-(C) in first-order logic with identity? As was pointed out earlier, a change in the interpretation of quantifiers is the only modification which we have to make here. How, then, do the conditions (A)-(C) fare on the exclusive interpretation of quantifiers? It is easily seen that (A) carries over without any changes. The condition (B) is also seen to apply without major changes. Its applicability has to be restricted to the cases in which the a-constituents 1 (9) and (a 1 , ••• , a k , x) which were assumed to occur in (7) are really different a-constituents. Since two different a-constituents with the same parameters are logically incompatible (by the incompatibility lemma), the individuals satisfying them must be different from each other. And this may be seen to suffice to restore the condition (B). In fact, we shall assume that this restriction is built into the condition (B) itself. This does not make any difference for our purposes. It is true that in the original formulation of (B) we did not exclude the case u = q. However, the force of (B) in this special case is also obtained from (C), as you may easily verify. Hence we may exclude this case from (B), and say that it remains unchanged on the exclusive interpretation. In contrast to (A) and (B), the condition (C) is based entirely on modes of reasoning that are incompatible with the exclusive interpretation of quantifiers. It therefore becomes inapplicable in first-order logic with identity.
Ct:-
12. Omitting layers of quantifiers In this paper the conditions (A)-(C) will not be discussed as much as certain consequences of theirs. These consequences may also be derived 1) See the first footnote of this paper.
DISTRIBUTIVE NORMAL FORMS IN FIRST-ORDER LOGIC
69
independently in a rather simple manner. They are found by asking: What happens to a constituent or a-constituent when a layer of quantifiers is omitted from it? There are two cases to be considered here: (a) the omission of the innermost layer of quantifiers and (b) the omission of the outermost layer of quantifiers. The other cases in effect reduce to these. In order to eliminate an intermediate layer of quantifiers from an a-constituent (7) - say to eliminate the e-th layer of them - it suffices to omit the outermost layer of quantifiers from every a-constituent cr ::: 1 of depth d - e + I occurring in (7). (a) What happens to (7) when all the subformulae of the form (Ex d) Cto(a 1o
or
(UX d) (J
••• ,
a., Xl' ... , Xd -
crc«; ..., ak> Xl'
l'
... , Xd -
Xd) 10
Xd)
are omitted? An answer is obtained by examining what happens to the subformulae of the form Ct 1(a 1. . . . , a., Xl' ... , Xd _ 1) of (7), i.e. by considering the special case d = 1. In this case, the result is seen to be of the form Cto(aI' ... , ai, Xl' ... , x d _ 1) with the possible exception of notational variation. Because of the way deeper a-constituents depend on shallower ones it follows that in the general case (7) becomes a formula of the form Ct" - 1(a1, ... , ak), i.e. becomes an a-constituent with the same parameters (P .1) -(P .2) but with depth d -1, with the possible and inessential exception of notational variation. From the omission lemma it follows that the resulting a-constituent is implied by (7). (b) Assume that we are given (7) and (9) in (7). A part of 0) is then 1 (Ex) (a 1 , •.• , ai, x). What happens to this part of (7) if all the atomic formulae containing x (and the quantifier (Ex» are omitted from it? The result is obviously of the form
Ct:-
(16)
1r
i
p (Ey)Ctt-
=1
2(a1,
... ,a k, y) & (UY)(JpCtti
=
1
2(a
1, ... ,ak,Y)
except, perhaps, for notational variation. If we add to (16) as an additional member of the conjunction the unquantified part IIs i = 1
B i (a 1 ,
••• ,
ak )
of (7), we obtain a formula which is of the form Ct d- 1(a1, ... , ak), again up to notational variation. In virtue of the omission lemma, this formula
70
JAAKKO HINTIKKA
is implied by (7). It will be said to be obtained from (7) by reduction with respect to (9). When (7) is reduced with respect to the different a-constituents of depth d-l occurring in it, we obtain a number of formulae of the form Ct d- 1(al ' ... , ak). If there are two formulae among them which are different (apart from notational variation), then (7) is inconsistent, for it implies both of these formulae which are mutually incompatible by the incompatibility lemma. In order for (7) to be consistent, the results of reducing it with respect to the different a-constituents of depth d-l occurring in it must all coincide. This gives us a necessary condition of consistency for an a-constituent (7). It will be called condition (D). When in the sequel it will be said to be applied to a given constituent or a-constituent, this will be understood to mean that it is also applied to the a-constituents of lesser depth occurring therein. In the diagram of section 6 the reduction of the constituent illustrated there (with respect to one of the a-constituents of depth d-l occurring in it) is represented schematically by the solid line. As we just saw, the result should be independent of the choice of Ct~-l. Further conditions of consistency are obtained by comparing the results of eliminating the different layers of quantifiers which occur in an a-constituent, say in (7). If (7) is to be consistent, the result is in the case of each layer an a-constituent of depth d-l, as we just saw. It may now be added that all these resulting a-constituents of depth d - I must be identical (up to notational variation); otherwise they would be incompatible, though they are all implied by (7). Hence the omission of a layer of quantifiers must yield the same result, no matter which layer is omitted, if a constituent or a-constituent is to be consistent. This requirement will be called condition (E). In our diagram (section 6) the omission of the last layer of quantifiers is indicated by the dotted line. Notice that the two omissions that are represented schematically in the diagram must yield the same result if the attributive constituent represented by the diagram is to be consistent. Since the result of omitting one layer of quantifiers from a given a-constituent Ct~(aI' ... , ak) is unique if this a-constituent is consistent, we can refer to all these results in one and the same way: Each of them will be called Ct~[-l](al"'" a k). Incase Ct~(al' ... , ak)does not yield a unique result when a layer of quantifiers is omitted, or if the result is not
DISTRIBUTIVE NORMAL FORMS IN FIRST-ORDER LOGIC
71
an a-constituent, we shall say that Ct~[-l](al' .,., ak) disappears, If it does not disappear, it is easily seen to satisfy the conditions (D)-(E) of consistency and hence to yield a unique result when another layer of quantifiers is omitted. The result of applying the same operation to ca«; ... , ak ) e times will be called Ct~[-e](al' .. " a k). Notice that there is an operation which in a certain sense is the inverse of the operation of omitting a layer of quantifiers. It is the operation of expanding a constituent or a-constituent Cd of depth d to a disjunction of a number of subordinate constituents or a-constituents of depth d+ 1. The procedure which was mentioned earlier in section 5 for converting formulae into the normal form may be assumed to be such that each of the subordinate constituents (or a-constituents) of depth d-s- 1 again yields Cd when the innermost layer of quantifiers is omitted. We may also require that this is the case no matter which layer of quantifiers is omitted from the subordinate constituent or a-constituent in question, for if the result is not C", the subordinate constituent is inconsistent and hence may be omitted from the normal form of c- with depth d-s- 1. The relation of the two sets of conditions (A)-(C) and (D)-(E) (conceived of as conditions of consistency) is straightforward. It may be proved that whenever a constituent or a-constituent satisfies (A)-(C) it also satisfies (D)-(E). In fact, it may be proved that it satisfies (D)-(E) whenever it satisfies (A) plus (B) in its original, strong form. Conversely, it may be shown that a constituent or a-constituent Cd satisfies the conditions (A)-(B) if at least one of its subordinate constituents or a-constituents of depth d-s- 1 satisfies the conditions (C)-(F). In a sense, the two sets of conditions (A)-(C) and (C)-(E) are thus equally powerful for the purpose of discovering inconsistent constituents and a-constituents. In order to apply the latter ones successfully to a constituent, however, we must first explicate its content by expanding it into a disjunction of constituents of depth d+ 1, and apply the conditions (C)-(E) to each of these. These results may be proved rather simply by means of the diagrams of constituents and a-constituents which have been explained in section 6. For reasons of space the proofs are not given here. Suffice it to say that the implication from the satisfaction of (A)-(C) to that of (D)-(E) is proved conveniently by induction on d. That this should be the case is not
72
JAAKKO HINTIKKA
surprising in view of the intuitive meaning of the conditions (A)-(C), for from the intuitive meaning they have it is rather easy to gather that they require (among other things) that the omission of two adjacent layers of quantifiers has to give us the same result. 13. The effects of identity again
So far, we have considered the process of omitting a layer of quantifiers only in first-order logic without identity. What changes are occasioned by the exclusive interpretation in this respect? The omission of the last (innermost) layer of quantifiers can be accomplished as before. The omission of an intermediate layer of quantifiers also reduces to the omission of the outermost quantifier in the same way as before. What cannot be done in the same way as above is the process of relative reduction. Applied to (7) with respect to (9), it no longer gives us a formula which is implied (in all cases) by (7). The reason why the omission lemma fails in this case is the following: In order to get rid of the outermost quantifiers (Ex) and (Ux) we have to omit from (9) not only all the atomic formulae containing x but also all the identities involving x which on the exclusive interpretation are implicit in the quantifiers (Ex) and (Ux) themselves as well as in all the inner quantifiers (Ez) and (Uz) of (9). Simply omitting all these identities does not always fall within the scope of the omission lemma. In some cases it does; thus the part of (7) beginning with (Ux) is omitted altogether in the reduction, eliminating all problems concerning identities occurring in it. Furthermore, all identities implicit in existential quantifiers are easily seen to fall within the scope of the omission lemma. Thus there only remains the problem of dealing with a universal quantifier (Uz) occurring in the inner layers of (7). On the exclusive interpretation, this quantifier really means (Uz) (z =ft x:::> ... ). Intuitively, it is easy to see whence the trouble comes here. We are trying to omit x, that is, we are trying to convert a statement about all individuals different from the referent of x into a statement about all individuals without restrictions. Clearly this is possible only if we add a new clause which takes care of the case in which one of these "all" individuals is the referent of x. The way to do this is as follows: Assume that we are reducing (7) with respect to (9). The at the same time as we omit all the atomic formulae which contain x from (9), we add
DISTRIBUTIVE NORMAL FORMS IN FIRST-ORDER LOGIC
73
as a new member of the main conjunction of (9) the following formula: (17)
and as a new member of the outermost disjunction x, y) the following formula:
lT w
ct' - 2(a l,
... , ai,
0: - I[-l](al' ... , ai, y).
(18)
The same operation has to be applied to every a-constituent ci; - e(a I, . . . , ai, x, y, ... ) which occurs in (9), and whose outermost quantifiers are (let us say) (Ez) and (Uz). At the same time as we omit all the atomic formulae which contain x from it, we add as a new member of the main conjunction the formula (17)*
(E z)
cr:t p
e[ - 1](
aI' ... , ak' z, y, ... )
and as a new member of the outermost disjunction the related formula (18)*
0: -
Ct~ -
e[-I](a I, ... , a k,
z, y, ... ).
Here l[-I\aj, ... , ab x) and O~ - e[-l](al' ... , ai, x, y, ... ) may be defined as the respective results of omitting the last layer of quantifiers from (9) and from Ct~ - e(a I' . . . , a k , x, y, ... ). The result of carrying out this operation in all the a-constituents which occur in (9) as well as in (9) itself, at the same time as we omit all the atomic formulae containing x from (9) and add as a new member of the main conjunction of (9) the unquantified part
ITs Blal' ... , ak )
i = 1
of (7), will be called the result of reducing (7) with respect to (9) in firstorder logic with identity. This result is implied by (7) (on the exclusive interpretation of quantifiers, of course). This implication will not be proved formally here. No proof is probably needed to convince the reader, for on the basis of the intuitive considerations which led us to modify the process of relative reduction it should be obvious (at least on a moment's reflection) that the modification is just what is needed to reinstate the implication. After the reduction of an a-constituent with respect to another has thus been redefined so as to restore the crucial implication, everything else may be done in the same way as in first-order logic without identity.
74
J AAKKO HINTIKKA
We can define what it means to omit a layer of quantifiers, and we can reformulate the necessary conditions of consistency (D)-(E) for firstorder logic with identity. The relation of these conditions to (A)-(B) (the latter in the weak form) is even more clear-cut than before. A constituent or a-constituent csatisfies (D)-(E) if it satisfies (A)-(B); and it satisfies (A)-(B) if at least one of its subordinate constituents or a-constituents of depth d+ I satisfies (D)-(E). 14. A disproof procedure defined
The conditions (A)-(E) are connected in an intimate and intuitive way with the structure of constituents and a-constituents. It will now be shown that they provide us with a disproof procedure for inconsistent constituents and a-constituents, a procedure which is semantically complete in that every inconsistent constituent is subject to this disproof procedure. The procedure can be described very simply. Given a constituent Cd of depth d, how can we try to find out whether it is consistent or inconsistent? It may be the case that our conditions (A)-(E) suffice to establish its inconsistency. If not, we do not yet know whether Cd is consistent or not. What we can do, however, is to expand Cd into a disjunction of a number of subordinate constituents of depth d+ I (with the same parameters (P .l)-(P. 2)), to which we may apply our conditions. If all of them are inconsistent by our sufficient conditions, their disjunction and therefore Cd itself is likewise inconsistent, and we have an answer to our question. If not, we have to keep on expanding Cd into a disjunction of subordinate constituents of greater and greater depth d-i-e. If during this procedure some constituents are inconsistent by our conditions, they may be omitted in the sequel. If for some e all the subordinate constituents of depth d-i-e turn out to be inconsistent by (A)-(E), then so is C". What we want to show is that for each inconsistent constituent cthere is an e such that this happens at depth d + e. In other words, whatever inconsistencies there may be in a constituent can be brought to light by adding to its depth. And since every formula can be brought to a distributive normal form, this likewise gives us a method of disproving every inconsistent formula. This method consists of the rules for converting
DISTRIBUTIVE NORMAL FORMS IN FIRST-ORDER LOGIC
75
a formula to the (second) distributive normal form (which also give us rules for expanding a constituent to a disjunction of a number of deeper constituents) plus our sufficient conditions (A)-(E) ofinconsistency. The statement that this method really can be used to disprove every inconsistent formula will be called the completeness theorem of our theory of distributive normal forms. Because of the connection between the two sets of conditions (A)-(C) and (C)-(E) either of these combinations of conditions may be used in the disproof procedure just described (in first-order logic without identity). The only difference between the two sets of conditions is that if we use the former we shall be able to see inconsistencies one step earlier than if we used the latter; i.e. if using the former we have to go down to depth d-i-e (in the case of some particular formula), then using the latter we have to go down to depth d+e+ 1. The same relationship holds in firstorder logic with identity between the two sets of conditions (A)-(B) and (D)-(E). In proving the completeness theorem either set of conditions may be used. Because of the greater simplicity of (D)-(E) they will be used in what follows instead of (A)-(E). 15. Completeness proof (first part)
The disproofs described in the preceding section have the structure of a tree. Since each constituent of depth d has only a finite number of subordinate constituents of depth d+ 1, at each point of this tree only a finite number of branches can diverge. Hence the tree theorem (Konig's lemma) applies, showing that the completeness theorem can be proved by proving the following result: A constituent is consistent (satisfiable) if it can occur in a sequence of constituents which satisfies the following conditions: (i) each member of the sequence is subordinate to its immediate predecessor; (ii) each member satisfies the conditions (C)-(E) of consistency (in first-order logic with identity, the conditions (D)-(E». We may also assume, for simplicity, that the depth of each member of the sequence is d + 1 when the depth of its immediate predecessor is d.
76
JAAKKO HINTIKKA
It clearly suffices to prove that the first member of each sequence which
satisfies (i)-(ii) is consistent. Let us assume that we are given a sequence So of deeper and deeper constituents which satisfies all the conditions just mentioned. We shall show that the first member of So is satisfiable. For this purpose, we shall first construct a sequence of attributive constituents S r- This will be done in such a way that instead of adding to the depth of constituents (as in So) we introduce new free individual symbols. In fact, the depth of all the members of Sl will be the same as that of the first member of So (say d); however, each of them will have one free individual symbol more than its immediate predecessor. Each member of Sl will be chosen in such a way that it occurs in the corresponding member of So, i.e. occurs there with appropriate bound variables substituted for some of its free individual symbols, of course. For the first member of Sl we may take the attributive constituent of depth d which occurs in the first member of So' The main question which remains to be answered is therefore: How is a member of Sl obtained from its immediate predecessor? In order to answer this question, let as assume that (7) is an arbitrary member of S r- Let is also assume that of the free individual symbols of (7) a l' a2 . . . , a j (1 :s; j :s; k) occur already in the first member of So while a j +1' aj +2, ... , a k do not occur there. Then an a-constituent of the form Ct~(a 1, . . . , aj, Xj + 1, . . . , Xk) occurs in the corresponding member of So' Since the next member of So is obtained (as was pointed out in section 12) by adding one more layer of quantifiers, there must be in the next member of So at least one (usually there are several) attributive constituent which is subordinate to the one just mentioned and which is of the form Ct: + teal' ... , aj' X j +1' ... , Xk) or, more explicitly, (19)
II B;(a 1, ... ,aj'x j+ 1, ... ,xk) &
s i = 1
nt' i = 1
(EXk+l)Ct~(al' ... ,a j,xj+l' ... 'X k'Xk+ 1) &
(UXk + 1)
(Jt'
i = 1
Ct~(al> ... , aj' Xj + 1, ... , Xb Xk + 1)'
In order to obtain the next member of Sl' we choose one of the a-constituents of depth d occurring in (19). The principle of selection will be explained later. If the attributive constituent chosen is
DISTRIBUTIVE NORMAL FORMS IN FIRST-ORDER LOGIC
77
(20) the next member of S 1 is simply (20)* Ct~(a1' ... , a j , a j + l' ... , ai, ak + 1)' where a k + 1 is a new free individual symbol. This a-constituent satisfies by construction the requirement that the corresponding bound-variable formula (20) occurs in the corresponding member of SQ. How is (20) to be selected? We shall first explain one particular method of making the choice. Subsequently, it will be pointed out that this method can be considerably generalized. Let Tk be the set of all the a-constituents of depth d -1 which occur in (7), and let be the set of all the similar a-constituents which contain x j + l' Xj + 2' . . . , X k instead of a j + l' a j + 2, . . . , ak' Each of the a-constituents from among which (20) is chosen arises from a member of through the addition of a new layer of quantifiers. For the members of T: we shall shortly establish a certain seniority ranking. After this has been accomplished, (20) may be chosen to be any of the a-constituents which arise from members of T: of the highest rank. Because of the similarity of Tk and T:, a similar ranking will be automatically induced for the members of Tk , too. The only thing that remains in order to define Sl is therefore to explain how the seniority ranking is established. This ranking will be a linear quasi-ordering, i.e. it will be a linear ordering of the different ranks into which the members of T~ (and ofTk ) will be partitioned. The ranking of the a-constituents of depth d-l occurring in the first member of Sl does not matter. Hence the only thing we have to do is to explain how this ranking is carried over from one member of Sl to the next one. (In each case, we have a ranking of the a-constituents of depth d-l occurring in a member of Sl') In order to explain this, notice that one and the same result is obtained from (19) in two different ways:
T:
T:
(a) By omitting the last layer of quantifiers; (b) By reducing it with respect to (20). This identity follows from the fact that (19) satisfies the conditions (D)-(E). The result of'{a) is simply ca«; ... , aj' Xj + l' . . . , x k ) , which
78
JAAKKO HINTIKKA
is therefore also obtainable through the operation (b), i.e. whose quantified part is obtainable by omitting all reference to X k + 1 from (20). This fact establishes a one-to-many correlation between the members of T{ and the members of the set T~ + 1 of all a-constituents of depth d-1 which occur in (20): each of the former arises from one of the latter by omitting all mention of X k + 1. This correlation will be called the weak correlation between the members of the two sets and conceived of as a symmetric relation. A similar relation, which will be referred to in the same way, obtains between the members of Tk and the members of the set Tk + 1 of all the a-constituents of depth d -1 which occur in (20)*. It is to be observed that this relation is determined as soon as the sequence S1 is given; its definition does not in any way turn on the definition of our seniority ranking. The weak correlation can be strengthened (artificially, so to speak) into a one-to-one correlation between all the members of Tk (or of TD and some of the members of Tk + 1 (or of T~ + l' respectively). All we have to do for the purpose is to choose one of the weak correlates of each member of Tk and assign it to this member as its strong correlate. This can of course be done in a variety of ways in most cases: in the sequel we shall consider some particular way of doing so. One important exception will be made here, however. That member of Tk which gave rise to (20)* (i.e. which is identical with Ct~[-1](a1' ... , a k , Xk + 1) will not be assigned any strong correlate in Tk + 1; instead, we shall say that it is associated with (20)*. The same will of course apply to T~, T{ + l ' and (20). When the strong correlation has been established, the seniority ranking may be carried over from Tk to Tk + 1 by stipulating that strong correlation preserves relative rank, and that members of Tk + 1 which do not have any strong correlate in Tk rank lower than one which has such a correlate. The net effect of the transition from (7) to (20)* (i.e. from a member of S1 to the next one) on the seniority ranking is thus that the ranking is preserved except that (i) one member of the highest rank gets lost and that (ii) a new rank will be created which is lower than all the old ones. From this we can gather what happens in the long run in the sequence S1. Every rank will become empty while lower and lower new ranks will (usually) be created. Every chain of strong correlations will come to an end with an a-constituent which does not have a strong correlate any
DISTRIBUTIVE NORMAL FORMS IN FIRST-ORDER LOGIC
79
more but which is associated with the next member of St. Conversely, every member of St is associated with an a-constituent of depth d-1 which occurs in its predecessor. Notice also that if we move backwards in St no chain of weak correlations comes to an end until it reaches the first member of St. What has been said so far applies in the first place to first-order logic without identity. In first-order logic with identity, the situation is slightly complicated by complications that were needed in the definition of relative reduction in section 13. They enter the present discussion through the operation (b) which was applied to (19) earlier in this section. The net effect of the complications here is that one of the members of Tk may become weakly correlated, not with any member of Tk + t, but rather with Ct~[ -t](a t, ... , ak , X k + t). This member of T k will, however, be the one which is not going to have any strong correlate anyway. The fact that it may now also lack a weak correlate does not interfere with the way the seniority ordering is carried forward in St. Nor does it interfere with the statements made in the preceding paragraph. This completes our explanation of how the sequence St may be obtained from the given sequence So' We can see that such a sequence can always be obtained. It only remains to show that all the members of St are simultaneously satisfiable. If we can do this, then we have shown that the first member of St is satisfiable. In first-order logic with identity, the satisfiability of the first member of So is thereby made obvious. In first-order logic without identity, the satisfiability Of the first member of So can be established by an argument similar to the one we shall give at the end of section 17. 16. Attributive constituents and model sets
The simultaneous satisfiability of all the members of St may be proved by imbedding them into one and the same model set. t ) If no negation1) For model sets and for a proof of the fact that imbeddability in a model set equals satisfiability (consistency), see my papers, Form and Content in Quantification Theory, Acta Philosophica Fennica 8 (1955) 11-55, and Notes on Quantification Theory, Societas Scientiarum Fennica, Commentationes physico-mathernaticae 17, no. 12 (Helsinki 1955). Cf. also Modality and Quantification, Theoria (Lund, Sweden) 27 (1961) 119-128. - It is convenient to assume here that (c. &) and C. v) have been generalized so as to apply also to conjunctions and disjunctions with more than two members.
80
JAAKKO HINTIKKA
signs are allowed except those which immediately precede an atomic formula, if no identities are admitted, and if the only propositional connectives are r-«, &, and v, then a model set may be defined as a set of formulae - say fl - which satisfies the following conditions: (C.
~)
(c.&)
(C.v) (C.E)
If F G u, then not '" F G fl. If (F & G) G fl, then F G fl and G G fl. If (F v G) G II, then F G fl or G G fl (or both). If (Ex)F G u, then F(a/x) symbol a.
G
fl for at least one free individual
Here F(a/x) is the result of replacing x everywhere by a in F, subject to the usual precautions concerning the binding of variables. The same notation will be used in what follows. (C.V)
If (Ux)F G fl and if b is a free individual symbol occurring in at least one formula of fl, then F(b/x) s fl.
Satisfiability has to be interpreted here as satisfiability in an empty or non-empty domain of individuals. For the exclusive interpretation of quantifiers, the definition of a model set can obviously be modified by changing (C.E) and (C.V) as follows"): (C.Eex)
If (Ex)F 8 fl, then F(a/x) 8 fl for at least one free individual symbol a which does not occur in F.
(C,V ex)
If (Ux)F G fl and if the free individual symbol b occurs in the formulae of fl but not in F, then F(b/x) G fl.
These conditions can be geared more closely to the structure of constituents and attributive constituents. It is easily seen that an a-constituent is imbeddable in a model set and therefore satisfiable ifit can be imbedded in a set A of attributive constituents which satisfies the following conditions (in these conditions (7) is thought of as an arbitrary a-constituent):
(Cict-c )
If an atomic formula all of whose individual symbols are free occurs unnegated in a member of A, then it never occurs negated in any member of A.
1) In the terminology of the paper referred to in section 7 (footnote), p. 59, these conditions formulate the weak fy exclusive interpretation of quantifiers.
DISTRIBUTIVE NORMAL FORMS IN FIRST-ORDER LOGIC
81
(C.ctE)
If (7) occurs in A, then for every a-constituent (9) of depth d-l which occurs in (7) there is a free individual symbol b such that Ct:-l(a l, ... , ai, b) eA.
(C.ctU)
If (7) occurs in A, then for every free individual symbol b which occurs in at least one member of A there is an a-constituent (9) of depth d-l which occurs in (7) and which is such that \al' ... , ai, b) eA.
Ct:-
A set A of a-constituents which satisfies these conditions will be called a constitutive model set"), For the exclusive interpretation of quantifiers, the condition (C.ctE) and (C.ctU) have of course to be modified in the same way as in (C.E ex) and in (C,U ex ) , respectively, by requiring that b does not occur in (9) - or in (7), which is the same thing. The resulting conditions will be called (C.ctE ex) and (C.ctU ex) ' 17. Completeness proof (concluded)
The simultaneous satisfiability of all the members of the sequence Sl which was defined earlier will be proved first for first-order logic with identity. Subsequently, the modifications needed for the ordinary interpretation of quantifiers will be explained briefly. On the exclusive interpretation ofquantifiers, all we have to do in order to imbed Sl in a constitutive model set is to form its closure with respect to the following operations: (a) The operation of omitting a layer of quantifiers. (b) The operation of omitting a free individual symbol (i.e. of omitting all the atomic formulae which contain this symbol, together with all the connectives which hence become idle). The operation (b) requires a few comments. First, we shall restrict it by requiring that the free individual symbol omitted is not the distinguished 1) Given a constitutive model set A, an ordinary model set fl is easily obtained as the closure of J. with respect to the following operations: (i) Whenever a conjunction occurs in A, adjoin to A all its members. (ii) Whenever (7) occurs in J. and b occurs in the formulae of A, adjoin the disjunction l at Ct/- (a b ••• , ak' b) i= 1
to A. This fl contains Aas a part and has exactly the same free individual symbols as A.
82
JAAKKO HINTIKKA
(last) free individual symbol of an a-constituent. For instance, the symbol
ak must not be omitted from (7). Otherwise (b) might not give an a-con-
stituent as a result when applied to an a-constituent. Furthermore, in first-order logic with identity the operation (b) has to be modified in the same way and for the same reason as the operation (a). Suppose, for instance, that we are omitting the free individual symbol a, (1 ::;: i ::;: k) from (7); and suppose that Ct;-e(a 1 , •.. , ai, x, y, ... ) is an arbitrary a-constituent which occurs in (7) and whose outermost quantifiers are (Ez) and (Uz) (exclusively interpreted, of course). At the same time as we omit all the atomic formulae which contain a, we must add as a new member of the main conjunction the formula (E)C z t dp -
e[ - l ] (
aI' ... , a j
_
l'
z, a i + 1,
... ,
ak , x, y, ... )
and as a new member of the outermost disjunction the formula
cr: p
e[-l]( aI'
... , a j _
1,
z, aj +
1, . . . ,
) a k , x, y, ....
Let the closure of Sl with respect to the operations (a) and (b), so qualified, be A. Then it is easy to see that A satisfies (Cict-«). For the purpose, consider an arbitrary atomic formula which occurs in the formulae of Awith all its individual symbols free. If it occurs (in this way) in a member of A, negated or unnegated, it must likewise occur, in view of the way A was obtained from Sl' in a member of Sl' Hence it suffices to verify (Cict o- ) for Sl alone. Now all the free individual symbols of Sl are aI' a 2, ... ; we assume for simplicity that they are different from all the bound individual variables we are dealing with. If the last member of this sequence which occurs in F is ai; then F occurs (negated or unnegated but not both) in exactly one member of Sl' viz. in (7) provided of course that k > j. But if k ::;: j, F cannot occur in any member of S 1 at all. Hence F cannot occur both negated and unnegated in the members of 8 1 nor therefore in the member of A. It is also easy to see that all the members of A satisfy the other two defining conditions of a constitutive model set provided that all the members of Sl do so in A. Indeed, the relation between (7) and (9) which is mentioned in (C.ctE) and (C.ctU) and hence also in (C.ctEex ) and (C.ctUex) continues to obtain if one of the operations (a) and (b) is applied to both of them. From this it follows in first-order logic without identity that the other (new) members of A satisfy (C.ctE) and (C.ctU)
DISTRIBUTIVE NORMAL FORMS IN FIRST-ORDER LOGIC
83
if the members of Sl do so. In first-order logic with identity, there are two additional cases we have to worry about. They are caused by the fact that in omitting a free individual symbol on the exclusive interpretation a new attributive constituent of depth d-l is added to each attributive constituent of depth d from which a free individual symbol is omitted. Let the latter be (7) and the former therefore (21) Now), might prima facie fail to satisfy (C.ctEex) because of the presence of this new a-constituent of depth d-l in the modified from of (7), and it might prima facie fail to satisfy (C.ctUex ) because a free individual symbol (viz. ai ) which formerly did occur in (7) does not do so any more and hence seems to open a new possibility of applying (C.ctUex) ' In both cases the violation of one of the two conditions is avoided because A is closed under (a) and hence contains the formula (cf(21)):
Hence the only thing that remains for us to do in order to show that A is a constitutive model set is to prove that the members of Sl satisfy the conditions (C.ctE) and (C.ctU) or (C.ctE ex) and (C.ctUex) ' as the case may be, when they are considered as members of A. This may be proved by means of certain lemmata concerning Sl' These lemmata follow directly from the way in which Sl was constructed. LEMMA 1: Whenever Ct~ - \a 1 , ... , ak, x) e T k is weakly correlated with Ct: - \a 1 , ••• , ak, ak + l ' x) e T k + i - the former results from the latter (up to notational variation, as usual) by omitting the free individual symbol ak + 1 in the sense of operation (b) defined earlier in this section.
This is, on reflection, just what being weakly correlated means. Lemma I can be generalized:
ci;-
1*: Whenever \al' ... , ak' x) e T k is connected by the ancestral of the weak correlation with Ct: - l(a l, •.. , ak' ak + 10 •.. , ak + I' x) e T k + l ' the former results from the latter by omitting the free individual symbols ak + l' ak + 2, . . . , ak + I' LEMMA
Another useful lemma is the following:
84
JAAKKO HINTIKKA
LEMMA 2: Whenever Ct~ - 1(a1, ... , ak' x) s T k is associated with the next member (20)* of Sl' it results from the latter by omitting one layer of quantifiers.
This is just what being associated means. By means of Lemmata 1* and 2 we can show that the members of Sl satisfy (C.ctE e x) and (C.ctUex ) in A. For this purpose, assume that (7) is a member of Sl and that (22)
occurs in (7). As pointed out in section 15, the chain of strong correlations which passes through (22) will always come to an end at some a-constituent which is not strongly correlated with any further a-constituent but which is instead associated with the next member of Sl .- say with (23) Then by Lemmata 1* and 2 it follows that (22)* results from (23) by first omitting one layer of quantifiers and by then omitting the free individual symbols ak + 1, ak + 2, . . . , ak + I' But since Ais closed under the operations (a) and (b), this implies that (22)* belongs to A. This means that (C.ctE) is satisfied by (7). Moreover, since ak + 1 does not occur in (7) or (22), it also means that (C.ctE ex) is satisfied. In order to verify (C.ctU ex) , assume that (7) occurs in Sl' Then every free individual symbol of A which does not occur in (7) (nor in the less deep a-constituents which occur in (7)) is of the form a, + I' Consider now the first member of Sl which contains ak + I; let it be (23). Then (23) is associated with some member of T k + 1 _ 1 which is in turn connected by a chain of weak correlations with some member of Tk , i.e. with some a-constituent of depth d-l occurring in (7). Let this a-constituent be (22). Then from Lemmata 1* and 2 it follows in the same way as in the case of (C.ctE e x) that (22)* results from (23) by means of the operations (a) and (b) and hence belongs to A. This suffices to verify (C.ctU ex) for (7). This completes our argument to the effect that all the members of Sl are simultaneously satisfiable for a system with identity. For a system without identity, an additional argument is needed to take care of those cases of (C.ctU) in which b occurs in (7), i.e. in which b is an a, where
DISTRIBUTIVE NORMAL FORMS IN FIRST-ORDER LOGIC
85
I .:::;; i .:::;; k. In the special case of Sl we know that each member (7) of Sl satisfies the condition (C) of consistency. Hence there must be in (7) an a-constituent (9) of depth d-l such that the bough of (7) determined by (9) is strongly symmetric with respect to x and a.. Consider, then, the formula Ct~- teal' ... , ak- l, ak, aJ obtained from (9) by replacing x by a.. By a straightforward argument whose details are here omitted, it can be shown that because of strong symmetry this formula is identical with the result of reducing (7) with respect to (9) and therefore also identical with the result of omitting one layer of quantifiers from (7). (Identity here means, as usual, identity except for the naming of bound variables and the order of conjunctions and disjunctions, and in this case also for the vacuous repetition of some members of conjunctions and disjunctions.) Since A is closed with respect to the operation (a), this implies that Ct~-l(al> ... , ak-l, ai, a i ) e A and shows that the condition (C.ctU) is satisfied also in the cases which do not fall within the scope of (C.ctU ex ) ' 18. General considerations
This brings to an end our completeness proof for the disproof procedure described in section 14, both in first-order logic without identity and in one with it. As a dual of the disproof procedure we obtain a complete proof procedure for first-order logic. Proofs of this form have an easily surveyable structure, and most of their steps are quite innocentlooking. They may be taken to be linear, i.e. each line of the proof is a single formula which is obtained from its immediate predecessor. Most of these formulae are conjunctions; hence we may, if we want, split the proof into branches (Beweisfiiden) which are tied together into the form of a tree by the simple inference rule F, G f- (F & G). It may be worth while to list, by way of summary, the different kinds of step which will occur in these proofs (with the optional exception of the inference rule just mentioned). In listing these steps, we are looking at the proofs not in the direction from the axioms to the formula to be proved but in the opposite direction. The following operations are needed to convert a formula into the dual of its distributive normal form:
86
JAAKKO HINTIKKA
(1) Transformation into propositional normal form without affecting the parameters (P .l)-(P. 3) of the formula in question.
(2) The distributivity of the universal quantifier with respect to conjunction. (3) The possibility of reducing the scope of a universal quantifier by omitting from it members of disjunctions which do not contain the variable which is bound to the quantifier in question. We must be able to carry out these operations also within a formula (i.e. to apply them to a subformula of a larger formula). Furthermore, we must of course assume the usual interrelation between the two quantifiers and the possibility of renaming bound variables. These operations also enable us to reach the dual of the second distributive normal form. In adding to the depth of the dual of a constituent we must assume something further: (4) One may add to the depth of a formula by introducing propositionally redundant parts which do not change the other parameters (P .l)-(P .2). Finally, in each branch of the proof or in each conjunct of the first line of the rest of the proof (looking at it now in the direction from axioms to the formula to be proved) we need exactly one initial operation which is the dual of the argument by means of which attributive constituents not satisfying (A)-(C) (or, in first-order logic with identity, (A)(B» were shown to be inconsistent. From section 10 it is seen that in the case of (A) and (B) these operations are, apart from certain inessential preparatory steps, essentially applications of the well-known exchange laws for adjacent quantifiers: f- (Ux) (Uy)F
==
(Uy) (Ux)F.
f- (Ex) (Uy)F ~ (Uy) (Ex)F.
In the case of (C) this initial step is essentially an application of the law
(5h
f- (Ux)F ~ F(a/x)
where a occurs in F. The application of these laws may perhaps be considered as the only
DISTRIBUTIVE NORMAL FORMS IN FIRST-ORDER LOGIC
87
non-trivial step of the argument. In each initial conjunct, only one application of one of them is needed. In first-order logic with identity, a few unimportant complications arise because of the presence of identity. We shall not discuss them in detail here. We also obtain two dual methods of proofs from assumptions. These proofs are interesting because they are linear; each line of the proof consists of a single formula.'). We may consider that method which turns on our original (second) distributive normal form. In order to prove G from F by means of this method, we may proceed as follows; First, we pool together the parameters (P. I)-(P .2) of F and G, take the maximum (say d) of their depths, and convert F and G into their respective normal forms F d and Gd in terms of the parameters so obtained. Then we expand the normal forms F d and Gd by splitting their constituents into disjunctions of deeper and deeper constituents while their other parameters are unchanged. At each depth we test all the constituents for consistency by means (A)-(C) or (C)-(E) (or, if identities are present, by means of (A)-(B) or (D)-(E)) and omit all the ones which fail the test. Let us call the disjunction of the remaining ones of depth d-s- e, and cr-, respectively. Then if G really follows from F there will be an e such that all the members of F d + e are among the members of c-:». A proof of G from F will then proceed from F to F d to r-: 1 to ... to F d + e to c-:» to G d+ e- 1 to ... to c-: 1 to G d to G. All the steps of this proof except the one from F d + e to Gd+e are equivalences. The same procedure gives us a complete method of equivalence proofs. If F and G are equivalent, then for some e the disjunctions F d + e and Gd+e will have the same members and therefore be equivalent. In this case, all the steps of the proof from F to G will be equivalences. In disproofs, proofs, proofs from premises, and equivalence proofs, we thus have to add to the depth of the formulae we are dealing with. Consider disproofs as an example. In order to disprove F, we bring it to its distributive normal form F d (where d is the depth of F) and keep adding to the depth of the constituents of F d until at some depth d-s- e all the subordinate constituents are inconsistent by our conditions. Here the
r-:»
1) For an interesting discussion of a different standard form of linear proofs from assumptions, see William Craig, Linear Reasoning, A New Form of the HerbrandGentzen Theorem, Journal of Symbolic Logic 22 (1957) 25(}-268.
88
JAAKKO HINTIKKA
difference e between the depth of F and the depth at which the inconsistency of F becomes explicit may be considered as a kind of (rough) measure as to how deeply hidden the inconsistency of F is and therefore also as a measure of the amount of "development" or "synthesis" required to bring this inconsistency into the open. This idea seems to have some philosophical interest. It also applies to the other kinds of proofs we have mentioned. For instance, in the method of equivalence proofs just described the difference between d-i-e and the depth of F perhaps serves as a numerical measure of the amount of "synthesis" which has to be performed on F in order to bring out its equivalence with G 1 ) . It may be objected here that these suggested measures are all relative to the particular conditions of consistency which are being used in the proof in question, and hence not likely to have much general significance. That they depend on the conditions employed is true; but on the other hand it seems to me that in some rather elusive sense our conditions (A)-(E) of inconsistency are as strong as we can possibly hope natural conditions to be. Further work is needed here to clear up the situation. From the way we proved the completeness theorem we can also read directions for a different standard form for first-order proofs which is much closer than the ones we have just mentioned to the earlier standard forms of Herbrand and Gentzen. Consider disproofs first. In this new standard form, each stage of the proof is again a disjunction of constituents, preceded by a transformation to the distributive normal form. Again, each disjunction is obtained from its immediate predecessor. The way in which this happens is now different, however. Instead of adding to the depth of our constituents we add new free individual symbols to them. In order to describe the procedure in more detail, assume that (8) occurs in one of the disjunctions. We choose one of the a-constituents of depth d-l occurring in (8), say (9); the principles of selection will be commented on later. We drop the existential quantifier (Ex) which precedes an occurrence of (9) in (8), and replace x everywhere in this occurrence of (9) by a new free individual symbol ak + l ' This leaves the parameter (P .1) of (8) and its depth unchanged, but adds a new member to the set (P. 2). Hence we may transform the formula we 1) If the depths of F and G are identical, exactly the same amount of "synthesis" is required to convert F into G by the method just described as is required to disprove ~ (F == G), if these amounts are measured in the suggested way.
DISTRIBUTIVE NORMAL FORMS IN FIRST-ORDER LOGIC
89
have obtained into a disjunction of constituents with the same (P .1) and the same depth as (8) but with one more free individual symbol ai: This procedure corresponds to the expansion of a constituent of depth d into a disjunction of subordinate constituents of depth d-s- 1. To each of the new constituents we may apply the conditions (A)-(E) and eliminate those which turn out to be inconsistent. Each of the remaining ones is related to (8) in the same way as each member of St was related to its predecessor. We can set up strong and weak correlations as well as associations in the same way as before and also define a seniority ranking among the a-constituents which occur in (8) and which have depth d-l in the same way as in the completeness proof. For the particular a-constituent of depth d-l in (8) into which the new free individual symbols was first introduced we may simply choose one of the highest ranking ones. This disproof procedure is complete in the same way as the earlier one: Either at some stage of the procedure all the constituents are inconsistent by our conditions (A)-(E) or else we have a sequence of constituents which is just like the sequence St of a-constituents and which can be shown to be satisfiable in the same way as St. The only new feature is that there is not just one way of going from one stage to the next one, for there usually is some choice in the selection of that a-constituent of depth d - 1 in (8) into which the new free individual symbol is first introduced. In fact, both here and in the completeness proof there may even be more choice than we have allowed so far. The situation is the same in the two cases; hence we may consider St by way of example. Suppose that we construct St without any regard to the selection of (20) in (19) (section 15). Then we can define weak correlation and association as before. We can also define strong correlation in a variety of ways in most cases. From section 17 (from the argument which follows the lemmata) we can see that there is only one further thing which is needed in order for us to be able to carry out the completeness proof. This is that every chain of strong correlations eventually comes to an end when we proceed further and further in St- The use of the seniority ranking was simply an artifice for securing this end. If it can be obtained by other means, so much the better. In any case, the conventions concerning the seniority ranking need only be adhered to from some (arbitrarily late) stage on.
90
JAAKKO HINTIKKA
This suffices to describe the new disproof procedure. Similar procedures for proofs from premises and for equivalence proofs are obtained as a consequence, and a similar proof procedure as its dual. The number of new free individual symbols which are introduced in one of these proofs may be said to indicate the amount of synthesis performed in it. The situation is now more complicated than before in that a given formula F may now have different disproofs with different numbers of free individual symbols introduced in them. The smallest of these numbers may be taken, however, to indicate the amount of synthesis needed to bring out the hidden inconsistency of F. It may be shown that this measure of the amount of synthesis required to disprove F coincides with the measure suggested earlier. Similar considerations apply to the other types of proofs. We have described a proof procedure in which the free individual symbols remain unchanged but in which the depth of our formulae grows steadily, and one in which the depth of our formulae remains intact while new free individuals are introduced. These are really the two ends of a long spectrum of mixed proofs in which our two methods are employed together. The method by means of which the completeness theorem was proved is now seen to be tied to the second type of disproof (introduction of new free individual symbols) much more closely than with the first (adding to the depth of constituents) although it is the first one which was initially considered in the completeness proof. In fact, in the completeness proof we first transformed the increase in depth into an increase in the number of free individual symbols. This may seem a rather roundabout way of proving the completeness of the first main method of disproof. In fact, we could have avoided this complication, but only at the expense of complicating the proof considerably in other respects. For this reason, the "direct" proof will not be attempted here. It would be of interest to construct a model for a sequence So of deeper and deeper constituents which satisfy (A)-(E) and which are subordinate to their predecessors directly without going by way of the auxiliary sequence St, for such a construction promises us a survey of the different kinds of model which So (essentially, an arbitrary consistent and complete theory) can have. An interesting line of development seems to open here, a line in which the
DISTRIBUTIVE NORMAL FORMS IN FIRST-ORDER LOGIC
91
important results of Vaught's on models of complete theories (which e.g. entail the Ryll-Nardzewski No·categoricitytheorem) appear to assume a natural place.')
1) For Vaught's results see Denumerable Models of Complete Theories, in Infinitistic Methods, Proceedings of the Symposium on Foundations of Mathematics, Warsaw, 2-9 September 1959, pp. 303-321 (Pergamon Press, Oxford and London 1961).
SEMANTICAL ANALYSIS OF INTUITIONISTIC LOGIC I SAUL A. KRIPKE Harvard University, Cambridge, Mass., USA
The present paper gives a semantical model theory for Heyting's intuitionist predicate logic, and proves the completeness of that system relative to the modelling. The model theory and completeness theorem were announced in [1]. The semantics for modal logic which we announced in [1] and developed in [2], [3], together with the known mappings of intuitionistic logic into the modal system S4, inspired the present semantics for intuitionist logic. It would in fact be possible to derive the completeness of Heyting's predicate logic in our semantics by using the mappings into S4 together with the results of [2], [3]. We prefer, however, to develop the semantics of intuitionistic logic independently of that of S4; this procedure will enable us, we believe, to obtain somewhat more information about intuitionistic logic, including the mapping into S4 as a consequence thereof"), Further, a fairly recently worked-out development, not contained in the announcement of [1], is included: an exposition of Cohen's notion of "forcing" [5] in terms of the present semantics. In addition to giving a simple decision procedure for Heyting's propositional calculus, Part II will present a result not announced in [1] but mentioned in [4]-the undecidability of monadic intuitionistic quantification theory. The proof is based on the semantics previously developed. It should be mentioned that, for the pure implicational intuitionistic propositional logic, Beth [6] has announced the rediscovery of essentially the present modelling; also that, for all of intuitionist propositional logic, 1) The reader who wishes to understand thoroughly the deeper motivation of the present paper, however, is strongly urged to consult [2], [3], and [16], which give the underlying analysis of modal logic.
SEMANTICAL ANALYSIS OF INTUITIONISTIC LOGIC I
93
a modelling equivalent to ours can be extracted from the results of Lemmon and Dummett [7].1) The results of this paper, though devoted to intuitionistic logic, are proved only classically, except as mentioned below. Intuitionistically, the situation is essentially the same as that for Beth's completeness theorem [8], as analysed by Dyson and Kreisel in [9]; a reader who is interested in intuitionistically valid proofs can consult [9] and apply a similar analysis to the present results. We will give indications below which (we believe) will be sufficient for a reader familiar with [9] to make such an analysis. In the course of these indications, we will prove some results about Kreisel's system Fe which are parenthetical to the main theme of this paper. In particular, we will show that Kuroda's conjecture and Markov's principle are both refutable in Fe. Some notations that will be used throughout the paper are the following: P", Qn, R" (n ~ 0) are n-adic predicate letters; a O-adic predicate letter is usually called a "proposition(al) letter." Occasionally the superscript on a predicate letter will be omitted if this does not sacrifice clarity. We use letters x, y, Z, . . . , with or without subscripts, as (individual) variables. The formulae of the intuitionistic propositional calculus are to be built out of the usual connectives A, V, ::::>, -', starting with the propositional letters as atomic formulae. In the predicate calculus, not only propositional letters but also formulae pn(x u . . . , x n) are taken as atomic; thence formulae are built up from these in the usual manner, using the connectives just given and the quantifiers (x) and (3x). We use A, B, C, . . . , for arbitrary formulae of propositional or predicate calculus, depending on the context; if we wish to call attention to certain free variables in a formula, we use such notations as A(x 1 , ... , x n ) . We assume, finally, that the reader is familiar with standard presentations of Heyting's formalized intuitionistic propositional and predicate calculus, say the presentation in [10].
1) Kreisel's conjectured "reinterpretation of the (intuitionistic) logical constants" in [171 is also, if his conjectures prove correct, related to the present model theory.
94
SAUL A. KRIPKE
1. The model theory
We define an (intuitionistic) model structure (m. s.) to be an ordered triple (G, K, R) where K is a set, G is an element of K, and R is a reflexive and transitive relation on K. An (intuitionistic) model on a m. s. (G, K, R) is a binary function ¢(P, H), where P ranges over arbitrary proposition letters") and H ranges over elements of K, whose range is the set {T, F}, and which satisfies the following condition: if ¢(P, H) = T and HRH' (H, H'cK), then ¢(P, H') = T. Given a model ¢(P, H), we can define a value ¢(A, H) (=T or F) for an arbitrary formula A of propositional calculus by induction on the number of connectives in A .If A has no connectives, then it is a proposition letter and ¢(A, H) = T or F has already been defined for each H. Assume that ¢(A, H) and ¢(B, H) have already been defined. Then we stipulate: a) ¢(A A B, H)= T iff ¢(A, H) = ¢(B, H) = T; otherwise, ¢(A A B, H) = F. b) ¢(A v B, H) = T iff ¢(A, H) = T or ¢(B, H) = T; otherwise, ¢(A v B, H) = F.
c) ¢(A ::J B, H) = T iff for all H' E K such that HRH', ¢(A, H') = F or ¢(B, H') = T; otherwise, ¢(A ::J B, H') = F. d) ¢(,A, H) = T iff for all H' E K such that HRH', ¢(A, H') = F; otherwise, ¢(,A, H) = F.
Notice that the conditions on A and v are exact analogues of the corresponding conditions on classical conjunction and disjunction; but the conditions on :::> and, are not analogous to the classical conditions. It is easy to show by induction, for any H, H' E K such that HRH', that if ¢(A, H) = T, then ¢(A, H') = T. This property has been stipulated 1) In [2), we let ¢(P, H) range over H E K and atomic subformulae of a fixed formula A. We called this a model of A. We could equally well have adopted this orientation
here; conversely (2) could have adopted, mutatis mutandis, the present definition. The viewpoint of (2) is exploited in the analysis of Cohen's "forcing", where we consider models assigning values only to formulae built out of a fixed atomic formula P(x). We should also remark that, although in this section we have taken the atomic formulae to be proposition letters and formulae pn(x" ... , x n), the definitions would equally well go through if formulae were built out of an arbitrary fixed class of atomic formulae; this fact is exploited in the "provability interpretation," section 1.3, below.
SEMANTICAL ANALYSIS OF INTUITIONISTIC LOGIC I
95
for a propositional letter, and it follows for more complex formulae using the clauses (a) - (d). Notice that, intuitionisticaIly, the inductive definition here given does not work, since it clearly appeals to the law of excluded middle in clause (c) and (d) (e.g., in (d), either for all H', ¢(A, H') = F, or not). Thus intuitionistically, it would be best to define a model ¢ as a mapping ¢(A, H) in {T, F}, where A ranges over arbitrary formulae of propositional calculus, and which happens to satisfy the clauses (a) - (d) as well as the condition that ¢(P, H) = T and HRH' implies ¢(P, H') = T. Clearly, from the classical viewpoint, this modification leaves the notion of a model essentially unchanged. We call a formula A of propositional calculus valid iff ¢(A, G) = T for every model ¢ on a model structure (G, K, R). A model ¢ on a m. s. (G, K, R), such that ¢(A, G) = F, is called a countermodel for A. To extend the modelling to quantification theory, we define a quantificational model structure (q. m. s.) to be a model structure (G, K, R), together with a function IjJ (the "domain function"), defined on K, such that IjJ(H) is a non-empty set for all H E K, and IjJ(H) s:; IjJ(H') if HRH' (H,H' EK). (Intuitionistically, we require that IjJ(H) not only be non-empty, but that it contains at least one element; of course, a species may be known not to be empty without any particular element thereof being known.) We define a quantificational model ¢ on a q. m. s. (G, K, R) to be a function ¢(p n , H), where P" ranges over l1-adic predicate letters (for all n), and H ranges over elements of K. If 11 = 0, ¢(pn, H) = T or F, and if n ~ 1, ¢(r, H) is a subset of the Cartesian product [1jJ(H)r. We again require for n = 0, that ifHRH', and ¢(pn, H) = T, ¢(r, H') = T; for n ~ 1, analogously we require that if HRH ', ¢(r, H) s:; ¢(r, H'). Let U
= U IjJ(H). H£K
Given a quantificational model ¢, we can define, for each formula A of intuitionistic quantification theory, a value ¢(A, H) = T or F, for each HE K, relative to a fixed assignment of elements ofU to the free individual variables of A. If A is an atomic formula, it is either a propositional letter P, in which case ¢(P, H) = TorFis given, or itis a formula P'(x., ... , x n)
96
SAUL A. KRIPKE
(n ~ 1). In this latter case, let elements aI' ... , an of U be assigned to Xl' ... , Xn; then we can define, relative to this assignment, ¢(pn(XI' ... , x n), H) = Tiff (a l, ... , an) E ¢(r, H), and ¢(r(Xl' ... , x n), H) = F iff (a l, ... , an) ¢: ¢(pn, H). Given this assignment to atomic formulae, we
can build up the assignment to more complex formulae by induction. Suppose A(Xl' ... , Xm y) is a formula, where at most the variables 'X m y). Assume, that relative to each assignindicated occur free in A(x l' , Xm y, a truth-value ¢(A(x l , . . . , Xm y), ment of elements of U to Xl' H) has been defined for each H. We can then obtain values for ¢«y)A (Xl' ... , Xn, y), H) and ¢«3y)A(Xl' ... , Xn, y), H) as follows. Let the elements al' ... , an of U be assigned to the variables Xl' ... , x.; Then:
e) We say ¢«3y)A(Xl' ... , Xm y), H) = T iff there is e b e l/f(H) such that ¢(A(x l, ... , Xn, y), H) = T when Xl' ... , Xn are assigned al' ... , am respectively, and y is assigned b; otherwise ¢«3y)A(x l , . . . ,X n, y), H) = F.
f) We say ¢«y)A(x l, ... , Xm y), H) = T iff for each H' E K such that HRH' ¢(A(x l, ... , Xm y), H') = T when Xl> ... , x, are assigned al' , an, and y is assigned any element b of l/f(H'); otherwise, ¢«y)A(x l, , X n , y), H) = F.
Finally, we stipulate that if truth-values ¢(A, H) and ¢(B, H) (for all E K), are given relative to an assignment to the free variables of A and B, then corresponding values ¢(A A B, H), ¢(A v B, H), ¢(A :::J B, H), and ¢(,A, H) are to be defined according to the prescriptions (a) - (d). To get a proper intuitionistic definition of model, we should again modify the given conditions and stipulate that a model ¢ is a function ¢(r, H) as above, together with a function ¢(A, H), assigning T or F to ¢(A, H) relative to a given assignment of elements of U to the free variables of A, and satisfying the previously stated conditions (e.g., that ¢(r(x l, ... x n ) , H) = T when Xi is assigned aD .:0;; i .:0;; n) iff (a l , . . . , an) E ¢(pn, H)). Again, this definition is classically substantially equivalent to the old one. We note that all of the results to follow would remain valid if we allowed B, H) formalizes this requirement. Consider the following two point countermodel to the law of excluded middle: p
H
G
Figure 2.
100
SAUL A. KRIPKE
We have ¢(P,H) = T, ¢(P, G) = F. Since ¢(P, H) = T, ¢(,P, G) = F, and hence ¢(P v ,P, G) = F. Intuitively, at the present situation G, we have not yet proved P; nor can we assert ,P, since the possibility remains that we will get enough information later to advance to H and assert P. Thus, at the point G, we are not in a position to assert P v v]', These considerations can readily be formulated in terms of Kreisel's theory FC of absolutely free choice sequences. Intuitively, an absolutely free choice sequence (a.f.c.s.) is a free choice sequence a, chosen from a given spread S, in which it is stipulated from the beginning that no restrictions, other than the conditions defining the spread S, can ever be placed on the choices. Figure 2, then, for example, can be interpreted in terms of the present theory as follows: Consider a.f.c.s.'s from the spread S consisting of free choices ofO's and 1's, in which, however, 1 can be followed only by 1. Intuitively, we interpret the situation G as a choice of 0 and H as a choice of 1. Since, starting with G, we can remain "stuck" at G as long as we like, we permit 0 to be followed by an arbitrary number of O's as well as by 1; but, since H is followed only by itself, we permit 1 to be followed only by 1. Then P{a) is the assertion "a 1 occurs on the a.f.c.s. a" (i.e., (3n) (a{n) = 1), where n ranges over natural numbers). As long as we have chosen only O's in a, we have not established P{a); but on the other hand, since a is chosen with no restrictions other than being in S, we cannot exclude the possibility of the choice of a 1 later, so we cannot establish ,P{a). These considerations can be formalized easily in Kreisel's FC so as to yield a proof of ,(a t S) (P{a) v,P{a)), where at S ranges over a.f.c.s.'s in S. More generally, given any (intuitionistically defined) countable tree model ¢ of A on a m.s. (G, K, R), suppose we identify the nodes (elements of K) with natural numbers, identifying G in particular with 0. Define in terms of (G, K, R) a spread S consisting of all free choice sequences in which the initial choice is 0, and the choice of any natural number m must be followed either by a further choice of m or by a choice of some successor ofm on the tree. To any atomic subformulaP of A, and a.f.c.s. a in S, associate a formula P{a) abbreviating (3x) (3m) (a{x) = m and ¢(P, m) = T). Given B, C, and associated formulae B{a) and C{a), associate with B A C, B{a) A C{a); with B v C, B{a) v C{a), etc. Then, it is easily seen by induction that, for any subformula B of A,
SEMANTICAL ANALYSIS OF INTUITIONISTIC LOGIC I
101
if¢(B, m) = T,then(a~S)«3x)(a(x) = m) =:> B(a)),andif¢(B,m) = F, , (et ~ S) «3x) (et(x) = m) =:> B(et)). In particular, if ¢(B, G) = T[ =F], then since every a.f.c.s. in S contains = G), we have (et ~ S) B( «) [,(a ~ S) B(et)]. If the m.s. (G, K, R) and model ¢ can be formally described in
°(
Kreisel's FC, the preceding reasoning can be formalized in FC, and thus in particular, if ¢(A, G) = F, f- ,(et ~ S)A(et) in FC, giving a counterexample to the validity of A. To extend this treatment to quantifiers, consider first the following countermodel to (x) (P(x) v Q). =:> • (x)P(x) v Q:
{a}
{a, b}
pea)
P(a),Q
G
H
Figure 3.
We have ¢(P(x), G) = ¢(P(x), H) = T, when x is assigned a, but ¢(P(x), G) = ¢(P(x), H) = F when x is assigned b. Further, ¢(Q, G) = F, ¢(Q, H) = T, GRH but not HRG, and ljJ(G) = {a}, ljJ(H) = {a, b}. All this information is included in the diagram. It is easily verified that ¢«x) (P(x) v Q), G) = T, but ¢«x)P(x) v Q, G) = F. Intuitively, we can interpret the situation as follows: Identify the elements a and b with the integers and 1, respectively. Let R be Fermat's last theorem, and let
°
°
Q be R v ,R. Let V be the species containing 0, and containing 1 if Q is true (i.e., V = {mlm = v (m = 1 A Q)}), and let x be a variable ranging over V. Let P(x) be the statement x = 0. Then, already at the present situation G, we can assert V s; {a, l}, and 1 E ViffQ; so we can assert (x) (P(x) v Q). But so long as we have not advanced to the situation H, where Fermat's last theorem has been decided, so that we can assert Q, we cannot assert (x)P(x) v Q. N.B. It should be remarked that (x) (P(x) v Q). =:> • (x)P(x) v Q holds in any quantificational model such that ljJ(H) is constant. Thus, in general, if the variables in a formula A range over a domain D, then for each situation H, ljJ(H) is the species of all individuals known to be in D on the basis of the information available at H. (So, in the case of the paragraph above, at the present situation G, ljJ(G) = {O}; but when at H, Q has been proved, ljJeH) = {a, I}. Since D is to contain an
102
SAUL A. KRIPKE
element, we must know at least one element ofD from the outset, so that I/J(G) must contain at least one element. The restriction that HRH' is to imply I/J(H) C;; I/J(H') should now be obvious on the intended interpretation. Notice that, to assert in a situation H that for every element x of D, P(x) is true, we must know not only that P(x) is true for every x in I/J(H), but also that it is true for every x which may later be proved to be in D; i.e., for every x in I/J(H'), where HRH'; and this is exactly the inductive clause for universal quantification. On the other hand, to assert the existence of an x in Dsuch that P(x) is true, we need to find an element x which has already been proved to be in D (i.e., which is in I/J(H)), and such that P(x) is true; and this is exactly what the condition on existential quantification requires. These facts can again be stated more formally in terms of the theory of absolutely free choice sequences. Suppose we are given an (intuitionistically defined) countable tree m. s. (G, K, R) in which D, and hence I/J(H} for each H is countable. Then, we can identify both the elements of K and the elements of D with natural numbers, identifying G in particular with O. We then associate with (G, K, R) a spread S of absolutely free choice sequences, defined just as above. Further, for any a.f.c.s. a in S, let D" be the species of all natural numbers n such that there is a natural number x such that n E I/J(a(x)). Let x, be a variable ranging over D" (i.e., (x,,) ( ... ) isto be interpreted as (x)(x E D" ::l . . . ) and similarly for (3x,,)). Then since «(O) = 0 = G, and I/J(G) contains a natural number, D" has an element for all a. Let ¢ be an (intuitionistically defined) q. model on (G, K, R)for some formula A. Given any atomic subformulaP'(x., ... , x n) , and an a.f.c.s. a of S, we associate with these two an assertion pea, x I, , . . . , x n), where the variables XI~' ... , xn~ range over D", and where pea, XI~' ... , xnJsays that ¢(pn(x l , •.• , x n), m) = T for some m on a, when Xi~ is assigned to the variable Xi (i = 1, ... , n; note that Xi~ E D" C;; D). Given formulae A( ct, x I~' •.. , xnJ and B( a, Y I~' ... , ymJ associated with A(x l , . . . , x n) and B(y I' ... , Ym), respectively, associate A(a, XI~' ... , xnJ /I B(a, YI~' ... , YmJ with A(x l , . . . , x n) /I B(YI' ... , Ym), and similarly for the other connectives. Further, associate (xi)A(a, XI~' ... , xnJ with (xi)A(x l , . . . , x n), and similarly for the existential quantifier. Then, we prove, by induction, that, for any mE K, if A(x l , . . . , x n ) contains only the free variables listed and XI' ... , Xn are assigned ai' ... , an E I/J(m), then if ¢(A(xI' ... ,
SEMANTICAL ANALYSIS OF INTUITIONISTIC LOGIC I
103
x n) , m) = T[ =F] relative to this assignment, we have in FC (a t S) ((3x) (a(x) = m) ::::> A(a, aI' ... , an» [.,(a t S)((3x) (a(x) = m) ::::> A (a, aI' ... , an))]. In particular, if m = 0 = G, since (a t S) (3 x) (a(x) = 0), we get (a tS)A(a,a l, ... ,an) [.,(atS)A(a,al, ... , an)]. Thus, if A does not contain free variables, and ¢(A, G) = F, we get a proof in FC that A is not generally valid.
To translate, then, the example given above into FC, notice that, where B is the full binary spread, (a)
(a
t B)(x)((3y) (a(y)
= x)
::::>
(x = 0 V (3y) (a(y) =
1»),
but also (b)
,(a t B)((x)((3y) (a(y) = x)
::::> X
= 0) v (3y) (a(y) = 1»).
Thus we have refuted the "law" (x) (P(x) v Q). ::::> • (x)P(x) v Q; for if it held, it would hold for any free choice sequence a, with x ranging over the species of all z such that (3y) (a(y) = z), contrary to (a) and (b). Notice that, since (a) is a triviality and (b) follows from the fan theorem, we could simply have used the ordinary theory of free choice sequences instead of Fe. We remark that, following Dyson and Kreisel [9], the countermodels in FC that we have described, assigning certain infinite sequences of natural numbers to formulae, can classically be interpreted as countermodels in Baire space (the space of all sequences of natural numbers, with the usual topology). In fact, by examination of the countermodels actually produced below, it follows that every unprovable formula has a countermodel in the Cantor set, as Dyson and Kreisel assert. The following remarks on the uses of absolutely free choice sequences are not relevant to the main point of the present paper, but will be added here: REMARK.
1. All the theorems which are proved in the last chapter of Heyting
[12], using Brouwer's method of free choice sequences depending on the
solving of problems, can be carried out in Fe. To take the first example given by Heyting: to show that it is absurd that, for every real number a, a i= 0 should imply a # O. For if this were true, then for any free choice sequence a in the binary spread, by associating with a the real number
104
SAUL A. KRIPKE 00
~ rx(x)/2 x=o
x
,
we could show that -,(x)(rx(x) = 0) ~ (3x) (rx(x) = 1); hence, in particular, this would hold for absolutely free choice sequences. But it is easy to show, in FC, that (rx ~ B) -,(x)(rx(x) = 0). Hence we need only show in FC that -,(rx ~ B)(3x)(rx(x) = 1); but this easily follows from the fan theorem, since (rx ~ B)(3x)(rx(x) = 1) would imply (3m)(rx ~ B)(3x S m) (C«(x) = 1), which is absurd. Similar treatments are possible for all the refutations of classical theorems treated by Heyting by this method in [12].
I think it probable that such treatments in FC will extend to all the counterexamples to classical theorems which Brouwer gives by his method; but I have not made a survey of the literature. A careful reader of the present section on the interpretation of our models will find it plausible that, conversely, a good deal of the interpretation, at least for propositional calculus, that has just been carried out in Fe, could be carried out using Brouwer's method of ips depending on the solving of problems. 2. The following example, which refutes both Kuroda's conjecture (cf. [13]) and Markov's principle (cf. [14]) in FC, was inspired by applying the methods of the present section to obtain a countermodel to (x) -, -, A(x) ~ -, -, (x)A(x). Let S be the finitary spread consisting of all free choice sequences o: such that «(x + 1) = «(x) or «(x + 1) = «(x) + I for every x. We show in FC (a)
(rx ~ S)(m)-'-'(3n) (rx(n) ~ m)
(b)
(rx
~
S)-,(m) (3n) (rx(n)
~
m).
To prove (a), let o: be an a.f.c.s. in S, let m be an integer and suppose for reductio ad absurdum that -'(3n)(rx(n) ~ m). Then, since o. is absolutely free, by axiom 5. 1 of FC, there is an initial segment iX(x) of a such that (*) (13 t S) ((j(x) = iX(x) ~ -,(3n) (p(n) ~ m». Now rx(x) < m, for otherwise (3n) (rx(n) ~ m). Hence, since every ips on S is non-decreasing, for all y < x, rx(y) < m. Now (*) asserts that, if we have chosen the first x components of 13 so that P(x) = iX(x), we can never choose pen) ~ m for any n. But by axiom 5.3 of FC, there are a.f.c.s.'s 13 in S, satisfying the conditions P(x) = iX(x) and f3(x + i) = «(x) + i (0 s i s m - rx(x»,
SEMAN TICAL ANALYSIS OF INTUITIONISTIC LOGIC I
105
since this finite sequence of choices accords with the spread law of S. But then if n = x + m - a(x), {len) = m, contrary to (*). To prove (b), let a be an a.f.c.s. in S, and for reductio ad absurdum assume (m) (3n) (a(n) ~ m). Then, again by axiom 5.1 of FC, there is and x such that (**) ({l ~ S) (~(x) = ii(x) ::) (m) (3n)(~(n) ~ m». Given any a.f.c.s, {l in S, assign a value f({l) as follows: If ~(x) ¥ ii(x), let f({l) = 0; if ~(x) = ii(x), let f({l) be the least n such that {len) ~ a(x) + 1. By (**), f is well defined for all such {l, so by the fan theorem there is some finite integer p such thatf({l) is wholly determined by ~(p). We can thus write f({l) as f(~(p». Clearly, by the definition of J, p ~ x. Now, again using axiom 5.3 of FC, determine {l by requiring ~(x) = ii(x), {l(x + i) = a(x) (0 ::::; i ::::; P - x). Then (**) asserts that {l(f(~(p»)., ~ o(x) + 1. But this is clearly absurd, since again by 5.3 we are perfectly free to continue the choices by {l(p + j) = rx(x) (0 ::::; j ::::; f(~(p» -- p), so that, takingj =f(~(p» -p, we would get {l(f(~(p») = «(x) < «(x) + 1. So (b) is proved. We will now use (a) and (b) to refute Kuroda's conjecture [13] and Markov's principle [14]. Kuroda's conjecture asserts that for m a number variable, (m) -, -,A(m) implies -, -, (m)A(m). Using Kuroda's conjecture, we could derive from (a) the assertion (rx ~ S) -, -, (m)(3n)(rx(n) ~ m), which directly contradicts (b); so Kuroda's conjecture is refutable in FC. Similarly Markov's principle asserts that, for a decidable predicate A(x) and number variable n, -, -, (3n)A(n) implies (3n)A(n). But, if we take A(n) to be rx(n) ~ m, then A(n) is primitive recursive and hence decidable. Then Markov's principle would allow us to derive (rx ~ S) (m) (3n) (rx(n) ~ m) from (a), again contradicting (b). In spite of the proofs by G6del and Kreisel that strong completeness of Heyting's predicate calculus implies certain forms of Markov's principle, I am unable to see how to convert these results into a proof in FC that Heyting's predicate calculus is not strongly complete, and I doubt that such a conversion is in fact possible. If S' is the spread consisting of all tx such that there is a {l in S such that (x) (rx(x + 1) = {lex»~, it is easy to conclude from the present results that (rx ~ S') -'(3n)(rx(n) ~ «(O) and that -,(rx ~ S') (3n)(rx(n) ~ rx(O»; but, since rL here ranges over absolutely free choice sequences of S' and not ordinary free choice sequences, we are unable to apply Theorem 1 of Kreisel [15] to conclude that Heyting's predicate calculus is not strongly complete.
106
SAUL A. KRIPKE
1.2. Relationship to the Beth models
In this section, we discuss the relationship of the present model theory to that of Beth [8]. We will show that the present models can be "translated," in a natural way, into Beth models. Using an intuitive interpretation of the Beth modelling, we will also show that the mapping leads to an interpretation of our own quantificational models which is alternative to that of the previous section; in this interpretation, the variables always range over the species of natural numbers. This section can be omitted, if desired, without loss of continuity. First, we present the notion of Beth model in our terminology as follows: Let (G, K, S) be a tree, and let R = S*, so that (G, K, R) is a tree m.s. By a path in the tree (G, K, S) we mean a sequence {H;} of elements of K, indexed on either the sequence of natural numbers or on some finite initial segment thereof, satisfying the conditions: (a) H o = G; (b) for i > 0, H;_ISH;; (c) if {HJ has a last element H m H, is an endpoint of (G, K, S). If some H; = H, we say the path is through H. Let B be a subset of K. If every path through H intersects B, we say that H is barred by B. Thus, for example, H is barred by {H}. By a Beth model on (G, K, S), we mean a binary function IJ(P, H) satisfying the following conditions: (a) rJ(P, H) = T or F, where P is atomic and HE K. (b) If HSH' and rJ(P, H) = T, then rJ(P, H') = T. (c) IfH is barred by Band rJ(P, H') = T for every H' E B, then rJ(P, H) = T. Given a Beth model n, we define by induction values rJ(A, H) for an arbitrary formula A of the propositional calculus. Suppose rJ(A, H) and rJ(B, H) have already been defined. Define rJ(,A, H), rJ(A A B, H), and rJ(A :::> B, H) exactly as was done above for a model ¢; simply replace "¢" by "IJ" throughout. Finally define rJ(A v B, H) = T iff there is a subset B ofK such that H is barred by Band rJ(A, H') = Tor rJ(B, H') = T for every H' E B; otherwise, I1(A v B, H) = F. Notice that if I1(A, H) = T or I1(B, H) = T, I1(A v B, H) = T; we can take {H} as the set B barring H. Notice that condition (b) above actually implies the strengthened condition (b'): If IJ(P, H) = T and HRH', then I1(P, H') = T. Using this fact, it is easy to prove by induction that the properties (a) - (c) actually hold not only for an atomic formula P, but also for an arbitrary formula A. As in the case of models ¢, the inductive definition of I1(A, H) just
SEMANTICAL ANALYSIS OF INTUITIONISTIC LOGIC I
107
given depends on the law of the excluded middle. Again as in the case of a model, we can correct the situation by modifying the definition of a Beth model. We leave the modification to the reader. A Beth model '1 on a tree (G, K, S) is called finitary if (G, K, S) is finitary. Beth's own version of his models in [8] is actually equivalent to our notion of a finitary Beth model. We call a Beth model '1 a strong Beth model iff for all HE K and formulae A and B, '1(A v B, H) = T implies '1(A, H) = T or '1(B, H) = T. Notice that, on account of the validity of condition (b') above, a Beth model '1 is also a model in our sense. However, since the inductive clause for disjunction in a Beth model differs from the inductive clause for our sense of model non-atomic formulae may be given different values according as '1 is considered as a model in our sense or as a Beth model. A strong Beth model is precisely a Beth model in which this eventuality never happens. The intuitive rationale behind the Beth models is simple: Again the elements ("nodes") of the tree model (G, K, S) are points in time, or evidential situations; but we no longer suppose that we are allowed to remain at a given point H as long as we please. On the contrary, if H is a node of the tree (G, K, S), we are forced, unless H is an endpoint, to "jump" within a fixed, finite time to one of the successors of H in the tree. (Paradigmatic of such a game, of course, are free choices in an (absolutely) free choice sequence (J(: after each choice we are forced to make another, within a finite length of time, unless the spread-law states that the choice we have just made is terminal.) '1(P, H) = T[ =F] means that P has been established [has not yet been established] at the time H, so the conditions (a) and (b) on '1 are clear. If H is barred by B ~ K, condition (c) says if we know that P will be established at any H' E B, then we already know at H that P is true; for, once we are at H, we must get to some H' E B in a finite time. Similarly, the inductive clause which defines '1(A v B, H) observes that to establish A v B at H it is sufficient to know that, in a finite number of "moves," we must either establish A or establish B; that is to say, it suffices to know that there is a B which bars H such that every H' E B either establishes A or establishes B. The inductive clauses for the other connectives are as before. As in section 1.1, we can give a more precise justification of the
108
SAUL A. KRIPKE
definition in terms of absolutely free choice sequences. As before, we identify the elements of the (countable) tree (G, K, S) with natural numbers, associating 0 with G. We then consider the spread S of all absolutely free choice sequences of elements of K whose first term is 0 and which satisfy the condition that a(n)Sa(n + 1), unless a(n) is an endpoint of (G, K, S), in which case a(n) = a(n + 1). For any atomic P, associate a formulaP(a) which says (3n) (3x) (a(x) = n A I/(P, n) = T). We then define inductively a formula A(a) associated with an arbitrary formula A, exactly as in section 1.1. Again as in 1.1, if IJ(A, 11) = T [=F] we can derive (a ~ S) «3x)(a(x) = n) ~ A(a)) [.(a ~ S)«3x) (a(x) = n) ~ A(a))] in FC We now show how the ideas of section 1.1 can be modified so as to show how every model can be transformed into at. "equivalent" strong Beth model. Let ¢ be a model on a m.s. (G, K, R). Define a tree (G', K', S') as follows: Let K' be the set of all finite non-empty sequences {H;}7= l' where Hi E K(1 sis n), H 1 = G, and HiRH i + 1(1 s i < n). Let G' be the sequence whose sole term is G. We say, for H~, H~ E K', that H~ S'H; iff H~ is the initial segment of H~ formed by omitting the last term of H;. (Then, if R' = S'*, H~ R'H; iff H~ is an initial segment of H~. ) For any H' E K', let l(H') be the last term of H', then l(H') E K. Define IJ(P, H') (P atomic, H' E K') by IJ(P, H') = ¢(P, l(H')). Let H' E K', H' = {HJ~= l' We define an associated path P(H') = {Hj}.f:=o as follows: For 0 s j < n, let H, be the unique initial segment of H' with j + 1 terms. For j :2: n, let Hi be the j + I-termed sequence whose first n terms are H 1 , . . . , H, and whose other terms are all equal to H n • So for j :2: n, let l(Hi) = Hn- Clearly P(H') is a path through H'; further, for any Hi on this path, l(Hj)Rl(H'). We now assert: THEOREM 1 (First part): IJ is a strong Beth model. Further IJ is equivalent to ¢ in the sense that IJ(A, H') = ¢(A, l(H')) for any H' E K' and formula A. In particular IJ(A, G') = ¢(A, G) for any A. PROOF. First we show that IJ is a Beth model. Condition (a) is clear. For (b), if IJ(P, H~) = T and H~ S'H;, then l(H~)Rl(H;). Since ¢(P, l(H~)) = IJ(P, H~) = T, and since ¢ is a model, ¢(P, l(H~) = T, hence IJ(P, H;) = T. For (c) let H' be barred by B ~ K'. Then the path P(H') intersects B. Let H" be some point of the intersection. To establish (c)
SEMAN TICAL ANALYSIS OF INTUITIONISTIC LOGIC I
109
it is sufficient to show that if 'l(P, H") = T, 'l(P, HI) = T. Since H" is on the path P(H '), l(H")Rl(H '). Since 17(P, H") = T, 0: in either case there exists HI E K such that HRH I and HI = l(H I ). By assumption, either '1(A, HI) = F or '1(B,Ol) = T, whence, by the induction hypothesis, either q,(A, HI) = F or q,(B, HI) = T. Since HI was arbitrary (subject to HRH I , q,(A => B, H) = T. Conversely, suppose l1(A => B, H) = F. Then for some HI such that HRH I , '1(A, 01) = T and l1(B, HI) = F. By the induction hypothesis, q,(A, l(H I ) ) = T and q,(B, 1(H 1 ) ) = F; since l(O)Rl(H I ) , q,(A => B, l(H)) = F, as desired. The case of' is quite
similar. Q.E.D.
Notice that the situation contrasts with that in S4, where it is often impossible to replace an arbitrary finite model by an equivalent finite tree model (cf. [2]). The third part of Theorem 1 extends the procedure for finding a tree model equivalent to an arbitrary model to quantificational models. Here we cannot use the same construction as a tree q. model and as a Beth q. model, as will be seen when we define the latter, in preparation for the fourth part of theorem 1. THEOREM 1 (Third part): Let (G, K, R) be a q.m.s. with domain function ljJ(H). (R need not be anti-symmetric.) Let S be any relation (not necessarily irreflexive) such that R = S*. Let q, be a quantificational model on (G, K, R). Let (G, K, R) be defined as in the second part of the theorem, and let ilI(H) = ljJ(l(O)). Let '1(r, H) = cp(r, l(A)) for each predicate letter P" and each H E K. Then '1 is a quantificational model on the q.m.s. (G, K, R) with domain function ill. Further, relative to a given assignment to the free variables of A, '1(A, H) = q,(A, l(H)): in particular, '1(A, G) = q,(A, G).
The proof is left to the reader. Notice that, since S is not required to be irreflexive, it may in particular be R itself: thus (G, K, R) may be as in the second part of Theorem 1, or may be identical with the Beth model (G', K', R') of the first part. As a quantificationa1 model, however, '1 will not be a Beth quantificationa1 model, to the definition of which we now turn. Unlike our own models, with their variable domains (a feature we have noted to be essential), the Beth quantificationa1 models are based on a fixed domain D. We define a Beth q.m.s. to be a Beth m.s. (G, K, R),
112
SAUL A. KRIPKE
together with a domain D with at least one element. A Beth q, model Yf is a binary function Yf(pn, H), whose value is T or F when n = 0, and is a subset of D" for n ~ 1. We require, in addition to the conditions (b) and (e) above on n, the analogues for n ~ I: (bn). If HRH', Yf(r, H) S; H'); (en) if H is barred by B S; K, then
«r:
n
H', B
Yf(r, H')
S;
Yf(pn, H).
For an atomic formula r(x l , . . . , x n), define Yf(r(x I , . . . , x n), H) = T, relative to an assignment of aI' .. , anEDtox l , . . . ,xmiff(a I , . . . ,an)E Yf(r, H); otherwise, =F. We then define the values for more complex formulae by induction. The inductive clauses for the propositional connectives are as above. Let the formula A(x I , ••• , X m y) contain only the free variables listed. We define Yf«y)(A(x I , . . . , x m y), H) = T, relative to an assignment of a, ED to Xi (1 ::::; i ::::; n), iff Yf(A(x I , ••• , (xn,y), H) = T relative to any assignment of an element bED to y and a, to Xi; otherwise, = F. Again Yf«3y)A(x I , ••• , X n, y), H) = T when a, is assigned to Xi iff there is a B S; K such that H is barred by B and for any H' E B there is a bED such that Yf(A(x I , .•• , X m y), H') = T when ai is assigned to Xi and y is assigned b; otherwise, = F. Using the inductive clauses and the conditions on atomic formulae, we can prove the analogues of (b) and (e) for an arbitrary formula A, relative to a fixed assignment to its free variables in a Beth quantificational model n, If Yf(A, H) = T and HRH', Yf(A, H') = T. If H is barred by B and 1'f(A, H') = T for any H' E B, then Yf(A, H) = T. Suppose we are given a quantificational model ¢ on a m.s. (G, K, R) such that u = U l/I(H) H,K
is countable. We will transform ¢ into a Beth quantificational model whose domain D is the set N of non-negative integers. Let (G/, K /, S') be as above, and R' = S'*. Notice that N is a countable union of disjoint countable sets; call these NJi = 0, ... ). We have a procedure, which, for each H' E K', generates certain elements of N at H'; the set of elements generated at H' will be identical with n
UN;
i=O
SEMANTICAL ANALYSIS OF INTUITIONISTIC LOGIC I
113
for some n. Further, ifP is any path in K', every pEN will be generated at some H' E P. Further, the procedure will satisfy the condition that if H' R'H", every element generated at H' is also generated at H". An element generated at H', but not at its predecessor (if any exists), is said to be introduced at H'. Further, any natural number n generated at H' is assigned a unique element of t/J(l(H')); this element is called v(n, H'). The v-function will satisfy the condition that if n is generated at H', and H'R'H", then v(n, H') = v(n, H"). We give an inductive definition on the tree (G', K', S') of a procedure with these properties; at any stage, satisfaction of these properties will be taken to be part of the inductive hypothesis. First, consider the origin G' of the tree. We generate exactly the elements of No at G', and we define v(n, G '), for n E No, in such a way that No is mapped onto t/J(G). (This is possible since t/J(G) is at most countable. All arbitrary choices can be made precise, if desired, using well-orderings of the denumerable sets Nand Ll.) Suppose we have defined the set of all integers generated at H' it is, say, m
(M = U N;) i~O
and have defined v(n, H') for each n E M. Let H'S'H". Then introduce all elements of N n + l' so that the set of elements generated at H" is M v N m + l' Define v(n, H") for n E M v N m + 1 by v(n, H") = v(n, H') for n E M, and such that v(n, H") maps N n + 1 onto t/J(1(H")). Then the inductive definition is complete. We now define a Beth quantificational model Yf whose domain is N on the Beth m.s. (G', K ' S') as follows: If P is a propositional letter, define Yf(P, H') = ¢(P, l(H')). For an n-adic predicate letter n define Yf(r, H') to be the set of n-tuples (ml' , m n ) of natural numbers such , m; are all generated at H" and that, for every H" E K' such that m l , H'R'H", (v(m l , H"), ... , v(mm H")) E ¢(r, l(H")). THEOREM I: (Fourth part): Yf is a Beth quantificational model on (G', K ', S') whose domain is N. For any H' E K' and formula A(x l , . . . , x n ) , whose free variables are exactly those listed, and natural numbers m l' . . . , m.; which have been generated at H', Yf(A(x1 , ••. , x n) , H') = T when Xl' ••• , X n are assigned m l , . . . , m i; respectively, if and only if ¢(A(x 1 , ... , x n) , l(H')) = T when Xl' ••. , X n are assigned v(m 1 , H'), ... , v(m n ,
114
SAUL A. KRIPKE
H'), respectively. In particular (n = 0), ifA is a closedformula, rJ(A, H') = cf>(A, l(H'». PROOF. We show first that rJ is a Beth quantificational model. Conditions (b) and (b n ) are obvious. Condition (c) is proved as in the first part of the theorem. Condition (en) (n ~ 1) is proved as follows: Suppose H' E K' is barred by B £; K', and suppose {m I' ... , m n) is not in rJ(pn, H'). We show that there is an H" E B such that (m l , ... , m n ) is not in rJ(pn, H"). Since (m I' ... , m n) is not in I/(pn, H'), there is an H~ E K' such that H' R'H~, m I' . . . , m n are all generated at H~, and (v(m I' H~), ... , v(mn> H~» is not in cf>(pn, l(H~)). As in the first part of this theorem, let P be the path P(H~) through H~, with the property that, for H" on the path and H~R'H", l(H~) = l(H"). Then P intersects B in an element H". If H" R'H~, then since clearly (m l , ... , mn) is not in rJ(P", H~), by condition (b"), it is not in rJ(pn, H"). If H~R'H", then since l(H") = l(H~), and v(mi' H~) = v(mi' H"), we have «v(m l , H"), ... , v(m n, H"» tt ¢(pn, l(H"», so that (m l , . . . , mn) ¢ rJ(P", H"), the desired conclusion.
We now prove the assertion in the second sentence of the present Fourth part by induction; the third sentence is a special case. Let A(x l , . . . , nX) be atomic. If n = 0, see the proof of the first part of this theorem. If n > 0, write A(x l , . . • , xn) as P"(x l , . . . , xn) . Suppose m l , . . . , mn are all generated at H' E K '. Let H = l(H'), and a, = v(mi' H'). If c/>(P"(XI' ... , x n) , H) = T, when Xi is assigned a, (l ~ i ~ n), then (a I ' . . . , an) E cf>(P", H). If H' R'H~ (H~ E K'), let H o = l(H~). Then HRH o, hence a I' , an E ",(H o) . Also a, = v(mi' H') = v(mi , H~). This shows that , Inn) E rJ(pn, H'), hence rJ(pn(x l , . . . , x n) , H') = T, relative to the. (m l , assignment of m, to Xi' as desired. On the other hand, if cf>(P"(x I' •.. , x n ) , H) = F relative to this assignment, and hence (ai' ... , an) ¢ cf>(pn, H), we clearly have (m l , . . . , mn ) ¢ rJ(P", H'), again as desired. The inductive clauses for the propositional connectives are as in the first part of this theorem. Suppose the result proved for A(x l , . . . , X n , y). Again let m, be assigned to Xi' let H = l(H'), and let a, = v(mi' H') (i = 1, ... , n). Let cf>«3y)A(xl , ••• , X n , y), H) = T when Xi is assigned a.. Then there is e b e ",(H) such that cf>(A(x I' ... , X n , y), H) = T when in addition y is assigned b. v(p, H') maps the elements generated at H' onto ",(H), so let v(p, H') = b, where p is generated at H'. Then, by inductive hypothesis rJ(A(xl , • . . , X n , y), H') = T when Xi is assigned m, (i = 1, ... , n) and
SEMANTICAL ANALYSIS OF INTUITIONISTIC LOGIC I
115
X n, y), H') = T when x, is assigned m.; On the other hand, suppose (P, E) = T iff P is provable in E and F otherwise, then l/>(P, E} is a model on the m.s. (Eo, K, R). Thus for any complex formula A which is a theorem of the intuitionist propositional calculus, l/>(A, Eo} = T. If Eo is elementary number theory Z, and P is Godel's undecidable formula, then l/>(P v ,P, Eo} = F; for P is not provable in Eo, but it is provable in certain extensions E. The larger problem, whether Heyting's propositional calculus is complete with respect to this particular choice of Eo, remains open. To interpret intuitionistic quantification theory in this manner, we must assume that the system Eo and its extensions have notions of free
118
SAUL A. KRIPKE
variables and of constants, and that Eo contains at least one constant. For any E E K, let I/I(E) be the set of all constants of E. Then if ERE', I/I(E) s I/I(E'). For every n, define an n-adic atomic predicate P" to be a formula of Eo with n free variables, together with a I-I function from the integers I, ... , n to the free variables of P", The variable assigned by this function to m(1 ~ m ~ n) is called the mth free variable of P". Define, for n ~ 1, the set ¢(pn, E) s [I/I(E)]n as follows: An n-tuple (a l ' . . . , an) of constants in I/I(E) is in ¢(P", E) iff the result of the simultaneous substitution of aj(1 ~ i ~ n) for the ith free variable of P" is a theorem of E. Out of the atomic n-adic predicates (which play the role of the n-adic predicate letters above), we can build more complex formulae using the propositional connectives and the quantifiers. ¢(P", E) then becomes an intuitionistic quantificational model. It is clear that in the preceding K can be replaced by any subset K' thereof (e.g., the finitely axiomatizable extensions of Eo). Further, restrictions, such as recursive enumerability, on the notion of formal system, can be removed at will. There is also a more "model-theoretic" variant of the present interpretation of Heyting's predicate calculus, which eliminates the assumption that E must-contain constants. Further, the interpretations can be extended in other directions so as to yield new interpretations of larger parts of intuitionistic mathematics; in particular, we can give an interpretation of FC which leads to a proof that FC is an inessential extension of Heyting's arithmetic"). For more on provability interpretations of intuitionistic and modal logics, cf. [3]. 2. Cohen's notion of "forcing," Let D be an arbitrary countable infinite set. Let 9 = (9 0 , ( 1) be a pair of finite, disjoint subsets of D, and let K be the set of all such pairs. If 9 = (.0/'0' !!J I) and 9' = (9~, 9'1) are in K, theqdefine 9 R9' (or, f!}' is an extension of 9) iff 9"0 s 9~ and 9 1 S 9~. Further, let I/I(g» = 9 0 u:3' 1. Now consider a single monadic predicate letter P. For any g; E K, define ¢(P,9) = 9 0 . Let K' be the set of all 9 E K such that 1/1(.9) is non-empty. Then for any g; E K', (g>, K', R) is a q.m.s., with the associated domain function 1/1. (If we had modified Heyting's predicate calculus so as to admit the empty domain and thus permit I/I(Y') to be empty, the rather artificial use of K' in place 1) Kreisel has independently obtained this result using an elimination of free choice sequences by contextual definition.
SEMANTICAL ANALYSIS OF INTUITIONISTIC LOGIC I
119
of K could be dropped.) Then ¢ is a model on (g'J, K', R), and for any formula A built from P using propositional connectives and quantifiers, the inductive definitions we have given define a truth-value ¢(A, g'J'), for any g'J' E K', relative to a fixed assignment of elements of D to the free variables of A. If this value is T, we say that g'J' forces A relative to the assignment. (Notice that the value of ¢(A, g'J') is clearly independent of the choice of the "designated" element g'J of (g'J, K', R).) If D' is a subset of D, we say that g'J , agrees with D' iff g'J~ s D' and g'J~ s D - D'. We can say that D' forces A (relative to a given assignment to the free variables) iff there is a g'J' E K' which agrees with D' and forces A. Notice that if g'J' and g'J" agree with D', they have a common extension which agrees with D'; thence it easily follows that D' cannot force a statement together with its negation. Call D' generic iff for every A, and fixed assignment to the fret: variables thereof, D' forces either A or ..,A. Cohen proves that generic sets exist: Let {An} be an enumeration ofall the ordered couples Ai = BIB. If D' is generic and forces ..,..,A, it clearly must force A; hence a nonempty, generic D' forces every classically valid formula not containing universal quantifiers. Cohen has proved an even stronger fact: If D' is generic and A has no universal quantifiers, then (relative to an assignment to free variables), A is forced by D' if and only if it is true when the existential quantifiers (taken as ranging over D) and the propositional connectives are interpreted classically, and "P(x)" is interpreted as "x ED'," The
e::
e-:
120
SAUL A. KRIPKE
assertion is readily proved by induction on the complexity of A. Since, classically speaking, a (x) can always be replaced by ,(3xh the restriction that universal quantifiers be absent is not important. The definition we have given differs from Cohen's in inessential respects. (It may be closer to a definition given by Feferman, which we have not seen 1). It is clear that the notion can be extended. For example, we need not deal with a single predicate P(x); we can deal with several such, not all of which need be monadic. The modifications needed for this more general situation should be obvious. Further, we can replace the countable set D by a set of regular cardinality N,,; K will consist of disjoint pairs of sets of cardinality less than ~". Cohen's motivation was radically different from ours, but it is clear that his notion is intimately related to our model theory. The "deeper" reasons for this relation may yet be unknown. It should be noted that Dana Scott had already observed that Cohen's idea was similar to an interpretation conjectured by Kreisel [17]. And indeed, if Kreisel's conjectures prove correct, his interpretation of intuitionism will be closely related to ours.
2. Semantic tableaux In this section we develop Beth semantic tableaux for intuitionistic logic. The notion developed here is similar to those of [2], [11], which can be read as background if desired. We deal at each stage of the construction with a system of alternative sets of tableaux; each alternative set is ordered in the form of a tree, and the origin of the tree is called the main tableau of the set. We call the tree ordering relation on an alternative set "S"; the smallest reflexive and transitive relation containing "S" is called "R". We can assume, at a given stage of the construction, that each alternative set is diagrammed on a piece of paper; corresponding to the system of all the alternative sets of the stage, we have a leaflet of which the separate sheets of paper are pages. Given a formula A of Heyting's predicate calculus, to see whether it is valid we attempt to find a countermodel to A. If A has the form Al A •.. Am. :=> • B I V ... B n, then what we need is a model ¢, such that relative to some assignment to the free variables of A, ¢(A j, G) = T and 1) See note at end of paper.
SEMANTICAL ANALYSIS OF INTUITIONISTIC LOGIC I
121
¢(B j ' G) = F, 1 :0; i :0; m, 1 :0; j :0; n. We represent the situation by putting A l , ... , Am on the left, and B l , . . . , B; on the right of the main tableau of a construction. We continue the construction, which gives a systematic attempt to find a tree countermodel to A, by the following rules, which apply to any tableau of any alternative set of the construction:
NI. If..,A appears in the left column of a tableau, put A in the right column of that tableau. Nr. If..,A appears in the right column of a tableau t, start out a new tableau t l , with tSt", by putting A on the left of t l • AI. If A left of t.
A
B appears on the left of a tableau t, put A and B on the
Ar. If A A B appears in the right column of a tableau t, there are two alternatives; extend the tableau t either by putting A in the right column or by putting B in the right column. If the tableau t is in an ordered set Y, it is clear that at the next stage we have two alternative sets, depending on which extension of the tableau t is adopted. Informally speaking, if the original ordered set is diagrammed structurally on a sheet of paper, we copy over the entire diagram twice, in one case putting in addition A in the right column of the tableau and in the other case putting B; the two new sheets correspond to the two new alternative sets. The formal statement is rather messy: Given a tableau t in an alternative set Y, if t has A A B on the right, we replace Y by two alternative sets Y 1 and Y 2, where !f'l = Y - {t} u {tl} and Y 2 = Y - {t} u {t 2}, and t l [t2 ] is like t except that in addition it contains A [B] on the right. The tree ordering S 1 of the new set Y 1 is precisely the same as S, save that t l replaces t throughout; and similarly for the tree ordering S2 of Y 2 • (Formally, Sl agrees with S on !f' - {t}, and, if t' is the predecessor [a successor] of t, then t'Sltl[tlSlt'].) We say!f' splits into Y l and !f'2' Similar remarks apply to the rule VI and PI below. VI. If A v B appears on the left of t, put either A on the left of t or B on the left of t. (As in the case of Ar, this splits the set !f' containing t into two alternative sets.) Vr. If A v B appears on the right of t, put A and B on the right of t. PI. If A
=:>
B appears on the left of t, either put A on the right of t
122
SAUL A. KRIPKE
or put B on the left. (Thus again the set g> containing t is replaced by two alternative sets.) Pro If A ::> B appears on the right of t, start out a new tableau t l, with A on the left of t 1 and B on the right, such that tSt l • For a construction involving quantifiers, we associate, at a given stage of a construction, a set I/1(t) of variables with each tableau t. We start out the definition of I/1(t) by assuming that, at the initial stage of the construction, which starts out with a single tableau to, I/1(to) consists of a single variable x. At later stages I/1(t) is to be enlarged only as required by the rules Ilr and II below and the stipulation that tSt l is to imply that I/1(t) S I/1(t 1 ) . We are now in a position to state the rules for quantifiers as follows: Ill. If (x)A(x) appears on the left of t and y is any variable in I/1(t), put A(y) on the left of t. Ilr. If (x)A(x) appears on the right of t start out a new tableau t 1 with tSt l • If y is the alphabetically earliest variable which has not yet occurred in any tableau of any alternative set at this stage, put y E I/1(t 1) and put A(y) on the right of t': El. If (3x)A(x) appears on the left of a tableau t, and y is the alphabetically earliest variable which has not yet appeared in any tableau of any alternative set at this stage, put y E I/1(t) and put A(y) of the left of t. Et, If (3x)A(x) appears on the right of a tableau t, and y is a variable in I/1(t), put A(y) on the right of t.
In addition to the rules we have stated, the following stipulation holds throughout the construction: if t and t 1 are tableaux of some one alternative set, at any given stage, such that tSt! , and A appears on the left of t, then put A on the left of t 1. Notice that, since the stipulation is to be iterated an arbitrary number of times, it also applies when A is on the left of t and tRt l • The relation tSt l is to hold in a construction only as required by the rules listed above. The rules may be applied in any order, as long as the order stipulated is such that every applicable rule is eventually applied. A tableau t is called closed iff some formula occurs in it on both the left and the right. A set or tree of tableaux is closed iff some tableau in
SEMANTICAL ANALYSIS OF INTUITIONISTIC LOGIC I
123
the set is closed. A system of alternative sets is closed iff every set of the system is closed. A construction started out by putting A on the right of the main tableau of the construction is called the construction for A. We can place the following restrictions on constructions: A rule is not to be applied to a tableau of a closed set; nor is it to be applied if it is "superfluous" (e.g., Al is not to be applied if A and B already appear on the left of the tableau t in question). Let us call an alternative set at any stage of a construction terminal iff it is not replaced at any stage of the construction by another set or pair of sets; thus, in particular, every closed set is terminal. In any construction, let a be some fixed sequence Y'1' Y'2' ... of alternative sets such that Y'1 is a set at the first stage of the construction and Y'i+ 1 is the set or one of the two sets, which, at the (i + 1)-th stage, replaces Y'i; a terminates at Y' n iff Y' n is terminal. (If the construction does not terminate there is at least one infinite such sequence a.) Any tableau t in Y'I or in Y'i + 1 which is not an immediate descendant of any tableau in Y'i is called an initial tableau. Let K be the set of all sequences. of tableaux tl' ' 2, ... such that 11 is an initialtableau andr, + I isan immediate descendant of t, and r terminates at iff belongs to a terminal set Y'm. Let be that member of K whose first term II is in Y'1' Let tpx', for r, t ' in K, iff for some Y'i in a there are terms t, I' of t, t' in Y'j such that IRt' (R the ancestral of the tree ordering S). Then, intuitively, (.0' K, p) forms a q.m.s. with domain function
'n 'n
.0
'in
If a quantificational model ¢ is defined so that, for any sentence letter P, ¢(P, r) = Tiff P appears on the left of some I in r, and, for any predicate letter P", ¢( P", r) is the set of n-tuples (x l ' . . . , x n) of variables such that pn(x 1 , .•• , x n) appears on the left of some I in r, then, for every formula B, if B appears on the left of some I in r, ¢(B,.) = T (relative to the
assignment of each free variable in B to itself). Further, the dual law that, for every B. if B appears on the right of some I in r, then ¢(B, r) = F, holds iff z does not terminate in a closed set Y' n' Hence, if the construction was a construction for A, this is just the condition under which a provides a countermodel for A.
124 THEOREM
SAUL A. KRIPKE
2: The construction for A is closed if and only if A is valid.
The proof, which follows the lines sketched intuitively above, and in addition shows that the alternative sets of the construction for A exhaust the possibilities of finding a countermodel for it, is omitted because it is a routine variation on the proofs of the corresponding theorems of [2] and [16jl). 3. Completeness theorem 3. 1. Consistency property THEOREM
3: If A is provable in Heyting's predicate calculus, then A is
valid.
This theorem is almost trivial; we need only verify that, in a standard formalization of Heyting's predicate calculus, the axioms are all valid, and the rules preserve validity. Such a verification is left to the reader. It follows that if A is provable, the construction for A is closed. 3 . 2. Completeness property
We show that every valid formula A is provable by showing that if the construction for A is closed, then A is provable. As in [2] and [16], we do this using a notion of "characteristic formula." As in [2], define the rank of a tableau in a finite tree of tableaux (or, indeed, of a node in any finite tree), as follows: An endpoint of the tree has rank O. If tis not an endpoint, let t 1 , ..• , t n be its successors; then Rank(t) = Max {Rank (ti)}+l. It is easy to verify that, for any finite tree of tableaux, a unique rank is defined for each tableau of the tree. 1) Define A to be tree valid iff '" (A, G) = T for every model rp on a tree q.m.s,
(G, K, R). Then what really is readily proved is that the construction is closed iff A is tree valid. But, by section 1.2 above, validity coincides with tree validity. Alternatively, we can argue as follows without use of section 1.2: Clearly validity implies tree validity, and provability implies validity. The completeness result below shows that tree validity implies provability, so the three notions coincide. We could have defined a tableau procedure, based on a relation R, which would have been more appropriate to models than to tree models; a reader familiar with [21 will know how this could be carried out. Notice that, as observed in analogous cases in [21 and [161, the countermodels for non-valid formulae obtained by Theorem 2 from tableaux are always on a countable tree q.m.s. (G, K, R) with a countable set U of individuals involved. This "LowenheimSkolem" result will be used in part II to show that the present completeness results include those of Beth [81.
SEMANTICAL ANALYSIS OF INTUITIONISTIC LOGIC I
125
Given any tableau t in a tree of tableaux, define the following sequence {til: to = t, t j + l = the predecessor of t j , if such a predecessor exists, and undefined otherwise. The sequence is clearly finite, and its last term is the origin of the tree We call it the "path from t back to the origin." The terms of the sequence other than t "come before t" on the tree. For any t on a tree, let X(t) be the set of all variables occurring free in t but not in any tableau coming before it. At any stage of a construction, the tableaux of an alternative set form a finite tree. We define the characteristic formula of a tableau t in the set at a given stage by induction on its rank in the set. Given a tableau t, let AI' ... , Am[Bl , . . . , B n] be the formulae occurring on the left [right] of t. Further, let Xl' ... , x q be the elements of X(t). (Possibly q = 0.) If Rank (t) = 0, then the characteristic formula of t is defined as (x.). .. (Xq ) (AI A .. . A m.::::> .B l V .. • B n ) ; or, if there are no formulae on the left [right] of t, as (Xl)' .. (Xq) (B I V ... B n ) [(Xl)' .. (xqHA 1 A Am)]' If Rank (t) > 0, let t l , . . . , t p be the successors of t, and let C l , , Cp be the corresponding characteristic formulae. Then the characteristic formula of t is (Xl)' .. (x q) (AI A •.. Am.::::> .B; V ... B; V C, V ... C p ) ; or, if there are no formulae on the left [right] of t, the characteristic formula is (Xl)" . (Xq ) (B 1 V .. . B; V C l V " .Cp ) [(Xl)" . (Xq ) (A 1 A .. • A m.::::> ,C 1 V ... C p ) ] . The characteristic formula of an alternative set (tree) of tableaux is defined as the characteristic formula of the main tableau of the set. The characteristic formula of the entire system of alternative sets at a given stage of a construction is defined as the conjunction of the characteristic formulae of the alternative sets of the system. In a natural sense, the present notion of characteristic formula is "dual" to that of [2] and [16]. It may facilitate the reader's comprehension of the notion of characteristic formula if he consults the corresponding treatment of characteristic formulae in [2], [16]. LEMMA: If A o is the characteristic formula of the initial stage of a construction, and B o is the characteristic formula of any stage of the construction, then I- B o ::::> A o.
PROOF. It suffices to show that the characteristic formula of any stage of the construction implies the characteristic formula of the preceding stage. But the characteristic formula of the mth stage has in general the orm D 1 A .. . Dj A .. . Dn , where the Di(l :::; i :::; 11) are the characteristic
126
SAUL A. KRIPKE
formulae of the alternative sets of the stage. The rule which is applied and changes the mth stage into the m + lth affects only one alternative set, say with characteristic formula D i: If the rule is PI, Ar, or VI, it will change this set into two distinct alternative sets, with characteristic formulae D', and Dj; we wish to prove, then, J- D 1 A .. . D', A DjA ... D n . =:> • D 1 A ... D j A ... D w To do this, it suffices to prove D', A Dj. =:> • D i: Similarly, if the rule applied is other than PI, Ar, or VI, then D j is transformed into Dj; to prove that J- D 1 A ... D', A ... Dn' =:> • V 1 A ... D j A ... D n , it suffices to prove J- D', =:> D i: So, when a rule is applied transforming the mth stage of a construction into the m + 1th, we need only consider the characteristic formula of the set to which the rule is actually applied. Suppose, then, a rule (other than PI, or Ar, or VI) transforms a set Y with characteristic formula D j into one with characteristic formula Dj; we wish to prove J- Dj =:> D i: Let t be the tableau to which the rule is actually applied, and let C be its characteristic formula. Further, let C' be the characteristic formula of the tableau t' into which t is transformed by the given rule. (The rules Nr, Pr and IIr leave t unchanged, appending a new tableau t': In this case t' will be identical with t, but the new characteristic formula C' of t will not be identical with the old one C.) Suppose we can show J- C' =:> C. Then if t is the main tableau of the set Y, we have shown J- Dj =:> D j • Otherwise, let t 1 be the predecessor at stage m of t, let t~ be the predecessor at stage m+ Lof r', and let Cl[C~J be the characteristic formula oft 1 [ t a Then C 1 is a universal quantification (u.q.) of a formula of the form X. =:> • Yv C, and C~ is a u.q. of X. =:> • Yv C'. Since J- C' =:> C, clearly J- (X. =:> • Yv C') =:> (X. =:> • Y V C). Applying universal generalization to this last statement, and distributing universal quantifiers across the implication sign, we obtain J- C~ =:> C iIf t 1 is the main tableau of g, then C~ =:> C 1 is D', =:> D ; Otherwise, let t 2[t;J be the predecessor of tl[t~J, and apply the same reasoning as before. Eventually we will obtain D', =:> D j • Thus in the case ofany rule other than PI, VI, or Ar, we need only consider the tableau t to which the rule is actually applied, and prove the formula C' =:> C stated above. Notice that in general C, the characteristic formula of t, is a u.q. of a certain formula B, and C' is a u.q. of a certain formula B'. If we prove J- B' =:> B, then by universal generalization and distribution of the quantifiers across the implication sign, we can obtain C' =:> C.
SEMANTICAL ANALYSIS OF INTUITIONISTIC LOGIC I
127
Bearing these remarks in mind, we break down the proof into the following cases, depending on the rule applied to obtain the m + 1th stage from the mth. We can say a case is "justified," if we have shown, for the case, that f- D / ::::J D j' which usually reduces to f- B' ::::J B. The reader is advised to consult the similar treatments in [2] and [16]. In considering a rule, we will in general assume that the tableau t to which it is applied contains formulae both on the left and the right, and that its characteristic formula is therefore an implication. The cases where the left or right side is empty will be left to the reader. Case NI. The characteristic formula of t is a u.q. of X A ,A. ::::J. Y; after A has been put on the right, its characteristic formula becomes a u.q. of X A ,A. ::::J • Y v A. The case is justified by f- X A ,A. ::::J • Yv A: ::::J : XA,A.::::J .Y. Case Nr. The characteristic formula of t is a u.q. of X. ::::J • ,A v Y. When we start out a new tableau t 1 with A on the left, and t St", the characteristic formula of r ' is ,A (since X(t 1 ) is empty because any free variable of A already occurs in t), and that of t becomes a u.q. of X.::::J .,A v Yv ,A. The case is justified by f- X.:::::>. ,A v Yv,A::::::> :X. :::::>. ,A v Y. Case A 1. Justified by f- X
A
A
A
BAA
A
B. ::::J . Y: ::::J :X A A
A
B. ::::J . Y.
Case Ar. Let the characteristic formula of t, call it C, be a u.q. of X.:::::> . Yv (A A B). The rule Ar "splits" t into two alternative tableaux, t' and t", whose characteristic formulae C' and C" are u.q.'s of X . :::::> • Y v (A A B) v A and X.::::J . Yv (A A B) v B, respectively. Using f- (X.:::::>. Y v (A A B) v A) A (X. ::::J . Yv (A A B) v B): ::::J: X. ::::J . Yv (A A B), and generalizing, and distributing quantifiers, we obtain f- C' A C". :::::> • e. If t is the main tableau of the set, this is the desired result f- Dj A Dj. ::::J .Dj • Otherwise, let t 1 be the predecessor of t. The characteristic formula C 1 of t 1 is a u.q. of Xl' ::::J. Y1 V C; it is transformed by Ar into two alternative characteristic formulae C~ and e~, which are u.q.'s, respectively, of Xl' ::::J . Y1 V C' and Xl' ::::J . Y 1 V C", Using f- C' A C", ::::J • C, we easily obtain f- C~ A e~. ::::J . C 1 . Continuing this process along the path from t back to the origin, in a finite number of steps we obtain f- Dj A Dj. ::::J . Di: Case PI. Like Ar, using f- (X A (A ::::J B). ::::J . Y v A) B.:::::>. y)::::::>:XA (A :::::> B).:::::>. Y.
A
(X A (A ::::J B)
A
128
SAUL A. KRIPKE
Case Pro Let the characteristic formula of t be a u.q. of X. ::::> • Y v (A ::::> B). Pr instructs us to start out a tableau t 1 , with A on the left and B on the right, whose characteristic formula is thus A ::::> B (X(t 1 ) being empty). Then the characteristic formula of t is transformed into a u.q. of X.::::>. Yv (A ::::> B)v (A::::> B), and I- X.::::>. Yv(A ::::> B) v (A::::> B):::::>: X. ::::> • Yv (A ::::> B) justifies the case. Case VI. Like Ar, using I- (X A (A v B) B.::::>. Y):::::> :(X A (A v B).::::>. Y).
A
A.
::::> • Y) A
(X A (A v B)
A
Case Vr. Justified by I- X.::::>. Yv (A v B)v A vB:::::> :X.::::>. Yv (A v B). Case 171. If t has as characteristic formula C, a u.q. of X A (3x)A(x). ::::> • Y, after application of 171, t is transformed into t 1 , whose characteristic formula C' is a u.q. of X A (3x)A(x) A A(a). ::::> • Y. Since a is a new variable not previously introduced, a E X(t 1 ) . Thus, we can take C' to be a U.q. of (a) (X A (3x)A(x) A A(a). ::::> • Y). So I- (a) (X A (3 x)A (x) A A(a). ::::> • Y):::::> :X A (3x)A(x). ::::> • Y justifies the caseCase 17r. Justified by I- X.::::>. Y v (3x)A(x) (3x)A (x). Case III. Justified by I- X
A
(x)A(x)
A
A(a).
A
A(a):::::> :X.::::>. Y v
::::> • Y: ::::>
:X A (x)A(x).
::::> • Y.
Case IIr. The characteristic formula of t is a u.q. of X. ::::> • Y v (x)A(x). IIr instructs us to start out a new tableau t\ with tSt\ and with A(a) on the right, where a has not previously been used. Then X(t 1 ) = {a}, since a is the only free variable of t 1 which does not occur in t, Hence the characteristic formula of t 1 is (a)A(a), and the characteristic formula of t is transformed into a u.q. of X.::::>. Yv (x)A(x) v (a)A(a). So I- X. ::::> • Y v (x)A(x) v (a)A(a): ::::> :X. ::::> • Y v (x)A(x) justifies the case.
Finally, we must justify the rule stipulating that if a formula A appears on the left of a tableau t, and t St", we must put A on the left of t 1 • This is justified by XAA.::::>.Yv«X' AA)::::> Y'):::::>:XAA.::::>.Yv (X' ::::> Y'). The lemma is proved. THEOREM 4: If A is valid, then A is provable in Heyting's predicate calculus. PROOF. We can assume A has no free variables. Since A is valid, the
SEMANTICAL ANALYSIS OF INTUITIONISTIC LOGIC l
129
construction for A is closed. Then there is a stage at which each alternative set is closed; let the characteristic formula of that stage be D 1 /\ • • • D., where the D /s are the characteristic formulae of the alternative sets of the stage. By the lemma, D 1 A ..• D n • =:l.A (since A is the characteristic formula of the initial stage). So it suffices to show D j for each j. The alternative set whose characteristic formula is D j' being closed, contains a closed tableau t. Then t contains a formula B on both sides, so its characteristic formula C is a u.q. of X /\ B. =:l • Y v B. Clearly I- C. If t is the main tableau of the set, this is D i: Otherwise, let t 1 be the predecessor of t. Then the characteristic formula C 1 of t 1 is a u.q. of X'. =:l • Y ' V C. Clearly I- C 1 • Continuing in this manner, we are driven back along the path from t to the origin until we obtain I- D I: Q.E.D. REMARK. The theorem gives a finitary proof that if the construction for A is closed, I- A. We could have proved it alternatively by showing that the tableau procedure is equivalent to a standard Gentzen formulation of Heyting's system. Of course the theorem and proof apply to the propositional calculus, even though the proof was carried out for the predicate calculus.
References [I] Saul A. Kripke, Semantical Analysis of Modal Logic (abstract). The Journal of Symbolic Logic 24 (1959) 323-324. [2] Saul A. Kripke, Semantical Analysis of Modal Logic I. Normal Modal Propositional Calculi. Zeitschrift fur Mathematische Logik und Grundlagen der Mathemathik 9 (1963) 67-96. [3] Saul A. Kripke, Semantical Considerations on Modal and Intuitionistic Logic. Acta Philosophica Fennica 16 (1963) 83-94. [4] Saul A. Kripke, The Undecidability of Monadic Modal Quantification Theory. Zeitschrift fur Mathematische Logik und Grundlagen der Mathematik 8 (1962) 113-116. [5] Paul J. Cohen, The Independence of the Continuum Hypothesis. Proceedings of the National Academy of Sciences, U.S.A. 50 (1963) 1143-1148. [6] E. W. Beth, Observations on an Independence Proof for Peirce's Law (abstract). The Journal of Symbolic Logic 25 (1960; published, 1962) 389. [7] M. A. E. Dummett and E. J. Lemmon, Modal Logics between S4 and S5. Zeitschrift fur Mathematische Logik und Grundlagen der Mathematik 4 (1958) 250-264. [8] E. W. Beth, Semantic Construction of Intuitionistic Logic. Mededelingen der Koninklijke Nederlandse Akademie van Wetenschappen, Afd, Letterkunde, Nieuwe Reeks, Deel 19, No. 11. [9] V. H. Dyson and G. Kreisel, Analysis of Beth's Semantic Construction of In-
130
SAUL A. KRIPKE
tuitionistic Logic. Technical Report no. 3, Stanford University Applied Mathematics and Statistics Laboratories, Stanford, California. [l0] S. C. Kleene, Introduction to Metamathematics. (Van Nostrand, New York; North-Holland Publishing Co., Amsterdam and P. Noordhoff Ltd., Groningen). 1952. [ll] G. Kreisel, A Remark on Free Choice Sequences and the Topological Completeness Proofs. The Journal of Symbolic Logic 23 (1958) 369-388. [l2] A. Heyting, Intuitionism: An Introduction. (North-Holland Publishing Co., Amsterdam 1956). [l3] S. Kuroda, Intuitionistische Untersuchungen der formalistischen Logik. Nagoya Mathematical Journal 2 (1951) 35--47. Known only from references. [14] A. A. Markov, 0 nepreryvnosti konstruktivnyh funkcij (On the continuity of constructive functions). Uspehi Matern. Nauk 9 (1954) 226-230. Known only from references. [15] G. Kreisel, On Weak Completeness of Intuitionistic Predicate Logic. The Journal of Symbolic Logic 27 (1962) 139-158. [l6] Saul A. Kripke, A Completeness Theorem in Modal Logic. The Journal of Symbolic Logic 24 (1959) 1-14. [l7] G. Kreisel, Set Theoretic Problems suggested by the Notion of Potential Totality in: Infinitistic Methods (Warsaw 1961). Note (added in proof, August 9,1964). We have since seen Feferman's paper, and his version of forcing is indeed virtually identical with ours, although he, of course, does not base it on any model theory for or connection with intuitionistic logic. He credits his version to Dana Scott. Note (added in proof, October 28,1964). In connection with the "Remark" at the end of section 1.1, it should be pointed out that the example in part I of the Remark already refutes Markov's principle. For we observed there that, in FC, (a) (IX I B) , (x) (IX (x) = 0), but also (b) , (IX I B) (3x) (IX x) = 1). By (b), noting that since B is the binary spread, (IX I B) (x) (IX (x) =F 0 :::> IX (x) = I), we have (c) , (IX I B) (3x) (IX (x) =F 0). But (a) and (c) jointly contradict Markov's principle. The example in part 2 of the Remark is of interest in showing that a single counterexample can refute both Markov's principle and Kuroda's conjecture. It should be noted that Markov's principle would imply, for IX on the full binary spread, that , (x) (IX (x) = 0) :::> (3x) (IX (x) = 1). From this it is easy to derive, for a real number a, that a =F 0 implies a # 0 (similarly to part 1 of the Remark). Hence if Brouwer's disproof (using ips depending on the solving of problems) of the latter is accepted, Brouwer has already refuted Markov's principle. I wish to thank M. A. E. Dummett and John Crossley for their help in editing this paper, and in particular, M. A. E. Dummett for an important correction in section 1.2.
SET THEORY AND HIGHER-ORDER LOGICl) RICHARD MONTAGUE University of California, Los Angeles, Calif., USA
Several mutual applications of set theory and higher-order logic are developed. Second-order logic is used to discover the standard models of Zermelo-Fraenkel set theory, consideration of standard models leads to the introduction of new systems of set theory, one of these systems is applied in finding a definition of truth for higher-order sentences, and finally Zerrnelo-Fraenkel set theory with individuals is given a philosophical justification as logically true within higher-order logic. 1. Standard models
Let us consider three well-known first-order theories. The first, called Peano's arithmetic, has the non-logical constants 0, S, +, " and the following axioms"): ,0= Sx,
Sx = Sy
--+
x = y,
x+O = x, x+Sy = S(x+y), l) I am indebted to the United States' National Science Foundation, which supported the preparation of most of this paper under grant number NSF GP 1603 (Montague). 2) It is convenient for the purposes of the present paper to regard a first-order theory as determined by a sequence of non-logical constants and a set of axioms. I use the logical constants " A, Y, -->-,"-', A, Y, =, which are the respective symbols of negation, conjunction, disjunction, implication, equivalence, universal quantification, existential quantification, and identity. (I use ",", "A", etc. as names of certain symbols of the object language, and I indicate concatenation by juxtaposition.)
132
RICHARD MONTAGUE
x :0
= 0,
x'Sy
= (x'y)+x,
prO] A Ax[P[x]
P[Sx]]
-+
-+
AxP[x].
The last principle is regarded as a schema, called the Induction Schema; we take as axioms all formulas of Peano's arithmetic obtainable without clash of variables by substituting a formula for P in the schema. The second theory, called (at least in an alternative formulation 1» the theory of real closed fields has the non-logical constants 0, 1, +, " -, -1, ~, and the following axioms: x+(y+z) x+y x+O
= (x+y)+z,
= y+x, = x,
x+( -x) x . (y . z)
= 0, = (x . y) . z,
x : y = y' x,
= x, = 0 -+ x
x·l ., x
. x- 1 = 1,
x· (y+z) = (x : y)+(x' z),
.,0
o~ o~ o~ x
=
~
0- 1
1,
x v0 A 0 xA0
x
~ ~ ~
-x, - x -+ x = 0, Y -+ 0 ~ x +Y A 0
~ X •
y.
y ...... O ~ y+(-x),
= 0,
Vx P[x]
A
VyAx[P[x]
Az[Ax[P[x]
-+
-+
x ~ y]
x ~ z]
-+
-+
Vy(Ax[p[x]
-+
x ~ y]
A
y ~ z)).
The last principle is called the Continuity Schema and plays a role analogous to that of the Induction Schema; that is, we consider as axioms all formulas of the present theory obtainable (again without clash of vari1) The usual formulation involves fewer primitive symbols, and hence has somewhat more complicated axioms. It is clear that several of our symbols, for instance -1 could be defined in terms of the others.
133
SET THEORY AND HIGHER-ORDER LOGIC
abIes) by substituting a formula for P in the schema. As a final example, consider Zermelo-Fraenkel set theory, whose only non-logical constant is e and whose axioms are the following: Au[u e a
+-+ u
Vu u e a
-+
s b]
-+
a
= b,
Vu[u e a It. ., Vv(ve u It. v e a)],
VaAu[uea+-+u
= xvu = y],
VbAu[u s b +-+ Vv(u e v It. v s a)], VbAu[u e b +-+ Ax(x s u Va[Vu u s a
It.
Au(u s a
-+ -+
x e a)], Vv[u s v It. v e a])],
AxAyAzAaAbAcAqAr[Au(u e a +-+ u
Au(u e b +-+ u Au(u s q +-+ U It.
P[q] It. P[r]
=
x)
It.
= x v u = y) It. Au(u s c +-+ u = x v u = z) = a v u = b) It. Au(u s r +-+ u = a v u = c) -+
y = z]
-+
AsVtAy[y e t
= x] It. Au[u e b +-+ u = x Au[ueq+-+u = avu = b] It. P[q])]. Au[u s a
+-+ u
+-+ VxVaVbVq(x
vu
It.
e s It.
= y] It.
The last principal is called the Replacement Schema; as with the schemata above, we take as axioms all formulas of Zermelo-Fraenkel set theory obtainable without clash of variables by substituting a formula for P in this schema. 1) A possible model of Peano's arithmetic is a structure :+ 1) = the set of x in A such that, for all y, if :) for some ordinal rx; we call the least such ordinal the rank of'll. If B is any set and rx any ordinal greater than 0, then there is a type structure 'll with rank a such that In\lI = B. If B is a non-empty set, there is a type structure'll of rank such that In\lI = B. If'll and ill' are type
°
structures of the same rank, and In\lI and In\lI' have the same cardinality, then'll is isomorphic to'll'. If'll and 'U' are type structures, and f is a one-to-one correspondence between In\lI and In\lI" then there is at most one isomorphism between ill and ill' which is an extension off If ~l is a type structure, 'll = :) is non-empty, then '\lI(rx), R', B) is a type structure of rank «, where R' is the restriction of R to '\lI(rx). If in second-order rank-free set theory and second-order rank-free set theory with individuals we drop the initial quantifier of the Aussonderungsaxiom and treat the result as a schema, we shall obtain two first-order theories, which we may calljirst-order rank-free set theory andjirst-order rank-free set theory with individuals respectively. Like Peano's arithmetic, these theories have no theoretical pre-eminence among a number of recursively axiomatized first-order subtheories of the corresponding second-order theories, but they have some practical interest. Various useful systems of set theory, some well-known and others as yet unexploited, can be obtained from these two first-order theories in a uniform way, by the addition of "axioms of infinity", that is, principles imposing conditions on the ranks of models. For example, Zermelo-Fraenkel set theory with individuals may be
C'B S Pl and THEOREM
(Xl
n
Pl
= 0·
(ii) If C'A or C'B is finite, then A)
B S B. and A l n B, = 0. Let (Xl = C'A l , Pl = C'B l , then (Xu Pl are r.e, (recursive) by theorem 1.4.2. Since A l , Bl are reflexive, x 8 (Xl +-+ <x, x) 8 A l , and similarly for Pl and Bl • Hence A l n Bl = 0 +-+ (Xl n Pl = 0· Conversely, suppose there are r.e, (recursive) sets (Xl' Pl such that C'A s (Xl' C'B S Pl and (Xl n Pl = 0. Let A l = (Xi and Bl = pi. Then A l , Bl are r.e. (recursive) and reflexive and they clearly r.e. (recursively) separate A and B. (ii) This part of the theorem follows at once from (i)
CONSTRUCTIVE ORDER TYPES, I
201
and the fact that every finite set is recursive and so is the complement of a finite set. The second version of part (i) of this theorem is false if the relations are not assumed to be reflexive. For let T, R be the relations defined in the proof of theorem II. I .3. (iv); and suppose that there exist recursive sets " p such that CT s; , and CR s; p where, n p = 0. Then the sets,' = {x : x s C'T and x is (a Godel number of) a single formula} and p' = {x : x e C'R and x is (a Godel number of) a single formula} are recursively separable by r and p, But r = To and p' = R o which contradicts [15] theorem 22. The converse assertion, namely, that if there exist disjoint recursive sets containing the fields of A and B, then A and B are recursively separable, still holds, of course. THEOREM ILl.5: Let (X = CA, /3 = CB; then A)( B if, and only if, there is a partial recursive function, p, such that
f>P:2
(X
U
/3,
pp s; {O, I} (S)
and x e (X u /3 implies x s
(X -
p(x) = 0 .&. x s /3 - p(x) = l.
PROOF. If A )( B, then by the preceding theorem, there are r.e. sets
0(1 :2 0( and /31:2 /3 such that 0(1 n /31
= 0. For arbitrary r.e. set y let
c;(x) be the partial recursive function defined only on y such that c;(x) = 1 for x e y. Then x e 0(1 implies 1 ~ C~,(x) = 0 and x e /31 implies cp,(x) = 1. Let T, be a Turing machine which calculates 1..:... C~, and let
Tp be a Turing machine which calculates Cp,' Further, let T(m, n) = the number (represented) on the tape of the Turing machine T at the m-th step") in the calculation for argument n. Now let a new machine To be defined such that To(m, n) is as follows: (i) If Ta , Tp have not halted before the m-th step for argument n, then To(2m + l , n) = Ta(m + l , n), (ii) If T; has not halted before the (m + 1)-st step and Tp has not halted before the m-th step, then To(2m + 2, n) = TpCm + 1, n), (iii) If 1;. halts at the m-tll step and T p has not halted before the m-th step, then To halts at the (2m + 1)-st step, 1) "Step" does not mean here just one operation of the Turing machine, but a whole phase in the calculation. We assume m ~ 1.
JOHN N. CROSSLEY
202
(iv) If Tp halts at the m-th step and T~ has not halted before the + 1)-st step, then To halts at the (2m + 2)-nd step. Let p(x) be the function defined by the machine To. Then P is partial recursive and satisfies the conclusion of the theorem since, for an argument in a u P, T~ halts if, and only if, Tp does not. Conversely, let a l = {x: p(x) = O} and Pl = {x: p(x) = I}. Then a l and Pl are r.e. and disjoint; the required result follows from the preceding theorem. (m
THEOREM 11.1.6: If A, Bare r.e. (recursive) relations, then A )( B +--+ A ('\ B
=0
(A ) ( B +--+ A ('\ B
= 0).
THEOREM II.I .7: Any two C.R.T.s have recursively separable representatives. PROOF (v. [8], theorem 9(a)). Let A s A and Be B and let
C
= {(2x, 2y): (x, y) e
A},
D = {(2x+ 1, 2y+ 1): (x, y) s B}. Then C
~
A, D
~
Band C ) ( D.
11.2. THEOREM 11.2.1: Let A l +- B1 ~ A l +- Bz·
~
Al
Al ,
s,
~
a; A l )( s, and A l
)(
Bl , then
PROOF. Let a i = C' Ai' Pi = C'B i (i = 1,2). By hypothesis there exist recursive isotonisms p, q such that p: Al ~ Az and q: B1 ~ Bz. P; (i = 1,2) such that P; = 0. Let Further, there are r.e. sets Pl be the partial recursive function with domain ~P ('\ which is equal to P on ~Pl and let Pl be the partial recursive function with range PPI ('\ which is equal to PI on ~Pl' Let ql be the partial recursive function whose definition is obtained by replacing P by q and a by P in the preceding sentence. Then ~Pz ('\ ~ql = 0 and ppz ('\ pqz = 0. Hence r: A l +- Bl ~ A l +- Bl where r is the partial recursive function which is equal to Pz on its domain and equal to qz on its (disjoint) domain r is one-one since PPl ('\ pqz = 0 and Pz- ql are one-one. The other requirements are obviously satisfied. By virtue of this theorem we can now define addition of C.R.T.s uniquely as follows:
a;,
a;
a; ('\ a;
CONSTRUCTIVE ORDER TYPES, I
203
DEFINITION II .2.2: A +B = CRT(A-tB) whereAe A, Be Band A)( B. Notation. 0
= CRT(0).
We write "A+B" for "A-tB" when A)( B.
THEOREM 11.2.3: (i) A+O = O+A (ii) A + B = 0 +-+ A = 0 = B, (iii) (A+B)* = B*+A*.
= A,
PROOF of (ii). Let A s A, B e B where A )( B. Then A + B = A u B u CAC'B) = 0. Hence A = B = 0.
0
implies
THEOREM II. 2.4: + is associative, viz.for all A, B, C e:Jl, A + (B+ C)
= (A+B)+C.
PROOF. By definition II . 2 . 2 there exist A s A, B s Band C e C such that B )( C and A)( {B+C}. Now the latter implies A )( B and A )( C, hence A+B is defined, {A+B})(C and (A+B)+C is well-defined. We leave the reader to verify that A+(B+C) = (A+B)+C. As in the classical case addition is not commutative in general. 11.3. We can now introduce two relations on the collection :Jl of all C.R.T.s. These relations are reflexive and transitive, i.e. are quasiorderings. Later (§§ III, IV) we shall show that the former of these two quasi-orderings is anti-symmetric on a sub-collection of :Jl and is a partial well-ordering of C.R.T.s of well-orderings. DEFINITION 11.3.1: A :-:;; B if there is a C.R.T. C such that A + C = B. A < B if there is a C.R.T. C i= 0 such that A + C = B. A < B A = {<x, Then A, where 11 A = B.
is not, in general, equivalent to A :-:;; B & A i= B. For let y) : y :-:;; x}, B = A [(J - {O}), A = CRT(A) and B = CRT(B). B are both of classical order type w* and clearly B + 11 = A, = CRT({ 0, then A .n = 0 +-+ A . to = 0 +-+ A (x) m ~ n --+ A.m ~ A.n, (xi) m
s
m
n
--+
i
I
205
= 0,
n
I
Ai
=0
s I i
Ai'
=0
PROOF. The proofs of the various parts of this theorem are elementary and we only prove parts (ii), (viii) as examples, and leave the other parts to the reader. (ii) Suppose p : A ~ B, then q : A. t» ~ B. w where q(z) = j(pk(z), fez)). (viii) Let A e A and A.w then belongs to A.w by part (ii). Let q(x), rex) be the (primitive) recursive functions such that
x
= nq(x)+r(x) and 0
~
rex) < n
and letp(x) = j(j(k(x), r(l(x))), q(l(x))). Thenp is one-one and (primitive) recursive. We assert that p is relation preserving between A.m and (A.n).m, for
(A.n)m
and where
= {(j(j(a, s), u),j(j(b, t), v) : (u < v .v. u = v & s < t) & a, b e C'A .v. u = v & s = t & Cn 2: Cn _ 1 •.. 2: C 1 and A = Cn + ... +Cl • Further, if A = Cn + ... +C1 and A = D m + ... +D l are two decompositions such that P 2: Cn 2: Cn - 1 . . . 2: C 1 and P > D m 2: D m _ 1 . . . 2: D l and all the C, and D, are principal numbers for addition, then n = m and for all r S n, Cr = Dr' Conversely, if A is expressible as Cn + ... +C l where C; 2: Cn _ 1 2: ... C l and all the Cr are principal numbers for addition, then there is a principal number, namely,
c..«
2: A.
PROOF by transfinite induction with respect to the partial well-ordering S. We assume 0 < A < P e£( +) and take as induction hypothesis: If 0 < B < A, then B is uniquely expressible as a finite sum of principal numbers < P. If A is a principal number for addition, then there is nothing to prove. 1)
This theorem was conjectured by A. L. Tritter.
CONSTRUCTIVE ORDER TYPES, I
221
Now suppose A is not a principal number, then there exist B, C such that B+ C = A, where C "# 0, A (and hence B "# 0).
(4)
By corollary IV.4.l2, C < A. Let C 1 be the least C satisfying (4) (i.e. under the ordering by initial segments). We now show that C 1 is a principal number for addition. Suppose C 1 = D + E, then by corollary IV.4 .12, E < P and hence by theorem 11.5.4, C 1 and E are comparable. But 1E 1 ~I C 1 I, hence E s C 1 and by the minimality of C b E = C 1 • Thus C 1 is a principal number by theorem lV.4.7.(i). Now let B 1 be the least B such that B+C 1 = A. Then if B 1 = 0 we only have to prove uniqueness, and otherwise by the hypothesis of the induction, B 1 has a (unique) decomposition B = Cn + + ... +C z where P> C, ~ ... ~ C z and all the Cr(r = 2, ... , n) are principal numbers. Hence A = Cn+ ... +C 1 and, since C 1 < P, all the C, (r = 1, ... , n) are comparable. Suppose C z < C 1 , then by the definition of a principal number for addition, Cz + C 1 = C l' hence A = (C n+ ... +C Z)+C 1
= (C n+··· +C 3)+C 1 •
Now C z "# 0 -+ Cn+ ... +C 3 < Cn+ ... +C z = B 1 • But B 1 was chosen as the least B such that B+C 1 = A. We therefore cannot have C z < C 1 and must have Cz ~ C L: Thus A = C, + ... + C 1 is a decomposition of the required type. As regards uniqueness, letA = Cn+ ... + C 1 and A = Dm+ ... +D 1 be two decompositions of A as a sum of non-increasing principal numbers. By theorem 11.5.4, C, and D m are comparable. Suppose Cn > D m , then D m + C n = C, since C n is a principal number for addition. Therefore A = D m + C n + ... + C 1 and by substituting D m + C, for C, m times more we obtain A = Dm.(m+ 1)+A which implies Dm.(m+ 1) < A. Now if i ~ m, then D;+D m = Dm or Dm.2 according as Di < Dm or D i = Dm. Therefore Dm.(m+l)+A = A < A+Dm.m ~ Dm.2m and hence, by corollary JIL2.11, A ~ Dm.m. This contradicts Dm.(m+l) < A and we therefore cannot have D m < Cn" Similarly, Cn -{: D m and we conclude C; = Dm . Now by corollary 111.2.11 it follows that
222
JOHN N. CROSSLEY
Repeating this argument the minimum of m and n times and letting s be this minimum, we obtain C, r = Dm _, (r = 0, ... , s) and hence either Ct+ ... +C t = OorDt+ ... +D t = 0 where t = 1m - Ill. By theorem 11.2.3. (i i) it follows that t = 0 and hence that n = m and C, = D, for every r. Conversely, if A = Cn+ ... +C t , then as for Dm above, A+Cn.n :::;; :::;; Cn . 2n and hence by theorems IV.4 .10 and 11.4.1. (vii) it easily follows that A < Cn • w which is a principal number for addition. This completes the proof. -r
This theorem is not an immediate corollary of theorem 2, p. 280 in [14] for the following reasons: (i) it may be the case that I P I is a classical principal number while P is not a principal number, e.g. V, (ii) P may be a principal number but I P I may not be a classical principal number (see § VIII .1) and (iii) comparability conditions have to be established.
IV.5. By theorem IV .4. 1 above there are c co-ordinals corresponding to each limit number (and these co-ordinals are therefore incomparable with each other) but there are some limit number co-ordinals which have no predecessors of some smaller ordinal. More formally: Let A, B, C range over co-ordinals, over classical ordinals, then
e
(3A) (3B) (3 e)
(/
A
& (V C)
I = r & IB I = A & A < B & r < e < A (I C 1"1= e v C 1: B v A 1: C».
This is shown by example IV. 5.1 below. If, however, we restrict ourselves to recursive co-ordinals then this situation does not arise. We hope to present the results for recursive co-ordinals in [4]. Example IV. 5 .J. P is as given in § IV.4. Let T be the well-ordering of type w. 2 defined by <x, y) e T +--> x e p & yep v xc y e p &x:::;; y v x, yep & x :::;; y.
Let T = CRT(T). Suppose T = V + V' where I V I = I V' I = w, then by the Separation Lemma (II. 5 .1) there exist relations U, U' such that U )( U' and T = U + U'. Hence C'U and C'U' are contained in disjoint
CONSTRUCTIVE ORDER TYPES, r.e. sets (x, p. But this implies trary to the choice of p.
(X
223
I
= p & P = P and that p is recursive, con-
e
We observe that if the condition on above is satisfied for some successor number l ' then by theorem IV. 3 . 6 it is satisfied for some limit number 2 • On the other hand we do have c co-ordinals which have predecessors representing all ordinals less than that of the given co-ordinal. This is the content of the representation theorem below. We shall use the following classical theorems in proving the representation theorem.
e
e
THEOREM IV.5.2: ([14], p. 379, theorem 1.) Every denumerable ordinal which is a limit number is the limit ofa strictly increasing sequence, of type ill, of ordinals less than the given number. i
THEOREM IV. 5.3: ([14], p. 264, corol/ary 3.) If A and B are isotonic weI/orderings then there is an isotonism f such that every isotonism between A and B is an extension off. THEOREM IV. 5.4: (REPRESENTATION THEOREM.) Let F, ..1 range over (denumerable) classical ordinals, C, D over co-ordinals, then
('IT) (3C)
(I C 1=
r &('1..1) (..1
(E!D)
(I
D
1=
L1 & D < C))).
PROOF BY TRANSFINITE INDUCTION. The assertion is trivial if T = O. We assume the assertion holds for all ordinals less than T, If r = e + 1, then by the hypothesis of the induction there is a co-ordinal T such that
I T I = e & ('1..1) (..1 < e
->
(ElD)(1 D
I = ..1 & D
D s T. Let C = T + 1, then by corollary IV. 2.7, it easily follows that C has the required properties. If r is a limit number, then by theorem IV. 5.2, F is the limit of a strictly increasing sequence {4>;} j < w of ordinals. We may assume 4>0 #- O. Put II 0 = 4>0' Il, + 1 = 4>j + 1- 4>j (by [14], p. 275, Il, is well-defined). Then
By the hypothesis of the induction, for each i there is a P, such that
224
JOHN N. CROSSLEY
I Pi I = IIi & ('v'A)(A < IIi
(E! D)(I D I = A & D < Pi»'
-+
(5)
Using the axiom of choice, choose a fixed Pi in Pi (such that 0 a CP i for each i 1 Now define
».
C
= {<j(p, m),j(q, n»: p e CPm & q a CP n & m
< n
.v. m = n & a Pm}
and C = CRT(C). Clearly,
L
IC I = Now suppose A
n}. Then p(n»( pen) since if x a CP(n) U cp(n) (= CC), then xaCP(n)+-->l(x)
s:
U
n .&. xaCp(n)+-->l(x) > n.
Hence, if p is the partial recursive function sg (l(x)-=- n), then p satisfies 1) We shall use this auxiliary condition in the proof of corollary IV. 5.5.
225
CONSTRUCTIVE ORDER TYPES, I
the conditions in theorem II .1. 5. Hence p(n) + p 1, then A.In = A.{Il.n) = (A.I1).n = A.n by the first part of the proof and corollary VI. 1.5. Hence, for all n, A . In = A. n. (ii) A. W = A.{Il'OJ) = (A.I1).OJ = A.OJ.
234
JOHN N. CROSSLEY
VI. 2. By analogy to principal numbers for addition, we now introduce principal numbers for multiplication (v. [1], p. 66). DEFINITION VI. 2. 1: A co-ordinal A is said to be a principal number for multiplication if A i= 0, 1 and
°
1, then A < AB whenever A i= 0. PROOF. We prove only (it) leaving (i) to the reader. (it) B > 1 -+ (E!C) (B = I+C & C i= 0). Hence AB = A (I+C) = A+AC where AC i= 0 if A i= O. Thus A < AB. THEOREM VI.2.4: If A i= 0 and A, B, C are co-ordinals, then AB = AC -+ B = C.
CONSTRUCTIVE ORDER TYPES, I
235
PROOF. Let A e A, Be Band C e C and suppose p : AB ~ AC. Then AB ,.., AC and since AB and AC are well-orderings, it follows that p is an extension of the unique minimal isotonism, Pe, between AB and AC (theorem IV. 5.3). Now, classically, :F 0& = eJ -+ T = J. Therefore there is an isotonism qe (not necessarily partial recursive) such that qe : B ,.., C. Now the map Te :j(a, b) -+ j(a, qe(b» defined only on C'AB is an isotonism between AB and AC. Hence by theorem IV. 5 . 3 p is an extension of r.: Since A :F 0, there is an element, say a o, in C'A. Let p' be the map p with domain and range restricted to {j(ao, n) : n e J}, then p' is partial recursive. Further, if p'(j(ao, x) is defined then its value is j(ao, y) for some y. Now let q' be the map x -+ l(p'(j(ao, x»), then clearly q' is partial recursive and q' agrees with qe on C'B (again by theorem IV. 5.3). q' is one-one, since
e
= q'(y)
q'(x)
er
-+ l(p'(j(ao, x») -+ p'(j(ao,
x)
= l(p' (j(a o, y»)
= j(ao, c) & p'(j(ao, y» = j(ao, c)
(since pp' £ {j(ao, n) : n s J} by construction) -+ j(ao, x) = j(ao, y) -+
x
= y.
(since p is one-one)
Thus q' is partial recursive, agrees with qe on C'B and is one-one and order-preserving, i.e. q' : B ~ C, from which the theorem follows. LEMMA
VI. 2.5: If M is a principal number for multiplication, and
~C:F~~fflOC<M-B<M&C<M
Suppose BC < M, then B < M or C = I by theorem VI.2.3. (ii). In the former case BM = M and in the latter trivially, C < M. Now BC < M -+ BCM = M, and therefore, by theorem VI. 2.4, CM = M. Using theorem VI.2.3.(ii) it follows that C < M. Conversely, C < M -+ CM = M and B < M -+ BM = M. Hence (BC)M = B(CM) = BM = M and by theorem VI.2.3.(ii), BC < M. PROOF.
VI. 3
VI.3.1: (i) If A :F 0, then B < C C -+ AB s AC.
THEOREM
(ii) B
s
-+
AB < AC,
236
JOHN N. CROSSLEY
PROOF. (i) B < C -+ (E tD) (B + D = C & D "# 0). By theorem VI. I. 7, AC= A(B+D) = AB+AD. AD"#O by theorem VI. I. 6, hence AB < Ae. (ii) follows at once from (i). THEOREM VI. 3 .2: There exist co-ordinals A, B, C ("# 0) such that A < B but AC $ Be. PROOF. Let A = 1, B = Vand C = W, then AC = Wand BC = Vw. By theorem VI. 2 .3 . (i), V:s; Vw. Hence if W:s; VW, Wand V are comparable by theorem II. 5 .4 which contradicts corollary IV. 4.5. THEOREM VI. 3.3: If there is a principal number for multiplication such that B, C < M (or equivalently BC < M) then A < B -+ AC :s; Be. PROOF. If B or C = 0 there is nothing to prove. Similarly if A = O. Otherwise, by lemma VI. 2.5, A C < M and BC < M. Hence, by theorem II. 5.4, AC and BC are comparable. Now, classically, 4'> < lJI -+ 4'>r ..s lJIr, hence AC :s; BC. THEOREM VI. 3.4: If A, B, C are co-ordinals, then A C < BC
-+
A < B.
PROOF. If C = 0, then the assertion is trivial. If C "# 0, then by theorem VI. 2.3. (i), A :s; A C and B :s; Be. Hence by the transitivity of :s; and theorem II. 5 . 4, A and B are comparable. By the classical theorem 4'>r < lJIr -+ 4'> < lJI, we have I A I < I B I and hence A < B. THEOREM VI. 3.5: There exist co-ordinals A, B, C such thatA C :s; BC but A :$ B. PROOF. (As in the classical case.) Let A
=
2, B
=
1, C
=
THEOREM VI . 3 .6: If B, C are comparable, then AB < A C
W.
-+
B
and e B & »; i= b;u) v (b ru = b;u & <aru' a;u> s A)]}.
Now
ABC = {<e(D), e(E): D = (j(b o, co) .. , j(b m Cn)) s E(A, BC) ao an & E
= (j(b~: c~) ao
: &: D
j(b~,: C~,)) e E(A, BC) an'
= 0 . v. D i= 0 & [en s n'
(Vr) (r
s
n
--+ j(b"
&
c.) = j(b;, c;) & a, = a;))
v (3r) ("Is) {(s < r --+ j(b., cs) = j(b~, c~) & as = a~) & «j(b" cr),j(b;, c;» s BC &j(b" c.) i= j(b;, c;) . v. j(b" c.) = j(b;, c;) & 1, is said to be a principal number for exponentiation if
1~ B < A We write £ nentiation.
--+
BA
=
A.
(exp) for the collection of all principal numbers for expo-
244
JOHN N. CROSSLEY
THEOREM VII. 4.2: All principal numbers for exponentiation are infinite co-ordinals whose classical ordinals are limit numbers. PROOF. Left to the reader (cf. theorem IV. 4.9). The condition in definition VII. 4. 1 is stronger than the condition: 1 ~ B, C --+ Be < A. This will be shown later in a manner analogous to that referred to in § VI. 2 by proving that if 2A = A, then W divides A. THEOREM VII. 4 . 3: W is a principal number for exponentiation. PROOF. It suffices to prove that, if N I; In> then W ~ N W . Let N = {(x, y): 0 ~ x ~ y < n}, then clearly N I; In. If S I; of, then s is expressible in the form
where for all i, 0 defined by
~
a, < n. Let f be the (partial) recursive function
°)
f(s) = e r r - 1 ... ( a r a r - 1 '" a o where columns with bottom entry
°
have been omitted.
E·g·f(n 2 .3+n.0+2) Then, if u, v I; of and u a; and b, may be zero,
(2 0)
=e 3 2 .
= nrar+ ... +a o and v = n'br+ ... +b o' where
and
(1)
(We remark that the fact that a" a; _ l' . . . and b" b, _ l' . . . may be zero does not affect the ordering.) But the ordering ~ given by (1) is precisely the ordering in N W of the bracket symbols
(a,r
0 ) and ao
(r .. , 0 ) b; ... b o
where columns with bottom row zero have been omitted. Clearly, one-one. Hence f: W ~ N W and the theorem is proved.
f
is
245
CONSTRUCTIVE ORDER TYPES, I
Corollary VIl.4.4. 2w = W. THEOREM
VIIA.5: If A > 1, then A B = A C
-+
B =
c.
Let A G A, B G Band C G C, and suppose p: A B ~ A': Then A ..... AC and since A B and AC are well-orderings, it follows that p is an extension of the unique minimal isotonism, Pc, between AB and A C• Now, classically, e > 1 & e r = e.1 -+ r = ..1. Therefore there is an isotonism qe (not necessarily partial recursive) such that qe: B ..... C. Now the map PROOF. I) B
defined only on E( A, B) is an isotonism between A B and Ac. Hence by theorem IV. 5.3, p is an extension of r: Since A > 1, there is a non-minimum element, say a O, in C' A. Let p' be the map p with domain and range restricted to
then p' is partial recursive. Further, if p'
(e (:0))
is defined then its value is e
(~o)
for some y. Now let q' be the map
then clearly q' is partial recursive") and agrees with qe on theorem IV. 5.3). q' is one-one, since
c-s (again by
I) We are here using a similar extension procedure to that used in the proof of theorem VI. 2.4. 2) (x)o = exponent of (po =) 2 in the prime factorization of x.
246
JOHN N. CROSS
(e (:0)) = p (e (~O)) 2
2j ( u, q'(x» + d. 3X 1 . . . . . P:" &
--. p'
=
j
( u' , q'(y»
+ -: 3Y1 •
. ..
.
p~m
for some u, u', d, d', n, m, XI' .•. , X n' YI, ... , Ym where d, d' = 0 or 1. O But by the definition of p', any image of p' is of the form 2 j ( a , b) + I and hence d = d' = 1, n = m = 0, u = u' = a O and p'
and p'
(e (:0)) = 2
(e (~o)) =
j
( aO, q'(x»
2j (a
O
, q' ( Y»
+
1
+ 1.
Therefore
from which it follows, since p' and e are one-one, that X = y. Thus we have shown that q' is a recursive isotonism between Band C. This completes the proof. THEOREM
but A :F B.
VII.4. 6: There exist co-ordinals A, B, C such that AC = BC
PROOF (as in the classical case). Let A = 2, B = 3, C = W. Then by theorem VII.4.3, 2w = s". THEOREM PROOF. A
VII.4. 7: C > 1 & A < B --. C" < CB.
< B -. (ElD) (D :F 0 & A+D
= B). Hence by theorem
VII.3.1, C = C" + D = CA.. CD. Now CD:F 0 since C:F 0, hence (3E) (CD = 1+E). Hence CB = C\1+E) = CA+CA.E by theorem VI. 1.6 and C A ~ CB. But B
CA. = C B
-.
CA. E = 0 -. E = 0 -. CD = 1 -+ D = 0
which is a contradiction. This completes the proof.
CONSTRUCTIVE ORDER TYPES, I
247
VIIA.8: (i) If A, C> 1, then A < A C • (ii) If C > 0, then A s A C • LEMMA
PROOF. (i) Since C > 1, there is a D # 0 such that 1 +D = C. Therefore A C = A 1+ D = A.A D by theorem VII.3.!. Now IADI > 1, ,by classical arguments, hence there is an E # 0 such that AD = 1 + E. Hence A C = A(l+E) = A+AE where AE # 0, i.e. A < A C • (ii) follows at once. THEOREM
VII 04.9: There exist co-ordinals A, B, C such that A < B
but A C $ B C •
Let A = 2, B = V and C = W, then by theorem VIIA.3, Wand by lemma VIIA.8, V < V W = B C • Now if A C ::s; B C , then by theorem 11.5.4 and the transitivity of ::S;, Vand Ware comparable, which contradicts the construction of these co-ordinals. PROOF.
AC
=
Thus we see that the analogue of one of the classical laws for exponentiation breaks down in a very similar way to one of the multiplicative laws (theorem VI.3.2). We have, however, theorem VIlA. 11 which is analogous to theorem VI. 3. 3. VII.4. 10: If E is a principal number for exponentiation, then A, B < E -+ A B < E and conversely if A, B > 1. LEMMA
The assertion is trivial if A, B ::s; 1. Otherwise, if E is a principal number for exponentiation, then A < E -+ A E = E and similarly for B. Hence A IBE) = E. Now B < E and therefore there is a C # 0 such that B+C = E. Therefore E = A IBE) = A(B+C) = AB.A c . But A C > 1, since C # 0; hence A C = 1 +D for some D # O. It follows that E = A B(l+D) = AB+ABD where ABD # 0, i.e. A B < E. Conversely, suppose A, B > 1 and A B < E. Then by lemma VII.4. 8 .(i), A < E. Since E is a principal number for exponentiation, E = A E = (ABl = ABE. By theorem VII 04.5 it follows that BE = E and hence by theorem VI. 2.3. (ii) B < E. PROOF.
THEOREM VII.4. ll : If there is a principal number for exponentiation, E, such that B, C < E (or equivalently B C < E or B, C::s; l) then A < B-+ AC::S;~.
PROOF.
By the transitivity of ::s; and lemma VII. 4. 10, A C < E and
248
JOHN N. CROSSLEY
BC < E. Hence by theorem 11.504, A C and BC are comparable. Now, classically, F < Ll -+ t" ~ Lltl>, hence AC < BC -+ A < B. THEOREM
VII 04.12: If A, B, C are co-ordinals, A C < B C
-+
A < B.
If C = 0 then there is nothing to prove. Otherwise, by lemma VII.4.8, A s A C and B s B C and therefore, by theorem 11.5.4 and the transitivity of ~, A and B are comparable. Hence by the ciassical theorem cpr < tpr -+ cp < tp, we have I A I < I B I and hence A < B. PROOF.
THEOREM
VIlA. 13: There exist co-ordinals A, B, C such that I < A C ~ B C but A $ B.
(as in the classical case). Let A = 3, B = 2 and C = W, then by theorem VII 04.3 (proof), A C = BC = W. PROOF
THEOREM
VII A. 14:
If B, C are
comparable and A > I, then
A B < AC PROOF.
-+
B < C.
By theorem VII 04.7.
THEOREM VIlA. 15: If there is a principal number for exponentiation, E, such that A C < E, then
I < A B < AC
-+
B < C.
PROOF. 1 < A < A implies A, B, C are all ~ 1. By lemma VII 04.10, if A C < E, then A, C < E and BC < E -+ B, C < E. Hence by theorem 11.504 and the transitivity of ~, Band C are comparable. Hence by theorem VII .4.14, B < C. B
C
VIII. Natural well-orderings up to
w(J)w
VIII.t. We showed in § IV that the finite co-ordinals are unique but that for each infinite classical ordinal F there exist c mutually incomparable co-ordinals of classical ordinal F. We now go on to give criteria for collections of co-ordinals which contain precisely one representative for each member of a given collection of classical ordinals. Using these we can give simple criteria for recursive well-orderings to be natural well-orderings, in the sense that if two recursive well-orderings are of the same classical ordinal, then they are recursively isomorphic provided
CONSTRUCTIVE ORDER TYPES, I
249
they are of not too large an ordinal and they are both natural wellorderings. By theorem 1.4.4 it is sufficient to describe co-ordinals which contain such natural well-orderings. In this section and the next we work in a slightly more general context: we do not assume that all our wellorderings are recursive, though it will turn out that they are. In [4] we shall extend our results much further as announced in [21]. DEFINITION VIII. 1. 1: I) If d' is a collection of co-ordinals, then d' is said to be T -unique if
IA I = IB I 0, some yep. Hence f-n(x) is defined and t (X; so x t (xo. 2) Since f maps P u (X onto (x, x s (X implies either (' n, then since j is one-one and order-preserving, (jm - "(x), y) s B+ A. But yep and r - "(x) e oe which contradicts B) (A. Hence In ~ n. If m = n, then (x, y) s B+ A where x, yep. We conclude (x, y) e B. This completes the proof of 3).
r:
4) Since A is a quasi-well-ordering and A o 5;; A, A o is a quasi-wellordering. Now j maps oeo = C' Ao onto oe o since X e oeo -+ j-I(X) e oe o & j(x) e oeo which implies oe o 5;; j(oeo) 5;; oe o' But j is order-preserving, hence by theorem III. 1.6, j = 1 on oe o ' 5) x e Pw -+ x = f"(y) for some n > 0, some yep. Since j is one-one, x = j(x) impliesr"(x) = j-" + lex). Butj-"(x) e p andj"?' + I(X) e oe and p n oe = 0 since B )( A. Therefore j(x) ¥ x. 6) Since j is partial recursive, bj is r.e. If x e Pw, then by 6) j(x) ¥ x.
252
JOHN N. CROSSLEY
If XC Ci o, then by 5) f(x) = x. Hence Cio, {J(O are contained in the disjoint r.e. sets {x: x C fJf&f(x) ¥- x} and {x: x s fJf&f(x) = x}. Hence by theorem 1I.1.4.(i) B(O)( A o . 7) By 6), B(O + Ao is well-defined. By 2), C(B(O + Ao) = Ci. By definition B(O ~ A and A o ~ A. It therefore suffices to prove that {JwXCio ~ A and A ~ Bw + A o . If x e {Jw and y s Cio then(3n) (f-n(x) c {J) but ("In) (f-n(y) c «). Hence <J-n(x), rn(y) c {J x rx ~ B + A, for some n, and since f is orderpreserving, <x, y) e A. If (x, y) e A then either (i) x, y e {Jw or (ii) x e {Jw, y s Cio or (iii) x, y c Cio or (iv) x s Cio, Y e {Jw by 2). Hence in order to complete the proof of 7) we only need to show (iv) is impossible. If (iv) holds, then there is an n such that f-n(x) e Cio and f-n(y) c f3 which is impossible since f is order-preserving and (Ci x {J) n (B + A) = 0. We now complete the proof of the lemma. By 3) Bw e B .w. Let C = CRT(A o), then by 7), B.w+C = A and hence B.w ::;; A. LEMMA VIII. 1.5: A co-ordinal A is a principal number for addition if,
and only
if; B
0 and W n is a principal number for addition, then by lemma VIIL1. 7, W n + 1 = W n • w is a principal number for addition. Hence the lemma is proved by induction. PROOF OF THEOREM VIn. 1 .3 (FIRST VERSION). By lemmata VIII. 1.6 and VIII .1. 8 a co-ordinal A of classical ordinal < W W is a principal number for addition if, and only if, it is of the form Wn • Hence £( +) is wW-unique. Now let V, V' be two incomparable upper bounds for {W n : n e Y} constructed as in corollary V. 2.3. Then 1V I = I V' I = co", Now A < V --. A < W n < V for some n, and similarly for V'. But A < W n--. A+W n = W n and therefore A+V = V and A+V' = V', i.e. V and V' are principal numbers for addition. Thus £( +) is strictly wW-unique. LEMMA VIII.l.9: (i) W m < W n if m < n, (ii) Ifn ~ 1,1+ W n = W n ,
(iii)
If m
LIT'
--+
r > I",
PROOF. Immediate from theorem 2, p. 292 in [14]. THEOREM VII1.2.2: If A is a co-ordinal, then BA = A ~ B W divides A, i.e. ~ (3C) (A = BWe). PROOF. B WC=A-+BA=B 1 + WC=B wC=A by theorem IVA.4.(i).
256
JOHN N. CROSSLEY
Conversely, suppose BA = A. We may assume that A > 1, since otherwise there is nothing to prove. By hypothesis there exist well-orderings A, B and a recursive isotonism f such that A e A, B e Band f: A
Let
IX
= C' A,
/3
~
BA.
= C' B. We also write
"a n(x) then (lft(x) = (If)n(x)(x).
Let C = A[{x: Lf(x) = x} and let D = O(Bw ) where g is the partial recursive function, defined only on pe, which maps only bracket symbol images ofthe form e (no n l bno bn ,
...
•••
ns) where n i , bn , I> of and no > n l > n2 > ... > ns bn,
~0
onto
n l+ln l nl-I min(B) bn , min(B)
no no-lno-2 ( e bno min(B) min(B)
n, ns-I bn, min(B)
0 ) min(B) .
I.e. g(x) inserts the missing positive integers in the top row of e -I(X) and in the columns where an integer was missing inserts min(B) in the bottom row and takes the image under e of the resulting bracket symbol. It is clear that g is one-one, so D is well-defined. We shall now show that A ~ D. C from which it follows at once that A = B W C where C = CRT(C). Let
.((n(X)-1
hex) = J e kf(lf)"(X) -
...
1 ••.
i
0)
., .
kf(lfi(x) ... kf(x) , (if)
n(x») (x)
.
Clearly, h is partial recursive. Suppose hex) = hey), then kh(x) = kh(y) and 111(x) = Lh(y). Hence (If)"(X)(x)
Now, since e is one-one we have and hence, for 0 :c::; r < n(x),
n(x)
= (Lf)"(Y).
=
(I)
n(y)
kf(lf)r(x) = kf(lf)'(y).
(2)
Putting r = n(x)-I in (2) and using (I) we have f(lf)"(X) - I(X)
But f is one-one, hence (If)n(x) - I(X)
= f(lf)n(x) = (If)"(X) -
I(y).
I(y).
258
JOHN N. CROSSLEY
Now assume where s < n(x) = n(y). Then by (2) with r = s-l
(If)S(x)
(kf) (If)' - I(X)
and using (3)
= (If)S(y)
(3)
= (kf) (If)' -
I(y)
f(lf)S - I(X) = f(lf)' - I(y).
By the one-one property off, (If)s - I(X) = (If)s - I(y)
and by induction it follows that x = y. I.e. h is one-one. It is clear that h maps C' A onto C' D. C and it only remains to prove that h is order-preserving. Suppose a P2(W) > ... > peCW) at #- 0 and Pt(W) > W, then
A < Wwn and A+ Wwn = wwn if apt < n. Conversely, if A < wwn for some n, then A is expressible in the form (6) where apt < n. PROOF. We prove the two parts simultaneously. Suppose 0 < A < W then I A I has Cantor normal form (cf. e.g. [14], p. 320)
all. at + ... + roT•. a e + q(ro).
where T i > T 2 > ... > T c-
w:
(7)
CONSTRUCTIVE ORDER TYPES,
261 wO' For each i, T, is a polynomial in ro, since otherwise roT; ~ ro which contradicts A < Wwn. Now to every ordinal of the form (7) there corresponds naturally and in a bi-unique way a co-ordinal of the form (6) (i.e. under the mapping p(ro) -+ p(W)). In order to prove the lemma it therefore suffices by virtue of corollary IV. 2 .7 to prove that A + W W" = W w" where n > 0Pl = degree of the polynomial I', (in co), Suppose oq = m - 1, then A+ Wwn = WPl(W).al +
=
WPl(W).al +
I
+ WPe(W).ae+q(W)+ Wwn +q(W)+(Wm+ Wwn)
by lemma VIII. 2 . 8 = WPl(W).al + ...
= Now by
e
i
L ~
1
WPl(W).al + ...
+ WPe(W).a e+ W m+ Wwn + WPe(W).a e+ Wwn =
by lemma VIII. 2 .8 C, say.
a, applications oflemma VIII.2.8 we have C = Wwn.
LEMMA VIII. 2. 10: (i) (3n) (A < WW'') +4 A < Www.
+4
A < WWv,
(ii) (3n) (A < WW)
PROOF. (i) Let V= n+U, then I U 1= co, By lemma VII.4.8, W ~ W U , hence by lemma VIII. 1. 4, 1 + W U = W U • Now Wwn. WWV = W exp (W n+ W v ) = W exp (W n+ W n +u) = W exp (W n • {1 + W u } ) = W exp (W n . W u) = W exp (W n + U ) = WWV. Hence by lemma VIIA.S, Wwn < Wwv. wn Conversely, suppose A < Wwv, then I A 1< ro for some n. But by lemma VIII. 2.9 there is a co-ordinal A' of the form (6) such that I A' I = I A I and A' < Wwn. Hence by corollary IV.2.7, A = A' and A < WW". (ii) follows at once by substituting 'w' for 'V'. (In this case, U = W.) THEOREM VIII. 2. 11; The collection £(.) of all principal numbers for wO' multiplication is strictly ro -unique. PROOF. By lemmata VIII. 2.4 and VIII. 2.9 every principal number for wn wO' multiplication of classical ordinal < ro is of the form W and con-
262
JOHN N. CROSSLEY n
versely, alI the co-ordinals Ww are principal numbers for multiplication. Hence£{.) is co",W-unique. w V Ww and WW are principal numbers for multiplication, since by the w w v V n. n. proof of lemma VIII. 2.10, Ww Ww = Ww and Ww Ww = WW • ww wv wn Further, A < W or W implies A < W for some n; hence, since w n ww all the Ww are principal numbers for multiplication, A. Ww = W v WV and A. Ww = W • wv But WW,W = W implies, by theorem VII.4. 5 (twice), W = V which is a contradiction. Therefore£"(.) is strictly co"'w -unique, THEOREM VIII .2.12: £' (exp) c £' (.) c £' (+). PROOF. By theorem VII.4. 2 every principal number for exponentiation is infinite. Suppose Pe£(exp), then by lemma VII.4.10, A < P-+ AA < P and hence (AAy = P = A P• Hence if A > 1, then by theorem VII. 4.5, AP = P and hence P is a principal number for multiplication. Now suppose P E £(.), then W :==::; P by lemma VII. 2.4, hence by lemma VIII. 1.4, I+P = P. Therefore if 0 < A < P, P = AP = A(1+P) = A+AP = A+P. I.e. Pe£(+). W W E £(.) - £(exp) since for every F < co'" there is a co-ordinal C < W W but to" is not a (classical) principal number for exponentiation. W 2 E £( +) - £(.) by similar argument. Hence alI the inclusions are strict. THEOREM VIII. 2. 12 indicates how we might extend our classes of co-ordinals to get uniqueness up to higher ordinals. We shall present results obtained by this approach in [4] and [21]. Appendix At. In many theorems concerning (classical) ordinals use is made of the theorem
If a well-ordered set ex is similar to a subset of a well-ordered set then ex is similar to an initial segment of p.
p,
The proof of this theorem requires the axiom of choice. Accordingly, it is not surprising that its analogue fails for C.O.T.s and co-ordinals.
CONSTRUCTIVE ORDER TYPES, I
263
In fact, we have made use of this fact in giving counterexamples to analogues of classical laws like A < B --+ AC :s Be. DEFINITION AI.I: A::s 8 if there is a recursive isotonism from A onto a (linearly ordered) sub-relation of B, i.e. if A ~ A' S; 8. Clearly, if Al ~ A 2 , 8 1 ~ 8 2 and Al ::S 8 1 , then A 2 ::S 8 2 , DEFINITION A I .2: A ::S B if there exist A e A and B e B such that A ::S 8. We write A a, ::; hi, for all i. We define further a Ib
+-->
(3c) (c ::; b & b = ac) (3c;) [(c i
::;
+-->
(Vi) (l ::; i ::; n
--+
b;) & (b i = aic;)]).
If f I a and fib, f is called a common factor of a, b. If h I a and hi b and if k I a, k I b ~ k] h then h is the h.c.f. of a, b, it is easily shown that hi is the h.c.f. of a.; hi' If 1 is the h.c.f. of a, b then a, b are said to be relatively prime. It follows that if a, b are relatively prime then all ai' b, are relatively prime and conversely. If ¢(x) is Euler's function which counts the number of numbers (including 1) which are less than and prime to x, then ¢(x) is primitive recursive and so there is a ([> such that
MUL TIPLE SUCCESSOR ARITHMETICS
271
Since ¢(b j ) = 1 +mjb i when a.; bi are relatively prime therefore a(b)
= l+mb
a(b)
= 1 (mod b)
i.e,
when a, b are relatively prime.
References R. L. Goodstein, Recursive Number Theory (North-Holland Publishing Co. Amsterdam 1957). M. T. Partis, Commutative partially ordered recursive arithmetics. Mathematica Scandinavica 13 (1963) 199-216. V. Vuckovic, Partially ordered recursive arithmetics. Mathematica Scandinavica 7 (1959) 306-320.
UNSOLV ABLE PROBLEMS IN THE THEORY OF COMPUTABLE NUMBERS B. R. MAYOR University of Oslo, Blindern, Norway
With each total, general, recursive singulary functionf on the natural numbers (hereafter "recursive function") one can associate the real number '/' = rx = ± a o . ala2a3 . . . that satisfies:
o: :2: 0 if f(O) is even, rx ao = [f(0)/2] ,
s
0 if f(O) is odd,
(Ll) for all i :2: 1, a, = the remainder on dividing f(i) by 10.
A real number rx is said to be computable if there is a recursive function associated with a, Let R denote the class of real numbers, and C the class of computable numbers. 1: If function f: Rk - R and open interval Q c R k are such that f restricted to Q is continuous, monotone in each argument, and can be effectively calculated for any k-tuple offinite decimals in Q, then the value of f is a computable number for every k-tuple ofcomputable numbers in Q as argument. THEOREM
PROOF: Let (Xl' X2' ... , Xk) be any k-tuple of computable numbers in Q. As all finite decimals are computable - since any function whose value is 0 for all but a finite number of arguments is recursive - it suffices to consider the case whenj'(x., Xl> ••• , x k ) = rx is an infinite decimal. Let fl' f2' .. ·,fk be the recursive functions by which Xl' X 2, ... , X k are presented. For any positive integer m, letflm,f2m, ... ,fkm be the recursive functions given by:
fimO) = f;(j)
o
if)::;; m if) > m
THE THEORY OF COMPUTABLE NUMBERS
273
and d lm, d 2m, ... , d km be the finite decimals associated with 11m,12m, ... , Ikm' As Q is open, there is a positive integer I such that m ~ I implies that k Q includes the 2 arguments 0 there is an m > I such that Pm < e. As IX is not a finite decimal, this ensures that for any positive integer n there is an m > I such that all m-values agree on the first n decimal places. As I is monotone in each argument, these are the first n decimal places of a,
Corollary Ia. C is closed under the rational operations, so it is a field. Corollary lb. C is closed under the elementary functions; such functions as x - t exp x, x - t log x, X - t sin x, x - t n", x -+ "x, have computable values for computable arguments. In particular e = exp 1 is computable [10, p. 256]. THEOREM 2: III : R - t R is a continuous function whose sign can be computed effectively at any finite decimal that is not a root, then all simple roots 011 are computable. PROOF. Again one need only consider the case of a simple root is not a finite decimal. As IX is isolated, there is a finite decimal
such that
IX is
IX that
the only root oif in the closed interval
By Weierstrass' theorem tion: g(i)
IX
= 2d o
is presented by the following recursive func-
2d o+ 1
d,
if (j > 0 and i = 0 if (j < 0 and i :F 0 if 1 :::;; i :::;; k
the least z such that fid-s-z : lO-i) and I(d+(z+ 1)' lO- i ) have opposite signs, otherwise.
274
B. H. MAYOH
Corollary 2a. All the roots of a polynomial with computable coefficients are computable [2, 8]. In particular all algebraic numbers are computable [10, p. 254]. Corollary 2b. x -+ sin x satisfies the requirements, so [10, p. 256].
1t
is computable
Corollary 2c. C is closed under the inverse circular and hyperbolic functions. However C can be closed under a functionfwithout f being effective.
THEOREM 3: There is no effective procedure which, given two recursive functions f1 and f2, will stop and present a recursive function f3 such that
PROOF. One can effectively find the following recursive function for any Turing machine M: g 1 (i)
=
°
if i < 2 or M stops and presents 1 within i steps when started on its own Godel number 9 otherwise,
gii) = 2 if i < 2 or M stops and presents
°within
i steps when started
° + If Let be the digit in the first decimal place of presents ° when run on its own Godel number, then on its own Godel number otherwise.
M stops and dis 1 (0). The usual diagonalisation argument shows that one cannot find d effectively. d
'gt'
'g2'.
(1)
Similar proofs show that subtraction, division, multiplication, extraction of roots, exponentiation and the taking of logarithms are also non-effective. Moreover for any computable number a one can show that x -+ x + a, x = x]« when a #- 0, and x -+ x-a are effective if and only if a is a finite decimal. Thus doubling but not trebling is effective. If we had chosen to work with ternary instead of decimal expansions, the opposite would have been true. Such dependence on the number base can occur as conversion from base p to base q is only effective when q divides a power of p (cf. [5] theorems 3 and 5). It is curious that the existence of a procedure for any of the above non-effective operations does not seem to imply the solvability of the Halting problem [10], though each is reducible
THE THEORY OF COMPUTABLE NUMBERS
275
to the Halting problem. However the Halting problem is equivalent to the problem of finding the computable number that is the limit of a given recursively convergent recursive sequence of rationals. There is a profound analogy between the way in which a computable number is associated with each recursive function and that in which a semigroup is presented by each Thue system. Just as finite semigroups can be given by a multiplication table whilst infinite semigroups require a set of defining relations, so finite decimals can be written down directly whilst infinite decimals must be given by a rule. Just as not all semigroups can be presented by Thue systems, so not all real numbers are computable. Most interesting properties of semigroups are "Markov properties" [3, 7] in the following sense: i) There is a Thue system r 1 that presents a semigroups enjoying P. ii) There is a Thue system that presents an inhibiting semigroup S*, i.e. if S* can be embedded in the finitely presented semigroup S, then S does not enjoy P. iii) P is preserved under isomorphisms. The natural analogue of this definition is: A property of real numbers P is said to be pseudo-Markov if i) There is a recursive function Pfl associated with a number enjoying P. ii) There is a recursive function Pf2 associated with an inhibiting number (X* i.e. if recursive function g is associated with a number (x, that differs from (X* at only finitely many places, then (X does not enjoy P. It is known that (1) for every Markov property P of semigroups, the problem of determining whether or not a given Thue system presents a semigroup enjoying P is unsolvable [3, 7], and that (2) for each recursively enumerable degree of unsolvability D, there is a class of Thue systems A such that the above problem, restricted to A, has D as its degree of unsolvability [1]. N. Shapiro has proved the analogue of (I) [9, theorem 2.2]; the following theorem is the analogue of (2). THEOREM 4: For any pseudo-Markov property P of real numbers and any recursively enumerable degree of unsolvability D, there is a class A of recursive functions such that D is the degree ofunsolvability of the problem
276
B. H. MAYOH
of determining whether or not the number associated with a given recursive function in A enjoys P. PROOF. Let SD be an infinite recursively enumerable set of positive integers whose decision problem has D as its degree of unsolvability. For each positive integer i. one can effectively find the recursive function:
gii) =
IJii)+ 10· j if i # 0 and
"1
j is enumerated amongst the first i
elements of SD'
if i = 0, "1(i)+ 10· j otherwise. (i)
The associated computable number enjoys P if and only if j ¢ S D' so {gl(i), g2(i), ... } will serve for A. 5: If a property P of real numbers is enjoyed by at least one computable number, if any number, agreeing with a number enjoying P at all but a finite number of places, also enjoys P, and if one can recursively enumerate a set of recursive functions such that a computable number enjoys P if and only if it is associated with at least one recursivefunction in the set, then P is a pseudo-Markov property and the problem of determining whether or not the number associated with a given recursive function enjoys P is of degree 0", i.e. has the same degree of unsolvability as the problem of determining whether or not a given partial recursive function is total. THEOREM
PROOF.
f(i)
Consider the recursive function:
=6
if 5 is the remainder of the value of the U+ lj-st listed function for argument i on division by to 5 otherwise.
Its associated number does not enjoy P, and no computable number that agrees with it on all but a finite number of places can enjoy P. Thus P is a pseudo-Markov property. For any partial function p, one can effectively find the recursive functions:
THE THEORY OF COMPUTABLE NUMBERS g(O) = 1 g(i+ 1) = g(i)
g(i) + 1
fiO)
277
if p(g(i) - 1) cannot be computed within i + 1 steps otherwise.
= 0
fp(i+ 1)
= h(i){i+ 1)
if g{i+ 1) = g(i)
6
if g{i + 1) =f. g(i) and the remainder on dividing h(i){i+ 1) by 10 is 5
5
otherwise.
The number associated with fp enjoys P if and only if p is not total. Moreover for each recursive function r one can effectively find the partial recursive function: q(i)
= undefined if r and the i-th listed function always agree modulo
o
10 and they agree exactly for argument otherwise
o.
q is total if and only if the number associated with r does not enjoy P.
For properties of the form "Being algebraic of degree in S" where S is a recursive set ofpositive integers, and in particular for "Being algebraic" and "Being rational", this has been proved in another way [9, theorem 11. 10]. The theorem also applies to the apparently simpler property "Being expressible as a finite decimal". It is known that (3) for every recursively enumerable degree of unso1vability D there is a class A of Thue systems such that the isomorphism problem restricted to A has D as its degree of unso1vability [1], and that (4) the isomorphism problem for semigroups is unsolvable [4,6]. The next two theorems give the analogous results in the theory of computable numbers. THEOREM 6: For every recursively enumerable degree of unsolvability D, there is a class of computable numbers A such that the problem of determining whether or not two recursive functions in A have the same associated number has D as its degree of unsolvability. PROOF. Let SD be as in the proof of theorem 4. For any positive integer i. one can effectively find the recursive function:
278
B. H. MAYOH
.fii) = 0 l+lO'j
10· j
if i = 0 if i #= 0 andj is enumerated among the first i elements of SD otherwise.
Let fo be any recursive function whose value is always O. Then {fo, ft. f2" .. } will serve as A, since "I tj SD" reduces to "Do fo andfj have the same associated number?", and one can effectively find the natural number g* such that f g* = g for any recursive function g in A so "Do the functions g and h in A have the same associated numbers T" reduces to (g* tj SD and h* tj SD) or g* = h*.
This proof may not be available for other conventions that associate real numbers with recursive functions; it is then necessary to distinguish between identical recursive functions that are defined differently in order to reduce the decision problems in theorems 4 and 6 to that of SD' THEOREM 7: The problem of determining whether or not an arbitrary pair of recursive functions have the same associated numbers has the same degree of unsolvability as the Halting problem, viz. 0'. PROOF. Letfo be as in the last proof. For any Turing machine M, one can effectively find the recursive function
fM(i) = 0
if M runs for at least i steps when started on blank tape
1 otherwise.
M stops if and only if 10 and 1M have different associated numbers. Furthermore for any recursive functions f and g one can effectively find a machine M* that compares f(i) with g(i) for i = 0, 1, 2, ... until 'I' and 'g' diverge. M* stops if and only iff and g have different associated numbers. Similarly the following five decision problems are also of this degree: To determine of any pair of recursive functions whether or not the number associated with the first is greater than (not less than, different from, not greater than, less than) the number associated with the second. However one can show - by modifying M* in a suitable fashion - that all six decision problems are solvable when restricted to pairs of functions that have different associated numbers.
THE THEORY OF COMPUTABLE NUMBERS
279
N. Shapiro has proved: If a property P of real numbers is enjoyed by only finitely many computable numbers (and by one at least), then it is a pseudo-Markov property and the problem of determining whether or not the number associated with a given recursive function enjoys P has the same degree of unsolvability as the Halting problem [9, theorem 2.16]. In particular this applies for" a" where a is any computable number. The decision problems for the properties: "> a", "~ a", "#- a", ":::; a", "< a" are also of this degree. However judicious use of M* enables us to solve these six decision problems when restricted to recursive functions with associated numbers different from a. If it seems unnatural to associate a real number with every recursive function, as we have done - e.g. one might prefer to replace clause (,1) in the definition in the first paragraph by "a, = j(i)", or to disregard those decimal expansions that end in a string of nines - then one is free to change the definition; our results will still hold if "recursive function" is replaced throughout by "recursive function that is associated with some number".
=
References [1] [2] [3] [4] [5] [6] [7] [8] [9] [10]
w. W. Boone, Partial Results regarding Word Problems and Recursively Enumerable Degrees of Unsolvability. Bull. Amer. Math. Soc. 68 (1962) 616-623. A. Grzegorczyk, Computable Functionals. Fund. Math. 44 (1957) 61-71. A. A. Markov, Impossibility of Algorithms for recognising some Properties of Associative Systems (in Russian). Dokl. Akad. Nauk. SSR 77 (1951) 953-956. , Impossibility of Certain Algorithms in the Theory of Associative Systems (in Russian). Dokl, Akad. Nauk. SSR 77 (1951) 19-20. A. Mostowski, On Computable Sequences. Fund. Math. 44 (1957) 37-51. , Review of [4], J. Symb. Logic 16 (1951) 215. , Review of [3], J. Symb. Logic 17 (1952) IS\. H. G. Rice, Recursive Real Numbers. Proc, Amer. Math. Soc. 5 (1954) 784-795. N. Shapiro, Degrees of Computability. Ph. D. thesis, Princeton University (1955). A. M. Turing, On Computable Numbers. Proc. London Math. Soc. (2) 42 (1936) 230-265.
PREDICATIVE WELL-ORDERINGS KURT SCHUTTE Kiel University, Germany
A hierarchy of critical numbers of the second number class can be defined in the following way: (1) The l-critical numbers are the s-numbers, (2) An ordinal IX is v-critical (v > 1) if I,,(IX) = IX for every I" is the ordering function of the JJ-critical numbers.
jJ
< v where
These ordering functions I" are normal functions. For JJ < v the set of v-critical numbers is a proper subset of the set of u-critical numbers. If IX is JJ-critical then JJ ~ IX. We say that an ordinal K is strongly critical if it is x-critical, i.e. if 11/0) = K. It turns out that the smallest strongly critical number K o is a least upper bound for predicative reasoning. We define in section 1 an ordering relation -< of equivalence classes of natural numbers representing a sufficiently large segment of the second number class in a constructive way. With respect to this representation of ordinals, we prove in section 3 transfinite induction up to ordinals smaller than K o by using a formal system of ramified type theory which is defined in section 2. In this way well-ordering up to any ordinal IX < K o is provable by predicative methods (if the ordinals are defined in a sufficiently constructive way as in section 1). In another paper (Schutte [7]) there is a proof that well-ordering up to ordinals ;;::: K O cannot be proved by predicative methods. The same result was found independently by S. Feferman [l ]-[3]. Our wellordering -< is based on normal functions according to a construction of Veblen [8]. The well-ordering -< of this paper corresponds to a proper segment of a well-ordering in Schutte [5]. It is very closely related to the
PREDICA TIVE WELL-ORDERINGS
281
well-ordering --< of section 11 in Schutte [6]. Both well-orderings represent the same segment of ordinals. 1. A constructive system of ordinals We use small Latin letters as syntactical variables for natural numbers (including 0) and define a binary relation :S on natural numbers. We denote by n.
Obviously, for any natural numbers a, b it is decidable whether a :S b or
a ~ b. It is easy to prove by mathematical induction (according to the
natural ordering of natural numbers) that :S is a reflexive total ordering relation, i.e.
a :S b or b:s a (totality), If a :S band b :S c then a -< c (transitivity). We define: a == b if and only if a
:S band b :S a,
282 a
KURT SCHUTTE
-< b if and only if
b
~
a.
-< b.
a -I( b denotes the negation of a
Since S is a reflexive total ordering relation it follows that == is an equivalence relation, and < is an irreflexive total ordering relation with respect to the relation ==, i.e. a==a
If a == b then b == a If a == band b == c, then a == c a-l(a
If a -< band b -< c then a -< c a -< b or b -< a or a == b If a == c, b == d, and a -< b, then c
-< d and -
- is a relation symbol of our formal system. A numerical prime formula is an elementary prime formula which does not contain a free number variable. According to our interpretation of function symbols and relation symbols every numerical prime formula has a computable truth-value "true" or "false". A numerical prime formula (tl "" t 2) has the value "true" if and only if I t 1 I ~ I t 2 I holds. An elementary prime formula t 2 ) is called verifiable if there is a constructive proof of the following property: Whenever ti, ti are the results of substituting arbitrary numerals for the free number variables in t 1 , t 2 the numerical prime formula (ti "" ti) has the value "true".
«. ""
2.7. Nominal forms
A nominal form is a finite sequence containing no symbols other than primitive symbols of our formal system and a nominal symbol (which is not a primitive symbol of our system). If ~ denotes a nominal form and X denotes a primitive symbol or a finite sequence of primitive symbols then ~(X) denotes the results of substituting X for the nominal symbol in ~. 2.8. The supremum
Let a be a free number variable occurring neither in the nominal form s nor in the term s, and let sea) be a term. We say that the relation sup s == s is verifiable if there is a constructive proof of the following properties: (1) The elementary prime formula sea) :S s is verifiable. (2) Whenever s*, s* are the results of substituting arbitrary numerals for the free number variables in s, s, then for every natural number
289
PREDICATIVE WELL-ORDERINGS
n
F -+ 'll(t)
where a is a free number variable which does not occur in the formula F or in the nominal form'll, 'll(a) is a formula, and t is a term. (B3) The inference rule for predicate quantification: F -+ 'll(p(a, I)
=>
F
-+
Axl'll(x')
where p is a free predicate variable which does not occur in the formula F or in the nominal form'll, and a is a free number variable which does not occur in F, 'll or the term t. Remark. A formula Axl'll (x') has the interpretation: "'ll(q) for all predicates q of level -< r: 2 . 15. Definition
A formula of level s is regularly derivable if it is derivable only from formulas of levels s, such that Sj ::5 S is verifiable. 3. Proof of transfinite induction in the formal system of ramified analysis We define the foIlowing formulas: Pr (q) (