Westfalische Wilhelms-Universitat Munster Institut fur Mathematische Logik und Grundlagenforschung
An Introduction ...
95 downloads
2627 Views
982KB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
Westfalische Wilhelms-Universitat Munster Institut fur Mathematische Logik und Grundlagenforschung
An Introduction to
Mathematical Logic Lecture given by
Wolfram Pohlers worked out and supplemented by
Thomas Gla
Typewritten by Martina Pfeifer
An Introduction to
Mathematical Logic Wolfram Pohlers Thomas Gla
Institut fur Mathematische Logik und Grundlagenforschung Westfalische Wilhelms-Universitat Munster Einsteinstrae 62 D 48149 Munster
Typeset AMS-LaTEX
Preface
This text is based on my personal notes for the introductory courses in Mathematical Logic I gave at the university of Munster in the years 1988 through 1991. They have been worked out and supplemented by Thomas Gla to whom I express my sincere thanks for a well done job. Our courses have been planned for freshmen in Mathematical Logic with a certain background in Mathematics. Though self contained in principle, this text will therefore sometimes appeal to the mathematical experience of the reader. According to the aim of the lectures, to give the student a rm basis for further studies, we tried to cover the central parts of Mathematical Logic. The text starts with a chapter treating rst order logic. The examples for application of Gentzen's Hauptsatz in this section give a faint avour how to apply proof theoretical methods in Mathematical Logic. Fundamentals of model theory are treated in the second chapter, fundamentals of recursion theory in chapter 3. We close with an outline of other formulations of ( rst order and non rst order) logics. Nearly nothing, however, is said about set theory. This is usually taught in an extra course. Thus there is an appendix in which we develop the small part of the theory of ordinal and cardinal numbers needed for these notes on the basis of a naive set theory. One of the highlights of this text are Godel's incompleteness theorems. The true reason for these theorems is the possibility to code the language of number theory by natural numbers. Only a few conditions have to be satis ed by this coding. Since we believe that a development of such a coding in all its awkward details could mystify the simple basic idea of Godel's proof, we just required the existence of a suitable arithmetisation and postponed the details of its development to the appendix.
iv
Preface
I want to express my warmest thanks to all persons who helped in nishing this text. Besides Thomas Gla { who did the work of a co-author { I want to mention Andreas Schluter in the rst place. He did not only most of the exercises but also most of the proof-reading. Many improvements are due to him. My thanks go also to all our students who detected and reported errors in a rst version and gave us many helpful critical remarks. We do not regard these notes as nished. Therefore we are still open for suggestions and criticism and will appreciate all reports about errors, both typing errors and possible more serious errors. Last but not least I want to thank our secretary Martina Pfeifer who TEXed the main bulk of this paper. Munster, October 1992
Wolfram Pohlers
Contents Historical Remarks : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 1 Notational Conventions : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 3
1 Pure Logic
Heuristical Preliminaries : : : : : : : : : : : : : : : : : : : 1.1 First Order Languages : : : : : : : : : : : : : : : : : 1.2 Truth Functions : : : : : : : : : : : : : : : : : : : : 1.3 Semantics for First Order Logic : : : : : : : : : : : : 1.4 Propositional Properties of First Order Logic : : : : 1.5 The Compactness Theorem for First Order Logic : : 1.6 Logical Consequence : : : : : : : : : : : : : : : : : : 1.7 A Calculus for Logical Reasoning : : : : : : : : : : : 1.8 A Cut Free Calculus for First Order Logic : : : : : : 1.9 Applications of Gentzen's Hauptsatz : : : : : : : : : 1.10 First Order Logic with Identity : : : : : : : : : : : : 1.11 A Tait-Calculus for First Order Logic with Identity :
: : : : : : : : : : : :
: : : : : : : : : : : :
: : : : : : : : : : : :
: : : : : : : : : : : :
: : : : : : : : : : : :
: : : : : : : : : : : :
: : : : : : : : : : : :
: : : : : : : : : : : :
: : : : : : : : : : : :
: : : : : : : : : : : :
2 Fundamentals of Model Theory
: : : : : : : : : : : :
5
5 6 10 17 27 35 45 48 53 66 82 85
93
2.1 Conservative Extensions and Extensions by De nitions : : : : : : : : : : 93 2.2 Completeness and Categoricity : : : : : : : : : : : : : : : : : : : : : : : 99 2.3 Elementary Classes and Omitting Types : : : : : : : : : : : : : : : : : : 109
3 Fundamentals of the Theory of Decidability 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8
Primitive Recursive Functions : : : : : : : : : : : : : : : : : : : : Primitive Recursive Coding : : : : : : : : : : : : : : : : : : : : : Partial Recursive Functions and the Normal Form Theorem : : : Universal Functions and the Recursion Theorem : : : : : : : : : Recursive, Semi-recursive and Recursively Enumerable Relations Rice's Theorem : : : : : : : : : : : : : : : : : : : : : : : : : : : : Random Access Machines : : : : : : : : : : : : : : : : : : : : : : Undecidability of First Order Logic : : : : : : : : : : : : : : : : :
4 Axiom Systems for the Natural Numbers
: : : : : : : :
: : : : : : : :
: : : : : : : :
115
: 116 : 124 : 130 : 134 : 136 : 142 : 145 : 148
153
4.1 Peano Arithmetic : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 153 4.2 Godel's Theorems : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 158 v
vi
Contents
5 Other Logics
165
Appendix
179
Bibliography
203
Glossary Index
209 215
5.1 Many-Sorted Logic : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 165 5.2 !-Logic : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 172 5.3 Higher Order Logic : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 175 A.1 The Arithmetisation of NT : : : : : : : : : : : : : : : : : : : : : : : : : 179 A.2 Naive Theory of the Ordinals : : : : : : : : : : : : : : : : : : : : : : : : 188 A.3 Cardinal Numbers : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 198
Historical Texts : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 203 Original Articles : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 203 Text Books : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 206
Historical Remarks Nowadays mathematics is an extremely heterogeneous science. If we try to nd a generic term for mathematical activities we encounter the astonishing diculty of such an enterprise. In former times mathematics has been described as the `science of magnitudes'. Nowadays this is no longer true. We regard a science as a mathematical science mainly not because of its contents but rather because of its methods. The characteristic of a mathematical theory is that it proves its claims in an exact way, i.e. it derives its results from generally accepted basic assumptions without using further information such as empirical experiments etc. Then the next question to be asked is: \What does it mean to `derive something from something?'", i.e. what is a mathematical proof? Still in the last century a proof was more or less a matter of intuition. A sentence was regarded to be a theorem when it was accepted by the mathematical community. We know `proofs' of theorems which are considered to be false in these days (although the theorem is true which is a point for the intuition of the researchers involved). However, it seems to have been clear at all times that `logical' reasoning should be ruled by laws and the eorts to investigate them reach back to the times of antiquity. The oldest known `logical system' is Aristotle's 384 B.C., y322 B.C.] Syllogistic. We will not describe Syllogistic here. All we want to say is that it is by far too weak to describe mathematical reasoning. About the same time there was also some research on logical reasoning by the Megarians and Stoics, which in some sense was much more modern than that of Aristotle. The inuence of Aristotle's work was tremendous. It ruled the Middle Ages. The fact that the Roman Church had taken up { with some adjustments { Aristotle's philosophy created an over-great respect for Aristotle's work. This together with other traditions paralysed logical research and restricted it mainly to work-outs of Aristotle's systems of Syllogistic. In that time a remarkable book with new ideas was Ars Magna (1270) by Raimundus Lullus 1235, y1315] in which he suggested that all knowledge in the sciences is obtained from a number of root ideas. The joining together of the root ideas is the `ars magna' (the great art). Lullus himself did not really develop a theory but his ideas still inuenced Leibniz' work. One of the lasting challenges of Lullus' ideas was the development of a general language for a general science. The rst one to have the idea of developing a general language in a mathematical way was Rene Descartes 1596, y1650]. But because of the great inuence of the Roman Church he did not publish his ideas.
2
Historical Remarks Such attempts were made by Gottfried Wilhelm Leibniz 1646, y1716]. In his
De arte combinatoria (1666) he suggested a `mathematics of ideas'. Leibniz regarded
mathematics as the key-science which should be able to decide all questions `in so far as such decisions are possible by reasoning from given facts'. So he tried to develop a general algorithm which could decide any question (obeying the just mentioned restrictions). Using that algorithm he wanted to decide whether God exists. Of course he failed (he had to, as we know today). The real start of mathematical logic were George Boole's 1815, y1864] books The Mathematical Analysis of Logic (1847) and The Laws of Thought (1854). Three decades later, Gottlob Frege 1848, y1925] published his book Begrisschrift, eine der arithmetischen nachgebildete Formelsprache des reinen Denkens (1879). In the work of both authors the central point is a formalisation of the notions of `sentence' and `inference'. Boole, inuenced by the work of William Hamilton 1780, y1856] and Augustus De Morgan 1806, y1878], opted for an algebraic notation while Frege designed an arti cial language on the model of colloquial language. Because of its complicated two-dimensional notations Frege's formalisation did not succeed. Boole's concept of an algebra of logic still had some aws which prevented it from being commonly accepted. Nowadays, after having taken out the errors in Boole's concept, the notion of a boolean algebra became central in mathematical logic. The breakthrough in the development of mathematical logic was the Principia Mathematica (1910, 1912, 1913) by Alfred North Whitehead 1861, y1947] and Bertrand Russell 1872, y1970]. Their notions relied on previous work by Giuseppe Peano 1858, y1932]. His book Formulaire de Mathematique (1897) presented a completely developed formalism for the theory of logic and thus launched what we call mathematical logic nowadays. Kurt Godel 1906, y1978] and other pioneers of modern mathematical logic used the `Principia' as their main reference. Mathematical logic investigates mathematical reasoning by mathematical methods. This self-referential character distinguishes it from other elds of mathematics and is the reason why logic sometimes is regarded as a somewhat strange part of mathematics. The best example of the kind of strangeness we mean are the famous Godel's incompleteness theorems which show us the limits of formalisations. These theorems are included in this book. So the reader is advised to convince her- or himself about the things we claim here. Nowadays mathematical logic is divided into four sub elds: Recursion Theory Set Theory Model Theory Proof Theory Having discussed the basics of logic (pure logic) we are going to obtain some connections to model theory (fundamentals of model theory). After that we develop the basic
Notational Conventions
3
notions of recursion theory (fundamentals of the theory of decidability) and turn to the fundamentals of proof theory in the fourth chapter. In the last chapter we will regard some other formulations of logic.
Notational Conventions
i stands for if and only if. denotes the empty set. IN is the set of natural numbers 0 1 2 : : : If f : X ! Y is a function we call X the domain of f, i.e. X = dom(f): The range of f is the set rg(f) = fy 2 Y : 9x2X(f(x) = y)g:
If f : X ! Y is a function and Z X, then f Z denotes the restriction of f to Z, i.e. f Z : Z ! Y f Z(x) = f(x) for x 2 Z: X Y is the set of functions f : Y ! X: Pow(X) denotes the power set of X, i.e. the set of all subsets of X. X n Y denotes the set X without Y , i.e. the set fx 2 X : x 2= Y g: X Y denotes the union of X and Y, i.e. the set fx : x 2 X or x 2 Y g: X \ Y denotes the intersection of X and Y, i.e. the set fx : x 2 X and x 2 Y g: idX is the identity on X, i.e. dom(idX ) = X and idX (x) = x for x 2 X:
4
Notational Conventions
Chapter 1 Pure Logic Heuristical Preliminaries In the historical remarks we already emphasised that the development of a formal language belongs to the main tasks of mathematical logic. To see what will be needed for a formal language of mathematics we are going to examine examples of mathematical propositions. Let us start with a simple one 5 j 15 i.e. the natural number 5 divides the natural number 15, or (3 + 4) = (2 + 5): These propositions tell us facts about natural numbers. The rst one tells us that the two natural numbers 5 and 15 share the property that one (the rst) divides the other (the second). Such properties which may be shared by one or more objects (natural numbers in our example) will be called predicates . The number of objects which may share a predicate is called the arity of the predicate. The equality of natural numbers, for instance, is a binary predicate. Whenever we have an n-ary predicate P and n objects o1 : : : on, then (Po1 : : :on ) is a proposition , something which either can be true or false. The second example is a bit more complex. In it we do not longer compare two objects but two things, 3 + 4 and 2 + 5 which are built up from objects by functions . Such things will be called terms . Terms can be evaluated and the evaluation will yield an object. Thus objects in predicates may well be replaced by terms and still represent a proposition. We may even replace objects occurring in terms by terms and still obtain a term. To get a uniform de nition we could say that every object is a term and that more complex terms are obtained by applying an n-ary function f to already formed terms t1 : : : tn, i.e. by building (ft1 : : :tn ). Once propositions are formed, we may compose them to more complex ones by using sentential connectives. In colloquial language we compose ones by connectives such as `and' it is raining and I'm taking my umbrella,
6
I. Pure Logic
`or' it is raining or the sun is shining, `if : : : then' if it is raining, then I'm going to use my umbrella, `not' etc. We will use the symbols ^ _ ! : to denote these connectives. Of course we will have to give them a mathematically exact meaning (which of course should be as close as possible to their meaning in colloquial language, because it is colloquial language which conserves our long experience in thinking). This will be done in section 1.2. The use of sentential connectives, however, does not exhaust all the possibilities of forming more complex propositions. In mathematics we usually deal with structures. Let's take the example of a group. A group G consists of a non-empty set G of objects, together with a binary group operation, say , a neutral element, say 1, and the equality relation. The only propositions which are legal with respect to our hitherto collected rules are equations between terms and their sentential combinations. But this does not even allow us to express that 1 is the neutral element of G . In order to do that we have to say something like: 1 x = x and x 1 = x for all objects x in G: Here x is a symbol for an arbitrary element of G, i.e. x is a variable . This variable is quantied by saying for all x in G. Thus we also have to introduce the universal quanti er 8x (or some other name, say x y : : :, for the variable). The universal quanti er alone does not suce to express that G is a group. To formulate the existence of the inverse object we have to say
8x there exists an object y with x y = 1: Thus we need also the dual of the universal quanti er, the existential quanti er 9x. Altogether this means that in order to describe a group in our formal language, we have to allow object variables replacing objects in the formation for terms and quanti ers binding them. These are all the ingredients for a rst order language. This language is already quite powerful. E.g. it suces for the formalisation of group axioms, ring axioms etc. However, one can imagine much more powerful languages. So we might introduce variables for predicates and quanti ers binding them. This would be called a second order language. Third or even higher order languages languages can be obtained by iterating this process, i.e. allowing the quanti cation over predicates on predicates etc. Now we close our preliminary words and try to put these ideas into mathematical de nitions.
1.1 First Order Languages In the heuristical preliminaries we spoke about creating a language for mathematics. Now we are going to put those informal ideas into mathematical de nitions. After
1.1. First Order Languages
7
having de ned precisely the formal expressions and their meaning we will analyse the expressive power of so-called rst order logic. The heuristical studies of the previous section give us already a clear picture how to design a formal language. All we have to do in this section is to translate this picture into a mathematical de nition. The strategy will be the following. To design a language rst we have to x its alphabet . Then we need grammars which tell us how to get regular expressions out of the letters of the alphabet. In rst order languages the regular expressions will be the terms and the formulas. De nition 1.1.1. The alphabet of a rst order language consists of 1. countably many object variables, denoted by x y z x0 : : : 2. a set C of constant symbols, denoted by c d c0 : : : 3. a set F of function symbols, denoted by f g h f0 : : : Every function symbol f 2 F has an arity #f 2 f1 2 3 : :: g: 4. a set P of predicate symbols, denoted by P Q R S P0 : : : Every predicate symbol P 2 P has an arity #P 2 f1 2 3 : :: g: 5. the sentential connectives ^ (and), _ (or), : (not), ! (implies) and the quanti ers 8 (for all), 9 (there is). 6. parentheses (, ) as auxiliary symbols. A rst order language is obviously characterised by the sets C (the constant symbols), the set F (the function symbols) and the set P (predicate symbols). We call the elements of C F P the non-logical symbols of a language. All the other symbols are variables, sentential connectives, quanti ers and auxiliary symbols. They are called logical symbols. They do not dier in rst order languages. To emphasise that L is a
rst order language depending on the sets C F and P we often write L = L(C F P ): First order languages are denoted by L L0 : : : To make this de nition more visible we are going to give an example: think of formalising group theory. We declare a rst order language L(CGT FGT PGT ) in which we are able to make all statements concerning (elementary) group theory. There we have a constant symbol 1 for the neutral element a function symbol for the group operation and # = 2 a predicate symbol = for the equality relation on the group. = is binary, too. Thus we have CGT = f1g FGT = f g and PGT = f=g: Using this alphabet we are able to talk about statements concerning a group. But how do we build up (regular) statements? This will be done in general (for any rst order language) in two steps. In the rst step we will declare how to use variables, constant and function symbols. The expressions obtained in this step will be called terms, and in the second step we will introduce how to use predicate symbols, sentential connectives and quanti ers to obtain expressions called formulas.
8
I. Pure Logic
De nition 1.1.2. Let L(C F P ) be a rst order language. We simultaneously list
the rules for term formation and for the computation of the set FV(t) of variables occurring free in the term t: 1. Every variable x and every constant symbol c 2 C is a term. It is FV(x) = fxg and FV(c) = . 2. If t1 : : : tn are terms and f 2 F is a function symbol of arity #f = n then (ft1 : : :tn) is a term. It is FV(ft1 : : :tn ) = FV(t1 ) : : : FV(tn ). If f is binary we usually write (t1 ft2 ) instead of (ft1 t2) and call (t1 ft2 ) the in x notation of the term (ft1 t2 ). Terms are denoted by r s t r0 : : : Because terms depend on the given language we will call them L-terms if we want to emphasise the language. With the alphabet for group theory we can build up the following terms ( x1) ( ( 1x)( 1( yz))) which doesn't look like something concerning groups because usually one would like to use the in x notation in connection with binary function and predicate symbols. Then the above terms are read as (x 1) ((1 x) (1 (y z))) and we have the free variables FV((x 1)) = fxg FV((1 x) (1 (y z))) = fx y z g: I.e. FV(t) is the set of variables occurring in the term t: De nition 1.1.3. Let L(C F P ) be a rst order language. Simultaneous to the grammar for formulas we introduce the rules for the computation of the sets FV(F) of variables occurring free and BV(F) of variables occurring bound in the formula F. 1. If t1 : : : tn are L-terms and P 2 P is a predicate symbol of arity #P = n then (Pt1 : : :tn ) is a formula. It is FV(Pt1 : : :tn) = FV(t1) : : : FV(tn ) and BV(Pt1 : : :tn ) = . If P is binary we often write (t1 Pt2) instead of (Pt1t2). 2. If F and G are formulas, then so are: (:F) (F ^ G) (F _ G) (F ! G): It is FV(:F ) = FV(F ) BV(:F) = BV(F) and FV(F G) = FV(F) FV(G), and BV(F G) = BV(F) BV(G) for 2 f^ _ !g:
1.1. First Order Languages
9
3. If F is a formula and x is a variable such that x 2= BV(F), then (8xF) and (9xF) are formulas with FV(QxF ) = FV(F) n fxg and BV(QxF) = BV(F ) fxg for Q 2 f8 9g: Formulas are denoted by F G H F0 : : : Thus formulas depend on the language, too. If we want to stress this fact we will call them L-formulas. If L does not contain any predicate symbol, there are no L-formulas. So from now on we will assume that we have P 6= . Using the in x notation again we have obtained formulas of the shape (8x((1 x) = x)) (x = y) (8x(1 = y)) In these cases we have the sets FV((8x((1 x) = x))) = BV((8x((1 x) = x))) = fxg FV((x = y)) = fx yg BV((x = y)) = FV((8x(1 = y))) = fyg BV((8x(1 = y))) = fxg. In the third case of De nition 1.1.3 we have a condition on the variable for building formulas. Thus (8x(9x(x = x 1))) is not a formula because x 2 BV((9x(x = x 1))): The grammars in De nitions 1.1.2 and 1.1.3 (and in further de nitions to come) are often called inductive de nitions. An inductive de nition is given by a set of rules. The least xed point of an inductive de nition is the smallest set which is closed under all the rules in the inductive de nition. A set is inductively dened if it is the least xed point of an inductive de nition. The important feature of inductively de ned sets is that we may prove properties of their elements by induction on its denition , which means the following principle: To show that all elements of some least xed point M share a property ' it suces to show that ' is preserved under all the rules in the inductive de nition of M. We will use `induction on the de nition' over and over again starting with quite simple situations. Quite easy examples for `induction on the de nition' are given in the exercises. We agree upon the following notations and conventions: Formulas which are built according to the rst clause in De nition 1.1.3 are called atomic . A term t with FV(t) = is called closed .
10
I. Pure Logic
A formula F with FV(F) = is called a sentence .
Up to now we have described the objects of interest of the rst chapter:
rst order languages. But in this section we only spoke about the syntax of a rst order language: about its alphabet and about its regular expressions. In the next two sections we are going to develop the semantics of rst order languages. Section 1.2 is devoted to give meaning to the sentential connectives ^ _ : ! which are only syntactical symbols without any meaning. There we will see that rst order languages are powerful enough to represent any truth function (a semantical object, cf. De nition 1.2.1) by some kind of syntactical expression.
Exercises
E 1.1.1. We de ne the set of permitted words (i.e. non-void nite strings) over the alphabet fM U I g by the following inductive de nition. There x y are words and concatenated words are denoted by writing them one behind the other. 1. MI is a permitted word. 2. If xI is a permitted word, so is xIU: 3. If Mx is permitted, so is Mxx: 4. If xIIIy is permitted, so is xUy: 5. If xUUy is permitted, so is xy: Prove the following claims: a) MUUIU is a permitted word. b) MU is not permitted. E 1.1.2. The set M IN is de ned inductively by: 1. 2 2 M 2. Is n 2 M so n + 3 2 M: Prove: It is n 2 M i there is an m 2 IN with n = 3m + 2:
1.2 Truth Functions Up to now we have only developed the syntax of rst order languages. A term or a formula is nothing but a well-formed sequence of letters according to the rules of the respective grammar. Therefore we need to x the meaning of the letters and of the expressions formed out of the letters. We start by xing the meaning of the sentential connectives. The purpose of sentential connectives is to connect propositions. A proposition is something which either can be true or false. Thus sentential connectives can be regarded as syntactical counterpart of truth functions and we have to develop a theory of truth functions. We represent the truth value `true' by t, f stands for `false'.
1.2. Truth Functions
11
De nition 1.2.1. An n-ary truth function is a map from ft f gn into ft f g. Now we have to give a precise meaning to the sentential connectives of colloquial language. That is easy for negation. We de ne the truth function : : ft f g ! ft f g by : (t) = f and : (f) = t: It is also easy for ^ and _ . To make their de nition more visible we arrange it in form of truth tables. ^ t f _ t f t t f t t t f f f f t f These truth tables are to be read in the following way. ^ and _ are binary truth functions. The rst argument is in the vertical column left of the vertical line, the second in the horizontal row above the horizontal line. The value stands at the crossing point of the row of the rst and the column of the second argument. A bit more subtle to de ne is the implication ! formalising the colloquial if : : : then . The truth table of ! is
! t f
t t f f t t This is the way how implication is de ned in classical logic. The controversies about the meaning of implication reach back to the times of Megarians and Stoics. What people annoys is the `ex falso quodlibet' saying that ! (f ) is always true independent from the second argument. However, there are other interpretations of the colloquial if : : : then statements leading to dierent kinds of logic and thus also to dierent kinds of mathematics. One example is intuitionistic logic which interprets if A, then B in such a way that a proof for fact A can be converted in one for fact B. In this lecture we will restrict ourselves to the classical interpretation of implication as given in the above truth table. Usual mathematics is based on classical logic which uses the classical interpretation of implication. To study the theory of truth functions more generally we are going to introduce a formal language. The alphabet consists of
propositional variables, denoted by a b a0 : : : all connectives, i.e. names for all truth functions. Now we are able to build up expressions only with respect to their sentential structure. Think of propositional variables as of variables for propositions (or formulas). The expressions build up only by this means are called sentential forms.
12
I. Pure Logic
De nition 1.2.2. We de ne the sentential forms inductively as follows. 1. Every propositional variable is a sentential form. 2. If A1 : : : An are sentential forms and ' is an n-ary connective (i.e. a name for an n-ary truth function ' ), then ('A1 : : :An ) is a sentential form. Sentential forms are denoted by A B C A0 : : : Think of sentential forms build up by : ^ and _: Then (using in x notation) (:a) (((:a) ^ b) _ a) are examples for sentential forms. Now let's make some conventions to spare parentheses: outer parentheses will be cancelled : binds more than ^ _ ! and ^ _ more than !, i.e. we will write :A ^ B ! C _ :C instead of (((:A) ^ B) ! (C _ (:C))) we will write A1 ! A2 ! : : : ! An instead of (A1 ! (A2 ! (: : : ! An) : : : )): This will also be the case if we replace ! by ^ or _. Now we want to formalise that we think of propositional variables as variables for propositions (which either can be true or false). Therefore we think that we have associated a truth value with every propositional variable. Then we are able to determine the truth value of a sentential form by successive application of the corresponding truth functions. A boolean assignment is a map B : A ! ft f g where A denotes the set of propositional variables. Boolean assignments are denoted by B B 0 : : : We de ne the value B (A) of a sentential form A induced by a boolean assignment B as follows. De nition 1.2.3. We de ne B (A) for sentential forms A inductively. It is de ned according to the de nition of the sentential forms. 1. If A 2 A , then B (A) is already given by the assignment. 2. If A = ('A1 : : :An ) where ' is a name for the n-ary truth function ' , then B (A) = ' (B (A1 ) : : : B (An )).
1.2. Truth Functions
13
If B is a boolean assignment with B (a) = f and B (b) = t, then we obtain in the above example B (:a) = : B (a) = t B (((:a ^ b) _ a)) = (( : B (a)) ^ B (b)) _ B (a) = (( : f) ^ t) _ f =t Now we give a rst example for using `induction on the de nition'. Proposition 1.2.4. If A is a sentential form and B is a boolean assignment, then B (A) 2 ft f g: Proof by induction on the de nition of `A is a sentential form'. 1. If A 2 A , then B (A) 2 ft f g according to the de nition of a boolean assignment. 2. If A = ('A1 : : :An ), then we have (B (A1 ) : : : B (An )) 2 ft f gn by the induction hypothesis (which applies because A1 : : : An are previously de ned sentential forms). Since ' is a name for an n-ary truth function ' and B (A) = ' (B (A1 ) : : : B (An )) it is B (A) 2 ft f g. Let A be a sentential form and fa1 : : : ang the set of propositional variables occurring in A: It is obvious from De nition 1.2.3 that B (A) only depends on B fa1 : : : ang (i.e. B restricted to the nite set fa1 : : : ang). There are only 2n many boolean assignments which dier on fa1 : : : ang. This means that there is an obvious algorithm for computing B (A) which consists in writing down B (a1 ) : : : B (an ) for the 2n many assignments which dier on fa1 : : : ang and then computing B (A) according to the truth tables for the functions represented by the connectives occurring in A. For a precise formalisation of this fact cf. the appendix (Lemma A.1.11). De nition 1.2.5. Let A and B be sentential forms. We say that A and B are sententially equivalent, written as A B if B (A) = B (B) for any boolean assignment B. Proposition 1.2.6. is an equivalence relation on the sentential forms, i.e. we have AA A B entails B A A B and B C entail A C: The following proposition gives a list of equivalent sentential forms.
Proposition 1.2.7. a) A ^ B B ^ A A _ B B _ A: b) :(A ^ B) :A _ :B :(A _ B) :A ^ :B:
14
I. Pure Logic
c) :(:A) A: d) (A ^ B) ^ C A ^ (B ^ C) (A _ B) _ C A _ (B _ C): e) (A ^ B) _ C (A _ C) ^ (B _ C) (A _ B) ^ C (A ^ C) _ (B ^ C): f) A ! B :A _ B: The proofs are obtained by mere computation of both sides. At this point we want to single out some connectives, respectively some truth functions, by which all other connectives, respectively truth functions, can be represented in a way as ! is represented in Proposition 1.2.7 by : and _:
De nition 1.2.8.
a) A sentential form A1 ^ : : : ^ An, in which every Ai i = 1 : : : n is either a propositional variable or of the form :ai for ai 2 A , is a pure conjunction. b) Dually a sentential form A1 _ : : : _ An , where the Ai i = 1 : : : n are as above, is called a pure disjunction (or clause). c) A sentential form A1 ^ : : : ^ An, where all the Ai (i = 1 : : : n) are pure disjunctions, is a conjunctive normal form. d) Dually a sentential form A1 _ : : : _ An with pure conjunctions Ai (i = 1 : : : n) is a disjunctive normal form. The aim of the following theorem is to obtain an equivalent disjunctive normal form for arbitrary sentential forms. How the normal form can be computed (not in the general situation) will be performed in the following example: (a _ :c) ^ (b _ c) (a ^ (b _ c)) _ (:c ^ (b _ c)) (a ^ b) _ (a ^ c) _ (:c ^ b) _ (:c ^ c) (a ^ b) _ (a ^ c) _ (:c ^ b) using Proposition 1.2.7 and the fact B (:c ^ c) = f for all boolean assignments B . Theorem 1.2.9. Let A be a sentential form. Then there is a disjunctive normal form B such that A B. Proof. Let A be a sentential form. Then there are only nitely many propositional variables, say a1 : : : an occurring in A. We have 2n boolean assignments B 1 : : : B 2n which dier on fa1 : : : ang. Now we de ne sentential forms
(
Aik = ak if B i (ak ) = t :ak if B i (ak ) = f for i = 1 : : : 2n and k = 1 : : : n: Here we have B i (Aik ) = t
1.2. Truth Functions
15
and for i 6= j there is a k 2 f1 : : : ng such that B i (ak )
This entails
6= B j (ak ): 6
B j (Ajk ) = B i (Ajk )
and since B j (Ajk ) = t we have B i (Ajk ) = f: Fitting parts together we have for the pure conjunctions Ci = Ai1 ^ : : : ^ Ain i = 1 : : : 2n the fact that B i (Cj ) = t i i = j since B i (Cj ) = t just in the case if B i (Ajk ) = t for all k = 1 : : : n: Without loss of generality we may assume that we have numbered the boolean assignments in such a way that B i (A) = t for i = 1 : : : m and B j (A) = f for j = m + 1 : : : 2n: If m = 0 de ne B = a0 ^ :a0 : Then we have that B is a disjunctive normal form with AB since B (A) = f = B (B) for all boolean assignments B . If m 6= 0 de ne the disjunctive normal form B = C1 _ : : : _ Cm : Then it is for i = 1 : : : m B i (B)
and for j = m + 1 : : : 2n since it is
= B i (Ci ) = t = B i (A)
B j (B) B j (C1 ) =
So we have for all i = 1 : : : 2n
= f = B j (A)
: : : = B j (Cm ) = f:
B i (A)
and we can conclude A B:
= B i (B)
16
I. Pure Logic
Corollary 1.2.10. For any sentential form A there is a conjunctive normal form B such that A B. Proof. To prove the corollary we observe that, in view of Proposition 1.2.7 the negation of a disjunctive normal form is equivalent to a conjunctive normal form and vice versa. Thus choose a disjunctive normal form B0 equivalent to :A which exists by 1.2.9. Then by 1.2.7 A ::A :B0 which by the above remark is equivalent to a
conjunctive normal form. De nition 1.2.11. Let M be a set of connectives, i.e. names for some xed truth functions. We call M complete if for every sentential form there is an equivalent sentential form only containing connectives from M. So we obtain as another immediate corollary of Theorem 1.2.9: Theorem 1.2.12. f: ^ _g f: _g and f: ^g are complete sets of connectives. Proof. From Theorem 1.2.9 we see that f: ^ _g is complete. But according to 1.2.7 we can express ^ by : and _, and _ by : and ^. At this point we have obtained a justi cation for taking only the connectives
^ _ : ! into the alphabet of a rst order language (cf. De nition 1.1.1) since every other connective can be represented by them.
Exercises
E 1.2.1 (Sheer stroke). The binary truth function j: ft f g2 ! ft f g is given by: j t f
t f t f t t One may think of j as a connective. Prove that fjg is a complete set of connectives. De ne the connectives : ! _ ^ using only j. E 1.2.2. The binary truth function : ft f g2 ! ft f g is given by: t f t f f f f t One may think of as a connective. Is fg complete? E 1.2.3. Let f1 : : : ng be a complete set of connectives. Prove or disprove that f : 1 : : : : n g is complete. E 1.2.4. Prove: f^ _g is not complete. E 1.2.5. Is f^ _ !g a complete set of connectives?
1.3. Semantics for First Order Logic
17
E 1.2.6. A king puts a prisoner to a severe test. He orders the prisoner into a room with two doors. Behind each door there may be a tiger or a princess. Choosing the door with the princess the prisoner will be set free. Otherwise he will be torn into pieces by the tiger. Knowing that the prisoner is a logician, the king has mounted two signs to the door. The choice of the room doesn't make any dierence.
The princess is in the other room.
He gives the following information to the prisoner: \If the princess is behind the door on the left hand side, the sign at that door is true. If there is the tiger so it's false. With the other door it is just the other way round."
What door should be chosen by the prisoner? a) Formalise the exercise. b) Determine the equivalent disjunctive normal form and use it to help the prisoner to make his decision.
1.3 Semantics for First Order Logic Having discussed the meaning of the sentential connectives now we can turn to x the meaning of the terms and formulas introduced in section 1.1. The rst step in that direction is to tell the meaning of the non-logical symbols of a rst order language. De nition 1.3.1. An L(C F P)-structure is given by a quadruple
S = (S C F P) satisfying the following properties: 1. S is a non-void set. It is called the domain of the structure. 2. It is C = fcS : c 2 Cg S: 3. It is F = ff S : f 2 Fg a set of functions on S such that f S : S n ! S if #f = n: 4. P = fP S : P 2 Pg is a set of predicates on S, i.e. for P 2 P with #P = n it is P S S n : Let's give an easy example: Let LGT = L(CGT FGT PGT ) be the language of group theory. In an LGT -structure we have interpretations for 1 and =. Thus, if G = (G 1G G) is a group we have an LGT -structure with domain G interpreting 1 by 1G which is 1G 2 G by the function G which is G : G2 ! G (x y) 7! x G y
18
I. Pure Logic
= by the predicate =G which is f(x x) : x 2 Gg G2 Thus every group is an LGT -structure. But we also have very strange LGT -structures,
e.g. if we have the following structure domain of the structure is IN is interpreted by the function f : IN2 ! IN (x y) 7! 2x = is interpreted by the predicate f(x y) 2 IN2 : x < yg This structure has of course nothing to do with a group. Now we are going to give a meaning to the syntactical material of section 1.1 (i.e. terms and formulas) with respect to a given structure, i.e. with respect to the meaning of the non-logical symbols. In a rst step we are going to assign elements of the domain of the structure to the variables. By an assignment for an L(C F P )-structure S = (S C F P) we understand a map : V! S where V denotes the set of object variables. Assignments are denoted by 0 ::: Now let L = L(C F P ) be a rst order language, S = (S C F P) an L-structure and an assignment for S : By we have interpreted the variables in S : Now we can lift the interpretation to all L-terms. De nition 1.3.2. The value tS ] of an L-term t in the L-structure S with respect to the assignment is de ned by induction on the de nition of the L-terms as follows: 1. If t is the variable x, then tS ] = (x): 2. If t = c, then tS ] = cS : 3. If t = (ft1 : : :tn ) then tS ] = f S (tS1 ] : : : tSn ]): For an example regard the group G = (Z0 +) of the integer numbers as an LGT structure. Take the term t = (1 x) y and an assignment for G with (x) = 5 and (y) = ;3: Then it is tG ] = 2: Proposition 1.3.3. Let S be an L-structure and an S -assignment. a) tS ] 2 S for any L-term t. b) If t is an L-term with FV(t) = , then tS ] = tS ] for all S -assignments and . In this case we write briey tS : Proof. We only prove the rst part at full length. This is an induction on the de nition of `t is an L-term'. Therefore we have the following cases:
1.3. Semantics for First Order Logic
19
t=x
Then it is tS ] = (x) 2 S by the de nition of : t=c2C Then it is tS ] = cS 2 S by the de nition of cS : t = (ft1 : : :tn ) with L-terms t1 : : : tn and f 2 F By the de nition of f S it is f S : S n ! S and by the induction hypothesis we have tS1 ] : : : tSn ] 2 S: So it is tS ] = f S (tS1 ] : : : tSn ]) 2 S: This nishes the induction. The proof of the second part is an induction on the de nition of `t is an L-term', too. There one need not consider the case t = x because of FV(t) = : In the next step we are going to de ne the truth value ValS (F ) of an L-formula F under an assignment for S . To simplify the de nition we introduce x , 8y 2 V(x 6= y ) (y) = (y)): This means that the assignments and dier at most at the variable x: De nition 1.3.4. We de ne ValS (F ) by induction on the de nition of the formulas. 1. If F is an atomic formula (Pt1 : : :tn) we put
(
S S S ValS (F ) = t if (t1 ] : : : tn ]) 2 P f otherwise
2. ValS (:F0 ) = : (ValS (F0 )) 3. ValS (F1 ^ F2 ) = ValS (F1 ) ^ ValS (F2 ) 4. ValS (F1 _ F2 ) = ValS (F1 ) _ ValS (F2 ) ( 5. ValS (8xF0 (x) ) = t if ValS (F0 ) = t for all x f otherwise ( 6. ValS (9xF0 (x) ) = t if ValS (F0 ) = t for some x f otherwise Instead of ValS (F ) = t we commonly write S j= F]: Thus S 6j= F] means ValS (F ) = f: To make clauses 5. and 6. in De nition 1.3.4 better conceivable we are going to prove that both clauses meet the intuitive understanding of the quanti ers 8 and 9 (cf. Lemma 1.3.7). Though more perspicuous this
20
I. Pure Logic
alternative formulation has the disadvantage that it needs a bigger apparatus. We denote by Fx (t) the string which is obtained from the string F by replacing all occurrences of x by t, similar for sx (t): Proposition 1.3.5. If F is an L-formula and t is an L-term such that FV(t) \ BV(F) = and x 2= BV(F) then Fx(t) is a formula with FV(Fx (t)) FV(t) (FV(F) n fxg) and BV(Fx (t)) BV(F ): Proof. This is left to the reader as an exercise. From now on we always tacitly assume that FV(t) \ BV(F) = and x 2= BV(F) whenever we write Fx(t). Lemma 1.3.6. Let s t be L-terms, F an L-formula and S an L-structure. If and are S -assignments such that x and (x) = tS ] then sS ] = sx (t)S ] and ValS (F ) = ValS (Fx (t) ): Proof. First we show sS ] = sx(t)S ] by induction on the de nition of s. If s = y 6= x, then sS ] = (y) = (y) because x . If s = x, then sS ] = (x) = tS ] = sx (t)S ]: If s = c 2 C , then sS ] = cS = sx (t)S ]: If s = (fs1 : : :sn ), then by the induction hypothesis sS ] = f S (sS1 ] : : : sSn ]) = f S (s1x (t)S ] : : : snx(t)S ]) = sx (t)S ]: Next we show ValS (F ) = ValS (Fx (t) ) by induction on the de nition of F. If F = (Ps1 : : :sn ), then S j= F] i (s1S ] : : : sSn ]) 2 P S which holds i (s1x (t)S ] : : : snx(t)S ]) 2 P S by the rst part. But this means S j= Fx (t)].
1.3. Semantics for First Order Logic
21
If F = :F0, then ValS (F ) = : (ValS (F0 )) = : (ValS F0x(t) ) = ValS (:F0x(t) ) where the equation between the second and the third term holds by induction hypothesis. In the following we will indicate this by writing: =i:h: . If F = (F1 F2), where is a connective ^ _ !, then ValS (F ) = ValS (F1 ) ValS (F2 ) =i:h: ValS (F1x(t) ) ValS (F2x(t) ) = ValS (Fx (t) ):
If F = 8yG and S j= F] then S j= G0] for all 0 y . We have x 2= BV(F). Thus x = 6 y. Let 0 y . Then de ne ( 0 6 x (z) = (z) for z = (z) for z = x:
Then y because for z 6= y we have (z) = 0 (z) = (z) = (z) if z 6= x and (z) = (z) for z = x. Thus we have S j= G ]: We have x 0 by de nition and obtain (x) = (x) = tS ] = tS 0 ] since y 0 and y 2= FV(t) because of y 2 BV(F) and FV(t) \ BV(F) = : Hence S j= Gx(t)0 ] by induction hypothesis. Since 0 was an arbitrary assignment such that 0 y this entails S j= 8yGx (t)]: For the opposite direction assume S j= 8yGx (t)]: Let 0 y : De ne (0 (z) = (z) for z 6= x (z) for z = x:
22
I. Pure Logic
Then y because we have for z 6= x (z) = 0(z) = (z) = (z) and (z) = (z) for z = x: Hence S j= Gx(t) ]: But x 0 and 0(x) = (x) = tS ] = tS ] because BV(F ) \ FV(t) = and y 2 BV(F). Thus S j= G0] by the induction hypothesis. Since 0 was arbitrary with 0y this means S j= 8yG: The case that F = 9yG is similar and left to the reader. If S = (S C F P) is an L-structure we may extend L = L(C F P ) to a language LS = L(C S F P ) where S = fs : s 2 S g and expand S to an LS -structure SS = (S C S F P) where sSS = s: It is obvious that any SS -assignment is also an S -assignment and vice versa because an assignment only depends on the domain of a structure. Thus LS is obtained from L by giving `names' to the elements of S. Here look at an easy example. Let L = LGT the language of group theory and S = (Z 0 +) be the group of the integer numbers. Then S is the set fz : z 2 Zg where z is nothing but a new constant symbol. In the expanded structure SS we interpret z (which is thought to be a name for the object z) by the object z: Lemma 1.3.7. Let F be an L-formula and S an L-structure. Then: a) S j= 8xF] i SS j= Fx(s)] for all s 2 S: b) S j= 9xF] i SS j= Fx(s)] for some s 2 S: Proof. Before we start the proof we make the general observation that S j= 8xF] i SS j= 8xF] because F is an L-formula (cf. Exercise E 1.3.5). To show the direction from left to right in a) assume SS j= 8xF ] and choose an arbitrary s 2 S: De ne
(
(y) = (y) if y 6= x s if y = x
1.3. Semantics for First Order Logic
23
Then x and (x) = s = sSS ]: Hence SS j= F] which entails SS j= Fx (s)] by Lemma 1.3.6. For the opposite direction assume SS j= Fx (s)] for all s 2 S. Let x and set s = (x): Then SS j= Fx (s)] which by 1.3.6 entails S j= F]: Hence S j= 8xF]: The proof of b) runs analogously. We see from 1.3.7 that we really captured the colloquial meaning of 8 and 9 by De nition 1.3.3. Before we continue to investigate the semantical properties we introduce some frequently used phrases. De nition 1.3.8. Let L be a rst order language and M a set of L-formulas. a) M is satis able in S if there is an S -assignment such that S j= F] for all F 2 M: b) M is valid in S if S j=F] for all F 2 M and all S -assignments : c) M is satis able or consistent if there is an L-structure S such that M is satis able in S . d) M is valid if M is valid in every L-structure S . For a formula F we denote by S j= F that F is valid in S and by j= F that F is valid. Let's illustrate this de nition by some examples. Therefore let LGT be the language of group theory and G be a group (which is an LGT -structure, too). M = f(x = 1)g is satis able in G because if we take an G -assignment with (x) = 1G we have G j= x = 1]:
If G is not a group with only one element M = f(x = 1)g is not valid in G because if we take g 2 G with g = 6 1G and an G -assignment with (x) = g we have G 6j= x = 1]: Now, let AxGT be the set of `axioms' of group theory, i.e. the set of the following formulas
8x8y8z(x (y z) = (x y) z) 8x(x 1 = x) 8x9y(x y = 1) Thus the LGT -structures S interpreting = by f(x x) : x 2 S g with AxGT valid in S
are just the groups. AxGT f8x8y(x y = y x)g is consistent because there are commutative groups. It follows from De nition 1.3.4 that ValS (F ) only depends on FV(F). Thus we have the following property:
24
I. Pure Logic
Proposition 1.3.9. Let S be an L-structure, F an L-formula and and S -assignments such that FV(F ) = FV(F ). Then it is ValS (F ) = ValS (F ): It follows from 1.3.9 that ValS (F ) does not depend on if F is a sentence. Thus sentences have a xed truth value in an L-structure S . That is the reason for calling them sentences. For sentences there is no dierence between satis ability and validity with respect to a xed structure. If F is an L-sentence, then F is satis able in an L-structure S i it is valid in S . If F is an L-formula with FV(F ) fx1 : : : xng and an assignment such that (xi) = ai , then we often write S j= F a1 : : : an] instead of S j= F]: According to 1.3.9 ValS (F ) is determined by a1 : : : an: De nition 1.3.10. Let L be a rst order language and F G L-formulas. We say that F and G are semantically equivalent if ValS (F ) = ValS (G ) for any L-structure S and any S -assignment . We denote semantical equivalence by F S G. Lemma 1.3.11. We have F S G i j= (F ! G) ^ (G ! F ). Proof. If F S G, then ValS (F ) = ValS (G ) for any structure S and any S assignment . According to the truth tables of ! this entails ValS (F ! G ) = t and ValS (G ! F ) = t: Hence j= (F ! G) ^ (G ! F): For the opposite direction assume ValS (F ) 6= ValS (G ) for some S and S -assignment . If ValS (F ) = t, then ValS (F ! G ) = f: Hence 6j= (F ! G) ^ (G ! F): If ValS (F ) = f then ValS (G ! F ) = f and again we get 6j= (F ! G) ^ (G ! F): We de ne the connective $ by F $ G = (F ! G) ^ (G ! F). This means that the truth function interpreting $ is given by this combination of the truth functions for conjunction and implication. Then 1.3.11 reads as F S G i j= F $ G:
1.3. Semantics for First Order Logic
25
Proposition 1.3.12. The semantical equivalence of L-formulas is an equivalence relation on the L-formulas. The following proposition gives a list of semantically equivalent formulas.
Proposition 1.3.13. a) F ^ G S G ^ F F _ G S G _ F b) :(F ^ G) S :F _ :G :(F _ G) S :F ^ :G c) :(:F) S F d) (F ^ G) ^ H S F ^ (G ^ H) (F _ G) _ H S F _ (G _ H) e) (F ^ G) _ H S (F _ H) ^ (G _ H) (F _ G) ^ H S (F ^ H) _ (G ^ H) f) F ! G S :F _ G g) :(9xF ) S 8x(:F) :(8xF ) S 9x(:F) Proof. Claims a) to f) do obviously not depend on the quanti er structure of the
formulas involved. Thus a) to f) follow from 1.2.7 just because of the propositional structure of the involved formulas. We are going to study the propositional structure and properties of rst order formulas in the next section. Thus the precise argument will be given by Corollary 1.4.6. Thus all we have to check is g). Assume S j= :9xF] for some L-structure S and an S -assignment . Then S 6j= 9xF] which says that there is no S -assignment x such that S j= F ] i.e. we have S j= :F ] for all S -assignments x : Hence S j= 8x:F]: If S j= 8x:F ], then S j= :F] for all x which shows that there is no S -assignment x such that S j= F]: Hence S j= :9xF: The second part follows from the rst by the computation
:8xF S :8x:(:F) S :(:9x:F) S 9x:F:
Exercises E 1.3.1.
a) Prove: j= 8x(F ^ G) ! 8xF ^ 8xG: b) Let L be a rst order language including a constant symbol 0: Determine Lformulas F and G with
6j= 8x(F _ G) ! 8xF _ 8xG:
E 1.3.2. Let L be a rst order language and P a predicate symbol of L: Which of the following formulas are valid? a) (F ! G) ! ((F ! :G) ! :F)
26
I. Pure Logic b) 8xF ! 9xF c) 8y9xPyx ! 9x8yPyx d) 9xF ^ 9xG ! 9x(F ^ G)
E 1.3.3.
a) Let S be an L-structure. Prove that if S j= G ! F and x 2= FV(G) so one has S j= G ! 8xF b) Is the condition x 2= FV(G) necessary? Prove your claim. E 1.3.4. Prove Lemma 1.3.5. Hint: Show rst that for a term s also sx (t) is a term with FV(sx (t)) FV(t) (FV(s) n fxg): Do we have in general FV(Fx(t)) = FV(t) (FV(F) n fxg)? E 1.3.5. Let F be an L-formula, S an L-structure and an S -assignment. Prove: S j= F] , SS j= F]:
E 1.3.6.
a) Determine a rst order language LV S suited for talking about a vector space and its eld. Hint: use a binary predicate symbol `='. b) Formulate a theory (a set of sentences) TV S in the language LV S such that for all LV S I-structures S (i.e. LV S -structures interpreting =S by f(s s) : s 2 S g, cf. also section 1.10) one has: S j= TV S , S consists of a eld and vector space over this eld. c) De ne the LV S I-structure S of the continuous functions over the eld R of the real numbers. d) Determine LV S -formulas F and G such that the following holds in all LV S Istructures S with S j= TV S : 1. s1 : : : sn 2 S are linear independent , S j= Fs1 : : : sn]: 2. s1 : : : sn 2 S form a vector space basis , S j= Gs1 : : : sn ]:
E 1.3.7.
a) Let S be an L-structure and an S -assignment. Prove: S j= 9xF] , S j= 9yFx (y)] if y 2= FV(F) BV(F):
1.4. Propositional Properties of First Order Logic
27
b) Is the condition y 2= FV(F ) BV(F ) in a) necessary? Prove your claim. c) Let S be an L-structure and a S -assignment. Now let F and F~ be two Lformulas which are obtained by renaming bounded variables. Prove: ~ S j= F ] , S j= F]:
E 1.3.8. Let F be an L-sentence with FV(F) = fx1 : : : xng. Prove for any Lstructure S it is S j= F , S j= 8x1 : : : 8xnF: E 1.3.9. Let LIN = (0 1 + 1.8. A Cut Free Calculus for First Order Logic
59
De nition 1.8.10.
a) A number sequence is a map : n ! ! where f0 : : : n ; 1g = n < !: We call dom() = n also the length of : By !
Now we are going to visualise a tree as follows
h0 0 0i
@@
h0 0 1i
h0 0i
;;
QQ Qh0i @@
h2 5 0i h1 0i h1i hi
@@
h2 5 3i
h2 5i
h2i ;;
;;
Here we have the topmost nodes h0 0 0i h0 0 1i h1 0i h2 5 0i h2 5 3i: Lemma 1.8.11 (Induction on a well-founded tree). Let T be a well-founded tree. Assume (1.4) 8(8n(_ hni 2 T ! (_ hni)) ! ()): Then 8 2 T () where () is any `property' of the node .
60
I. Pure Logic
Proof. Assume that there is a 2 T such that :(): We de ne a 2 !! such that is an in nite path in T such that :( n) holds for all n dom(): Put m = dom() and (k) = (k) for k < m: For n m we de ne by recursion on n: By the induction hypothesis we have :( n): Thus by (1.4) the set M = fk < ! : n_ hki 2 T ^ :( n_ hki)g is not empty. De ne (n) = minM: Thus n 2 T for all n < ! which contradicts the
well-foundedness of T: For the de nition of the search tree we need the hypothesis that L is a countable language. Assume that t0 t1 : : : is an enumeration of all L-terms. De nition 1.8.12 (The search tree S ). Let L be a countable language and $ be a nite list of LT -formulas. We de ne the search tree S together with a labeling map $ which assigns nite lists of formulas to the nodes of S . 1. hi 2 S and $(hi) = $ 2. If 2 S and $() is either irreducible or an L-axiom (when viewed as a set), then is topmost in S . Thus, for the following de nitions, assume that 2 S and $() is neither an axiom nor irreducible. Let R be the distinguished redex in $(). We have the following cases: 3. R = (F0 ^ F1): Then _ hii 2 S for i 2 f0 1g and $(_ hii) = $()r Fi: 4. R = (F0 _ F1): Then _ h0i 2 S : De ne $(_ h0i) = $()r F0 F1: 5. R = 8xF: Then _ h0i 2 S and $(_ h0i) = $()r Fx(y) where y is the rst variable (in the xed enumerationSof the terms) which does S not occur in $ = $() (i.e. y 2= FV($ ) = F 2 FV(F)). 6. R = 9xF: Then _ h0i 2 S and $(_ h0i) = $()r Fx(t) 9xF where t is the rst term (in the xed enumeration) such that Fx(t) does not occur in $ : A search path for $ is a path through S : We say that a search path contains a set of formulas ; if ; = $( n) for some n dom():
1.8. A Cut Free Calculus for First Order Logic
61
The search tree S for $ = ((x = y) ^ x = 1 (x = 1) _ x = 1) is given by x
(
=
x
y
x
=1 =1 x
= 1) _ = 1 x
x
=
y
h0 0i
h1 0i
h0i
h1i
@@
hi
;;
x
= 1 ( = 1) = 1
x
x
( = 1) _ = 1 = 1 x
x
x
Lemma 1.8.13 (Principal syntactic lemma). Let L be countable. Assume that
every search path for $ contains an L-axiom. Then T $: Proof. Since every search path for $ contains an axiom it is nite. Hence S is well-founded. We show by bar induction that T $() holds for all 2 S : Our rst observation is (A) If P is atomic and P 2 $() for some 2 S , then P 2 $() for all 2 S such that : This is immediate by de nition since P can never be cancelled. To prove the lemma we have the induction hypothesis _ hni 2 S ) T $(_ hni) If _ hni 2= S and 2 S , then is topmost in S : Thus is a search path for $ which has to contain an axiom. But then already $() has to be an axiom which entails T $(): If _ hni 2 S , then $() is reducible. Let R be the distinguished redex in $(): We have to take cases on the shape of R: 1. If R = (F0 ^ F1), then _ hii 2 S for i 2 f0 1g: Hence T $()r F0 ^ F1 by an ^-inference. This, however, is T $(): 2. If R = (F0 _ F1 ), then _ h0i 2 S : By the induction hypothesis we therefore have T $(_ h0i): We have $(_ h0i) = $()r F0 F1 and obtain $()r R by two _-inferences. 3. R = 8xF: Then T $()r Fx(y) where y does not occur in $()r : Hence T $() by a 8-inference.
62
I. Pure Logic
4. R = 9xF: Then T $()r Fx(t) 9xF by the induction hypothesis and we obtain T $() by an 9-inference. Lemma 1.8.14 (Principal semantic lemma). Let L be countable and $ be a nite list of formulas such that there is a search path for $ which does not contain an axiom. Then there is an L-structure S and an S -assignment with S 6j= F] for all F 2 f$( n) : n dom()g: Proof. To prove the lemma we need a couple of observations. (B) If n dom() and R is a redex in $( n), then there is an m dom() such that R is distinguished in $( m): Proof. We induct on the number of redexes which proceed R in the list $. If this number is 0, then R is distinguished in $( n): Otherwise let Fi be the distinguished redex in $. Since contains no axiom we have n 2 dom() and Fi is not longer distinguished in $( n + 1) = $( n)r F G for suited F and G. Thus the number of redexes which proceed R in $ has decreased and we get the existence of some m dom() such that R is distinguished in $( m) by induction hypothesis. (C) If m dom() and (F0 ^ F1) 2 $( m), then there is an i 2 f0 1g and an n dom() such that Fi 2 $( n): Proof. By (B) there is an n dom() such that (F0 ^ F1) is distinguished in $( n): But then n 2 dom() and by de nition we have Fi 2 $( n + 1) for i = 0 1: (D) If m dom() and (F0 _ F1) 2 $( m), then there is an n dom() such that Fi 2 $( n) for i 2 f0 1g: Proof. Let n dom() such that F0 _ F1 is distinguished in $( n): Then n 2 dom() and F0 F1 2 $( n + 1): (E) If m dom() and (8xF ) 2 $( m), then there is a variable y 2= FV(8xF ) and an n dom() such that Fx (y) 2 $( n): Proof. Let n0 dom() be such that 8xF is distinguished in $( n0 ): Then n0 2 dom() and Fx (y) 2 $( n0 +1) where y 2= FV($ ( n)) with n = n0 +1. Thus especially y 2= FV(8xF) since 8xF 2 $ ( n): (F) If m dom() and (9xF ) 2 $( n), then for every L-term t there is an mt dom() such that Fx(t) 2 $( mt ): Proof. Assume that 9xF is distinguished in $( n) for n dom() by (B). Then n 2 dom() and for some k Fx (tk ) 2 $( n + 1) as well as Fx(tj ) 2 $ ( n) for all j < k: We show by induction on l k that there is an ml dom() such that Fx(tj ) 2 $ ( ml ) for all j < l and fFx(tl ) 9xF g $( ml ): For l = k this is already clear. Thus assume that the claim is true for l: By (B)
1.8. A Cut Free Calculus for First Order Logic
63
we obtain an n0 dom() such that 9xF is distinguished in $( n0 ): Hence n0 2 dom() and fFx (tl+1 ) 9xF g $( n0 + 1) by de nition since tl+1 is the
rst term in the xed enumeration which does not occur in $ ( n0): We use properties (A)-(F) to prove the lemma. First we de ne an L = L(C F P )structure S = (S C F P) by S = ft : t is an L-termg C = fcS : c 2 Cg where cS = c F = ff S : f 2 Fg where f S (t1 : : : tn) = (ft1 : : :tn ) and P = fP S : P 2 Pg where P S = f(t1 : : : tn ) : 9m dom()Pt1 : : :tn 2 $( m)g: An S -assignment is de ned by (x) = x: This yields (G) tS ] = t for all L-terms t and we show (H) If m dom() and F 2 $( m), then S 6j= F ]: Proof by induction on the length of F. 1. If F = Pt1 : : :tn m dom() and F 2 $( m), then Pt1 : : :tn 2= $( k) for all k dom() by (A) and the fact that does not contain an axiom. Hence (t1 : : : tn) 2= P S which by (G) entails S 6j= F]: 2. If F = Pt1 : : :tn, then we obtain (t1 : : :tn) 2 P S by de nition. Hence S 6j= F] by the de nition of the Tait-language for L: 3. If F = F0 ^ F1, then by (C) there is an m dom() such that Fi 2 $( m) for i 2 f0 1g: Hence S 6j= Fi ] by induction hypothesis which entails S 6j= F ]: 4. If F = F0 _ F1, then by (D) there is an m dom() such that Fi 2 $( m) for i = 0 1: Hence S 6j= Fi ] for i 2 f0 1g which entails S 6j= F]: 5. If F = 8xG, then by (E) there is an m0 dom() such that Gx y] 2 $( m0 ): Hence S 6j= Gx y]]: De ne
(
(z) = (z) for z 6= x (y) for z = x: Then S j6 = G] which entails S 6j= 8xG: 6. If F = 9xG, then by (F) we have Gx (t) 2 $( mt ) for some mt dom(): Assume x : Then let t = (x) and we obtain S 6j= Gx (t)] by induction hypothesis. Hence S 6j= G] and we have S 6j= 9xG]:
64
I. Pure Logic
The combination of the principal syntactical and semantical lemma yields the completeness theorem.
Theorem 1.8.15 (Completeness of the Tait-calculus). j= F1 _ : : : _ Fn implies T
F1 : : : Fn:
Proof. First we remark that we need not bother about the restriction of L being countable in the proof of the principal syntactical and semantical lemma as we can restrict the language L to those non-logical symbols which occur in F1 : : : Fn. Assume
that T F1 : : : Fn is false. Then, by the principal syntactic lemma, there is a search path for F1 : : : Fn which does not contain an axiom. This, by the principal semantic lemma, entails that there is a structure S and an S -assignment with S 6j= Fi ] for i = 1 : : : n: Hence 6j= F1 _ : : : _ Fn: Assume that we have any calculus K: We call an inference rule F1 : : : Fn ) G admissible for K if K F1 : : : K Fn imply K G:
Theorem 1.8.16 (Weak form of Gentzen's Hauptsatz). The cut rule F ! G G ! H ) F ! H is admissible for the Tait-calculus.
Proof. Assume T F _ G and T G _ H: Then we have j= F _ G and j= G _ H by the soundness theorem 1.8.7. Hence j= F _ H which by the completeness theorem 1.8.15 entails T F H: By two inferences (_) this entails T F _ H: We call 1.8.16 the weak form of Gentzen's Hauptsatz, since we did not show that there is a terminating procedure for the elimination of (cut). In the exercises we indicate how such a procedure is obtainable. Dealing with cut elimination procedures is the hard core of proof theoretical research.
Exercises
Let LT be a Tait-language. We de ne the length l(F) of a formula F inductively: 1. Is F atomic, so let l(F) = 0: 2. Is F = F0 F1 and 2f^ _g, so let l(F) = maxfl(F0) l(F1)g + 1: 3. Is F = QxG and Q2f8 9g so let l(F) = l(G) + 1:
1.8. A Cut Free Calculus for First Order Logic
E 1.8.1. Prove l(F ) = l(F):
65
Now we are going to de ne a re nement of the Tait-calculus nk $ for n k2IN inductively. (Ax) Is P atomic and fP P g $ so nk $ for all n k2IN: (^) If nk 0 $ F0 and nk 1 $ F1 then nk $ F0 ^ F1 for n > max(n0 n1): (_) If nk 0 $ Fi for i = 0 or i = 1 then nk $ F0 _ F1 for n > n0 : (8) If nk 0 $ Fx(y) and y 2= FV($) then nk $ 8xF for n > n0 : (9) If nk 0 $ Fx(t) for a term t then nk $ 9xF for n > n0: (cut) If nk 0 $ F and nk 1 $ F and l(F) < k then nk $ for n > max(n0 n1):
E 1.8.2. a) b) c) d)
n k n k n k n k
E 1.8.3. a)
$ ) nk $x(t) $ ) nk $ ; $ F0 _ F1 ) nk $ F0 F1: $ F0 ^ F1 ) nk $ F0 and nk $ F1
2l(F ) F F: 0 2l(F ) + 3 8xF 0
b) _ Fx(t) and 20l(F ) + 3 Fx (t) _ 9xF c) nk 0 $ F _ G and nk 1 $ F and l(F) < k ) nk $ G for n > max(n0 n1): d) Let x 2= FV($ G): 1) nk $ G _ F ) nk + 3 $ G _ 8xF 2) nk $ F _ G ) nk + 3 $ 9xF _ G
E 1.8.4.
a) nk 0 $ F and nk 1 ; F and l(F) = k ) nk 0 + n1 $ ;: Hint: Use induction on n0 + n1: n b) nk + 1 $ ) 2k $:
66
I. Pure Logic (n)
c) (Gentzen's Hauptsatz) De ne 2(0n) = n and 2(kn+1) = 22k : (n)
Prove: nk $ ) 20k $: E 1.8.5. Let F be a boolean valid formula. Prove that there is an n 2 IN such that n 0 F: E 1.8.6. Let F be an LT -formula with ` F: Prove that there are n k 2 IN such that n k
F: E 1.8.7. Use the preceding exercises to obtain a second prove for the completeness theorem for the Tait-calculus: j= F entails T F: E 1.8.8. Determine the search tree for $ = ft1 6= t1 9y(y = t1 ^ (Pt5 ^ Pt5))g where t0 t1 : : : is the xed enumeration of the terms.
1.9 Applications of Gentzen's Hauptsatz As a rst application of Gentzen's Hauptsatz we want to show a theorem of Jacques Herbrand 1908, y1931]. To prepare this we need the notion of an existential formula, 9-formula for short. De nition 1.9.1. The class of 9-formulas is inductively de ned by the following clauses. 1. Every atomic formula is an 9-formula. 2. If F and G are 9-formulas, so are (F ^ G) and (F _ G): 3. If F is an 9-formula, then so is 9xF: Now let's introduce the problem behind Herbrand's theorem: think of a sentence 9xF with F quanti er free and T 9xF: Then we know by the soundness of the Tait-calculus that we have S j= 9xF for any L-structure S : This means SS j= Fx (s) i.e. in the structure S we have a witness s 2 S for the formula 9xF: But in general there is not one single term t such that the interpretation of t is a witness in any
1.9. Applications of Gentzen's Hauptsatz
67
structure S . This means we do not have a term t such that for all L-structures S we have S j= Fx (t): The following lemma gives the way out in this special case: it is possible to nd a nite set of terms so that in each case the witness can be taken to be the interpretation of one of those terms. In Exercise E 1.9.4 we ask if it is possible to x an upper bound for the number of terms witnessing the existential formula. Lemma 1.9.2. Let $ F be a nite set of 9-formulas. If T $ 9xF, then there are
nitely many terms t1 : : : tn such that T
$ Fx(t1 ) : : : Fx(tn):
Proof. We show the lemma by induction on the length of the derivation T $ 9xF: If T $ 9xF is an L-axiom, then T $ Fx(t1 ) : : : Fx (tn) is also an L-axiom. Thus we may assume that T $ 9xF is the conclusion of one of the inferences. There are
two possibilities: 1. The main formula of the inference belongs to the set $. Then we have the possibilities of an (^) (_) or (9)-inference. (8)-inferences are excluded because $ is a set of 9-formulas. In the case of an ^-inference we have the premises $0 F0 9xF and T $0 F1 9xF and obtain terms t1 : : : tk s1 : : : sl with T
$0 F0 Fx(t1 ) : : : Fx(tk ) and T $0 F1 Fx(s1 ) : : : Fx(sl ) by the induction hypothesis. Using the structural rule and an ^-inference we obtain T $0 F0 ^ F1 Fx(t1 ) : : : Fx(tk ) Fx (s1 ) : : : Fx(sl ): That was claimed. We may treat the cases of an _- or 9-inference simultaneously. Thus assume that we have the premise T
$0 G0 9xF which by either an _- or 9-inference leads to the conclusion T
$0 G 9xF: Then G0 is an 9-formula, too, and by induction hypothesis we have T
$0 G0 Fx(t1 ) : : : Fx(tn ): Using the same inference this yields T
T
$0 G Fx(t1 ) : : : Fx (tn):
68
I. Pure Logic 2. The main formula of the inference is 9xF: Then we have the premise for some term t and obtain T
T
$ Fx(t) 9xF
$ Fx(t) Fx(t1 ) : : : Fx(tm )
by the induction hypothesis.
Lemma 1.9.3 (Herbrand's lemma). If F is an 9-formula with j= 9xF, then there are nitely many terms t1 : : : tn such that j= Fx(t1 ) _ : : : _ Fx(tn ): Proof. From j= 9xF we obtain T 9xF by the completeness theorem 1.8.15. Using
Lemma 1.9.2 this yields T Fx(t1 ) : : : Fx(tn ) which, by the soundness theorem, entails j= Fx (t1) _ : : : _ Fx (tn): De nition 1.9.4. We say that a formula F is in prenex form if F = Q1x1 : : : QnxnF0 where BV(F0 ) = and Qi 2 f9 8g for i = 1 : : : n: Theorem 1.9.5. For any formula F there is a formula FN in prenex form such that F S FN : Proof by induction on the length of the formula F: 1. For an atomic formula we put FN = F: 2. If F = G _ H, then we have by induction hypothesis formulas GN S G and HN S H in prenex form. Let GN = QG1 x1 : : : QGn xn G0 and HN = QH1 y1 : : : QHm ym H0 : Without loss of generality we may assume that = fx1 : : : xng \ fy1 : : : ym g = fx1 : : : xng \ FV(H0 ) = fy1 : : : ym g \ FV(G0 ): We put FN = QG1 x1 : : : QGn xnQH1 y1 : : : QHm ym (G0 _ H0) and have to show (A) FN S F (B) H _ 9xG S 9x(H _ G) (C) H _ 8xG S 8x(H _ G)
1.9. Applications of Gentzen's Hauptsatz
69
for x 2= FV(H): Iterated application of (B) and (C) yields (A). To prove (B) let S be an L-structure and an S -assignment such that S j= (H _ 9xG)]: Then S j= H] or S j= 9xG]: In the second case there is a x such that S j= G] which entails S j= (H _ G)] and in the rst case we get S j= (H _ G)] for any x since x 2= FV(H): Hence S j= 9x(H _ G)]: If S 6j= (H _ 9xG)], then S 6j= H] and S 6j= 9xG] which entails S 6j= 9x(H _ G)]: The proof of (C) is similar: If S j= (H _ 8xG)], then S j= H] or S j= G] for all x : Thus S j= (H _ G)] for all x because x 2= FV(H): Hence S j= 8x(H _ G)]: If S 6j= (H _ 8xG)], then S 6j= H] and S 6j= G] for some x : As x 2= FV(H) this again entails S 6j= (H _ G)] and we get S 6j= 8x(H _ G)]: 3. F = :G: Then there is GN S G where GN = Q1 x1 : : : Qn xnG0 is in prenex form. But then F S :GN S Q' 1x1 : : : Q' n xn:G0 = FN where 8' = 9 and 9' = 8: Obviously FN is in prenex form. 4. F = QxG for Q 2 f8 9g: Then we have a prenex formula GN S G and obtain F S QxGN and put FN = QxGN : De nition 1.9.6. Let F = 9x0 : : : 9xk;18xk Qk+1xk+1 : : : Qn xnG be an L-formula in prenex form with G quanti er free. We extend the language L to L by adding a new function symbol f and put F = 9x0 : : : 9xk;1Qk+1xk+1 : : : Qn xnGxk (fx0 : : :xk;1): Let F (0) = F and L(0) = L and de ne F (n+1) = (F (n)) and L(n+1) = (L(n)) : Then there is an n 2 IN such that F (n+1) = F (n) and thus also L(n+1) = L(n): Let m be the least such n and put FH = F (m) and LH = L(m) : We call FH a Herbrand form of F and LH a Herbrand language for F: For a formula F not in prenex form we de ne FH = (FN )H :
70
I. Pure Logic
At this point we want to give an easy example for computing the Herbrand form of a formula. Therefore let F = 8x9y(x y = 1) ! 8x9y(y x = 1) in the language LGT of group theory. Because F is not in prenex form we compute it by: F S 9x18x2:(x1 x2 = 1) _ 8y1 9y2 (y2 y1 = 1) S 9x18x28y1 9y2 (:(x1 x2 = 1) _ (y2 y1 = 1)): This formula is in prenex form, say FN . Thus we have (FN ) = 9x1 8y1 9y2 (:(x1 fx1 = 1) _ (y2 y1 = 1)) for a new function symbol f and (FN )(2) = 9x19y2 (:(x1 fx1 = 1) _ (y2 gx1 = 1)) with g as a new function symbol. This is a Herbrand form of F: Lemma 1.9.7. We have j= F i j= FH : Proof. Since F S FN we may assume that F is in prenex form and it suces to show j= F , j= F because then the lemma follows by iteration. So let F = 9x0 : : : 9xk;18xk G where G is in prenex form. Then F = 9x0 : : : 9xk;1Gxk (fx0 : : :xk;1): We already know j= 8xk G ! Gxk (fx0 : : :xk;1) which entails j= 9x0 : : : 9xk;18xk G ! 9x0 : : : 9xk;1Gxk (fx0 : : :xk;1): Hence j= F ) j= F : For the opposite direction assume 6j= F: Then there is an Lstructure S and an S -assignment such that S 6j= F], i.e. S 6j= 9x0 : : : 9xk;18xk G]: Choose s0 : : : sk;1 2 S arbitrarily and de ne (x) = (x) for x 2= fx0 : : : xk;1g and (xi ) = si : Then we have S 6j= 8xk G] which shows that there is an assignment 0 xk such that S 6j= G0]: Put f S (s0 : : : sk;1) = 0(xk ):
1.9. Applications of Gentzen's Hauptsatz
71
This expands S to an L -structure S : If we had S j= 9x0 : : :xk;1Gxk (fx0 : : :xk;1)] we could nd x0 ::: xk;1 with S j= Gxk (fx0 : : :xk;1)]: Therefore we had S j= G0] with f S (s0 : : : sk;1) = 0(xk ): Here we have written si = (xi ): But this contradicts the de nition of f S : So we conclude S 6j= 9x0 : : :xk;1Gxk (fx0 : : :xk;1)] and S 6j= F ]: Theorem 1.9.8 (Herbrand's theorem). Let F be a sentence. Then we have j= F i there are nitely many k-tuples (t11 : : : t1k ) : : : (tn1 : : : tnk) of LH -terms such that j= Gx1 ::: xk (t11 : : : t1k ) _ : : : _ Gx1 ::: xk (tn1 : : : tnk) where it is FH = 9x1 : : : 9xk G and G is quanti er free. Proof. By the previous lemma we have j= F ,j= FH : Thus it suces to show j= FH ,j= Gx1 ::: xk (t11 : : : t1k ) _ : : : _ Gx1 ::: xk (tn1 : : : tnk): To show the `(' direction, we apply the completeness theorem for the Tait-calculus to get 1 1 n n T Gx1 ::: xk (t1 : : : tk ) : : : Gx1 ::: xk (t1 : : : tk ) and conclude T 9x1 : : : 9xk G by applications of the 9-rule. For the opposite direction we assume j= FH and get T 9x1 : : : 9xk G by the completeness theorem. Since 9x1 : : : 9xk G is an 9-formula we may apply Lemma 1.9.2 to obtain n1 1 T 9x2 : : : 9xk Gx1 (t1 ) : : : 9x2 : : : 9xk Gx1 (t1 ): From this we get n1 1 T 9x2 : : : 9xk (Gx1 (t1 ) _ : : : _ Gx1 (t1 )): Applying 1.9.2 once more this yields n1 1 1 1 T 9x3 : : : 9xk (Gx1 x2 (t1 t2) _ : : : _ Gx1 x2 (t1 t2)) : : : 9x3 : : : 9xk (Gx1 x2 (t11 tn2 2 ) _ : : : _ Gx1 x2 (tn1 1 tn2 2 )) Iterating this procedure and possibly adding dummy terms we nally get the claim. At this point we nish the observations concerning Herbrand's theorem of 1930. We are going to derive a second consequence of Gentzen's Hauptsatz: the interpolation theorem.
72
I. Pure Logic
Theorem 1.9.9 (Interpolation for the Tait-calculus). Let $ and ; be nite sets of formulas in the Tait-language for L such that neither T $ nor T ;: If T $ ;,
then there is a formula E satisfying the following properties: 1: FV(E) FV($) \ FV(;): 2: E contains only predicate symbols which occur in formulas of $ as well as in formulas of :; = fF : F 2 ;g: 3: T ; E and T E $: We call E an interpolation formula for ; and $: Proof. We use induction on the de nition of T $ ;: 1. Assume T ; $ by an L-axiom. Since 6 T ; and 6 T $ there is an atomic formula P with P 2 $ and P 2 ; (or P 2 ; and P 2 $). Set E = P (or E = P ). 2. Assume that the last inference was T ; F0 $ T ; F1 $ ) T ; $ where (F0 ^ F1 ) 2 ;: There are the following sub-cases (a) T ; F0: Then 6 T ; F1 because otherwise we had T ;: Thus by induction hypothesis there is an interpolation formula E for ; F1 and $. Since F1 is a sub-formula of (F0 ^ F1 ) 2 ; E obviously satis es properties 1. and 2. for ; and $. We have T E $ and T ; F1 E and T ; F0 entails also T ; F0 E: Hence T ; E by an ^-inference and E is also an interpolation formula for ; and $: (b) 6 T ; F0 and 6 T ; F1: By induction hypothesis we have interpolation formulas E1 for ; F0 and $ and E2 for ; F1 and $: For the same reasons as above E1 and E2 satisfy properties 1. and 2. also for ; and $: From T E1 $ and T E2 $ we obtain T (E1 _ E2 ) $ and T ; F0 E1 and T ; F1 E2 rst yield T ; F0 E1 _ E2 and T ; F1 E1 _ E2: Then we get by an ^inference T ; E1 _ E2. Thus (E1 _ E2) is an interpolation formula for $ and ;. 3. The last inference was T ; Fi $ )` ; $ for i 2 f0 1g and (F0 _ F1) 2 ; according to the _-rule. Then we have 6 T ; Fi because otherwise we had T ;: By induction hypothesis there is an interpolation formula E for ; Fi and $ also satisfying 1. and 2. for the sets ; and $: From T ; Fi E however, we obtain T ; E by an _-inference. Thus E is also an interpolation formula for ; and $:
1.9. Applications of Gentzen's Hauptsatz
73
4. The last inference was ; Fx(y) $ ) T ; $ according to the 8-rule where 8xF 2 ;: Again we have 6 T ; Fx(y) which by induction hypothesis gives us an interpolation formula E for ; Fx(y) and $. By the variable condition we have y 2= FV(; $) which by 1. entails that y 2= FV(E): Thus we get from T ; Fx(y) E also T ; E by an 8-inference. It is obvious that E also satis es 1. and 2. for $ and ; and hence is an interpolation formula for $ and ;: 5. The last inference was an 9-inference T ; Fx(t) $ ) T ; $ with 9xF 2 ;: By induction hypothesis we have an interpolation formula D for ; Fx(t) and $: Then D satis es property 2. also for $ and ; since there are no new predicate symbols in Fx (t): Let fy1 : : : yng = (FV(t) n FV(;)) \ FV($): We have T ; Fx(t) D and T D $: By 9-inferences we thus obtain T ; D and T 8y1 : : : 8yn D ; Since all of the variables y1 : : : yn do not belong to FV(;) we apply 8-inferences to get T
; 8y1 : : : 8yn D: Putting E = 8y1 : : : 8yn D we see that E is an interpolation formula for $ and ;: Annoyingly we also have to regard the case that the main formula of the last inferences belongs to the set $: Since most of the cases are dual to those already treated we can be quite short. 6. T ; F0 $ T $ F1 ; ) T $ ; and (F0 ^ F1) 2 $ T
(a) T F0 $ entails 6 T F1 $: Let E be an interpolation formula for ; and F1 $: Then T ; E and T E F1 $: Thus T E $ by (^) and E interpolates also ; and $: (b) 6 T F0 $ and 6 T F1 $ give interpolation formulas E0 and E1: Thus T ; Ei for i = 0 1 and T E0 F0 $ as well as T E1 F1 $, then E = E0 _ E1 interpolates ; and $: 7. T ; Fi $ ) T ; $ and (F0 _ F1 ) 2 $: Then 6 T Fi $ which gives E with T ; E and T E Fi $: E also interpolates ; and $:
74
I. Pure Logic 8. T Fx (y) $ ) T ; $ by an 8-inference and 8xF 2 $: By the induction hypothesis there is a formula E with T ; E and T E Fx(y) $: Since y 2= FV(E) we also get E as an interpolation formula for ; and $: 9. T ; Fx(t) $ ) T ; $ by an 9-inference and 9xF 2 $: Then we have a formula D with T ; D and T D Fx (t) $: Let fy1 : : : yng = (FV(t) n FV($)) \ FV(;):
By 9-inferences we rst get T ; 9y1 : : : 9yn D and T D $: As before we have y1 : : : yn 2= FV($) which yields T 8y1 : : : 8yn D $: Putting E = 9y1 : : : 9yn D we get an interpolation formula for ; and $: Corollary 1.9.10. If $ and ; are nite sets of formulas such that T $ ; but 6 T $ and 6 T ;, then $ and ; have at least one common predicate symbol. The hypotheses 6 T ; and 6 T $ in the interpolation theorem for the Tait-calculus are of course annoying. To get a general formulation of the interpolation theorem we should try to get rid of them. This, however, is not easy. If we assume that we have T ;, then we obviously obtain T ; $ for any formula set $, even if ; and $ have no predicate symbol in common. Thus the only possible interpolation formula is the empty formula which is not yet available in our language. So let us try to introduce a symbol, say ?, for the empty formula. ? may be viewed as a 0-placed predicate symbol. Therefore we need its dual notion, say >, in the Tait-language. We de ne ? = > and > = ?: The L-axiom for this axiom then becomes ` $ > ? which is the same as ` $ > (>-axiom) because ? is supposed to stand for the empty formula. The interpretation of the symbol ? is of course ValS (? ) = f in all L-structures S : For this reason the interpretation of the dual symbol in the Tait-language has to be ValS (> ) = t for all L-structures S : Due to these interpretations we have that :($ f>g) is always inconsistent and so the soundness theorem for the Tait-calculus extends also to the Tait-calculus with >-axiom. The proof of the completeness theorem remains literally the same. So we have soundness and completeness for the Tait-calculus with >-axiom, too. We rst observe that ? really behaves like the empty formula. We will use the obvious notion >-Ax $ to express that $ is derivable in the Tait-calculus with >-axiom. Proposition 1.9.11. >-Ax $ ? implies >-Ax $:
Proof. By induction on the de nition of >-Ax $ ?: ? can never be the mainformula of the last inference. Therefore we get the claim immediately from the induction hypothesis in case that >-Ax $ ? in conclusion of an inference. But if
1.9. Applications of Gentzen's Hauptsatz
75
>-Ax $ ? is an axiom, then >-Ax $ has to be an axiom, too.
Another easy consequence is that the Tait-calculus with >-axiom is a conservative extension of the usual Tait-calculus. We will return to conservative extensions later (cf. section 2.1). Therefore we do not give a general de nition of conservative extensions but formulate the result as follows.
Proposition 1.9.12. Assume >-Ax $ and $ does neither contain the symbol > nor ?: Then T $: Proof. The proof of the lemma is straightforward by induction on the de nition of >-Ax $:
We are now prepared to formulate the general version of the interpolation theorem.
Theorem 1.9.13. If >-Ax $ ;, then there is an interpolation formula for $ and
;:
Proof. In case that we have 6 >-Ax $ and 6 >-Ax ; we use Proposition 1.9.11,
Proposition 1.9.12 and Theorem 1.9.9. Otherwise we either have >-Ax $ or >-Ax ; and obtain >-Ax $ ? or >-Ax ; ? respectively. In the rst case we have the interpolation formula ? because we have >-Ax > ; by >-axiom and in the second case we get for the same reasons > as interpolation formula. As a consequence of Theorem 1.9.13 we get the famous theorem by William Craig published in 1957.
Theorem 1.9.14 (Craig's interpolation theorem). If we have j= F ! G, then there is a formula E which interpolates F and G, i.e. we have j= F ! E and j= E ! G and E contains only predicate symbols which occur both in F and G: The free variables of E also occur as well in F as in G:
Proof. If j= F ! G we get >-Ax F G by the completeness theorem for the Taitcalculus with >-axiom. By Theorem 1.9.13 there is an interpolation formula E for F E and >-Ax E G which yield j= F ! E and F and G, i.e. we have j= E ! G by the soundness>-Axtheorem. In Theorem 1.9.13 we have proved that E has all the other properties stated in the claim. There is a nice application of the interpolation theorem which is due to Abraham Robinson 1918, y1974] in 1956. Theorem 1.9.15 (Joint consistency theorem). Let M1 and M2 be consistent sets of L-sentences. Then M1 M2 is consistent i there is no sentence F such that M1 j= F and M2 j= :F:
76
I. Pure Logic
Proof. If there is a sentence F such that M1 j= F and M2 j= :F it is M1 M2 inconsistent. For the other direction let M1 M2 be inconsistent. By the compactness theorem there are nite subsets N1 M1 and N2 M2 such that N1 N2 is inconsistent. Now let F1 be the conjunction of the sentences in N1 and F2 of those in N2 . Thus we have N1 j= :F2 which is j= F1 ! :F2: By Craig's interpolation theorem there is a sentence E such that j= F1 ! E and j= E ! :F2: But this implies M1 j= E since M1 j= F1 and F1 j= E: But also we have M2 j= :E because F2 j= :E and M2 j= F2 : This is because N1 N2 is inconsistent, which means N2 j= :F1: So we have shown that there is a sentence E such that M1 j= E and M2 j= :E: There is a sharper form of the interpolation theorem due to Roger C. Lyndon in 1959. This theorem also tells something about the form of occurrences of the predicate symbols in the interpolation formula. To formulate the theorem we need the following notion. De nition 1.9.16. We de ne inductively the positive and negative occurrence of a predicate symbol P in an L-formula F: 1. P occurs positively in Pt1 : : :tn : 2. If P occurs positively (negatively) in F, then P occurs negatively (positively) in :F: 3. If P occurs positively (negatively) in F, then P occurs positively (negatively) in (F _ G) and (G _ F ): 4. If P occurs positively (negatively) in F, then P occurs positively (negatively) in 9xF: For an L-formula F let us denote its translation into the Tait-language LT by F T : Formally the translation is given inductively by 1. (Pt1 : : :tn )T = Pt1 : : :tn 2. (:F )T = (F T ) (cf. De nition 1.8.2)
1.9. Applications of Gentzen's Hauptsatz
77
3. (F _ G)T = F T _ GT 4. (9xF )T = 9xF T : On the other hand any formula in the Tait-language LT is easily retranslated into an L-formula translating an occurrence of Pt1 : : :tn into :Pt1 : : :tn : Positive and negative occurrences of predicate are easy to locate in the Tait-language. We have the following observation.
Proposition 1.9.17. P occurs positively (negatively) in F i P (P) occurs in F T : The proof is an easy exercise.
Theorem 1.9.18 (Lyndon's interpolation theorem). If j= F ! G, then there is an interpolation formula E for F and G such that every predicate symbol occurring positively (negatively) occurs positively (negatively) in both formulas F and G:
Proof. The proof uses Proposition 1.9.17. By the interpolation theorem for the Taitcalculus with >-axiom we get an interpolation formula for :F T and GT : Any predicate symbol P occurring in E occurs as P in ::F T and GT and thus positively in F and G and any predicate symbol P occurring in E occurs as P in F T and GT : The retranslation of E into an L-formula transfers occurrences of P into positive occurrences of P and occurrences of P into negative ones. A sometimes useful modi cation of the interpolation theorem is the following one.
Theorem 1.9.19. If M j= F for some formula set M (not necessarily nite), then there is a formula E with FV(E) FV(M) \ FV(F) such that M j= E and j= E ! F and every predicate symbol occurring positively (negatively) in E also occurs positively (negatively) in F and some formulas of M:
Proof. By the compactness and the deduction theorem we get j= G1 ^ : : : ^ Gn ! F for nitely many formulas fG1 : : : Gng M: Application of Lyndon's interpolation theorem proves the theorem. Theorem 1.9.19 has the consequence that for proving a theorem F from the axiom system Ax we need at most those axioms in Ax which tell something about the predicate symbols occurring in F: There are two nice applications of the interpolation theorem. The rst is Evert W. Beth's de nability theorem which is already a consequence of Craig's interpolation theorem.
78
I. Pure Logic
Theorem 1.9.20 (Beth's de nability theorem). We say that a formula F de nes an n-ary predicate symbol P implicitly if we have (1.5)
j= F ^ FP (Q) ! 8x1 : : : 8xn((Px1 : : :xn ) $ (Qx1 : : :xn)):
Let P be implicitly de ned by F: Then there is a formula G such that FV(G) fx1 : : : xng j= F ! 8x1 : : : 8xn(Px1 : : :xn $ G) and P does not occur in G: This means that G de nes P explicitly. Proof. From (1.5) we get
j= F ^ Px1 : : :xn ! (FP (Q) ! Qx1 : : :xn): By Craig's interpolation theorem there is an interpolation formula G such that (1.6)
j= F ^ Px1 : : :xn ! G
and
j= G ! (FP (Q) ! Qx1 : : :xn): We have FV(G) fx1 : : : xng and neither P nor Q occur in G: Thus we get from (1.7) (1.7)
(1.8)
j= G ! (F ! Px1 : : :xn):
and (1.6) and (1.8) together yield
j= F ! 8x1 : : : 8xn (G $ Px1 : : :xn): The second application uses Lyndon's version of the interpolation theorem. It is a theorem about monotone operators. To de ne monotone operators let L be a rst order language and S an L-structure. Let P be a new unary predicate symbol. An L(P)-formula F with FV(F ) = fxg de nes an operator ;F : Pow(S) ! Pow(S) by ;F (N) = fs 2 S : (S N) j= Fs]g for N S where (S N) is the L(P)-expansion of S interpreting P by N S: An operator ; : Pow(S) ! Pow(S) is monotone on S if N M entails ;(N) ;(M): ;F is globally monotone if it is monotone on every L-structure S : Lemma 1.9.21. Let F be an L(P)-formula with FV(F) = fxg and at most positive occurrences of P: Then F de nes a globally monotone operator.
1.9. Applications of Gentzen's Hauptsatz
79
Proof. It suces to prove (1.9) j= 8x1 : : : 8xn(Px1 : : :xn ! Qx1 : : :xn) ! F ! FP (Q) for P occurring at most positively in F and (1.10) j= 8x1 : : : 8xn(Px1 : : :xn ! Qx1 : : :xn) ! (FP (Q) ! F) for P occurring at most negatively. We prove (1.9) and (1.10) simultaneously by induction on the length of F: If F = Pt1 : : :tn, then P occurs positively and we have j= 8x1 : : : 8xn(P x1 : : :xn ! Qx1 : : :xn ) ! Pt1 : : :tn ! Qt1 : : :tn: If F = :G and P occurs positively (negatively) in F, then P occurs negatively (positively) in G: By induction hypothesis we have (1.11) j= 8x1 : : : 8xn(Px1 : : :xn ! Qx1 : : :xn) ! (GP (Q) ! G) or (1.12) j= 8x1 : : : 8xn(Px1 : : :xn ! Qx1 : : :xn) ! (G ! GP (Q)) respectively. But (1.11) and (1.12) immediately entail (1.9) and (1.10). If F = G _ H and P occurs positively (negatively) in F, then P occurs at most positively (negatively) in both G and H: Thus we have by induction hypothesis (1.13) j= 8x1 : : : 8xn(Px1 : : :xn ! Qx1 : : :xn) ! (G ! GP (Q)) and (1.14) j= 8x1 : : : 8xn (Px1 : : :xn ! Qx1 : : :xn ) ! (H ! HP (Q)) or we have (1.15) j= 8x1 : : : 8xn(Px1 : : :xn ! Qx1 : : :xn) ! (GP (Q) ! G) and (1.16) j= 8x1 : : : 8xn (Px1 : : :xn ! Qx1 : : :xn ) ! (HP (Q) ! H) respectively. But (1.13) and (1.14) entail j= 8x1 : : : 8xn(Px1 : : :xn ! Qx1 : : :xn) ! (G _ H ! (G _ H)P (Q)) and (1.15) and (1.16) j= 8x1 : : : 8xn (Px1 : : :xn ! Qx1 : : :xn) ! ((G _ H)P (Q) ! G _ H): If F = 9xG and P occurs positively (negatively) in F, then P occurs positively (negatively) in G and we get (1.17) j= 8x1 : : : 8xn(Px1 : : :xn ! Qx1 : : :xn) ! (G ! GP (Q))
80
I. Pure Logic
or (1.18)
j= 8x1 : : : 8xn(Px1 : : :xn ! Qx1 : : :xn) ! (GP (Q) ! G)
by induction hypothesis. From (1.17) or (1.18), however, we immediately get
j= 8x1 : : : 8xn (Px1 : : :xn ! Qx1 : : :xn ) ! (9xG ! 9xGP (Q)) and
j= 8x1 : : : 8xn(Px1 : : :xn ! Qx1 : : :xn) ! (9xGP (Q) ! 9xG):
Now we may use Lyndon's interpolation theorem to show that indeed any globally monotone operator is de nable by a P-positive formula F: Theorem 1.9.22. A rst order de nable operator is globally monotone i it is de nable by some P -positive formula. Proof. The one direction of the theorem is Lemma 1.9.21. To prove the opposite direction let ;F be globally monotone. Then we have (S P Q) j= 8x1 : : : 8xn(P x1 : : :xn ! Qx1 : : :xn) ! 8x(F ! FP (Q)) for any structure S and any expansion (S P Q) of S : Thus
j= 8x1 : : : 8xn(Px1 : : :xn ! Qx1 : : :xn) ^ F ! FP (Q) By Lyndon's interpolation theorem we get an interpolation formula E i.e. (1.19)
j= 8x1 : : : 8xn(Px1 : : :xn ! Qx1 : : :xn) ^ F ! E
(1.20)
j= E ! FP (Q):
Since Q only occurs positively in
8x1 : : : 8xn (Px1 : : :xn ! Qx1 : : :xn ) ^ F it can at most occur positively in E: Thus choosing Q as P in (1.19) and (1.20) we get j= F $ EQ (P ) i.e. F is logically equivalent to a formula EQ (P) which has at most positive occurrences of P:
Exercises
E 1.9.1. Let F G be quanti er free formulas of the rst order language L: Let 0 be a constant symbol, f and j j function symbols and < a predicate symbol. Compute a prenex form of the following formulas: a) 8xF $ 9xG
1.9. Applications of Gentzen's Hauptsatz
81
b) 8x(0 < x ! 9y(0 < y ^ 8z(jz ; x0j < y ! jf(z) ; f(x0 )j < x))): E 1.9.2. Let P Q R be binary predicate symbols of the Tait-language LT : Compute with respect to the proof of the interpolation theorem for the Tait-calculus, an interpolation formula for 8x(9yP xy ^ 9yQxy) and 8x9y(Pxy _ Rxy) Hint: First determine a derivation for 8x(9yP xy ^ 9yQxy) ! 8x9y(P xy _ Rxy) in the Tait-calculus. E 1.9.3. Prove that every monotone operator has a least xed point, i.e. if ; is the operator the least xed point is a set s with a) ;(s) = s b) 8t(;(t) = t ! s t) E 1.9.4. Prove or disprove: For every rst order language L containing only nitely many constant and function symbols there is an n 2 IN such that for every L-formula 9xF with T 9xF there are t1 : : : tm m n with Fx(t1 ) _ : : : _ Fx (tm ): E 1.9.5. Let L(C F P) be a rst order language with C F P nite. Let P Q be unary predicate symbols not in L S a nite L-structure and R S: We call R is S -invariant , for all isomorphisms ' : S = S it is R = f'(r) : r 2 Rg For the notion of an isomorphism cf. De nition0 2.2.6. We denote the expansions of an L-structure S 0 to L(P ) and L(P Q) by (S 0 P S ) and (S 0 P S 0 QS 0 ): Now, let P S = R and F an L(P)-formula, such that for all L-structures S 0 T
(S 0 P S 0 ) j= F , (S 0 P S 0 ) = (S P S ) Prove the following claims: a) R is S -invariant and (S 0 P S 0 ) = (S 0 P S 0 ) ) P S 0 is S -invariant. b) R is S -invariant ) (S 0 P S 0 QS 0 ) j= F ^ FP (Q) ! 8x(Px $ Qx) c) R is S -invariant , there is an L-formula G with FV(G) fx0g such that R = fs 2 S : S j= Gs]g: Hint: Use Beth's de nability theorem. d) For any set X S there is an L-formula G with FV(G) fxg such that f'(s) : s 2 X and ' : S = Sg = fs 2 S : S j= Gs]g:
82
I. Pure Logic
E 1.9.6. Prove Proposition 1.9.17. E 1.9.7. It is possible to strengthen Herbrand's lemma in the following way: If T is a theory with
T j= 9xF then there are nitely many terms t1 : : : tn such that T j= Fx (t1) _ : : : _ Fx (tn): Which restrictions have to be made on T and F that the above strengthing is correct?
E 1.9.8. Do we have in general j= F $ FH i.e. do we have for all LH -structures S (which are expansions of L-structures) and all S -assignments S j= F $ FH ]?
1.10 First Order Logic with Identity The equality of objects in a domain of an L-structure is such a basic property that it should be possible to express it in any logic. Up to now we cannot do that. Therefore we are going to introduce a symbol `=' into the rst order language L and call the extended language the `language L with identity', denoted by LI.
De nition 1.10.1. The `standard' interpretation of `=' in any LI-structure S is given by
=S is the set f(s s) : s 2 S g: Since we don't want to repeat all the work we have done up to now for languages with identity, we try to treat `=' as a common binary predicate constant on such to transfer the results of the previous sections to languages with identity. In order to give `=' the special meaning, intended by its standard interpretation, we cannot treat it as an arbitrary binary predicate constant but have to x its meaning by giving de ning axioms for it.
De nition 1.10.2. The following sentences are the de ning axioms for `='. 1. 8x(x = x) (reexivity) 2. 8x8y(x = y ! y = x) (symmetry) 3. 8x8y8z(x = y ^ y = z ! x = z) (transitivity) 4. 8x1 : : : 8xn 8y1 : : : 8yn (x1 = y1 ^ : : : ^ xn = yn ! fx1 : : :xn = fy1 : : :yn ) for all function symbols f in L (compatibility with functions) 5. 8x1 : : : 8xn 8y1 : : : 8yn (x1 = y1 ^ : : : ^ xn = yn ! (Px1 : : :xn ! Py1 : : :yn )) for all predicate symbols P in L (compatibility with relations).
1.10. First Order Logic with Identity
83
We call the set of de ning axioms for `=' Id. Every axiom of Id is the form 8~xF with quanti er free F: We de ne Id0 = fF~x (~t ) : BV(F) = ^ 8~xF 2 Idg and call Id0 the open version of identity axioms. The open version is apparently as good as Id since we have Proposition 1.10.3. Id0 j= F entails Id j= F: Proof. If Id0 j= F, then we have by compactness and the deduction theorem
j= F1 ^ : : : ^ Fn ! F for F1 : : : Fn 2 Id0 : Since j= 8~xFi ! Fi for i = 1 : : : n this entails
j= 8~x1F1 ^ : : : ^ 8~xn Fn ! F: Thus Id j= F by the deduction theorem. In the rest of this section we are going to investigate the possible dierences between the two viewpoints: 1. Taking = as a `logical symbol' and interpreting it standardly in any LI-structure 2. Taking = as a `non-logical symbol' (i.e. a symbol belonging to the set P ) and interpreting it in L-structures which are models of Id: The rst and easiest observation is made by the following proposition. Proposition 1.10.4. For any LI-structure S we obviously have S j= Id: De nition 1.10.5. Let S1 = (S1 C 1 F1 P1), S2 = (S2 C 2 F2 P2) be L-structures. We call S2 epimorphic to S1 if there is a mapping ' : S1 ! S2 satisfying the following conditions: 1. ' is onto 2. '(cS1 ) = cS2 for all c 2 C 3. '(f S1 (s1 : : : sn )) = f S2 ('(s1 ) : : : '(sn )) for all f 2 F 4. (s1 : : : sn) 2 P S1 , ('(s1 ) : : : '(sn )) 2 P S2 for all P 2 P : Mappings satisfying 1.{4. are called epimorphisms. In mathematics we meet a lot of epimorphisms. E.g. if S1 S2 are groups, i.e. special LGT -structures, ' : S1 ! S2 is an epimorphism in the sense of De nition 1.10.5 i it is a group homomorphism and onto. Here we want the reader also to look at De nitions 2.2.1 and 2.2.6.
84
I. Pure Logic
Proposition 1.10.6. Let ' : S1 ! S2 be an epimorphism and an S1 -assignment. Then there is an S2 -assignment ' such that 1: tS2 ' ] = '(tS1 ]) for all L-terms t 2: Val' (F ') = Val' (F ) for all L-formulas F: Proof. Put ' (x) = '((x)) and check 1. and 2. by induction on the length of t and
F respectively, as an easy exercise. Theorem 1.10.7. Let S be an L-structure satisfying Id: Then there is an LI-structure S epimorphic to S i.e. a structure which interprets `=' standardly. Proof. Let S = (S C F P): We de ne S = (S C F P) as follows: a = fb 2 S : a =S bg for a 2 S S = fa : a 2 S g cS = cS for x 2 C f S (s1 : : : sn ) = f S (s1 : : : sn ) for f 2 F and P S = f(a1 : : : an) : (a1 : : : an) 2 P S g for P 2 P : Now it is easy to show that 1. f S and P S are well de ned. 2. ' : S ! S '(a) = a is an epimorphism. 3. S interprets = standardly The proofs are straightforward and left as an exercise. In 1.10.4 we have seen that the standard interpretation of `=' is always a model of Id: Combining this with Theorem 1.10.7 we see that there is no essential dierence between our two standpoints mentioned in the beginning of this section. Therefore it should not be dicult to transfer the results on pure logic to logic with identity. First we get: Theorem 1.10.8 (Compactness theorem for logic with identity). Let M be a set of LI-formulas such that every nite subset of M is LI-consistent. Then M is LI-consistent. Proof. Let M0 M be a nite subset of M: Then there is an LI-structure S and an S -assignment such that S j= F] for any F 2 M0 : Since S j= Id we see that M Id is nitely consistent. By the compactness theorem for pure logic we therefore get an L-structure S 0 and an S 0-assignment 0 with S 0 j= F0] for all F 2 M and by 1.10.7 and 1.10.6 S 0 and 0 can be boiled down to an LI-structure S~ and an S~-assignment ~ satisfying M: Let us denote by M Id F that S j= M] ) S j= F] holds for any LI-structure S and S -assignment : Then we have
1.11. A Tait-Calculus for First Order Logic with Identity
85
Theorem 1.10.9. M Id F i M Id j= F: Proof. We have M Id F i M f:F g is LI-inconsistent, i.e. there is no LI-structure and no S -assignment satisfying the formulas in M f:F g: But then M Id f:F g is L-inconsistent because any L-structure S and any S -assignment satisfying S j= M Id f:F g] can be boiled down by 1.10.7 to an LI-structure and a corresponding assignment without changing the truth values of the formulas. On the other hand, if M f:F g is LI-consistent, M Id f:F g is L-consistent
by 1.10.4. Theorems 1.10.8 and 1.10.9 tell us that there is no essential dierence between the two standpoints: regarding `=' as a logical or an additional `non-logical' symbol. Therefore we are going to count `=' among the logical symbols. Since we have the compactness theorem for this language all `model theoretic' properties are transferred. By Theorem 1.10.9, however, we see that there is also a calculus producing the valid formulas of rst order logic with identity. We have Id F i Id j= F and all we have to do is to use a calculus for pure rst order logic, e.g. the Hilbert-calculus introduced in section 1.7 and augment its axioms by the set Id: A bit more delicate to transfer are the results obtained by inspecting the Tait-calculus. A priori we cannot be sure that there is also a cut free calculus for rst order logic with identity. We are going to check this in the next section.
Exercises E 1.10.1.
a) Prove: Id j= s1 = t1 ^ : : : ^ sn = tn ! tx1 ::: xn (s1 : : : sn ) = tx1 ::: xn (t1 : : : tn): b) Prove: Id j= s1 = t1 ^ : : : ^ sn = tn ! Fx1 ::: xn (s1 : : : sn ) = Fx1 ::: xn (t1 : : : tn):
E 1.10.2. Prove Proposition 1.10.6. E 1.10.3. Prove Theorem 1.10.7.
1.11 A Tait-Calculus for First Order Logic with Identity Since we have Id F ) Id j= F by 1.10.9 it is obvious that for any formula F with Id F we get T F1 : : : Fn F for nitely many instances F1 : : : Fn of axioms in Id: Thus adding axioms T $ G for G 2 Id and the cut rule would immediately give us T F: All the results we got
86
I. Pure Logic
by the Tait-calculus, however, depended heavily on the fact that the calculus was cut free. Therefore we are going to try to cook up a cut free calculus also for predicate logic with identity. In the Tait-language we also have the symbol 6= for inequality. De nition 1.11.1. We de ne calculus Id $ as follows. We augment the axiom of the Tait-calculus by the additional axiom Id
$ t = t for any L-term t (Id-axiom)
and the rules by the identity rule $ si = ti for i = 1 : : : n and Id $ Px1 ::: xn (t1 : : : tn) imply Id $ Px1 ::: xn (s1 : : : sn): where P denotes an atomic formula of the Tait-language for LI (Id-rule). The other axioms and rules are those of the ordinary Tait-calculus (cf. De nition 1.8.4). Id
Proposition 1.11.2. If Id $, then :$ is LI-inconsistent. Proof. The proof is essentially that of 1.8.6 with the additional clauses that we have
$ by an Id-axiom or an Id-rule. In the case of an Id-axiom we have (t = t) 2 $: Hence (t 6= t) 2 :$ and :$ is LI-inconsistent. In case of an Id-rule we have by induction hypothesis the LIinconsistency of :$ fs1 6= t1 : : : sn 6= tng and :$ fP x1 ::: xn (t1 : : : tn)g: Let S be an LI-structure and an S -assignment. We have to show that Id
S 6j= (:$ fP x1 ::: xn (s1 : : : sn )g)]: If S j= :$], then S j= (si = ti)] for i = 1 : : : n and
S j= Px1 ::: xn (t1 : : : tn)]: This immediately entails
S j= Px1 ::: xn (s1 : : : sn)
since S is an LI-structure. Corollary 1.11.3. Id F entails Id F and Id j= F: Proof. By 1.11.2 we rst get Id F and this by 1.10.9 entails Id j= F: Our next aim is the proof of the opposite direction of 1.11.3. To prepare this we need the following lemma.
1.11. A Tait-Calculus for First Order Logic with Identity
Lemma 1.11.4. Assume
Id
87
; P and Id $ P for an atomic formula P. Then
; $: Proof. Without loss of generality we may assume that P is not an equation. We show the lemma by induction on the length of the derivation Id $ P: If P 2 ; $ we get Id ; $ by the structural rule and are done. Thus assume P 2= ; $: Then Id $ P cannot be an L-axiom. If it is an Id-axiom, then (t = t) 2 $ for some L-term t because P is not an equation. Then Id $ ; is an Id-axiom, too. If the last inference is Id $i P ) Id $ P for i = 0 or i = 0 1 Id
then we get Id $i ; by induction hypothesis and obtain Id $ ; by the same inference. Thus the only non-trivial case is that the main formula of the last inference is P: Since P is atomic this can only by an inference according to the Id-rule. But then P = Px1 ::: xn (s1 : : : sn) and we have the premises (1.21) Id $ s1 = t1 : : : Id $ sn = tn and (1.22) Id $ Px1 ::: xn (t1 : : : tn): We have the Id-axioms (1.23) Id $ t1 = t1 : : : Id $ tn = tn and obtain from (1.21) and (1.23) (1.24) Id $ t1 = s1 : : : Id $ tn = sn by applications of the Id-rule. From (1.24) and the hypothesis we get
Id
; Px1 ::: xn (s1 : : : sn )
(1.25) Id ; Px1 ::: xn (t1 : : : tn) by another application of the Id-rule. But (1.25) together with (1.22) yield Id
$ ;
by the induction hypothesis. For the next lemma we introduce the set Idt = f8xk+1 : : : 8xn Fx1 ::: xk (t1 : : :tk ) : F 2 Id0 and t1 : : : tk are arbitrary L-termsg:
88
I. Pure Logic
Lemma 1.11.5. Assume ; Idt and T :; $: Then Id $: Proof. We induct on the length of the derivation of T :; $: If T :; $ is an L-axiom, then either already $ is an L-axiom or $ contains an equation t = t for some L-term t such that (t = 6 t) 2 :; (these are the only atomic formulas occurring in ;). In the rst case we get Id $ as an L-axiom in the second as an Id-axiom.
If the last inference is T
:; $i )
T
:; $ for i = 0 or i 2 f0 1g
then we get Id $i by induction hypothesis and deduce Id $ by the same inference, which is possible because the Tait-calculus with identity comprises pure Tait-calculus. Thus the crucial cases are those in which the main formula F of the last inference belongs to the set :;: There we have the following sub-cases. 1. F = 8xG Then we have the premise ` :; $ Gx(t): But if 8xG 2 Idt we also have Gx(t) 2 Idt and obtain Id $ by the induction hypothesis. 2. F = (s = t ! t = s) Then it is :F = (s = t ^ t 6= s) and we have the premises ; $ s = t and T :; $ t 6= s: By the induction hypothesis we get T
Id
From these we obtain
$ s = t and Id $ t 6= s:
$ s 6= s by an application of the Id-rule. Using an Id-axiom Id $ s = s we get Id $: Id
3. F = (s = t ^ t = r ! s = r) Then F = (s = t ^ t = r) ^ s 6= r: Then we have the premises T
:; $ s = t ^ t = r and
T
:; $ s 6= r
which by ^-inversion (of the exercises) entail that there are derivations T
:; $ s = t and
which are not longer than that of T
T
:; $ t = r
:; $ s = t ^ t = r:
1.11. A Tait-Calculus for First Order Logic with Identity
89
By induction hypothesis these yield (1.26)
Id
$ s = t
(1.27)
Id
$ t = r
Id
$ s 6= r
and (1.28)
From (1.26) and (1.27) we get Id $ s = r by the Id-rule. This together with (1.28) entails Id $ by Lemma 1.11.4.
4. F = (s1 = t1 ^ : : : ^ sn = tn ! fs1 : : :sn = ft1 : : :tn ) Then we have the premises (1.29)
T
:; $ s1 = t1 : : :
T
:; $ sn = tn
and (1.30)
T
:; $ fs1 : : :sn 6= ft1 : : :tn
(where we tacitly use ^-inversion to get (1.29)). By the induction hypothesis these yield (1.31)
Id
$ s1 = t1 : : : Id $ sn = tn
and (1.32)
Id
$ fs1 : : :sn 6= ft1 : : :tn:
From (1.29) and the Id-axiom (1.33)
Id
$ ft1 : : :tn = ft1 : : :tn
Id
$ fs1 : : :sn = ft1 : : :tn:
we get (1.34)
From (1.30) and (1.34) we get (1.35) by Lemma 1.11.4.
Id
$
90
I. Pure Logic 5. F = (s1 = t1 ^ : : : ^ sn = tn ! (Ps1 : : :sn ! Pt1 : : :tn)) i.e. F = (s1 = t1 ^ : : : ^ sn = tn ^ Ps1 : : :sn ^ Pt1 : : :tn ): Then we have the premises (again using ^-inversion) (1.36) T :; $ s1 = t1 : : : T :; $ sn = tn (1.37) T :; $ Ps1 : : :sn and (1.38) T :; $ Pt1 : : :tn : We apply the induction hypothesis to (1.36), (1.37) and (1.38) and get by an application of the Id-rule Id $ Ps1 : : :sn and Id $ Ps1 : : :sn
which again by Lemma 1.11.4 entails Id $: Theorem 1.11.6. We have Id F i Id F:
Proof. The direction from right to left is Corollary 1.11.3. Thus assume Id F: Then by 1.10.9 we have Id j= F which by the compactness and deduction theorem entails j= F1 ^ : : : ^ Fn ! F for formulas F1 : : : Fn 2 Id Idt : By the completeness theorem for the Tait-calculus this yields T F1 : : : Fn F and by 1.11.5 we nally get Id F: Lemma 1.11.7 (Herbrand's lemma for logic with identity). If F is an 9-formula with Id 9xF , then there are nitely many L-terms t1 : : : tn such that Proof. Assume
Id Fx (t1 ) _ : : : _ Fx (tn ):
Id
9xF: Then
T
F1 : : : Fn 9xF
for fF1 : : : Fng Id by 1.10.9, the deduction theorem and the completeness theorem for the Tait-calculus. All formulas in Id are 8-formulas. Hence F1 : : : Fn are 9-formulas and we may apply Lemma 1.9.2 to get nitely many terms t1 : : : tn such that T F1 : : : Fn Fx(t1 ) : : : Fx(tn): This together with Lemma 1.11.5 entails Id Fx (t1 ) _ : : : _ Fx(tn ): Lemma 1.11.7 entails Herbrand's theorem for predicate logic with identity in the same way as Lemma 1.9.2 did it in the case of pure logic. All the proofs, including the construction of the prenex form of a formula, can be literally transferred.
1.11. A Tait-Calculus for First Order Logic with Identity
91
Theorem 1.11.8 (Herbrand's theorem for logic with identity). If F is an Lsentence and FH = 9x1 : : : 9xk G: Then we have Id F i there are nitely many k-tuples
of LH -terms such that Id
(t11 : : : t1k ) : : : (tn1 : : : tnk)
Gx1 ::: xk (t11 : : : t1k ) _ : : : _ Gx1 ::: xk (tn1 : : : tnk):
Another application of Lemma 1.11.5 is the interpolation theorem for predicate logic with identity. In a language with identity we can even avoid the addition of the empty formula, which was needed to state the interpolation theorem for pure logic in a general setting (cf. Theorem 1.9.13). We observe that the formula t = t for closed terms t behaves like > and dually t 6= t like ?: We have (1.39)
Id
$ t = t
for any formula set $ by an Id-axiom and $ t 6= t ) Id $ by (1.1) and Lemma 1.11.4. Thus we get the general interpolation theorem for the calculus Id . Of course, no longer we can count `=' among the non-logical symbols, because t 6= t, in the role of the empty formula, must not contain a non-logical predicate symbol. Theorem 1.11.9 (Interpolation theorem for logic with identity). If we have Id F ! G, then there is an interpolation formula E for F and G, i.e. we have Id F ! E and Id E ! G and E contains at most those non-logical predicate constants positively (negatively) which occur simultaneously positively (negatively) in F and G: Proof. By the above remark and Theorem 1.11.6 it is just the same as the proof of 1.9.14 by a slight modi cation of the proof of Theorem 1.9.9 for the calculus Id in the axiom case and for the Id-rule. We will now leave these more syntactical investigations and turn to the fundamentals of model theory. Id
92
I. Pure Logic
Chapter 2 Fundamentals of Model Theory The objects of interest in model theory are structures and classes of structures. Here connections to rst order logic will be studied. Especially it is analysed if and how structures and classes of structures can be described within rst order logic. In this chapter we will only deal with predicate logic with identity. Therefore there will be no need to emphasise that. So we are just writing j= instead of Id etc.
2.1 Conservative Extensions and Extensions by Denitions De nition 2.1.1. Let L be a rst order language. An L-theory is a set of L-sentences. L-theories are denoted by T T0 : : : If F is an L-formula and T a theory such that T j= F , then F is called provable in T: We call the sentences provable in T theorems of T: The language of a theory T is L(T ) = L(CT FT PT ) where CT FT and PT are the sets of constant, function and predicate symbols which occur in sentences of T. A structure S which satis es all sentences of T is called a T-model. Here we give our standard example. Let LGT be the language of group theory (cf. section 1.1). Then AxGT { the set of the following formulas
8x8y8z(x (y z) = (x y) z) 8x(x 1 = x) 8x9y(x y = 1) is an LGT -theory. An AxGT -model is just a group. Because in every group G we have G j= 8x(1 x = x) it is 8x(1 x = x) a theorem of AxGT . As there are also non-commutative groups
8x8y(x y = y x) is not a theorem of AxGT :
94
II. Model Theory
De nition 2.1.2. Assume that L L0 are rst order languages. a) An L0-theory T 0 is an extension of an L-theory T if every T-theorem also is an T 0-theorem, i.e. T j= F ) T 0 j= F: b) An extension T 0 of T is conservative, if for all L-sentences we also have T 0 j= F ) T j= F: Now think of the language LGT LGT of section 1.5. In LGT we have an unary
function symbol ;1 for the inverse function. That is we set AxGT = AxGT f8x(x x;1 = 1)g: Our intuitive understanding of groups may say that AxGT is a conservative extension of AxGT , i.e. AxGT doesn't prove more LGT -sentences than AxGT does. That this impression is correct can be obtained by the following result. Theorem 2.1.3. Assume that T and T 0 are theories such that every T-model expands to a T 0 -model. Then T 0 is conservative over T: Proof. Let L = L(T) and L0 = L(T 0): Let F be an L-sentence such that T 0 j= F: If S is a T model expand it to a T 0 model S 0. Then S 0 j= F: But S is the L-retract of S 0 and F is in L: Thus S j= F: Hence T j= F: We had already an example of a conservative extension which was LI and L: Since any L-structure expands to an LI-structure satisfying Id we re-obtain the result that predicate logic with identity is a conservative extension of pure logic. The above example of the inverse function in group theory gives reason for the following question: In mathematics it is usual to de ne new functions and relations and prove theorems using those de nitions. Usually we assume that there is no dierence if we use those new de nitions or not. By the following de nition we make precise what we usually do in mathematics and the next theorem justi es mathematical practice. De nition 2.1.4. Let T be a theory with language L = L(T ): A language L L0 = L(C 0 F 0 P 0) is called an extension of T by de nitions if the following conditions are satis ed: 1. For every c 2 C 0 n C there is an L(T)-formula F c such that FV(F c ) = fxg and T j= 9x(F c ^ (8y(Fxc (y) ! y = x))) briey denoted by T j= 9!xF c(x): 2. For every f 2 F 0 n F with #f = n there is an L(T)-formula F f such that FV(F f ) = fx1 : : : xn yg and T j= 8x1 : : : 8xn9!yF f : 3. For every P 2 P 0 n P with #P = n there is an L(T)-formula F P such that FV(F P ) = fx1 : : : xng:
2.1. Conservative Extensions and Extensions by Denitions
95
If L0 is an extension by de nitions of a theory T, then T(L0 ) is the theory which contains the following sentences: 1. all sentences in T i.e. T T(L0) 2. all sentences Fxc(c) for c 2 C 0 n C 3. all sentences 8x1 : : : 8xnFyf (fx1 : : :xn) for f 2 F 0 n F 4. all sentences 8x1 : : : 8xn(Px1 : : :xn $ F P ) for P 2 P 0 n P : We have seen LGT to be an extension of AxGT by de nition. Theorem 2.1.5. If L0 is an extension of T by de nitions, then T(L0) is conservative over T: Proof. Let S = (S C F P) be a T-model. We have to expand S to a T(L0 )-model S 0 . To obtain an L0-structure we rst have to interpret the symbols in L0 which do not belong to L: 1. If c 2 C 0 n C , then there is a formula F c such that T j= 9!xF c: Because of S j= T we also have S j= 9!xF c which entails that 0there is an uniquely determined element s 2 S such that S j= Fxcs]: We put cS = s: Then we obviously get (2.1) S 0 j= Fxc(c): 2. Let f 2 F 0 n F and #f = n: Then there is an L(T)-formula F f such that (2.2) S j= 8x1 : : : 8xn9!yF f : We de ne 0 f S (s1 : : : sn ) = t if S j= F f s1 : : : sn t]: This de nes a function because for arbitrary s1 : : : sn 2 S there is exactly one t such that S j= F f s1 : : : sn t] by (2.2). Then we obviously have (2.3) S 0 j= 8x1 : : : 8xnFyf (fx1 : : :xn): 3. For P 2 P 0 n P with #P = n we have an L(T )-formula F P such that FV(F P ) = fx1 : : : xng: De ne P S 0 = f(s1 : : : sn ) 2 S n : S j= F P s1 : : : sn ]g Then we get (2.4) S 0 j= 8x1 : : : 8xn(Px1 : : :xn $ F P ): From (2.1), (2.3) and (2.4) we get that S 0 is a T(L0)-model. Thus by 2.1.3 T(L0) is a conservative extension of T: Now we can strengthen Theorem 2.1.5 in such a way that we have for every formula of the extension by de nitions L0 of a theory T an equivalent L(T )-formula. I.e. it is possible to replace the new de ned symbols by their de nitions.
96
II. Model Theory
Theorem 2.1.6. Let L0 be an extension of T by de nitions. Then for each L0-formula F there is an L(T )-formula F T such that T(L0) j= F $ F T : Proof. In a rst step we prove: For any L0-term t there is an L(T )-formula Gt such that
T(L0 ) j= t = x $ Gt for x 2= FV(t)
(2.5)
by induction on the length of t: If t is an L-term we put Gt = (t = x): If t = c 2 C 0 n C , then there is an L-formula F c such that T j= 9!xF c and T (L0) j= Fxc(c): We put Gc = F c: Then if c = x we get Gc from Fxc (c): On the other hand if Gc we get c = x from T(L) j= 9!xGc and T(L0 ) j= Gcx (c): Hence T (L0) j= c = x $ Gc :
If t = fs1 : : :sn , then there are formulas Gsi such that T(L0 ) j= si = x $ Gsi
(2.6)
for i 2 f1 : : : ng: If f 2= F , then there is an L(T)-formula F f such that FV(F f ) = fx1 : : : xn yg T(L0) j= 8x1 : : : 8xn9!yF f and T (L0) j= 8x1 : : : 8xn Fyf (fx1 : : :xn):
(2.7) We de ne Then we have (2.8)
Gt = 9x1 : : : 9xn (Gsx1 (x1) ^ : : : ^ Gsxn (xn) ^ Fyf (x)): T(L0 ) j= fs1 : : :sn = x $ Gt:
To show (2.8) we observe that by (2.6) we have (2.9)
T (L0 ) j= 9x1 : : : 9xn (Gsx1 (x1) ^ : : : ^ Gsxn (xn)):
On the other hand we also have (2.10)
T (L0) j= fs1 : : :sn = x ^ x1 = s1 ^ : : : ^ xn = sn ! fx1 : : : xn = x:
From (2.6), (2.7) and (2.10) we therefore get T (L0) j= fs1 : : :sn = x ^ 9x1 : : : 9xn (Gsx1 (x1 ) ^ : : : ^ Gsxn (xn )) ! 9x1 : : : 9xn(Gsx1 (x1) ^ : : : ^ Gsxn (xn) ^ Fyf (x))
2.1. Conservative Extensions and Extensions by Denitions
97
which together with (2.9) yields T (L0 ) j= fs1 : : :sn = x ! Gt: For the opposite direction we observe that by (2.5) and (2.6) we have T (L0) j= Fyf (x) ! fx1 : : :xn = x: Thus T (L0 ) j= 9x1 : : : 9xn (s1 = x1 ^ : : : ^ sn = xn ^ Fyf (x) ! fs1 : : :sn = x) which together with (2.10) entails T(L0 ) j= Gt ! fs1 : : :sn = x: If f 2 F we put Gt = 9x1 : : : 9xn(Gsx1 (x1) ^ : : : ^ Gsxn (xn) ^ fx1 : : :xn = x) and show T (L0) j= fs1 : : :sn = x $ Gt as above. This terminates the proof of (2.5). Next we show: For an atomic formula Pt1 : : :tn there is an L(T)-formula G such that T (L0) j= Pt1 : : :tn $ G: (2.11) If P 2= P , then there is an L(T)-formula F P (x1 : : : xn) with FV(F P ) = fx1 : : : xng (2.12) T(L0) j= 8x1 : : : 8xn(Px1 : : :xn $ F P ) We put G = 9x1 : : : 9xn(Gtx1 (x1) ^ : : : ^ Gtxn (xn) ^ F P ) and if P 2 P we may just put F P = Px1 : : :xn i.e. G = 9x1 : : : 9xn (Gtx1 (x1 ) ^ : : : ^ Gtxn (xn ) ^ P x1 : : :xn) where Gti are the formulas given by (2.7). Then we get T(L0) j= Pt1 : : :tn $ G since we have T(L0 ) j= 9x1 : : : 9xn (Gtx1 (x1 ) ^ : : : ^ Gtxn (xn )) and T(L0) j= Pt1 : : :tn ^ 9x1 : : : 9xn(Gtx1 (x1) ^ : : : ^ Gtxn (xn)) ! 9x1 : : : 9xn(Gtx1 (x1) ^ : : : ^ Gtxn (xn) ^ F P ) by (2.6) and (2.12) T(L0) j= Pt1 : : :tn ! G
98
II. Model Theory
and the opposite direction follows since also T(L0) j= 9x1 : : : 9xn(Gtx1 (x1) ^ : : : ^ Gtxn (xn) ^ F P ) ! Pt1 : : :tn by (2.12) and (2.6). From (2.11), however, we get: For any L0 -formula F there is an L(T)-formula F T such that (2.13) T (L0 ) j= F $ F T de ning F T inductively by the clauses 1. (Pt1 : : :tn )T = G where G is as in (2.11) 2. (F ^ G)T = F T ^ GT 3. (:F)T = :(F T ) 4. (9xF)T = 9x(F T ): Now an easy induction on the length of F shows (2.13). Corollary 2.1.7. Let L0 be an extension of T by de nitions. For every L0-formula F there is an L(T)-formula F T such that T(L0) j= F i T j= F T : Proof. Take F T as in 2.1.6. Then we have (2.14) T(L0) j= F $ F T : Thus if T(L0) j= F , then T (L0 ) j= F T which entails T j= F T by 2.1.5 since F T is an L(T)-formula. On the other hand if T j= F T , then of course T(L0 ) j= F T which entails T(L0) j= F by (2.14).
Exercises
E 2.1.1. Let L1 be an extension by de nitions of T and L2 an extension by de nitions of T(L1 ). Prove that L2 is an extension by de nitions of T . E 2.1.2. Let T be a consistent L-theory and $ a set of sentences with F1 : : : Fn 2 $ ) F1 _ : : : _ Fn 2 $ Show the equivalence of: 1. T has an axiom system ; $, i.e. T j= ; and ; j= T: 2. For all L-structures S S 0 S j= T and 8F 2 $(S j= F ) S 0 j= F ) implies S 0 j= T: Hint: For the interesting direction let ; = fF 2 $ : T j= F g: To prove ; j= T de ne for any arbitrary S 0 j= T the set ) = f:F : S 0 j= :F and F 2 $g: Show the consistency of T ) and derive the premis of 2.
2.2. Completeness and Categoricity
99
2.2 Completeness and Categoricity It is a natural question to ask, whether there is a set of sentences, i.e. an axiom system which characterises the theorems of a given L-structure S : The obvious answer is of course `yes'. Just take Th(S ) = fF : F is L-sentence and S j= F g: But of course this is not what we really meant. Our question was whether there is some `simple set' of sentences, whatever `simple' could mean. A possible `simple' set would be a nite set of sentences or at least a set of sentences together with an algorithm which allows us to decide whether a sentence belongs to the set or not. Before we make more precise what a `simple' set of sentences really could mean, we shall study some general properties of axiom systems. But rst we take a look at the connections between two given structures. De nition 2.2.1. Let S1 S2 be two L-structures. a) We call ' : S1 ! S2 an embedding, written as ' : S1 ,! S2 if the following conditions are satis ed: 1. ' is one-one. 2. '(cS1 ) = cS2 for all c 2 C : 3. '(f S1 (s1 : : : sn)) = f S2 ('(s1 ) : : : '(sn )) for all f 2 F , s1 : : : sn 2 S1 : 4. (s1 : : : sn ) 2 P S1 , ('(s1 ) : : : '(sn)) 2 P S2 for all P 2 P and for all s1 : : : sn 2 S1 : b) We call ' : S1 ! S2 an elementary embedding, written as ' : S1 $ S2 if we have 1. ' : S1 ,! S2 2. For any L-formula F and any S1 -assignment we have
S1 j= F] , S2 j= F'] where ' is the S -assignment ' : c) We call S1 a substructure of S2 , written as S1 S2 if the identity is an embedding, i.e. idS1 : S1 ,! S2 : d) S1 is an elementary substructure of S2 , or synonymously S2 is an elementary extension of S1 , written as S1 $ S2, if the identity is an elementary embedding, i.e. idS1 : S1 $ S2:
100
II. Model Theory
This de nition denotes some possible relationships between two given structures. That this relation can be described in terms of validity of certain sets of formulas will be established in the following lemma. Recall (cf. section 1.3) that for an L-structure S we introduced the language LS which contains constant symbols s for every element s 2 S the domain of S : By SS we denoted the LS -expansion of L interpreting each constant s by s. We showed in 1.3.6 and 1.3.7: tSS s] = tx (s)SS for closed LS -terms tx (s) and
SS j= F s] i SS j= Fx (s) SS j= 8xF i SS j= Fx (s) for all s 2 S and SS j= 9xF i SS j= Fx (s) for some s 2 S: for LS -sentences Fx(s):
De nition 2.2.2. a) The diagram of an L-structure is the set Diag(S ) = fF : F is an atomic or negated atomic LS -sentence and SS j= F g: b) The elementary diagram of an L-structure S is the set Th(SS ) = fF : F is an LS -sentence and SS j= F g:
Proposition 2.2.3. Let S1 S2 be two L-structures and ' : S1 ! S2 : Then we de ne the LS1 -expansion S 0 of S2 by sS = '(s): 0
a) If we have ' : S1 ,! S2 , then it is for all LS1 -terms t and all S1S1 -assignments '(tS1S1 ]) = tS 0 ' ]: b) If it is ' : S1 ,! S2 , then we have S 0 j= Diag(S1 ): c) If it is ' : S1 $ S2 , then we have S 0 j= Th(S1S1 ):
Proof. This is left as an exercise to the reader. Proposition 2.2.4. Let S1 S2 be two L-structures and S 0 an LS1 -expansion of S2 : Then de ne ' : S1 ! S2 by 0 '(s) = sS :
a) If it is S 0 j= Diag(S1 ), then we have ' : S1 ,! S2 : b) If it is S 0 j= Th(S1S1 ), then we have ' : S1 $ S2:
2.2. Completeness and Categoricity
101
Proof. We prove only the rst part. First we have to show that ' is one-one. If we have r s 2 S1 with r 6= s, then it is S1 j= r 6= s: Since S 0 j= Diag(S1) we know '(r) = rS 0 = 6 sS 0 = '(s) and so ' is one-one. Now let c 2 C : Then we have S1S1 j= c = cS1 : Here cS1 is the name for the interpretation of c in S1 . Since S 0 j= Diag(S1 ) we obtain
'(cS1 ) = (cS1 )S = cS = cS2 : If we take f 2 F and s1 : : : sn 2 S1 we have S1S1 j= fs1 : : :sn = f S1 (s1 : : : sn ): Because of S 0 j= Diag(S1 ) we have '(f S 0 (s1 : : : sn )) = (f S 0 (s1 : : : sn ))S 0 = (fs1 : : :sn )S 0 = f S2 ('(s1 ) : : : '(sn )): If we take P 2 P , then it is since S 0 j= Diag(S1 ) S1S1 j= Ps1 : : :sn i S 0 j= Ps1 : : :sn : But this yields 0 (s1 : : : sn) 2 P S1 i ('(s1 ) : : : '(sn )) 2 P S = P S2 : Corollary 2.2.5. Let S1 S2 be two L-structures with S1 S2 : a) It is S1 S2 i S2S1 j= Diag(S1 ): b) It is S1 $ S2 i S2S1 j= Th(S1S1 ): De nition 2.2.6. Let S1 S2 be two L-structures. a) We call S1 and S2 elementary equivalent, denoted by S1 S2 , if Th(S1 ) = Th(S2 ): b) We call S1 and S2 isomorphic, written as S1 = S2 , if there is a one-one function ' from S1 onto S2 such that ' : S1 ,! S2 and ';1 : S2 ,! S1 : 0
0
102
II. Model Theory
De nition 2.2.7. a) An axiom system for an L-structure S is a set Ax Th(S ): b) An axiom system Ax is complete for S if Con(Ax) = Th(S ) where Con(Ax) = fF : F is L-sentence and Ax j= F g is the set of logical consequences of Ax: c) A theory T is complete if T is a complete axiom system for some L-structure S : d) The model class ModL (T ) of an L-theory T is de ned as ModL (T) = fS : S j= T g: e) An axiom system Ax for S is categorical for S if every structure in ModL (Ax) is isomorphic to S : f) A theory T is ( -) categorical (for some cardinal ) if any two models of T (of cardinality ) are isomorphic. Proposition 2.2.8. Let Ax be categorical for S : Then Ax is also complete for S : Proof. Con(Ax) Th(S ) follows from Ax Th(S ): If F 2 Th(S ) and S 0 j= Ax, then S 0 is isomorphic to S : Hence F 2 Th(S ) = Th(S 0 ), i.e. S 0 j= F and we have Ax j= F: But there are only uninteresting categorical axioms systems (cf. Exercise E 2.2.10), so we will not mention them further. We are able to give some characterisations of complete theories: Lemma 2.2.9. Let T be an L-theory. Then the following statements are equivalent: 1: T is complete. 2: Con(T) is maximal and consistent. 3: T is consistent and Con(T ) = Th(S ) for all S 2 ModL (T ): 4: T is consistent and for any two S S 0 2 ModL (T) we have S S 0 :
Proof. 1: ) 2: is obvious since Con(T ) = Th(S ) for some L-structure S and Th(S ) is consis-
tent and maximal. 2: ) 3: Let S 2 ModL (T): Then T Th(S ): If F 2 Th(S ), then F 2 Con(T ) because F 2= Con(T) entails :F 2 Con(T) Th(S ), a contradiction. 3: ) 4: is obvious since Th(S ) = Th(S 0 ) = Con(T) for S S 0 2 ModL (T): 3: ) 1: holds trivially.
2.2. Completeness and Categoricity
103
4: ) 3: ModL (T ) 6= since T is consistent. Pick S 2 ModL (T): Then T Th(S ): If F 2 Th(S ) but F 2= Con(T), then T f:F g is consistent and thus has a model S 0 which also belongs to ModL (T): Thus S 0 S which contradicts the fact that :F 2 Th(S 0 ) and F 2 Th(S ): For an L-theory T we de ne the cardinality of T by card(T) = card(LT ) i.e. by the cardinality of the set of non-logical symbols occurring in T: Lemma 2.2.10. Let T be an L-theory. a) If T possesses arbitrary large nite models, then T also has an in nite model. b) If T has an in nite model, then T has a model of cardinality for any cardinal
max(card(T) @0 ):
Proof.
a) Put
T 0 = T f8x0 : : : 8xn9y(x0 6= y ^ : : : ^ xn 6= y) : n 2 INg: Then T 0 is nitely consistent since T possesses arbitrarily large nite models. By compactness T 0 is consistent. A model of T 0 of course cannot be nite. b) Let be a cardinal max(card(T) @0) and choose a set fc : < g of new constant symbols. Put T 0 = T fc 6= c : < 6= g: Any nite subset T0 of T 0 contains only nitely many of the new constant symbols c: Since T possesses an in nite model S0 we may expand it to an T0 -model S1 by interpreting fc1 : : : cn g occurring in T0 such that cS1i 6= cS1j holds for i 6= j : By Theorem 1.5.12 we get a model S of cardinality card(S) = max(@0 card(T 0 )) = max(@0 card(T ) ) = : Now we have to boil down the model S to a structure interpreting equality standardly. Using Theorem 1.10.7 we obtain an epimorphic model S of T 0. Here we have card(S) since we have taken equivalence classes. But since T 0 j= c 6= c for 6= we have the new constant symbols in dierent equivalence classes. So we conclude card(S) = :
104
II. Model Theory
Since there is an in nite group and card(AxGT ) = @0 we have by Lemma 2.2.10 groups of any in nite cardinality. The following theorem is going back to Leopold Lowenheim 1878, y1957] in 1915 and Thoralf Skolem 1887, y1963] in 1920. Theorem 2.2.11 (Lowenheim-Skolem downwards). Let S be an L-structure and card(S) = @0 : For any in nite with card(L) and any S0 S with card(S0 ) there is an elementary substructure S 0 $ S such that S0 S 0 and card(S 0 ) = : Proof. Let L = L(C F P ): Take S 0 with S0 S0 S with card(S 0 ) = and put S1 = S 0 fcS : c 2 Cg:
Then card(S1 ) = since is in nite and card(S 0 ) = and cardfcS : c 2 Cg card(L) : The idea of the proof is to take the closure of S1 under all functions f S for f 2 F and to add all witnesses for existential sentences valid in S : Formally this is done by the following de nitions: For M S we put M = M ff S (s1 : : : s#f ) : f 2 F and (s1 : : : s#f ) 2 M #f g:
For a formula F in L with FV(F ) = fx0 : : :xng and (s1 : : : sn) 2 M n we de ne S~s F = fs 2 S : S j= Fs s1 : : : sn]g: If S~s F is not empty, let S(~s F ) be a xed element of S~s F : Put M = M fS(~s F ) : ~s 2 M n F an L-formula and S~s F 6= g: From card(L) we obtain that card(M) entails card(M ) as well as card(M) : We de ne S1 as above and Sn+1 = Sn Then we have card(Sn ) for all n 2 IN and for S0 =
n2IN
Sn
we have card(S 0 ) : Because of S01 S 0 and card(S1 ) = it is in fact card(S 0 ) = . Let S 0 = (S 0 C S 0 FS 0 PS 0) with cS = cS for c 2 C f S 0 (s1 : : : sn) = f S (s1 : : : sn ) for f 2 F and
0 P S = P S \ S 0 #P for P 2 P :
2.2. Completeness and Categoricity
105
It is (2.15) S0 S (2.16) f S 0 : (S 0 )#f ! S 0 (2.17) SS 0 j= Th(S 0 S 0 ) We have Sn S by construction. Hence S 0 S: To show (2.16) assume #f = k and pick s1 : : : sk 2 S 0 : Then there is an n 2 IN such that s1 : : :sk 2 Sn : Thus 0 f S (s1 : : : sk ) = f S (s1 : : : sk ) 2 Sn Sn+1 S 0 : It remains to show (2.17). By the de nition of f S 0 P S 0 (2.15) and (2.16) we have already S 0 S : Hence (2.18) SS 0 j= Diag(S 0 ) by Corollary 2.2.5. We show (2.19) F 2 Th(S 0 S 0 ) ) SS 0 j= F by induction on the length of F: Without loss of generality we may assume that F is translated into the Tait-language of L: Then (2.18) covers the case that F is atomic and we need not consider negations. If F = (F0 ^ F1) we have F0 F1 2 Th(S 0 S 0 ) and get SS 0 j= F0 ^ F1 by the induction hypothesis. Similarly if F = (F0 _ F1) we have F0 2 Th(S 0 S 0 ) or F1 2 Th(S 0 S 0 ) which entails SS 0 j= F0 _ F1 by the induction hypothesis. Let F = 9xG: Then there is an s 2 S 0 such that S 0 S 0 j= Gx(s): By induction hypothesis this gives SS 0 j= Gx(s) which by S 0 S entails SS 0 j= 9xG: Assume F 2 Th(S 0 S 0 ) but SS 0 6j= F: Then (2.20) SS 0 j= 8x:G: Let (s1 : : : sk ) be the list of all constant symbols from LS 0 n L occurring in G: Then G = G0x1 ::: xk (s1 : : : sk ) and (2.20) entails S(s1 ::: sk ) :G0 6= and thus also (2.21) S j= :G0s1 : : : sk S(s1 : : : sk :G0)]: But s1 : : : sk 2 Sn for some n 2 IN which entails S(s1 : : : sk :G0) 2 S n = Sn+1 S 0 : Let s = S(s1 : : : sk :G0): From F 2 Th(S 0 S 0 ) we get S 0 S 0 j= G0x1 ::: xk y (s1 : : : sk s) which by induction hypothesis entails SS 0 j= G0x1 ::: xk y (s1 : : : sn s): This, however, contradicts (2.21) and 1.3.6. This terminates the proof of (2.19) and (2.19) entails S0 $ S: Here we can give another proof of the second part of Lemma 2.2.10. Let T 0 be de ned
106
II. Model Theory
as in the proof of 2.2.10. Since we have seen that every nite subset of T 0 has a model S by the compactness theorem for logic with identity. Because all the new constant symbols have to be interpreted dierent we have card(S) : Since card(T ) we obtain by Lowenheim-Skolem downwards a model S 0 of T with card(S 0 ) = : The Lowenheim-Skolem downwards theorem is another reason for some limits of rst order logic. For example it is not possible to characterise the real numbers R (e.g. viewed as a eld) up to isomorphism. To be more explicit there is no set M of rst order sentences such that we have S j= M i S = R since we would nd some countable structure S $ R and so S cannot be isomorphic to R: In this argumentation we have assumed that R is an L-structure for a countable language L: If we make no restrictions to the cardinality of L we are able to construct a structure R $ S with card(S) > card(R) by the next theorem. Though the following theorem is named by Lowenheim and Skolem it is due to Alfred Tarski 1901, y1983] and Robert L. Vaught 1926] in 1957. Theorem 2.2.12 (Lowenheim-Skolem upwards). Let S be an L-structure and let = card(S) @0 : For any cardinal
maxf card(L)g there is an elementary extension S 0 % S such that card(S 0 ) = : Proof. Let T = Th(SS ): Since @0 S is an in nite model of T which entails that T has a model S 0 of a given cardinality card(T) by Lemma 2.2.10. We have S 0 j= Th(SS ) and thus S 0 % S up to isomorphism. Because we have card(T ) = maxf card(L)g we are done. Theorem 2.2.13 (Vaught's test). Let @0 and T be a consistent -categorical theory without nite models and card(T) : Then T is complete. Proof. Let S S0 2 ModL(T ): Then we obtain two models S 0 S 00 of T of cardinality
by Lowenheim-Skolem upwards. Thus S $ S 0 = S 0 0 % S0 which entails S S0 : By 2.2.9 this entails the completeness of T: In the exercises we will learn that the theory DLO of dense linear orderings without endpoints is @0 -categorical. Since DLO has no nite models we know by Vaught's test that DLO is complete and more: DLO is complete for (Q x + y f(x y + 1) > f(x y) f(x + 1 y) f(x y + 1) f(x + 1 y) > f(x y) f(x1 y) + f(x2 y) f(x1 + x2 + 4 y) 8x1 : : : 8xn 9x8y Pni=1 f(xi y) f(x y) For all primitive recursive functions g : INn ! IN there is an x 2 IN such that
8x1 : : : 8xn g(x1 : : : xn) < f(x
n X i=1
xi):
Hint: Use induction on the de nition of the primitive recursive functions. If g is built up by the recursor R, then choose x 2 IN such that
8x1 : : : 8xn g(x1 : : : xn) + l) f : IN2 ! IN is not primitive recursive.
n X i=1
xi < f(x
n X i=1
xi)
124
III. Theory of Decidability
E 3.1.6. Prove that every monotone decreasing function f : IN ! IN is primitive recursive.
E 3.1.7. Let p q 2 IN with p q > 0 and f : IN2 ! IN with 8x8y f(x + p y) = f(x y) = f(x y + q): Prove that f is primitive recursive. E 3.1.8. Let f : IN ! IN be primitive recursive and a) monotone decreasing b) strictly monotone increasing. Prove that rg(f) IN is a primitive recursive relation. E 3.1.9. We are going to de ne the class of polynomial functions inductively by: 1. Ckn Pkn S + are polynomial 2. If f g1 : : : gn are polynomial, then so is Sub(f g1 : : : gn): 3. If f g h are polynomial and it is 8~x R(g h)(~x) f(~x) then so is R(g h): Prove: a) Every polynomial function is primitive recursive. b) There is a primitive recursive function which is not polynomial.
3.2 Primitive Recursive Coding The aim of this section is to provide primitive recursive coding functions, i.e. functions cn : INn ! IN together with decoding functions pni : IN ! IN such that pni(cn (z1 : : : zn)) = zi : There are very dierent possibilities to obtain such functions. Here we will use the fact that every natural number is the product of uniquely determined prime powers, i.e. we de ne cn (x1 : : : xn) = p(0)x1 +1 : : : p(n)xn +1 where x:p(x) is the function which enumerates the primes. Due to the uniqueness of the factorisation of a natural number into prime powers it is obvious how to get decoding functions. All we have to check is that this can be done primitive recursively. The rst step in this direction is the following lemma.
3.2. Primitive Recursive Coding
125
Proposition 3.2.1. The function p which enumerates the primes is primitive recursive.
Proof. p satis es the following recursion equations (p0) (pS)
p(0) = 2 p(z + 1) = xp(z)!+1 (Pr(x) ^ p(z) < x)
From the previous section it is obvious that this de nes p primitive recursively. But of course we have to check that p satis es (p0) and (pS): p0 is obvious. All that needs checking is that there is a prime in the interval ]p(z) p(z)!+1]: Towards a contradiction assume that ]p(z) p(z)! + 1] \ Pr = : Let q be some prime factor of p(z)! + 1: Then q p(z) which entails qjp(z)!: Hence qj1, a contradiction. De nition 3.2.2. Let (x y) be the multiplicity of p(y) in the factorisation of x i.e. (x y) = zx :(p(y)z+1 jx): Then : IN2 ! IN is primitive recursive and we have the following representation.
Proposition 3.2.3. If x 6= 0, then x=
x Y i=0
p(i) (x i) :
Now we are prepared for the de nition of the coding and decoding functions which { according to the common literature { will be denoted by and ( )i :
De nition 3.2.4. a) = 0 code of the empty sequence
hz0 : : : zn i =
n Y i=0
p(i)zi +1
b) (x)i = (x i);_ 1: Note that (x)i is de ned for arbitrary natural numbers x: However, not every natural number codes a sequence. Obviously only those natural numbers code sequences which have no gaps in their decomposition into prime factors, i.e. if p(i+1) is a prime factor, then so is p(i): Thus we de ne c) Seq = fz : z 6= 1 ^ 8i z(p(i + 1)jz ! p(i)jz)g and call Seq the set of sequence numbers.
126
III. Theory of Decidability
Finally we de ne the length of a sequence coded into x by d) lh(x) = zx (:p(z)jx): Proposition 3.2.5. ~x:h~xi xi:(x)i and lh are primitive recursive functions, Seq is a primitive recursive predicate such that 1: z 2 Seq ^ lh(z) = 0 , z = and 2: z = hz0 : : : zn;1i , z 2 Seq ^ lh(z) = n ^ 8i < n(zi = (z)i ): Finally we de ne the concatenation of sequence numbers by a_ 0 = a 0_ b = b and a_ b = h(a)0 : : : (a)lh(a);_ 1 (b)0 : : : (b)lh(b);_ 1 i for a b 6= 0: Again this is de ned for arbitrary a b: It is easily veri ed that xy:x_ y is primitive recursive. For sequence numbers, however, we obviously get
hz0 : : : zni_ hx0 : : : xmi = hz0 : : : zn x0 : : : xmi: It is worth while to observe that (x)i < x holds for all natural numbers x 6= 0: As a rst application of coding (there will be many more of them later) we show that the primitive recursive functions are closed under course-of-values recursion. Let f : INn+1 ! IN be any function. The course-of-values function f : INn+1 ! IN is de ned by (CV 0) (CV S)
f(~z 0) = f(~z x + 1) = f(~z x)_ hf(~z x)i
Then we have the following proposition.
Proposition 3.2.6. a) f(~z n + 1) = hf(~z 0) : : : f(~z n)i and b) f is primitive recursive i f is.
Proof.
a) follows by induction on n: For n = 0 we have f(~z 1) = hf(~z 0)i by (CV 0) and (CV S) and for n = x + 1 we get f(~z n + 1) = f(~z x + 1)_ hf(~z x + 1)i =i:h: hf(~z 0) : : : f(~z x)i_hf(~z x + 1)i = hf(~z 0) : : : f(~z n)i:
b) If f is primitive recursive, then f is obviously also primitive recursive. If f is primitive recursive, then f(~z x) = (f(~z x+ 1))x which shows that f is primitive recursive.
3.2. Primitive Recursive Coding
127
The following result goes back to Thoralf Skolem (1923) and Rosza Peter 1905, y1977] (1934).
Lemma 3.2.7 (Course-of-values recursion). Let g : INn+2 ! IN be primitive recursive. Then the function h : INn+1 ! IN satisfying h(~z x) = g(~z x h(~z x)) is uniquely determined and primitive recursive.
Proof. The course-of-values of h is given by h(~z 0) = h(~z x + 1) = h(~z x)_ hg(~z x h(~z x))i: Thus h is uniquely determined and primitive recursive. But then the same holds for h:
Corollary 3.2.8. Let g : INn+k+1 ! IN g0 : INn ! IN and hi : IN ! IN i = 1 : : : k be primitive recursive functions such that hi(x) < x holds for all x > 0 and i = 1 : : : k: Then there is a uniquely determined primitive recursive function f : INn+1 ! IN satisfying (0 if x = 0 f(~z x) = g (~z) g(~z x f(~z h1(x)) : : : f(~z hk (x))) if x 6= 0: Proof. We have
(0 f(~z x) = g (~z)
if x = 0 g(~z x (f(~z x))h1 (x) : : : (f(~z x))hk (x)) if x 6= 0:
Uniqueness holds obviously. Another application of coding is the principle of simultaneous recursion. This result was mentioned by David Hilbert and Paul Bernays 1888,y1977] in 1934.
Lemma 3.2.9 (Simultaneous recursion). Let g1 : : : gn and h1 : : : hn be primitive recursive functions. Then there are uniquely determined primitive recursive functions f1 : : : fn satisfying fi (~z 0) = gi (~z) and fi (~z x + 1) = hi (~z x f1(~z x) : : : fn(~z x)) for i = 1 : : : n:
128
III. Theory of Decidability
Proof. De ne a function f~ by
~ z 0) = hg1 (~z) : : : gn(~z)i f(~
and ~ z x + 1) = hh1(~z x (f(~ ~ z x))0 : : : (f(~ ~ z x))n;1) : : : f(~ ~ z x))0 : : : (f(~ ~ z x)n;1)i: hn (~z x) (f(~ Then f~ is primitive recursive and we get ~ z x))i;_ 1 for i = 1 : : : n: fi (~z x) = (f(~ Uniqueness follows straightforward by induction on x: As another application of coding we will show that there are eectively computable functions which are not primitive recursive. To get this result we have to code primitive recursive functions by natural numbers. De nition 3.2.10. The Godel number pf q of a primitive recursive function f is de ned inductively. 1. pCknq = h0 n ki 2. pPknq = h1 n ki 3. pS q = h2 1i 4. pSub(g h1 : : : hn)q = h3 (ph1q)1 pgq ph1q : : : phnqi 5. pR(g h)q = h4 (pgq)1 + 1 pgq phqi: Note that we have coded a primitive recursive function f in such way that #f = (pf q)1 : Put CPR = fe : e codes a primitive recursive functiong: Then we observe the following. Proposition 3.2.11. CP R is primitive recursive. Proof. We have e 2 CPR , e 2 Seq ^ ((e)0 = 0 ^ lh(e) = 3) _ ((e)0 = 1 ^ lh(e) = 3 ^ 0 < (e)2 (e)1 ) _ ((e)0 = 2 ^ lh(e) = 2 ^ (e)1 = 1) _ ((e)0 = 3 ^ lh(e) 4 ^ 8i < lh(e)((i > 1 ! (e)i 2 CPR) ^ (i > 2 ! _(e)i1 = (e)1 )) ^ (e)21 = lh(e);_ 3) _ ((e)0 = 4 ^ lh(e) = 4 ^ (e)2 2 CPR ^ (e)3 2 CPR ^ (e)1 = (e)21 + 1 ^ (e)31 = (e)1 + 1)]:
3.2. Primitive Recursive Coding
129
Lemma 3.2.12.
a) There are exactly @0 (= card(IN)) primitive recursive functions. b) There are functions which are not primitive recursive.
Proof.
a) All constant functions are primitive recursive. So there are @0 primitive recursive functions. Since distinct primitive recursive functions have distinct codes in CPR IN we have at most @0 primitive recursive functions. b) There are 2@0 (= card(2IN )) functions from IN to IN. As @0 < 2@0 by the usual Cantor argument, there is function f : IN ! IN which is not primitive recursive.
One should observe that pf q indeed codes an algorithm for the computation of f: The leading component tells us whether we have to apply recursion, substitution or if we have one of the basic functions. Since decoding is primitive recursive we may eectively construct a primitive recursive function e] out of a code e 2 CPR: Thus the function
(
UPR (e x) = e](x) if e 2 CPR ^ (e)1 = 1 0 otherwise is eectively computable. Theorem 3.2.13. UPR is not primitive recursive. Proof. Towards a contradiction assume that UPR is primitive recursive. Then x:UPR (x x) + 1 is also primitive recursive. Let e0 be its code. Then UPR (e0 e0) = e0 ](e0 ) = UPR (e0 e0 ) + 1: A contradiction. The diagonalisationargument in the proof of Theorem 3.2.13 reveals a general dilemma. Whenever we try to de ne eective computability in some precise way there will be some algorithm for computing this functions and this algorithm can be coded. Thus we will get some kind of coding and once we have a class of functions which can be coded we may use the same diagonalisation argument as in 3.2.13 to show that there are eectively computable functions not belonging to that class. The way out of this dilemma is to regard partial functions.
Exercises
E 3.2.1. Prove that the following predicates are primitive recursive. a) P1 = fn : n is sum of two odd prim numbersg
130
III. Theory of Decidability
b) P2 = fn : n is sum of its divisors 6= ng E 3.2.2. (Raphael M. Robinson, 1947) We de ne the iterative functions inductively by the following clauses: 1. S Ckn Pkn x:(x)i x1 : : :xn :hx1 : : : xni are iterative functions. 2. If f g1 : : : gn are iterative, then so is Sub(f g1 : : : gn): 3. If g is iterative and unary and f : IN2 ! IN is de ned by: f(0 x) = x f(n + 1 x) = g(f(n x)) then f is iterative. Prove: a) Every iterative function is primitive recursive. b) Every primitive recursive function is iterative. Hint: For f = R(g h) observe the function s : IN2 ! IN with s(n~x) = hf(~x n) n ~xi Show that for g h iterative so is s. E 3.2.3. The Fibonacci-function is de ned by f(0) = f(1) = 1 f(n + 2) = f(n) + f(n + 1): Show that f is primitive recursive. E 3.2.4. De ne : IN2 ! IN with (n m) = (m+n)(2m+n+1) + m: Prove that there are primitive recursive functions 1 2 : IN ! IN such that (1 (n) 2(n)) = n using simultaneous recursion.
3.3 Partial Recursive Functions and the Normal Form Theorem De nition 3.3.1.
a) A function f : M ! IN with M INn is called a partial number theoretical P IN. We call M the domain of f and put function. We denote this by f : INn ;! dom(f) = M: If x 2= dom(f), then f(x) is unde ned which often is denoted by f(x) " : On the other hand f(x) # stands for x 2 dom(f):
3.3. Partial Recursive Functions and the Normal Form Theorem
131
b) For partial functions f and g we de ne f(n) ' g(n) if n 2= dom(f) dom(g) _ (n 2 dom(f) \ dom(g) ^ f(n) = g(n)): This could be read as (f(n) " ^g(n) ") _ (f(n) # ^g(n) # ^f(n) = g(n)): The easiest way to memorise this relation is to think that f(x) = 1 if f(x) " where 1 is some element not in IN: For those extended functions f(x) ' g(x) really means f(x) = g(x): De nition 3.3.2. The unbounded search operator assigns to a partial function P IN the partial function (f) : INn ;! P IN de ned by f : INn+1 ;! (f)(~z ) = minfx : f(~z x) = 0 ^ 8z < x9y 6= 0(f(~z z) ' y)g: One should notice that (f) can be partial even if f was total, i.e. if dom(f) = INn+1 : This is the case when f does not have zeros. We will also de ne the unrestricted P IN search operator for predicates. Let P INn+1 be a predicate. Then (P) : INn ;! is de ned by (P )(~z) = ((sgP ))(~z ) = minfx : (~z x) 2 P g: De nition 3.3.3. (S. C. Kleene, 1938) The class of partial recursive functions is the least class of partial functions which contains all the basic functions and is closed under substitution, primitive recursion and unbounded search. A function f : INn ! IN is recursive if it is partial recursive and total, i.e. dom(f) = INn: Proposition 3.3.4. Every recursive function is eectively computable. Proof. The proof is by 3.1.2, 3.1.5 and the additional observation that for an eectively computable total function f for which (f) is total, (f) is also eectively computable. The algorithm for (f) is to compute f(~z 0) f(~z 1) : : : successively. Since by hypothesis (f)(~z ) # we eventually nd the least n 2 IN such that f(~z n) = 0 and (f)(~z ) = n: One should observe that a partial recursive function is in general not eectively computable. If we try to compute f(~x) by applying the algorithm for f given by its de nition as a partial recursive function, it may well happen that the algorithm does not terminate. During our computation, however, we cannot know that and may still hope that it eventually will. Only for ~x 2 dom(f) the algorithm will terminate. For that reason one sometimes calls partial recursive functions positively computable. There is an algorithm which gives the correct answer in case that f(~x) # but doesn't tell us anything in case that f(~x) " : Proposition 3.3.5. Every primitive recursive function is recursive. It is easy to code the partial recursive functions. All we have to do is to extend De nition 3.2.10 by the additional clause
132
III. Theory of Decidability
6. pf q = h5 (pf q)1 ;_ 1 pf qi Then we put CP = fe : e codes a partial recursive functiong and can easily extend Proposition 3.2.11 to Proposition 3.3.6. CP is a primitive recursive set. Proof. We refer to the proof of 3.2.11. All we have to do is to replace CP R by CP and to add ((e)0 = 5 ^ lh(e) = 3 ^ (e)2 2 CP ^ (e)1 = (e)21 ;_ 1) to the disjunction. It is again obvious that we may easily reconstruct the function f from its code pf q: Usually we denote the function with code e by feg: Thus we have a relation R(e ~z x) , feg(~z) ' x and our next aim is to determine the complexity of that relation. For that reason we are going to code the computation of feg(~z): Informally we will do that by de ning a sequence c = h e i hc1 : : : cnii where codes the output, e codes the function, i the input and c1 : : : cn the subcomputations which are necessary for the computation of feg((i)0 : : : (i)lh(i);_ 1 ): Let Cmp denote the set of codes for computations. This we would like to manage as follows: for the basic functions we have computations of the form: hk pCknq i i h(i)k;1 pPknq i i h(i)0 + 1 pS q i i: In the case of a substitution h0 pSub(g h1 : : : hn)q i hh0 pgq h1 : : : ni : : : i h1 ph1q i : : : i : : : hn phnq i : : : ii: For the recursor a computation should look like hn pR(g h)q i_hni hh0 pgq i : : : i h1 phq i_h0 0i : : : i : : : hn phq i_hn;_ 1 n;_ 1i : : : ii and a computation with the unbounded search operator is given by hn pf q i hh0 pf q i_ h0i : : : i h1 pf q i_ h1i : : : i : : : hn pf q i_ hni : : : ii
3.3. Partial Recursive Functions and the Normal Form Theorem
133
where 0 : : : n;1 6= 0 and n = 0: Then we have c 2 Cmp ,c 2 Seq ^ lh(c) = 4 ^ (c)1 2 CP ^ (c)2 2 Seq ^ lh((c)2 ) = (c)11 ^ (c)3 2 Seq ^ (c)10 = 0 ^ (c)0 = (c)12 ^ (c)3 =] _ (c)10 = 1 ^ (c)0 = (c)2((c)12 ;1) ^ (c)3 =] _ (c)10 = 2 ^ (c)0 = (c)20 + 1 ^ (c)3 =] _ (c)10 = 3 ^ 8i < lh((c)3 )((c)3i 2 Cmp) ^ (c)300 = (c)0 ^ (c)301 = (c)12 ^ 8j < lh((c)3 )(j = 0 _ ((c)3j 1 = (c)1(j +2) ^ (c)3j 2 = (c)2 )) ^ 8j < lh((c)302)((c)302j = (c)3(j +1)0)] _ (c)10 = 4 ^ lh((c)3 ) = (c)2(lh((c)2 );_ 1) + 1 ^ 8i < lh((c)3 )((c)3i 2 Cmp) ^ (c)301 = (c)12 ^ (c)302 = h(c)20 : : : (c)2(lh(c)2 ;_ 2)i ^ 8j < (c)2 lh(c)2 ;_ 1(j = 0 _ (c)3j 1 = (c)13 ^ (c)3j 2 = h(c)20 : : : (c)2(lh(c)2 ;_ 2) j ;_ 1 (c)3(j ;_ 1)0i) ^ (c)0 = (c)3(lh(c)3 ;_ 1)0] _ (c)10 = 5 ^ lh((c)3 ) = (c)0 + 1 ^ 8i < lh((c)3 )((c)3i 2 Cmp ^ (c)3i1 = (c)12 ^ (c)3i2 = h(c)20 : : : (c)2(lh(c)2 ;_ 1) ii^) (i < lh((c)3 );_ 1 ! (c)3i0 6= 0) ^ (c)3(lh(c)3 ;_ 1)0 = 0]]: It is again obvious by the corollary to the course-of-values recursion that Cmp is a primitive recursive set. Thus we get fegn (z1 : : : zn) ' x , 9z(hx e hz1 : : : zni z i 2 Cmp): According to the common notation we de ne T n (e z1 : : : zn z) , h(z)0 e hz1 : : : zn i (z)1i 2 Cmp: and call T n the Kleene T -predicate. Then we have the following theorem by Stephen Cole Kleene 1909] in 1936. Theorem 3.3.7 (Kleene's normal form theorem). There is a primitive recursive relation T n and a primitive recursive function U such that #T n = n + 2 and for every partial recursive function f : INn ! IN there is a natural number e with f(~z ) ' U(xT n(e~z x)): The proof of Theorem 3.3.7 is obvious by the above construction of the T n predicate. In e the `program' computing the function f is coded. x codes the computation of f applied to the arguments ~z, and U extracts the value f(~z ) from the computation. The function U is just the decoding function x:(x)0 which is primitive recursive.
134
III. Theory of Decidability
Exercises
E 3.3.1. (S. C. Kleene, 1936) The class R is de ned inductively by 1. The basic functions are in R 2. R is closed under substitution and recursor 3. If f : INn+1 ! IN is in R and it is 8x1 : : : 8xn9y f(x1 : : : xn y) = 0 then it is f in R: Prove: a) Every function in R is recursive. b) Every recursive function is an element of R.
3.4 Universal Functions and the Recursion Theorem
P IN is called universal for a class C De nition 3.4.1. A partial function f : INn+1 ;! P INg if it is of n-ary partial functions, i.e. C fg : g : INn ;! 8g2C9e8x1 : : : 8xn f(e x1 : : : xn) ' g(x1 : : : xn): Such an e 2 IN is called an index of g w.r.t. f.
As a corollary to Kleene's normal form theorem one obtains the following result, stated by Emil Leon Post 1897, y1954] in 1922, Alan Mathison Turing 1912, y1954] in 1936 and S. C. Kleene in 1938.
Lemma 3.4.2. The function ,n : INn+1 ! IN with ,n (e~x) ' fegn(~x) is partial recursive and universal for the class of n-ary partial recursive functions.
Proof. It is
fegn(~x) ' U(yT n (e~x y))
so ,n is partial recursive and Theorem 3.3.7 shows that ,n is universal. Lemma 3.4.3 (Snm -theorem). There is an (m + 1)-ary primitive recursive function Snm : INm+1 ! IN, such that
8x1 : : : 8xm 8y1 : : : 8yn fegm+n (x1 : : : xm y1 : : : yn ) ' fSnm (e x1 : : : xm )g(y1 : : : yn):
3.4. Universal Functions and the Recursion Theorem
135
Proof. It is fSnm (e x1 : : : xm)g = Sub(feg Cxn1 : : : Cxnm P1n : : : Pnn) and we may de ne Snm (e x1 : : : xm ) = pSub(feg Cxn1 : : : Cxnm P1n : : : Pnn)q = h3 n e h0 n x1i : : : h0 n xmi h1 n 1i : : : h1 n nii which is primitive recursive. The Snm -theorem tells us that it is possible to obtain a code of the function y1 : : :yn :f(x1 : : : xm y1 : : : yn) out of a code for f in a primitive recursive manner. It is due to S. C. Kleene (1938). Using this lemma it is possible to prove one main tool in recursion theory also mentioned by Kleene in 1938. Theorem 3.4.4 (Recursion theorem). For every n + 1-ary partial recursive function f there is an index e 2 IN such that 8x1 : : : 8xn fegn (x1 : : : xn) ' f(e x1 : : : xn): Proof. Since f is partial recursive so is y~x:f(Sn1 (y y)~x): Now let e0 be an index of this function and de ne e = Sn1 (e0 e0): Then we have
fegn(~x) ' fSn1 (e0 e0 )gn(~x) ' fe0 gn+1 (e0 ~x) by the Snm -theorem ' f(Sn1 (e0 e0)~x) by the de nition of e0 ' f(e~x): Here we are going to give an easy example, how to use the recursion theorem. We want to prove, using the recursion theorem, that there is a recursive function f : IN2 ! IN with f(x 0) ' x2 f(x n + 1) ' f(f(x n) n): P IN by Therefore we are going to de ne h : IN3 ;! (2 if n = 0 h(e x n) ' x 2 2 feg (feg (x n;_ 1) n;_ 1) if n 6= 0: Using Theorem 3.4.2 h is partial recursive. Using the recursion theorem there is an e 2 IN, such that 8x8n h(e x n) ' feg2(x n):
136
III. Theory of Decidability
Now one can prove by induction on n : 8n8x feg2(x n) #. So de ne f = feg2, i.e. f is partial recursive and dom(f) = IN2 , i.e. f is total. So f is recursive.
Exercises
E 3.4.1. Prove that there is a primitive recursive function f : IN2 ! IN, such that 8x ff(e0 e1)g1 (x) ' fe0 g1(fe1 g1(x)): E 3.4.2. Prove that the Ackermann-P+eter function de ned in E 3.1.5 is recursive using the recursion theorem.
E 3.4.3. a) (R. Peter, 1935) Prove that the function UPR of section 3.2 is recursive. b) Prove that there is a relation which is recursive (i.e. a relation R such that R is recursive) but not primitive recursive. E 3.4.4. Prove that there is no normal form theorem of the following shape: There is a primitive recursive relation T IN3 such that for all partial recursive functions P IN there is an e 2 IN such that f : IN ;!
8x f(x) ' yT(e x y)
E 3.4.5. (S.
C. Kleene, A. M. Turing, 1936) Prove that there is no recursive universal function for the class of recursive functions f with rg(f) f0 1g: E 3.4.6. Prove that the partial recursive functions are closed under a) course-of-values recursion. b) simultaneous recursion.
3.5 Recursive, Semi-recursive and Recursively Enumerable Relations De nition 3.5.1. R INn
(E. L. Post (1922), S. C. Kleene (1936)) We call a relation
1. recursive, if R is a recursive function. 2. semi-recursive, if R = dom(f) for some partial recursive function f: 3. recursively enumerable, (briey: r.e.) if R = or there is a recursive function f such that R = f(z1 : : : zn) : 9x(f(x) = hz1 : : : zni)g i.e. f enumerates hRi = fhz1 : : : zni : (z1 : : : zn ) 2 Rg:
3.5. Recursive, Semi-recursive and Recursively Enumerable Relations
137
If we had not restricted ourselves to functions with values in IN instead of admitting functions with ranges INn , then we could have de ned the recursively enumerable relations as the ranges of recursive functions. Because we have primitive recursive coding functions we have the following result. Proposition 3.5.2. A relation R IN is recursively enumerable i R = or there is a recursive function f such that R = rg(f) = fx : 9y(f(y) = x)g: One should notice that recursive predicates are decidable. To check whether ~z 2 P we just have to compute P (~z): Semi-recursive predicates, however, are only positively decidable. To check ~z 2 P = dom(f) we may apply the algorithm for f: If ~z 2 dom(f) it will terminate and we get the answer `yes'. But it ~z 2= dom(f) we will never get an answer. Recursively enumerable predicates are also only positively decidable. We may create a recursively enumerable set R such that hRi = rg(f) by successively computing the list f(0) f(1) f(2) : : : If ~z 2 R, then h~zi will eventually show up in that list, if ~z 2= R however, this gives no answer because h~zi will never show up but at no point we can be sure that it might possibly show up later. Our next aim is to study the closure properties of the recursive, semi-recursive and recursively enumerable relations. Lemma 3.5.3. The class of recursive relations is closed under boolean operations, bounded quanti cation and recursive substitution. The proof is the same as that of 3.1.12 and 3.1.14. The key is that every primitive recursive function is of course recursive and the closure under boolean operations and bounded quanti cation was obtained by using primitive recursive functions as auxiliary functions. The closure under recursive substitution is a consequence of the closure of recursive functions under substitution which on its side holds trivially because partial recursive functions are closed under substitution by de nition and substitution preserves the totality of the involved functions. The following theorem is a normal form theorem for recursively enumerable relations. It is by S. C. Kleene in 1936, John Barkley Rosser 1907, y1989] in 1936 and Andrzej Mostowski 1913, y1975] in 1947. Theorem 3.5.4. A relation P INn is r.e. i there is a recursive relation R INn+1 such that P = f(z1 : : : zn) : 9x(z1 : : : zn x) 2 Rg: Proof. Let P INn be r.e. If P = , then put R = f(z1 : : : zn x) : C1n+1(z1 : : : zn x) = 0g: If hP i = rg(f), then P = f(z1 : : : zn) : 9x(f(x) = hz1 : : : zni)g
138 and
III. Theory of Decidability R = f(z1 : : : zn x) : f(x) = hz1 : : : zn ig
is recursive by 3.5.3. For the opposite direction let P = f(z1 : : : zn) : 9x(z1 : : : zn x) 2 Rg for some recursive relation R: If P 6= we choose (a1 : : : an) 2 P and de ne
(
f(x) = h(x)0 : : : (x)n;_ 1 i if ((x)0 : : : (x)n) 2 R ha1 : : : ani otherwise. Then f is recursive and hP i = rg(f): Of course 9x in 3.5.4 might be a dummy quanti er. So we have as a corollary the following fact.
Lemma 3.5.5. Every recursive relation is r.e. Later on we will see that the converse of 3.5.5 is false. Thus the recursive relations form a proper subclass of the r.e. relations. To get the bridge to the semi-recursive relations we de ne Wen = f(z1 : : : zn) : 9y T n (e z1 : : : zn y)g i.e. Wen = dom(fegn ): Since every partial recursive function is feg for some e 2 CP we see that (We )e2CP enumerates all the semi-recursive relations. Since f(z1 : : : zn y) : T n (e z1 : : : zn y)g is a recursive (even primitive recursive) relation we get by 3.5.4.
Proposition 3.5.6. Every semi-recursive relation is r.e. On the other hand, if P is r.e., then P = f~z : 9x(~z x) 2 Rg for some recursive relation R: If we put
f(~z ) = x((~z x) 2 R) we get P = dom(f) which shows that P is semi-recursive. Thus we have the following characterisation of the r.e. relations by S. C. Kleene (1936).
Theorem 3.5.7. The class of semi-recursive and the class of r.e. relations coincide. After having seen that the semi-recursive and recursively enumerable relations coincide we are going to study their closure properties.
Lemma 3.5.8. The class of r.e. relations is closed under positive boolean operations, i.e. ^ _ bounded quanti cation, recursive substitution and unbounded existential quanti cation.
3.5. Recursive, Semi-recursive and Recursively Enumerable Relations
139
Proof. Let P1 and P2 be r.e. relations. Then Pi = f~z : 9y(~z y) 2 Rig for some recursive relations Ri (i = 1 2): But then we have \
^
P1 P2 = f~z : (9y1 (~z y1) 2 R1) _ (9y2 (~z y2) 2 R2)g ^ = f~z : 9y((~z (y)0 ) 2 R1 _ (~z (y)1 ) 2 R2)g: Because of the closure of recursive relations under boolean operations and recursive substitutions we have
f(~z y) : (~z (y)0 ) 2 R1 _^ (~z (y)1 ) 2 R2 g \
as a recursive relation. Thus P1 P2 is r.e. by 3.5.4. The closure under recursive substitution again follows from 3.5.4 and the fact that the recursive relations are closed under recursive substitution. Let's prove the closure under bounded quanti cation. We postpone the case of bounded 9-quanti cation because it is entailed by unbounded 9-quanti cation. Thus let Q = f(~x z) : 8y z(~x y) 2 P g for some r.e. relation P: We use 3.5.4 to get Q = f(~x z) : 8y z 9u(~x y u) 2 Rg for some recursive relation R: But then we claim Q = f(~x z) : 9v8y z(~x y (v)y ) 2 Rg: The inclusion `' is obvious. To show `' let (~x z) 2 Q: Then for every y z there is an uy such that (~x y uy ) 2 R: Put v = hu0 : : : uz i to see that (~x z) is a member of the set on the right hand. To show closure under 9-quanti cation we assume Q = f~z : 9y(~z y) 2 P g for a r.e. relation P: Then by 3.5.4 Q = f~z : 9u9y(~z y u) 2 Rg = f~z : 9v(~z (v)0 (v)1 ) 2 Rg for some recursive relation R which by the closure properties for recursive relations and 3.5.4 entails that Q is also r.e. Because
9x z((~z x) 2 P) , 9x(x z ^ (~z x) 2 P) closure under unbounded 9-quanti cation together with closure under ^ entails closure under bounded 9-quanti cation. P IN is partial recursive i its graph G = Theorem 3.5.9. A function f : INn ;! f f(~z y) : f(~z ) ' yg is r.e.
140
III. Theory of Decidability
Proof. We have for partial recursive f (~z y) 2 Gf , f(~z ) ' y , 9u(T n(pf q~z u) ^ U(u) = y ^ 8y < u:T n(pf q ~z y)): The relation in parentheses is obviously recursive (even primitive recursive). Thus Gf is r.e. by 3.5.4. For the opposite direction let Gf = f(~z y) : 9uR(~z y u)g for a recursive relation R: We claim that f(~z ) ' (uR(~z (u)0 (u)1))0 : If it is f(~z ) ", then we have 8u:R(~z y u): Thus uR(~z (u)0 (u)1) " : If f(~z ) #, then 9!y9v(R(~z y v)): Take the least such v and put u = hy vi: Then f(x) ' y ' (uR(~z (u)0 (u)1))0 : Now we have the means to give a second proof to show that there are universal partial recursive functions. P IN de ned by ,n(e~z) ' fegn(~z) is Corollary 3.5.10. The function ,n : INn+1 ;! partial recursive. Proof. For the graph of the function ,n we have ,n(e~z) ' y i fegn(~z) ' y which means 9u(T n(e~z u) ^ 8z < u:T n(e~z z) ^ U(u) = y): So Gn is r.e. and by 3.5.9 ,n is partial recursive. We close this section by a characterisation of recursive relations which is due to E. L. Post (1943), S. C. Kleene (1943) and A. Mostowski (1947). Theorem 3.5.11 (Post's theorem). A relation R is recursive i both R and :R are r.e. Proof. For the easy direction let R be recursive. Then R and :R are recursive and hence also r.e. by 3.5.5. For the opposite direction let R and :R be both r.e. Then we have recursive relations P1 and P2 such that ~x 2 R , 9u((~x u) 2 P1 ) and Let
~x 2= R , 9v((~x v) 2 P2 ): f(~x) ' z((~x z) 2 P1 _ (~x z) 2 P2 ):
3.5. Recursive, Semi-recursive and Recursively Enumerable Relations
141
Then f is partial recursive and obviously also total. Hence f is recursive. We claim ~x 2 R , (~x f(~x)) 2 P1 which by the closure of recursive relations under recursive substitutions entails the recursiveness of R: To prove the claim we observe ~x 2 R ) 9u((~x u) 2 P1) ^ 8u((~x u) 2= P2 ): Thus (~x f(~x)) 2 P1 and we have the direction from left to right. Conversely if ~x 2= R, then 8u((~x u) 2= P1 ) which implies (~x f(~x)) 2= P1 : Post's theorem is easy to visualise if we think in terms of decidability and positive decidability. If R is decidable, then :R is and thus both are positively decidable. On the other hand if R and :R are positive decidable we simultaneously apply the algorithms which decide ~x 2 R and ~x 2= R positively. Since either ~x 2 R or ~x 2= R we either get an answer assuring ~x 2 R or ~x 2= R: This, however, decides ~x 2 R:
Corollary 3.5.12. A (total) function f : INn ! IN is recursive i its graph is recursive. Proof. If f is recursive, then Gf is r.e. by Theorem 3.5.9. Because (~z y) 2= Gf , 9x(f(~z ) = x ^ y 6= x) , 9x((~z x) 2 Gf ^ y =6 x) :Gf is r.e., too. So by 3.5.11 Gf is recursive. The second direction follows directly by
3.5.9. Finally we show that the classes of recursive and r.e. relations are indeed distinct. The proof is based on a diagonalisation argument and was mentioned by E. L. Post (1922), K. Godel (1931) and S. C. Kleene (1936)
Theorem 3.5.13. There is a r.e. relation which is not recursive, namely K = fx : 9zT 1 (x x z)g = fx : x 2 Wx1 g: Proof. K is r.e. by 3.5.4. Towards a contradiction assume that K is recursive. Then :K is recursive, too, and there is an e 2 IN such that :K = We1 and we obtain e 2= K , e 2 We1 , 9yT 1 (e e y) , e 2 K: Thus K cannot be recursive. Now we close this section by reviewing the connection between the subsets (unary relations) of IN we have studied up to now.
142
III. Theory of Decidability
r.e.
ZZ Z : ZZ Xy ZZ XX ZZ ZZ Z ;;
complements of r.e. recursive
primitive recursive We learned that this picture is correct, e.g. by Post's theorem we know that the recursive subsets of IN are just those subsets which are r.e. and complements of r.e. sets.
Exercises
E 3.5.1. Prove that the class of r.e. relations is not closed under unbounded universal quanti cation. E 3.5.2. (S. C. Kleene, 1936) Let R IN be an in nite set. Prove: a) R is r.e. i R is range of a one-one recursive function. b) R is recursive i R is range of a strictly increasing recursive function.
E 3.5.3. a) There is an n 2 IN with Wn1 = fng. b) For any recursive function f there is an n 2 IN with Wf1(n) = Wn1 : P IN de ned by E 3.5.4. Let R IN2 be r.e. Is the function f : IN ;! f(x) ' yR(x y)
partial recursive? E 3.5.5. (S. C. Kleene, 1936) Prove that for a r.e. relation P INn+1 there is a P IN such that partial recursive function f : INn ;!
9yP (~x y) , f(~x) # ^P(~x f(~x)):
3.6 Rice's Theorem Up to now we have been mainly concerned with recognising sets to be recursive. This section is devoted to a theorem which may help us to see that many explicitly given sets are not recursive. It is due to H. Rice in 1953. The idea behind that theorem
3.6. Rice's Theorem
143
is that it is only positively decidable if two partial recursive functions fe0 g fe1g are extensionally dierent, i.e. if it is
9x fe0 g(x) 6' fe1g(x): But now we have to observe the following closure property of partial recursive functions. Proposition 3.6.1 (De nition by cases). If P1 : : : Pn are pairwise disjoint r.e. relations and g1 : : : gn are partial recursive functions, then the function f de ned by
8 g1 (~z) > > 0 r n > _ > (l (z : : : z ; 1 : : : z 0 r n)) > > > (m (z0 : : : zr : : : zn)) > > > : (k (z0 : : : zr : : : zn ))
if (k INC(r) l) 2 P if (k DEC(r) l) 2 P if (k BEQ(r) l m) 2 P and zr = 0 if (k BEQ(r) l m) 2 P and zr 6= 0 otherwise.
We call a function f : INn ! INm primitive recursive, partial recursive, recursive if the function hf i = f i.e. hf i(z1 : : : zn) = hf(z1 : : :zn )i is. Proposition 3.7.1. The transition function P : INn+2 ! INn+2 is primitive recursive. Proof. This is obvious by the de nition since every nite set is primitive recursive. This will be made more explicit in the exercises. We obtain the iterated transition function iP : IN INn+1 ! MP INn+1 by
iP (0~z) = (k~z) where k is the start mark of the programme P
iP (n + 1~z) = P (iP (n~z)): Thus iP (n ~z) is the n-fold application of the transition function to the start con guration (k~z): Proposition 3.7.2. The iterated transition function is primitive recursive. The iterated transition function simulates the computation of the computing device under a given programme P: Thus if we have the input ~z for a random access machine iP (n~z) computes the actual mark in the programme P and the content of the registers after n steps. This should be done until a stop mark is reached. The result of that computation is available in the register number 0, which is (hiP (n~z)i)1 if (hiP (n~z)i)0 is the stop mark. Thus we de ne P IN is random access machine (RAM) comDe nition 3.7.3. A function f : INn ;! putable if there is a programme P such that f(~z ) ' (iP (n((iP (n~z))0 is stop mark of P)~z)1:
3.7. Random Access Machines
147
Because we have only nitely many stop marks in a given programme P we obtain:
Lemma 3.7.4. Every RAM computable function is partial recursive. In the exercises we will prove the converse direction. So we have the following characterisation of the partial recursive functions.
Theorem 3.7.5. The partial recursive functions are just the RAM computable functions.
At this point we have been confronted with a universal method to handle any kind of algorithmical computability. In the exercises we will code up the instructions of any given algorithm. Then we will simulate the algorithm by a recursive function. These facts give reason for the so-called Church's thesis, formulated by A. Church and A. M. Turing in 1936. Church's thesis: every eectively computable function is recursive.
Exercises E 3.7.1. Prove the following claims: a) The functions Ckn Pkn S are RAM computable. P IN and h : : : h : INm ;! P IN are RAM computable, then b) If g : INn ;! 1 n Sub(g h1 : : : hn) is. P IN and h : INn+2 ;! P IN are RAM computable, then R(g h) is. c) If g : INn ;! P IN is RAM computable, then g is. d) If g : INn+1 ;! e) Every partial recursive function is RAM computable.
E 3.7.2. a) Code the instructions of a random access machine into a set of natural numbers. b) Determine a primitive recursive relation Prg IN such that Prg(e) , e codes a programme for a random access machine c) Determine a primitive recursive relation End IN2, such that End(e y) ,e codes a programme P and y 2 dom(P ) with (y)0 is stop mark d) Prove that the function : INn+3 ! INn+2 , with (e~x) = P (~x), if e codes the programme P, is primitive recursive.
148
III. Theory of Decidability
3.8 Undecidability of First Order Logic By the completeness theorem for rst order logic we have
j= F i ` F: The relation ` F is positively decidable because we have an algorithm producing all formulas F with ` F : in a rst step we take all axioms of ` (it can be decided if F is an axiom or not) and then use the rules of ` to derive new formulas out of the produced ones. There it is decidable if we have the premise of a given rule. The question at this point is: is ` F decidable? Proposition 3.8.1. P INn is semi-recursive i there is a programme P with ~x 2 P , 9n(iP (n~x)0 is stop mark of P):
Proof. Let P = dom(f) with f partial recursive. Now let P be the programme computing f (cf. 3.7.3 and 3.7.5). Then we have
~x 2 P , ~x 2 dom(f) , 9n(iP (n~x)0 is stop mark of P): Now we will come to a famous result by Alonzo Church and Alan M. Turing (1936): the unsolvability of the `Entscheidungsproblem': it shows that logical truth (in rst order logic) is undecidable. Theorem 3.8.2 (Church's theorem). The validity of formulas of rst order logic is undecidable. Proof. We will use the following strategy: Take P INk recursively enumerable but not recursive. Using Proposition 3.8.1 we have a programme P with P = dom(P): Now de ne FP ~x with: ~x 2 dom(P) ,j= FP ~x : If j= F is decidable, so would be ~x 2 dom(P): This is a contradiction to: P is not recursive. Now we are going to construct the formulas FP ~x : Let P be a programme for a random access machine with m registers. Then de ne L to be the rst order language with identity containing a predicate symbol R with #R = m + 2 which should be read to simulate the computation, i.e. R(n s~r) , iP (n~x) = (s ~r)
a binary predicate symbol 212
Glossary pd ;_ ~x:f(~x ~y) sg sg R xz p(n) (x y) hz0 : : : zni (x)i Seq lh(x) x_ y f CPR UPR P IN f : INn ;! f(x) " f(x) # f(n) ' g(n) f P CP Cmp fegn (~z) T n (e~z z) ,n Snm Wen Gf K INC DEC BEQ
MP
P iP FP ~x
predecessor function, 117 arithmetical dierence, 117 functions taking arguments for ~x, 118 sign function, 119 dual function to sg, 119 characteristic function of R, 119 bounded -operator, 120 nth prime, 125 multiplicity of p(y) in the factorisation of x, 125 code of the empty sequence, 125 coded sequence, 125 ith component of a coded sequence, 125 coded sequences, 125 length of a coded sequence, 126 concatenation of coded sequences, 126 course-of-values function, 126 codes of primitive recursive functions, 128 universal function for primitive recursive functions, 129 partial function, 130 f is unde ned at x, 130 f is de ned at x, 130 partial equality, 131 unbounded -operator, 131 unbounded -operator, 131 codes of partial recursive functions, 132 computation predicate, 132 partial recursive function with code e, 133 Kleene's T -predicate, 133 universal partial recursive function, 134 Snm -predicate, 135 recursive enumerable set with index e, 138 graph of f, 140 diagonal set, 141 increase function of a RAM, 145 decrease function of a RAM, 145 branch function of a RAM, 145 marks of the programme P, 145 transition function, 146 iterated transition function, 146 formula describing P, 148
Chapter IV LPA language of Peano Arithmetic, 153 PA NT
Peano Arithmetic, 153 number theory, 154
Glossary
Chapter V LI many-sorted language, 165
FV(t) free variables of t, 166 FVi (t) free variables of sort i in t, 166 SI many-sorted structure, 166 t+ translated term, 168 F+ translated formula, 168 OntI ontological axioms, 168 Ont ontological axioms, 169 L! language of !-logic, 172 S! !-structure, 172 ! F calculus with !-rule, 173 ! F validity in !-structures, 173 L2 second order language, 175 S2 second order structure, 175 L2PA language of second order Peano Arithmetic, 175 N constant symbol for the natural numbers, 175 PA2 second order Peano Arithmetic, 175 Lw2 weak second order language, 176 S2w weak second order structure, 176
213
214
Glossary
Index 8-axiom, 49
admissible rule, 64 @-function, 32 alphabet, 7 antecedent, 54 Aristotle, 1 arity, 5, 7 8-rule, 49, 55 ^-rule, 55 assignment, 18, 167 boolean {, 12, 28 atomic formula, 9 axiom of choice, 31 system, 102 axiomatizable, 109 back-and-forth, 108 bar induction, 59 basic functions, 116 operations, 116 Bernays, P., 127 Boole, G., 2 boolean assignment, 12, 28 operation, 119 boolean valid, 49 bounded quanti cation, 119 calculus Gentzen style {, 54 Hilbert style {, 53 Tait style {, 54, 55 Cantor, G., 108 cardinal, 32 categorical, 102
characteristic function, 119 Church's thesis, 147 Church, A., 147, 148 clause, 14 closed term, 9 compactness theorem, 84 2nd version, 46 for rst order logic, 40 for many-sorted logic, 170 for propositional logic, 34 for weak second order logic, 177 complete !- {, 159 axiom system, 102 calculus, 48 set of connectives, 16 theory, 102 completeness theorem, 50 for many-sorted logics, 171 connective, 5, 7, 10, 11, 165 complete set of {s, 16 consistent, 23 !- {, 158 sententially {, 29
nitely { {, 29 maximally { {, 29 constant symbol, 7 countable language, 31 set, 33 course-of-values function, 126 recursion, 127 Craig, W., 75 cut-rule, 58 De Morgan, A., 2
216 deduction logical {, 49 de nition explicit {, 78 implicit {, 78 inductive {, 9 Descartes, R., 1 diagram, 100 elementary {, 100 domain, 109 of a function, 130 of a structure, 17
9-axiom, 49 9-formula, 66
Ehrenfeucht, A., 112 elementary equivalent, 101 class, 109 diagram, 100 embedding, 99 extension, 99 substructure, 99 embedding, 99 elementary {, 99 end extension, 156 epimorphism, 83 equivalent elementary {, 101 semantically {, 24 sentential {, 13 well-orderings, 189 9-rule, 49, 55 ex falso quodlibet, 11, 46 expansion, 38 extension, 94 by de nitions, 94 conservative {, 94 elementary {, 99 end {, 156
eld, 109, 188
xed point, 9 formula, 8, 166 atomic {, 9 empty {, 74
Index existential {, 66 interpolation {, 72 irreducible {, 58 free variable, 8 Frege, G., 2 function, 5 basic {, 116 characteristic, 119 course-of-values {, 126 partial {, 130 partial recursive {, 131 primitive recursive, 117 RAM computable {, 146 recursive {, 131 symbol, 7 total {, 131 transition {, 145 truth {, 10, 11 universal {, 134 generator, 111 Gentzen, G., 54 Godel number, 128 Godel, K., 2, 35, 50, 116, 141, 153, 158 grammar, 7, 9 group, 6, 17, 109, 173 theory, 7, 23 Hamilton, W., 2 Hauptsatz, 64, 66 Henkin constant, 38 degree, 38 extension, 38 set, 35 Henkin, L., 35, 50, 173 Herbrand form, 69 language, 69 Herbrand, J., 66 Hilbert, D., 53, 127, 163 Id-axiom, 86 Id-rule, 86 incomplete !- {, 159
Index incompleteness theorem
rst {, 160, 161 second {, 160, 164 inconsistent !- {, 159 induction bar {, 59 on the de nition, 9 scheme, 153 trans nite {, 32, 188, 196 inductive de nition, 9 inference boolean {, 50 in nitesimal, 41 in x notation, 8 initial segment, 59, 191 proper {, 191 instruction, 145 interpolation formula, 72 inversion, 56 isomorphic, 101 joint consistency theorem, 75 Kleene, S.C., 131, 133{138, 140{142 -abstraction, 118 language, 5 countable {, 31
rst order {, 6, 7 Herbrand {, 69 many-sorted {, 165 of a theory, 93 second order {, 6 Tait {, 54 with identity, 82 lattice, 109 L-axiom, 49, 55 Leibniz, G.W., 1, 2, 115 lemma diagonalisation {, 161 Herbrand's {, 90 principal semantic {, 62 principal syntactic {, 61 Tarski's {, 107 Zorn's {, 31, 34
217 length, 59 letter, 7 liar antinomy, 160 limit ordinal, 31, 195 linear ordering, 188 Lowenheim, L., 104 Lowenheim-Skolem theorem downwards, 104 for weak second order logic, 177 for many-sorted logic, 170 upwards, 106 logic classical {, 11
rst order {, 7 higher order {, 175 intuitionistic {, 11 many-sorted {, 165 !- {, 172 S - {, 172 second order {, 175 third order {, 175 weak second order {, 176 logical consequence, 46 Lullus, R., 1 Lyndon, R.C., 76 main formula, 55 Mal'cev, A.I., 35, 50 many-sorted language, 165 logic, 165 mark identi cation {, 145 start {, 145 stop {, 145 transition {, 145 Megarians, 1, 11 model, 40, 93 class, 102 theory, 2 modus ponens, 46, 49 monotone operator, 78 Mostowski, A., 137, 140 node, 59 topmost {, 59
218 normal form disjunctive {, 14 conjunctive {, 14 theorem, 133 number sequence, 59 number theory, 41 object, 5 occurrence positive, 76 negative, 76 !-complete, 159 !-consistent, 158 !-incomplete, 159 !-inconsistent, 159 omitting types theorem, 112 operator globally monotone, 78 monotone, 78 ordering linear {, 188 partial {, 34 ordinal, 31, 191 limit {, 31, 195 successor {, 31, 195 Orey, S., 173 _-rule, 55 partial function, 130 recursive function, 131 path, 59 Peano Arithmetic, 153 Peano, G., 2 P+eter, R., 127, 136 Post, E.L., 134, 136, 140, 141 predicate, 5 symbol, 7 prenex form, 68 primitive recursive function, 117 relation, 119 proof theory, 2 proposition, 5 propositional atom, 27
Index part, 27 provable, 93 pure conjunction, 14 disjunction, 14 quanti er, 6, 7 existential {, 6 universal {, 6 random access machine, 145 recursion course-of-values, 127 simultaneous {, 127 theory, 2 trans nite {, 32, 196 recursive function, 131 redex, 58 regular structure, 176 relation, 136 primitive recursive {, 119 recursive {, 136 recursively enumerable {, 136 semi-recursive {, 136 retract, 38 Rice, H., 142 Robinson, A., 75 Robinson, R.M., 130 root, 59 Rosser, J.B., 137, 158 Russell, B., 2 satis able, 23 Schutte, K., 58 search operator bounded {, 120 unbounded {, 131 search tree, 60 semantics, 10, 166 sentence, 10, 24 sentential connective, 7 equivalent, 13 form, 12 sequent, 54 set theory, 2
Index Sheer stroke, 16 Skolem, T., 104, 116, 127 soundness theorem, 49 standard interpretation, 82 Stoics, 1, 11 structural rule, 56 structure, 17, 166, 172, 175, 176 expanded {, 22 regular {, 176 sub-language, 38 substitution primitive recursive {, 119 substructure, 99 elementary {, 99 succedent, 54 successor ordinal, 31, 195 Syllogistic, 1 symbol auxiliary {, 7, 166 constant {, 7, 165 function {, 7, 165 non-logical {, 7 predicate {, 7, 165 syntax, 10 Tait calculus, 54, 55 Tait, W.W., 54 Tarski, A., 106, 164 term, 5, 8, 166 closed {, 9 theorem Beth's de nability {, 78 Church's {, 148 compactness {, 84 2nd version, 46 for rst order logic, 40 for many-sorted logic, 170 for propositional logic, 34 for weak second order logic, 177 completeness {, 50, 64 for many-sorted logic, 171 Craig's interpolation {, 75 deduction {, 46
rst incompleteness {, 160, 161 Herbrand's {, 71, 91 interpolation {, 72, 91
219 joint consistency {, 75 Lowenheim-Skolem { downwards, 104 for weak second order logic, 177 for many-sorted logic, 170 upwards, 106 Lyndon's interpolation {, 77 normal form {, 133 !-completeness {, 174 !-soundness {, 173 omitting types {, 112 Post's {, 140 recursion {, 135 Rice's {, 143 Rosser's {, 158 second incompleteness {, 160, 164 Snm - {, 134 soundness {, 49, 57 well-ordering {, 32 theory, 93 total function, 131 trans nite induction, 32, 188, 196 recursion, 32 transition function, 145 tree, 59 well-founded {, 59 truth function, 10, 11 table, 11 value, 19 Turing, A.M., 134, 136, 147, 148 type, 110 universal function, 134 universe, 166 valid, 23 boolean {, 49 value of a formula, 19 of a term, 18 truth {, 19 variable, 6, 7, 165 bounded {, 8
220 free {, 8 propositional {, 11 Vaught's test, 106 Vaught, R.L., 106 well-ordering, 188 theorem, 32 Whitehead, A.N., 2 witness, 35 word, 10
Index