INTRODUCTION TO METALOGIC
WITH AN
APPENDIX
ON
TYPE·THEORETICAL EXTENSIONAL AND INTENSIONAL LOGIC
BY
IMRERUZSA
AR...
15 downloads
588 Views
23MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
INTRODUCTION TO METALOGIC
WITH AN
APPENDIX
ON
TYPE·THEORETICAL EXTENSIONAL AND INTENSIONAL LOGIC
BY
IMRERUZSA
ARON PUBLISHERS
* BUDAPEST * 1997
ISBN 963 85504 5 7 ARON PUBLISHERS, H-1447 BUDAPEST, P.O.BOX 4~7, -HUNGARY © Imre Ruzsa, 1997. Printed in Hungary
Copies of this book are available at Aron Publishers H-1447 BUDAPEST, P.O.BOX 487, HUNGARY and L. Eotvos University, Department of SymbolicLogic H-1364 BUDAPEST, P.O.BOX 107, HUNGARY
Nyomdai e16cillft~s: CP STUDIO, Budapest
· l':.
),
,.
.f
~.r-?, :~
/i' : ,.
T
~ .
i..
In'menuiriam my wife (1942-1996)
?
'
" ;
.
ACKNOWLEDGEMENTS The material of the present monograph originates from a series of lectures held by the author at the Department of Symbolic Logic of L. Eotvos University, Budapest. The questions and the critical remarks of my students and colleagues gave me a very valuable help in developing my investigations. My sincere thanks are due to all of them. Special thanks are due to PROFESSOR ISTVAN NEMETI and Ms AGNES KURUCZ who read the first version of the manuscript and made very important critical remarks. In preparing and printing this monograph, I got substantial help from my son DR. FERENC RUZSA as well as from my daughter AGNES RUZSA.
*** This work was partly supported by the Hungarian Scientific Research Foundation
(OTKA II3, 2258) and by the Hungarian Ministry of Culture and Education (MKM 384/94).
lmre Ruzsa Budapest, June 1996.
vi
TABLE OF CONTENTS Chapter 1 Introduction
1
1.1. The subject matter of metalogic
1
1.2. Basic postulates on languages 1.3. Speaking on languages
2 4
1.4. Syntax and semantics
5
Chapter 2, Instruments of metalogic
2.1. Grammatical means 2.2. Variables and quantifiers 2.3. Logical means
7 7 10 15
2.4. Definitions
19
2.5. Class notation
20
Chapter 3 Language radices
3.1. Definition and postulates
25 25
3.2. The simplest alphabets
28
Chapter 4 Inductive classes 4.1. Inductive definitions
31 31
4.2. Canonical calculi
36
4.3. Some logical languages 4.4. Hypercalculi 4.5. Enumerability and decidability
42
47
Chapter 5 Normal algorithms 5.1. What is an algorithm? 5.2. Definition of normal algorithms
51 51 54
5.3. Deciding algorithms
58
5.4. Definite classes
61
Chapter 6
The first-order calculus (QC)
6.1. What is a logical calculus? 6.2. First-order languages 6.3. The calculus QC 6.4. Metatheoremson QC 6.5. Consistency.First-order theories
vii
38
66 66
67 70
72
74
Chapter 7 The formal theory of canonical calculi (CC*) 7.1. Approaching intuitively 7.2. The canonical calculus 1:* 7.3. Truth assignment 7.4. Undecidability: Church' s Theorem
76 76 78 81 83
Chapter 8 Completeness with respect to negation 8.1. The formal theory CC 8.2. Diagonalization 8.3. Extensions and discussions
90
Chapter 9 Consistency unprovable 9.1. Preparatory work 9.2. The proof of the unprovability of Cons
93 93 94
Chapter 10 Set theory 10.1. Sets and classes 10.2. Relations and functions 10.3. Ordinal, natural, and cardinal numbers 10.4. Applications
85 85 87
98 9R 103 106 110
References
114
Index
116
List of symbols
122
APPENDIX (Lecture Notes): Type-Theoetical Extensional and Intensional Logic Part 1: Extensional Logic Part 2: Montague's Intensional Logic References
viii
123 127 148 182
Chapter 1 INTRODUCTION 1.1 The Subject Matter of Metalogic Modern logic is not a single theory. It consists of a considerable (and ever growing) number of logical systems (often called - regrettably- logics). Metalogic is the science of logical systems. Its theorems are statements either on a particular logical system or about some interrelations between certain logical systems. In fact, every system of logic has its own metalogic that, among others, describes the construction of the system, investigates the structure of proofs in the system, and so on. Many theorems usually known as "laws of logic" are, in fact, metalogical statements. For example, the statement saying that modus ponens is a valid rule of inference - say, in classical firstorder logic - is a metalogical theorem about a system of logic. A more deeper metalogical theorem about the classical frrst-order logic tells us that a certain logical calculus is sound and complete with respect to the set-theoretical semantics of this system of logic. Remark. It is assumed here that this essay is not the first encounterof the reader with logic (neither with mathematics), so that the examples above and some similar ones later on, are intelligible. However, this does not meanthat the understanding of this bookis dependent on someprevious knowledge of logic or mathematics.
Another very important task of metalogic is to answer the problem: How is logic possible? To give some insight into the seriousness of this question, let me refer to the well-known fact that modern logic uses mathematical means intensively, whereas modem mathematical theories are based on some system(s) of logic. Is there a wayout from this - seemingly - vicious circle? This is the foundational problem of logic, and its solution is the task of the introductory part of metalogic. The greater part of this essay is devoted to this foundational problem. The device we shall apply in the course consists in dealing alternatively with mathematical and logical knowledge, without drawing a sharp borderline between mathematical and logical means. No knowledge of a logical or a mathematical theory will be presupposed. The only presupposition we shall exploit will be that the reader knows the elements of the grammar of some natural language (e.g., hislher mother tongue), can read, write and count, and is able to follow so-called formal expositions. (Of course, the ability last mentioned assumes - tacitly - some skills which can be best mastered in mathematics.) The introductory part of metalogic is similar to the discipline created by David Hilbert, called metamathematics. (See, e.g., HILBERT 1926.) The aim of metamathematics was to find a solid foundation of some mathematical systems (e.g., number theory, set theory) by using only so-called finite means. In Hilbert's view, fmite mathemat1
ics is sufficient for proving the consistency of transfinite (non-fmite, infinite) mathematical theories. In a sense, the foundation of a logical calculus (which is sufficient for mathematical proofs in most cases) was included in Hilbert's programme. (perhaps this is the reason that scientists who think that modem logic is just a branch of mathematics - called mathematical logic - often say metamathematics instead of metalogic.) Metalogic is not particularly interested in the foundation of mathematical theories. It is interested in the foundation of logical systems. In its beginning, metalogic will use very simple, elementary means which could be qualified as finite ones. However, the author does not dare to draw a borderline between the finite and the transfinite. We shall proceed starting with simple means and using them to construct more complex systems. Every system of modem logic is based on a formal language. As a consequence, our investigation will start with the problem: How is it possible to construct a language? Some of our results may tum out to be applicable not only for formal languages but for natural languages as well. This essay is almost self-contained. Two exceptions where most proofs will be omitted are the first-order calculus of logic (called here QC, Chapter 6) and set theory (Chapter 10). The author assumes that the detailed study of these disciplines is the task of other courses in logic.
Technical remarks. Detailed information about the structure of this book is to be found in the Table of Contents. At the end of the book, the Index and the List of Sym-
bols helps the reader to find the definitions of referred notions and symbols. In the inner reference, the abbreviations 'Ch .' , 'Sect.' , 'Def.', and 'Th.' are used for 'Chapter', 'Section', 'Definition', and 'Theorem', respectively. References for literature are given, as usually, by the (last) name of the author(s) and the year of publication (e.g., 'HILBERT 1926'); the full data are to be found in the References (at the end of the
book). - No technical term and symbol will be used without definition in this book. At the end of a longer proof or definition, a bullet '.' indicates the end. - A considerable part of the material in this book is borrowed from a work by the author written in Hungarian (RUZSA 1988).
1.2 Basic postulates on languages On the basis of experiences gained from natural languages, we can formulate our first postulate concerning languages: (L1) Each language is based on afinite supply ofprimitive objects. This supply is
called the alphabet ofthe language.
In the case of formal languages, this postulate will serve as a normative rule. 2
In spoken languages, the objects of the alphabet are called phonemes, in written languages letters (or characters). The phonemes are (at least theoretically) perceivable as sound events , and the letters as visible (written or printed) paint marks on the surface of some material (e.g., paper) - this is the difference between spoken and written languages. We shall be interested only in written languages.
In using a language, we form fmite strings from the members of the alphabet, allowing the repetition (repeated occurrence) of the members of the alphabet. In the case of spoken languages , the members of such a string are ordered by their temporal consecution. In the case of written languages, the order is regulated by certain conventions of writing. (The author assumes that no further details are necessary on this point.) Finite strings formed from the members, of the alphabet are called expressions - or, briefly, words - of that language. The reader may comment here that not all possible strings of letters (or of phonemes) are used in a (natural) language . Only a part of the totality of possible expressions is useful; this part is called the totality of well-formed or meaningful expressions. (But a counterexample may occur in aformal language!) Be it
as it is, to define the well-formed expressions, we certainly must refer to the totality of all expressions. Thus, our notion of expressions (words) is not superfluous.
Note that one-member strings are not excluded from the totality of expressions. Hence, the alphabet of a language is always a part of the totality of words of that language. Moreover, by technical reasons, it is useful - although not indispensable - to include the empty string .called empty word amongst the words 'of a language. (We shall recur to this problem later on.) Our second postulate is again based on e~periences with natural languages: (L2) If we know the alphabet of a language, we know the totality of its words
(expressions). In other words: The alphabet of a language uniquely determines the totality of its words.
To avoid philosophical and logical difficulties we need the third postulate: (L3) The expressions ofany language are ideal objects which are sensibly realiz-
able or representable (in any copies) by physical objects (paint marks) or events (sound events or others).
This assumption is not a graver one than the view that natural numbers are ideal objects. And it makes intelligible the use of a language in communication., Thus, in speaking of an expression (of a language) we speak of an ideal object rather than of a perceivable object, i.e., a concrete representation of that ideal object. In other words: Statements on an expression refer to all of its possible realizations, not only to some particular representation of the expression. (Reference to a concrete copy of an expression must be indicated explicitly.)
1.3 Speaking about languages Speaking (or writing) about a language, we must use a language. In this case, the language we want to speak about is called the object language, and the language we use is called the language of communication or, shortly, the used language. The object language and the used language may be the same. The used language is, in most cases, some natural language, even when the object language is aformal one. However, the used language is, in most cases, not simply the everyday language, but it is supplemented with some technical terms (borrowed from the language of some scientific discipline) and perhaps with some special symbols which might be useful in the systematic treatment of the object language. (If the object language is a natural one, then the used everyday language is to be supplemented, obviously, with terms of linguistics. In the case of formal languages, the additional terms and symbols are borrowed from mathematics, as we shall see later on.) The fragment of the used language that is necessary and sufficient for the description and the examination of an object language is called usually the metalanguage of the object language in question. In fact, this metalanguage is relativized to the used language. (Changing the language of communication, the metalanguage of an object language will change, too.) If the object language is a formal one, there might be a possibility to formulate its metalanguage as a formal language. (Theoretically, this possibility cannot be excluded even for natural object languages.) However, in such a case we need a meta-metalanguage for explaining - to make intelligible - the formalized metalanguage. This device can be iterated as often as one wishes, but in the end we cannot avoid the use of a natural language, provided one does not want to play an unintelligible game. When speaking about an object language, it may occur that we have to formulate some statements about some words of that language. If we want to speak about a word, we must use a name of the word. Some words (but not too many) of a language may have a proper name (e.g., epsilon is the proper name of the Greek letter e), others can be named via descriptions (e.g., 'the definite article of English'). A universal method in 'a written used language (to be used in this essay as well) consists in putting the expression to be named in between simple quotation marks (inverted commas); e.g., 'Alea iacta est' is a Latin sentence which is translated into English by 'The die is cast' . (Note that in a written natural language the space between words counts as a letter.) The omission of quotation marks can lead to ambiguity, and, hence, it is a source of language humour. An example:
4
- What word becomes shorter by the addition ofa syllable? - The word 'short'. Surely, it becomes 'shorter' but not shorter.
1.4 Syntax and semantics The science dealing with symbols (or signs) and systems of symbols is called semiotics. Languages as systems of symbols belong, obviously, under the authority of-semiotics. Semiotics is, in general, divided to three main parts: syntax, semantics, and pragmatics. (See, e.g., MORRIS 1938, CARNAP 1961.) The syntax of a language is a part of the description (or investigation) of the language dealing exclusively with the words of the language, independently of their meaning and use. Its main task is to define the well-formed (meaningful) expressions of the language and to classify them. The part of linguistic investigations that deals with the meaning of words and with the interrelations between language and the outer world but is indifferent with respect to the circumstances of using the language belongs to the sphere of semantics . Finally, if the linguistic investigation is interested even in the circumstances of language use then it belongs to the sphere of pragmatics. No rigid borderlines exist between these regions of semiotics.
In natural languages,
most parts of syntactic investigations cannot be separated from the study of the communicative function of the language, and, henceforth, investigations in the three regions become intertwined strongly. Of course, in the systematization of the results of studies, it is possible to omit the semantical and the pragmatic aspects; this makes possible pure syntax as a relatively independent area of language investigation.
In case of formal languages, the situation is somewhat different. A formal language is not (or, at least, rarely) used for communication. It is an artificial product aiming at the theoretical systematization of a scientific discipline (e.g., a system of logic, or a mathematical theory). Its syntax (grammar) and semantics are not discovered by empirical investigations, rather, they are created, constituted. Thus, seemingly, here we have a possibility for the rigid separation of syntax and semantics. However, if our formal language is not destined to be a l'art pour l'art game, its syntax must be suitable for some scientific purposes; at least a part of its expressions must be interpretable, translatable into a natural language. Consequently, the syntax and the semantics of a formal language created for some scientific purpose must be intertwined : it is impossible to outline, create the former without taking into consideration the latter. After the outline of the language, of course, the description of its syntax is possible inde-
5
· pendently from its semantics. In this case, the role and function of the syntactic notions and relations will be intelligible only after the study of the semantics.
In this essay, the following strategy will be applied: Syntax and semantics (of a formal language) will be treated separately, but in t1l:e description of the syntax, we shall give preliminary hints with respect to the semantics. By this, the reader will get an intuitive picture about the function of the syntactic notion. However, our main subject matter belongs to the realm of syntax. Fonnallanguages are "used" as well, even if not for communicative purposes, but in some scientific discipline (e.g., in logic). Applications of a fonnallanguage can be assumed as belonging to the sphere of pragmatics - if somebody likes to think so. However, this viewpoint is not applied in the literature of logic. After these introductory explanations, we should like to tum to our first problem: How is possible to construct the syntax of a language? However, we shall deal first with the means used in the metalanguages. This is the subject of the following chapter.
6
Chapter 2 INSTRUMENTS OF METALOGIC 2.1 Grammatical Means The basic grammatical instruments of communication are the declarative sentences. This holds true for any metalanguage - and even for our present hypermetalanguage used for the description of instruments of metalanguages. We formulate
defmitions, postulates, and theorems by means of declarative sentences. In what follows, we shall speak - for the sake of brevity - of sentences instead of declarative sentences. Thus, sentence is the basic grammatical category of metalanguages. Another important grammatical category is that of individual names - in what follows, briefly, names. Names may occur in sentences, and they refer to (or denominate) individual objects, of course, in our case, grammatical objects (letters, words, expressions). In the simplest case, they are proper names introducedby convention. Compound names will be mentioned later on. The mostfrequent components of sentences willbe called functors. In the first approximation, functors are incomplete expressions (in contrast to sentences and names which are complete ones insofar theirrole in communication is fixed) containing one or more empty places called argument places which can be filled in by somecomplete expressions (names or sentences) whereby onegetsa complete expression (nameor sentence): Remark. Thereexist functors of whichsomeemptyplaceis to be filled in by another functor. In our investigations, we shall not meet withsuch a functor. Hence, the aboveexplanation on functors - although defective - will sufficefor our purposes.
Functors can be classified into several types. The type of a functor can be fixed by determining (a) the category of words permitted to fill in its argument places (for each argument place), and (b) the category of the compoundexpressionresulted by filling in all its empty places. According to the number of empty places of a functor, we speak on one-place or monadic, two-place or dyadic, three-place or triadic, ... , multi-place or polyadic functors. A functor can be considered as an automaton whose inputs are the words filled in its empty places, and its output is the compound expression resulted by filling in its empty places. Using this terminology, we can say that the type of a functor is determined by the categoriesof its possibleinputs and the categoryof its output. A functor is said to be homogeneous if all its inputs belong to the same category, i.e., if all its empty places are to be filled in with words of a single category. We shall deal only with homogeneous functors.
7
In metalanguages, we shall be interested in the following three types of (homogeneous) functors .
(1) Sentence functors forming compound sentences from sentences. Their inputs and outputs are sentences.
(2) Name functors forming compound names from names. Their inputs and outputs belong to the category of names.
(3) Predicates forming sentences from names. Their inputs are names, and their outputs are sentences. Another cross-classification of functors consists in distinguishing logical and non-
logical functors. Logical functors have the same fixed meaning in all metalanguages. All the sentence functors we shall use are logical ones. We shall meet them in Sect. 2.3. Among the predicates, there is a single one that counts as a logical one: this is the dyadic predicate of identity. All the other functors we shall deal with are non-logical ones.
Name functors express operations on individual objects in order to get (new) objects. Well-known examples are the dyadic arithmetical operations: addition, multiplication, etc. Thu s, the symbol of addition , '+' , is a dyadic name functor. The expres sion 5+3 is a compound name (of a number) formed from the names '5' and '3'. Here the input places surround the functor (we can illustrate the empty places by writing' ... + -'); this is the general convention with respect to using dyadic functors (so-called infix notation). In metalanguages, the most important dyadic name functor we shall use is called
concatenation. It expresses the operation by which we form a new word from two given words , linking one of them after the other. For example , from the (English) words 'cow' and 'slip' we get by concatenation the word 'cowslip' (or even, in the reversed order, 'slipcow'). This operation will be expressed as
cow'sllp where
,n,
is the concatenation functor. Now, one sees that (in any language) the words
consisting of more than one letter are composed from letters by (iterated) applications of concatenation. We shall deal with this functor in more detail in Sect. 3.1.
Monadic predicates are used to express properties of individual objects. If the argument place of such a predicate is filled in by a name, we get a sentence stating that the object denominated by the name (if any) bears the property expressed by the predicate. An arithmetical sentence can serve as an illustration:
Eleven is an uneven number. Here the property is expressed by the predicate ' ... is an uneven number'; and its argument place is filled in by the name 'eleven'.
8
Multi-place (or polyadic) predicates express relations between individual objects. In arithmetic, the symbol
'
B
H
B) & (B => A)", this condition follows from (b)
and (d) above.) Remarks. 1. On the basis of everyday experiences, the reader may doubt the assumption that sentencesare either true or false. However, this doubt is not justified in metalanguages wheretruth is created by fiat , i.e., somebasic sentences are true as postulates or as definitions, and other sentences are inferred from these.Perhaps, the reader may acceptthat in somelimiteddomains, the true-false dichotomy of sentencesis acceptable, especially if we are speaking about ideal objects like in mathematics or '" in metalogic.
16
2. The truth conditions (a) ... (e) are more or less in accordance withthe everyday use of the words expressing our functors. Rule (d) seemsto be most remotefromthe everyday use of 'if ... then', but this is just the sentence functor very useful in forming metalanguage sentences. In most cases,the symbol '~' will occur between open sentences standing in the scope of (tacit or explicit) universal quantification(s). Examples of this use of 'if ... then' are the sentences occurring just in the rules(a) ... (e) above. 3. The symbols introduced in this section and in the preceding one will be used sometimes in the followingexplanations whenever theiruse is motivated by the aimsof exactness and/orconciseness. However, the everyday expressions of these symbols ('and' , 'or', 'if ... then', 'iff,'for all' etc.) will be used frequently.
Given the meaning of our sentence functors, it is clear that they are logical functors. They - and their symbols - are often called (sentence) connectives in the literature of logic. Let us note that in mathematical logic, the terms disjunction, implication, and
equivalence are used instead of our alternation, conditional, and biconditional, respectively. These are not apt phrases , for they can suggest misleading interpretations. ITwe apply more than one of our sentence functors in a compound sentence then the order of their applications can be indicated unambiguously by using parentheses. However, some parenthese s can be omitted if we take into consideration the properties of our functors. First, it follows from the truth conditions above that conjunction and alternation are
commutative and associative, and hence, we can write, e.g.,
"A &B &C" and "A vB vC' (where the variables 'A " 'B', and 'C' refer to sentences) without using parentheses. Further, we can see easily that the truth conditions of
"(A&B)oC" and "A o (B o C)" are the same. That is, if the consequent of a conditional is a conditional then the antecedent of the latter can be transported by conjunction to the antecedent of the main conditional. This suggests the convention to omit the parentheses surrounding the conditional being the consequent of a conditional, i.e., to write
"A
0
B
~
C"
instead of
"A
~
(B 0 C)".
Of course, this convention can be applied repeatedly. Finally, we can realize that the biconditional is (a) reflexive , in the sense that "A
~
A" is always true,
(b) symmetrical, in the sense that from "A
~
B" we can infer "B
~
A ", and
(c) transitive, in the sense that from "A ~ B" and "B ~ C" we can infer "A ~ C".
17
Identity bears the same remarkable properties - the fact which legitimates the use of chains of identities of form a=b=c= ... (where the variables 'a', 'b', 'c' refer to names), practised from the beginning of primary school. Then, chains of biconditionals of form A~B~C¢::>
...
will be used sometimes in the course of metalogical investigations. As illustrations of our new symbols, let us re-formulate the more detailed logical structure of some sentences used as examples
(2")* Ax(x is a substantive noun ee x is a noun phrase). (4")* Vx(x is an adverb & x ends in '-ly'). (5)*
x is a word::::::> Vy(y is longer than x).
(7)*
I\A/\B ((A is a sentence & B is a sentence) ::::::> "if A then B" is a sentence). Inferences. In the metalogical investigations, we draw some inferences from our
starting postulates and definitions (definitions will be treated in the next section) on the basis of the meaning of our logical words - i.e., quantifiers, identity, sentence functors, be they expressed by symbols or by words of a natural language. The meaning of these words or symbols was exactly fixed by their truth conditions in the present and the preceding section. No formal system of logic will be used here as a basis legitimating our inferences - at least not before Chapter 6 that treats of a system of logic. However, on the basis of the mentioned truth conditions, a list of important infer-
ence patterns could be compiled. In the preceding section, it was mentioned, e.g., inference from a universal sentence to its instantiations. The properties of the sentence functors treated in the present section also contain some hidden inference patterns. Instead of giving a large list of inference patterns, we only stress two important ones: (A) From a conditional "A::::::>B" and from its antecedent A we can correctly infer its consequent B. This pattern is called modus ponens [placing mood] (in formal systems, sometimes called detachment). H
(B) From a conditional "A::::::> B and from the falsity of its consequent, i.e., from
"-B H, we can correctly infer to the falsity of its antecedent, i.e., to "-A". This pattern is called modus tollens [depriving mood]. It is the core of the so-called indirect proofs. In such a proof, one shows that accepting the negation of a sentence would lead to a sentence which is known to be false, that is, a conditional of form "-A
~
-B" is ac-
cepted. From this and from the falsity of "-B" - i.e., from the truth of B -, the falsity of
"-A" - i.e., the truth of A - follows by modus tollens.
18
2.4 Defmitions In metalanguages, we often use definitions, mainly in order to introduce terms or symbols instead of longer expressions. Such a definition consists of three parts: (a) the new term called the definiendum, (b) the expression stating that we are dealing with a definition, and (c) the expression that explains the meaning of the definiendum called definiens. In verbal definitions, (b) is indicated by words such as 'we say that', 'is said I to be', 'let us understand', etc. Some examples of verbal definitions:
(1) By the square ofa number let us mean the number multiplied by itself.
(2) We say that a number x is smaller than the number y iff there is a positive number z such that x + z = y. Here the definienda are italicized, and the words indicating that the sentence is a definition are printed in bold-face letters. In (2), the definiendum is, in fact, the relation expressed by the dyadic predicate 'is smaller than', but its argument places are filled in by variables. This
s~ows
that we cannot avoid the use of free variables in definitions.
Definitions involving free variables will be called contextual definitions. In (1), the use of a variable (referring to numbers) is suppressed due to the fact that it defines a very simple monadic name functor (i.e., the operation of squaring numbers). In the canonical forms introduced below, a special symbol standing between the definiendum and the definiens will indicate that the complex expression counts as a definition. Now, the canonical form of a contextual definition of a predicate has the following shape: (3)
where A indicates the definiendum: an open sentence formed from the predicate to be defined by filling in its argument places with (different) variables, and B indicates the definiens: an open sentence involving exactly the same free variables which occur in
the definiendum. Of course, the predicate to be defined must not occur in the definiens (the prohibition of circularityin defmitions).Furthermore,in order that the definition be reasonable, the definiens must contain only functors to be known already. For example, the canonical form of (2) - if we use the sign ' 'f/(x).
If two classes are mutually subclasses of each other then we say that their extensions
coincide. Unfortunately, the symbol expressing this coincidence - used generally in 22
literature - is the sign of identity C='). Thus, according to this convention, the definition of coincidence of extensions is as follows: (6)
A
=B
¢::>df
(A ~ B & B ~ A).
Or, by using (5): A =B
¢::> df
J\x(x E A ~ X
E
B)
Note that "A = B" may hold even in a case A and B are defined by different properties (open sentences). To mention an example, in the arithmetic of natural numbers we find that {2} = {x: x is an even prime number}.
Identity between individual object is a primitive relation characterized by the fact that each object is only identical with itself. (However, an object may bear more different names , and this fact makes identity useful.) On the other hand, the symbol '=' as used between class abstracts bears a meaning introduced by definition (6); thus, "A = B" only means what this definition tells us. If the extensions of A and B do not coincide, and A is a subclass of B then we say that A is a proper subclass of B,' in symbols: "A c B". That is: (7)
A
c B
¢::>df
(A
c
B & A;t: B).
It may occur that a quite meaningful predicate defines a class with no members. For example: {x: x is a prime number & 7
< x < I I }.
The simplest definition of such an empty class 'is: {x: x 7= x} .
The extensions of two empty classes always coincide, that is, empty classes are "identical" with each other (in the sense of (6)). Hence, we can introduce the proper name '0 ' for the empty classes: (8)
o = df
{x: x;t: x}.
This definition is to be considered as the concise variant of the following contextual defmition: X E
0
¢::> df
x;t: x.
Analogously, we can introduce a proper name, say
r, for any class abstracts
"{x: rp(x)}" by
r
=df
{x: ~ (x)}.
23
By our definitions in (5) and (8), it is obviousthat
ok
A,
and
AkA.
Operations with classes. We introduce the dyadic class functors of union, intersection, and difference, symbolized by 'u', '(l' ,and '-', respectively. (9)
A
u B
=df
{x: x
E
A
(10)
An B
=df
{x: x
E
A& x
(11)
A-B
=df
{x: x
E
v X E B}, E
B},
A & x ~ B}.
Some properties of these operations: (12)
A
u
B =B
u
A (l B =Br. A;
A,'
C=A
(13)
(A u B) u C = A u (B u C);
(A ( l B)
(14)
A u A =A;
A (lA =A;
(15)
Au 0 =A;
A
(l
0 = 0;
(16)
A u (B ( l C) = (A u B)
A
(l
(B u C) = (A (lB) u (A ( l C);
(17)
A-(A-B) = A ( l B;
(18)
AS; B
~
(l
(A u C);
(A-B)
(A u B = B) ~ (A (l B = A);
u
(l
B
(l
(B ( l C);
=A u B;
A-B~A.
Identities (12), (13), (14) tell that union and intersection are commutative, associative, and idempotent (self-powering) operations. In (15), the role of the empty class in the operations is shown. Union and intersection are distributive with respect to each other, as (16) tells us. In (17), we see important connections between difference and the two other operations. Finally, in (18), some interrelations between the subclass-relation and our operations are presented. - The proof of these laws is left to the reader. (Use the definitions (9), (10), (11), (5), and (8).) Finally, let us note that two classes are said to be disjoint iff they have no common members, i.e., if their intersection is empty.Thus, A(lB=0
expressesthat A and B are disjoint classes.
24
Chapter 3 LANGUAGE "RADICES 3.1 Definition and Postulates In what follows, we shall use script capital letters (Jil, '.B, C, etc.) to refer to an alphabet. The totality of words formed from the letters of an alphabet Jil will be denoted by "Jil° ", and the members of jilO will be called sometimes "jil-words". According to our postulate (Ll) in Sect. 1.2, an alphabet is always afinite supply of objects. Hence, we can use class notation in displaying an alphabet, e.g.,
where a b
a~
... ,an stand for the letters of jil.
According to the postulate (L2) in Sect. 1.2, if we know Jt then we know jil° as well. By using class notation, we may write: jilO = {x: x is an x-word}. Of course, this identity cannot serve as a defmition of jilo, for 'is an jil-word' is an undefined predicate. We mentioned already in Sect. 2.1 that the introduction of the empty word may be useful for technical reasons (this will be demonstrated in Ch. 4). We shall use the symbol '0 ' for the empty word:
o =df the empty word. Obviously, this notion is language-independent. Also, in Sect. 2.1, the name functor concatenation was mentioned; its symbol is ,n , . We can imagine that the words of an alphabet are "produced" starting from the
empty word via iterated concatenation of letters to words given already. Thus, at the beginning of the description of a language, we have to deal with four basic notions: the letters of the language, the words of the language , the empty
word, and the name functor concatenation. Up to this point, we have an intuitive knowledge about these notions. We shall say that these four notions together form a
language radix. Now we shall give a so-called axiomatic treatment of these notions the significance and the importance of which will be cleared up gradually later on.
2S
DEFINITION.
By a language radix let us mean a four-component system of
form
whereJ'i and J'i ° are nonempty classes,0 is an individualobject (the empty word), fl is a dyadic operationbetween the members of J'i0 , and the postulates (Rl) to (R6) below are satisfied. The members of J'i will be called letters, and the members of JilOwill be called words ~-words). Remark. Pointed brackets above under (*) are used to sum up the four components of a lan-
guage radix into a whole. The reader need not think of the set-theoretic notion of an ordered quadruple which is undefinedup to this point.
(Rl) (R2)
J'i
c
JilO
and 0 e
x, yeJil o =>
JilO •
X fly eJilO.
(Tacit universalquantification of the variables. Similarlyin the following postulates.) (R3)
Concatenationis associative: if x, y, Z e JilO then (X fl y) fl Z
= X fl(y fl Z).
Consequently, parentheses can (and will) be omitted in the cas~ of iterated concatenations.- In what follows, the variables x, y, z will refer to members of Jilo. (R4) This tells us that a word is nonempty iff it has a final letter.
That is, the last letter of a word is uniquely determined. (R6)
(x
fl
y =X
¢:)
Y
=0) & (x
fl Y
=Y x =0).
That is, a concatenation is identicalwith one of its members iff its other member is the empty word. Some consequences of our postulates:
By (R6), the empty wordis ineffective in a concatenation: (1)
x
fl0=x=0 flx.
From "a. eJil & x =Y fl a." it follows by (R4) that "X;1: 0". In other words:
(2)
cx eJil=> yfl cx:;t 0.
26
Particularly, if y
=0
then - with respect to (1) - we get that aEj{:::) a:;t0,
(3)
which means that
that is, the empty word does not belong to the class of letters. (Note that this was not explicitly stated in our postulates.) Assume that x, y Ej{0 , and
(4) Now, if y = 0, then, by (R4), it is of form "z n a" where a Ej{. Thus, in this case, (4) has the following form:
or, with respect to (R3),
However, this is excluded by (2). Hence, (4) excludes the case y:;t 0. On the other hand, "x n 0 = 0" implies "x = 0 ", by (l). Thus, we have that (5)
By (1), this holds conversely, too. Hence, we can replace the
"~"
in (5) by "¢:>". In
words: The result of a concatenation is empty (i.e., the empty word) iff both its members are empty. Our postulates and their consequences are in full accordance with our intuitions with respect to the four basic notions of a language radix. Among others, they assure that the empty word can be "erased" everywhere, for it is ineffective in concatenations. What is more, these postulates determine "almost" uniquely the class j{0: the objects which are j{-words according to our intuitions are really (provably) in j{0. However, (Rl) ... (R6) do not assure that
jIO
contains no "foreign" objects, i.e., objects which are
not words composed from the letters of jI. This deficiency could be supplied e.g. by the following postulate: (R7)
If B is a class such that
(i)
0 E B, and
(ii)
(x E B & a Ej'f) :::)
thenx"
X
n a E B,
c B.
27
In other words: 51° must be a subclass of all classes satisfying (i) and (ii). Another
usual formulation: 51° is the smallest class which contains the empty word and the lengthening of every contained word by each letter of 51. Now we see that (R7) involves a universal quantification over classes, and, hence, it passes the limits of our class notation introduced in Sect. 2.5. On the other hand, 51° is not perfectly determined by the remaining postulates (Rl) to (R6). We are compelled to refer to our postulate (L2) introduced already in Sect. 1.2. On the basis of experiences in natural languages, we can accept that the class of 51-words is perfectly determined by the alphabet 51. - However, the notion of a language radix introduced in the present section will be utilized later on (mainly in Ch. 7). Remark. The systems called language radices above are called free groupoids with a unit element in mathematics when the postulate (R7) is accepted as well. They form a particular family of algebraic systems. Here ~o is the field of the grupoid, n is the groupoid operation, 0 is the unit element, and ~ is the class of free generators. - Let us note that accepting (R7) makes possible to weaken some of the
other postulates ; e.g., (3) above is sufficient instead of (R4), and (2) instead of (R6).
3.2 The Simplest Alphabets 3.2.1. Notational conventions. (a) It is an obvious device to omit the sign of concatenation and using simple juxtaposition instead; i.e., writing "xyz" instead of
"x (") y (") z". (b) Displaying an alphabet, we have to enumerate its letters between braces.
The letters are objects, thus, in enumerating them, we have to name them. For example, the two-letter alphabet whose letters are '0' and '1', is to be displayed as {CO', '1' }.
Could be omitted here the quotation marks? We can answer this question by YES, agreeing that inside the brackets, the letters stand in an autonymous sense as names of themselves. However, we can give a deeper "theoretical" argument in doing so. Namely, if we do not want to use a language for communication, if we are only interested in the structure of that language then we can totally avoid the use of the letters and words of the language in question - by introducing metalanguage names for the letters and words of the object language. (For example, the grammar of Greek or of Russian could be investigated without using Greek or Cyrillic letters.) Hence, we shall not use quotation marks in enumerating the letters of an alphabet, but we leave undecided the question - being unimportant - whether the characters used in the enumeration are names in the metalanguage for the letters, or else they stand in autonymous sense.
28
(c) In presenting an alphabet, different characters denote different letters. That is, the alphabet contains just as many letters as many are enumerated in the alphabet.
The simplest alphabet is, obviously, a one-letter one: (1)
J{o
= {a}.
However, the words of this minimal alphabet are sufficient to naming the positive integers: the words a, CIa, CICIa, ... can represent the numbers 1, 2, 3, ... . (Even, the empty word can be considered as 0.) - We shall exploit this interpretation of
J{o
later
on. The two-letter alphabet (2) is used naming natural numbers in the so-called dyadic (or binary) system. In the everyday life, we use the decimal system in writing natural numbers; this system is based on a ten-letter alphabet. However, the dyadic system is exactly as good as the decimal one (although the word expressing the same number is usually a longer one in the dyadic system than in the decimal system). This leads to the idea: Is it possible to replace any multi-letter alphabet by a two-letter one? The existence of the Morse alphabet suggests the answer YES . In fact, this is a three-letter alphabet
I: ,-, I} where the third
character serves to separate the the translations of the letters of (say) the English alphabet. For example, the translation of ' apple' to a Morse-word is:
. -I· - - · I · - - · I · - · · I · which shows that Morse-alphabet is, in fact, a .three-Ietter one. Let our "canonical" two-letter alphabet be (3)
JI 1
={CI, ~}.
Furthermore, let C be an alphabet with more than two letters, e.g.,
We define a universal translation method from C into Jl 1• Let the translation of the letter
Y; the Jlrword beginning with a and continued by i copies of rule can be displayed in the following table: the letters of C translate into
Yn a(3 .,. (3 ---,...-....
n copies of f
29
~
(for 0 ~ i ~ n). This
Then, the translation of a C-word is defmed, obviously, as the concatenation of the translations of its letters. Detailed, in a hair-splitting manner: The translation of 0 is 0; and if the translation of a c-word c is c", and the translation of Yi is gi then the translation of "cy i" is c gi "
•
We avoided here the use of a separating symbol by the
rule that the translation of each letter of C begins with a.. Translations of c-words among the Jilrwords can be uniquely recognized: divide the given Jilrword to parts, by putting a separating sign, e.g., a vertical stroke, before each occurrence of the letter a.. Now if it is the translation of some c-word then . each part must be the translation of a letter of C (it is easy to control whether this holds or not). Re-translating the C-Ietters, we get the c-word the translation of which the given Jilrword was. - Summing up: 3.2.2. THEOREM. Any finite language based on an alphabet with more than two letters can be equivalently replaced with a language based on the two-letter alphabetJil 1• We shall exploit this theorem in Sect. 4.4. Note that the table above can be continued beyond any great value of n. Thus, our theorem holds even for a denumerably infinite alphabet - an alphabet that has just as many letters as much natural numbers are. A language radix itself is not a "full-fledged" language, although it is the radix of the "full tree" of a language. After fixing the radix of a language, the next task is to define the well-formed ("meaningful") expressions of the language. In most cases , these are divided into classes called categories of the language. The problem of defining such categories will be the main topic of the next chapter.
30
Chapter 4 INDUCTIVE CLASSES 4.1 Inductive Defmitions In order to solve the problem outlined at the end of the preceding chapter - i.e., the definition of categories of
a language -
we shall introduce a new type of definition
called inductive definition.
4.1.1. Explanation. Let us assume a fixed alphabet
Vtl
jlo,
jl,
and, hence, a language radix
0, II ). Assume, further, that we want to introduce a subclass F of
jlO
(intended as a category of the language based on jl) by pure syntactic means. The general method realizing this plan is applying an inductive definition that consists of three parts: (a) The base of the induction: we give a base class B (of course, B ~
jlO)
by
an explicit definition, stating that F must include B (B c F). (b) Inductiverules: we present some rules of form (1) where
a J, .. . , ak
and b are words formed from the letters of
jl
and, possibly, from
variables referring to jl-words, that is, the shape of these words is (2) where co,
Cit .. .
. c;
E jl
0
(any of these may be 0), and
Xit ... 'X n
are variables refer-
ring to jl-words. (c) Closure: we stipulate that the class F must contain just the words prescribed to be in F by (a) and (b). • Except some trivial cases, the stipulations (a), (b), (c) cannot be re-written into a form of an explicit or contextual definition (cf. Sect. 2.4) - at least we have no means to do so, up the present time. Thus, inductive definition is, in fact, a new type of definitions .
4.1.2. Supplements to the explanation of inductive definition. Ad (a). The base B can be defmed explicitly by enumerating its members, e.g.,
where b t , ... .b; are fixed jl-words (in this case, B is fmite), or by using variables as under (2), e.g., B
= {x:
x EjlO & Vy(x ="ayb" v x
31
="ycy") },
where a, b, care fixed x-words (in this case, B is infinite). Finally, B can be a class defined earlier. - Note that if B is empty then the class F defined by the induction is empty, too. Ad (b). The inductive rules involving free variables are to be understood as universal ones, i.e., with suppressed universal quantification of their free variables (in ac-
cordance with our convention on metalanguages). Two examples of inductive rules:
Y E F => "axbyc "
F,
(3)
X,
(4)
(x E F &
U
~ -words
(some of them may be 0).
In both rules, a, b, c are fixed
E
axbyc" E F) => Y E F.
Ad (c). In fact, the closure tells that F must be the smallest subclass of ~ 0
satisfying the conditions in (a) and (b). Thus, the closure contains a suppressed universal quantification over the subclasses of
o
~ • We
met a similar situation in the postulate
(R7) (in Sect. 3.1), with the considerable difference that in the present case, the quantification is limited to the subclasses of 510 (not all classes in the whole .world). However, in the sense we use class notation, the notion totality of all subclasses of
~o
is
undefined; thus, we have no logical justification of the closure. - In what follows, we shall omit the formulation of the closure whenever we indicate in advance that we want to give an inductive definition. Clearly, in such a case, it is sufficient to present the base and the inductive rules. • We saw that inductive definitions are not eliminable by our present means. (The kernel of the problem lies in the closure.) However, our intuition suggests that a subclass of 510 is clearly determined by an inductive definition. And, what is perhaps more important, we are unable to determine categories of languages without using this tool. As a consequence, we accept inductive definitions as a new means in metalogical investigations. Remark. In set theory, inductive definitions can be reduced to explicitdefinitions. However, the
introduction of the language of set theory is impossible without inductive definitions. (More on this topic see Ch. 10, Sect.4.1.)
Our next goal is to find out some new devices of presenting inductive definitions that will pave the way for some generalizations as well. We approach this goal via some examples.
4.1.3. The simplest example of an inductive definition is as follows. Given an alphabet 51, let us define the subclass A * of Base: 0
E
~0
by induction as follows:
A*.
Inductive rule: (x E A * &
(X
E51) => "xa" E A
32
*.
This rule comprises as many rules as many letters are in .91.. Thus, if .91. = {<X.o, a v ... , an}, then we can enumerate them: X E
A*
:::::) X E A * : : :)
"xao" E A *. "xa( E A *.
X E
A * :::::) "xa.n "
E
A *.
The reader sees that A * = .91. o • A "theoretical" consequence: .91. 0 is an inductively de-
fined subclass of itself, for any alphabet .91.. We can simplify the notation by omitting the dull occurrences of "s .91. *" and the quasi-quotation signs. Furthermore, let us use '-7' instead of ':=>'. We get a table:
o
(5)
Any inductive definition can be presented by such a table. The first line represents the base , and the other lines represent rules. Each rule has an input on the left side of the arrow, and an output on the right side of the arrow. Even the first line can be considered as a rule without any input. Note that the name of the class to be defined does not occur here; you can give it a name later on. The rules mentioned as examples under (3) and (4) contain two-two inputs. Remembering that "(A & B) :=> C' is the same as "A:::::) B :::::) C'
(cf. Sect. 2.3), we can re-formulate them as follows: ~y ~axbyc,
(3')
x
(4')
x -e axbyc
~y.
Thus, a rule may involve more than one input.
4.1.4. Our next example will be more complicated. Let us consider the dyadic alphabet.~d
= {O,l} introduced in Sect. 3.1. We want to define the class D of dyadic numer-
als representing natural numbers divisible by three. (We shall call numerals the words expressing numbers in a certain alphabet.) These numerals are >td-words such as
(6)
0, 11, 110, 1001, 1100, 1111, 10010, ... 33
(in decimal notation: 0,3,6,9,12, 15, 18, ... ). Some initial members of this (infinite) totality D can be included into the base, and the other ones are to be introduced via inductive rules. Our intuitive key to the inductive rules is: by adding three to any member of the class D , we get another member of D. The problem is to express "adding three" in the dyadic notation. Let us note that each member greater than three has one of the following forms in the dyadic notation:
xOO, xOl, xlO, xlI where x is any dyadic numeral other than O. It is obvious that by adding three to "xOO" we get "x 11". Thus, we can formulate a rule:
xOO
(7)
~
xl l.
Assuming that the input is "good" (i.e., represents a number divisible by three) then the output is a "good" one as well. By adding three to "xOl" we get "yOO" where y is the numeral following x (i.e., x plus one). Similarly, "xl0" plus three gives "yOl", and "xlI" plus three gives
"y 10" where, again, y is the numeral following x. In order to express these facts, let us introduce the notation "xFy" for "x is followed by y". Then we can formulate the missing three rules as follows: (8)
xFy
~xOl ~
(9)
xFy
~
(10)
xFy -,\ xl l
yOO,
xl0 ~ yOl, ~
ylO.
Of course, this is incomplete without defining the relation represented by 'F. This is simple enough: (11)
xOFxl
(12)
xFy~
xlFyO.
The first line serves as the base: it says that any word ending with 0 is followed by the word we get by replacing its final letter by 1. The second line is an inductive rule saying that "xl" is followed by "yO" provided x is followed by y. - Now we see that even a
relation (a dyadic predicate) can be defined via induction. Putting the empty word for x in (11) we get 'OFl', i.e., that 0 is followed by 1. Putting 0 for x and 1 for y in (12), we get 'OFI ~ OIFIO'. Knowing that the input holds, we get 'OlFIO' . However '01' is not accepted as a well-formed dyadic number, thus, we cannot accept this result as saying that 1 is followed by 10. We must add an extra stipulation: (13)
IFI0.
34
The continuation is correct: we get 'I OFll' from (11) by putting 1 for x; then we get from (12)
IFI0
-7
IIFIOO,
which, using (13), gives 'llFlOO', and so on. Now, the complete inductive definition of our class D can be compiled by taking the first three members of (6) as input-free rules, and collecting the rules from (7) to (13). We get the following nice table: (14)
o 11
110
xOFxl IF10
xFy -7 xlFyO XOO -7 xII xFy -7 xOl -7 yOO xFy -7 xlO -7 yOl xFy -7 xlI -7 yIO We see that in some cases, we shall use three sorts of letters in an inductive definition: (a) letters of the starting alphabet Yl (b) letters as variables referring to Yl-words - of J
course, these letters must not occur in Yl, (c) subsidiary letters (like F in our example above) representing (monadic, dyadic, etc.) predicates necessary in the definition again, these letters must differ from those in (a) and (b). Furthermore, the character' -7' occurring in rules must be foreign from all the mentioned three supply of letters.
Conventions for using subsidiary letters. In our example, the subsidiary letter 'F' was surrounded by its arguments. If a subsidiary letter, say 'G' represents a monadic predicate, then its argument will follow this letter, as "Gx". If the letter 'H' stands for a triadic predicate, then its arguments, x, y, z will be arranged as follows:
xHyHz. In the case of a tetradic predicate letter 'K', the arrangement of the arguments x, y, z, u is a similar one:
xKyKzKu. The convention can be extended for more than four-place predicate letters, but, fortunately, in our praxis we can stop here.
3S
In what follows, tables such as (5) and (14) representing inductive definitions will be called canonical calculi. In fact, a canonical calculus is a finite supply of rules (permitting some rules without any input). But the more detailed explanations will follow in the next section where the connections with the inductive definitions will be cleared up exactly.
4.2 Canonical Calculi 4.2.1. DEFINITION. Let C be an alphabet and
I
~'
a character not occurring in C.
We define inductively the notion ofa c-rule by the following two stipulations: (i) Iff eC~ f is a C-rule. (ii) Ifr is a c-rule, and f eC O then
(Remark. Note that by (i),
(0
uf~
r'' is a c-rule.
is a c-rule.)
4.2.2. DEFINITION. Let j{ and j{ / be alphabets such that Then, a finite class K of
j{ '.) No other symbol is needed. Thus, the alphabet we need is as follows: .stPL
={(,), 1t, t , -, o}.
(The subscript 'PL' refers to propositionallogic.) In the calculus defining the class of formulas, we shall use the subsidiary letters'!' and 'F' ('1' for index, and 'F' for for.
mula), and the letters 'u' and 'v' as calculus variables. Now our calculus called KpL consists of the followingrules: K pL :
1.
2. 3. 4. 5. 5*.
10 Iu ~ Iut Iu ~ F1tu Fu ~ F-u Fu ~ Fv ~ F(u::) v) Fu~ u
(The numbering of rules does not belong to the calculus. We use the ordinals only for helping to refer to the singularrules.) Comments. Rule 1 tells that the empty word is an index. Let us agree that we omit 0 in any rule except when it stands alone, i.e., if '0' is a rule. Thus, rule 1 can be written simply as 'I'. - Rules 1 and 2 together define the class of indices; one sees that indices are just the {t}-words. - Rule 3 defines the atomic formulas, rules 4 and 5 define compound formulas. Finally, rule 5* serves to release the derived words from the subsidiary letter 'F' . Let us call such rules as releasing rules. Now we can define the class of formulas of classical propositional logic - in symbols: 'FormpL' - as follows:
Let us note that FormpL can be defined by a calculus immediately over .stPL , i.e., without any subsidiaryletters.Namely:
39
11. 12. 13. 14. 15.
Tt Ttt ut~un u~-u
u ~ v ~ (u:::> v)
By 11 and 12 we can get the atomicformulas 'rc' and 'm'. By 13, a word terminating in t can be lengthened by an t . Thus, rules 11, 12, and 13 together are sufficient to producing atomic formulas. Rules 14 and 15 need no comment. However, the preceding calculus seems to be more in accordwith our intuitions concerning the gradual explanations of "what are the formulas". Releasing of subsidiary letters seems to be unimportant, due to the simplicity of the studiedobject language. We can define a subclass of FormpL called the class of logical truths of propositional logic. The formulas of this class have the property that they are always true independently of the fact whetherthe atomicformulas occurring in them are true or false, assuming that negation (-) and conditional (:::» have the same meaning as '-' and '~' in the metalanguages (cf. Sect. 2.3). For this definition, we introduce the calculus KLPL as an enlargementof Kp L above. Our basic alphabetremains J'l.PL ; we use a new subsidiary letter 'L' and a new variable w. Omit rule 5* from Kp L and add the following new rules: ~
Fv -7 L(u :::> (v:::> u))
6.
Fu
7. 8.
Fu -7 Fv ~ Fw ~ L«u :::> (v o w)):::> «u:::> v):» (u:::> w))) Fu ~ Fv ~ L«-u:::> -v):::> (v:::> u))
9. 10.
Lu ~ L(u :::> v) ~ Lv Lu -7 U
The last rule releasesthe subsidiary letter 'L'. The calculusKLPL consists of the rules 1 to 5 (taken fromKpL) and 6 to 10 givenjust. Let us define: L pL = {x: x
E J'l.PL o
& K LPL
Jt+
x}
as the class of logical truths of propositional logic. Referring to the truth conditions of the negation and the conditional (as given in Sect. 2.3), it is ;easy to prove that the members of LpL are, really, logical truths. In fact, L~L contains all logical truths expressible in propositional logic - but we do not prove here this statement. (The proof belongsto the metatheory of classicalpropositional logic.) 4.3.2. A first-order language. Our next topic will be a language of classical first-order logic. First-order logic uses all the grammatical and logical means used in metalogic (cf. Sections 2.1, 2.2, 2.3): it applies (individual) variables, names, name functors, predicates, and quantifiers. (The adjective 'first-order' refers to the fact that only variables referring to individuals are used.) First-order languages mayuse different stocks of name functors and 40
predicates. We shalldealherewith the maximal first-order language which has an infinite supplyof name functors for all possible numbers of argument places, and, similarly, an infinite supplyof predicates for all possible numbers of argument places (and, of course, an infinite supplyofvariables). Thus,the alphabet we needmustbe muchricherthan JlPL. We applyas initial letter'x' (gothic eks) for variables, it willbe followed by an {t}wordas an index. Forindicating numbers ofemptyplaces offunctors, we shalluse the Greek letter' 0' (omicron); {o}-words maybe calledarities. As initialletterfor nomefunctors we shall use the letter 'cp', it will be followed by an arityand an index. If the arityis empty, we have a name. For predicates, we shall use the initialletter '1t', followed by an arity and an index. If the arityis empty, we havean unanalyzed atomic formula. We apply ''tI' as the universal quantifier, and the symbols '-', 'zi', '=', and the parentheses will be used as well. (Themissing sentence fimctors - e.g.,conjunction - and the existential quantifier can be introduced via contextual definitions.) - Thus,ouralphabet willbe: J{MF
(The subscript
'MF'
= {(,), t,
0, X,
N(f) = w) .
Obviously, in this case F is a decidable subclass of jI and N may be called the deciding algorithm of F. . We give an example of a deciding algorithm. In Sect. 4.3, we met the alphabet of the maximal first-order language 0
jIMF
,
={(, ),1 , 0, X, (B ::> A» (B2) «A::> (B::> (B3)
C»
::> «A ::>
«-B ::>-A) ::> (A ::>B»
B) ::> (A ::> C»)
(B4) ('v'x A ::> AtlX) (B5) ('v'x(A ::> B) ::> ('v'xA ::> 'v'xB»
(B6) (A::> 'v'xA)
provided A isfree from x
(B7) (x = x)
(B8) «x = y) ::> (Avz ::> AyIZ» To get basic formulas from these schemata, A, B, C are to be substituted by formulas, x, y, Z by variables, and t by terms of £1. (ii) If A E BF, and x
E
Var then "'v'xA" E BF. •
Remark. It can be proved that BF is always a definite subclass of Form (and, even, of .9l 0
) .
6.3.2.
DEFINITION:
Deductibility. Given £1, Fs; Form, and A E Form, we define by
induction the relation "A is deduciblefrom T" - in symbols: "T I- A" - as follows:
r u BF then r I- A. r I- (A ::> B) and r I- A then r I-
(i) If A E (ii) If
B.
In case 01- A we say that formula A is provable (in QC), and we write briefly" I-A". Rule (ii) is called modusponens (MP) or sometimesthe ruleof detachment. • Remark. In mosttextbooks of logic, our basicschemata are called axiom schemata, and our basicformu-
las axioms (of QC). This seems to be a wrongusage of the term axiom. For, in the generally accepted senseof the word, axioms are basic postulates of a scientific theoryfrom which all theorems of the theory follow by means of logic. Are, then,the basicsentences axioms from which all theorems of QC (or what else) follow by meansof logic? (which logic?) Eventhe question is a confused one. The most we can say is that all provable formulas of QC follow from the basic sentences via applications of modus ponens. Do we identify the classof provable formulas withthe theorems of QC? The latternotionis undefined; but the centralnotionin QC is the deductibility relationratherthan provability. It is hard to find an acceptable reasoning in defence of the mentioned use of 'axiom'.
6.3.3. An intuitive justification of QC. We gave an intuitive interpretation of the sentence functors negation and conditional (denoted in first-order languages by '-' and '::>', respectively) in Sect. 2.3, by referring to truth conditions. In Sect. 2.2, the meaning of the universal quantification was clarified as well. Now let us imagine a nonempty domain D of individual objects and assume that the members of Var (and Term) refer to members of D, so that "'v'xA" says: "for all members x in D, A holds". Then, one can check easily that - according to this intuitive interpretation- any formula of forms (B1) to (B8) is always true (is a logical truth), even independently from the choose of the domain D. (However, if we are dealing with more than one formula, we must as71
sume the same D with respect to all formulas being used.) Of course, in case (B7) and (B8), we must exploit the meaning of identity as well. In addition, if A is always true then so is "'v'xA". Hence, the members of BF are logical truths. Furthermore, we see that the rule Modus Ponens leads to a true formula from true ones. Hence, if "T ~ A" holds in QC, and the members of rrepresent true sentences (with resPeCt to a fixed domain D) then the formula A represents a true sentence (with respect to D). These considerations show that QC is really a logical calculus, a syntactic formulation of the consequence relation. We can use it with confidence in our reasoning. 6.3.4. THE CLASSICAL PROPOSITIONAL CALCULUS (PC). By a zero-order (or pure propositional) language L
0
let us mean a three-component system (based on a certain
alphabet) (Logo. Ato, Formo) where Logo = {(, ), -, ::J}, Ato is a nonempty class of words called atomic formulas, and Forms is defined by the two stipulations: (i) Ato ~
Forms. and (ii) (A, B
E
Forms ) => ("-A", "(A ::J B)"
E
Formo). - We met such a
language in 4.3.1. A logical calculus for zero-order languages is the (classical) propositional cal-
culus, PC. It can be presented as a fragment of QC, based on the schemata (B 1), (B2), (B3) and the rule of Modus Ponens. (Of course, the prescription: (A
E
BF => "'v'xA"
E
BF) is to be omitted here.) - In 4.3.1, the canonical calculus KLPL defines just the logi-
cal truths of PC. PC is, in itself, a very weak system of logic. However, it is interesting as a fragment of QC. For, any first-order language has a zero-order fragment if we define
Ato as containing all the first-order atomic formulas and all formulas of form "'v'xA". Then all laws of PC are laws of QC as well. In proving PC-laws, we use only the basic schemata (B1), (B2), (B3), and the rule MP. - This strategy will be applied in the next section.
6.4 Metatheorems on QC It was noted already in Sect. 1.1 that the author assumes: the reader has some knowledge on classical first-order logic. Up to this point, this assumption was not formally exploited. In what follows, we shall give a list of metatheorems on QC without proofs, assuming that the reader is able to check (at least) the correctness of these statements. The particular style of presentation of QC given in this chapter, and the metatheorems listed below will be essentially exploited in the remaining chapters of this essay. Hence, the reader's assumed familiarity with first-order logic does not make superfluous the explanations of the present chapter.
72
6.4.1. Metatheorems on PC. The frrst group of our theorems is based on the basic schemata (Bl), (B2), and (B3) as well as the rule MP. Then, referring to 6.3.4, these laws may be called PC-laws. Thus, .they will be numbered as PC.1, PC.2, .... Some of these will get a particular code-word, too. - In the notation, T and r'refer to class of formulas, and A, B, C to formulas.
r:
PC.l. ir ~ A, Ts; r') ~ ~ A. PC.2. ({A , A:::l B}C r) ~ r ~ B. PC.3. ~ A:::l C) ~ (ru {A} ~ C). PC.4. ~ A :::lAo PC.5. (DT) (ru {A} ~ C) ~ (F ~ A:::l C). - Deduction Theorem . The converse of PC.3. PC.6. (Cut.) ir ~ A, r'u {A} ~ B) ~ (ru r' ~ B). PC.7. (ru { -A} ~ -B) ~ (ru{B} ~ A). PC.8. {--A} ~ A, and A ~ - -A. PC.9. (Co.po.) (ru {B} ~ A) ~ (ru {-A} ~ -B). (The law of contraposition.) PC.tO. {A, -A} ~ B. {A, -B} ~ -(A:::l B). PC.12. {-A:::l A} ~ A, and A ~ -A:::l A. PC.13. -A ~ A :::l B, and B ~ A:::l B. PC.14. {A} ~ B, and {-A} ~ B) ~ ~ B.
cr
rc.n.
tr.:
ro
r
6.4.2. Laws of quantification. For the proof of the following laws , one needs the basic schemata (B 1) to (B6). - In the notation , x and y refer to variables. QC.I. (UG ) If F -Especially: ~ A
~
~
t
r
A, and F is free from the variable x then
~ VxA.
VxA . (Universal generalization.y
QC.2. If y is substitutable for x in A, and A is free from y then VxA ~ VyAylx and VyAylx ~ VxA. (Re-naming of bound variables.) QC.3. If the name t (i.e., a name functor of arity 0) occurs neither in A nor in the members of F, and
r
~ Atlx then
r
~ VxA.
QC.4. VxVyA ~ VyVxA. QC.5. If Q is a string of quantifiers "VXIVX2 ... Vx n " (n ~ 1) then {Q(A :::l B), QA} ~ QB. (A generalization of (B5).) 6.4.3. Laws of identity. Now we shall use the full list of our basic schemata (Bl) to (B8). - In the notation , s, s ~ and t refer to terms. Q C.6.
~
(t
=t).
QC.7. {(s = t), ASIz } ~ Atlz
•
Q C.8. {(s = t)} ~ (t =s). QC.9. {(s = s' ), (s'= t) } ~ (s = t).
73
~-
-
-
-
- -
-
-
-
-
-
-
-
-
-
-
-
-
6.4.4.
DEFINITION.
Let A be an open formula, and let
Xl, ... , Xn
be an enumerationof
all variables having free occurrences in A (say, in order of their first occurrences in A). Then, by the universal closure of A let us mean the formula "VXl cording to QC.4, the order of the quantifiers is unessentialhere.
...
VxnA". - Ac-
6.5 Consistency. First-Order Theories 6.5.1.
DEFINITION.
Given a logical calculus I and a class of formulas r, we shall
denote by "CnsI (T)' the class of formulas deduciblefrom F; i.e., Cnsx(r) = {A: r ~A}.
We shall be interested in case Iis PC or QC. Obviously: Cnspc (F) ~ CnsQc (T) .
We say that r is I-inconsistent iff CnsI (F) = Form, i.e., iff everyformula is deducible from r.
-
Finally, T'is said to be I-consistent iff it is not I-inconsistent.
•
Clearly, if r is PC-inconsistent then it is QC-inconsistent, too. Or, by contraposition, if r is QC-consistent then it is PC-consistent as well. We know from PC.10 that a class of form {A, -A} is PC -inconsistent. We could to prove that the empty class (or, what is the same, the class BF) is QCconsistent, but we shall get this result as a corollarylater on. 6.5.2. THEOREM. ru {A} is PC-inconsisten t iff r reader: Use PC.12, PC.IO, DT and Cut.
~
-A. - The proof is left to the
6.5.3. THEOREM. If "A:::> B" E F, and r is QC-consistent then at least one of the classes ru {-A}, r u {B} is QC-consistent. Proof (sketchily): Assume, indirectly, that both of the mentioned classes are
inconsistent.Then, by the precedingTh. and PC.11, we have that (i)
r~
A,
From (i) and (iii) we get by Cut -
r
r~ -B,
(ii)
r
u {-B}
(iii)
r -(A:::> B),
{A, -B} ~ -(A:::>B).
by Cut. This and (ii) gives - again,
r -(A:::>B), contradictingthe assumption of the theorem.
6.5.4. THEOREM. If "-V xA " E T, r is QC-consistent, and the name t occurs neither in A nor in the members of r then ru {_At/X} is QC-consistent. Proof (indirectly). IT
and
rt
ru {_At/X} is inconsistent then
rr At/
x
VxA (by QC.3), contradictingthe assumptionof the theorem.
74
(by Th. 6.5.2),
1
6.5.5. DEFINITION. The pair T = (L , r) is said to be e first-order theory iff L 1 is a first-order language and
r
is a class of closed formulas of c'. The members of
r
are
said to be the postulates (or axioms) of the theory T, and the members of CnsQc (T ) are called the theorems of T. The theory T is said to be inconsistent iff
r
is QC-
inconsistent, and it is said to be consistent in the contrary case. •
r
In the limiting case in the language L
1
.
= 0, the theorems of the theory are the logical truths expressible
According to our intuitive interpretation given in 6.3.3, we believe
that such a theory is a consistent one. (For a more convincing proof, we need some patience in waiting.) In the next chapter, we shall introduce a first-order theory that will lead us to a very important metatheorem on QC. In addition, we shall have an opportunity to show the application of a canonical calculus in defining a first-order theory .
7S
~
-
-
~-
~~-~~~~
Chapter 7 THE FORMAL THEORY OF CANONICAL CALCULI (CC*) 7.1 Approaching Intuitively Our aim in this chapter is to reconstruct the content of the hypercalculus H 3 (see 4.4.3) in the frame of a frrst-order theory. We shall call this theory CC*. (Here the star '*' refers to the fact that this is an enlarged theory of canonical calculi. The restricted theory of canonical calculi would be based on H 2 instead of H 3• We shall meet this in
Ch.8.) The kernel of this reconstruction consists in transforming the rules of H3 into (closed) first-order formulas which will serve as postulates of CC*. The transformation procedure will be regulated by the following stipulations (i) to (viii). (i) The subsidiary letters of H3 are to be considered as predicates of the firstorder language £1* to be defined. (ii) The variables of H3 are to be replaced by first-order variables. (iii) The letters of the alphabet j{cc
= {a,~,~, < , * }
are to be considered as names (i.e., name functors of arity 0) of £1*. (iv) j{cc-words are to be considered as closed terms of £}*. Hence, we would need a dyadic name functor in £ 1* to express concatenation. However, we shall follow the practice used in metalogic instead, expressing concatenation by simple juxtaposition. To do so, we formulate an unusual rule for terms as follows: H s and t are terms,
"st" is a term. (v) As the subsidiary letters are considered as predicates, their arguments are to be arranged according to the grammatical rules o! frrst-order languages: the arguments are to be surrounded by parentheses, and they have to follow the predicates. (vi) In some rules of Hj, the invisible empty word 0 occurs as an argument of some subsidiary letter. In the formulas of frrst-order languages, a predicate symbol must not stand "alone", without any arguments. Hence, we need a name representing the empty word; let it be (vii) The arrows
't}'. (~)
in the rules are to be replaced by the sign of conditional
'::>'. According to our convention, "(A:::> B:::> C)"
stands for
"(A:::> (B:::> C»)",
thus, we need no inner parentheses within the translation of a rule.
76
(viii) Finally, after applying (i) to (vii) to a rule, let us include the result be-
tween parentheses if it involves some '::)', and prefix it by universal quantifiers binding all the free variables (if any) occurring in it. (In case of more quantifiers, their order is unessential, by QC.4 .) For example, the translations of rules 1, 13, and 16 are:
(1')
I(~)
(13')
V'xV'n(K(x)::)R(n) ::)K(x.n»
(16')
V'x'Vn'Vnt (V(x) ::)I(nt) ::) S(x~nt)(x~nt)(Xt)(X»
The hypercalculus H 3 consists of 34 rules. (The releasing rule 34* will be omitted here.) Thus, we get 34 postulates by transforming these rules into formulas. However, we need some other postulates in order to assure that the system
should behave as a language radix (see Sect. 3.1). This means , in essence, that the postulates (Rl) to (R6) are to be included into our planned theory . After these preliminary explanations we can begin the systematic formulation of
cc-, 7.1.1. DEFINITION. The frrst-order theory of canonical calculi CC* is defined by
where L 1* is a first-order language based on the alphabet Jilc• = ((,), t, X, - ,::), =, 'V ,~, c, ~,~, - B" is false if A is true and B is false; in all other cases, "A ::> B" is true. (5) "TIxA" is false if for some t E Jilcc0 , At/x is false; otherwise it is true. (b) Open formulas. An open formula is true if its universal closure (cf. Def. 6.4.4) is true, otherwise it is false.
•
On the basis of this definition, the following statements CC.l to CC.4 are almost trivially true (the detailed checking is left to the reader).
CC.I. All basic sentences of
£1* are true. (See the rules 50 to 58 of
E* in the pre-
ceding section.)
CC.2. The postulates under 61 to 81 that refer to identity are all true. CC.3. All postulates from 82 to 115 are true. (These are the translations of the rules of H3·) If both "A ::> B" and A are true, then B is true (according to (4) of the definition above). Then , using that MP is the single proof rule in QC, we have:
CC.4. Every theorem of CC* is true. It is an open question whether all (closed) true formulas are theorems of CC*; we shall return to this question only at the end of the next chapter. However, we can show that the true closed atomic formulas are theorems of CC*. This will be detailed in three steps.
CC.5. If H3 ... f then Tr(j) is a theorem of CC*. Proof' by induction with respect to derivations in H3• Base: f is a rule of H3• Then Tr(j) is one of the formulas in the list from 82 to 115 (of the rules of 1:*), i.e., it is a member of PC. - Induction step (a): Assume that H 3
'"
g, Tr(g) = A, A is a
theorem of CC*, and we get! from g by substituting certain Jilcc-words tl, ... , t1 for certain Hj-variables first-order variables is of form "TIXl
.. •
Zl, . .. , Zk Xl, ..• ,XJco
in g. In A, these H 3-variables are replaced by some
Then, we can assume (using QC.4 if necessary) that A
TIXl B". Then, Tr(j) is the formula
which is deducible from A by k applications of (B4), and, hence, is a theorem of CC*.
- Induction step (b): Assume that H 3 '" g Tr(g
-7 f)
-7
J, H 3 '" g where g involves no arrows,
= "Ql(B::> A)", Tr(g) = "Q2 B" where Ql and Q2 are (possibly empty)
strings of quantifiers of form "TIX". Assume that these formulas are theorems of CC*. 82
It is clear that by appropriate choosingof the variables, we can assumethat Q2 is a part of Q1 (use, if necessary, QC.2), and, by using (B6), we can replace Q2 by Q1' Thus, we can assumethat our formulas are of form
From these we get "QI A" by QC.5. To get Tr(f) which is of form "Q3 A" we apply (B4) and QC.2 (if necessary) to omit the superfluous quantifiers and re-name the bound variables. Thus, Tr(f) is a theoremof CC*. CC.6. If A is a true closed atomic formula but not of form "(s = t)" then A is a thearemofCC*. Proof. According to item (2) of our truth assignment, there is a word f such that H3 . . f, and Tr(f) = A. Then, by CC.5, A is a theorem of CC*. CC.7. If "(s = t)" is closed and true, it is a theoremof CC*. Proof. According to item (1) of our truth assignment, we get from s and t the same term c by deletingthe occurrences of '~' (if any).Now, "(c = c)" is obviously a theorem of CC* (see QC.6). The omitted occurrences of '~' can be placed back by using the postulates 66,67, and the basic schema (B8). Hence, " (s = t)" is a theorem ofCC*. CC.8. If A is a true closed atomicformula then A is a theorem of CC*. - This is the summary of the previous two statements. The formula '(ex = ~ )' is obviouslyfalse, hence, by CC.4, it is not a theorem of CC*. Thus, not all formulas are theorems of CC*. In other words: CC.9. Theory CC* is consistent. (Cf. Def.s 6.5.1 and 6.5.5.) COROLLARY. The empty class offormulas - or the class BF of basic formulas - is
consistent.
7.4 Undecidability: Church's Theorem Let us pose the question: Is it possible to find a procedure - an algorithm - by which we would be ableto decidefor every formula of .L1* whetherit is a theoremof CC*? If we had such a procedure then we would be able to decide, among others, for all formulas of form "A(t)" where t is an ~o-word whether it is a theorem of CC*. Now, r* A(t) iff H3 " At (by CC.4 and CC.8). However,
r
H 3 " At iff t EAut
83
(cf. 4.4.3). Hence, in the presence of a decision procedure, we would be able to decide for all numerals (i.e., {a}-words) whether it is an autonomous one. By Th. 4.4.4, the class of non-autonomous numerals (J'o-Aut) is not an inductive class. Then, by Th. 5.4.4 and its corollaries, Aut is not a definite class, and, according to Markov's Thesis (see 5.4.2) it is not a decidable one. Hence, the answer to our question turned to be a negative one: no normal algorithm can decide the theoremhood in CC*, and, if we accept Markov 's Thesis, no decision procedure exists for the class of theorems of CC*. Summing up:
7.4.1. THEOREM. Theory CC* is undecidable in the sense that the class of its theorems is not a definite subclass of its formulas. Since "A is a theorem of CC*" means the same as "T * ~ A" we get from the result above that in QC, no general procedure (or, at least, no normal algorithm) exists to decide the relation "T ~ A". Of course, we can imagine a general decision procedure as a schematic one that can be adjusted in some way or other to all particular first-order languages. To be more unambiguous, we can state that no decision procedure exists for the maximal frrst-order language (see 4.3 .2 and the first paragraph of Sect. 6.2.). For, this maximal language includes £1*; hence, if we had a decision procedure for the former then it would be applicable for the latter as well. Let us realize, furthermore , that " r
*~
A" tells the same as "P 61 :::) ...
:::) P 115 :::) A "
where P6h ... ,P115 are the members of r* (the formulas in 1:.* enumerated from 61 to 115) - according to the Deduction Theorem (see PC.5). Thus, it follows from the undecidability of "T * ~ A" that the class of provable formulas of £ 1* is undecidable. The same holds, a fortiori , for the class of provable formulas in the maximal first-order language . Summing up:
7.4.2. THEOREM. (Church's Theorem.) In QC, there exists neither a universal procedure (representable by a normal algorithm) for deciding the deductibility relation t" r ~ A") nor for recognising the provable formulas ( It ~ A "). This theorem was first proved by Alonzo Church (CHURCH 1936) in another way than the one applied here. Obviously, this undecidability theorem holds for all larger logical calculi including QC. In some first-order languages there exist decision procedures for the deductibility relation; obviously, this does not contradict our theorem. For example, if a first-order language involves only names and monadic predicates as (nonlogical) constants then it is decidable. The same holds for zero-order (i.e., propositional) languages.
* * * The investigations of the present chapter showed us an interesting example for the defmition of a first-order theory by means of a canonical calculus, and, in addition, presented a very important metatheorem on the first-order calculus QC. 84
Chapter 8 COMPLETENESS WITH RESPECT TO NEGATION 8.1 The Formal Theory CC In Sect. 7.3, we saw that every theorem of CC* is true (see CC.4). If the converse holds too (up to the present point in this essay, this question was not yet answered) then the identity (1)
{A: A is a true formula of CC*}
= {A: A is a theorem of
CC*}
holds true. Using that for any formula A, exactly one of the pair A, "- A" is true, it follows from (1) that for all formulas A, either A, or "- A" is a theorem of CC*, or, as we shall express this property, the theory CC* is complete with respect to negation. 8.1.1. DEFINITION. A first-order theory Tis said to be complete with respectto nega-
tion - briefly: neg-complete - iff for all formulas A of T, one of A, "-A" is a theorem ofT. An inconsistent theory is, trivially, neg-complete. Thus, the problem of negcompleteness is an interesting one only for consistent theories, such as CC*. Intuitively, we can say that a consistent and neg-complete theory grasps its subject matter exhaustively.
Since any consistent class of formulas can serve as a basis (postulate class) of a consistent theory, it is not surprising that many consistent first-order theories are not neg-complete. As we shall see later on, this is the case even with CC* that means that identity (1) does not hold. Moreover, there are surprising cases of neg-incomplete theories, theories which are irremediably neg-incomplete in the sense that any consistent enlargement (with new postulates) of the theory remains neg-incomplete. Such a theory is especially interesting if we can give a truth assignment of its formulas according to which all its theorems are true, and, in addition, our intuition suggests that the postulates of the theory characterize exhaustively the notions represented by the constants of the language of the theory. We shall give an example of such a surprising frrst-order theory. It will be a fragment of CC*, let us call it CC. The intuitive background of CC* is the hypercalculus H 3• In CC, we shall rely on Hz, instead. Let us remember that Hz defines the notion of a canonical calculus and the derivability in a canonical calculus. The additional notions defined in H 3 are the lexicographic ordering, the Gooel numbering, and the autonomous numerals. These considerations show that the theory CC that will be based on the hypercalculus Hz is
85
the smallest first-order theory of canonical calculi whereas CC* is one of the .possible enlargements of it. (However, CC* was useful in demonstrating the undecidability of QC.) According to our intuitions, 8 2 regulates exhaustively the notionsinvolved in it. This gives the (illusory) hope that the theory CC based on 8 2 will be neg-eomplete. The formulation of CC is simple enough: we get it by certain deletions from CC* (cf. Def. 7.1.1). 8.1.2. DEFINITION. The first-order theory CC is defmedby CC =
(£10,
r o)
where .£10 is a first-order language based on the 23-letteralphabet Jt cO = {(,), t,
x, -,::>, =, 'if, ~, a,
~, ~, D(a)(gJ\) ::> D(a)(hJ\).
Proof. By Lemma 8.2.1, it follows from our assumptions that (4)
F 0 ~ D(a)(f'-< gA-< hJ\).
Rule 106 of 1:* is a postulate of CC (a member of r 0) from which we get by applications of the basic schema (B4) (of QC) that (5)
Fo ~ D(a)(f')::J D(a)(f'-< g"-< h")::> T(fA) =:l D(cr)(g"-< h").
Furthermore, ''T(f')'' is obviously true, and , hence, by CC.8 (see in Sect. 7.3) (6)
ro
~ T(f').
We get from (4), (5), and (6) - by PC - that (7)
Fo ~ D(a)(f')::> D(a)(gJ\-< h").
Again, we get from the postulate under 106 that (8)
Fo ~ D(a)(g")::> D(a)(g"-< h") ::> T(g") ::> D(a)(hJ\)
and (9)
F o ~ T(g").
93
From (7), (8), and (9) we get by PC:
which was to be proven. We see that if h involvesan arrow- i.e., if h/: is of form "h I A -< h2A " - we can continueour proof to get "D(a)(h I A ) :::> D(cr)(h2A ) " instead of "D(a)(h A ) " provided h, involvesno arrow.Thus, we can extendour Lemmafor rules of 1: containing more than two arrow-free inputs. Furthermore, our result is independent of the fact whetherthe rule in question is an original one - i.e., listed in the presentation of 1: - or is a derived rule of 1:. To mentionjust a derivedrule of 1: which will be important in the following discussions, let us considerthe followingone: (l0)
1: »+ Fu
~
vSuStSx ~ Fv.
Obviously, we get a formulafrom a formulavia substitution. Thus, if the two inputs in (l0) are derivablein 1: then the output "Fv" must be derivablein 1: as well.
9.2 The Proof of the Unprovability of Cons Let us considerthe followingabbreviations (introduced partly in Sect. 8.2):
Ao
=df
e, =df Co
=df
VXIV'X2 - Diag(xJxlt~; VXI-Diag(alxlt~; bo
=df
Diag(aoIx,bo);
ao =df Ao" ; Bo"; Co =df Co"·
Let us recall the main results of Sect. 8.2:
(2)
¢:> (Fo ~ Bo)· (Fo ~ Co) => (Fo ~ -Co)·
(3)
None of Bo, -Bo is a -theorem ofCC.
(1)
(Fo ~ Co)
Our proof will be detailed in severalsteps. Step 1. Since 1: »+
B~A~a~xt
we have by Lemma 8.2.1 that
We know that here the word bo is uniquely determined by the words ao, ao" and x. Henceforth, the following conditional is true:
94
To accept this formula as a theorem of CC, the following auxiliary postulate is sufficient:
(SUD) Since To ~ K(o) we have that (4) follows from SUD (Substitution Uniquely Determined) by QC; thus, (4) is a theorem of CC. Remark. The introduction of the auxiliary postulate SUD wasmentioned in Def. 8.1.3 already. We couldformulate a moregeneral version of SUD, e.g.,
However, the present version suffices ouraims. - If someone objects to SUD, he/she canomitit fromL and
Fa; the results of Chapter 8 remain correct without SUD as well. However, SUD is indispensable in the present chapter (except if you find a proofof (4) without using SUD; this possibility is not ab ovo excluded).
A particular case of (l0) of the preceding section is: 1: ... FA o ~ vSAoSaoSx
~
Fv
(where v is a 1:-variable) . Then, by Lemma 9.1.1:
From (4) and (5) we get by PC: (6)
To ~ (D(cr)(F"ao) & D(o)(~"ao[SaoSx]") & D«J)(~» :) (D(cr)(FA~)
& (b o = ~ & D(o)(~».
Here the antecedent is "Diag(ao/x,~", and the consequent yields ''D(o)(F''bo) & D(o)(bo)" , by the basic schema (B8) of QC. The latter formula is - by (2) of the preceding section - just "Th(bol'. Hence:
To ~ Diag(ao/x ,~:::J Th(bo). Then , by QC:
(taking into consideration that 'Th(bo
r is free from x, Xl, and x,a). Here the antecedent
is just the negation of BOo Thus, our fmal result is that: (7)
To ~ -Bs:» Th(bo).
95
Step 2. By (1), if one of Co, Bo is deducible in 1: then so is the other.Thus, we have the following derivedrules:
and Then, by Lemma 9.1.1 and PC, we get easily: Fo ~ D(a)(bo) == D(cr)(co)·
With respectto the definition of 'Th' (see (2) in the preceding section) we then have: F o ~ Th(b o) ~ Th(co)·
From this and (7) we get: (8)
Fo ~
-e, ~ Th(Co ).
Step 3. By (2), if Co is derivable in 1: then so is its negation. This yields the
derived rule:
Then, similarly as in the previous step, we have: (9)
F o ~ Th(Co ) ~ Th( _/\ co).
Step 4. By PC, from a pair (of formulas) A, "-A" , any formula is deducible.
Hence, we have the derived rule: 1: .. Fu ~
U
~
F-u
~
-u ~ Fv ~ v,
Then, using the generalization of Lemma9.1.1 and applying the definition of'Th' , we get:
From this it then follows by QC: Fo ~ (Th(co) '& tu-» co) ~ VXl(D(a)(F/\x~~ D(a)(x~)
Here the consequent is exactlythe negation of 'Cons' (see (1) in Sect. 9.1). Hence: (10)
F o ~ (Th(co) & Thi-> co) ~ - Cons.
Step 5. We get from (8), (9), and (10),by PC, that Fo ~ -Bo ~ -Cons .
96
Or, by contraposition:
Fo ~ Cons ~ Bo . Hence, if 'Cons' would be a theorem of CC then so would be Bo . By (3), Bo is not a theorem of CC. Consequently, 'Fo ~ Cons ' does not hold. Our aim was just to prove this statement.
•
Our result can be extended to certain enlargements of CC. The conditions are the same as in Th.8.3.1. The metatheorem just proved is an analogue of Godel' s Second Incomplete-
ness Theorem which states that the consistency of Number Theory is unprovable although expressible - within Number Theory. Concluding remarks. We have finished our work on the pure syntactic means of metalogic. However, every system of logic is defective without a semantical foundation . - at least according to the views of a number of logicians (including the author). Thus, if the question is posed, 'How to go further in studying metalogic?' the natural answer seems to be, 'Tum to the semantics !'. Now, a logical semantics which is best connected to our intuitions concerning the task and applicability of logic can be explained within the frames of set theory. Set theory is a very important and deep discipline of mathematics. We need only a solid fragment of this theory in logical semantics (at least if we do not go far away from our intuitions concerning logic). Fortunately, set theory can be explained as a first-order theory. After studying its most important notions and devices, we can incorporate a part of this theory into our metalogical knowledge, and we can utilize it in developing logical semantics. Our next (and last) chapter will give a very short outline of set theory as well as some insights on its use in logical semantics. We assume here (similarly as in Ch. 6) that the reader had (or will) take a more detailed course in this discipline - our explanations are devoted merely to give the feeling of the continuity in the transition from syntax to semantics. Let us mention that another field of logical semantics is the algebraic seman-
tics. This is foreign to the subject matter of the present essay, for, in the view of the author, it does not help us to understand the truly nature and essence of logic. However, it is a very important and nowadays very fashionable field of mathematical logic presenting interesting mathematical theorems about systems of logic.
97
Chapter 10 SET THEORY 10.1 Sets and Classes 10.1.1. Informal introduction. The father of set theory was Georg Cantor (1845-1918). It became a formal theory (based on postulates) in the 20. century, due to the pioneering works of Ernst
Zermelo and Abraham Fraenkel (quoted as 'Z-F Set Theory'). Further developments are due to Th. Skolem, J. v. Neumann, P. Bernays, K. Godel and many other mathematicians. (On the works of Cantor, see CANTOR 1932.) The intuitive idea of set theory is that some collections - or classes - of individual objects are to be considered as individual objects - called sets - which can be collected, again, into classes which might be, again, individuals, i.e., sets, and so on. Briefly: the operation of forming classes can be iterated; and classes which can be members of other classes are called sets. Thus, according to this intuition, sets are in-
dividualized classes. Then, an important task of set theory is to determine which classes can be individ ualized (i.e., considered as sets). Now, formal set theory gives no answer of such questions as 'what are classes?' or 'what are sets?'. Its universeof discourse is the totality of sets, and most of its postulates deal with operations forming sets from given sets. There exist different formulations of (the same) set theory. In most formulations, set theory is presented as a first-order theory whose single nonlogical constant is the dyadic predicate 'E ' (' is a member of), and the possible values of free variables are assumed (tacitly) to be sets. Thus, in the formula "x
E
y",
both x and y are sets; members of sets (if any) must be, again, sets. Moreover, identity of sets is introduced via definition: (1)
=
(a =b) df 'Vx((x E a) (x E b)).
From this "a = a" - and "'Vx(x = x)"
is deducible; hence, the basic formula (B7)
of QC is omitted. The same holds for (B8); instead of the latter, a postulate called
axiom ofextensionality is accepted: (a = b):::) 'Vx«a EX):::) (b EX».
In QC, "3x(x = x)" follows from "'v'x(x = x)". This means that the domain of individuals is not empty. Hence, we need not a postulate stating that there are sets (for, in this approach, everything is a set).
In logical semantics, it is advantageous to assume that there are domains - i.e., sets - whose members are not sets but other type of individual objects (e.g., physical or
98
grammatical objects). By this, we shall depart a little from the usual formulation of set theory sketched above. The main peculiarity of our approach is "to permit individuals other than sets, these will be called primary objects, briefly: primobs. Of course, they will have no members. To differentiate between sets and primobs, we need a monadic predicate 'i', where "i(x)" represents the open sentence 'x is a set', and "--i(x)" tells that x is a primob. We cannot omit the identity sign '=' from the supply of our logical constants, for, if we try apply the definition (1) for primobs we get that all primobs are identical with each other. Thus, we shall use the full machinery of QC, retaining (B7) and (B8) as well. - Note that we shall not prescribe the existence of primobs, we want only to permit them. After these preliminary discussions let us return to the systematic explanation of our set theory.
10.1.2. The language of set theory. To avoid superfluous repetitions, it is sufficient to fIX that in the language of set theory, the class of nonlogical constants is
Con where i is a monadic and
={i , E}
is a dyadic predicate. No name functors are used - al-
E
though several such ones can be introduced via definitions. Thus, Term = Var.
Notation conventions in our metalanguage. We shall use lower-case Latin letters (a, b, c, x, y, z) as metavariables for referring to the object language variables (x, Xt, xu, ...). The logical symbols &, v,
==, 3 introduced via definitions (see 6.2.2, Re-
marks 4) will be used sometimes . The convention for omitting parentheses (see 6.2.4) will be applied , too. We write "(x instead of "-(x
E
y)"
E
y)" instead of "E(X)(y)", "(x ~ y)" and "(x
:I;
y)"
and "-(x = y)", respectively. The expressions "qJ(x)" and
"tp(x,y)" refer to arbitrary monadic and dyadic open formulas, respectively.
We do not want to list all postulates of set theory in advance. Instead, we shall present postulates, definitions and theorems alternatively, giving a successive construction of the theory. Now, if T, denotes the class of postulates of our set theory, we shall write
"Ir
A" instead of "T,
~
A" in this chapter. Most theorems - including postu-
lates - will be presented by open formulas; these are to be understood as standing for their universal closures.
10.1.3. First postulates: (PO)
3x(x
(PI)
i(a) ::> j(b) ::> Vx«x
E
a) ::> i(a). E
a) == (x E b» ::> (a = b).
(According to our conventions, (PO) stands for "Va(3x(x be prefixed with 'V aV b'.)
" 99
L
E
a) ::> i(a»", and (PI) is to
(PO) says that if something has a member, it is a set. (But it does not state that every set has a member.) By contraposition: -Sea) ~ -3x(x e a),
i.e., primobs have no members. - (PI) tells us that if two sets coincide in extenso (containing the same members) then they are identical. Here the conditions sea) and s(b) are essential, without them all primobs would be identical.
Before going further, we shall extend our metalanguage.
10.1.4. Class abstracts and class variables. As in Sect. 2.5, we introduce class abstracts and class variables, with the stipulation that in the class abstract {x: qJ(x)},
rp(x) must be a monadic open formula of the language of set theory. Then, class ab-
stracts and class variables (A, B, C) will be permitted in place of the variables in atomic formulas (that is, everywhere in a formula except in quantifiers), but these occurrences of class symbols will be eliminable by means of the definitions (01.1) to (01.6) below. Thus, the introduction of these new symbols does not cause an extension
of our object language; it gives only a convenient notation in the metalanguage. (Note that the class of monadic open formulas is a definite one; hence, the same holds for the class of class abstracts which is the domain of the permitted values of our class variables .) The six definitions below show how a class symbol is eliminable in atomic formulas. (D!.l)
a e {x : qJ(x)}
(01.2)
(A = B) ¢:>df 'Vx«x e A) == (x e B)).
(01.3)
(a
=A)
¢:>
(01.4)
(A
e b)
¢:>df
(01.5)
(A E B) ¢:>df 3a(a = A) & (a E B)).
(01.6)
seA) ¢:>df 3a(a
¢:>df
(A
tpta).
=a)
¢:>df
sea) & 'v'x«x e a) == (x E A)).
3a«a = A) & (a E b)).
=A).
We get from (Dl.3) that
Ir
-Sea) ~ (a *A)
(primobs are not classes). By (01 .6), a class is a set iff it is coextensive with a set, Hence, "-s(A)" means that the extension of A coincides with no set. In this case, A is said to be a proper class.
10.1.5. Proper classes. Set theory would be very easy if we could assume that every class is a set (as Cantor thought it before the 1890's). As we know today we cannot 100
assume this without risking the consistency of our theory. Here follow the definitions of some interesting classes: (D1.7)
lnd
=df
o =df
(D1.8) (D1.9)
Set
=df
(D1.10)
Ru
=df
{x: (x = x)}. {x: (x:;e x)} . {x: s(x) }. {x: .o(x) & (x ~ x)}.
lnd and Set are the classes of individuals and of sets, respectively. 0 is the empty
class. Ru is the so-called Russell class: the class of "normal" sets (which are not members of themselves). Except 0, all these are proper classes. It is easy to show this about Ru. - Assume, indirectly, that j(Ru). Then there is a set, say r, such that Vx«x
E
r) == (j(x) & (x ~ x»).
Then, by (B4) of QC, we get
(r E r) == (j(r) & (r ~ r» , which implies that " - j (rt, contradicting our indirect assumption. Hence:
Ir
(Th.1.1)
-j(Ru),
i.e., Ru is a proper class . Since Ru
~
Set c lnd, we suspect that Set and lnd are
proper classes, too. (Proof will follow later on.) Thus, there "exist" proper classes . This - seemingly ontological - statement means merely: We cannot assume , without the risk of a logical contradiction, that for all monadic predicates " rp(x)" of the language of set theory there is a set a such that "Vx«x
E
a) == rp(x)" holds .
Remark. The Russell class Ru was invented by Bertrand Russell in 1901. (See e.g. RUSSELL 1959, Ch. 7.) The existence of proper classes was recognised (but not published) by Cantor some years earlier (see CANTOR 1932, pp. 443-450). These recognitions led to the investigations of fmding new foundations for set theory.
10.1.6. Further definitlons and postulates. - From now on, our treatment will be very sketchy. Let us introduce an abbreviation for the simplest class abstractions: (D1.11)
a
E
= df {x: (x
E
a)}.
By (D1.3) and (D1.1) we have: (a = a~ == (j(a) & Vx(x
l
101
E
a) == (x E a»).
From this we get by QC: I~ (a =a~ == j(a).
(Th.1.2)
That is, any set a coincides with the class aE • Thus, all sets are classes (but not conversely). By this theorem, all definitions and theorems on classes hold for sets as well. On the other hand, if a is a primob, aE has no members, and, by (01.2), it
c0-
incides with the empty class 0 : I~
(Th.1.3)
-j(a)::::> (aE = 0).
In what follows, we shall use all notions and notations introduced for classes in Sect. 2.5 - see especially (4), (5), (7), (9), (10) and (11) in 2.5. In case of sets, we speak on sub- and superset instead of sub- and superclass, respectively. We define the union class of a class A - in symbols: "u(A)" - as follows: (01.12)
u(A)
{x: 3y«x E y) & (y E A»}.
=df
Note that by (PO), " (x
E
y)" implies "i(y)". Thus, if no set is a member of A then u(A)
=0. Particularly: I~ -j(a)::::> (u(a~ = 0)
and
I~ u(0)
= 0.
Now we can formulate two further postulates: (n)
j({a, b})
[Axiom of pairs.]
(P3)
j(u(a~)
[Axiom of union.]
We omit the proof of the following consequences of these new postulates: (Th.1.4)
I~ j({a}).
(Th.l .5)
I~ j(a u b~. E
If a, b are sets, we can write "u(a)" and "a u b" instead of "u(a~" and "aE u bE ", respectively. Let us introduce provisionally Zermelo's postulate: (Z)
j(aE (J A)
which will be a consequence of the postulate (P6) introduced in the next section. Its important consequences (without proof):
(B c a~ ::::> j(B). j(aE-B).
(Th.l.6)
I~
(Th.1.7)
I~
(Th.1.8)
I~ j(0).
It follows from (Th.l .6) that Set and Ind - as superclasses of Ru - are proper classes. 102
The set corresponding to 0 is uniquely determined and is called the empty set. In set theory, it represents the natural number 0, hence, we shall use '0' as its proper name. However, the use of '0' in formulas is eliminable by the following contextual defmition: (D1.13)
~(1)} ¢::>df
3a(s(a) & ~(a) & Vx(x ~ a». We define the power class of a class A - in symbols: "po(A)" - by (D1.14)
pl(A) =df {x: sex) & (x ~ An. E
(The denomination is connected with the fact that if A has n members then po(A) has 2n members.) - Our next postulate: (P4)
[Axiom of power set.]
This states that the power class of a set a is, again, a set, called the power set ofa. Our next postulate - among others - excludes that a set could be a member of itself: (P5)
(aE;f. 0) ~ 3x«x
E
a) & (xE l'I aE = 0» .
[Axiom of regularity.]
Its important consequences:
(Th. l.9)
I~
(Th.1.10)
I~ (a ~ a).
(a
E
b) ~ (b ~ a).
These mean that the relation fe' is asymmetrical and irreflexive (cf. Def. 10.2.4).
10.2 Relations and Functions 10.2.1. Ordered pairs. An ordered pair (or couple) is an (abstract) object to which there is associated an object a as its distinguished (or first) member, and an object bas its contingent (or second) member. Such a pair (couple) is denoted as "(a,b}"; the case b = a is not excluded. This seems to be an irreducible primitive notion that can be regulated only by means of the postulate: (1)
( {a,b} = (c,d'»
¢::>
«a
=c) & (b = d) ).
However, in set theory, there is a possibility of representing (or modelling) ordered pairs. Within set theory, this representation has the form of a definition: (D2.1)
(a,b)
=df
{{a}, {a,b}}.
103
This definition satisfiesthe postulate under (1). Note that (a,a) is reducible to, o{ {a}}. Furthermore, if a, b e Ind then I~ j«a, b». - In what follows, we shall deal with ordered pairs only withinset theory. In defining classes of ordered pairs, we have to use class abstracts of form (2)
{z: 3x3y «z = (x,y)) & If/(x,y) ) }.
Let us agree to abbreviate (2) by {(x,y) : tf/(x,y)}.
The Cartesian product of the classesA and B - in symbols: "A x B" - is definedby (02.2) (02.3)
AxB
=df
A (2) =df
{(x,y) : (x E A) & (y E B) }.
A x A.
It is easy to show that a E x bE
c po(po(aE u b1). Then, by (Th.1.5), (P4), and
(fh.l.6), we have that: (Th.2.1) The class of all orderedpairs, Orp, is (02.4)
Orp
=df
Ind x Ind.
10.2.2. Relations. A class of ordered pairs is a possible (potential) extension of a dy-
adic predicate. Such a predicate expresses a relation (cf. Sect. 2.1). This is the reason that in set theory, subclasses of Orp are called relations (although, in fact, they are only potential extensions of relations). We introduce the metalogical predicate 'ReI' by (02.5)
Rel(A) ~df A c Orp.
In the following group of definitions, let us assumethat R is a relation. (D2.6)
xRy ~df (x,y)
E
R,
Dom(R)
=df
{x: 3y(xRy)},
Im(R)
=df
{y: 3x(xRy)} ,
Ar(R)
=df
Dom(R) u Im(R).
Here Dom(R), Im(R), and Ar(R) are said to be the first domain, the second or image domain, and the area or field, respectively, of the relation R. A relation R may be considered as a projection from Dom(R) to 1m(R). The restriction of a relationR to a class A - denotedby "RJ,A" is defined by (02.7)
RJ,A
=df
{(x,y) : (x E A) & xRy }
104
=R
(l
(A x Im(R).
If aRb holds, we can say that b is an R-image ofa. The class of all R-images of a will be.denoted by "Rcz::t a}". We extend this notation to an arbitrary class A in the
placeof {a} : (02.8)
RCCA
=df
{y: 3x(x E A) & xRy)}.
If everymember of Dom(R) has a single R-image then R is said to be afunc-
tion. The metalogical predicate 'Fnc' is defmed by
(02.9)
Fnc(R)
df
Rel(R) & 'Vx'Vy'Vz«xRy & xRz):J (y = z».
Now we can formulate Fraenkel's postulate of set theory: Fnc(R):J i(R CC a~
(P6)
tellingthat the R-image of a set is a set, provided R is a function. Now, let "Id -1.A " be the class {(x,x): XE A}, the identity relation restricted to the class A. Obviously, Id-1.A is a function. Then, by (P6), we have: I~ i«Id-1.A)cc a~ .
However, (I d-1.A)CC a E = aE ( l A. hence:
whichis exactlyZermelo's postulate (Z) in 10.1.6. . If R is a function, we can write "R(a) = b"
instead of
"aRb".
10.2.3. Further notions concerning relations. The following definitions are useful in
logical semantics. Changingthe two domainsof a relation R we get its converse denoted by "R v ,,: RU
(02.10)
=df
{(x,y): yRx}.
The relative product of the relations R and S - denoted by "RIS" - is defined by (02.11)
RIS
(D2.12)
A function is said to be invertible iff its converse is, again,a function.
=df
{(x,y): 3z(xRz & zSy)}.
The class of all functions from B into A - denoted by (02.13)
BA
=df
,JlA" -
if: i(f) & Fnc(f) & if~ B'x A) & Dom(f)
lOS
=
is defined by B}.
(fh.2.3)
It
(Th.2.4) (Th.2.5)
I~ ~A = {OJ). It (A;t 0) ::> (A 0 =0).
(02.14)
1 =df {OJ.
(i(A) & i(B)) ::> i(BA).
This is the set -theoretical representation of the naturalnumber 1. Using it, (Th.2.4) can be writtenas
It 10.2.4.
~A = 1).
DEFINITION. The relation R is said to be reflexive iff Vx«x E Ar(R»)::> xRx), irreflexiveiff Vx(-xRx), symmetricaliff VxVy(xRy::> yRx), antisymmetrical iff VxVy«xRy & yRx)::> (x = y)), VxVy(xRy::> -yRx), asymmetricaliff
transitiveiff VxVyV z«xRy & yRz) ::> xRz), connectediff VxVy«(x E Ar(R») & (y E Ar(R») & (x;t y))::> (xRy v yRx)), an equivalence iff it is both symmetric and transitive, a partial orderingiff it is reflexive, antisymmetric, and transitive, a linear ordering iff it is irreflexive, transitive, and connected. Note that (R is symmetric and transitive) ==:> R is reflexive,
(R is asymmetric) ==:> R is irreflexive, (R is irreflexive and transitive) ==:> R is asymmetric.
10.3 Ordinal, Natural, and Cardinal Numbers The successor of an individual a - denoted by "a+ " - is definedas follows: (03.1)
Natural numbers in set theoryare representable by the following definitions: (03.2)
o =df
0,
3
2+
=df
1 =
=df
{O,1,2},
0+
=
{OJ,
4
=df
2 3+
=
=df
1+
=
{O, I},
{O,1,2,3},
and so on. Intuitively: any natural number n is the set of natural numbers less than n. Or: if a natural number n is defined already then the next natural number is defined as n+ = n u in}.
106
How can we define the class of all natural numbers?This work needs a series of further defmitions.
10.3.1. Ordinal numbers. We say that the relation R well-orders the class A - in symbols: "R.Wo.A" iff R-LA is connected, its area is A, and everynonempty subset a of A has a singular membernot belongingto Im(R-La). In details: (03.3)
R.Wo.A ¢:)df VxVy«(x E A) & (y E A) & (x *" Y»::J (xRy v yRx» & Va(i(a) ::J (aE k A) ::J (a ~ 0) ::J 3x«x E a) & -3y«(y E a) & yRx»).
One can prove that if R.Wo.A then R-LA is a linear ordering. (03.4) A class A is said to be transitive iff
Vx«x E A) ::J (i(x) & xE k A». Let us denote by 'Eps' the relation 'is a member of' , i.e., (03.5)
Eps
=df
{(x,y): (x E y)}.
Now, all the sets 0, 1, 2, 3, 4 in (D3.2) are well-ordered by Eps and are transitive. (D3.6)
A class A is said to be an ordinal iff it is transitive and Eps. Wo.A.
Ordinals which are sets will be called ordinal numbers. Their class, On, is defined by (D3.7)
On
=df
{x: i(x) & xE is an ordinal}.
10.3.2. The following statements can be proved: (1) Every nonempty class of ordinal numbers has a singular member with respect to the relation Eps. (2) Every memberof an ordinal is an ordinal number. (3) On is an ordinal. (4) Every ordinal other than On is a memberof On. (5) On is a proper class, i.e., -B(On). (6) The successorof an ordinal number is an ordinal number (i.e., (a EOn) ::J (a+ EOn» . We shall use lower-caseGreek letters - a, P,
r - referring to ordinal numbers.
An ordinal number other than 0 mayor may not be a successor of another ordinal number; if not, it is called a limit ordinal number. Now, the class of non-limit ordinal numbers, On" is defmed by (D3.8)
On,
=df
(a: (a =0) v 3f3(P EOn) & (p + = a»)},
107
whereas
is the class of limit ordinalnumbers.
10.3.3. Natural numbers. - In set theory, natural numbers are represented by those members of On I which, starting from 0, are attainable by means of the successoroperation. Thus, the definition of the class of naturalnumbers, co, is as follows:
1
(03.9)
.1
Now, co is provedto be an ordinal. Hence, eitherco = On, or else co ics cannotbe devoid of the following postulate: (P7)
E
On. Mathemat-
i(co).
From this it then follows that co is a limit ordinalnumber. (D3.10)
(a < /1) ~df (a E /1);
(a ~ fJ) ~df «a < fJ) v fa =fJ) .
We omit the details how the full arithmetic of natural numbers - including arithmetical operations - can be developed in the frames of set theory. The essenceis that accepting set theoryin metalogic, we can use the notions of arithmetic as well. By induction on co, we can definethe notionof ordered n-el-tuple (n ~ 2) as an orderedpair whosefirst memberis an orderedn-tuple. We agreein writing
Similarly, we definefor n > 0
10.3.4. Sequences. Where a is an ordinal number, by an a-sequence let us mean any function defined on a, i.e., a memberof alnd. If s is an a-sequence, and P < a then the s-imageof Pis called the P-th member of s. The usual notation: (sP}fkrx .
Sequences defined on a natural number (a member of co) are called finite sequences. The single O-sequence is O. If co ~ a (a EOn), the a-sequences are called transfinite sequences, whereas co-sequences are said to be (ordinary) infinite sequences. If S =(Si}i
=
Proof' by structural induction using the semantic rules. (The details are left to the reader.) 1.1.3.4. DEFINmoN. Let rbe a set of sentences (F c: Cat( Ip an interpretation and v a valuation joining to Ip.
0»,
129
(i) We say that the couple (Ip, v) is a model of riff for all A E I: IAI/p ='1. (ii) ris said to be satisfiable iff rhas a model, and ris said to be unsatisfiable iff rhas no model. (iii)The sentence A is said to be a semantic consequence of T - in symbols: "T ~ A" - iff everymodel of r is a model of {A}. (iv) Sentence A is said to be valid (or a logical truth of EL) - in symbols: "~ A" - iff A is a semantic consequence of the emptyset of sentences. (v) Terms A and B are said to be logically synonymousiff ~ (A = B). Note that if r is unsatisfiable then for all sentences A, r ~ A,. and if ~ A then for all r, r ~ A. 1.1.4. SOME SEMANTICAL METATHEOREMS Throughout in this section, a language L~ and an arbitrary interpretation Ip for L~ is assumed. Let us denote by "FV(A) " the set of variables having some free occurrences in the term A. Then: 1.1.4.1. LEMMA. If the valuations v and v/coincideon FV(A), then IAI/p = IAlv'/p . Proof. Our statement is obviously true if A is a variable or a constant. If A is of form "B(C)" or H(B = C)" then use the induction assumption that the lemmaholds true for B and C, and take into consideration that in these cases, FV(A) = FV(B) u FV(C) (and use the rules (S1) and (83). Finally, if A is of form "(A.xaBp)" then, FV(A) = FV(B)- {x}.
If v and
y'
coincide on FV(A) then for all b
E
D(b), v[x:b] and v' [x:b] coincide on
FV(B),' thus, by inductionassumption,
IBlv[x:b]
=IBlv'{x:b] .
Then (usingthe rule (S2», for all b e D(ft): I(AX.B)l v (b) = IBlv[x:b]
= IBlv'[x:b] =I(AX.B)lv'(b)
which means that I(AX.B)l v = I(Ax.B)l v " COROLLARY. If A is a closedtermthen for all valuations v and v~ IAlv = IAlv~ 1.1.4.2. LEMMA. If for all valuations v, lAaI v =I BuJv, then for all valuations v, ICl v = IC[B/A]l v ' (Cf. (vi) of Def. 1.1.2.2.) Proof. For the sake of brevity, we writeX'instead of "X[B/A] ". Our statement holds trivially if A is not a part of C, or Cis A. If C is of form "F(E)" or H(F =E)" then use the induction assumption that IFlv = IF' l, and IElv = IE' l, (for all v). If C is of form "(AXpE)" then C'must be "('Ax.E')". Using that for all v, IElv = IE' l, we have that for all v and for all b E D(ft), 130
IElv[x:bJ
=IE"lv[x:bJ
whichmeansthat for all valuations v, I(Ax.E)lv = 1(A.x.E")l v ' COROLLARY. If i= (A a =BaJ then i= (C = C[B/A]). Let us emphasize a furthercorollary: 1.1.4.3. THEOREM. The law of replacement. If i= (A a = BaJ and i= Cathen 1= C[B/A). 1.1.4.4. LEMMA. If xp is substitutable by BP in the term A then for all valuations v:
~ 1.4/ l, =IAlv[x:bJ' Proof. If A is free fromx then A/ is the sameas A, and (byLemma 1.1.4.1) IBlv
=b e D(P)
IAlv = IAlv[x:b].
Now assume that A is not free from x. Then we use, again, structural induction on A. If A is x then A/ is B, and, trivially, IBlv
=b = Ixlv/x:b] .
The cases when A is of form UF(C)" or U(C = E)" are left to the reader. Now let us consider the case A is of form "('Ay.C)". Then y x (for "(Ax.C)" is free from x), and B must be free from y (for B is substitutable for x in "('Ax.C)" ). Hence, if y E Var(y) then for all c E D(y),
'*
IBlv[y:cJ = IBlv
(byLemma1.1.4.1). Then,for all c E D(y), 1('Ax.C/ )I v (c)
=lex Iv[y:, ] B
= IClv[y:cllx:b]
= 1('Ay.C)lv[x:b](c),
by inWcticit asswDptiCll
whichyieldsthat 1('Ay.C/)Iv =1('Ay.C)l v/x:bJ' 1.1.4.5. THEOREM. The law oflambda-conversion. If x is substitutable by B in A then 1=
((Axp.Aa)(Bp) =A/.
Proof. We haveby (S1) and (S2) thatif IBlv = b then 1('AxpAtJ(Bp)lv = I(Ax.A)l v (IBl v) = IAlv[x:bJ'
According to the precedig lemma(usingthe assumption on the substitutability): IAlv[x :b]
=IA/ l,
Hence: I(Ax.A)(B)l v =IA/ Iv for all interpretations and valuations. Our statement follows trivially fromthis fact.
131
L
1.1.5. LOGICAL SYMBOLS INTRODUCED VIA DEFINITIONS
We defme first the sentences i and J" called Verum and Falsum, respectively:
i [Show that li~lp
=df «Ap()p) = (Ap.p»; J, =df «Ap()p) = (Ap. i» .
=1 and IJ,I/p =0, for all Ip and v.]
We continueby introducing negation, 1_' : (Df.-) Then -A =df(Ap(J,=p»(A). By the law of A-conversion, the right side is logically synonymousto "(J, =A)". Hence, the contextual defmitionof'-' is as follows: -A
=df
(J, =A).
The explicitedefmitionof the universalquantifier "'i/ a (of type a) is: (Df."'i/) Its contextual definitionis:
(Here the type subscript aof"'i/ can be omitted.)We can introducethe usual notation by "'i/xaA o =df "'i/(Ax.A) [= «Ax.A) = (AX.
t» ].
The definition of the conjunction ' &': (Df.&)
& =d f (Apo(Aqo ·"'i/fro [P = (f(p) = f(q))]
».
[For the sake of easier reading, we applied here a pair of square brackets instead of the "regular" parentheses. This device will be applied sometimes later on.] We shall write the usual "(A & B)" instead of "&(A)(B)". Thus, the contextual definition of' &' is as follows: (A & B ) =df vt.oo (A = (f(A) = fiB)))
where A and B must be free from the variablef. [Show that our definiton of '&' satisfies the canonical truth condition of the conjunction.] The further logical symbols will be introduced via contextual definitions only: (Df. o)
(Ao~ B o) =df -
(A & - B)
(Df. v)
(Ao v Bo ) =df
(-A & - B),
-
(We do not need a new symbol for biconditional since"(A = B )" is appropriate to express it.) 132
(Df.3)
3xa·A o =df- V'x-A.
(Df. ¢)
(A a
*Ba )
=df "....(A =
B).
1.1.6. THE GENERALIZED SEMANTICS It follows from a result of KURT GbDEL (1931) that there exists no logical calculus
which is both sound and complete to our semantical system EL (i.e., a calculus in which asentence A is deducible froma set of sentences riff r FA holds in EL). However, via following the method of LEONARD HENKIN 1950, it is possible to formulate a generalized semantics - briefly: a G-semantics - in such a way that the calculus EC -
to be introduced in 1.2 - proves to be sound and complete with respect to this Gsemantics. The present section is devoted to formulate the G-semantics of the EL languages. The semantics introduced in Ll.Lmay be distinguished by calling it the standard semantics. 1.1.6.1. DEFINITION. By a generalizedinterpretation - briefly: a G-interpretation - of
a languageLo:! we meana triple Ip = ( V, D, p) satisfying the following con"ditions: (i) U is a nonempty set., (ii) D is a function defined on EXTY such that D( 0)
={0,1},
Dtt) = U, and D( ap) b D(fJ) D( a).
(iii) p is a function defined on Con such that
CeCon(a) ~ ~C)eD(a). (iv) Whenever v is a valuation joining to Ip (satisfying the condition v(x a) e D( a) ), the semantic rules (SO) to (S3) in Def. 1.1.3.2are applicable in determining the factualvalues (according to Ip and v) of theterms of Lo:!. Comparing G-interpretations and standard interpretations (defined in 1.1.3.1) one sees the main difference in permitting 'c' instead of '=' in the definition of D( ap). However, the domains D( ap) must not be quite arbitrary and 'too small' ones: the restriction is contained in item (iv).For example, to assurethe factual value of the term U(Axa(X = x) " the domainD(oa-) mustcontaina function (/J such that for all a e D(a), f./J(a) = 1 hold.
1.1.6.2. DEFINITION. Consider Def. 1.1.3.4. Replace the term 'interpretation' by 'G-
interpretation', and prefix 'G-' before the defined terms. Then one gets the defmition of the following notions: a G-model of a set of F, G-satisfiability (andG-unsatisfiability), G-consequence- denoted by r FG A - and G-validity ( FG A), G-synonymity. Since every standard interpretation is a G-interpretation we have the following interrelations:
133
ris satisfiable ::::> T is G-satisfiable,
ris G-unsatisfiable
::::>
r FoA FoA
::::> ::::>
ris unsatisfiable,
r F A, F A.
We have proved some important semantic laws in Section 1.1.4. Fortunately, their proofs were based in each case on the semantic rules (SO) to (S3) which remained intact in the G-semantics, too. Hence: 1.1.6.3. THEOREM. All logical laws proved in the standard semantics - in section 1.1.4
- are logical laws of the G-semantics as well. The most important laws - which will be used in 1.2 - are the law of the replacement (1.1.4.3) and the law ofthe lambda-conversion (1.1.4.5) .
1.2. THE CALCULUS
sc
1.2.1. DEFINITION OF EC
The calculus EC introducedbelow will be a pure syntactical system joining to the semantical system EL. Our presuppositions here are: the extensional type theory, the grammar of the Lei\1 languages (including the notational conventions), and the definitions of the (nonprimitive) logical symbols (as i,
J.., - , 'V, &, etc.). (See 1.1.1, 1.1.2,
1.1.5.) EC will be based on five basic schemata (E1)...(E5), and a single proof rule called replacement. Basic schemata:
(E1) (A a = A a) (E2) (if aft) &f oofJ)) = 'Vpo.f(p) )
(E3) «x a =Ya) ::J if aJx) =f(y»)) (E4) «(fap = gap) = 'Vxpff(x) = g(x)]) (E5) «'Axp A a )(Bp) =
A/ )
Here the metavariables f, g, x, y refer to variables and A, B to arbitrary terms of a formal IanfJCf
guage.L
•
(Ofcourse, in (E5), it is assumed that the term B is substitutable for x in A .)
By a basic sentence (of Lei\1) we mean a sentence resulting by a correct substitution of terms of L~ into one of the basic schemata. (A substitution is said to be a correct one if the lower-case letters are substituted by variables of the indicated types and the upper-caseletters are substitutedby terms of the indicatedtypes.) Proof rule. Rule of replacement - RR. From "(A a =
B~"
and Co to infer to
"C[B/A]".
Proofs in EC. By a proof we shall mean a nonempty finite sequence of sentences such that each member of the sequence is either a basic sentence, or else it follows from two precedingmembersvia RR. 134
A sentence A (of .£~) is said to be provable in EC - in symbols: "fee A" iff there exists a proof in EC terminatingin A. (As one sees, our definitions are language-dependent. In fact, we shall be interested, in most cases, in the proofs of sentence schemata rather than of singular sentences (of a particularlanguage).) In what follows, we shall omit the subscript 'EC' in the notation' fee', writing simply '~' instead. (The distinction is important only if we are speaking of different calculi.) The notion"A is a syntacticconsequence of the set r of sentences" (or "A is deduciblefrom r") will be introducedin Section 1.2.3. It is easy to see that all basic sentences are valid in the semantical system EL.
Furthermore, by Th. 1.1.4.3, the rule RR yields a valid sentence from valid ones. Hence: If ~ A then ~ A.
Let us realize that the above statement holds not only for our standard semantics of EL but even for the generalized semantics explainedin 1.1.6, in consequence of Th. 1.1.6.3. Consequently: 1.2.1.1. THEOREM. The soundness ofEC with respect to the generalized semantics of EL. If ~e A then FG A. To prove the converse of this theorem, we need first to prove some theorems about the provability in EC . 1.2.2. Some proofs in EC
In this section, we shall prove some metatheorems about the provability in EC. Some of these theorems state that a certain sentence, or a sentence schema is provable in EC, and some others introduce derived proof rules. At the beginning, the proofs will be fully detailed. A detailed proof will be displayed in numbered lines. At the end of each line, there stands a reference between .square brackets indicating the provability of the sentence/the schema occurring in that line. Our references will have the following forms: 'ass.' stands for assumption occurringin the formulation of the theorem. Reference to a basic schema or to a schema proved earlier will be indicated by the code of the schema (e.g., '(E2)', 'E3.2', etc.), A reference of form "Df. X" (e.g., 'Df. \/ ', 'Df. :::J') refersto the definitionof the logicalsymbolstanding in the placeof 'X' . A reference of form "kim" in line numberedby n states that line n follows by a replacement (RR) accordingto the identity standing in line k into the schema in line m. Instead of k or m, we use sometimes codes of schemata proved earlier. We shall refer often to the basic schema (E5) - the identity expressing A-conversion - , in this case we write simply 'A' in the place of k.
135
References to derived proofrules will be of form "RX: k" or "RX: k.m" where "RX" is the code of the rule and k, m are the line numbers (or codes) of the schema(ta)
to whichthe rule is to be applied. Later on, the proofs will be condensed, leavingsomedetailsto the reader., Outermostparentheses will be sometimes omitted. Notethat instead of "(A 0 (Bo~
~
Co»" we write "A ~ B ~ C'.
Proofs from (E1) 1. ~ (A = A)
2. 3.
~ (A
(A =B) ~ ~ (B =A). - Proof:
~
El.I.
[(El)]
=B)
[ass.]
~ (B = A) [211] COROLLARY.
If ~ (A = B) and ~ Co then ~ C[B/A). In what follows, we shall
refer to this rule as to our basic rule RR.
El.2.
(~
1. ~ (A = B)
2.
~ (B
=C)
(A = B) and ~ (B =C» ~ [ass.]
r (A =C). -
Proof:
[ass.]
3. ~ (A = C) [1/2]
EI.3. ~ t [by Df.i and (EI)]. EI.4. ~ (Ao =i) ~ r A o · - Proof: 1. ~ (A =i) [ass.] 2. ~
i
[El.3]
3. ' ~ A
[l/2]
EI.S.
1. ~
2.
~
(~
Ao=B o) and (A = B) [ass.]
A
r A ) ~ r B. -
Proof:
[ass.]
3. ~ B
[112]
These resultswill be used, in most cases,withouta particularreference. Proofs from (E5) ES.I. ~ (A = 'B) ~ (Ax C = B,c) (provided, of course,that C and x belong to the samecategory, and C is substitutable for x both in A and B). 1. ~ (A = B) 2. ~ «Ax.A)(C) =( Ax.A)(C» 3. ~ «Ax.A)(C) = (Ax.B)(C» 4. ~ «Ax.A)(C) =Axc) 5. ~ «Ax.B)(C) =Bxc ) 6. ~ (Axc = Bxc )
Proof:
[ass.] [(EI)] [1/2] [(E5)]
[(E5)] [4,5/3]
Note that line 6 comprises two applications of RR. In what follows, steps such as 2 and 3 will be contractedinto a single step with the reference[l/(El)]. 136
COROLLARIES: We get from (E2) and (E4) by E5.1 that: (E2*) ~ (FaAi) & F(J,» = 'Vpo' F(p) [where Fis free fromp]. (£4*) ~ (F ap = Gap) ='Vxp [F(x) = G(x)] [two applications; F and G must be free from x]. To apply this device to (E3), remember that "(A ::> B)" is an abbreviation for "(J, =(A & -B»". Hence, E5.1 is applicable to (E3) as well. By three applications we get: (E3*) ~ (A a = B a ) ::> (Fat (A) = F(B». E5.2. ~ 'V x.A o ~ ( ~ (A/ =i) and ~ A/). 1. ~ (Ax.A) = (Ax. T) [ass. and Df. 'V] 2. ~ (Ax.A)(B) = (Ax.i)(B) [1/(EI)] 3. ~ (A/ = i) [A /2, twice] 4. ~ A/ [EI.4: 3] [The provisos are analogousto those of E5.1.] Proofs from (E4) 1. 2.
3. 4.
Proof:
E4.1. ~ «A =A) =T). - Proof: ~ «Ax.A) = (Ax.A» = 'Vx[(Ax .A)(x) = (Ax.A)(x)] [(E4*] ~ 'Vx[(Ax.A)(x) = (Ax.A)(x)] [£1.5: I,(El)], ~ 'Vx(A =A) [A /2, twice] ~ «A =A) =T) [E5.2: 3] E4.2. ~ ('Vx.i = T). - Proof: H'Vx.i" is "(Ax.i) = (AX. i)". Now apply E4.I. E4.3. ~ (- J, = T). - Proof' '- J,' is 'J, = J,'. Apply E4.1. .
E4.4. ~ ('Vpo'P = J,). - Use that "'Vp.p " is "(Ap.p) = (Ap. i)", and the latter is J" by its definition. E4.5. ~ 'VPJ.p = i) = J,. - Proof: 1. ~ «Ap.p) =(Ap. T) = 'Vp(p = T) [(£4*) and A] 2. ~ J, ='Vp(p = i) [Df. J, 11] Complete by using EI.l. Proofs from (E2) E2.1. ~ i & i) = i). - Proof: In (E2*), let F be "(Apo tr, and apply Aconversions.At the right side, use £4.2 . E2.2. ~ (( i & J,) = J,). - Proof' In (E2*), let F be "(Apop)". Apply Aconversions. At the right side, use E4.4. E2.3. ~ 'VpJ..(i & p) = p). - Proof: Let F be "(APJ..(i & p) =p»"in (E2*). After A-conversions, we have:
«
~
«t &i) =i) & «i &J,) =J,)]= 'Vp«i & p) =p). ~
~
(i = i) '--.r-----J
&
,i
B: i
(J, '
•
= J,)
1
.
[by E2.1 and E2.2], [by E4.1, twice], [by E2.1]. 137
Complete by using EI.I and El.4. E2.4. ~
«r & A o ) =A).
[From E2.3, by E5.2.] E2.S. ~ (( J, = i) = J,). - Proof: Let F be "(Ap(P = i»" in (E2*). Use Aconversions, E4.1, E2.4, and (at the right side)E4.5. E2.6. ~ (- t = J,). [Df.-1E2.5.] E2.7. ~ 'VpJ(P = i) =p). - Proof: In (E2*), let Fbe "(Ap«(P = i) =p)". AfterA--eonversions, use E4.1, E2.5, and B2.1. Complete as in the proofofE2.3. E2.8. ~ «A o =i) = A). [FromE2.7, by E5.2.] E2.9. ~ 'VPo (- - p = p). -
Proof: In (E2*), let F be "(Ap(- -p = p»". Use
E4.3 and E2.6. E2.10. ~ (- - A = A). [FromE2.9, by E5.2.] R'V. ~ A o ~ ~ 'Vx lJ'A. - Proof: 1. ~ (A = i) [fromthe ass. and E2.8]
2. ~ (( Ax.A) = (Ax.i»
[l 1(E1)]
3. ~ 'Vx.A
[Df. 'V: 2]
R&. ( ~ A and ~ B) ~ ~ (A & B). 1. ~ (A =T) [ass. and E2.8] 2. ~ (i & B) = B [B2.4] 3. ~ (A & B) = B [1/2] 4. ~ (B = i) [ass. and E2.8] 5. ~ (A & B) = t
[4/3]
6. ~ (A & B)
[B2.8/5]
Proof:
at J,. If A, B E Cat(o), p E Var(o), ~ Api, and ~ ApJ, then ~ A/ . Proof: In (E2*),let Fbe "(Ap.A)". Use the assumptions, R&, andE5.2. R::J . ( ~ (Ao::J B o ), and ~ A) ~ ~ B. - Proof: 1. ~ (A = T)
[ass. and E2.8]
2. ~ (J, = (A & -B)
[ass. and Of. ::J]
3. ~ (J, = (i &
-B»
4. ~ (J, = -B)
[1/2]
[E2.4/3 ]
5. ~ (-J,
= --B)
[4/(El)]
6. ~
= B)
[E2.1 0, E4.3 15]
7.
~
(i B
[E1.1,E2.8 16]
138
~---------- - - -- - -
-
-
Proofs from (E3)
(Aa =Ba ) ::J (B =A). - Proof: In (E3*), let F be "(Axa.(B whereA and B are freefrom x. After A-eonversions we have:
EJ.t.
~
~ (A
=x»"
=B)::J «(B =A) =(B =B).
Complete by usingE4.1 and E2.8. E3.2. ~ (I::J I). - Proof: In (E3*),let A and B be I, and let Fbe "(Apo- I)". We have (by A-conversions) that: 1. I- (I = I) ::J (I = I) 2. ~ I::J 1 [E4.11l] Putting J, for A, we get analogously: E3.3. I- J,::J t [Cf.E2.5.] E3.4. ~ A o=> I. (Verum ex quodlibet.) - From E3.2 and E3.3, by RI J,. E3.5. I- J,::J J, . - Proof: In (E3*), let F be "(Apo-p)", and let A and B be J, and t, respectively. Use E2.5. E3.6. I- J,::J A o . (Exfa lso quodlibet.) - From E3.3 and E3.5, by RI J,. E3.7. I- Ao::J A. - From E3.2 and E3.5, by RI J,. Note that by Df. ::J,
By E4.3, E2.l, and E2.5, this reducesto:
On the other hand, E3.5 and E2.8 yield:
Similarly, E3.2 and E2.8 yield:
From (a) and (b) we get by R 1J, that:
I- (Ao::J J,) =-A. We get analogously from (a) and (c) that: E3.9. I- (I::J A o ) =A. It follows from E3.8. that I- -(Ao::J - i) E3.8.
=--A.
Using that (A ::J -I) =df -(A & I), this means that: E3.10. I- (A o &- I) =A.
139
The Propositional Calculus (PC)
PCI. 1.
r «Ao::JB o) & (B::J A»::J (A =B). -
r «A::J i) & ~
i
(i ::J A» ::J (A = i) ~
A
&
'--v----J
),::J
A
[E3.4,E3.9, E2.8]
::J
A
[E2.4,E3.7]
A
2.
r «A::J J,) & ~
~ ~A
Proof"
(J,::J A» ~
i
&
- A
= (,1, = A) '--v----J
~=
-A
[E3.4,E3.9, E2.8]
=
- A
[E3.l0, (El)]
r (,1, = A) ::J (A = J,) [E3.l] r «A::J ,1,) & (J, ::JA»:::l (A = J,) [2/3] [Ri ,1,: 1,4] 5. r «A::J B) & (B::J A»::J (A = B) PC2. r (Aa =B a) =(B =A). - Proof" 1. r [«A=B )::J (B=A» & « B=A) ::J (A=B» ] ::J [(A=B) = (B=A)] 2. r (A = B) ::J (B = A) [E3.l] 3. r (B = A) ::J (A =B) [E3.l]
3. 4.
[by PC1]
Now use R& and R ::J. Consider the proof of Pel. In line 1, the main '" :::l' can be replaced by (Why?) In line 2, '"(,1,
,"=1.
=A)' can be replacedby '"(A =,1,)' (accordingto Pe2). From these,
one gets by Ri J, that:
PC3. PC4. 1.
r «Ao::J Bo) & (B ::>A» =(A =B). r (A = B ) ::> (A ::> B). - Proof"
r (i = B ) ::J (i :::l B) '--v----J
B 2.
~
::J
B
[E2.8, E3.9]
r (,1, = B)::J (,1,::> B)
~
'--v----J
- B::J i Completeby using at J,.
"[E3.6, E3.4]
The followinglaws can be proved analogously: substitute i and J" respectively, for A, and use at t.
r r
PCS. Ao::J Bo::J A. PC6. (Ao::J B o::> Co) = «A::J B)::> A::J C). PC7. Bo::J - A o) =(A::J B). By PC3 (and R::», the two latter laws can be weakenedas follows: PCS. (Ao::> B o::J Co) ::> (A ::J B) ::J A ::J C. PC9. t (-B o::> -A o) ::J A ::> B.
r (r
140
Knowing that PC:5,8,9 and R::J are sufficientfor the foundation of PC, we have that EC contains PC. Note that PC3 and PC4 assure that the identity symbol '= ' acts, between sentences, the part of the biconditional. In what follows, we shall refer by 'PC' to the laws of the classicalpropositional calculus. Laws of Quantification (QC)
=
QCl. ~ 'VpoC'vxa(P ~ A o) (P ~ 'Vx.A). -
Proof" Show that
'Vx(p ~ A) = (p ~ 'Vx.A)
is provable with i and J. insteadof p. Then use (E2*). -
COROLLARY:
QC2. ~ 'VxJ..Co-::;) A o) -::;) C ~ 'Vx.A), provided C is free from x.
Using that
~
(Ao~ A)
[PC] and using R'V, we get from QC2:
QC3. ~ A 0 ~ 'Vxll'A provided A is free from x. QC4. ~ 'VxaA o -::;) A/. -
Proof"
1. ~ ((h.A)=(h. i» -::;) [(A/ro-f(B»(h.A)=(Af/(b»(Ax. i)] [by (E3*)]
[Notethat U(A/ro-f(Bu,)"
E
Cat(o(oa)).]
2. ~ « h .A )=(h . i ) ~ « h.A)(B )=(Ax. i )(B»
[A 11 ]
3. ~ 'Vx.A ~ (A/ =
[Df. 'V and A /2]
i)
4. ~ 'Vx.A:::> A/
[E2.8/3]
QC5. ~ 'VxJ..Ao-::;) B o) -::;) 'Vx.A ~ 'Vx.B. -
1.
~ 'Vx(A ~ B) ~ (A ~ B)
2.
r 'Vx.A ~ A
Proof"
[QC4] [QC4]
r ('Vx(A ~ B) & 'Vx.A) -::;) (A & (A -::;) B» r (A & (A ~ B» -::;) B 5. r ('Vx(A ~ B) & 'Vx.A) ~ B 3.
[PC: 1,2]
4.
[PCl
6.
r 'Vx[('Vx(A -::;) B) & 'Vx.A) -::;) B]
[pC: 3,4]
[R'V: 5]
r ('Vx(A ~ B) & 'Vx.A) ~ 'Vx.B 8. r 'Vx(A ~ B) ~ 'Vx.A ~ 'Vx.B
[R~;
7.
QC2, 6]
[pc: 7]
The laws QC4, QC5, QC3, (EI), (E3*), and R'V - togetherwith the laws of PC - are sufficient for the foundation of the frrst-order Quantification Calculus QC. Hence, all laws of QC are provablein EC, with quantifiablevariables of any type (in contrast to QC where the quantifiable variablesare restrictedto type z). QC6. If the variable yp does not occur in the term A a then
r (hpA~ = (AypAl
). -
Proof" Let A" be Ai , and notethat - owingto our proviso - [A "l/ is the same as AxZ •
r (Ay.A ")(zP) =[A "] / 2. r (Ay.A ")(z) =A/ 3. r (h.A)(z) =Ax 4. r (Ax.A)(z) =(Ay.A ")(z) 5. r 'Vz[(h.A)(z) = (Ay.A ")(z)] 1.
l
6.
~ (ALA)
= (Ay.A ")
[(E5), with suitable z] [from 1, for [A "]/ [(E5)]
[3/2]
[R'V : 4] [(E4*) /5]
141
=Ax
Z
]
This is the law of re-naming bound variables. Finally, we prove a generalization of (E3) which will be useful in the next Section. (E3+)
r
(Ap = Bp ) ::J (CutiA) = C(B». · -
Proof: In (E3*), let F be
"('J....x{iC(x) = C(B) ))" (which belongs to Cat(ofJ». After A-conversions we have:
r (A =B) ::J([C(A) = C(B)] =[C(B) = C(B)]) ~
t By FA.! and E2.8, we get the required result. 1.2.3. EC-consistent and EC-complete sets 1.2.3.1. DEFINITION. A sentence A is said to be a syntactic consequence of the set of
sentences r(ordeducible from n-in symbols: "T ~cA" - iff
r A, or
T is empty and
T is nonempty and there exists a conjunction K (perhaps a one-member one) of
r K::J A. F r A , for all r. - Prove that ru {Co} r A iff
some membersof F such that F
r
[Provethat A ~ this is the so-called Deduction Theorem .]
r C::J A -
1.2.3.2. DEFINmON. A set ris said to be EC-inconsistent iff T
r J,; and ris said to
be EC-consistent iff r is not EC-inconsistent. [Prove that tF is EC-inconsistent) ¢:> (for all sentences A, F some A
E
T, r
r A)
¢:>
(for
r . . A). - Prove that if ris EC-consistent and r r A then ru {A}
is EC-consistent as well.] 1.2.3.3. D EFINmON. A set ris said to be EC-complete iff
(i) ris EC-consistent; is 3- complete (existentially complete) in the sense that whenever
(ii) T
"3x aA o " E F then for some variableYa, A/ E r, (iii) r is maximal in the sense that if Ao~ r then ru {A} is EC-inconsistent. 1.2.3.4. THEOREM . If the membersof F are free from the variable v« . X a does not oc-
r A/ then T r V x.A . r K::J A/ where K is a conjunction of some members of T, and K is free from y. Then, by RV, QC2, and R::J, r K::J 'fIy.A/We get by QC6 that r K::J 'fIx.A, that is, r r 'fIx.A.
cur in the sentence A/ and F t
t
Proof. By the assumption,
1.2.3.5. THEOREM. Every EC-consistent set of sentences is embeddable into an ECcomplete set. More exactly: If ro is an EC-consistent set of sentences of a language Lao:! then there exists a language Lo:! and an EC-complete set T of sentences of Lo:!
such that ro
~
r. 142
Proof For each a E EXTY, let Var'(a) be a sequence of new variables, and let be the enlargement of Loo:J containing these new variables. Let (En)n E iV be an enumeration of all sentences of form "3x aAo " of the extended languageLr;t;!. Starting with the givenset ro, let us definethe sequence of sets (rn)n E iV by the schema:
Lr;t;!
rn+1 = Tn if F; u {En} is EC-inconsistent; and otherwise r n+1 = T; u { En , En} where, if En is "3x a A o " then E is A/ where y is the first memberof Var'(a) occurring neitherin the members of F; nor in En . LEMMA. If T; is EC-consistentthen so is r n+1 • Proof This is obvious in case rn+1 = rn • In the other case, F; U {3x a A o } is assumed to be EC-consistent. Now assume, indirectly, that F; U {3x.A, A/ } is ECinconsistent, i.e., that
F; U {3x.A}
~ ...Al .
Using that T; U {3x.A} is freefrom y, we get by the preceding Th. that
rn U
{3x.A} ~ 'v'x A . . r-
Since "3x.A" is "-'v'x....A" , we have that F; U {3x.A } is EC-inconsistentwhich contradictsthe assumption. Continuing the proofof the theorem, let us define
Showthat riV is EC-consistent (using that in.the contrary case, for some n, rn would be EC-inconsistent) and 3-complete. Now let (Cn )n E aJ be an enumeration of all sentences of Lr;t;!. Let us define the sequence of sets (r~)n E iV by the following recursion:
r~+ l
= T'; if T';
U
{C n
}
is EC-inconsistent, and in the contrary case:
Obviously, for all n, T'; is EC-consistent and 3-complete. Consequently, the same holdsfor Furthermore, Fo s; r. Finally, show that r is maximal (use that if A o ~ I: then for some n, A is C; , and F'; U {Cn } is EC-inconsisteent).
143
1.2.3.6. THEOREM. Assume that r is an EC-completeset of sentences. Then: (i)
r
~ A ~ A E
(ii) {(A
=B), C} ~
r. r
~
UC[BIAj" E F.
(iii) If the term A a occurs in a memberof r then for some variable Xa r "(A a = xa)"
E
r.
(iv) If "(Cf1/J = Df1/J)" E r then for all variables xp ~ "(C(x) = D(x) " E r . Proof. Ad (i). This follows fromthe maximality of F, using that rv {A} is ECconsistent. - Ad (ii), By (E3 +), ~ (A =B) ::J (C = C[BIAJ). Now use PC and (i). Ad (iii).By (i), "(A = A)" ~ (A
=A) ::J 3Ya (A =y),
for some X a , "(A
E
r. By contraposition of QC4 we have that
and, hence, "3y(A
=y)" E r. By the 3-eompleteness of I:
=X a )" E r .
=D) =VYp(C(y) =D(y) (with y such that C and D are free from y), and by QC4, r V y(C(y) = D(y)::J (C(x) = D(x) ), with arbitrary x a. Ad (iv). By (E4*), ~ (C
Complete by using (i). 1.2.4. The completeness of EC 1.2.4.1. THEOREM . If r is an EC-complete set of sentences then there is a Ginterpretation Ip = { V, D, p } and a valuation v such that for all A E F, IA I/P = 1.
(l)
Proof. - Part I: The definition of Ip and v.
We shalldefine D and v by induction on EXTY. (a) For p E Var(o), we defme v(p) simply by v(p) = 1, if p e T; and v(p) = 0 otherwise.
Knowing that i and J. are terms of
£()(!,
we have by (iii) ofTh . 1.2.3.6that for some Po
and qo, "(i =p)"
Since
ri
E
r
"(J. =q)" E r.
and
, we have that pEr r and, hence, v(p) = 1. If q E
r then, by (ii) of the Th.
just quoted, J, E r which is impossible (by the EC-consystency of Ty; hence, q ~ and v(q) =0. Thus, we can define: D( 0) =
r,
to, I}.
Assume that "(Po =q oJ" E r, and v(p) =1, that is, p e T: Then, q E F; too, and v(q) = 1. Assuming that v(q) =1, we get analogously that v(p) = 1. Hence: "(p 0 = q0)" E r iff v(p)
=v(q). (b) Let (Zn}n E (l) be an enumeration of Var( z). Let us definethe function rp by rp(Zn )
=k
~
for some k ~ n, "(Zn " (Zn
=Zj )" ~ r
=Zk )" E r,
[i, k, n
144
E
(J) ] •
and for all i < k,
(In other words: let fP{zn) be the smallest number k such that "(z, = Zk )"
r.
Note that "(Zn =z, )" E r. Then fP{zo) = 0.) Now let U =D(l) be the counterdomain of ¢J, i.e., U =D(l) = {k E (J) : for some x E Var(l), fP{x) =k }. We then define v for members of Var(l) by vex) =df rp(x). [Using the symmetry and the transitivity of the identity, show that "(r, = YI )" E r iff E
vex) = v(y).]
(c) Now assume that D(a), D(j3) are defined already, v is defined for the membersof Yarea) u Var(ft>, and the following two conditions hold for r E {a, fJ} : (i) If C E D(r) thenfor some x r' vex) = c. (ii) "(Xr = Yr)" E r iff vex) =v(y). (Notethat by (a) and (b), these conditions hold for r E {o, l }.) Now we are going to define v for the members of Var(a(p]) as well as the domainD( a(p). Forf E Var(afJ), a E D( a) , b E D(j3), we let: v(f)(b)
=a
iff for some Xa , YP such that vex)
=a, v(y) =b, "(f(y) =x)" E r.
Using (i) and (ii), it is easy to provethat v(f) is a (unique) function from D(j3) to D(a). - We define: D(afJ) = { rp E D(jJ)D(a): for somef ap, v(j) = qJ} c
D(jJ)
D( a).
Now provethat (i) and (ii) hold for r = a(p). (For (ii), use (iv) of Th. 1.2.3.6.) By (a), (b), and (c), the definition of D and v is completed. (d) If C E ConCa) then for some Xa , "(C = x)" E r . We then define P(C) = vex). Show that this definition is unambigous.
By these, the defmition of lp is completed. Part II: The proofof (1). (A) We prove(1) firstly for identities of form "(Ba = Ya)". If B is a variable or a term of form "f(x)" or a constant, then (1) holds according to the defmition of lp and v. In other cases, B is a compound term of form "F(C)", " (Ax.C)" , or U(C = D)". We shall investigate one by one these three cases. Meanwhile, we shall use the induction assumption that (1) holds true for sentences whichare less compound ones than the one underinvestigation. (AI) If U(FatiCp) = Ya )" E r then - by (iii) of Th.1.2.3.6 - for some variables fap and xs . "(F = f)" E F. and "(C = x)" E r. Furthermore, by the ECcompleteness of F, "({(x) =y)" E r. Then,by the definition of lp and v, (2)
I(f(x) =y)l v
= 1.
145
We can assume that (3)
I(F
=f>/v =1
and
(4)
I(C
=x)lv = 1
for F and C are less compoundterms than "F(C)". From (2), (3), and (4) it then follows that I(F(C)
=Y)/v =I
which was to be proved. (The furthercases will be less detailed.) (A2) If "«'AxfJCa) = YaP )"
r
E
then - by (iv) of Th. 1.2.3.6 - for all zp,
"«'Ax. C)(z) = y(z»" E r, and, by 'A-conversion, "(C/
(5)
=y(z) ) E r.
However, for all zpthere is an U z
E
Var(a) such that
"(y(z) = u z )"
(6)
E
r.
By (5) and (6), we have that for all z E Var(/J) there is an Uz E Var(a) such that "(CXZ
=Uz )" E F. By inductionassumption, I(Cx
l(y(z)
=Uz )Iv =1. Henceforth: I(C/ =y(z»lv =1, that is"
Z
= U z )Iv =1, and,furthermore,
/«'Ax. C)(z) = y( z) )/v = 1
(7)
for all zp . Remembering that for all b ED(jJ) there is a zp such that v(z) = b, we get from (7) that for all b E D(jJ), I(AX.C)l v (b) = v(y)(b), which means that I(Ax.C)l v = v(y), and, hence: 1('Ax.C)
=y)l =1. v
(A3) If "«C a = D a ) =Po)" E H(D = y)" E F; and "«(x = y)
(8)
r
thenfor some Xa, Ya: "(C =x)" E T;
=p)" E r.
We can assume that (9)
I(C = x)l v = 1
and
(10)
Now v(p) is 1 or O. If v(p) = 1 then "(p = vex) (11)
I(D
tr E r,
=y)lv
= 1.
and, by (8), "(x = y)
E
r, that is,
=v(y). This and (9) and (10) togetherimply that =
I«(C D) = p)l v
=1.
On the other hand, if v(p) = 0 then "(p = J,)" E r, and, by (8) again, «; (x hence, vex) ;I:. v(y). This and (9) and (10) togetherimply (11).
146
= y) E r,
(B) Secondly, we prove (1) for identities of form "(B a = C a )"where both B
and Cmay be compound terms, If"(B =C)" E r then for somexj and y., , "(B = x)" r, "(C = y)" E r, and "(x = y)" E r. By (A), we have that I(B = x)lv =1,
I(C = y)lv = 1,
and
I(x
whichyieldthat I(B = C)lv =1. (C) Finally, if the sentence A is not an identitythen A
E
E
=y)lv =1, r implies that
"(A =i)" E
r. Since IAl v =I(A =i)l v this case reducesto the preceding one. 1.2.4.2. THEOREM. If the set r is EC-consistent then r is G-satisfiable. This follows fromTheorems 1.2.3.5 and 1.2.4.1. 1.2.4.3. THEOREM. The completeness ofEC with respect to the G-semantics. If r ~G A then r
hr:
A.
Proof. Assume that r ~G A. Then I" =
r
u {- A} is G-unsatisfiable, and, by
contraposing the preceding theorem, F' is EC-inconsistent which meansthat
r u {... A} ~ J,. Then, by the Deduction Theorem, we have that r ~ (- A ::> J,) which reduces to r ~ A. 1.2.4.4. THEOREM. (LbWENHEIM-5KOLEM.) If the set r is G-satisfiable then r is "denumerably" satisfiable in the sense that r has a G-model Ip = {U, D, p} with a valuation v such that each D( a) is at most denumerably infinite. Proof. Note firstly that if r is G-satisfiable then r is EC-~onsistent. (For if r ~ J, then ... continue!) Then, by Th. 1.2.3.5, r is embeddable into an ECcomplete set I" , and, by the proof of Th. 1.2.4.1, I" has a G-model in which the cardinality of each D(a) is not greater than the cardinality of Var(a) (which is; of course, denumerably infinite). That is, I" has a "denumerable" G-model. Since r s: I", this is a G-model of r as well.
147
PART 2: MONTAGUE's INTENSIONAL LOGIC The sourcesof this chapterare the following works: R. MONTAGUE, Universal Grammar, 1970.
R. MONTAGUE, The ProperTreatment of Quantification in Ordinary English, 1973 - briefly:
PTQ. D. GAllIN, Intensional and HigherOrderModalLogic, 1975. We shall presentthe essenceof the most important parts of these writings, of course, without literal repetititons. The resultsof Part 1 will be utilizedextensively.
2.1. THE SEMANTICAL SYSTEMS IL AND IL+ 2.1.1. MONTAGUE'S TYPE THEORY
Montague uses the basic symbols t, e, and s - t for truth value, e for entity, s for sense - in his type theory. The type of a functor with the input type a and output type f3is denoted by "( a~fJl' . The full inductive definition of his types is as follows: t and e are types.
If a, p are types, "( a~p>" is a type. If a is a type, (s,a)" is a type.
Here t and e correspond to our type symbols "(a~p>"
0
and
l,
respectively (cf. 1.1.1), and
correspondto our f3(a). .Finally, "(s,a)" is the type ofexpressions naming the
sense (or the intension) of an expression of type a. It is presupposed here that there exist terms naming intensions (senses) of terms. (For example, if A is a sentence, the term "that A" is a name of the sense (intension) of A; or if B is an individual name say, 'the Pope' - then "the concept of B" is a name of the sense (intension) of B.) Note that the isolated's' is not a type symbol. However, we shall not use Montague's original notation for types. Instead, we shall follow our notation introduced in Section 1.1.1, of course, with suitable enlargements. Thus, our inductive defmition of the set of Montagovian types - denoted by 'TYPM' - runs as follows:
o, lE TIPM, a~pETYPM
pE
TYPM
=> "a(p}'ETYPM, =::) "(/J)s" E TYPM.
Again, if Pconsists of a single character (0 or z), the parentheses surrounding it will be omitted. Furthermore, we write usually "(P'l" instead of "(/J)s" [except when it occurs in a subscript], and instead of "«P'lt " we write simply "(/J)5S "[e.g., d', (55, (Ol)5SSS, etc.].
148
The unrestricted iterations and multiply embedded occurrences of's' may provoke some philosophical criticism, but let us put asidethis problem presently. 2.1.2. THE GRAMMAR OF IL AND IL + The semantical system IL is introduced in Universal Grammar. In PrQ, system IL is extended by the introduction of tense operators; this extended system will be called IL+. 2.1.2.1.
DEFINITION.
By an IL language let us mean a quadruple L (i) = (Log, Var, Con, Cat)
where Log = { (, ), A., =, A
, v }
is the set of logical symbols of the language (containing left and right parentheses, the lambdaoperator A., the symbol of identity, the intensor A, and the extensor v); Var = U
a e TYPM
Var(a)
is the set of variables of the language whereeach Var(a) is a denumerably infinite set of symbols called variables of type a; Can
= UaeTYPM Con(a)
is the set of (nonlogical) constants of the language where each Con(a) is a denumerable (perhaps empty)set of symbols called constants of type a; all the sets mentioned up to this point are pairwise disjoint; Cat
= U ae TYPM Cat(a)
is the set of the well-formed expressions - briefly: terms - of the language where the sets Cat( a) are inductively defined by the grammatical rules (GO) to (G5) below. For a E TVPM, Cat( a) maybe called the a-category of L{i) . [Thenotational conventions will be the same as in 1.1.2.] Grammatical rules: (GO) Var(a) u Con( a)
~
Cat(a).
(G1) "Aa/1(BpJ" E Cat( a). (G2) "(AxpAa) E Cat( ap) . (G3) "(A a = B~" E Cat( 0) (G4) "AA a " E Cat( as).
(G5) "vA a s" E Cat(a). Let us enlargethe set of logical symbols Log by the symbols 'P' and 'F' (called past andjUture tense operators, respectively) and let us add the rule (G6) to the grammaticalrules: (G6) "PAo", "FAd' E Cat(o). 149
By these enlargements, we get the grammar of the IL + languages. [In PrQ, some symbols introduced via definitions in Universal Grammar, are treated as primitive ones; but we do not follow here this policy.] Remark. Montaguespeaksof a singlelanguage of IL, and,hence,he prescribes that each Con( a) must be (denumerably) infinite. We follow the policy of Part 1 in dealingwith a family of languages the members of whichmaydifferfromeachotherin havingdifferent sets of theirnonlogical constants.
2.1.2.2.
DEF1NITION.
(i) Free and bound occurrences of a variable in a term as well
as closed and open terms are distinguished exactly as in EL (cf. (i), (ii) and (iii) of Def. 1.1.2.2). (ii) The set of rigid terms of I}) - denotedby 'RGD' - is definedby the fol-
lowing induction: (a) Var s: RGD;
"AA a "
E
RGD.
RGD ~ F(B) E RGD. (c) A E RGD ~ "(Mp A)" E RGD. (b) Fap, Bp
(d) A, B
E
E
RGD
~
"(A = B)" E RGD.
(e) In case IL": AoE RGD ~ "P(A)", UF(A)"
E
RGD.
In other words: rigid terms are composed of variables and terms of form "AA" via applications of the grammaticalrules (G1), (G2), and (G3). A motivation of the adjective 'rigid' will be given in the next section. (iii) A variable Xa is said to be substitutableby the term Ba in the term A iff (a) whenever "(Ay.C)" is a part of A involving some occurrences of x which counts as a free occurrence of x in A then B is free from y; and (b) if a free occurrence of x in A lies in a part of form "AC' (or - in case of IL + "P(C)" or "F(C)") of A then B is a rigid term. (iv) The result of substitutingBa for Xa in A - in symbols: "[A]/" - and the replacement of A a by B a in a term C - denotedby "C[B fA]" - are defined exactly as in EL (cf. (v) and (vi) of Def. 1.1.2.2). 2.1.3. THE SEMANTICS OF IL AND IL + 2.1.3.1.
DEFINITION.
(a) By an interpretation of an IL language I./) let us mean a
quadruple Ip = (U, W, D, a ) where U and Ware nonemptysets; D is a function defmedon TYPM such that
and a is a functiondefinedon Con such that (2) C E Con(a) ~ o(C) E Int(a) 150
=df W D( a).
(b) By an interpretation of an IL + languagewe mean a sixtuple
Ip = (U, where U,
~
~ T,
-c, D, a}
and T are nonempty sets, < is a linear ordering of T, D is the same as in
(l) except that
D(a S
= I D(a)
)
and a is as in (2) except that Int(a)
where 1= W xT,
= I D( a).
(c) A function v defmed on Var is said to be a valuation joining to Ip iff
Var(a) => vex)
X E
E D( a)
.
The notation "v[x: al" will be used analogouslyas in EL (cf. Def. 1.1.3.1).
Comments. W is said to be the set of (labels of) possible worlds, Tis the set of possible time moments, and < represents the 'earlier than' relation between time moments. I = W x T is said to be the set of indices. For a E TYPM, D( a) is the set of factual values and Int(a) is the set of intensions, of type a, respectively. 2.1.3.2. DEFINmoN. Given an interpretation Ip of an IL or an IL + language If), we shall define for all terms A E Cat and for all valuations v joining to Ip, the intension of Aaccording to Ip and v - denoted by "IIA II /p "- by the semantic rules (SO) to (S6) below. According to our definition,if A E Cat(a) then
will be satisfied where I
=W
in the case of IL, and I
= W x T in the case of IL + •
Hence, I A I /p is defmed iff for all i E I, IAI/p =df
IIA I /p (i) E D( a)
is defmed. We shall exploit this fact in our definition. The object IAI/p may be called
the factual value of A, according to Ip and v, at the world index i. - In what follows, the superscript" Ip' will be, in most cases, omitted. (SO) If x
E
Var, !xlvi
=vex).
If
=IFlvi (IBlvd. (S2) I(AXp A a )Ivi is the function b e D(P) => (S3) I(A a = Ba)vi = 1 if IAlvi =
(SI) IFafJ(Bp)l vi
IIAllv' IVA a s Ivi =IAlvi (i).
C E Con, II CII v =o(C). _ «Apos .vp)(AA) = (Ap.vp)(Ai» 2. ~ (AA =Ai) :::) (vAA =VA T)
[(E3+)]
3. ~ (AA =
[(Il)/2]
A
[All]
T) :::> (A = i )
4. ~ DA :::> A
[Df. 0 and E2.8: 3]
COROLLARY: ~ A o:::> OA.
11.2. If Ao E RGD then
~ O(A o:::> B o ) = (A :::) DB). -
1. ~ (i:::>B) = B
[PC]
2. ~ O(i :::) B) = DB
[1/ (E1)]
3. ~ (i :::> DB)
=DB
[PC]
4. ~ O(i :::) B) = (i :::) DB)
[31 2]
5. ~ (J, :::> B) 6. ~ O(J, :::) B)
[PC]
[RO: 5]
7. ~ (J,:::) DB)
8.
Proof'
[PC]
r O(J, :::> B) =(J, :::) DB)
[pC: 6,7]
9. ~ O(A :::> B) = (A :::> DB)
[Ri J,: 4,8]
Onthe last step, we use that in "o(p 0:::> B) = (p :::> DB)", A is substitutable for p, sinceA is rigid.)
11.3. If A E RGD, 1. ~ «Ai = Ai) = I) 2. ~ (01
=I
3. ~ OJ,:::> J, 4. ~ J,:::) OJ, 5. ~ (OJ,
=J,)
6. ~ (OA =A)
l-
(DA o = A). - Proof' [(EI) and E2.8] [Df. 0 : 1]
[11.1] [PC]
[pC: 3,4] [Ri J,: 2,5]
COROLLARIES:
11.4. ~ (ODA =OA), and ~ (OOA =OA). It follows from 11.3 that if A o is rigid then} (- O-A 11.5. If A o E RGD, ~ (OA = A). 157
= --A) .and, hence:
COROLLARIES:
11.6. t (OOA =OA), and t (oDA =OA). Furthermore, usingthat OA::> A, and A::> 0 A, we have that: 11.7. t 00A ::> A, and t A::> DOA. (TheBrouwerschemata) 11.8. t D(A::> B)::> OA ::> DB, and t D(A::> B)::> oA ::> V'x.DA
r V'x.DA::> DA 5. t o V'x .DA ::> oDA
4.
[QC4]
6.
t
[Lemmon: 4] ·[11.7 (Brouwer)]
7.
t o V'x.DA ::> A
[pC: 5,6]
8.
t
OV'x .DA ::> V'x.A
9.
t
OOV'x.OA ::> DV'x.A
[RV': 7, and QC2] [Lemmon: 8] [11.7 (Brouwer)]
10.
ODA ::> A
t V'x.DA::> DOV'x.DA
r V'x.DA ::> DV'x.A
[pC: 10,9]
12. t (DV'x.A = V'xDA)
[pC: 3,11]
11.
r
=3x.OA.) -
(Prove this!) The next law will be usedin Section2.2.4. COROLLARY:
(03x.A
158
r r 3. r 4. r 5. r 1.
2.
11.10. ~ (- O(B &A) & O(B & C»::J O(C & - A). - Proof: O«B & C)::J (B & A»::J O(B & C)::J O(B & A) [11.8] (-O(B&A) & O(B&C» ::J -O«B&C)::J (B&A» [pc: 1] - «B & C) ::J (B &A»::J (C & - A) [PC] - O«B & C)::J (B & A»::J O(C & - A) [Lemmon:3] (-O(B & A) & O(B & C»::J O(C & - A) [pC: 2,4] If the readeris familiar withmodallogic,he/she realizes that the modalfragment of Ie is an SS-
type modallogic.Furthermore, the combination of quantifiers and modaloperators yielded a Barcan style
systemcharacterized by theschema11.9.
2.2.3. IC-CONSISTENT AND IC-COMPLETE SETS 2.2.3.1. DEFINmoN. (a) A sentenceA is said to be an IC-consequence of the set of sentences T(or IC-deducible from T) - in symbols:"T he A" - iff Tis empty and he A, or T is nonempty and there exists a conjunction K of some members of r such that he K::J A. (b) Compare (a) with Def. 1.2.3.1, the definition of syntactic consequence in ECI One sees that the difference is merely in the reference to the calculus IC instead of EC . Consider Definitions 1.2.3:2-3. Substitute 'EC' everywhere by 'IC'. Then one gets the definitions of IC-inconsistent/IC-consistent sets, and IC-complete sets. 2.2.3.2. THEOREM. If the members of r are free from the variable Yt» the variable Xa does not occur in the sentence Al , and r he Al then T he V x.A. - For the proof see 1.2.3.4. 2.2.3.3. THEOREM. Every IC-consistent set is embeddable into an IC-complete set. Proof Consider the EC variant of this theorem: Th. 1.2.3.5,and its proof. The proof of our present theorem is essentially the same, with obvious modifications. The starting point is that To is an IC-consistent set of sentencesof a language Lo(i) and we shall define an enlargement L P) of L/i) by introducing new variables in all types. Replace in the quoted proofeverywhere: 'Lot)(! , by 'Lo(i) " 'Lt)(! , by 'L PJ ,, J
'EXTY' by 'TYPM " 'EC' by 'IC'. 2.2.3.4. THEOREM. If r is an Ie-completeset of sentences then: (i) r heA => A E I>: (ii) {(A a= B a ), C} ~ r => " C[B/ A]" E r, (iii) if the term A a occurs in a member of r then for some variable X a , "(A a =x a )" E r, (iv) if "(Cup = Dup)" E r then for all variables xp ,"(C(x) = D(x)" E r. Proof" See Th. 1.2.3.6. 159
2.2.4. MODAL ALTERNATIVES 2.2.4.1. DEFINITION. We say that f/J is a modal alternative to r iff
(i)
r and f/J are IC-complete sets of sentences (of the same language)and
(ii) for all sentences A, if "OA III
E
r
then A
f/J .
E
2.2.4.2. LEMMA. If f/J is a modal alternativeto r then for all Co E f/J , " OC'Er.
Proof. Since "....o C"
E
r
r
is IC-complete, one of " A" and "B :::::> B" are synonymous, and so are A and "A & (B v-B)". Hence, no difference in meaning can be exhibited in IL between the sentences: !Marg tlii~ tliat '13i{{sings. !Mary tliink§ tliat '13i{{sings atufJolinsfeeps ordoes notsleep, (B) Returning to Montague's grammar of a fragment of EnglishLE , the followingtwo mistakes may be qualified as simple oversights of the author (easily corrigable): (a) By (SI2), wa{R.anita{R. E T(IV). Then,by (S4), Mary wa{k§ ani taC( E T(t).
For (S4) says that only the first verb is to be replaced by its third person singularpresent. (b) By (S5) and (S4), lieo Coves lieo E T(t). Then, by (S140) , !Mary Coves her E T(t). According to the translation rule (Trl40) , its translation is Cove. {Marg}{Marg} which means that Mary loves herself. It is very doubtful that the sentence Marg Coves her has such a reading. It seems that (SI4n ) needs somerestrictive clauses. A more essential reflection: In the syntax of LE , Montague does not distinguish extensional and intensional functors (in the same category). According to the translation rules, all functors are treatedas intensional ones. Later on, the introduction of "meaning postulates" will make, nevertheless, a distinction between extensional and intensional functors (in certain categories). All this means that a correct construction of a fragment of English is impossible withouta sharp distinction of extensional and intensional termsin some categories. Then it wouldbe a moreplainmethod to makethis distinction in the syntax already. For example, the category of transitive verbs could be handled by introducing the basic sets B(TVext) and B(TVint), that is, the set of extensional and intensional transitive verbs, respectively. Then, the translation rules for these verbs would be as follows: If A E B(TVint) then A E Cone (~E S ), and A * = A. If A E B(TVext) then A E Cone (Oll), and A * = (Ages [Ax lSvg("(AYls A(vx)(vy) )) E cat, (~t ). [Cf. (M4) of the preceding section.] Thus, the translation of every transitive verb belongsto the same logical type.Let us consider an example of application:
180
Vimf apen]* = jimf* (A[a pen]*)
=::
= (AgEs [Axls.Yg(A(AYlsfimf(Vx)(Yy»])(A(Afos .3Zt [pen. (z)& Yf(A z)]) = = [Ax(Af3z[pen. (z) & Yf(VZ)])(A(Ayfi7Ul"tx)ty»)] ::: :=
(Axl s 3zl [pen. (z) & jimf(Yx)(z)])
E
Cat, (b).
['13ifljitufs apen]* = (Afrs Y f(AtJ3ill)(AVimf apen]*) = (Ax.3z l [pen. (z) & jimf(Yx)(Z)])(A'13ill) = 3z[pen. (z) & jimf('13i{{)(z)].
:=
:=
An analogous method is applicable for the categories CN,
1':' and PRE. How-
ever, one can raise some doubts about the existence of intensional terms in B(eN) and B(IV). Montague's example concerning temperature and rise (cf. (5), (6), and (7) in 2.3.2) apparently proves that these predicates are intensional ones. However, tlie tem-
perature in the sentence tlietemperature rises refers to a function (defined on time) whereas the same term in the sentence
tlietemperature is ninety refers to the value (at a given time moment t) of that function. This is a case of the systematic ambiguity of natural language concerning measure functions (as, e.g., 'the
velocity of your car', 'the height of the baby', 'the price of the wine', etc.). Montague's solution of this ambiguity seems to be an ad hoc one. Instead, a general analysis of the syntax and the semantics of measure functions would be necessary. It seems to be somewhat disturbing that the rules governing the verb be permit the construction of the sentence:
tlie woman is eve'!! man. Its translation is:
3x l [V'Yl (woman. (y) =(y =x) & V'Zt (man. (z) :J (x
=z»] .
It is dubious whether an eve'!! expression can occur after is in a well-formed English sentence. Let us note, finally, that T( e) is empty in L E , for the individual names belong to T(NOM). In fact, the basic categories of LE are t, IV, and eN; by means of these all other categories are definable. Why used Montague e at all? The reason is, probably, that the definition of the function f mapping the categories into logical types (see at the beginning of 2.3.2) became very short and elegant. If he had chosen t, IV, and CN as basic categories the definition of f would grow longer with a single line. The mathematical elegance resulted a grammatical unelegance: an empty basic category. 181
REFERENCES BARCAN, R. C. 1946,'A functional calculusof first order based'on strict implication.' The Journal ofSymbolic Logic, 11. GALLIN, D. 1975, Intensional and Higher-Order Modal Logic. North HollandAmericanElsevier,Amsterdam-NewYork. HENKIN, L. 1950, 'Completeness in the theory of types.' The Journal ofSymbolic Logic, 15. LEWIS, C. I. AND LANGFORD, C. H. 1959, Symbolic Logic, seconded., Dover, New York. MONTAGUE, R. 1970, 'Universal Grammar'. In: THOMASON 1974. MONTAGUE, R. 1973, 'The proper treatment of quantification in ordinaryEnglish.' In: 1HOMASON 1974. SKOLEM, TH. 1920, Selected Works in Logic, Oslo-Bergen-Tromso, 1970,pp.l03136. 1HOMASON, R. H. (ed.) 1974, Formal Philosophy: Selected Papers ofRichard Montague. Yale Univ. Press, New Haven-London. RUZSA, 1. 1991, Intensional Logic Revisited, Chapter 1. (Available at the Dept. of SymbolicLogic,E.L. University, Budapest.)
182