THEORETICAL LINGUISTICS Vol.1
1974
W DE WALTER DE GRUYTER
G
BERLIN-NEWYORK
INSTITUT DEUTSCHS PHILOLOGIE U$VE:£r;£7 KOCHEN Inventar- 4·. r.
Archiv-Nr. 3109001089 ISSN 0301-4428 ©1974 by Verlag Walter de Gruyter & Co., vormals G. J. Göschen'sche Verlagsbuchhandlung—J. Guttentag, Verlagsbuchhandlung—Georg Reimer—Karl J. Trübner—Veit & Comp., l Berlin 30, Genthiner Straße 13.—Printed in GermanyAll rights reserved, including those of translations into foreign languages. No part of this journal may be reproduced in any form—by photoprint, microfilm or any other means—nor transmitted nor translated into a machine language without permission from the publisher. Typesetting: H. Hagedorn, Berlin—Printing: Mercedes-Druck, Berlin—Binding: T. Fuhrmann, Berlin.
CONTENTS
Artifles BELLERT, IRENA On inferences and interpretation of natural language sentences
215
GABBAY, Dov M., and MORAVCSIK, J. M. E. Branching quantifiers, English, and Montague-grammar
139
HOEPELMAN, JAN PH.
Tense logic and the* semantics of the Russian aspects
158
ISARD, STEPHAN D. What would you have done if... ?
233
KARTTUNEN, LAURI Presupposition and linguistic context
182
KASHER, ASA Mood implicatures: A logical way of doing generative pragmatics
6
KUTSCHERA, FRANZ VON Indicative conditionals
257
LIEB, HANS-HEINRICH Grammars as theories: The case for axiomatic grammar (part I)*
39
SOAMES, SCOTT Rule orderings, obligatory transformations and derivational constraints
116
Discussions and Expositions
BELLERT, IRENA A reply to H.H. Lieb
v
287
DASCAL, MARCELO and MARGALIT, A vis HAI A new 'revolution* in linguistics? 'Text-grammars' vs. 'sentence-grammars'
195
LEWIS, HARRY A. Model theory and semantics
271
Part II of Lieb's article is to appear in volume 2 No. 3 of THEORETICAL LINGUISTICS
EDITORIAL
Theoretical Linguistics is concerned with linguistic theories. This statement seems to be almost a tautology, but it is not: though that may not be immediately obvious. The formulation needs interpretation, especially if it is to be taken as a condensed statement of the editorial policy of this journal. Let me begin by stating the general tendencies determing this policy: rigour of presentation, adequacy with respect to the more complicated aspects of natural languages, and critique of the foundations of the analysis of languages. The first tendency stems from logic, the second from linguistics, and the third from philosophy of language, the three "parents" of theoretical linguistics. But let me explain this in detail. "Theoretical Linguistics" is not only the title of this journal — it is almost meant to designate a sub-dicipline of linguistics. Theoretical linguistics, in being concerned with the development of theories about general aspects of particular languages or of language and its uses in general, as well as with the discussion and analysis of the form, scope and applicability of such theories, will have to be contrasted with that part of linguistics which is either concerned with experiments, tests and field-work applied to languages or else with setting up particular empirical hypotheses about the structures of languages based on judgements about (introspective or observable) experiences. Obviously, there should be a very close interrelation between the different parts of linguistics including theoretical linguistics. In view of this interrelation, linguistic theories will have to be extensively illustrated by derivations of hypotheses about particular phenomena of languages in order to become accepted as systems of hypotheses and these special hypotheses will have to be tested to evaluate the empirical adequacy of the linguistic theories from which they are derived. Many expositions and illustrations of linguistic theories describing fragments or aspects of particular languages, dialects etc. will therefore belong to theoretical linguistics as well as empirical research trying to assess the justification of some theory in general. But the central concern of theoretical linguistics should be the construction and discussion of general theories of a high degree of methodological andy in particular, meta-linguistic sophistication.
2
Editorial
Wherever possible, the concept of an axiomatized, though not necessarily formalized, theory should be taken as a standard at which theories presented or discussed in this journal aim. Glossing over a great number of problems, I assume that (1) linguistic theories (or the central empirical propositions correlated with them), are sets of statements, some of which are empirically true or false; and that (2) the logical relations among the statements of a linguistic theory may be exhibited by an axiomatic system. The logical relations may be deductive or inductive. We may require, in particular, that (3) the logical relations of a linguistic theory should be presented either (a) as an axiomatic deductive theory with descriptive constants (which may be expressed, in its strictest form, in some formal language—the metalanguage of the language to be described); or (b) by definition of a set-theoretical predicate (or a set-theoretical structure). In empirical disciplines aiming at all at strict axiomatization method (3b) is often applied, especially because it is then possible to draw upon the huge stock of standard mathematical notions (under the assumption that their definition as set-theoretical predicates has already been accomplished in meta-mathematics). Indeed, an axiomatization along these lines may have been an implicit aim of mathematical linguistics, to the extent that the latter developed in relation to generative transformational grammars. The basic intention was apparently to define set-theoretical predicates presenting the structural properties of descriptive constants like " is a grammatical sentence in L" and ".. .is a syntactic structure of sentence _in L", defined for each natural language L. This approach was based on some fundamental notions such as "... is an elementary unit of expression" (e.g. phonetic unit) and the notion of concatenation, supplemented by definitions of auxiliary notions such as "grammatical construction", "grammatical (or syntactical) rule" etc. The basic notions could be taken to correspond to the set-theoretically defined mathematical notions "element of the generating set" and "binary operation" of a free semi-group with a generator set. In connection with these definitions particular emphasis has been placed upon the requirement that the definitions should be recursive. Since, however, the sets corresponding to the notions to be defined could not be taken as finite, methods of recursively constructing the elements had to be applied. This led to speaking of devices that generate sentences, from which some people have mistakenly supposed that the particular method of recursive definition applied might serve as a model of psychological procedures performed by speakers, and possibly also by hearers, of the language for which grammaticality of sentences and their structure had been defined. In fact, only the recursive definition of these notions
Editorial
3
and no psychological procedures were in question. (It could at most be said that these notions, if adequately defined, put constraints on possible descriptions of psychological procedures). Recently, a complex of set-theoretical predicates (or set-theoretical structures) for syntax and semantics of natural languages has been advanced by applying the methods of model-theory and intensional logic. In complexity and, perhaps, in scope, the resulting systems seem to be at least comparable to generative transformational systems. But in addition to syntax, these proposals included as an essential component an axiomatized theory of indirect semantic interpretation by defining a translation function into a language with a syntactic structure that is better suited to the definition of a direct model-theoretic interpretation than are natural languages. Recently, method (3a) has been used for axiomatization within the generative theory of grammar and the first steps have been taken towards axiomatizing grammars. Further developments in the field of axiomatic grammars will be presented in the first issues of this journal. There may be alternative ways of presenting precise linguistic theories, but they will have to be comparable in rigour to the standards of formal theories that are already well understood and analysed. Certain judgements of value may seem to be implicit in my discussion of various approaches to theoretical linguistics. Let me therefore state explicitly that these judgements are by needs subjective. I do not believe, though, that this should exclude such judgements from an editorial. Any bias that might result from my own convictions will, as a matter of editorial policy, be corrected by consultation of the members of the editorial board, who will not necessarily agree in all their views on theoretical linguistics. Admitting all this, I would still suggest that certain consequences follow concerning the style of presentation preferred in this journal: Currently established mathematical and logical conventions of formulation should be followed as far as possible. Some readers may, perhaps, object that too much emphasis has been placed upon the form in which theories are presented. It is certainly true, that the force of a theory does not derive from its rigour but from its leading ideas and from the answers that can be given to the following questions: Are these ideas revealing and fruitful? Do they offer important generalizations and insights? Are they adequate and applicable? Can they be related to neighbouring fields or do they even sustain notions that are basic in a number of neighbouring fields, etc., etc.? True enough! But it is also true that the reliability and the continuity of development of ideas does depend on the perspicuity, rigour and systematic controllability of the theories in which they are presented. It may be granted that fruitful ideas are very often not stated in a rigorous form in the
4
Editorial
first instance; a number of attempts at finding the best formulation will have to be allowed. However, the very attempt to integrate an idea into a theory will demand precise and rigorous formulation. Nevertheless, rigour of meta-linguistic presentation is only a virtue in an empirical discipline if it is not empty, i.e. providing that the proposed theory aims at empirical adequacy. Progress in logical and theoretical analyses of languages during recent years should allow for the development of theories that combine rigour with empirical adequacy. The situation has changed from that of the early days of logical analysis of languages when rigour could only be achieved at the sacrifice of adequacy in the description of natural languages. Still, whereas, because of their different training, the logician has had the advantage over the linguist with regard to meta-linguistic rigour, the position is reversed when it comes to meeting the requirements of empirical adequacy. Logicians tend to underestimate the complexities of natural languages and to ignore whatever does not fit into their framework of description. Finally, we have to acknowledge that strict adherence to the basic principles of a discipline may, at times, be detrimental to its fruitful development. It is the task of the philosopher to insist in season and out of season on the need for reflection on the basic principles and to doubt the validity of tacit assumptions. And of course, this also applies to the conception of theoretical linguistics as presented in this editorial. In particular, such points as the relation of theoretical linguistics to formalization, which are somewhat controversial, even among the members of the editorial board, should be discussed in the journal itself. It is to be hoped that logicians, linguists and philosophers of language— and at times perhaps others such as the computer scientist—will collaborate in the development of theoretical linguistics. It should be borne in mind, however, that contributions are variously assessed in different disciplines: our field will not be an exception. In order to aid mutual understanding, authors should, therefore, provide their contributions with careful introductions which present the problems succinctly, as well as the criteria for a satisfactory solution, and the methods applied in trying to attain it. This effort on the part of the authors should be met by the reader with patience, tolerance und willingness to learn even from approaches that seem to be unfamiliar at a first reading. In order to further mutual understanding as well as more thorough acquaintance with the different approaches of theoretical linguistics, the journal will have a special section called "Discussions and Expositions" containing detailed critical analyses as well as contributions designed to give an easily readable introduction and exposition of more complicated ideas and theories which are of interest. At present for example, a number of aspects of modeltheoretic grammars seem to need such an introduction. Authors who are willing and able to provide expository contributions of this type are invited to submit them to the journal. It is up to the authors and readers of this journal to contribute to the development of theoretical linguistics, the authors by providing
Editorial
5
not only original contributions of high standard, but also pedagogically perspicuous and lucid introductions to or critical discussions of the different ways of approaching a common field of interest, the readers by analyzing the results published with an appropriate combination of criticism and tolerance and by applying them fruitfully. H. Schnelle
ASA KASHER
MOOD IMPLICATURES: A LOGICAL WAY OF DOING GENERATIVE PRAGMATICS*
In this paper we present an extension of the model-theoretic framework of semantics in which some pragmatical aspects of natural language can be treated adequately. In ch.l we specify the scope of generative pragmatics. Ch. 2 outlines the formal framework of semantics. In ch. 3 we pose the problem of non-indicative sentences and in chs. 4 and 5 we reject the solutions suggested by Stenius, Aquist and Lewis. In ch. 6 we define some pragmatical concepts—preconditions and implicatures of various types—using them in ch. 7 to present a pragmatical characterization of moods in terms of preference-implicatures. Some ramifications are discussed. In ch. 8 we draw a distinction between basic preference-implicatures ("pragmemes") and derived ones. The derivations involve communication rules. In ch. 9 we outline the extended formal framework. Finally, in ch. 10, we present some open questions.
It may be that the next great advance in the study of language will require the forging of new intellectual tools that permit us to bring into consideration a variety of questions that have been cast into the waste-bin of 'pragmatics'. N. Chomsky Form and Meaning in Natural Language 1.
Introductory
Ideal speakers live in speech-communities, communicating with their fellows regularly and happily. They ask questions and answer questions, as all of us, real speakers, do; some of them issue commands, as some of us do; and a few of them give * The preparation of this paper was partly supported by the Deutsche Forschungsgemeinschaft (Bonn—Bad Crodesberg). I am grateful to John Bacon, L. Jonathan Cohen and Helmut Schnelle for pointing out some important, apparent counterexamples. I owe special thanks to Helmut Schnelle for extensive comments on an earlier version of the paper. As usual, I am grateful also to Yehoshua Bar-Hillel for some comments and remarks.
Mood implicatures: a logical way of doing generative pragmatics
7
verdicts, as a few of us do. However, unlike any of us, ideal speakers never fail to perform happily speech acts they intend to carry out. They utter an instance of sentence S in context C only if S is linguistically appropriate to 6"; in their discourse, preconditions never fail to obtain and presuppositions are always true. Pragmatics is the study of that part of linguistic knowledge and behavior which pertains to speaker-sentence-context relations. In a homogeneous community of ideal speakers the linguistic behavior matches strictly the linguistic knowledge, but this is not the case when real speakers are under consideration. In that case there is the distinction between Pragmatics in a narrow sense—the study of the pragmatical competence of real speakers, which is, so to speak, the study of the pragmatical aspects of the behavior and knowledge of ideal speakers—, and Pragmatics in a broad sense— the study of the pragmatical aspects of the linguistic behavior of real speakers. This paper is confined to Pragmatics in the narrower sense. A pragmatical theory purports to describe and explain adequately part of linguistic competence. When such a theory incorporates a system of explicitly presented postulates and rules that govern the pragmatically basic relations, it will be called a generative pragmatical theory. Pragmatical investigations are intertwined with semantical considerations. We side with Montague in the dispute about representation of meanings: logical languages and model-theoretical systems of interpretation are the backbone of adequate semantical theories of natural languages.1 On the other hand, we take sides with all those linguists who have argued to the effect that at least some transformations have no semantical bearings, and hence, pace Montague, not every linguistic rule (of a syntactical nature) should be semantically justified. Dubbing this approach "logical generativism", we may call the present work a logical way of doing generative pragmatics. In this paper we attempt to provide a framework in which the performative functions of sentences can be described and explained. Some aspects of J.L. Austin's views of speech-acts have been systematized in.J.'R. Ross* work on declarative sentences and J.F. Sadock's works on hyper-structures, but they are restricted to syntactical considerations.2 We are interested in the pragmatical point of view.
2.
Semantical Preliminaries 3
Semantics is a hydra-headed label, but we are interested here in just two heads. On the one hand it means the science of meaning and on the other hand it refers to "a discipline which^ speaking loosely, deals with certain relations between expressions of a language and the object s ... 'referred to* by those expressions ..., or possibly ... [the] 'states 1
Richard Montague's papers will be published in a collection by Yale University Press. Meanwhile, consult Montague (1970) and Montague (1973). 2 Ross (1970) and Sadock (1969). 3 The reader who is fluent in some version of model-theoretical semantics may skip this chapter. See, however, the fourth conclusion of this chapter.
8
Asa Kasher
of affairs' described by them." (Tarski 1944, 345). Since the cleavage between theories of meaning and theories of reference is endorsed both by philosophers and linguists4, let me disclose my sticker at the very outset: Unionist. A general theory of reference provides foundations of and a framework for an adequate theory of meaning for natural language. Replacing slogans by examples, consider first the phrase (1) the former Pope. (2) A*/("the former Pope", Q =/j (/c, Pc) This string of words does not refer on its own merits to anybody on earth, even when an appropriate syntactical structure is attached to it. Otherwise it could not be used to refer to different people on different occasions. Consider the following schematic description of the recent papacy: Figure 1
An utterance of (1) at /j was used to refer to Pius XI while an utterance of it at /2 was used to refer to Pius XII. Generally, where Cis the context of utterance of (1), tc is the time ofthat utterance, and P c is a succession of popes, say, from St. Peter and on, future popes not excluded. Where the actual state of the papacy is not clear we can stick, say, to the official view of the Catholic Church or to an alternative view. The function/i transfers us, so to speak, from a given interval in figure 1 to the former interval, if any. The second argument of yj (x, y) specifies the succession under consideration, the first argument specifies which interval in this succession serves as a starting point for the operation of/j, which in turn moves us from ÷ to the former interval ofj, if any. Notice that ji(/c, Pc) is sometimes undefined. Assuming that the consecration of the second pope, St. Linus, took place in 67, ji is undefined for every earlier point. Presuming that antipopes were popes, f i is undefined whenever there are two or more former popes; thus^(1122, Pc) is undefined, because both Gelasius II and Gregory VIII were then former popes. The function^ is defined at / if and only if any pope at / is an immediate successor of exactly one pope, in Pc. We can now see what the contribution of each element of (1) is to the reference made by any utterance of (1). The word 'pope' directs us to Pc, the word "former" brings forward /, and then fir and finally 'the' introduces restrictions on the domain off é, mentioned in the former paragraph. 4
Quine (1961) 21f, 47ff and 130ff. Cf. Katz (1966).
Mood implicatures: a logical way of doing generative pragmatics
9
What would be the reference made by using (1) at /^, had John XXIII been the successor of Pius XI? Obviously, the reference made would be to Pius XI rather than to Pius XII. The word "Pope" in (1) would then direct us not to Pc but to Pc, which is a possible history of the papacy, referred to in a possible context of Figure 2
utterance C at which (1) was used. One can imagine other contexts of utterance in which (1) is used to refer to popes in other possible courses of events, including different possible papal successions. However, (2) holds for each of these possible contexts of utterance, where "Pc" stands for what is the actual papal succession/row the point of view of C rather than from our point of view, or from the present Pope's point of view. Summarizing what we have found so far, we maintain that when (1) is used in a context Cto refer, the reference is a function of some aspects of the context C; one is the time-index /c, involved in the use of "former", another is the exponent (as opposes to index) eC, which is the possible world or possible history under consideration, that is involved in the use of "Pope". We take the meaning of the phrase "the former Pope" to be fully captured by a satisfactory characterization of the function Ref("the former pope", C). One way of characterizing this function is sketched in (2), which needs however some further clarification. The reference of "the former Pope" in C is a function of Pc, but how exactly should Pc be characterized? It is not the reference of "the pope" in C, because it specifies a whole succession of popes and not just the one that heads the Roman Catholic Church in C, according to some view. On the other hand, given some means GC for determining for a given time point / and a person e whether e is a pope at / or not, we can define the appropriate papal succession Pc in terms of Gc. Let us then amend (2) to read (3) A*/("the former Pope", Q = /j (/c, GC}. Again, one can imagine different C7s that determine different Fs; this is why we have GC rather than just G. Every Gc is related to somepossib/e history; the answer to the question whether e is a pope at / is relative to this possible history. Trying to get rid of this relativity we take G to be the means for determining for a given possible history b, a time point / and a person e, whether e is a pope at / in h. Again, a satisfactory characterization of G will be here considered to capture in full the meaning of the word "pope". GC is then the function G (bc, *> *)> where hc is the possible history under consideration at Cand t and e are appropriate variables. Using the ë-notation for specification of variables, we have:
10
(4)
Asa Kasher
?/("the former Pope", Q =fi (/c, ë/ ë* 6),
and, consequently, the meaning of (1), which is KC Ref ("the former pope", 6), is represented as a function of the meanings of "the", "former" and "Pope" in (5): (5) Ç = ë€ Ufa ë/ë* Gfc, /, 0). from which it is also clear that the context C in which (1) is used to refer, contributes significantly to the reference, both by index and exponent. Notice that we take a satisfactory characterization of G to be first of all a sufficient condition for capturing in full the meaning of the word "Pope". Whether it is a necessary condition or not is open to debate. We cannot do here more than deckring that for some expressions it seems to us to be a necessary condition as well. "Pope" is among these expressions. However, we admit that in many cases this could not be a necessary condition. For an extreme anti-necessitarian view, consult Schnelle (1973). Having seen how the meaning and reference of the phrase "the former pope" are rekted to semantical properties of its elements, we turn now to see how this phrase plays the role of an element in bigger phrases. (6) The former Pope was Italian. Denoting by "/" the meaning of "Italian", which is a function I(hc, /, *), similar to the function G we saw earlier and ignoring the tense of (6), it is clear that the meaning of (6) is a function of H (the meaning of "the former pope") and 7. Since both H and / have contextual arguments, vi%. bc and /c, it should come as no surprise that the meaning of (6) turns out to have contextual arguments as well. Why (6) depends on the context of utterance should be clear by now. On one occasion, given appropriate index and exponent, the proposition made is true—John XXIII was in the actual history, say, according to the official view of the Catholic Church, an Italian; on other occasions the produced proposition is false—John XXII was, according to the same official history, a Frenchman, and a possible history can be outlined in which John XXIII was not an Italian. The meaning of (6) will then be taken to be fully captured by an adequate characterization of the function that relates a context of utterance—its indexes and exponent—to the truth-value of the proposition made by using (6) at that context. An adequate characterization of the meaning of (6) should show how the meaning of (6) depends on the semantic properties of its elements. I(kc> A 0 is a function from exponents, time-points and persons to truthvalues. Given any context of utterance C9 we determine by / whether the reference of (1) in that context was an Italian, during the appropriate period of the possible history which is the exponent of C. In other words, we apply the function 7(x, y, z) to a certain triple of arguments, as follows: for ÷ we substitute 'be denoting the exponent of C which is the possible history under consideration, and for æ we substitute '//(C)', which refers to the same person, if at all, that (1) refers to in C. What we substitute for y depends on an analysis of the tense of (6) which is beyond the scope of our discussion.
Mood implicatures: a logical way of doing generative pragmatics
11
Altogether we have:
(7)
I(bc,t'c,H(C))
which is either true or false, or undefined in case the last argument fails to refer to anything. The following partial function (8) ë€É from contexts of utterance to t ruth-values represents the meaning of (6). The purpose of this section has been to hint at our semantic framework. 1) A semantic theory of a language is a theory of functions that are related to phrases of the language and their semantical properties; given a phrase x, a word or a sentence and an appropriate context of utterance C, the function attached by the theory to ÷ supplies the value of C, if any. Values of some nounphrases are individuals referred to by using ÷ in an appropriate C. 2) Values of other phrases, e.g., predicates, adjectives and prepositions, are functions that contribute to the functions attached to other phrases in which the former ones appear. Generally, a linguistic adequacy criterion for a semantic theory is that the function attached by the theory to a phrase ÷ be determined by the functions attached to the elements of ÷ and by its syntactic structure. 3) Functions may not be defined for some arguments; for example, a noun phrase ÷ may fail to refer whenever uttered in contexts in which presuppositions induced by ÷ do not obtain. 4) The contribution of any Cis two-fold: first it provides values of indexes: speaker, addressees, time and place of utterances, and so forth. Secondly, it fixes an exponent, including a possible factual background for x's functioning. An exponent may be a possible state of affairs, a possible succession of such states, or generally, any class of such states. We use the term "exponent" in order to be neutral with respect to the extent and structure of those classes. Justification and detailed development of such frameworks are not our job here and we refer the reader to works by Montague, Lewis and others.5 In what follows some problems will be considered within such a framework.
3.
The problem of other moods
We assume that the meaning of the sentence (9) He is still alive is captured satisfactorily by an adequate presentation of some partial function from contexts of utterance to truth-values. Such a presentation seems possible, 5
Lewis (1972) is an excellent introduction to Montague (1973). Consult also Thomason (1972). Notice that we have not subscribed here to any particular logical language or theory, but to the general conception of model-theoretical semantics for natural languages.
12
AsaKasher
and even natural, because whenever (9) is uttered in an appropriate context, what is said may be true or false. A context of utterance is inappropriate for (9) if, for example, it does not specify the reference of "he" in (9). Thus, lack of information may cause the function attached to (9) to be undefined for some contexts. Consider now the following sentences: (10) May he be still alive! (10') Is he still alive? (10") Who is still alive? Trying to attach to each of these sentences a function, depicting its meaning along similar lines, we have no difficulty in circumscribing the domains of these functions—the classes of elements serving as arguments of these functions—vi%. contexts of utterance determining indexes and exponents. What are the values of these functions when defined?—that is the question. Assuming that no information is lacking in a context C of uttering (10), we are still inclined to say that what was said was neither true nor false. Since we assume that no information pertaining to (10) is lacking in C, the third alternative, w%. that the truth-value is undefined, is also excluded. Our functions are defined for some £7s and their values seem not to be 'true* or 'false'. What are they, then? Without adhering to any solution of that problem, we mention two useful distinctions that have been drawn in relation to some solutions. Philosophers since Frege (1918/19) distinguish between two logical elements of a sentence, the element which the sentences (9) (10) (10') and (10") share, and what they do not share. The former is the "descriptive contenf of the sentences, indicatives and non-indicatives as well; the latter is the "mood" of the sentence— indicative, interrogative, imperative, etc. —indicated in many cases by a performative verb or by the syntactical form of the sentence. E. Stenius (1964) 168 pointed out that there are different ways in which a mood/radical distinction can be applied, and introduced the terms "grammatical mood" and "semantical mood"6. Obviously, the grammatical mood of a sentence may be different from its "deeper" mood: (11) You will pack at once and leave this house. This sentence is of the indicative mood in the grammatical sense, but in many contexts it is of the imperative mood, pragmatically. In what follows we shall confine ourselves to a discussion of semantically non-indicative sentences. Our major problem in the present paper is how to present the meanings of their moods, within the formal framework outlined earlier. However we shall have something to say about the use of indicative sentences as well. 6
Jespersen used "notional" for the same purpose (1924) 319. Better terms would be "syntactical mood" and "pragmatical mood** but we shall use Stenius' terms.
Mood implicatures: a logical way of doing generative pragmatics
4.
13
Stenius and Aquist on the other moods
One theory of meaning for (semantically) non-indicative sentences postulates the existence of a sentence-radical and a sentence-mood for every sentence, indicative and non-indicative as well. Thus, at some level of representation the suggestive notation of (16)—(20) can be used for (12)—(15), respectively: (12) Do you live here now? (13) It is obligatory for you to live here now, or live here now! (14) You live here now. (15) It is necessary that you live here now. (16) ?p (17) Op (18) ip
(19)
INp1
Notice the difference beween Ip and p. The latter does not denote the sentence (14). It is a sentence-radical which does not have a mood, and it may be expressed in English by (20) rather than by (14): (20) that you live here now. The operators ?, I and others, which combine with sentence-radicals to form representations of sentences, play a role both in the syntactical and semantical processes. The syntactical role of such operators is well known, having been used by some linguists, though under different terms. Katz and Postal (1964) 86 ff and Katz (1972) 201 ff introduced a special morpheme Q into the underlying structures of yes-no question, and Ross (1970) argues to the effect that declarative sentences are derived from deeper structures including as their main verb, a complex symbol, [ + V, + performative, + communication, + linguistic, + declarative] which is defended on syntactical grounds. Whether these detailed analyses stand up to criticisms or not is not of our concern here (cf. Fräser 1970). It suffices to notice that the postulated operators used in (16)—(19) provide syntactical anchors, that is, starting points for a transformational process accounting for the surface differences between (12) and (14). The main point of Aquist's variant of Stenius' theory is that the sentenceradical specifies what should be true somewhere and the sentence-mood indicates in a sense where (Aquist 1967). When the sentence (14) is under consideration, what is true in some possible world is (20)—that you live here now—, which is the sentence radical of (14). The sentence mood of (14) indicates in which possible world (20) is true, according to what is conveyed by (14) in its standard
7
Stenius* own notation for (19) is "Np". Cf., however, his paper (1967). We shall not discuss the differences between Stenius* two works.
14
AsaKasher
use. Since the semantical mood (14) is the indicative, the possible world indicated is usually the actual world; more accurately, if a context of utterance C has the indexes < addressee: John), , and " stands for the relation of weak implication, we have: (R)
Sw>PREF e , c (v 1 ,v 2 ) Sw>PREF t t f C (u 1 ,u 2 ) where
and
Applying (R) to (82), one finds still another weak implicature of (60). This process can of course be reiterated indefinitely many times, yielding new weak implicatures of (60). The set of preference relations involved, characterizing the semantical mood of (60), includes this infinity of weak implicatures of (60) as well as (79) and (80) and, indeed, similar consequences. But this infinite class is finitely definable: it includes the basic preference relation (74) and is closed under (R) and under logical deduction rules. All the characteristic classes of preference relations are similarly definable. The basic preference relations of these classes we call "pragmemes"?* 22
Kasher (1973). See also Kasher (forthcoming). We do not develop here the due distinction between uninterpreted pragmemes and interpreted ones. See chapter 9, below. We introduced the term "pragmemes" in our lecture at the IVth Congress of Logic, Philosophy and Methodology of Science (Bucharest 1971). Since then it has been put to another use by Konrad Ehlich and Jochen Rehbein. 23
3 TLIl/2
34
9.
Asa Kasher
A formal theoretical framework, outlined
Earlier we have made informal use of a variety of preference relations between possible worlds, at some contexts of utterance. We shall now show how to formalize this conception within a theoretical framework, meant to be a basic, pragmatical component of an adequate linguistic theory. Let L be a natural language. A linguistic theory for L is a system M of postulates and rules, formulated in a language Ë serving as a meta-language of L. First, we outline some elements of this system. /. Sentences. Ì specifies recursively the set of sentences of L. By "sentences" we do not mean just strings of phonological units or of letters, but pairs each consisting of such a string and some syntactical surface parsing and lexical specifications of it, in a way that makes sentences to be syntactically24 and lexically unambiguous.25 In this we deviate from some common ways of understanding the vague term "sentence" but this deviation will make our exposition simpler. 2. Semantical representations. Ë includes a set of uninterpreted formulae by means of which semantic analyses of sentences of L will eventually be represented. We call these formuke "creative". (This term is the pragmatical counterpart of "generative".) These might be the well-formed formulae of a second-order predicate language with intentional operators of various kinds, with many-sorted and restricted variables, branching quantifiers and other logical devices which will be found useful.26 Ì-includes also general means for considering interpretations of the creative formulae. Adopting Thomason's terminology, a valuation consists of an interpretation assignment which associates semantical values—intensions—to atomic expressions, and a method of projection (or better: a method of combination) which determines the semantical values of molecular expressions, recursively. Where intensions are taken to be functions whose domains consist of classes of possible worlds or other semantical exponents (see section 2 above) a valuation includes also an interpretation structure which specifies a class of semantical exponents. In order to capture some linguistic phenomena, e.g. semantical presuppositions, we admit valuations that are partial in the sense that the functions which serve as semantical values of formulae or other expressions might be left undefined for some elements in their domains.
24
Notice that only surface syntax is involved here. For a similar conception of "sentence" see my paper (1972). 26 For the usefulness of intentional operators, consult Montague, loc. fit.; concerning many-sorted and restricted variables, see Kasher (1973a); a discussion of branching quantifiers is included in some unpublished papers of Hintikka; see also Gabbay and Moravcsik, this issue. 25
Mood implicatures: a logical way of doing generative pragmatics
35
3. Contexts of utterance. Another facility of M is the means it provides for considering classes of possible contexts of utterance. A major duty of M is to characterize adequately the notion of appropriateness of contexts for uttering sentences. At the moment we assume that a context of utterance C determines the values of the meta-variables á (the speaker) and â (the addressee or addressees), and the exponent eC. 4. Translation. Ì is meant to render ultimately adequate descriptions and explanations of meaning and use. For that purpose, Ì characterizes a translation function that maps pairs , C), where D is a domain statement and C is a structural change statement on D" (310). "A domain statement is any expression of the form Dv or DI Ë Z?0" (309), where DI is a finite string of symbols from a certain set and D0 is a 'Boolean domain statement'; the latter concept is defined recursively as either (a) a finite string of symbols from the given set or (b) any expression of one of the forms (Di v £>2), (D^ Ë D2),~ Diy where DI and D2 are Boolean domain statements (309). A structural change statement on D (D a. domain statement) is again a string of symbols. The expressions "domain statement" and "structural change statement" seem to suggest that the members D and C of a transformational rule are intended to be interpreted as statements. There is, however, not the slightest hint how this might be achieved. In the case of the domain statement only a single interpretation suggests itself: Given a certain 'graph assignment', the domain statement might be said to denote the set of 'trees' that 'satisfy' the statement for that assignment (cf. the definitions I.e. 311; "tree" is not defined). For the 'structural change statement', I do not see any interpretation at all. The transformational rule as a whole remains uninterpreted. If an interpretation is sought, there is only one that would seem defensible: The rule (Z>, C) denotes a two-place relation (more specifically, a function) between 'trees', vi%. between trees satisfying D and their 'transforms'.10 Even so, a transformational rule is not understood as a statement. There is an informal way of specifying transformational rules which makes use of an arrow notation. This notation reappears in Ginsburg and Partee (1969) 313 as "ô=>7\^ô'", which, on the basis of a similar formulation I.e., may be read as "T and/change ô into ô'" (Ã a transformational rule,/a graph assignment, ô 8
For the two {errns, see Chomsky (1972*) § 3.2. Ginsburg and Partee characterize the scope of their paper as follows (297 f): "As yet, no mathematical model has been given which encompasses most of these different versions of a T-grammar. The purpose of this paper is to propose one such mathematical model". (198, fn.:) "One important exception is the notion of "syntactic features" described in Chomsky (1965) and included in most subsequent T-grammars." 10 Again, denotation would have to be relativized to a graph assigment.
9
Grammars as theories: the case for axiomatic grammar (part I)
47
and ô' trees).11 By binding the variables or replacing them by constants, we would obtain an expression that could indeed be interpreted as a statement. However, this would be a statement on transformational rules, graph assignments, and trees. From Ginsburg-Partee it is fairly clear that the rules belong to a grammar of a language, not to the intended subject matter of the grammar, however defined. Therefore, any such expression, although interpreted as a statement, would not (or not completely) be interpreted as a statement on the intended subject matter of the grammar. Given the definition of "transformational grammar" I.e. 315, the transformational rules and their two components are the only entities of the transformational component for which an interpretation as statements might be considered.12 Ginsburg and Partee formalize the transformational rules of syntax only. In the Later Model the rules of the phonological component came eventually to be interpreted as being, in a sense, transformational too. The most explicit formulation of this view is found in Chomsky and Halle (1968) 20: The phonological rules are said to be 'local transformations' in the sense of Chomsky (1965).13 Such transformations are apparently covered by the Ginsburg-Partee formalization if it can be extended to allow for the 'features' of phonology, which seems possible.14 In this case, our previous results carry over to the phonological rules. The situation is, however, complicated by the fact that Chomsky and Halle develop a formalization of their own (I.e. 390—399) which makes the phonological rules very different formal objects from the transformational rules of syntax as usually conceived and as formalized by Ginsburg-Partee.15 I will not try to reconcile the two versions.16 Suppose that the Chomsky-Halle formalism is adopted. No explicit interpretation is given but it is clear that rules are the only expressions that could possibly be interpreted as statements (besides rule schemata, which can be reduced 11
Ginsburg and Partee omit reference to/, which is, however, presupposed in the context introducing the arrow notation. 12 It is not clear if and how on the basis of Ginsburg and Partee a transformational level could be constructed that would be compatible with the theory of levels in Chomsky (1955) or an appropriate modification of that theory. In the present context, such a level must be considered as unspecified. 13 "By a local transformation (with respect to A) I mean one that effects only a substring dominated by the single category symbol A" (/. c. 215, fn. 18). 14 Ginsburg-Partee did not consider grammars with 'syntactic features', see above, fn. 9. 15 The phonological rules as specified I.e. 391, (6), are isomorphic to context-sensitive phrase structure rules in the following sense: If the so-called 'units' in a rule are replaced by single symbols, then a phonological rule has the same form as a context-sensitive phrase structure rule, understood as an expression of the form ö—* ÷. The formalism is eventually expanded to include 'rule schemata' (393—396). 16 The situation is further complicated by the fact that Chomsky and Halle consider extending their formalism to cover certain rules "which are rather similar to transformations in their formal properties" (I.e. 398), i.e. similar to syntactic transformational rules in their usual format.
48
Hans-Heinrich Lieb
to rules, cf. 394, (13)): In a rule ö—»÷, the arrow is taken as denoting a two-place relation whose members are somehow 'given* by ö and ÷, where "ö" and "÷" are to be replaced by strings of certain primitive symbols (specified 390, (1)). As a reading of the arrow "rewrite as" is given; furthermore, the rules are 'applied* so as to allow the transition from strings to strings (391 f, (8) and (9)). Hence, if taken as a statement, a rule should be taken as a statement on strings. The following formulation may be roughly adequate: "For any string x corresponding to ö and any string y corresponding to ÷, x is rewritten as y" (where "corresponding to" can be defined on the basis of 391, (8), and 392, (9)). The strings are not understood as belonging to the intended subject matter of the grammar; they are apparently considered as expressions that should be interpreted with respect to it. Hence, even if considered as a statement, a rule is not to be taken as a statement on the intended subject matter of the grammar. Let us now briefly consider grammars according to Chomsky (1965), disregarding the semantic component. The categorial component of the base and the phonological component are covered by our previous discussion. The same holds for the (syntactic) transformational component if we assume that taking into account 'syntactic features' will leave unaffected our previous statements on transformational rules. Although I have not made a detailed investigation, this seems to be true with the following qualification: If the restated transformational rules involve complete 'lexical entries', those entries are among the entities which might be considered as statements. There is, however, no explicit interpretation to this effect, and an appropriate interpretation would not invalidate our thesis (I).17 Hence, the lexicon of the grammar (consisting of lexical entries and 'redundancy rules', which are not problematic with respect to our thesis (1)) also contains no expressions that can be understood as statements on the intended subject matter of the grammar. Thus, thesis (1) holds for all the non-semantic components of a grammar conforming to Chomsky (1965), which is the key formulation of the Later Model.l8 1:4.
Classical semantics. Later Developments
As conceived in Katz and Fodor (1963), the semantic component of a grammar consists of a dictionary and a projection rule component. (The revisions 17
A lexical entry is a pair (Dy C) where D is a phonological distinctive feature matrix, and C is a set of specified syntactic and other features (Chomsky (1965) 84, 87). These features are expressions such as " + N" (corresponding to the phonological specified features in Chomsky and Halle (1968) 391, (6a)). It might be proposed that "(£>, Q" is to be interpreted as synonymous with the conjunction of statements "D is F! and ... and D is Fn", where Ft, ..., Fn are the features in C. This is a statement on a phonological distinctive feature matrix (such matrices are the values of the variable "D"\ the expressions substitutable for "/}" are names of matrices). The distinctive feature matrices do not, however, belong to the intended subject matter of the grammar (see our fn. 4); they are at best expressions that can be interpreted with respect to it. 18 The concept of level must again be regarded as unspecified.,
Grammars as theories: the case for axiomatic grammar (part I)
49
in Katz and Postal (1964) are irrelevant to our problem; hence, we use the earlier conception). The projection rules are indeed formulated as statements. They are, however, statements on paths in phrase markers and on their 'amalgamation'. The paths and their amalgams consist (in some sense of the word) of 'lexical strings' and strings of 'syntactic' and 'semantic markers' and 'distinguishes'. The markers and distinguishers (and strings and paths?) are regarded as expressions that should be interpreted with respect to the intended subject matter of the grammar—they do not belong to it.19 Hence, the projection rules are not statements on the intended subject matter of the grammar (which apparently consists of certain 'abilities' of speakers, cf. Katz and Fodor (1963) [(1964) 484, 493]). This should also follow from our § 1.3, if Katz's claim is correct that projection rules are transformational rules (Katz (1971)). In the case of dictionary entries, be it in the original form of Katz and Fodor (1963) or as modified in Katz (1967) 144f, 149, we may argue in a similar way as for the lexical entries according to Chomsky (1965).20 The 'semantic interpretation of a sentence' contains statements on that sentence (Katz and Fodor (1963) [(1964) 503]), and the interpretation might appear among the entities of a 'semantic level' (left unspecified) which would correspond to the semantic component. The 'sentence' is, however, not a sentence of the natural language but a corresponding string of symbols of a grammatical level (cf. Lieb (1968£) § 3) and for this reason not part of the intended subject matter of the grammar. Hence, thesis (1) remains unaffected. This concludes our argumentation in support of (1), which by now should be well established. The thesis is restricted to classical TGs; our original claims in § 1.1 were more general, covering most generative grammars that have been suggested. We shall briefly discuss whether (1) can be extended to later developments in generative grammar. It is natural to do so in the present context since in those developments it was treatment of semantic phenomena that came to be the key issue. Katz (1972) is an updated and expanded version of 'classical' semantics and might have been subsumed under it (as the most elaborate version of the standard 19
There are a few passages in Katz and Fodor (1963) where "marker" does not seem to refer to an expression [(1964) 518]. But cf. Katz (1967) 129: "A semantic marker is a theoretical term that designates a class of equivalent concepts or ideas." 20 There are, however, certain problems with this: (1) So-called complex semantic markers (as introduced by Katz) may perhaps be reconstructed as open formulas of predicate logic (cf. Bierwisch (1969); accepted in principle in Katz (1972) 166, fn. 21; criticized in Bartsch (1971) 43—48). (2) Whereas markers in Katz and Fodor seem to denote classes of lexical items that share a factor in their meanings, they are later interpreted by Katz as denoting a class of 'ideas' (see previous footnote). As the phonological part of a dictionary entry continues to be understood as a phonological distinctive feature matrix (Katz (1967) 144)— i.e., as an expression—an interpretation on the lines of our fn. 17 would require some adjustment. 4 TLIl/2
50
Hans-Heinrich Lieb
theory in the sense of Chomsky). The following differences are the most important ones in the present context: (a) Introduction of additional notational devices, in particular, of 'categorized variables'.21 (b) Replacement of different projection rules by a single one and assignment of this rule to 'general semantic theory* rather than to individual grammars (Ch. 3, § 10). (c) Introduction of a new interpretative component of grammars to take care of 'rhetorical' phenomena such as topic and comment. The new notation is covered by our previous discussion22. The single projection rule is again of a metalinguistic nature; moreover, its status is no longer relevant because the rule is no longer part of individual grammars. Of the 'rhetorical interpretations' it is explicitly stated (I.e. 433) that they "will employ the same vocabulary of representation as semantic interpretation does." Thus, thesis (1) seems to hold for Katz (1972) too. Let us briefly consider 'the extended standard theory' as proposed in Chomsky (1970£), (1972*) (including the 'lexicalist hypothesis' of Chomsky (1970*)) and developed most fully in Jackendoff (1972). Outside semantics, the only innovation that need concern us is the use of syntactic features instead of non-lexical category symbols such as "NP", both in Chomsky (1970*) and Jackendoff (1972).23 This extension of the feature notation is easily covered by an extension of our earlier discussion (above, fn. 17). As for the semantic rules, Chomsky does not make any proposals of his own. Jackendoffs 'projection rules' (e.g. (1972) 107, 293) are statements or pseudo-statements (using imperatives) on formal objects of the grammar (readings, semantic markers, syntactic phrase markers, and the like). In the case of his 'semantic structures', in particular, his 'functional structures', I am at a loss of how to understand them because of poorly explained notation. With this proviso, then, thesis (1) may be extended to the extended theory as formulated by its main proponents, and thus to the more important recent developments in *interpret(at)ive' semantics as a whole. An extension of thesis (1) should also be correct for most or all of generative semantics. Concerning the conceptions developed by McCawley and Lakoff, we argue as follows.24 As far as a systematic account of grammars has 21
Ch. 3, § 9. Also: various parentheses p. 165f; a number of 'abbreviatory* symbols (e.g*> p. 314). 22 No demonstration will be attempted. Of a semantic marker it is now stated that it "is a theoretical construct which is intended to represent a concept" (Katz (1972) 38); the question of what the ontological status of a concept is "will be left here without a final answer" (39). 23 In Chomsky, there are additional changes in the form of the base rules, which, however, are of no consequence with respect to thesis (1). 24 I had worked out a detailed argument on the basis of Lakoff s unpublished book [1969] only a small part of which was published as Lakoff (1971). In 1972 I learned from the author that his book is being completely rewritten. I therefore decided not to publish my analysis, which, however, led to basically the same results as the following argument.
Grammars as theories: the case for axiomatic grammar (part I)
51
been given, the only expressions of a grammar that could possibly be understood as statements are the rules of the grammar. All rules (whether 'descendants' of the earlier phrase structure rules or of the transformational rules) are understood as specifying 'constraints' on 'derivations' that are sequences of phrase markers. The phrase markers do not belong to the intended subject matter of the grammar (a natural language, in some sense) because they are partly constructed from metalinguistic symbols such as "S", "N" etc. Hence, the rules cannot be understood as statements on the intended subject matter of the grammar.25 The work of Keenan (e.g. Keenan (1972), Keenan to appear) should also be covered by our previous arguments: There are rules for specifying an interpreted formal system certain expressions of which are the 'logical forms' or 'logical structures' of sentences of a natural language. Neither the rules nor the expressions are statements on the intended subject matter of the grammar. The logical forms are to be related to sentences of the natural language (rather, to names of such sentences) by transformations. To the transformational rules our previous arguments (§ 1.3 and this paragraph) apply.26 In a way similar to Keenan, Bartsch (1972) and Bartsch and Vennemann (1972) are aiming at greater logical explicitness within a generative-semantics-type framework. Thesis (1) can also be extended to cover their work although this is somewhat difficult to establish due to imperfections of their formalization (e.g. Bartsch (1972) Ch. XX).27 25
LakofPs account of the concept of phrase marker in [1969] is completely inadequate if taken literally, though similar in intent to McCawley's definition of tree ((1968) 244£), which, allowing for minor modifications, may be taken as basic for later work (McCawley (1972) assumes the 1968 account: 543, fn. 4). Our above statement on phrase markers is based on the following feature of McCawley's conception: McCawley includes a 'labeling relation' among the constituents of a tree and speaks explicitly of trees "whose non-terminal nodes are labeled by syntactic category names" (246). It should be noted, though, that McCawley introduces "tree" as a purely set-theoretical term. One might try to apply his definition to trees that are entities of the natural language: The set of 'nodes' would have to be identified with a sentence of the natural language and certain of its parts, the set of 'labels' with a set of syntactic categories (not category names). But certain conditions in the definition would now exclude many trees of the usual sort. (The difficulties are with multiple occurrence of the same part of the sentence, and with assigning a single part to several categories simultaneously, e.g. to N and NP).—For different but related criticism, cf. Bellert (1972£) 294f, Bellert (1973). 26 For Keenan's own view of his work, cf. the following quotations ((1972) 430, 453): "Using the naturalness condition presented in Part I we shall argue here that the L-structures we have proposed are natural transformational sources for NL-structures." ("L": "logical"; "ML": "natural language".) "Our work is akin in spirit to that in generative semantics (G. Lakoff 1969). The difference here is more one of emphasis: we have been more concerned to formulate rigorously certain logical properties of NL and less concerned to define the functions which derive surface structures from logical structures." 27 The question was discussed with the authors in winter 1972/73 in the Berlin research group on Theory of Language and Theory of Grammar.
52
Hans-Heinrich Lieb
Our sketch of recent developments in generative grammar should be sufficient; it would be rather useless to try and aim at exhaustiveness. Most forms of generative grammars that were not considered are systematically compared in Hirschman (1971); apparently they fall in line with the forms that were studied. Thesis (1) or an appropriate extension may now be considered as wellestablished for the majority of all work undertaken in generative grammar since its inception. Hence, we have also established the thesis that most generative grammars cannot be considered as theories of their intended subject matter in any ordinary sense of "theory (of)"; moreover, we are confronted with the general problem of how those grammars can be interpreted so as to 'refer', in a reasonable sense, to their intended subject matter. It is these questions that were taken up independently by Wang and myself and studied with respect to the syntactic components of classical TG grammars. 1.5.
Re-interpreting generative grammars In my own work I attacked the problems connected with (1) by what may be called the re-interpretation approach, as opposed to the correlation approach used by Wang. Briefly, the first method attempts to re-interpret the rules of a generative grammar so that they may be understood as statements on the intended subject matter of the grammar; the second method attempts to correlate an axiomatic theory to a given generative grammar.28 In Lieb (1967), an idea formulated in Lieb (1968£) 379, is applied to (one version of) the categorial subcomponent of a grammar as understood in Chomsky (1965)29: The rules are re-interpreted so as to become statements on the system of the natural language. For this purpose, all rules are considered as expressions containing "—»", and are first interpreted in a standard way as statements on strings of a grammatical level. The lexical and grammatical formatives occurring in the terminal strings of the component are then interpreted as names of certain entities called "-L-constructs"; the basic morphological elements of the system of the object language are assumed to be Æ,-constructs. The terminal strings are interpreted to denote appropriate sequences of L-constructs. A category symbol 28
The possibility of re-interpretation seems to be naively assumed in much work within the generative framework (see also the quotations from Bach (1964) in Lieb (1967) 369 f). For an approach similar to Lieb (1967) though less formal and questionable in detail, cf. Hermanns (1971). The conception of syntax developed in Bellert (1972 £) may be considered as a first step to either the re-interpretation or the correlation approach; cf. below, § 4.5. (Both Hermanns and Bellert are unaware of the work done by Wang or myself.) In a number of publications Marcus distinguishes and compares 'generative' and 'analytic' models of language (esp. Marcus (1967 *), (1969)). Although this parallels a distinction between algorithmic and axiomatic grammars, Marcus does not recognize the problems presented by our thesis (1) and considers the distinction between la linguistique analytique and la linguistique synthetique ou generative as a distinction essentielle^ irreductible a une autrey plus elementaire ((1969) 320). 29 Lieb (1967) was written later than Lieb (1968£) and actually published only in 1969.
Grammars as theories: the case for axiomatic grammar (part I)
53
denotes "a class of sequences of /.-constructs, roughly, the class of sequences with the following property: Each sequence is denoted by a certain part of the terminal string of some generalized phrase-marker, where the phrase-marker is a deep structure underlying a surface structure, and the part of the terminal string is dominated by the category symbol." (Lieb (1967) 371). Now a second interpretation of the rules is given, which makes them statements on the system of the object language, formulated either in predicate logic or in set theory. To give an example: Suppose that "S—*NP^VP" is one of the rules. "S", "A//5", and "KP" denote the same classes as the corresponding category symbols "S", "NP" and "VP".30 The arrow is interpreted as £=, and " " as denoting the product operation on classes of sequences, e.g. "NP VP" denotes the class of sequences obtained by concatenating every sequence in NP with every sequence in VP. On this interpretation, the rule is equivalent with "Every element of S consists of an NP followed by a VP" (understood as "Every element of S is the concatenation of some NP and some VP"), which corresponds to an intuitive reading given to such rules by many linguists (as could be easily documented from the literature). A way is suggested as to how the rules, understood as statements on Æ,-constructs and classes of Æ,-constructs, can be understood as statements on units or categories of the object language at least in the case of a 'correct grammar'. Any such possibility presupposes the second interpretation of the rules.31 Lieb (1967) does not yet formulate definite results but outlines a program for research. In subsequent work (unpublished) I carried the program through in complete detail, but eventually discontinued the entire line of research for two reasons: (a) When the classical theory was supplemented by the interpretative part I had worked out, the whole system seemed to be unnecessarily complex, especially if compared to the conception of theories and their interpretation as developed in the philosophy of science, (b) After 1965, the classical theory dissolved. In Lieb (1967) the question of how to understand a classical TG as an interpreted axiomatic theory was not treated explicitly; rather, I confined myself to the immediate problems raised by thesis (1). The correlation approach developed by Wang is concerned with the former question. It has grown out of work with a somewhat different orientation, which we will also characterize. /. 6.
Correlating grammars and axiomatic theories
In Wang (1968), a grammar of a natural language is conceived as a system of rules by which 'grammatical statements* (einfache grammatische Aussagen, I.e. 21) 30
For reasons explained in Lieb (1968£) § 2.2, the letter symbols in rules should be italicized and thus distinguished from the corresponding symbols of the grammatical levels. 31 That interpretation is not a psychological one; hence, it may be questioned whether we are establishing a direct relation to the intended subject matter, cf. the quotations above, § 1.1, fn. 4. The same objection may be raised against Wang's solution.
54
Hans-Heinrich Lieb
can be derived. These statements have the form of sentences of first-order predicate logic consisting of a one-place predicate and a closed individual expression. (For formulating the rules Wang also introduces expressions analogous to individual variables.) Wang's 'statements' can indeed be interpreted as statements, and they are obviously intended as statements on the natural language.32 It is proposed (Lc. 40) that a 'structural description' (strukturelle Beschreibung) be identified with a construction (Konstruktion, Lc. 23) of a grammatical statement on the basis of the rules, the construction being a sequence of grammatical statements ending with the statement in question (say, "S he came"). Wang's general approach consists in developing formal systems that correspond to a syntactic system for first-order predicate logic (in the sense of Carnap (1958) Ch. B); a grammar of a natural language is the system of rules of one of the formal systems; grammatical statements on the natural language are derived by those rules, which thus correspond to the rules of inference in a syntactic system for logic.33 It is then shown that the resulting grammars correspond (in a specific way and allowing for a number of deviations) to transformational grammars and structural descriptions as characterized in Chomsky (1965). Further developments of this approach are found in Wang (1971^). For structural descriptions in the sense of Wang it is indeed easier to see how they could be directly interpreted with respect to the object language of the grammar. On the other hand, the rules of a grammar cannot be interpreted in this way since they are rules for the derivation of grammatical statements. E.g., "NPu, VPv—»Suv" (1968) 24, if interpreted as a statement, might be read as: "From "NPu" and "VPv", "Suv" is derivable" (where "u" and "v" are variables whose values are expressions of the natural language, and juxtaposition of variables denotes concatenation). The rule in this example is easily replaced by a sentence of predicate logic: "(u)(v)(NPu & VPv->Suv)"—"for every u and v, if u is an NP and v a VP, then u concatenated with v is an S." This suggests that the language of first-order predicate logic might have been used in the first place: Instead of giving a formal system whose rules of derivation coincide with the rules of the grammar, the latter are introduced as axioms in an axiomatic theory formulated completely within first-order .predicate logic itself. The axioms may then be understood as statements on the object language. The usual rules of inference are used to derive theorems which can be interpreted as statements on the object language, such as "S (he comes)".
32
As constants denoting expressions of the natural language Wang uses those expressions autonymously (cf. Lc. 21). If this is taken literally, he allows only for written forms of natural languages. Cf. "N'man*" (or "Nman") /. c. 20, to be interpreted as meaning that "man" is an N. 33 Basically the same approach is used in the appendix of Smaby (1971), where the concept of a formal system as put forward in Smullyan (1961) is applied.
Grammars as theories: the case for axiomatic grammar (part I)
55
This possibility is indeed explored in Wang (1971£) (cf. also Wang (1972*) Section IV, (1973*)), where the rule quoted from Wang (1968) 24, is replaced as suggested above. Wang indicates a general way of replacing a context-free grammar by an axiom system formulated within first-order predicate logic (with identity) such that the axioms can be understood as statements on the object language. He formulates a special case of a theorem (proved in Wang (1971*)) stating that a sentence of the form Kx can be derived from the axioms (where K is a category symbol and ÷ a closed term to be interpreted as the name of a syntactic constituent) if ÷ is a terminal string derivable from K in the grammar ((1971£) Section 4).34 An attempt is made to replace transformational rules by axioms, too (Section 5). Emphasizing that his results agree with Chomsky's conception of a grammar as a theory (p. 273), Wang finally applies the concept of deductive-nomological explanation to grammars reformulated as axiom systems. A more detailed application of that concept is attempted in Wang (1972*) (to be discussed below, §4.4). In recent work Wang has been trying to combine ideas from Katz and Fodor (1963), Knuth (1968) and Montague (1970*, £), (1973) in order to arrive at a semantic component that could be combined with an axiomatic syntax (Wang (1972*) Section V; (1972£); (1973£)); this attempt does not provide an axiomatic reformulation of semantic symbolisms as developed within the generative framework. Wang's work may well represent the most promising attempt to understand the syntactic component of a classical TG as an interpreted axiomatic theory (by mechanically constructing an uninterpreted axiomatic theory from the grammar). Still, his account is problematic because of (A) and inadequate because of (B) and (C): A. Consider phrase structure rules such as "S—>NP VP". The sentence correlated with this rule—"(x)OO(NPx & VPy—>Sx"j)"—is equivalent with "NP + VP^S" (assuming an appropriate language of logic and a definition of " + " so that NP + VP = {^: (3*)(3j)(NPx & VPj & æ = *">)}. But why not correlate "(æ)(S^;—>(3x)(3j) (NP* & VPj & î = x"jO)", which would be equivalent with "S£ NP + VP"? Indeed, it is this latter sentence that is equivalent with the rule in the re-interpretation approach of Lieb (1967), and it is only this sentence that corresponds to the immediate constituent analysis which phrase structure grammars were meant to capture. Wang naively assumes that it is only a difference of formulation ("anders ausgedr ckt" (1971h) 277) whether constituents are combined or subdivided. But the two axiomatic theories that could be correlated with a phrase structure grammar differ in a non-trivial way. How are they related? Would it be possible to read " = " in each axiom of the two theories 34
In Section 3, it is shown that from any semi-Thue system a corresponding axiom system in first-order predicate logic can be constructed. The resulting system is, however, quite different from the systems obtained in Section 4, and it is left open how an interpretation with respect to natural languages might be effected.
56
Hans-Heinrich Lieb
("S = NP + VP"), which collapses the difference and gives us a grammar very much like the ones proposed in Cooper (1964) ? B. In many cases, the theory correlated with the complete syntactical component of a classical TG will be inadequate for lack of correspondence and may even be inconsistent. According to Chomsky (1965) 174, "rules of agreement clearly belong to the transformational component." We may surely assume a classical TG of English where the strings the boy is running and the boys are running (or analogous strings that would also serve the purpose of the demonstration) are terminal strings of the syntactic component and have (surface or shallow) phrase markers in which the boy is dominated by "NP" and are running by "VP". By Wang's method we should have "NP the boy39 and " VP are running as theorems of the correlated theory.35 Since agreement is handled by transformational rules, we may assume a base rule "S—»NP + VP" (or some other rule that would serve for the rest of the argument). The correlated theory then contains the axiom "(x)(y)(NPx & VPy—>Sx y)". We immediately obtain the theorem "S the boy are running*. On the other hand, the boy are running cannot be derived from "S" in the grammar, assuming an appropriate rule of agreement that blocks the derivation. Hence the theory is inadequate, for lack of correspondence with the grammar. It may even be the case that we have a theorem of the form — (Sx) whenever the derivation of from "S" by means of the grammar is transformationally blocked. In this case, we have "—(S the boy are running)", which makes the theory inconsistent. C. Wang's informal indications of how to interpret a theory correlated with a classical TG grammar are inadequate.36 Consider the theory correlated with the categorial subcomponent of a classical TG. Suppose the grammar is meant to be a grammar of a certain natural knguage L (in a sense where L is not identified with a set of strings derived from "S" by means of the grammar, to avoid circularity; cf. Lieb (1968£) §3). In the theory assockted with the categorial subcomponent, all category symbols —"S", "NP" etc. —are primitive (undefined) axiomatic constants. Following Wang's interpretative hints we would interpret them to refer to categories of the system of L by sentences such as "'S' denotes the set of sentences of L" etc. Assuming that denotation for category symbols can be understood as a finite two-place relation and thus be given by enumeration we are still left with the problem of making 35
"Should", because Wang makes only suggestions for treating the transformational component ((1971 b) §5). If those theorems cannot be obtained, the method is inadequate for that very reason. 36 Wang (1971 b) 277: "Ein formales System ist erst für die Linguistik von Interesse, wenn es eine linguistische Deutung besitzt. Wir sagen zum Beispiel im Fall der Regel wie NP—»T N, daß NP für eine Nominalphrase, N für ein Nomen und T für einen Artikel steht." Wang probably wants to say that "NP" denotes the set of noun phrases, etc., which is then implicitly assumed also for "NP" in the axioms of the correlated grammar. Wang's interpretative hints are extremely scanty.
Grammars as theories: the case for axiomatic grammar (part I)
57
sense of the expressions "sentences of, "noun phrases of, etc., and of the constant "L". Wang does not even see the problem. Thus, an interpretation along his lines is empty. Instead of supplementing the theory by an interpretation as above we may suggest to add axioms which are identities: "S = Sentence-of-L", "NP = Nounphrase-of-L", etc. This would bring out even more clearly the emptiness of Wang's interpretative hints: They do not contain anything about how to interpret the new theory. What are the consequences to be drawn from our investigation of the two approaches that were meant to overcome the difficulties created by thesis (1)? If either the re-interpretation or the correlation approach is successful, a generative grammar (or rather, the syntactic component of a classical TG) may be taken as equivalent, in a certain sense, with an interpreted axiomatic theory. I do not wish to maintain that the problems and inadequacies of Wang's method could not be overcome, just as my own re-interpretation program can be carried through in detail. My argument is that the two attempts are equally misguided: To make sense of a generative grammar (to make it say something about its intended subject matter) we either end up with an interpretative machinery inferior to the interpretation concept that has been or could be developed for axiomatic theories (Lieb), or we make use of an axiomatic theory anyhow (Wang). In order to make sense of an axiomatic theory of a language, we certainly do not need a generative grammar. If it can be shown that such a theory is not otherwise inferior to a corresponding generative grammar, there is no reason why one should bother with such a grammar in the first place. And indeed, this can be shown (§ 4.4, below). Hence, the attempts made by Wang and myself are misguided from the point of view of developing an optimal format for scientific grammars. Their main value consists in showing that the enormous amount of generative research could be made avaikble, or partly available, for use in a better framework. In turning to our main subject—grammars as axiomatic theories—we have to admit right away that the accepted view of interpreted axiomatic theories is not quite sufficient for our purposes. Thus, a general discussion of axiomatic theories is required. 2.
A framework for axiomatic theories (1): Formalized systems and abstract theories
2.1.
Introduction
The aim of this and the next section is an account of axiomatic theories— preliminary in many ways—that may be adequate for treating the general problems of axiomatic grammar writing. The two sections have grown out of an attempt to handle those problems; from a systematic point of view they are basic to the discussion of grammars in Part II. At the same time they try to develop a coherent picture that may be of value independently. At the beginning even elementary
58
Hans-Heinrich Lieb
points may be stated explicitly (also for the possible benefit of some readers); soon enough we shall have to modify customary conceptions. Such departures will not always be made explicit; the rich literature on axiomatization must to a large extent be presupposed. Our way of presentation will be relatively informal; a rigorous discussion of all relevant problems in the present context would be both impossible and out of place. Even so the two sections assume a wider scope than is customary in the field and may come closer to giving a synthetic picture than most existing work. They also contain some new results, especially on the combination of different theories into new ones (§ 3.4). An important feature of the framework is separation of formalized systems (logical calculi) (§ 2.2f) from axiomatic theories (§§ 2.4 and 3): An axiomatic theory is formulated by means of a calculus without being a mere extension of the latter, hence, another calculus. We also assume a richer structure for axiomatic theories than is customary: An axiomatic theory is not identified with a set of sentences of some calculus—quite apart from questions of 'interpretation'. The basic distinction usually made between 'uninterpreted' and 'interpreted' axiomatic theories is reconstructed as a distinction between 'abstract' and 'realized theories'. Two types of abstract axiomatic theories are distinguished (§ 2.4), one using axiomatic constants, the other axiomatic variables and an 'explicite predicate' (e.g. a set-theoretical predicate). The realized theories are subdivided into 'interpreted' and 'applied' (§3.1); the latter correspond to 'partially interpreted (empirical) theories' as discussed by Carnap and others. They are given special attention (§ 3.2f) because our later account of axiomatic grammars (Part II) is based on them. The realized theories present a more differentiated picture of 'interpretation' than is usually assumed; in particular, there is no 'partial interpretation' in a Carnapian sense although there may be partial interpretation in a literal sense in the case of an interpreted theory, and there must be such interpretation in the case of an applied one (the 'theoretical terms' are //«interpreted). Interpretation of the logical calculus used in a theory is discussed in connection with 'interpreted formalized systems' (§ 2.3) where a concept of pre-interpreted formalized system is introduced; it is such pre-interpreted systems that are used for formulating axiomatic theories.37 The subsections on formalized systems (2.2£) are in my own view less satisfactory than the ones on axiomatic theories. Not only did I have to be sketchy in my presentation of the ordinary conception of calculi (choosing, moreover, a fairly conservative version that may be too narrow to include recent developments in logic); I also had to introduce some new concepts without being able to develop their consequences. It may be worthwhile for the logician to pursue any nontrivial problem my account may present since it is completely motivated by the actual needs of linguistic theory construction. 37
Stegmüller has emphasized the ambiguity of "interpretation" as applied to theories ((1969/70) II, 340f); our conception takes his criticisms into account.
Grammars as theories: the case for axiomatic grammar (part I)
59
These needs led to elaboration of the kst subsection (§ 3.4) where a new conception of theory integration is developed. I distinguish two main types of joining given theories into new ones and formulate two theorems which are important for combining linguistic theories with each other and with non-linguistic ones. My account of formalized systems will at the beginning be based on Fraenkel et at. (1973)280—293. 2.2.
Formalized systems
In Fraenkel et al. (1973) a formal system** is considered as "an ordered quintuple of sets fulfilling certain requirements" (282): "(1) A set of primitive symbols, the (primitive) vocabulary, divided into various kinds, such as variables, constants and auxiliary symbols" (280). "(2) A set of terms as a subset of the set of expressions", an expression being any string (usually finite) .of symbols (281). "(3) A set of formulae as a subset of the set of expressions" (281). "(4) A set of axioms as a subset of the set of formulae" (282). (5) A set of "rules of inference according to which a formula is immediately derivable as conclusion" from an appropriate "set of formuke as premises" (282). If those sets satisfy certain requirements of 'effectiveness' (mentioned in their specification pp. 280—282), such as the terms being effectively specifiable, the formal system is said to be a logistic system (285). The notion of theorem is introduced in the usual way (282): A sequence (finite in the case of a logistic system) of one or more formulae is called a derivation from the set of premises if each formula in the sequence is either an axiom or,a member of For immediately derivable from a set of formulae preceding it in the sequence. The last formula is called derivablefrom . A derivation from the empty set of premises is called a proof of its last formula, hence, of each of its formulae. A formula is called provable or a formal theorem if there exists a proof of which it is the last formula.
In addition to formal systems, Fraenkel et al. introduce what they call 'formalized theories', each being (287) an ordered sixtuple of sets: a set of symbols, a set of terms, a set of formulae, a set of logical axioms, a set of rules of inference, and a set of valid sentences, with various relations obtaining between, and various conditions fulfilled by, these sets.
The essential features are exclusion of non-logical axioms and introduction of a special set of 'valid' sentences (I.e.): 8
On terminological equivalents, cf. op. cit. 281. For technical terms, variables, assumptions, and theorems introduced in §§2-4, cf. the index pp. 106fF(end of Part I of this essay). I gratefully acknowledge some valuable suggestions that were made to me for the present subsection by Franz von Kutschera (Universität Regensburg) and Wolfgang Stegmüller (Universität München).
60
Hans-Heinrich Lieb The exact extension of this term has to be defined from case to case, the only general condition which such a definition will have to fulfil being that the set of valid sentences should be closed with respect to derivability^ i. e. that every sentence derivable from valid sentences by the rules of inference should be valid itself . . .
It is maintained that any formal system is a formalized theory (287). There is an inconsistency in this, since a formal theory is a quintuple not a sixtuple. Obviously, the authors think of a sixtuple obtained from a formal system by taking the set of theorems as a sixth member. But even then not all formal systems are formalized theories because a formal system may contain non-logical axioms. Let us introduce "formalized system" as a new term to cover both formal systems and formalized theories. A formalized system, then, would be a sixtuple of a certain kind. This conception is still not adequate for our purposes: Fraenkel et al. "wished to dodge the problem of the status of definitions in formal systems" (284). We shall explicitly recognize definitions by including the set of definitions among the members of the formalized system; thus, a formalized system becomes an ordered septuple. A final change concerns the 'rules of inference'. By the conception that seems to be adopted in Fraenkel et al. (cf. also Carnap (1942) 22, 157), rules are sentences of some language that refer to formulas of the formalized system; thus, they are 'metalinguistic' entities. To avoid such entities, we take the relation of immediate derivability itself, i.e. a certain relation between formulas and sets of formulas, following the suggestions in Carnap (1958) § 26b. (The rules of inference themselves could also be reconstructed as entities that are no longer 'metalinguistic' in nature; cf. Hatcher (1968) 12.) We thus arrive at the following conception. A FORMALIZED SYSTEM (FS) is an ordered septuple of sets satisfying certain conditions (which are partly spelled out in Fraenkel et al. (1973) and cannot be reproduced here). The first member of the FS is called the set of symbols or the vocabulary of the FS; the second is the set of its terms (including the variables and certain constants from the vocabulary); the third the set of its formulas·; the fourth the set of its axioms. These four members are related to each other and to the expressions of the FS (strings of symbols of the FS) as previously indicated. The fifth member is the (possibly empty) set of definitions of the FS, all of them taken as formulas. The sixth member is a two-place relation between formulas and sets of formuks of the FS, called (immediate) derivability in the FS. The last member is the set of valid sentences of the FS, again a subset of the set of formulas, more specifically, a subset of the set of sentences·, /'. e. the closed formulas of F (the ones without 'free variables').39 The provable formulas on formal theorems of the FS are defined as before but 39
This assumes that the expression "free variable" can be defined for arbitrary FSs. Note that validity here is a non-semantic notion that must not be confused with truth.
Grammars as theories: the case for axiomatic grammar (part I)
61
the concept of derivation is changed by allowing not only axioms but definitions in the derivation sequence. All sentences that are provable and all sentences that are derivable from a set of valid sentences are valid sentences. A FORMAL SYSTEM (F1S) is now defined as an FS whose valid sentences are exactly the provable sentences or theorems of the FS. The terms "logistic system" and "formalized theory" could be re-introduced but we shall have no use for them.40 The following term refers to an important property of FSs: An FS is called axiomati^able if there is an F1S with the same set of valid sentences. Discovery of non-axiomatizable FSs was one of the main results of foundational research in mathematics. Since we explicitly recognize definitions, a few explanatory remarks may be in place. The definitions are formulas of a specific form that satisfy certain conditions relative to each other and relative to the other constitutive sets of the system. Thus, the definitions and axioms form disjoint sets; and there are four functions from the set of definitions: the defined term of, the deßning terms of, the definiendum of, and the definiens of. The first is a function into the set of constants of the system; the second into the set of non-empty sets whose elements are either constants or variables; the third and fourth are functions into the set whose elements are either terms or formulas. The defined term is the only constant occurring in the definiendum; it does not occur in the definiens. The defining terms are exactly the constants and variables occurring in the definiens. The defined terms of all the definitions are the defined terms of the system. The set of constants of the system can be partitioned into the set of defined and the set of undefined terms.41 Using "F", "F!", ... as variables ranging over septuples such as formalized systems, and "C", "Q", ... as variables over sets of constants of FSs, we now introduce the following notions. Let C be a set of constants of the FS F. The C-defined terms of F = the defined terms of those definitions of F among whose defining terms there is an element of C or a C-defined term of F. (This should be replaced by a proper recursive definition.) The C-terms of F= those terms of F in which an element of C or a C-defined term of F occurs;42 analogously, the C-formulas, C-definitions, C-axioms, C-theorems, valid C-sentences and C-expressions ofF. The following concept (which I have not found in the literature) is fundamental for later definitions: Let F be an FS and C a set of constants of F. F-MINUS-C = the septuple consisting of the following sets: the vocabulary of 40
Also, "formalized theory" will be used later (§ 2.4) for different purposes. For a systematic account of the conditions that must be satisfied by the set of definitions, see a sufficiently comprehensive standard introduction to logic such as Suppes (1957) Ch. 8; for a specialized treatise, cf. Essler (1970). For treating definitions not as formulas but as rules of inference, cf. e.g. Carnap (1942) 157f. For replacing definitions by axioms of a special type cf. Shoenfield 1967, §4.6. 42 Occurrence does, of course, allow for identity.
41
62
Hans-Heinrich Lieb
F-(C u the C-defined terms of F);43 the terms of F-the C-terms of F; and the five sets obtained analogously from the formulas, axioms, definitions, the derivability relation, and the valid sentences of F (in the case of derivability we take that subrelation of immediate derivability in F for which no first member and no element of a second member is a C-formula of F). We shall say that of two FSs F and Pl9 F is contained in Fj if there is a set C constants of F! such that F = Fj-minus-C. The union of F and Fa is the septuple obtained by uniting corresponding members of F and Fi9 except that the fourth member is to be the set of axioms of F and Fl that are not definitions Under certain conditions, here left unspecified, an FS may be called a FORMALIZED SYSTEM OF LOGIC (FS of logic, logical calculus), where "logic" may be replaced by more specific terms such as "predicate logic", "firstorder predicate logic" etc.44 It is assumed that in any logical calculus the set of constants (similarly, of axioms) may be partitioned into a non-empty set of logical constants and a set, possibly empty, of non-logcal constants** If there are no nonlogical constants, the calculus will be called pure, otherwise, non-pure?6 If F is an FS of logic, C will be called a harm/ess set of constants of F if C is a set of constants of F such that some logical axioms of F are not C-axioms of F. We make the following ASSUMPTION ON MINUS: For any FS of logic F and any harmless set C of constants of F, F-minus-C is an FS of logic.47 We finally extend the concept of FS as follows. Most standard logical calculi, at least the pure ones, have 'readings' in natural languages, i.e. some or all of their constants, terms, and formulas are systematically correlated with expressions of the natural language (thus, "All ÷ are P", "Every ÷ is P", "For every ÷, *· is P"9 "All ÷ are elements of P" etc. may all correspond to "(,*)/V). Frequently axiomatic theories are formulated not directly by means of an FS of logic but by using 43
Where "— " stands for difference and "u" for union of sets. For a recent introductory survey, see Rogers (1971); cf. also Hatcher (1968). 45 In a sense where this distinction does not depend on an interpretation of the calculus (cf. the -remarks in Carnap (1958) § 25a). Logical connectives and operator symbols like "3" are counted among the logical constants. We shall also assume that in any logical calculus the (possibly empty) set of mathematical constants can be specified by purely formal means. The mathematical constants are assumed to be non-logical. If our conception of FS of logic is to cover systems of natural deduction (cf. Hatcher (1968) § 1.6), we may have to allow for the set of logical axioms to be empty (unless we keep the tautologies as logical axioms). 'Dummy constants' (as introduced by Hatcher I.e., or 'ambiguous names' in Suppes 1957, § 4.3) must be excluded as constants. Concepts such as derivation and proof would have to be redefined. 46 Reflecting Church's "pure functional calculus" and "applied functional calculus", used in an analogous context ((1956) 173f). 47 A complete explication of "FS of logic" might show that the above assumption is too strong. In that case we would have to strengthen the concept of harmlessness by including additional requirements in its definition. 44
Grammars as theories: the case for axiomatic grammar (part I)
63
a reading of such a system. I allow for readings from the very beginning, suggesting an explication along the following lines. Let us consider regimented forms of natural languages as a special type of FS i8 Such expressions as "All ÷ are P" would belong to a regimented form of English. Assuming this concept, we define as follows.49 F is a natural language reading (NLR) of Fj iff: (1) F is an FS. (2) Ft is an FS of logic whose expressions are not expressions of a regimented part of a natural language. (3) There is an F2 and a function r such that: (a) F2 is a regimented form of a natural knguage. (b) The domain of r is the set of constants, terms, and formuks of Fj. (c) The values of r are sets of expressions of F2. (d) For all e, ej (e ö ej) in the domain of r, r(e) Ð r(ej) = ^.50 (e) The vocabulary (the constants, the variables, terms, formuks, axioms, definitions, valid sentences, respectively) of F = the set of all e such that, for some e ls e 6 r(ej) and ej e the vocabulary (the constants, variables, terms, formuks, axioms, definitions, valid sentences) of Fj. (f) Immediate derivability in F = «e, E>: (Be^ee r(ej) & <e!,{e2: (3e3)(e3 e Enr(e2))}> e Immediate derivability in Fj)}.51 If F is an NLR of Fj, any r that for some F2 satisfies condition (3) is called a rendering of F± as F. We now make the following ASSUMPTION ON NLRs: If F is an NLR of F!, F is an FS of logic (and if Fj is an FS of predicate logic etc., so is F). This assumption is, on the whole, unproblematic.52 48
In the light of Montague's work (esp. Montague 1970£), this seems justifiable. As of now, there does not seem to exist an explication of the concept of reading that would really do justice to the domain the explicatum should cover, i.e. the controlled use of natural languages (extended by symbolisms) in the rigorous formulation of theories. The following definition may still be simplistic, i.e. may apply to only some of the entities that should be covered. I believe, though, that the rest can be obtained by formal operations from readings in the defined sense. 50 With "n" for set-theoretical intersection and "0" for the empty set. 51 I.e. e is immediately derivable in F from E if there is an ej such that: ee rfo) and % is immediately derivable in Ft from the set of all e2 for which there is an element of E that is an element of r(e2). " ..., Un>, v, f> whose first member 53
The above account only indicates the starting-point for model-theoretic work; a more detailed presentation would be impractical in the present context.
Grammars as theories: the case for axiomatic grammar (part I)
65
is a non-repetitious n-tupel of non-empty sets, with n = the number of different types of individual variables of F, such that: (1)
(2)
(3) (4)
f is a one-place function whose domain is the set of constants, terms, and formulas of F, and there is exactly one m > 1 such that the values of f are 'set-theoretical constructs in «Uj,..., Un>, {!,..., m})'.54 v is a one-place function whose domain is the (possibly empty) set of undefined non-logical constants of F-minus-C such that, for any argument e ofv,v(e)ef(e).55 The domain of v £ the domain of D £ the domain of f.56 For all <e, x> e D, ÷ e f(e), and if e 6 the domain of í, ÷ = v(e).
The term "interpreted formaUzed system" will not be defined; we only give necessary conditions for its application that are indispensable for subsequent discussion. From the very beginning we will relativize the notion to "interpreted except for a set of constants C (and all C-expressions)". This will eventually enable us to deal with so-called partial interpretation of axiomatic theories. ASSUMPTIONS ON MINUS-C INTERPRETED FORMALIZED SYSTEMS OF LOGIC (C-IFS of logic). If is a C-IFS of logic, then: (1) F is an FS of logic. (2) C is a set of constants of F. (3) D is a one-place function. (4) The domain of D is the set of constants (including logical ones), closed terms (i.e. terms without free variables), and closed formulas (i.e. sentences) of F-minus-C. (5) There is a minus-C basis for . An interpreted formalized system of kgic (IFS of logic) is a couple that is a C-IFS of logic, for some C. 54
f is the function assigning to each 'well-formed' expression of L the set of its 'possible values relative to ({Ui» · · ·> Un), {l,..., ðé}}'. Thus, in the above quotation from Fraenk'el et al. (1973) it is specified that the set of 'possible values'Ifor any individual constant is U, the set of 'possible values' of a unary predicate is the power set (the set of subsets) of U, etc.; the set of possible values for formulas could be taken as {1,2}, the set of 'truth-values'. The term "set-theoretical construct in" will not be defined, but set identity, power set, and Cartesian product may be taken as examples. Following recent advances in intensional logic, intensional entities such as properties of elements of Uj could also be treated via set-theoretical constructs in an appropriate tuple of sets that would include Uj-, e.g. we might add among others a set J of 'possible worlds' as a fourth member of B and take the values of f as set-theoretical constructs in «Uj,.. .,Un>, {!,.. .,m}, J>. Thus, if e is a formula, f(e) could be the set of propositions understood as functions from J into {!,.. .,m}, and properties etc. would be treated by also using the Uj. The above formulation is to be understood as generalized on these lines whenever necessary. We did not use a more general fomulation in order to leave open the specific form it might take. 55 /. e. v(e) is a 'possible value of e relative to «Uj,..., Un>, {l,..., m}>'. 56 With "c" for set inclusion. 5 TLIl/2
66
Hans-Heinrich Lieb
A minus-C interpreted formalized system of predicate logic (etc.) is a C-IFS of logic such that F is an FS of predicate logic (etc.)?1 An IPS of predicate logic (etc.) is a C-IFS of predicate logic (etc.), for some C. We further make the following ASSUMPTION 1 ON IFSs OF LOGIC: For all F, B, C, if there is a D such that is a minus-C IPS of logic (of predicate logic etc.) and B is a minus-C basis for , then there is exactly one D such that is a minus-C IPS of logic (of predicate logic etc.) and B is a minus-C basis for . This assumption is justified as follows. A minus-C basis for an interpreted formalized system of logic would be specified by rules of designation, including rules for .possible values (these would specify a class of functions to which f belongs). On this basis, the rules of evaluation (including the rules of truth) uniquely determine D if an interpretation of the logical constants of F is presupposed. The assumption is needed for some important considerations on axiomatic theories in § 3.2, Ad (W)y and § 3.3. An IPS of logic is called extensional if every sentence e of F is extensional in in the following sense: Let ej be in the domain of D and occur in e; let e2 be 'equivalent* to ^ in ;58 let e3 be the expression obtained from e by replacing an occurence of ^ in e by e2: Then e is equivalent to e3 in .59 In constructing an empirical theory we do not use a formalized system of logic that is completely unspecified semantically.60 Rather, we use systems that have been specified 'up to designation': The semantic rules have been formulated up to the point where indication of a set C of constants and an appropriate triple «Uj, ..., Un>, v, f> is sufficient to characterize a relation D that would make the system (minus-C) interpreted. Let us call such systems 'pre-interpreted* and treat them as pairs (where "Ä" stands for sets of relations D), as follows: is a PRE-INTERPRETED FORMALIZED SYSTEM OF LOGIC (Pre-IFS of logic) iff: (1) F is an FS of logic. (2) For any D 6 Ë: (a) The domain of D £Î the set of constants, terms, and formulas of F. (b) For any C, if there is a minus-C basis for , is a minus-C IPS of logic. (Again, "logic" as part of the defined term may be repkced by a more specific expression like "predicate logic" if the same is done in the definiens.) A Pre-IFS of logic e developed here it will be important, though, that some non-logical constants of the 'language' of the AT may not be axiomatic constants of the AT. Therefore, in defining the notion of an AT, the axioms, definitions, and constants of the AT should be given independent status. Also, the 'language' of the AT should not be a formalized system but a pre-interpreted formalized system, as noted above (§ 2.3). The 'basic language' can be identified with the axiomatic knguage excluding the axiomatic constants, in the technical sense of "excluding" (§ 2.3)—we start from the axiomatic rather than the basic language.64 This leads us to a third proposal: An AT is considered as a quadruple consisting of the axiomatic language, the axioms, definitions, and axiomatic constants of an AS in the sense of Carnap.65 We will have to distinguish two forms of ATs only one of which is closely related to ASs. In order to have a second handy term for theories of this form we will re-introduce "axiom system" as a technical term into our framework. For the following definitions, "L", "IV, ...are introduced as variables over pairs-(F, Ä>; "S", "S^',... stand for arbitrary sets of formulas of formalized systems; "C", "Q" for any sets of elements of the vocabulary of any formalized system. An ABSTRACT AXIOMATIC THEORY OF TYPE 1 (AT1) or AXIOM SYSTEM (AS) is an ordered quadruple such that: (1) L is a pre-interpreted formalized system of logic. (2) C is a set of undefined constants of L. (3) S is the (non-empty) set of C-axioms of L. (4) St is the set of C-definitions of L. (5) Every element of C occurs in an element of S or Si. Note that C is non-empty because of (3), and that Sj may be empty. Both the C-axioms and C-definitions may contain non-logical constants of L that a*e neither in C nor C-defmed. (Cf. the definitions in §2.2, p. 61.) By (5) 'unused' constants are excluded, which would have been too stringent a requirement on formalized or formal systems in general. S and Sj are disjoint by (1), (3), and (4). Using "T", "Tj",... to stand for quadruples , Vj,^) of È is related to the fourth component «U 21> .. -»U^Xv^f^) of È1 as follows: U2i — Ujj, for all i = 1,.. .,n and some j = 1,.. .,m; Vj ^ v2; and fj ^ f 2 .
Grammars as theories: the case for axiomatic grammar (part I)
77
respondence rules') = the set of C*-axioms of the applied-core language of è (the axiomatic language of Tj), where C* = the application constants of è (the Observation terms' occurring in axioms of È). Ad (8) and (9). It has been emphasized by Stegm ller (1969/70), II, 305 that the 'theoretical language' and the Observation language' can be distinguished only within the total language of the theory (the latter also contains 'mixed sentences' in which both theoretical and observation terms occur, most notably the correspondence rules). Correspondingly, the total language L is introduced independently (8), and two abstract axiomatic theories are distinguished (1, 2) such that the axiomatic language of one is contained in the other (5) whose language in turn is contained in L (9). Ad (10). As T! is an AT1 by (2) and C a set of undefined non-logical, non-mathematical constants of Tj by (3), (4), and (10), it follows from (10) and the definition of "minus-C interpreted theory" that is a minus-C interpreted axiomatic theory. Moreover, there is a Bj and Q such that " and "(x, y^ e R" are treated as notational variants of each other. The assumed theory of language will not be developed formally. In informal discussion I may continue to use expressions of an English language reading (obtainable on the basis of Lieb (1968^) or by translating from Lieb (1970)), speaking of classes, elements etc. (cf. Carnap (1958) §28c). I do not wish to maintain that every theory of language should have the logical properties just indicated. On the contrary, if extensional systems are sufficient for formulating a theory of knguage we may try to avoid complexities such as transfinite levels by formulating theories of language in the language of set theory. This would also solve a problem in our informal version of a theory of grammars: The framework for axiomatic theories in §§ 2 and 3 was formulated by using set-theoretical means of expression. Assuming that a formal version of the theory of grammars presupposes both a theory based on our framework for axiomatic theories and the theory of language, an awkward situation develops if the language of set theory is used in one but not the other theory. It will soon become apparent, though, that a general restriction to the language of set theory may be problematic.
1
For the reasons, cf. already Carnap (1942) § 12.
Grammars as theories: the case for axiomatic grammar (part I)
4.3.
93
Problems of feasability
In this subsection we consider various objections that question the general feasability of our program. Both Thesis A and B contain the requirement that a grammar should be formulated in the language of predicate logic or set theory (in the sense of § 3.1, p. 76). Although we did not specify the conditions that would make a PreIFS of logic a Pre-IFS of predicate logic, the requirement may seem too strong if we take "predicate logic" to imply extensionality of the system, as we certainly would in the case of "set theory". Thus, we arrive at the following Objection 1. The program is doomed because the total language of a grammar cannot be extensional. We may argue as follows. According to §4.1, a complete linguistic grammar will contain not only a semantic part but may even contain a pragmatic one, and its application will establish a relation to actual language use. In the semantic part the grammar will have to deal with meanings of (abstract) expressions and in its application, possibly also in its pragmatic part, with attitudes of speakers. For either task a non-extensional knguage is required. If Thesis B is adopted we may even strengthen this argument as follows: The axiomatic language of the theory of language in terms of which a complete grammar is formulated is contained in the grammar's total language. In §4.1 we assumed two subtheories of a theory of language, a systems theory containing a semantic and (possibly) a pragmatic part, and a realization theory containing a speech act part. Developing these three parts requires a non-extensional language for the same reasons as given before in the case of grammars. Hence, the total language of the grammar cannot be extensional. I would immediately subscribe to these arguments except for the crucial assumption: Non-extensional languages are required for dealing with meanings of abstract expressions (i.e. not of concrete utterances or speech acts) and for attitudes of speakers.110 First, consider meanings. The first explicit discussion with which I am acquainted is Carnap (1956) § 38. His result, though inconclusive, supports the hypothesis that an extensional language may be sufficient for the semantic description of interpreted formalized systems of logic.111 Recent developments in logical semantics ('possible-worlds semantics') also seem to support that hypothesis : Intensional entities such as properties are all conceived as set-theoretical constructs, and the semantic metalanguage contains only sentences that could be 110
Note that such attitudes are also involved in the meanings of utterances, which are intended meanings (or understood meanings). 111 "On the basis of these considerations, I am inclined to believe that it is possible to give a complete semantical description even of an intensional language system like S2 in an extensional meta-language like Me. However, this problem requires further investigation." (1956,172).
94
Hans-Heinrich Lieb
sentences of the axiomatic language of an abstract set theory, including constants for specific sets. However, I have been unable to ascertain whether this is sufficient for extensionality in a Carnapian sense.112 Next, consider attitudes, in particular, so-called prepositional attitudes like knowing, believing, meaning, intending etc. It is apparent from the work of Irena Bellert that these concepts may occupy an important place in grammars, and fairly obvious from the work of Bellert and other authors that they should occupy such a place in a theory of language.113 We need non-logical constants such as "believes", "intends" in a grammar and a theory of language; it may be left open whether these are axiomatic constants of a theory of language or of a psychological theory used in the theory of language.114 The question, then, is whether an axiomatic theory of the relevant prepositional attitudes (that might be part of a theory of language or of actual grammars) can be formulated in an extensional language. Relevant discussion is found mainly in logical semantics and the philosophy of language, where an extensive literature on prepositional attitudes has accrued, centering to a large extent around the concept of belief. Mainly since Carnap (1956) (i.e. 1947) it has been widely debated whether belief should be taken as a relation between a person and a sentence or a person and a proposition, or still something else.115 A decision in this matter is immediately relevant for our problem: According to Carnap, an extensional language may suffice if sentences are chosen (1954: (1956) 232); Rescher, who advocates propositions, reaches the conclusion that "an adequate logical theory of belief statements is not, it would seem, to be had unless modal concepts be presupposed as an available tool for its development" ((1968) 53). Again, I am not certain about the extensionality of the language if a possible-worlds semantic is chosen for treating propositional attitudes (cf., e.g., Hintikka'(1969)). 112
Carnap (1956) 168f, considers one of his metalanguages as non-extensional simply because it contains sentences which say what the intension of a certain expression is; in his view, any such sentence is non-extensional. This would seem to answer our question also for the more recent developments. But Carnap's supporting example seems to be wrong; he apparently confuses equivalence in the metalanguage with, equivalence in the object language. He also presupposes that an expression such as "the property Human" would be a predicate expression of the metalanguage because "Human" is; in a semantics taking properties as particular sets and sets as individuals, such an expression would be an individual expression. 113 Cf. Bellert (1972*) (1972£) (to be discussed in § 4.5). For the role of propositional attitudes for a speech act theory (as a subtheory of a theory of language), cf. Searle (1969) as a classic reference. 114 It may be argued that such constants should be taken as logical operators; this would affect the form not the substance of our argument. In any case, we are not concerned with actual Verbs of believing* etc. in natural languages. 115 The question is taken up from a linguist's point of view in Partee (1973). Cf. also Moravcsik (1973) for a survey of part of the relevant literature.
Grammars as theories: the case for axiomatic grammar (part I)
95
Given this situation, I am taking the course already hinted at in the preceding paragraph: I explicitly allow as Pre-IFSs of predicate logic systems that are nonextensional. If necessary, I would be prepared to give up the relevant parts of Theses A and B: It is not essential to the program that the underlying logic should have a particularly simple structure. There is a second objection of a more fundamental nature: Objection 2. The program is problematic because it naively assumes axiomatizability. This objection loses part of its force because we did not require that the (axiomatic or total) language of an axiomatic theory should be axiomatizable (in the sense of §2.2, p. 61). Given a formal theory, abstract, interpreted, or applied, axiomatizability of the theory involves only a subset of the set of valid sentences of the theory's language (cf. § 2.4, p. 70, 3.1, p. 75). To make sense of the objection within our framework we have to assume that an Ordinary' grammar can be represented by a formalized theory of some sort. The question then is: Are those theories axiomatizable in one of the previously introduced senses? For representation as a formalized theory we have to decide on which set of sentences of the theory's language should be the third member of the theory. Objection 2 may already break down because no such set of sentences can be delimited unless the resulting theory is axiomatizable. In the foundations of mathematics it was possible to find non-axiomatizable formalized systems because in some cases the set of valid sentences could be defined semantically as true under such and such an interpretation; it then turned out that the system with those valid sentences was non-axiomatizable. In empirical theories we have the problem of uninterpreted constants that complicates the concept of truth. I have been unable to extend to formalized theories even the relativized concept of truth introduced above for realized axiomatic theories (§ 3.3, Third Objection to (10)). One might think of characterizing a set of valid sentences 'pragmatically' but this should always lead to finite sets unless a semantic or syntactical characterization is included. But suppose we have a concept of truth that applies to interpreted and applied formalized theories, and È is an applied formalized theory and is a theory of a linguistic means of communication.116 It may be natural to require that every true sentence of the form: s is a text of C, is an element of the third component of È (s and C are constants and C is 'meant to refer to the intended subject matter of the theory'). È will be axiomatizable only if the set of all those sentences is recursively enumerable. Assuming that it is corresponds to the one most basic assumption of generative grammar (at least, if "text of C" is understood as "sentence of C"). To my knowledge, only Henry Hiz has argued against that assumption in favor of a weaker one (Hiz (1968) 248f). Suppose the set of those sentences is not recursively enumerable. Then we may either give up the require' For "theory of, cf. above, § 3.3, fn. 82.
96
Hans-Heinrich Lieb
meftt that it should be a subset of the third component of È, which removes the obstacle to axiomatizability; or we may attempt to find a significant subset (of the set of sentences) that is recursively enumerable; that is, we weaken our claim of axiomatizability to a part of È. An axiomatization of that part would then be an incomplete axiomatization' of È, in an obvious sense. I conclude that the assumption of axiomatizability is at present defensible; in the worst case our program would have to be relativized to 'reduced grammars', /. e. to theories that are 'incomplete axiomatizations' of corresponding formalized theories. One might also consider re-formulating it for formalized instead of axiomatic theories. It would then considerably lose in interest: There are additional semantic problems with formalized theories; the results on theory integration of § 3.4 do not seem to generalize to formalized theories in a non-trivial way;117 and the systematizing effect of axiomatic theories would be lost. Objection 3. The program is misguided: It is based on a conception of theories that derives from and aims at theory construction in the natural sciences; this conception is [may be] inadequate, or partially inadequate, for the humanities to which linguistics belongs. I would subscribe to the premises of the objection in their non-dogmatic version ("may be" instead of "is"), taking as the conception of theories the ordinary one in the philosophy of science. I would not, however, accept the conclusion, for the following reasons: (a) The very attempt to apply a certain conception may be the best means to judge its adequacy in a certain field; the framework for axiomatic theories in §§2 and 3 already contains a considerable number of modifications and extensions of the traditional conception that are motivated by the needs of theory construction in linguistics; (b) I would indeed submit that the conception of theories as developed in the philosophy of science, whatever its historic origin, is adequate to a significant degree for any empirical discipline, (hence, for linguistics) although there may be no single discipline yet for which it is completely adequate.118 Objection 4. The program is lop-sided: It does not take into account the non-deductive aspects of grammars and grammar writing. This objection is connected with an important complex of questions on which I can comment only briefly: If the objection is understood as establishing a contrast between deductive theories and theories allowing for probability statements it is clearly mistaken since theories of probability can also be formalized or even axiomatized. If, however, the objection refers to the role of inductive reasoning in theory construction, then it has to be admitted that the program makes no reference to it. But this is only natural: Induction plays an important part in the 117
Note that the decisive definitions all contain reference to axioms. There are good reasons for this hypothesis, and so far I have not seen any argument in favour of the singulary status of the humanities that could not be refuted or reduced to dogma. 118
Grammars as theories: the case for axiomatic grammar (part I)
97
heuristics of theory construction and the confirmation of theories, not in their actual formulation; the program is concerned only with the format of grammars. Indeed, it is one of the advantages of the proposed format that questions of heuristics and confirmation can be formukted in a well-established framework.
4.4.
Advantages of axiomatic grammar writing
Fraenkel et al. (1973, 323) make the following remark on formalization of theories (representation by formalized systems): There are very many mathematicians, and even more so other scientists, who doubt it very much whether mathematical (and other) theories should be formalized even if they can be so in principle, suspecting that the fruits of formalization are not worth the effort.
This applies directly to our program of axiomatization in linguistics, which needs to be justified. There are at least four reasons which recommend an axiomatic format for grammars as stipulated in § 4.1. Although those reasons carry weight independently of any disadvantages that may be connected with generative grammars, I will include some comparison with generative grammars in the following discussion. First advantage: The language of a grammar is a formalized system of logic. This is true for the axiomatic language (or total language) of an axiomatic theory even if a restriction to the language of predicate logic or set theory is not accepted. The system may be a natural language reading of a systern of logic, which then provides the possibility of symbolization. Thus, the linguist does not have to bother with formally specifying the language he is going to use: This task has been solved for him by the logician. All that is needed is appropriate introduction of the non-logical components: the non-logical constants and the axioms and definitions that involve such constants. Second advantage: The metalinguistic approach to the theory of language is avoided. One of the basic ideas of generative grammar can be summarized as follows: Language should be studied by specifying the form of generative grammars that characterize natural languages; more specifically, the generative grammars used by the linguist should eventually be formally constrained in such a way that they characterize precisely the natural languages. Insofar as the symbolism used in a generative grammar can be compared to a constructed language, this approach may be called metalinguistic because it directly studies the formal properties of grammars in order to indirectly arrive at properties of natural languages ("Every natural language has the property of requiring generative grammars with the following formal properties: ..."). This is at best a very clumsy procedure for developing a theory of language; at worst it is a permanent source of pseudo7 TLI1/2
98
Hans-Heinrich Lieb
problems and ad /w-decisions on matters of principle.119 The axioms and theorems of a grammar as stipulated in § 4.1 are sentences of a language which is interpreted to a degree where the axioms and theorems can be understood as assertions on the intended subject matter of the grammar. A theory of language does not deal with the linguistic properties of grammars but with arbitrary natural languages. Grammars and theories of language are rekted by the concepts of theory integration as developed in § 3.4. Third advantage: The discussion of theories in the philosophy of science can be made to bear directly on grammars™
This claim is obviously correct because of our paragraphs 2 and 3 which incorporate and develop many important features of the conception of theories found in the philosophy of science. Because our framework for axiomatic theories is sufficiently similar to that conception, the latter is relevant in still other respects. Generally, the following points can be made. (a) Heuristics and explication. It is well-known that a theory must be distinguished from the ways of arriving at a theory, which in turn can be studied systematically. Developing the heuristics of grammar writing is simplified when it can be seen as a special case of characterizing theory construction in the empirical sciences. In particular, the relation between an informal scientific grammar and a corresponding grammar that has been formulated as an applied axiomatic theory can be understood in terms of a set of simultaneous 'explications'. (b) Interpretation. Transformational generative grammars, or algorithmic grammars in general, run afoul of the problem of 'making sense of them*. The two existing attempts to solve this problem directly or indirectly make use of axiomatic theories. Thus, solving the interpretation problem for generative grammars seems to require a solution of the same problem for axiomatic theories. In § 3, such a solution was attempted, based on the relevant discussion in the philosophy of science. 119 As an example of a pseudo-problem or set of pseudo-problems, take the question of how to develop an 'evaluation measure' for grammars in terms of'formal properties of grammars. As a general requirement, evaluation is to correspond to how closely a grammar is in agreement with certain psycholinguistic assumptions, especially on language learning. Normally, this would be seen as a problem of relating sentences of the grammar to sentences of a psycholinguistic theory, regardless of the form of the sentences involved. From this point of view, the problem of finding corresponding formal properties of the grammar is spurious.—For an ad ^-decision on a matter of principle, see the way linguistics is claimed for psychology by Chomsky (cf. Lieb (1970), Ch. 10). 120 In his somewhat mistitled book, Botha (1968) made an ambitious attempt to apply to transformational generative grammars the concepts which have been developed for theories in the philosophy of science. Unfortunately, he failed to demonstrate that such grammars are theories in the required sense. Even if some of his arguments can be saved in the light of Wang's work or my own (cf. above, §§ 1.5f), this does not hold for his elaborate discussion of 'mentalism', 'competence' and related matters, which therefore is largely empty.
Grammars as theories: the case for axiomatic grammar (part I)
99
(c) Confirmation. The involved discussion of what constitutes the 'data' for a grammar and how the grammar is related to the 'data' could greatly profit from placing it into the general methodological framework for theories: Assuming that grammars are applied axiomatic theories, the discussion could be reformulated as argumentation concerning (a) the heuristics of grammar writing; (b) the interpretation of the total language and the relation between core, applied core, and application of the grammar; (c) the confirmation of grammars. (d) Explanation. Wang's recent attempts to apply the Hempel-Oppenheim schema for nomological-deductive explanation of facts to generative grammars presuppose their conversion into axiomatic theories (Wang 1972^). The value of Wang's proposals is doubtful.121 Still, if a grammar is written as an applied axiomatic theory it should be possible to give a more adequate account of what Chomsky tried to cover by his distinction between observational, descriptive, and explanatory adequacy. The first kind of adequacy concerns questions of heuristics and confirmation. The second concerns the problem of explaining facts of language use by relating them to what is stated (a) in the grammar and (b) in a psycholinguistic theory that indicates how the system described by the grammar is represented in the speaker. The third kind of adequacy concerns the problem of explaining what is stated in the grammar by relating it to what is stated in a theory of language (or language acquisition). The last two problems may perhaps be reconstructed as explanation of facts vs. explanation of laws. In any case, these and other concepts of explanation now apply directly to grammars. (e) Evaluation. An applied axiomatic theory can be evaluated by such criteria as formal simplicity, organizing power, predictive power, and 'placeability', i.e. readiness with which it is linked to other theories in the same or related fields. At present, such evaluations can hardly be more than informal estimates according to certain principles; as such, they are quite important. They may be sufficient, too, for applied axiomatic theories whose application axioms are sufficiently well understood since any such theory has a semantic basis and a pre-interpreted formalized system of logic as its total language. I suggest that problems of grammar evaluation came to occupy a prominent place in generative tranformational grammar mainly because the grammars remained semantically unspecified. Fourth advantage: Grammars can be combined with grammars and other axiomatic theories by theory use or conflation (§3.4). 121
Wang considers atomic sentences of the axiomatic grammar that assign lexemes to lexical categories as statements of facts, and universal sentences on syntactic categories as laws. Sentences of the two types are then taken as the explanans of an explanation whose explanandum is a theorem derived from the explanans to the effect that such and such is a sentence of the language. But this is a way to have an abstract system explain itself. In my view, the facts to be partially explained by invoking sentences of the grammar are facts of the actual use of the system; for this, pointing out deductive relationships between sentences of the grammar is at best a first step.
100
Hans-Heinrich Lieb
This point is important for (a) practical and (b) theoretical reasons. (a) Practical. Even idiolect grammars will hardly ever be complete. Grammars whose subject matters are a given linguistic means of communication and different subsystems of the means may be integrated into a more comprehensive grammar by theory conflation, if they have the proposed format. Corresponding statements hold for grammars of linguistic complexes. (b) Theoretical. It is of considerable theoretical interest to have a unified conception for combining partial grammars into more comprehensive ones; for formally relating grammars of idiolects, language varieties, and languages to each other and to a theory of language; and for making a theory of communication or physiological, psychological, and sociological theories available in a theory of language. Take, in particular, the relation between a grammar and a theory of language. Chomsky remarks ((1965)6): It is only when supplemented by a universal grammar that the grammar of a language provides a full account of the speaker-hearer*s competence.
There is no hint how to understand "supplemented". It is naively taken for granted that deductive relationships may be established ((1965)46): Whenever this is done [sc. abstracting a statement or generalization from a particular grammar and attributing it to the general theory of linguistic structure], an assertion about a particular language is replaced by a corresponding assertion, from which the first follows, about language in general.
This is wishful thinking, given the nature of generative grammars (cf. § 1). On the other hand, analogous statements would have a precise meaning on our conception of grammars, where "supplemented by", for instance, could be interpreted äs "formulated in terms oF'. It is in connection with the fourth advantage that the desirability of accepting Thesis B shows most clearly. If a grammar is formulated in terms of a. theory of language, the following conditions are satisfied: (a) No sentence of the theory of language is a sentence of the grammar, (b) A sentence derivable from sentences of the grammar and sentences of the theory of language is a sentence of the grammar, (c) A sentence derivable from sentences of the grammar and sentences of the theory of communication that is used in the theory of language (or of any other theory so used) is a sentence of the grammar, (d) Different grammars are comparable if they are formulated in terms of the same theory of language.12 (e) The interpretation problem for theories of language is solved in an elegant way: It is the very grammars formulated in terms of the theory of language that provide applications for that theory, without giving rise to any circularity. 122
This would hold even more strongly if we could replace "formulated in terms of in (3) by "strongly formulated in terms of, /. e. if the grammar did not contain any defined axiomatic terms; {/".§ 3.4, p. 85.
Grammars as theories: the case for axiomatic grammar (part I)
101
Because of (a), there is no mixing of the general and the particular. Because of (b) and (c) the grammar nevertheless specifies properties of its subject matter that are not purely specific. Because of (d), grammars can be used directly in the quest for language universals. Because of (e) the interrelation between the general and the particular is reconstructed in an intuitively satisfying way that should do justice to traditional arguments both for and against a 'universalist' position in general linguistics.123 The above four reasons in favour of axiomatic grammars cannot be jointly adduced for generative grammars. There would seem to be only one advantage of generative grammars, greater succinctness of symbolization. This turns out to be an erroneous assumption since the very formulas of a generative grammar can be reinterpreted as set-theoretical expressions (Lieb (1967)). Moreover any abbreviation that might be useful in a grammar (like "VCV", " + ", etc.) can be introduced as a defined constant. In the light of our discussion it may be surprising that the axiomatic approach to grammar writing should have been so late in being seriously considered. 4.5.
Historical notes
-
.
Both in linguistics and in the theory of linguistics, the axiomatic method has a well-established tradition: On the one hand there have been attempts at axiomatic theories of language or aspects of language1^4; on the other there are many axiomatic investigations of grammars. Either type of research is quite different, though, from a grammar itself being conceived or formulated as an 'interpreted' axiomatic theory. My own work in (1967), (1968£) was motivated by the hypothesis that axiomatic grammars are preferable to generative ones in definite respects. In the published versions that idea was not made explicit. The possibility of an axiomatic grammar is implicit in the discussion of Hiz (1968) 248f, where it is no longer assumed that the set A of texts in a language is recursively enumerable: This would suggest an axiomatic treatment of A by which various properties of texts could be specified, without an attempt at recursive enumeration. A Hizean approach to grammar writing is at the basis of Smaby (1971); the proposed 'paraphrase grammars' are formal systems as originally used by Wang, and thus close to axiomatic theories (cf. above, § 1.6, p. 54). Moreover, a grammar corresponding to the general theory of Hiz (1969) 123 The conception formulated in Thesis B plays a prominent role in my study on the concept of language universal, Lieb (to appear). One of the theoretical advantages of Thesis B is exactly its fruitfulness for universality research and for clarifying the concept of language universal. 124 A trend that is exemplified by much of the East European work summarized in Marcus
102
Hans-Heinrich Lieb
can be conceived as an applied axiomatic theory, taking as primitive core constants ('theoretical terms') expressions such as "is a paraphrase oP* (for a relation between texts). Assuming a general pragmatic theory that is used in the grammar and application terms that refer to speakers, the paraphrase relation between, say, texts of an idiolect could be related to facts about the use of texts that are paraphrases of each other.125 Irena Bellert has recently proposed a format for grammars which makes a grammar "analogous to a formal interpreted theory" (Bellert (1972£) 299) ,126 Bellert's interesting and important proposal seems to be inadequate as it stands. I will discuss this question in some detail. Following Bellert (1972£) 299f, "a complete grammar or theory of a natural language L" will consist of a syntactic and an interpretative component and, corresponding to each component, of a "meta-rule" or "meta-theorem" which is "independent of any particular natural language" (300). The syntactic component "generates an infinite set of pairs (S, D), where S is a string of symbols over the terminal vocabulary of L and D is a deep structure tree dominated by a distinguished element of the auxiliary vocabulary of L, which denotes the category of sentence" (299). The syntactic meta-rule states that, "for all pairs (S, D) generated by the syntactic component of the grammar, S is a sentence of L and D its structural description." The 'meta-rule' is meant to repair an inadequacy of the pairs (S, D): They are not statements.127 Thus, the 'meta-rule' has an interpretative function relative to the expressions of the syntactic component. Proposals of this type were shown to be inadequate in Lieb (1968£), for the following reason: They lead to an interpretation that establishes a relation not to a natural language but only to (sets) of strings of symbols of (levels of) the grammar.128 Bellert's proposal for syntax is inferior to both the re-interpretation and the correlation approach, which are also of limited value.
125 I made a similar proposal to Professor Hiz in 1972 who agreed that it would be compatible with his 1969 theory. 126 In a number of previous publications (collected in Bellert (1972*)), Bellert explored the possibility of describing the semantical aspect of a natural language by stating 'implicational rules' for the use of sentences. In 1971 I suggested to her that these 'rules' might best be taken as axioms on the use of sentences in a given language. In a way, this is now done in her article although in a form which I find objectionable. 127 Cf. Bellert's discussion (1972£) 294. 128 Bellert apparently fails to notice that she is equating the sentences of the natural language with strings of symbols of (a level of) the grammar; this mistake has been discussed for Chomsky's work in Lieb (1968 £) § 3. Thus, Bellert's "theorems of the form: CX is a sentence of L with a structural description represented by D'" ((1972£) 294) are not statements on the natural language.
Grammars as theories: the case for axiomatic grammar (part I)
103
The interpretative component consists of (299) a finite set of axiomatic implications of the form C->A PROPOSITIONAL ATTITUDE that S' where C stands for a Boolean function of conditions on the structural descriptions D, A stands for the addresser, PROPOSITIONAL ATTITUDE stands for one of the purported attitudes, and S' is a sentential form ...129
The corresponding 'meta-rule' says that, for all S, D, R and C, if the sentence S with structural description D is used by addresser A and directed to receiver R and D satisfies condition C, then "A PROPOSITIONAL ATTITUDE that S' where C and A PROPOSITIONAL ATTITUDE that S' are, respectively, the antecedent and the consequent of an axiomatic implication" (299 f). The separation of the 'axiomatic implications' from the 'meta-rule' is not tenable. To make sense of the conception, we have to take "A" in an axiomatic implication as a free variable (bound in the meta-rule). We thus arrive at 'axioms' as exemplified by "Questionyes/no[S] -» A WANTS R to say if S" (298), where "R" has to be taken as another free variable.130 Because of the free variable(s) these 'axioms' are neither true nor false, and they have no acceptable interpretation: What exactly is it to mean that, for every yes-no question S (S being a sentence, not a concrete utterance or speech act), A wants R to say if S?131 All these problems disappear if Bellert's 'implicational axioms' are given up and the semantic 'meta-rule' is understood as specifying the form of certain axioms such that any grammar of a natural language has such axioms, -vi%. universal implications of the following form: If ÷ correctly uses <S, D> in L by addressing some utterance U of S to y and if C(S, D, L), then PA(x, y, f); where ÷ and y are variables ranging over humans, S is a variable ranging over sentences of natural languages and D a variable ranging over syntactic structures of sentences; U ranges over objects or events in space-time; L is a constant 'for' the language in question; C(S, D, L) is a sentential formula (of the language of the grammar) in which L occurs and S and D are the only free variables; PA(x, y, f) is an atomic formula consisting of a three-place or two-place predicate PA, which belongs to a small set of predicates 'for* propositional attitudes, and a three-place or two-place argument expression consisting of the free variables ÷ and (possibly) y and a 129 For «PROPOSITIONAL ATTITUDE" in the above schema we are to substitute expressions such as "BELIEVES" understood as "purports to believe" (roughly: behaves linguistically as though he believed); cf. 297. 130 The above example, like most of the examples on p. 248, does not completely correspond to the schema. 131 Bellert maintains for her axiomatic implications that "the consequents can be said to follow formally from the antecedents" (297). But even in analytic sentences material implication does not mean deducibility. Bellert might mean that, given the antecedent, modus ponens could be applied; but this would not answer my question.
104
Hans-Heinrich Lieb
formula f 'specifying properties of propositions relative to C and, possibly, xandy'.132 This description obviously has no place in any particular grammar but only in a theory of grammars. There it may be dispensed with altogether: What Bellert seems to be aiming at is a certain conception of the semantic subsystem of a language system and its relation to 'pragmatic' factors. Such a conception should be developed in a theory of language in terms of which grammars can be formulated. We thus arrive at the following picture: Bellert's proposal as it stands is given up. A partial grammar of a language 'covering syntax and semantics* is an applied axiomatic theory formulated in terms of a theory of language among whose axiomatic constants there are terms such as "uses correctly", "addresses", and among whose non-logical constants there are terms such as "BELIEVES", "WANTS". All axioms of the semantic part of the grammar are formulated by using those constants as indicated above, and this is justified by assumptions made in the theory of language.133 In this way, the basic features of Bellert's approach to semantics (which I continue to consider as fruitful) could be kept in a less objectionable framework. The idea of axiomatic grammar^ writing has also been emerging in recent work by Helmut Schnelle. In Schnelle (1970), a grammar is taken as a theory of a language which in principle ought to be formulated in one of the 'languages' of logic. Such a formulation would frequently be "zu umfangreich und unübersichtlich" (8), therefore, 'linguistic algebras' such as context-free grammars are developed.134 For translations into symbolic logic, Schnelle (Lc. 9) refers to Wang (1968), a reference that was premature at the time since Wang developed his 'correlation method' only in kter work. The first explicit studies of axiomatic grammars are Wang (1971£), (1971 ( eT == is well formed)] and that for each optional transformation T' there is a statement that (j) [PjEdomain of T' => ( eT' ^ is well formed)]. (Where 'i' and 'j' range over numerical subscripts of phrase markers in derivations.) Then, to avoid contradiction, we would have to restrict the class of transformations in a grammar by demanding that every obligatory transformation T be such that for every other transformation T" and every phrase marker Pj, if P| 6 domain of both T and T", then for all Pj? € T iff 6 T". In practice what this restriction would amount to is the absurdly strong demand that the structural description of every obligatory transformation be incompatible with (not just distinct from) the structural description of every other transformation in the grammar. It would amount to this because otherwise, we would have to imagine two distinct transformations yielding identical outputs from the same input. Anyone familiar with the actual work of generative grammarians will recognize that none of the transformations that have been empirically motivated have this property.
V.
Having shown that both LakofFs characterization of transformations and his distinction between optional and obligatory rules are inadequate, I now turn to what he says about rule ordering. One of the facts that Postal takes to be central to Lakoff s characterization is that rule ordering constraints are to be represented in grammars by independent theoretical statements. Specifically, such statements are supposed to define well formedness conditions on members of K. About this, Lakoff says Rule orderings, for example, are given by global derivational constraints, since they specify where in a given derivation two local derivational constraints can hold relative to one another. Suppose define local derivational constraints. To say that is to state a global constraint of the form: (i) (i) [(Pi/Q & Pi+i/Q & Pj/Q & Pj-H/Q) ^ i < j] (Lakoff 1971, 234)
Adopting the characterization of transformations that I suggested earlier, we can take Lakoff as asserting that, if Tj and T2 are transformations, then the statement that T! is ordered before T2 imposes the following constraint on members of K: (i) (j) [«Pb Pi+^eT! & eT2) ^ i< j] (where T and *j' range over subscripts of phrase markers in a derivation)
Rule orderings, obligatory transformations and derivational constraints
131
The intent of such constraints is to define a subset of K by eliminating all members of K that— intuitively—are produced by applying rules in the wrong order. There are two major points to notice regarding this proposal. First, if it is accepted, then there is every reason to believe that grammars will be allowed to vary in the extent to which they impose rule ordering—just as there is every reason to believe that grammars vary in the number of language particular transformations that they allow. Second, this characterization does not require that the ordering relation holding between transformations be transitive. In fact, it leads one to expect that it is not. This second point can be readily grasped by considering a grammar that contains three transformations—T, T*, and Ô Ö. Suppose further that the grammar orders Ô before T* and T* before ÔÖ. Given Lakoff's characterization of rule ordering, to say this is just to say that the grammar contains the following two derivational constraints. (19) (20)
(i) (j) [«Pi, P i+ i>eT & eT*) ï i< j] (i) (j) [«P,, Pi+1>eT* & eT#)^ i<j]
However, from (19) and (20) one cannot deduce (21).
(21)
(i)(j)[«P i ) P i + 1 >6T&
P j + 1 >eT#)^i<j]
Suppose, for example, that W is a member of K which is such that for all in W, £T*. In a case like this it is not inconsistent to suppose both that (19) and (20) are true of W (by falsity of antecedent) and that for some Pj and Pj in W, eT and ET# and i>j. What this means is that according to Lakoff's characterization of rule ordering, the fact that T is ordered before T* and T* is ordered before Ôö does not guarantee that T is ordered before Ô ö ,11 Of course, it is open to Lakoff to accept (21) as a separate grammatical rule in the case just described. If he were to do so, then Ô, Ô* and ÔÖ would be linearly ordered in the usual sense. However, if it is the case that whenever grammars impose orderings, the orderings imposed are linear, then we want a theory that is not just compatible with this result, but which predicts it. Therefore, if we want to preserve the idea that all orderings are linear, we must reject Lakoff's characterization of such devices. This point can be brought out more clearly by comparing Lakoff's characterization of rule ordering with an alternative characterization which requires all rules that are ordered to be linearly ordered. According to this characterization, universal grammar contains a single rule ordering statement (22). 11
It doesn't even guarantee that Ô ö is not ordered before T.
132
Scott Soames
(22)
For every generative grammar G, (i) if T' and T" are transformations of G to which G assigns numerical subscripts m and n respectively (ii) and if m < n, then (iii) for all sequences of phrase markers W, (Pj.... Pn), such that We the class K defined by G, and for all Pj, Pj,eW if
eT'and (Pj, Pj+i^ € T", then W is a well formed derivation in G only if Pi precedes P} in W. I will use the following formula, considered as a part of universal grammar, as an abbreviation for this constraint. 22«. (m) (n) [m [(i) (j) [(eT'm & eT"„)^in (í) (Ô) [Ô is unordered => £T]13 (Where the variables *n', 'm* and V range over numerical subscripts of transformations in a grammar, the variables Ô and ('f range over numerical subscripts of phrase markers in a derivation, and *T' ranges over transformations in a grammar). 12
Though I intend to leave open for the moment whether or not all grammars completely order their transformations. 13 To say that a transformation is unordered is to say that there are no constraints affecting when it can apply. This is reflected in our formalism by not assigning a numerical subscript to it.
134
Scott Soames
Intuitively, what these conditions state is that if a derivation D defined by a grammar G is such that (i)
It contains a phrase marker Pj that satisfies the structural description of some obligatory transformation Tn
(ii)
In constructing D, Tn did not apply to Pj
(iii)
Failure to apply Tn to Pj was not the result of the fact that Tn had not yet been reached in the ordering of transformations at the time that Pj was an available input to transformations
(iv)
Failure to apply Tn to Pj was not the result of the fact that Tn had already been passed in the ordering of transformations at the time that Pi was available, and
(v)
was not produced by a rule upon which no ordering constraints are stated,
then D is not a well formed derivation in G. The reason that these conditions are adequate to define the standard notion of an obligatory transformation is that to say that a derivation satisfies them (and hence is not well formed) is just to say that the derivation was produced by allowing an obligatory transformation to be passed over in its turn in the ordering even though its structural description was met. Finally, by collapsing these conditions we can formally define the notion of an obligatory transformation. Thus, a transformation Tn of a grammar G is obligatory in G iff G contains a constraint of the following form. (23)
(i) (m) [[ ö·¾Ë & (Ô) (Ô is unordered => ö Ô) meTr => j > i)] => P^domain Tn] r>n Only those members of K that satisfy all constraints of the form (23) are well formed derivations in G. (Note, to say that a derivations W fails to satisfy (23) is logically equivalent to saying it satisfies conditions (i)-(v) above.) I claim that this characterization captures what we standardly take obligatory transformations to be. However, there is one caveat that must be added. Since the above characterization makes use of rule ordering considerations, the extent to which some subset of the rules of a grammar corresponds to what we standardly take obligatory transformations to be depends upon the extent to which the grammar orders its rules. There are three cases to consider. First, if a grammar orders all of its rules, then the obligatory transformations of that grammar will correspond perfectly with what we standardly take obligatory transformations to be. Second, if a grammar orders none of its rules, then, according to the characterization just given, it can have no obligatory transformations. Finally, if a grammar stands somewhere between these two extremes then (i) none of its unordered rules can be obligatory and (ii) the application of an unordered rule can keep
Rule orderings, obligatory transformations and derivational constraints
135
an obligatory transformation from applying (if it destroys the relevant environment).14 Of course, for one who thinks that grammars can vary in the amount of rule ordering that they impose, considerations like these might lead one to try to define the notion of an obligatory transformation independently of rule ordering considerations. Two obvious ways of doing this would be to require (a) that every obligatory transformation apply to every phrase marker in a derivation that meets its structural description or (b) that if some phrase marker Pj in a derivation satisfies the structural description of an obligatory transformation T, then that derivation must also contain some PJ^J such that # Trl & 00 (product) % and w are shipped...". We shall see later that more complex branching can also be expressed in English. While the respective products make up the classes to which the predicate in (3) applies and thus occupy the last place in the logical sequences of quantifiers, they occupy the initial positions in the sequences as expressed by the grammatical construction of English. This suggests the hypothesis that when we deal with sequences of quantifiers that are constructed out of prepositional phrases, the grammatical order is the reverse of the logical order. As the examples below indicate, this hypothesis seems to work for the prepositions OP, 'to', 'in' (and other locationals), and 'for'. (4) (5) (6)
Some gift to every girl and some gift to every boy are bought by the same Santa Claus Every deer in some forest and every moose in some meadow drink from the same brook Every sacrifice for some good cause and every prayer for some blessing please the gods equally
The same point is illustrated by sentences in which different prepositions are combined such as: (7) Some entrances to every freeway to some city of every country are badly constructed Though this does not involve branching, the same point is exemplified; the logical order will start with the last quantified phrase ('every country') and work in reverse order, ending with 'some entrances' which is linked to the predicate. The hypothesis stands up, even though in this project we do not distinguish between different senses of the prepositions; such as the Of of possession, origin, and object (the portrait of David), as well as the 'to' of direction in contrast with the true dative ('he gave it to me'), and the 'for' of purpose as distinct from the 'for' in 'doing something for someone'. One class of exceptions to our hypothesis will be the pseudo-prepositionals; prepositional phrases that function, semantically at least, not as relational phrases. (8)
In one such type of case, the phrase functions adjectivally; e.g.: Every man of some intelligence (with some sense?) smokes cigars.
In another type of case, the preposition links events with their aspects, and thus denotes once more not a genuine relation. E.g.: (8 a) Some aspects of recent linguistic studies and some aspects of recent logical studies are equally depressing.
144
DovM. Gabbay and J.M.E. Moravcsik
The other exception is the preposition 'with' and its relatives; and as of now we have no explanation for this phenomenon. But the point becomes clear from such examples as (9) Every man without some woman is like every ship without some sail (10) Every man with a large income and every woman with a large appetite suit each other In sentences like these, the logical order and the grammatical order coincide, and the dyadic predicate does not apply to the variables bound by the last quantifiers in the sequences. The same coincidence of logical and grammatical order can be seen in sentences with subject-verb-object structure, such as: (11) Every farmer has some sons and every banker has some daughters who belong to the same club But in the case of this syntactic configuration the dyadic predicate is applied to the variables bound by the last quantified phrases in the logical and grammatical order, i.e., those denoting the sons and daughters respectively. Sentences of this structure admit ambiguities, and thus we cannot assume that all readings of all such sentences will preserve the coincidence of grammatical and logical order. Let us now consider the third type of construction mentioned initially namely relative clauses. Sentences of the following sort are illustrative: (12) Some truths that were rejected by every ancient sage in some civilization and some falsehoods that are accepted by every modern scientist in some country resemble each other The main predicate 'resemble each other' applies clearly to the truths and falsehoods. These are clearly not the things bound by the logically last quantifiers in the sequences. Within the relative clauses, however, the regularities observed so far apply. This shows that we have to determine the logical sequences within each relative clause first, and then attach the whole clause as a complex predicate to the head NP's; the main predicate of the sentence then applies to these. There are sentences where a predicate applies both to something within the clause and outside, such as: (13) Some men who pursue every woman are rejected by them In these cases the same considerations apply as to constructions built around 'with'. (14)
Similar considerations apply to clauses built with 'where' such as: Every place where some lawyers lived and every house where some doctors lived comes under the same zoning law
The specificity-quantifiers of English, such as 'a certain', always come first in logical order, and this rule is prior to all other regularities mentioned here.
Branching quantifiers, english, and montague-grammar
145
Finally, we turn to embedded sentences. One type of construction is exemplified by sentences with 'believe' as the main verb; e.g. (15) Some men believe that every woman hates them This sentence shows that there can be co-reference between noun-phrases and pronouns across the embedding construction. Needless to say, we can also have branching constructions in which one branch is also outside the belief-context, e.g. (15 a) Some men of some countries believe that they resemble some animals of every species. But note the impossibility of a branching construction in which the branches would be held together by a predicate denoted by the verb which creates the embedding construction. . (15c) Some χ and allj such that ... and some % and all TV such that ... andj believes that w. Let us now review the whole range of syntactic devices that allow us to express a wide variety of branching configurations. So far we dealt only with cases where distinct branches are held together by a predicate. But there are cases in which the branches are preceded by a common node, as in: (16)
Men who make a deal with a certain chisler and women who keep company with him deserve the same fate.
Here the expression referring to the chisler picks out an element that is necessary for the interpretation of both branches, giving us the structure: There is a chisler s.t.
/*
men who...
Vs
*
deserve the same fate
women who... Little reflection on (16) shows that one should be able to attach such an expression, thus forming the diamond-like structure, in front of any arbitrary number of quantified phrases and with any number of branches; the guarantee comes from the fact that these structures are equivalent to very complex relative clauses, as the schematic representation above indicates. Furthermore, these same syntactic devices make it possible to add one diamond-like structure after another; e.g. to go on with (16) in some such form as: - some angels of all religions . deserve the same fate s.t.
equally abhor some devils of all superstitions '
Further complexities of this sort are also expressible in English. Another relevant question is: how many branches can we have? The answer is: any number of these; one can link them with conjunctions, and predicates like the one in (11) can apply to any number of branches. 10 TLIl/2
146
Dov M. Gabbay and J. M. E. Moravcsik
Still another dimension of complexity is revealed when we see that the main predicate can apply to more than one class denoted within any one of the branches. E.g., (17)
Some lie of every politician and some weakness of every voter make the voters hate the politicians
Here all four NP's are tied to the main predicate; and further complexity in this dimension (involving more quantified NP's) is imaginable. Examples used in logic textbooks usually treat only verbs that function as monadic or dyadic predicates in English. But the syntactic device of adding cases via prepositional phrases shows that verbs can function as predicates with a much larger number of elements related. E.g., consider the schema: χ brings j with ^ to w in ν for u. We already saw above that more complex verb phrases can apply to any arbitrary finite number of elements. Furthermore, grammatically as well as semantically, prepositional phrases such as 'to Mars', eto London', 'to California' have common structure that must be brought out by analyzing them into further constituents. Thus the treatment of prepositional phrases as adverbial operators is necessities. We therefore give in §2 a simplified account of Montague grammar. In unsatisfactory, since under such treatment (an extra primitive for each phrase) we would lose internal structure and thus miss important generalizations. Thus we have shown how one can build up in English sentences with any number of branches, and predicates tying the branches together, applying to any finite number of variables, and how the language allows us to form the diamondtype branching, with relative clauses. Can we form branches with arbitrary length? Again, there is a syntactic device that guarantees this. For some prepositions, such as Of* allow of an arbitrary number of iterations. Thus we can always form branches of the form: "Quantified NP of quantified NP of..." of arbitrary length. In this section we illustrated a number of regularities linking the order of quantifiers within standard logical paraphrases and the order of quantified noun phrases in a variety of English syntactic constructions. We also indicated several syntactic devices of English that allow the construction of branching structures with arbitrary number of branches, with arbitrary number of NP's within any of the branches, and with predicates that can apply to an arbitrary number of elements. We also illustrated diamond-like structures of arbitrary complexity, and sentences in which the common main predicate is applied to more than one quantified NP's from each branch. The regularities and the variety of sentences expressible in English indicate the extent to which branching structures are part of a natural language like English. This sets the background for the development of the rigorous semantics.
Branching quantifiers, english, and montague-grammar
147
II. BRANCHING QUANTIFIERS AND MONTAGUE SEMANTICS
§1.
Introduction
Montague (1973) presented a grammar and a semantical interpretation for a certain fragment of English. The fragment is small, but "accomodates all of the more puzzling cases of quantification and reference" known to him. In part I we have shown that branching quantifiers behave according to certain rules. We now show that branching quantifiers can be expressed in Montague grammar and the respective semantics is correct for them. Thus the system of Montague (1973) allows for the construction of sentences with branching quantification. Our plan is as follows. In §2 we give an introduction to Montague type semantics. As explained in Gabbay (1973), Montague's paper (1973) is extremely elegant and some of its features are only technical options and not conceptual necessities. We therefore give in §2 a simplified account of Montague grammar. In § 3 we show how the simplified grammar of § 2 can accomodate branching quantification'and in §4 we show how Montague grammar (1973) accomodates branching quantification. §2.
Simplified Montague Semantics
Let us begin with a certain fragment of English. For example the fragment containing words like John, Mary, run, fall, and sentences like ''John runs', 'Mary kills John''. Our first step is to divide the words into categories, e.g., in this case into the four categories: NP (containing John), IV (containing run), TV (containing kill), S (containing 'John runs'). We supply rules that allow us to construct the sentences we have in mind. E.g., in our case. (Rl) S-+NP + IV, or in diagram:
(R2) IV-+TV + NP, or in diagram:
10*
148
DovM. Gabbay and J.M.E. Moravcsik
(The choice of rules and categories depends on what tasks we set for ourselves; which sentences do we want to construct? What ambiguities do we want to account for? Can we give a simple semantics for this grammar? etc.) Given a grammar, that is, given a set of categories and rules of construction, we can supply this grammar with a semantics. A semantics consists of the following. (a) With each properly constructed phrase ÷ of the language we associate a semantical object ||x||. Phrases belonging to the same category obtain the same kind of semantical object. (b) With each rule R we associate a semantical rule SR. If the rule R allows you to construct phrase z from phrases ×÷,..., x„ (e.g., S—»NP + IV) then SR tells you how to construct ||z || from ||x! ||,..., ||xn ||. These assignments must be natural. That is the semantics keeps close to the meaning of the English phrases and reflects correctly conditions of truth, makes distinctions in case English makes them, etc. As an example let us construct a semantics for the fragment given above. We start with a set of object õ (cao be thought e.g., as a set of people, etc.). NP's get elements of the set, e.g. \Jobn \ å õ. IV's get subsets of õ, e.g. || run || £ õ (intuitively, the set of those who run). TV's get binary relations on õ, (e.g., || kz//\\ c: õ ÷õ (i.e., a list of who kills whom). Sentences get truth values (true or false). We now have to specify the rules SRI, SR2. SRI tells you to check for, given ÷ å ÍÑ and y å IV, whether the element ||x|| belongs to the set ||y||, and gives a truth value. So: \Jobn run || is true if \John || å || run ||, i.e., the element associated with John is in the set ||nwr||, (those who run). The rule SR2 says: Take the relation ||TV|| and construct all those elements that are related to the ||NP||, e.g., ||kill Mary || is obtained from \\kill\\ and ||^irfry|| by collecting all those elements that the relation ||&//|| relates to the object (|Ë/ÁÃ÷:||·. Thus \\ki// Mary\\ is the set of all those who kill Mary. Thus semantics for the above grammar are obtained by taking sets õ and assignments || ||. There are many possible semantical interpretations. The nature of the interpretation depends on the richness of the fragment, the rules and categories chosen, the various distinctions required etc. We give two examples: 1. A fragment containing the verb seek cannot be handled like we handled kill, because you kill objects of the universe but you may seek non-existent objects. 2. But even without new kinds of words, how about adding the simple rule: (R3) NP-*NP + NP (i. e., we want to construct John loves John (and) Mary). What will be || NP + NP || (i. e., \John (and) Mary\)t. John (and) Mary is a member of NP and therefore must be assigned the same type of element as 'John\ A simple way out is to assign to NP's
Branching quantifiers, english, and montague-grammar
149
finite subsets of õ. So |[/0/&/?|| = set containing one element (John himself). \John (and) Mary\ is the set containing 2 elements. If x,y are NP's and z = x+y then || æ || = || ÷ || u || y || . SR3 tells you to take union. All the other rules remain the same except that we replace õ by \\ = set of all finite subsets of õ. If you look at the fragment and rules of Montague (1973), you will begin to see why the semantics of Montague (1973) is complicated. We now describe how we can introduce quantifiers in the language. Let us, for simplicity, confine ourselves to the fragment with rules Rl, R2, and categories NP, IV, TV. First we add to the category NP, the variable names he^ he2, he^ We can now form phrases that depend on unspecified names. E.g., kill he (or kill him}. We don't know who him is; for different choices of be we get different sentences. We also introduce common nouns like women , men. This is a new category CN. The semantical objects for them are subsets of õ \\wen\\ £Î õ (i.e., the set of all men) etc. We add the category Q of quantifiers containing every and some and add the rule NP—»Q + CN. Now we can form: Every man runs or every man kills some woman or he kills every man or he kills her·, etc. This is a very simple fragment. However, we need one more rule to allow us to express branching quantification ! Luckily this rule was introduced by Montague to treat relative clauses. Recall that we can construct sentences with variables in them, like he runs. We indicate that a sentence S contains a variable name he by writing S (he) (a sentence with he). We want to allow the construction of CN from CN + S (he) by the use of such that., e.g., man such that he runs. We can therefore form "every woman kills some man such that he (the man) runs" . In the next section, let us present this grammar formally and show how it can accomodate branching quantifiers. §3 1.
A Grammar For Branching Quantifiers 1 We have the following basic categories : NP (noun phrases). Contains two sets of basic phrases. N! = {John, Mary, etc.} and N2 — {x,y,z, ^i, he2, she-ij she2, etc.} This is the set of name variables. Besides these basic NP's we can construct more using the rules below. Morphology is neglected, i.e., run—* runs, love—»loves, he—»him, she—»Her, he—»she. In our examples we use the correct English form only for reasons of style. It doesn't affect the representation of branching quantifiers.
1 50
Dov M. Gabbay and J.M.E. Moravcsik
2.
IV (basic ones are for example run, walk).
3.
TV (basic ones are e. g., love, ki/J).
4.
Q (contains every and some).
5.
CN (basic common nouns are man, woman, sheep, etc.) We now define some derived categories and give some rules of grammar. To do this we shall simultaneously define, for any phrase P of any category the notion: "The name variable ÷ appears free in P". We denote this by writing P(x). The following are the clauses defining the rules and the notion.
6.
If Ñ is a basic element of any category (see (1) — (5) for a list of basic elements, such as run, kill, etc.), then ÷ is free in Ñ if Ρ is ÷ itself (as a name in NP).
7.
Æ is an element of the category S of sentences if it is of the form X+Y where X is an NP and Υ is an IV (i.e., the rule S—»NP + IV). A variable name ÷ is free in Æ if ÷ is free in either X or Y.
8.
Ζ is in IV if Z*=X+ Υ with X in TV and Υ in NP. A variable name is free in Æ if it is free in either Xot Υ (i.e., the rule IV—»TV+NP).
9.
Ζ is in NP if Ζ — X+ Υ and X is in Q and Υ is in CN. A variable name is free in Ζ if it is free in either X or Y.
10.
If X is in CN and F(x) is in IV with ÷ free in Y and not appearing in X then the following phrase Æ is in CN. Z= X such that Y(hen), where ben is the first new name variable of this type not appearing in X or Y. u is a free variable of Z if u? 6 x, u ^ hen and u is free in Xot in Y.
11.
If F(x) is in S with ÷ free and X is in NP then the following Æ is in S : (a) If X is a variable name u then take Z(u) to be Y(u) and a variable v is free in Z iff v = u or í φ ÷ but í is free in Y. (b) If X is not a name variable then Æ is obtained by replacing in K(x) the first occurence of ÷ by ¢" and every other occurence of ÷ by ben (or optionally shen depending on gender), where hen is the first name of this form not occurring
Using the notion of construction tree, Montague (1970, 1973), we can give some examples. Near the node of the tree we indicate the free variables of the phrase of that node and the rule (if in doubt) used to construct it.
Branching quantifiers, english, and montague-grammar
151
every man loves some woman
(18)
every man
every
loves some woman
some woman
man some
woman
Meaning: For every man there is a woman (depending on the man) whom the man loves. every man loves some woman
every man loves ÷
every
man
love
÷
Meaning: There is some woman such that every man-loves her. (The woman is the same for all men.)
152
Dov M. Gabbay and J. M. E. Moravcsik
(20)2 Every man loves some woman that shej kills every sheep that he, runs. every man loves some woman such that she, kills every sheep that hex runs love some woman such that she, kills every sheep that he é runs some woman such that shei kills every sheep that he, runs woman such that shei kills every sheep that he, runs y kills every sheep that hei runs v kills
every sheep that hej runs v every
sheep such that he} runs sheep that he é runs
The meaning is that the woman depends on the man (i.e., for every man ÷ there is a woman w = w(x) such that w(x) kills every sheep that ÷ runs).
2
We are not concerned with the deletions that take us from the sentence of this example to the fine surface form.
Branching quantifiers, english, and montague-grammar (21)
153
(Branching quantification): Every man loves some woman (and) every sheep befriends some girl that belongs to the same club (i.e., the woman and the girl). Z(every man, every sheep)
every sheep
Z (every man, u) Z(x, u) = some woman such that ÷ loves she é belongs to the same club as some girl such that u befriends she2
belong to the same club as some girl such that u befriends she2
some woman such that ÷ loves shei woman such that ÷ loves shei
some
woman
÷ loves y
ftat u befriends she2
girl such that u befriends she2
u befriends í
Once we know how to express branching like (21), we can also express branching like (16) in part I. Simply form a statement Z(u) where u is a name variable (i.e., u replaces the "chiseler") and now quantify over the Uj. Clearly any kind of lattice can be created in this way. (We regard "belong to the same club (as)" as one unit here. We abbreviate it by "belong-". In the top nodes (with Z(x,u)) we used rule 11.) The semantics will show that the meaning of the quantifiers is branching.
154
Dov M. Gabbay and J. M. E. Moravcsik
Now suppose we look at the sentence "every man loves some woman and every sheep befriends some girl that belongs to the same club owned by the man". This sentence has the same tree as in (21) except that the node "belong-" should be replaced by the node "belong to the same club owned by x". This phrase has to be constructed separately (i.e., the node is really replaced by a branch constructing the above). This shows we can express branching quantification where the main verb depends on more than the last of the branching quantifiers. We now turn to the semantics for the grammar of this section. Since our fragment is smaller than Montague's (not including intensional objects), our semantics is less complicated in the sense that the semantical objects associated with elements of the basic categories need not be sets of too high a type. As we remarked in section 2, the natural simple semantics of §2 has to be changed a little to accomodate various technical difficulties. So in order to make the present semantics more transparent, we use the original semantics as our starting point. Let H be a function assigning semantical objects to the elements of the categories, //is defined as follows: Let õ be a set (our universe of object). Let Η (÷) åõ, for any name variable ÷ (by name variable we mean variable for names). Let //(ç) å õ for any basic name such as John or Mary. Let //(ç) £Î õ, for any basic IV such as run or basic CN such as man. Let //(ç) £Î õ2 for any basic TV such as love. Given such an assignment Η we define our semantical interpretation. We define a semantical object || Ρ ||H associated with any phrase Ρ of any category, constructive in the language. The definition is given by induction. || \\H is defined first for the basic elements of the categories and then with each syntactical rule of grammar that allows us to obtain new phrases from old. We associate a semantical rule that allows us to obtain the sematical object associated with the newly constructed phrase. The numbers here follow the numbers of the definition of the grammar. (51)
If ç is a basic NP then ||n||H = the function f such that for any subset A £ õ, f(A) is a truth value and f(A) = true exactly when //(ç) å A.
(52)
If ç is a basic IV then ||n ||H = //(n).
(53)
If ç is a basic TV then || ç ||H is the function f such that for any function F (giving truth values to subset of õ), yields the set f(F) = {a å õ|F({b å õ| (a,b) å //(ç)}) = true}.
(54)
.1140/0* ||j? == the function that associates with every subset AS õ the function fA on subsets of õ with ,the property that fA(B) = true exactly when Á ç  φ Ï \every \\Η = the function that associates with every subset Á £Î l) the function GA such that for any B £ õ, GA(B) = true iff  2 ¢.
(55)
If ç is a basic CN, then || n ||H = //(n).
Branching quantifiers, english, and montague-grammar
155
To continue we need a definition. For a name variable x, let Η = ΚΗ* if Hl is like //except possibly for giving a different value on x, i.e., Vy(y Φ x—»//(y) = (S7-S9) Each of the rules (7-9) has the form Z= X+ Y, where Ζ is the new phrase obtained from X and Y. The corresponding semantical rules say \\Z\\jj — = É|^||Ç(||*º|Á)> i- e -> aPpty tne semantical function ||.ËÃ||Ç to the argument imiHandthevalueis||Z||H. (S10) If 7= A'such that 7(x) then ||Z[|H = || AT||Hn {a å õ | || r(x)||fli = /r«*, where (Sll) If Æ is obtained by applying the rule 11 to the NP X and sentence F( then || Z\\H = || *||H({a| II ^W II*1 = /^ where //* = x// and #*(x) = a}). Lemma 1 : Let P(XI , . . . , x,,) be a phrase with the only free name variables x t , . . ., x„. Let //, T/1 be two semantic functions that agree on x lr . . ., x,, and all the basic phrases appearing in P (such as run, kill, etc.) then ||Ñ||Ç^ Il-Plta 1 · Proof: clear, by induction. Lemma 2. Let Ρ be a phrase not containing the free name variable u, then no existential quantifier in Ρ can be dependent on u. Proof: Follows from Lemma 1, since by changing the value //, assigns to u || Ρ ||H does not change. Corollary. The quantifier of the tree (21) is branching since "some girl" does not depend on "man" and "some woman" does not depend on "sheep". They are both constructed as different phrases (with variables x, u) in different parts of the tree. (22)
(Branching Quantifier) : Every son of some king and every daughter of some queen are friends. First construct : then construct : now construct : now construct : and you can now and finally :
y is son of x u is daughter of í man such that he is son of x woman such that she is daughter of í every woman such that she is daughter of í every man such that he is son of x. S(x, v) = every woman such »that she is daughter of í and every man such that he is son of x are friends say : S (some king, v) S(some king, some queen)
assuming that every king is a man and every queen is a woman. Also note that we are not concerned with rules yielding the final surface form.
156
Dov M. Gabbay and J.M.E. Moravcsik
4 BRANCHING QUANTIFIER IN MONTAGUE (1973) Our grammar of §3 is a subgrammar of Montague (1973) and therefore this system can express branching quantification. The reader should note that we did not give a rigorous proof that every sentence with branching quantifiers can be expressed in our grammar. We simply gave several examples to convince the reader that this can be done. §5 DEGREE OF COMPLEXITY OF THE MONTAGUE LANGUAGE The language with branching quantification is stronger than the 1st order predicate calculus. In fact (due to F. Galvin) there exists a sentence of the form (23)
Vx3y| > A(x,y,u, v) Vu3v J
that cannot be expressed in 1st order predicate calculus as it is true in all models with infinite domains. The set of valid sentences of branching quantification is not recursively enumerable. (See Enderton 1970, p. 393.) The fact that we can express, e.g., the above sentence in Montague grammar, shows that the Montague language is stronger than 1st order predicate calculus. The reader may wonder, since on the face of it, whatever we can express in Montague grammar can be expressed also in the predicate calculus. The difference, however, is in the semantic interpretation. Take (23). Montague grammar rewrites this essentially as A(x, F(x), u, G(u)) which is expressible in predicate logic. However, semantically, Montague semantics gives it the interpretation 3 F 3 G A, which cannot be done in predicate logic.
References ENDERTON, H.B. (1970) Finite partially ordered quantifiers, Zeitschrift für Mathematische Logik und Grundlagen der Mathematik 16, 535-555. GABBAY, D.M. (1973) Representation of the Montague-semantics as a form of the Suppessemantics, pp. 395—412 in: Hintikka J., Moravcsik, J., and Suppes, P., (Ed.) Approaches to Natural Languages, Reidel: Dordrecht. GABBAY, D.M. (forthcoming) Tense logics and the tenses of English, in: Moravcsik, J. (Ed.) Logic and philosophy for linguists, Mouton: the Hague. GABBAY, D. M. and J. MORAVCSIK (1973) Sameness and individuation, Journal of Philosophy 70, 513-525.
Branching quantifiers, english, and montague-grammar
157
HINTIKKA, J. (forthcoming) Branching quantifiers, Linguistic Inquiry. MONTAGUE, R. (1970) English as a formal language, pp. 189-223 in: Visentini et al. (Ed.) Linguaggi nella societä e nella tecnica, Edizioni di communitä: Milano MONTAGUE, R. (1973) The proper treatment of quantification in ordinary English, pp. 221 -242 in: Hintikka, J. Moravcsik, J. Suppes, P. (Ed.) Approaches to Natural Language, Reidel: Dordrecht. MORAVCSIK, J. (1972) review of G. Leech's Towards a semantic description of English, Language 48,445^54. RYLE, G. (1957) The theory of meaning, pp. 239-264 in: Mace, C. A. (Ed.) British Philosophy in the Mid-Century, George Allen & Unwin: London.
J. PH. HOEPELMAN
TENSE-LOGIC AND THE SEMANTICS OF THE RUSSIAN ASPECTS1
We consider the applicability of J. A. W.Kamp's system for S(ince) and U(ntil) in the formalization of the supposed deep-structure of Russian sentences in which the aspects occur. We will see that, assuming certain expressions for the representation of the perfective and the imperfective, the consequences that are generally felt to be implied by these aspects in spoken Russian, can be inferred, assuming the axioms for linear and dense time. The semantical relations between the imperfective and the perfective aspecjft become more clear.
Introduction
If a "natural logic" exists (Lakoff 1970), it is to be expected that a tenselogical fragment will occur in it. Even in advanced treatments as (Montague 1973), the tense-operators are those of the propositional tense-logical system Kj and its extensions. These operators, however, cannot give a proper account of the logical form of all tense-phenomena that occur in natural language. In the following we consider the drawbacks of the aforementioned operators in the treatment of the logical form of Russian sentences in which the so-called "aspects" are found. Then we will make some proposals concerning the representation of these forms by means of Kamp's system for the operators S(ince) and U(ntil). We will limit our tense-logical analysis to one standard example, the verb "zakryt'—zykryvat'", "to close". This is not due to a limitation of Kamp's system, but to difficulties in the analysis of "unbalanced" word-pairs, like "to close" and "to open", by means of Potts' operator "Δ" (Cooper 1966). To expose this would take too much room for the purpose of the present article.
1
This article is part of research-pro jekt "The tense-logical fragment of natural logic", supported by The Netherlands Organization for the Advancement of Pure Research. I am indebted to Prof. S. C.Dik. J. A. W. Kamp, G. Berger and E. Krabbe for their help.
Tense-logic and the semantics of the russian aspects
159
I.
Until recently the study of tenses in linguistics has been more or less primitive. Most linguists treat the tenses in ways similar to those of Russell (1903, 458—476) or Jespersen (1924) and Reichenbach (1947). Prior (1967, 12) however, shows how Russell's analysis leads to a paradox in the treatment of compound tenses, and we in turn can show how Reichenbach's analysis leads to a similar paradox. In his treatment of the tenses, Reichenbach—following Jespersen—uses diagrams like figure 1.
Jti
Κ
I
ο
I
S
I
E
I
R
I
R,E ι
S ι
E
R, S
> I had seen John * I shall have seen John + I saw John -* I have seen John
S: point of speech R: point of reference E: point of event. Fig.l
Let us now assume that the sentence "Once all speech will have come to an end" is true (cf. Prior 1967, 12). Then a finite number of utterances will have taken place. Assuming further that each utterance has a finite number of reference points, there will be a finite number of reference points. At least one of these is the last one. But Reichenbach's analysis of the sentence "there will have been a last reference point" gives a reference point that is later than the last one. A similar pardox can be constructed when the expression "now" is analysed as "contemporaneous with this utterance" (Krabbe 1972). If the analysis of tenses is related to utterances, one is forced to assume that there always were and always will be utterances, in order to avoid these problems.
160
J.Ph. Hoepelman II.
Of the different forms of tense-logic, the non-metric prepositional ones with proposition-forming operators on propositions seem to bear the greatest formal resemblance to the tensed sentences of natural languages. J. A. W. Kamp (Kamp 1968) studies in detail the advantages of non-metric tense-logics for the treatment of tensed expressions in natural languages. We shall enumerate the axioms of standard propositional tense-logic and briefly mention the properties of the related models. The basis we choose is a system for standard propositional logic. The set of well-formed formulas is extended with those well-formed formulas which are preceded by: F P G H
"it will be the case that" "it was the case that" "it will always be the case that" "it always was the case that",
plus the usual truth-functional combinations of these. "F" and "P" are undefined. "A", "B", ... are metavariables for well-formed formulas Def.G. Def.H.
HA= d -.PiA
The rules of inference of propositional logic are extended with: RG. hA->h-iF-.A RH. hA-^KP-iA RM ("Mirror-image" rule). If in a thesis we replace all occurrences of F by P, and of P by F, the resulting formula is a thesis. Ax. 1. -ι Ft (A z> B) z> (FA ^ FB) Ax.2. PiF-iAz>A
Ax. 1. and 2. together give the system K t , the theses of which are valid in every model for the axioms given below. Extensions of K t : Ax.3.FFA=>FA (transitivity) Ax. 4. PFA ID (A v FA v PA) (linearity) Ax. 5. τ FT A => FA (non-ending, non-beginning time) Ax. 6. FA r> FFA (denseness) Ax.7. Π(τΡ Ί A -5 ρ Ί ρπ A) ..=> (τΡτ Ao τΡτ A) (completeness) Def. D : D A = d A & GA & HA
Tense-logic and the semantics of the russian aspects
161
III.
Russell offers the following definition of "change": "Change is the difference in respect to truth or falsehood, between a proposition concerning an entity and a time T, and a proposition concerning the same entity and another time T', provided that the two propositions differ only by the fact that T occurs in the one, where T' occurs in the other. Change is continuous when the propositions of the above kind form a continuous series, correlated with a continuous series of moments, ... Mere existence on some, but not all moments constitutes change on this definition" (Russell, 1903,469—470). This definition can, with due modifications, equally well be applied to non-metric systems. Von Wright (1963), (1965) has developed a system with a dyadic proposition-forming operator on propositions, T, by means of which four elementary transformations can be described. Clifford (1966) has pointed out, that von Wright's system-goes together with a discrete series of moments. Tp,q means "p now, and q the next moment". If p represents the proposition "the window is open", then Τρ,τρ describes the transformation of a world in which a window is open into a world in which it is closed. Ττ ρ, ρ describes the reverse, Tp, ρ describes the staying open of the window and Ττρ,τρ its staying closed. Agreeing with Russell's definition we can say, that only Tp, τ ρ and Ττρ, ρ describe changes. Anscombe has given an operator Ta, that Prior (1967, 70) has defined as follows: Def. Ta: Ta(A,B) =dP(PA & Β) ν (PA & B) Ta(p, q) may be called "p and then q". Def. Ta can be given for any of the axiom systems given above, and so does not presuppose discrete time. Ta(p, np) and Ta(np, p) describe changes as well, but do not preclude the possibility of there having been more changes in between.
IV.
In Russian there are in general two verbal forms corresponding to one English verb. So for instance the verb "to close", which in Russian is represented by the two forms "zakryt"' and "zakryvat'". These two forms are referred to as "perfective" and "imperfective" respectively (if necessary we will indicate the perfectivity or imperfectivity of a form by the superscripts p and '). It has for a long time been thought, that the aspects are a feature characterising only the slavic languages, but recent studies show that they can be assumed in the basis of other languages as well, e.g. in Dutch; cf. (Verkuyl 1971). Aspectual differences, however, are expressed very systematically in the morphology of the slavic languages. 11 TLIl/2
162
J.Ph. Hocpelman
There is considerable disagreement among linguists as to the meaning of the two Russian aspects and their different functions. The great "Grammatika Russkogo Jazyka" tries to cover all their basic meanings (Forsyth 1970, 3): "The category of aspect indicates that the action expressed by the verb is presented: (a) in its course, in process of its performance, consequently in its duration or repetition, e.g. zit', pet', rabotat', chodit', citat', ... (imperfective); (b) as something restricted, concentrated at some limit of its performance, be it the moment of origin or beginning of the action, or the moment of its completion or result, e.g. zapet', koncit', pobezat', propet', prijti, uznat', ujti, ... (perfective)". Forsyth tries to define the difference between the perfective and the imperfective by means of Jakobson's concept of "privative opposition": "A perfective verb expresses the action as a total event, summed up with reference to a single specific juncture. The imperfective does not inherently express the action as a total event summed up with reference to a single specific juncture" (Forsyth 1970,7). The Dutch slavist Barentsen too uses the term "limit" to define the meaning of the perfective and says further: "The perfective points out that in the Narrated Period two contrasting parts are distinguished, of which one contains the Contrast Orientation Period. The element NONPERF points out that in the Narrated Period no two contrasting parts are distinguished". Furthermore, analysing the meaning of the perfective and imperfective forms "otkryt/p—otkryvat'1"— "to open"—he states: "The notion of contrast ... asks for explanation. Let us consider the following example: There exists a situation "the window is closed". After some time a change is observed: the window is now open. This means that a transition has taken place from one situation into another" (Barentsen 1971, 10; translation mine, J.H.). The similarity of this definition (and many others could be adduced) to Russell's definition of change is easily seen. Both the imperfective and the perfective forms can describe a change, but whereas the imperfective past form "zakryvalas'"—"closed" in the sentence "dver' zakryvalas'"—"the door closed/the door was closing" does not necessarily mean that the door was ever closed, the perfective past form "zakrylas'"—"closed" in "dver' zakrylas'"—"the door closed"/"the door is (was) closed"—does mean that the "limit" of closing (which in a complete (i. e. continuous) series of moments is the first moment of the door being closed—the first moment at which the sentence "the door is closed" is true) is attained. In other words, the imperfective form may describe a change like "becoming more and more closed", while the door is open, whereas the perfective form describes not only this change, but also what may be called the result of this change: the fact that the door was really closed for the first time. The attainment of this result is an event without duration, which may be called an "instantaneous event", cf. (Anscombe 1964,17).
Tense-logic and the semantics of the russian aspects
163
V.
Let us now assume that we have attached a system for predicate logic to the systems of propositional tense-logic given above. We express predicate constants by "nV, "m2", ..., predicate variables by "jj", "f2", · - ·, individual constants by "a", "b", ..., individual variables by "x", 'V, 'V» - · - In this article we will only consider one-place predicates. Now we can express that an individual, a, gets a quality (e.g. "closed") for the first time: 1.
Hinija & n^a
Clearly (1) is too strong to represent the meaning of "dver' zakrylas/p"— "the door closed"—. Neither the Russian, nor the English sentence imply that the door has never been closed before. What we want to express is, that for some period the door became less and less open and was closed finally. If we try to express this in the propositional tense-logic given above, the best we can get is: 2.
HF-imja & n^a
In dense time HF-in^a is true if -11%a was true during some interval until now. But it is also possible that HF-in^a is true because in^a is true in the future—as can easily be inferred from ax. 2 and ax. 5. Even if we therefore stipulate that Gm1a, HF-imja can in a dense series be verified by a "fuzz": if between any past moment of -in^a's truth and the present, however close, there is a moment of -i mj a's falsehood and conversely, cf. (Prior 1967,108). A second difficulty is, that the standard predicate logical systems do not enable us to relate the result of an event to the event itself, so that we cannot distinguish between an event that stops because its result is attained and an event that stops without its result being attained: 3.
is true when a in the past gradually became mx and finally was (or is) 1%, as well as when a "jumped" to mj. On the other hand, if we have an expression "Φ(πΐι a)" to represent the imperfective verb "zakryvat'"—"to close (gradually)",
4. would be true if a stopped closing gradually without finally being closed, as well as when this result was indeed attained.
164
J.Ph. Hoepelman VI.
To express the concept of gradual becoming Potts (1969) has devised a system of rules of natural deduction. His main rule we use here as an axiom2. If p stands for "x is mj", than Δρ stands for "x becomes m^'. Ax. PL Δ/ίχ=>τ/[χ If we substitute "a is closed" for^x in PI. we get: "if a becomes closed, a is not closed". Contraposition gives "if a is closed, it doesn't become closed". We attach Potts' operator an axiom to the system of predicate logic we choose.
VII. We still cannot express that a proposition, p, was true during a certain interval of time. To express intervals in non-metric systems Kamp (1968) has developed the dyadic proposition-forming operators on propositions S(ince) and U(ntil). S(p, q) means: "Since a time at which p was true q was true until now", and U(p, q) means: "From now q will be true until a time at which p is true". Kamp has proved the functional completeness of S and U (Kamp 1968). We give some expressions defined in S and U (Prior 1967,106f.): PA FA ΗΆ G'A P'A F'A
2
= d S(A, A o A): "A i3 A has been true since A was the case". =d U(A, A => A): "A => A will be true until A is the case". = d S(A=>A, A): "A has been the case since A ID A, i.e. A has been true for some period until now". = d U(Az> A, A): "A will be true until A=> A will be true, i.e. A will be true during some period from now". =d-iHSA: "There can be found no interval, however short, stretching from the past until now, during which π A is uninterruptedly true". =Β hB
Eq.
HA = A' hB^B'
If A is a subformula of Β and B' results from replacing an occurrence of A in Β by A'. RM. Mirror-image rule for S and U (cp. Sect. II). Def. P, H', P7, F, G', F', as given above. The axioms 1 —6 correspond to linear time, the axioms 1 —7, 8 to linear nonbeginning, non-ending time, axiom 9 to dense time, axiom 10 to complete time. Furthermore we assume that we have attached a system for predicate-logic, extended with Ax. PL (cp. Sect. VI), to I and II. VIII. 6.
H'An^a &
is true if a is closed now for the first time, after becoming more and more closed during some period. We can prove3 that 1
Throughout proofs of theorems and lemma's can be found in the Appendix.
166
7.
J.Ph. Hoepelman
H'p=> Pp
and thus 8.
Η'Δ/iX &/ lX z> Ρ-,/ιΧ &/iX
can be inferred from axioms 1 — 6, 9, for dense time (cp. Sect. VII). From P^rrt! χ & mjx "x wasn't closed and is now closed" we can, by means of PL., infer T^-imjX, mjx), i.e. the contrast that, according to Forsyth, Barentsen and other grammarians is implied by a sentence like "dver' zykrylas'p"—"the door closed", and that, according to Russell can be used to define change. But because 9.
Η'Δη^χ & m t x
is stronger than P-imjX & nijx, we are now able to express formally the difference between the proposition that a door, a, was closing during some period until now and is now, indeed, closed for the first time, and the proposition that a was closing during some period until now, still being open at the present moment. The former was expressed by (9), the latter we can express by 10.
H'Amja & -in^a.
Although it is possible that a Russian sentence with a perfective-past verb form refers to present time, as in the following examples, this is not often the case : 11.
Umer — vskriknul Kukuskin, brosajas' na koleni u ego krovati — Umer. He is dead/he died — shouted Kukuskin falling on his knees at his bed — He is dead/he died. (B.N. Polevoj, Ex. Russ. Synt. II, 300.)
12.
My pogibli We are lost (Bondarko, 1967, 99)
13.
I teper', poborov otorop', ja resil . . . And now, having fought my shyness, I decided to ... (A. Terc. Pchenc.)
Most Russian sentences with perfective-past verb forms, however, refer to past time. There are, moreover, examples of sentences in which the perfective-past verb is ambiguous in respect to time : 1 4.
Kogda my prisli, oni usli When we arrived, they had (already) gone/When we arrived, they just went away. (Forsyth 1970, 68.)
To (14) we can ascribe the following structure : 15.
P(O(/ixn)&(PO(/jxm)vO(y5xin)), where Φ^χ«) represents a formula in which /^ occurs.
Tense-logic and the semantics of the russian aspects
167
So we can assume that 16. 17.
Η'Δ/ιΧ&/ιΧ, Ρ(Η'Δ/ιΧ&/ιΧ) as well as
18.
(Η'Δ/ιΧ &/ lX ) ν Ρ(Η'Δ/ιΧ &/lX)
are represented by forms like "zakrylas' " in surface structure. It is equally possible to assume that only (18) is represented by forms like "zakrylas'" in surfacestructure, because (18) is implied by (16) as well as by (17), which occur on some supposed deeper level, without direct representation in the surface-structure. By means of the λ-operator (cf. Carnap 1959, 82ff, 129ff.) we define a predicateforming operator on predicates, p, such that ffi κ is true if and only if Η'Δ^ίΧ & jJX is true : Def. p.: PA = Δ/,χ) ν FU(/lX) Δ/ιΧ)) ^
(^(AfrJri.
From Ο,(Δ^χ,^χ) we infer by PI, Lemma 9 and PL. 32.
Ο,ίπ/,χ,^χ),
and so from (29) and (31) by Syll.: 33.
Tense-logic and the semantics of the russian aspects
169
So, if we assume that Fp/jX occurs in the deep structure of Russian sentences with perfective-present verb-forms, the contrast of Barentsen, Forsyth and Russell, mentioned previously, can be inferred for linear, dense time. Furthermore, as we saw, U0£x, A/[x) v F(U(/jx, Δ£χ)) can be inferred from Ff^x. From U(/1x,A/1x)vFU(/1x,A^x) we can infer 34.
G'A/ix ν
by Lemma 4 and Lemma 9. Conversely, from (34) we cannot infer FJ^x. Assuming that (34) occurs in the deep-structure of Russian sentences with the imperfective correlate of perfective verbs with present-tense endings, e.g. "budet zakryvat's'a" — "will be closing" — then the situation described above again corresponds to that of Russian: 35. *Dver' zakroets'ap, no ne budet zakryvat's'a1 The door will close, but it will not be closing is unacceptable — this, as previously pointed out, for a logical reason —, while 36.
Dver' budet zakryvat's'a1, no ne zakroets'ap The door will be closing, but it will not be closed
is perfectly acceptable.
X.
A negated perfective verb can often be replaced by the negated corresponding imperfective verb (Forsyth 1970,102f.). This possibility is also accounted for by our assumed deep-structure for perfective verbs. 37.
ρ/ιΧ=Η'Δ/ιΧ&/ιΧ (byDef.p.)
If one of the conjuncts of the right member of (37) is negated, than p/[x is not true. Thus, if we assume Η'Δ/χΧ to occur in the deep-structure of imperfective verbs, the negation of Η'Δ^χ suffices for the negation of £/Jx. On the other hand, as we saw, the negation of a perfective verb can mean that the result of the event described by the verb, i.e. jix, is not attained, while nevertheless Η'Δ/Jx was the case: 38.
Ja dolgo ubezdal1 prepodavaternicu, cto v etom net nikakogo anarchizma, naoborot—Ubezdal1, no ne ubediP. I tried for a long time to convince the teacher that this was not a manifestation of anarchism, on the contrary. I tried to convince her, but I didn't succeed. (Erenburg. L'udi, gody, zizn'. From Forsyth 1970,104r)
170
J.Ph. Hoepelman XL
We have already assumed that expressions in which Η'Δ^χ occurs play a role in the deep structure of Russian sentences with imperfective-past verb forms. We have seen that these forms are implied by the postulated expressions for the deep-structure of perfective forms, so that it is impossible to state the perfective form, but to deny the imperfective one. It is, however, not possible to replace the perfective form by the imperfective one in all contexts. Perfective forms are required when the verb has the function of a perfect and when a series of successive events is described (Forsyth 1970, 92f.). We will try to find a formal expression for these contexts. The perfect meaning of a Russian perfective verb form expresses that a situation, of which the beginning is described by the perfective-past verb, has been existing up to and including now. 39. 40.
41. 42.
On pol'uhiP ee He fell in love with her (and still is in love with her)/He loves her. Moroz snova krepkij—podulp severnyj veter It's hard frost again (because) the north wind has got up. (Erenburg. OttepeF) Ja zabylp, gde on zivet I forget where he lives On s uma soselp He is mad (examples from Forsyth 1970, loc. c.)
As a formal expression of this perfect (for the group of verbs considered here) we propose: 43.
/,x & S(P/1X,/1X)
e.g.: "the door is closed now, and has been closed since it became closed". When perfective verbs are used to describe a series of successive events, each perfective verb describes an event that takes place in a new situation, the beginning of which is described by the preceding perfective verb. This situation can continue to exist after the new event has started, but it is equally well possible that it ceases therewith. 44.
D'akon vstalp, odels'ap, vz'alp svoju tolstuju sukovatuju palku i tycho vyselp iz domu The deacon got up, dressed, took his thick, rough stick and quietly left the house (Cechov, Duel'. Forsyth 1970, 65).
45.
On otkrylp dver', vyselp, i zaper1* ee op'at' He opened the door, went out, and closed it again (Forsyth 1970, 9).
Tense-logic and the semantics of the russian aspects
171
As a general formal expression of such a sequence we propose: 46.
P(0(x„+1) & S(|>/A & S(p/k_lXn_, & S(. . . & S(P/lXl ,fa\ . . .),
Λ-Λ-ι)./Α v (Φ(*.+ι) & SQP/,Α, & S(?/k_lXn_i & S(. . . &
Xj may be identical with x^/j For the future we replace in (46) all occurrences of Ρ by F, and of S by U in accordance with RM. The formulation of (46) as a disjunction allows us to consider sequences of events of which the last one took place in the past, as well as sequences of events of which the last one takes place in the present (and which eventually goes on in the future). The presence of the expression Φ(χ,ι+ι) allows for the possibility of an interruption or termination of the sequence of perfectives by an imperfective expression, as is often the case in Russian: 47.
Cto i govorit', eto bylo ne samoe obrazcovoe otdelenie. Proderzalip nas tarn minut sorok — kuda-to zvonili1, vyjasn'ali1, trebovali1 fotoplenku —i tol'ko posle aktivnych nasich ubezdenij . . . i dopol'nitel'nych zvonkov nas otpustilip i daze izvinilis/p. I must say it wasn't a model police-station. They held us there for about forty minutes while they made phone-calls, asked questions, demanded the film from the camera. And it was only after our active persuasions . . . and further phone-calls that they let us go, and even apologised. (V. Nekrasov. Po obe storony okeana. Forsyth 1970, 65.)
We see now, that (43) is a special case of the second member of the disjunction (46).
XII.
Except for the expression of a gradual change (and certain other functions), the imperfective forms of Russian verbs can have two functions that stand in a relationship to one another, and to the perfect meaning of perfective verbs. The first one is the expression of a "two-way-action", the other that of a repeated action. The imperfective verb describing a two-way-action stipulates that the situation which came into being by the action described by the verb does not exist any more. This function of the imperfective thus contrasts to the perfect meaning of perfective-past verbs.
172
J.Ph. Hoepelman
48.
Vojd'a v komnatu on skazal tovariscu — Kak zdes' dusno! Ty by chot' otkrylp okno. — Da ja ego nedavno otkryval1. When he entered the room he said to his friend: "How stuffy it is in here! You might at least have opened the window". "But I did open it (have it open) not long ago". (Forsyth 1970, 78.)
49.
Prochod'a mimo nee on snimal1 sl'apu. As he passed her he raised (i.e. took off and put on again) his hat.
Compare (49) to (50) in which the corresponding perfective form of snim P, sn'alp, occurs: 50.
Vstretiv ee, on sn'alp sTapu i skazal . . . When he met her he took off his hat and said (still with his hat off) . . . (Forsyth 1970, loc. c.)
In our formalism, and for the group of verbs considered here, we can express this meaning of the imperfective as follows : 51.
(Pp/lX & H S/i*) v P(Pf>/lX & Η'-,/ι*)
Η'τ/[Χ implies iSff^f^ 52.
in dense, linear time:
HS/ lX ^nSp/ lX ,/ lX
From τ5£/ίχ, j|X we infer by PL. 53.
-,(/lX & SJ>/lX,/lX)
So we can infer from (51) by PL., Lemma 9 and RM : 54.
P(Pp/lX & -,(/lX & Sp/lX,/lX)) v (Pp/lX & -,(/lX & Sp/lX,/lX)).
(53) is the negation of (43), which we proposed as the formal expression of the perfect meaning of the perfective-past. A repetition of a proposition, p, being true at different moments can, in Anscombe's formalism, be eXpressed as follows : or
T a (T a (p,-,p),p),...etC.
T a (T a (T a (np,p),-,p),...),...etc.
Pp/JX & Η'τ/[Χ implies Ta(Ta(-ijiX,^X), n^X), i.e. a repetition of n^X. 55.
Pp/lX & H VlX ^ T.CT.Cvix, /ix),
This means that, if we assume that (51) occurs in the deep-structure of Russian sentences with imperfective-past forms, as of "zakryvatVa" — "to close" — which denote a "two-way-action", then we can infer the repetition that, as may appear from the eXamples, is implied by this function of the imperfective, given the axioms for linear, dense time.
Tense-logic and the semantics of the russian aspects
173
XIII.
The other function of imperfect! ve verb forms we mentioned was the expression of repeated action (the iterative). An imperfect!ve verb that expresses an iterative can be considered as a repetition of perfectives : 56.
Kazdyj den' on vypival1 pered obedom r'umku vodki Every day before lunch he drank a glass of vodka i.e.: v ponedel'nik vypilp, vo vtornik vypilp, ... on Monday he drank one, on Tuesday he drank one . . . (Forsyth 1970, 164).
57.
Kazdyj den' on zakryval1 okno He closed the window every day i.e.: v ponedel'nik zakrylp, vo vtornik zakryP ... he closed it on Monday, on Tuesday . . . etc.
We can infer the repetition Ta(Ta(/[x, n^x),^x) without any new axioms if we express this function of the imperfective by a repetition of a perfective verb: 58.
P(P/lX & PP/lX) v (P/lX & PP/,x) o Ta(T.(/lX)
Forms of the future tense of imperfective verbs can also have an iterative meaning: 59.
Kazdyj den' on budet vypivat'1 pered obedom r'umku vodki Every day before lunch he shall drink a glass of vodka i.e. : ν ponedel'nik vyp'etp, vo vtornik vyp'etp ... on Monday he shall drink one, on Tuesday he shall drink one, ...
60.
Kazdyj den' on budet zakryvat'1 okno Every day he shall close the window i.e.: v. ponedel'nik zakroetp, vo vtornik zakroetp, . . .he shall close it on Monday, on Tuesday, . . .
We can infer that Qa(QaOix> ~l/ix)> Jix)> # we represent this meaning of the imperfective future by a future repetition of perfectives, i.e. if a perfective will twice or more be the case. 61.
F(P/ 1 x&FP/ 1 x)=Q a (Q a (/ 1 x > Vix),/ix)
As to the semantical relationship between the perfective and the imperfective (of the group of verbs considered) these results mean, that we can suppose that in surface structure an imperfective verb form occurs when in the deep-structure an expression occurs that implies τ/Jx, Pn^x or F-ijix, but not 'S^Jx and then^x", or a repetition of ^x, or of τ/Jx, whereas the perfective occurs in the surfacestructure when in the deep structure an expression occurs that implies "i^x and then^x", but not a repetition of jix, or of -i/ix.
174
J.Ph. Hoepelman APPENDIX
Proofs of theorems. Proof of 8. Η'Δ/ιΧ & / x x => P-i/lX & / lX : Lemma 1. S(p, q) 12 S(p, π q ν q) Proof of Lemma 1.: 1.
2. 3. 4. 5. 6. 7.
n(S(p, q) & S(p, q viq)) s -,(S(p & p, q & (q v -,q)) v S(p & (q v iq) & S(p, q vnq), q & (q vnq))) v S(p & q & S(p, q), q & (q v-iq))) (PL, ax. 2, subst.) -,(S(p, q) & S(p, q v -, q)) = -,(S(p, q) v S(p & S(p, q v -, q), q) v S(p&q&S(p,q),q)) (1. PL. EQ) -,(S(p, q) & S(p, q v-.q))==-,S(p, q) & -,S(p & S(p, q v-,q), q) & -,S(p&q&S(p,q),q)) (2. DeM.) -,(S(p, q) & S(p, q v -, q)) ID S(p, q) (3. PL, Sep.) S(p, q) ID S(p, q) & S(p, q ν τ q) (4. Contrap.) S(p, q) & S(p, q) =D S(p, q ν -, q) (5. PL) S(p,q)=>S(p,qv-,q) (6. PL)
Lemma 2. H(p ^ q) ID (H'p ID H'q) Proof of Lemma 2.: 1.
nS((pvnp),-nq)& S(p ν-φ, p) ID S(nq & ρ & S(p ν ι ρ, ρ), ρ) (ax.4,Sep.)
2.
S(iq & p & S(p V -ip, p), p) ^ S(iq & p, p) &
3. 4. 5. 6. 7.
S(S(pv-.p,p),p) (1., ax. 2, PL) -, S(p v -»p, -nq) & S(p v -ip, p) z> S(iq & p, p) (1, 2, Syll, Sep.) S(n q & p, p) -3 S(T q & ρ, τ ρ ν ρ) (Lemma 1., subs.) π S(p ν np, m q) & S(p ν τ p, p) z> S(n q & p, p v τ p) (3, 4, Syll.) n ST (n p v q, p v n p) ID τ S(p v np, p) v S(p v n p, q) (5., Contr. DeM.) H(p =5 q) ID (Hrp i5 H'q) (6., Def. H, Def. H', Def. =5)
Lemma 3. H'p "=> Pp. Proof of Lemma 3.: 1. 2. 3. 4. 5. 6. 7. 8. 9.
(·ι$(ρν-,ρ,-,(ρν-,ρ))&5(ρν-.ρ,ρ))Ξ(5((ρν-,ρ)& p & S(p v n p, p), p) & -i S((p v -ι ρ), -ι(ρ ν -ι ρ)) S(p v -,ρ, p) = S((p v -ip) & p & S(p v -,ρ, p), p) S(p v -ι ρ, ρ) Ξ S((p & S(p v -,ρ, p), p) S((p & S(p v i p, p), p) => S(p, p) & S(S(p v -, p, p), p) S(p v -, p, p) :D S(p, p) & S(S(p v -. p, p), p) S(p v n p, p) => S(p, p) S(p, p) ID S(p, ρ ν τ p) S(p v -, p, p) => S(p, p v i p) H'piDPp
(Αχ. 4.) (1. Ax. 9, Eq. PL) (2. PL, Eq.) (ax. 2, PL) (3, 4, Syll.) (6. Sep) (Lemma l.) (6, 7, Syll.) (8. Def. H', P)
Tense-logic and the semantics of the russian aspects
175
Proof of 8. Η'Δ/[χ &^x=> Ρτ/[χ
1. 2. 3. 4. 5. 6.
> -ι/ιχ) Η'Δ/ϊχ => Η'Τ/ΪΧ Η'Δ/ιΧ => Ρ-,/ix H f A/ 1 x&/ 1 x^PVi x &/ix
(PL) 0· Ax· l-> ax· 60 (2. Lemma 2., MP.) (Lemma 3.) (3, 4, Syll.) (5. PL) Q.E.D.
Proof of 28. F(q & H'p) z> FU(q, p) v U(q, p) :
2. 3.
4. 5.
U(q & S(p^p, P ), P =D P )s(U((p^ p ) & (pup) & P=> Ρ) ν ((p=> p) & U(q, (p=> p) & ρ)) ν (S(p=> p, p)& p&U(q,( P ^ P )& P )) (ax. 3) U(q & S(p => p, p), p o p) = (U(Uq, p), p o p) v U(q, p) v (S(P => P, P) & P & U(q, p)) (1. PL. Eq.) U(q&S(p=>p,p),p=>p)=> (U(U(q, ρ), ρ => p) v U(q, p) v U(q, p) (2. PL. Eq.) U(q & S(p ID ρ, ρ), ρ 13 p) o (U(U(q, p), p ^ p) v U(q, p) (3.PL.) F(q & H'p) 3 FU(q, p) v U(q, p) Q. E. D.
Proof of 30. (U(p, q) ν FU(p, q)) =5 Qaq, p: Lemma 4. U(p, q) => U(p ν τ p, q) Proof of Lemma 4. : 1. 2. 3-
-iU(Ov-i P ,q) = iU( P ,q)&-iU(-i P ,q) -.U(pvip,q)=>-.U(p,q) (1. PL., Sep.) U(p, q) 13 U(p ν τP, q) (2. ContraP.)
(Ax. 5. DeM.) (1. PL., Sep.) (2. Contrap».)
Lemma 5. G'P & G'q ^ G'(P & q) Proof of Lemma 5 .
1. 2.
3.
4. 5.
& U(p v-ip, p) & U(p ντ Ρ , q) = (U(p v-ip, P& q) ν U((p v q & U(P ν τ Ρ , q), P & q) ν U((P ν τ Ρ ) & (ax. 2, Eq.) ρ & U(p ν -, ρ, ρ), ρ & q)) & U(P ντ Ρ , Ρ) & U(p ντ Ρ , q)=> (U(P vi p , P & q) ν U((P q & U(P ν -ip, q)), p & q) ν U((p ν -ιΡ) & P & U(p ν τ Ρ , ρ) ν ι((ρ ν τρ) & (1 . Lemma 4. PL.) p & U(p ν π ρ, ρ)), ρ & q)) U(P VTJ>, P) & U(P ντ Ρ , q)=>U( P vn p , P & q) vU( P ν Ρ, Ρ & q) (2.Eq.) vU(Pv-iP,P&q) (3. PL.) U(p ν -,ρ, ρ) & U(p ν -,ρ, q) z> U(p ν -,ρ, ρ & q) (4. Def. G;, Eq.) G'p&G'q=>G'(p&q)
176
J.Ph. Hoepclman
Lemma 6. Fpn>G'Fp Proof of Lemma 6.: 1.
2. 3. 4. 5. 6. 7. 8.
(-iU(p V-ip,-nU(p,p V-ip)) & U(p,p V-ip)) = (U(-iU(p,pVnp) & (p V n p )
&υ(ρ,ρντρ), ρ ν -,p) & -»U(p ν τρ, -nU(p, p v τ p))) (Ax. 4. Subst.)(RM.) (-,U(p v-,p,-r,U(p, ρ ν-,ρ)) & U(p, p vip))z> (U(-,U(p, p ν-,ρ) & (p vip) & U(p, p v-.p), p v -.p)) (1. PL., Sep.) (-iU(p v i p, -n U(p, p v -· p)) & U(p, ρ v -i p)) ID (U(iU(p, p v -i p) & U(p, p v -, p), p v -, p)) (2. PL., Eq.) -i U(-, U(p, p v -i p) & U(p, p v τ p), p v τ p) (ax. l., Subst.) i(-iU(p v -ip, -iiU(p, ρ ν -ip)) & τΙΙ(ρ, ρ ν -ιp) (3, 4. Contrap. MP.) U(p v -ip, -nU(p, ρ ν τ p)) v iU(p, ρ ν τ p) (5. DeM.) U(p, p vip)=> U(p vnp, U(p, p ν-,ρ)) (6. Def. ID) Fpi^G'p (7. Def. F, G')
Lemma 7. FFp =2 Fp Proof of Lemma 7.:
1. 2. 3. 4.
U((p v -ip) & (p v -ip) & U(p, p v ip), (p v -ip) & (p v -·ρ)) ^ U(p ν-,ρ, ρ ν-,ρ) & U(p, p ν-,ρ) (ax. 2, PL.)(RM) U((np v p) & U(p, -,ρ v p), ip v p) ID U(p, -ip v p) (2. Sep.) U(U(p, ρ v -, p), -, p v p) ZD U(p, p v i p) (3. Eq.) FFp->Fp (4. Def. F.)
Lemma 8. U(p, q) ID F(q & Fp) Proof of Lemma 8.: 1. 2. 3. 4. 5. 6. 7.
U(p, q) r> Fp G'(q & Fp) ID F(q & Fp) U(p,q)iDG'q Fp=)G'Fp U(p,q)rDG'q>p (G'q & G'Fp) z> G'(q & Fp) U(p,q)^F(q&Fp)
(Lemma l, RM) (Lemma 3, RM) (Lemma 4, RM) (Lemma 6) (3,4, PL) (Lemma 5) (7,2 Syll.)
Tense-logic and the semantics of the russian aspects
177
Lemma 9. G(p ^ q) => (Fp z> Fq) Proof of Lemma 9.: 1. 2. 3.
i(U(-iOp v q), p v -ιρ) ν U(q, p v τ ρ)) Ξ i(U((-.(-i p v q) v q), p v -. p) (ax. 5., Eq.)(RM) -i(U((-i(-ip v q ) v q ) , p v - , p ) = i(U((p & nq) v q), p v -i p) (DeM. Eq.) i(U((p & -. q) ν q), ρ ν -. ρ) = -,(U((p ν q) & (q ν -ι q)), ρ ν ι ρ) (Distr. Eq.)
4.
-iU((p ν q) & (q ν nq), ρ ν -ιρ) = -iU(p v q, p v -φ)
5. 6. 7. 8. 9. 10.
(PL. Eq.)
-iU(p vq, ρ VTp) = -iU(p, ρ ντρ) & iU(q, ρ ντρ) (ax. 5, DeM.) -iU(p v q, p v ip) => -iU(p, ρ ν τ ρ) (5. Sep.) -.(U(-.(-ip ν q), ρ ν -ιρ) ν U(q, ρ ν -,ρ)) => -.U(p, ρ ν -ip) (l, 6, Syll.) -i(nU(-i(Tpvq),pv-ip)z>U(q,pv-ip))iD-iU(p,pv-ip) (7, Def. =>) i(G(p =>q) =5 Fq) ζ? π Fp (8. Def. G, F, Def. =>) G(p 35 q) Z5 (Fp z> Fq) (9. PL.)
Proof of 30. 1. 2. 3. 4. 5. 6.
G(U(p, q) => F(q & Fp)) (Lemma 8, ax. 1, RM, Def. G.) FU(p, q) ^ FF(q & Fp) (Lemma 9,1., MP.) FU(p, q) => F(q & Fp) (Lemma 7.) U(p, q) ν FU(p, q) => F(q & Fp) (3., Lemma 8, PL.) U(p, q) ν FU(p, q) ^ F(q & Fp) ν (q & Fp) (4. PL.) U(p,q)vFU(p,q)^Qaq,p Q.E.D.
Proof of 52. HVlX ID -, Lemma 10. H'q =5 τ S((p =5 ρ), τ q) Proof of Lemma 10.: 1.
S(p v-ip,q) & S(p Vip,nq)=(S((p v n p ) & (p Vnp), q & nq) ν S((pVTp)&iq& S(pVTp,nq),q&Tq)v
2.
S(p νnp, q) & S(p ντρ, nq) ID S(p ν np, q & -iq) ν S(p ν np, q & -iq) ν
3. 4. 5. 6. 7.
S(p ν τ ρ, q & π q) (1. Lemma 4, Eq.) S(pv-.p,q)&S(pv-,p,-,q):z>S(pvip,q&-,q) (2. PL.) τ S(p ν τ p, q & τ q) =5 i(S(p ν τ ρ, q) & S(p ν τ ρ, η q)) (3. Contrap.) τ S(p ν τ ρ, ρ & τ ρ) (ax. 9.) -,(S(pv-,p,q)&S(pv-ip,-,q) (4,5, MP.) H'q => ·, S(p => ρ, π q) (DeM. Def. =>, Def. H'.)
S((p v-ip) & q & S(p ν np, q), q & -.q)
12 TLI1/2
(ax. 2)
178
J.Ph. Hoepelman
Proof of 52. 1. 2. 3. 4. 5.
S(p, q) ID H'q H'q =3 P'q S(p,q)=>P'q -i P'q ID π S(p, q) HS q ^ -ι S(p, q)
6.
HViX^Sfltfx,/^)
(Lemma 4.) (Lemma 10, Def. P'.) (l,2,Syll.) (3, Contrap.) (4, Def. H'.)
(5. Subst.) Q.E.D.
Proof of 55. HS/iX & PftJxiD Ta(Ta(Vix,/ix), Lemma 11. (H'p & Pq) => P(p & Pq) Proof of Lemma 11.: 1. 2. 3. 4.
H'p & Pq => H'p & H'Pq H'p & H'Pq ID H'(p & Pq) H'(p & Pq) =5 P(p & Pq) H'p & Pq ID P(p & Pq)
(PL. RM. Lemma 6.) (Lemma 5, RM.) (Lemma 3.) (1, 3, Syll.)
(Proof of 55.:) 1. 2. 3. 4.
H'-i/iX & PP/iX =5 P(Vix & ρΡ/ιχ) (Lemma 11, Subst.) P(n/ix & PP/ix) = P(Vix & P^x & Η'Δ/ιΧ) (Def. f.) P(-,/ix & P(/iX & Η'Δ/ιΧ) => P(^x & P(/tx & P-i/ix) (PL, Lemma 3, Lemma 2, Lemma 9, RM.) H S/ix & Pf/lX ο Ta(Ta(Vix, /ix), Vix (1> 3, Syll, Def. Ta) Q.E.D.
Proof of 58. (P/1X & Pf/lX) ν P(f/lX & Pftfx) ^ Ta(Ta(/lX, Vix Lemma 12. P(q & r) :=> (Pq & Pr) Proof of Lemma 12.: 1. 2. 3. 4.
S(p, r) & S(q, r) Ξ (S(p & q, r) v S(p & t & S(q, r), r) v S(q & r & S(p, r), r)) S(p & q, r) = S(p, r) & S(q, r) S(p & q, p v n p) o S(p, ρ ν τ ρ) & S(q, ρ ν π ρ) Ρ(ρ & q) ζ> (Ρρ & Pq)
(Ax. 2.) (1. PL.) (2. Subst.) (3. Def, P.)
Tense-logic and the semantics of the russian aspects
179
(Proof of 58.)
1. 2. 3. 4. 5. 6. 7. 8.
p/lX & Pp/lX = (Η'Δ/1Χ &/lX) & Ρ(Η'Δ/ιΧ &/lX) (Def. P) (Η'Δ/ιΧ &/lX) & Ρ(Η'Δ/ιΧ &/lX) => (HS/lX &/lX) & P(HS/iX &/ιχ) (pl· Lemma 2, Lemma 9, RM.) χ HS^X &/ lX & Ρ(Η'-ι/ιΧ &/ι ) ^ (Lemma 12, Sep) H'-i/lX & P^x &/ lX (Lemma 12, Sep.) H'-i/iX & P/lX &/ lX iD P(-./iX & P/ix) &/ix (Lemma 11.) /lX & P(ViX & P/ix) =>/ix & P(Vix & PViX & P(/lX & P(./ix & P(Vix & P/1X & PP/1X ^ Ta(Ta(/lX, Vi P(|>/lX & Pp/lX) is Ta(Ta(/lX, -i/ix)>/ix) (Lemma 9, RM, Lemma 7, PL.) (P/1X & PJ>/lX) ν P(f/lX & Pp/lX) z, Ta(Ta(/lX, ^X),/1X) Q.E.D.
Proof of 61. F(P/1X & Fp/lX) => Qa(Qa(/lX, Lemma 13. F(q & H'p) => F(p & Fq) Proof of Lemma 13.: 1.
F(H'p&q) = U(q,p)vFU(q ) p)
2. 3. 4.
U(q, ρ) ν FU(q, p) => (F(p & Fq) ν FF(p & Fq)) F(p & Fq) ν FF(p & Fq) => F(p & Fq) F(H'p & q) = F(p & Fq)
(28.)
(Lemma 8., Lemma 9.) (Lemma 7., PL.) (1, 3, Syll.)
Proof of 61.: 1. 2. 3. 4. 5.
F(P/tx & FP/lX) = F((H'A/lX &/lX) & F(H'A/lX &/,X)) (Def. P.) F((H'A/lX & /lX) & Ρ(Η'Δ/,Χ &/,X)) r> F((H'-,/lX &/lX) & F(H' n/ix&/ix)) (PI. Lemma 9.) F((HS/lX &/lX) & F(HS/lX &/lX)) = F0ix & F(n^x & F^X)) (Lemma 12, RM., Lemma 7, Lemma 13.) F(/lX & F(-,/IX & F/lX)) = F(/,X & F(-,/lX & F/lX)) ν Οίχ & F(ViX & F/lX)) (PL.) F(P/lX & FP/,X) ^ Qa(Qa(/lX, Vix)./ix) (L *> SylL, Def. Qa.) Q.E.D.
180
J.Ph. Hoepelman
References ANSCOMBE,G.E.M.(1964) 'Before and After*. Philosophical Review, 73:1, 3—24 BARENTSEN A. (1971) 'K opisaniju semantiki kategorii "vid" i "vrem'a" (Na materiale sovremennogo russkogo literaturnogo jazyka).* Amsterdam. Unpubl. BONDARKO, A.V. BULANIN, L.L. (1967) 'Russkij glagol. Posobije dl*a studentov i ucitelej; pod red. JU. S. Maslova. Leningrad. CARNAP, R. (1958) 'Introduction to Symbolic Logic and its Applications.* New York: Dover CLIFFORD, J. (1966) 'Tense logic and the logic of change.* Logique et analyse no. 34,219—230. COOPER, N. (1966) Scale-words. Analysis, vol. 27,1966—1967, pp. 153—159. Cambridge University Press FORSYTH, J.: Ά Grammar of Aspect. Usage and Meaning of the Russian Verb.* Cambridge: JESPERSEN, O. (1924) 'The Philosophy of Grammar.* London: Allen and Unwin KAMP, J.A.W. (1968) 'Tense logic and the theory of linear order.* Diss. Univ. of Calif. KRABBE, E. (1972) 'Propositionele tijdslogica.* Amsterdam. Unpubl. LAKOFF, G. (1970) 'Linguistics and natural logic.* Synthese 22,151—271. POTTS, T. (1969) 'The logical description of changes which take time.* (Abstract) Journal of Symbolic Logic 34, 537. PRIOR, A. (1967) 'Past, Present and Future.* Oxford: Clarendon Press RASSUDOVA, O.P. (1968) 'Upotreblenie vidov glagola v russkom jazyke.* Moskva. REICHENBACH, H. (1947) 'Elements of Symbolic Logic.* London: Collier-Macmillan RUSSELL, B. (1903) 'Principles of Mathematics. Cambridge, Engl.: At the University Press VERKUYL, H. (1971) On the compositional nature of the aspects.* Diss. Utrecht: Routledge and Kegan Paul VON WRIGHT, G. (1963) 'Norm and Action, a Logical Inquiry.* London—(1965). 'And Next', Acta Philosophica Fennica, Fasc. 18.
LAURI KARTTUNEN
PRESUPPOSITION AND LINGUISTIC CONTEXT*
According to a pragmatic view, the presuppositions of a sentence detrmine the class of contexts in which the sentence could be felicitously uttered. Complex sentences present a difficult problem in this framework. No simple "projection method" has been found by which we could compute their presuppositions from those of their constituent clauses. This paper presents a way to eliminate the projection problem. A recursive definition of "satisfaction of presuppositions" is proposed that makes it unnecessary to have any explicit method for assigning presuppositions to compound sentences. A theory of presuppositions becomes a theory of contraints on successive contexts in a fully explicit discourse.
What I present here is a sequel to a couple of my earlier studies on presuppositions. The first one is the paper "Presuppositions of Compound Sentences" (Karttunen 1973a), the other is called "Remarks on Presuppositions" (Karttunen 1973b). I won't review these papers here, but I will start by giving some idea of the backgro nd for the present paper. Earlier I was concerned about two things. First, I wanted to show that there was no adequate notion of presupposition that could be defined in purely semantic terms, that is, in terms of truth conditions. What was needed was a pragmatic notion, something along the lines Stalnaker (1972) had suggested, but not a notion of the speaker's presupposition. I had in mind some definition like the one given under (1). (1) Surface sentence A pragmatically presupposes a logical form L, if and only if it is the case that A can be felicitously uttered only in contexts which entail L. * Presented at the 1973 Winter Meeting of the Linguistic Society of America in San Diego. This work was supported in part by the 1973 Research Workshop on Formal Pragmatics of Natural Language, sponsored by the Mathematical Social Science Board. I acknowledge with special gratitude the contributions of Stanley Peters to my understanding of the problems in this paper. Any remaining confusions are my own.
182
Lauri Karttunen
The main point about (1) is that presupposition is viewed as a relation between sentences, or more accurately, as a relation between a surface sentence and the logical form of another.1 By "surface sentence" I mean expressions of a natural language as opposed to sentences of a formal language which the former are in some manner associated with. "Logical forms" are expressions of the latter kind. "Context" in (1) means a set of logical forms that describe the set of background assumptions, that is, whatever the speaker chooses to regard as being shared by him and his intended audience. According to (1), a sentence can be felicitously uttered only in contexts that entail all of its presuppositions. Secondly, I argued that, if we look at things in a certain way, presupposition turns out to be a relative notion for compound sentences. The same sentence may have different presuppositions depending on the context in which it is uttered. To see what means, let us use "X" as a variable for contexts (sets of logical forms), "A" and "B" stand for (surface) sentences, and "PA" and "PB" denote the set of logical forms presupposed by A and B, respectively. Let us assume that A and B in this instance are simple sentences that contain no quantifiers and no sentential connectives. Furthermore, let us assume that we know already what A and B presuppose, that is, we know the elements of PA and PB. Given all that, what can we say about presuppositions of complex sentences formed from A and B by means of embedding and sentential connectives? This is the notorious "projection problem" for presuppositions (Morgan 1969, Langendoen & Savin 1971). For instance, what are the presuppositions of "If A then B"? Intuitively it would seem that sentential connectives such as if... then do not introduce any new presuppositions. Therefore, the set Py A then B should be either identical to or at least some proper subset of the combined presuppositions of A and B. This initially simple idea is presented in (2). (2) PIT A U « B S P A ^ P B However, I found that when one pursues this line of inquiry further, things become very complicated. Consider the examples in (3). (3) (a) If Dean told the truth, Nixon is guilty too. (b) If Haldeman is guilty, Nixon is guilty too. (c) If Miss Woods destroyed the missing tapes, Nixon is guilty too. In all of these cases, let us assume that the consequent clause "Nixon is guilty too" is interpreted in the sense in which it presupposes the guilt of someone else. The question is: does the compound sentence as a whole carry that presupposition? In the case of (3a), the answer seems to be definitely jes, in the case 1
There is some question over whether this notion of presupposition is properly labeled "pragmatic". For Stalnaker (1972, 1973), pragmatic presupposing is a prepositional attitude of the speaker. However, I will follow Thomason (1973) and others who would like to reserve the term "presupposes" for relations (semantic or pragmatic) between sentences. The idea that it is important to distinguish in this connection between surface sentences and their logical forms is due to Lakoff (1972, 1973).
Presupposition and linguistic context
183
of (3b) definitely no, and in the case of (3c) a maybe, depending on the context in which the sentence is used. For example, if the destruction of the tapes is considered a crime, then Miss Woods would be guilty in case she did it, and (3c) could be a conditional assertion that Nixon was an accomplice. In this context the sentence does not presuppose that anyone is guilty. But in contexts where the destruction of the tapes in itself would not constitute a crime (3c) apparently does presuppose the guilt of someone other than Nixon. These examples show that if we try to determine the presuppositions of "If A then B" as a particular subset of the joint presuppositions of A and B, the initial simplicity of that idea turns out to be deceptive. In reality it is a very complicated enterprise. The kind of recursive principle that seems to be required is given in (4a) in the form it appears in Karttunen (1973b). (4b) says the same in ordinary English. (4)
(») Pif A then B/X = PA/X^ (P /XuA ~ (E\uA ~ Εχ))
where Ex is the set of logical forms entailed (in the standard sense) by X, and X U A is the result of adding the logical form of A to X. (b) The presuppositions of "If A then B" (with respect to context X) consist of (i) all of the presuppositions of A (with respect to X) and (ii) all of the presupposition of B ( with respect to XuA) except for those entailed by the set XuA and not entailed by X alone. One would like to find a better way to express this, but I am not sure there is one.2 It really is a complicated question. So much for the background. What I want to show now is that there is another way to think about these matters, and about presuppositions of complex sentences in particular. Let us go back for a moment to the attempted pragmatic definition in (1). The point of that definition is that the presuppositions of a sentence determine in what contexts the sentence could be felicitously used. A 2
Peters has pointed out to me that, under certain conditions, (4a) is equivalent to the following projection principle.
Peters* principle has the advantage that it assigns the same set of presuppositions to "If A then B" irrespective of any context. Note that this set is not a subset of P A uP B , as required by my initial assumption in (2). Peters* principle says that, for each presupposition of B, "If A then B" presupposes a conditional with that presupposition as the consequent and the logical form of A as the antecedent. In addition, "If A then B" has all of the presuppositions of A. I realize now that some of the complexity in (4a) comes from trying to state the principle in such a way that (2) holds. If this is not worth doing, Peters' way of formulating the rule is superior to mine. However, in the following I will argue that we can just as well do without any explicit projection method at all, hence the choice is not crucial.
184
Lauri Karttunen
projection method, such as (4a), associates a complex sentence with a class of such contexts by compiling a set of logical forms that must be entailed in any context where it is proper to use the sentence. Thus we say that the sentence "If A then B" can be felicitously uttered in context X only if X entails all of the logical forms in the set Ptf A Ëâç B/x,defined in (4a). There is another, much simpler, way to associate complex sentences with proper contexts of use. Instead of characterizing these contexts by compiling the presuppositions of the sentence, we ask what a context would have to be like in order to satisfy those presuppositions. Of course, it is exactly the same problem but, by turning it upside down, we get a surprisingly simple answer. The reason is that we can answer the latter question directly, without having to compute what the presuppositions actually are. The way we go about this is the following. We start by defining, not presupposition, but a notion of satisfaction of presuppositions. This definition is based on the assumption that we can give a finite list of basic presuppositions for each simple sentence of English. For all cases where A is a simple, non-compound sentence, satisfaction is defined as in (5). (5)
Context X satis es-the-presuppositions-of A just in case X entails all of the basic presuppositions of A (that is, ÑÁO(B), then O(B) is only said to be the normal case on condition that A, not that O(B) holds in Á-worlds in which extraordinary circumstances and strange coincidences obtain. It may then very well be the case that A=>O(B), but -é(Á Ë C=>O(B)). Such restrictions to normal cases are implied, I think, in most every-day statements of conditional obligation. References ADAMS, E. W. (1970), Subjunctive and Indicative Conditionals. Foundations of Language 6, pp. 89—94. FRAASSEN, B. VAN (1973), Values and the Heart's Command. The Journal of Philosophy 70, pp. 5-19.
8
In another paper I define an epistemic interpretation of conditional necessity and try to show that a purely "objective" interpretation is, as in the cases of unconditional necessity or the similarity of worlds, impossible. 9 For other types of conditional obligations cf. Lewis (1973), 5.1, and Kutschera (1974B). 10 For a way out of this problem that comes close to deontic suicide cf. B. van Fraassen (1973).
Indicative conditionals
269
GOODMAN, N. (1965), Fact, Fiction, Forecast. 2nd ed. Indianapolis 1965. KRIPKE, S. (1972), Naming and Necessity, pp. 253—355, 763—769 in G. HARMAN and D. DAVIDSON (ed.): Semantics of Natural Language, Dordrecht: Reidel. KUTSCHERA, F. VON (1974a), Partial Interpretations. To appear in E. KEENAN (ed.): Formal Semantics for Natural Language, Cambridge. KUTSCHERA, F. VON (1974b), Normative Präferenzen und bedingte Obligationen. In H. LENK (ed.): Normlogik. München-Pulkch. LEWIS, D. (1973), Counterfactuals, Cambridge, Mass.: Harvard Univ. Press. STALNAKER, R. C. (1968), A Theory of Conditionals, pp. 98—112 in N. RESCHER (ed.): Studies in Logical Theory, Oxford.
DISCUSSIONS AND EXPOSITIONS H.A.LEWIS MODEL THEORY AND SEMANTICS1
In this paper some basic concepts of model theory are introduced. The structure of a definition of truth for a formal language is illustrated and the extension and alteration required for model theory proper is explained. The acceptability of a modeltheoretic account of truth in a natural language is discussed briefly.
I.
Introduction
In recent years several writers have proposed that formal logic has a role to play in linguistics. One suggestion has been that a semantic theory for a natural language might take the same form as the semantic accounts that are usual for the artificial languages of formal logic. Without presupposing a knowledge of formal logic on the part of the reader, I attempt here to sketch the arguments for this suggestion in the forms it has been given by Donald Davidson and by Richard Montague. Readers already familiar with their views will find nothing here that is new, but I hope also nothing that is dangerously misleading. One good reason for seeking to say nothing that presupposes knowledge of formal logic or of model theory is that questions of importance for my subject have to be settled, or begged, before formal semantics can be developed at all. I shall therefore be dealing with some basic concepts of model theory rather than with any detailed formal developments. It may help to identify my topic more clearly if I explain why it seems to me to survive two criticisms that might be brought—two views about the role of formal logic in the semantics of natural languages that imply (from opposite directions) that no question about the applicability of formal semantics to natural languages arises.
1
This paper is a revised version of a paper presented at the meeting of the Semantics Section of the Linguistics Association of Great Britain at the University of York on 5 April 1972. My presentation of the issues owes much to Donald Davidson, in particular to Davidson (1967). I am grateful to the referee for many improvements to an earlier version.
272
Harry A. Lewis
One school of thought is very hospitable to formal logic: allowing a distinction between deep and surface structures in a grammar, it claims that in a correct grammar deep structures will be nothing other than sentences of formal logic, and that such deep structures are necessarily bearers of clear semantic information. The only serious semantic questions that arise for natural languages would then be questions about the derivation of surface structures from deep structures. This view, a caricature to be sure, seems to me too hospitable to formal logic. If formulas of logic are usable as deep structures in a generative grammar, and the principle that meaning does not change in proceeding from deep to surface structure is espoused, semantic questions are simply thrown back onto the deep structures. The semantics of first order predicate logic (to mention the most familiar logical system) is well established for the purposes of logic textbooks but not without its limitations if it is taken as accounting for meaning in natural languages. A linguist who chooses logical formulas for his deep structures enters the same line of business as the philosophers who have puzzled over the philosophically correct account of the semantics of the logical formulas themselves. Some of their puzzles depend on the acceptance of a standard way of translating to or from logical symbolism, but others arise from the usual semantic accounts for first-order logic2. Even if it were legitimate to take the semantics of standard first-order logic for granted, this logic notoriously cannot deal in a straightforward way with many aspects of natural languages, such as tenses, modalities and intentional verbs, and indexicals. But the semantics of the more complex logical systems designed to deal with such notions is sufficiently controversial among logicians that no one can safely take it for granted. Another school of thought, viewing my subject from the opposite direction, holds that we know that a semantic account appropriate to an artificial language could not be appropriate to a natural language just because the former is artificial. The formulas of an artificial language are stipulated at their creation to have the meaning that they have, whereas a natural language, although it is a human creation, must be investigated empirically by the linguist before he can hope to erect a semantic theory3. It seems to me that the bare charge of artificiality is a pointless one: there is no reason why a semantic account that fits a language we have invented should not also fit another language that we have not. (Just as there is no reason why a human artifact should not be exactly the same shape as an object found in nature.)
2
See below, p. 276/7. In many presentations of first-order logic, only the logical constants (the connectives and quantifiers and perhaps the identity sign) are regarded as having a fixed meaning. In such an approach no formula (except one consisting entirely of logical constants) has a meaning until an interpretation is assigned to the non-logical constants: so the stipulation is a two-stage process. 3
Model theory and semantics
273
The idea that the semantics of natural languages is subject to empirical constraints that do not operate for artificial languages is worthy of greater respect, however. Whereas I may, it seems, decree what my artificial symbols are to mean, I must take the natural language as I find it—we can talk of 'getting the semantics right* for a natural language but not for an artificial one. As a matter of fact there is such a thing as getting the semantics wrong, indeed provably wrong, for an artificial language, because of the requirements of consistency and completeness that formal semantic accounts are intended to meet: but this remark, although it may make formal semantics sound more interesting, does not meet the difficulty about natural languages. Let us imagine that we have a proposed semantic theory for English before us, and that it gives an account of the meaning of the words, phrases, and sentences of the language. An empirical linguist must then inspect this theory to see if it squares with the facts. But what facts? The answer to this question depends to some extent on the nature of the theory: it may be a theory with obviously testable consequences. If, for example, it alleges that speakers of the language will assent to certain sentences, then provided you know how to recognize speakers of the language, and their assentings, such consequences can be tested in the field. Alternatively, the theory may provide a translation of sentences of English into another language. If the other knguage is Chinese, and you speak Chinese, you can check the translations for accuracy. If the other language is a knguage no one speaks, such as semantic markerese, whose existence is asserted only by the theory, then no such check is possible4. The problem of empirical adequacy is a central one for semantics. A semantic theory must provide an account of the meaning of the sentences of the language it purports to describe. If the language is our own language, we should be able to tell without difficulty whether the account is correct. It is a minimal requirement of a semantic theory that it offer a translation, paraphrase or representation of each sentence of the language. Any transktions it offers should be synonymous with the sentences of which they are translations. If the transktions talk about abstract entities of a kind of whose existence we were unaware, we shall need to be persuaded that we really were talking about them all the time, although we did not realize it. (It is an even more basic requirement that the transktion of a deckrative sentence should be a declarative sentence, rather than (for example) a verbless string.) It is not obvious that these platitudes bring us any closer to an understanding of the empirical constraints on a semantic theory. Certainly, if we think of translation as a simple pairing of sentences, it does not5. But if we think of
4
I owe the expression 'semantic markerese' to David Lewis. See Lewis, D. (1972). For the possibility of giving semantics by pairing synonymous expressions, cf. Hiz (1968) and Hiz (1969).
5
18a*
274
Harry A. Lewis
translation as the explaining in one language of the sentences of another, we may find a way out. Compare: (1)
'Pierre attend' means the same as 'Pierre is waiting';
(2)
'Pierre attend' means that Pierre is waiting.
(1) informs us of a relation between two sentences: (2) tells us what a particular French sentence means. A semantic theory should not simply pair sentences, it should tell us what they mean. (1) and (2) are both contingent truths, but the same cannot be said of both (3) and (4): (3)
'John is tall' means the same as 'John is tall';
(4)
'John is tall' means that John is tall.
We know that (3) is true in virtue of our understanding of 'means the same as', and so it is a necessary truth. We also know that (4) is true, but (4) is a contingent truth about the sentence 'John is tall'6. Moreover a semantic theory about English in English, worthy of the name, should have (4) as a consequence, as well as (5): (5)
'Four is the square of two' means that four is the square of two,
and in general should have as consequences all sentences like (A) (A)
S means that p.
where 'S' is replaced by a syntactic description of a sentence and 'p' is replaced by that sentence or a recognizable paraphrase of it. Such a theory does not simply pair sentences: it tells us what they mean. A minimal requirement on a semantic theory for a natural language is that it have as consequences sentences of form (A). The fact that the Á-sentences are contingent truths rather than stipulations or necessary truths proves to be no block to providing for a natural language a semantic account similar to some that can be given for formal languages: indeed the founding father of formal semantics, Alfred Tarski, made it one of his basic requirements for a formal semantic theory that it yield something very like the-A-sentences7. It may seem that the requirement that a semantic theory yield the Asentences is a weak one, and that the production of such a theory would be a trivial matter. It will be part of my purpose in what follows to show that this is not the case.
6 This point can easily be misunderstood because any (true) statement about meanings might be thought to be true *in virtue of meaning' and so necessarily true. But we do not need to know what 'John is tall* means to recognise (3) as true: all we need to know is that the same expression occurs before and after 'means the same as*. 7 See Tarski, A. (1956), in particular section 3 (pp. 186 sqq.). The idea that Tarski's approach may yet be appropriate for natural language is due to Donald Davidson. See in particular Davidson, D. (1967), (1970) and (1973).
Model theory and semantics
II.
275
Simple semantics
The semantic theories that will now be described have the form of definitions of truth for a language: they set out to give the truth-conditions of its sentences. The classical account of the definition of truth is Tarski's paper 'The concept of truth in formalized languages'. It seems to me to be essential to present briefly the main themes ofthat important paper8. A definition of truth is given in a language for a language. There are thus typically two languages involved, the one for which truth is defined, which plays the role of Object-language', and the one in which truth is defined, playing the role of 'metalanguage'. The metalanguage must be rich enough to talk about the object-language, in particular it must contain names of the symbols or words of the object-language and the means to describe the phrases and sentences of the object-language as they are built up from the words: it must also contain translations of the sentences of the object-language. Tarski lays down, in his Convention T, requirements for an adequate definition in the metalanguage of a truth-predicate: that is, requirements that a definition must fulfil if the predicate so defined is to mean 'is true'. The convention demands that it should follow from the definition that only sentences are true; and that the definition should have as consequences all strings of the form (B)
S is true if and only if p
where 'S' is replaced by a structure-revealing description of a sentence of the object-language and 'p' is replaced by a transktion of the sentence S in the metalanguage. Convention T resembles the requirement that an adequate semantic theory must have the Á-sentences as consequences. If we require that the definition of truth be finitely statable, then for an object-language with infinitely many sentences it is not possible to take as our definition of truth simply the conjunction of all the infinitely many B-sentences. It is in the attempt to compass all the B-strings in a finite definition of truth that the interest of the Tarski-type truth-definition lies. It is perhaps still not widely appreciated even among philosophers that the production of a definition of truth that fulfils Convention T for an interesting but infinite object-language is far from a trivial matter. Tarski himself showed that one superficially attractive trivialising move does not work. We might be tempted to use the axiom (6)
(x) ('x' is true if and only if x)
but this does not succeed in doing what was intended since the expression immediately to the left of the 'is true' is but a name of the letter 'x'. 8
cf. note 7 above. A simple introduction to Tarski's ideas is given in Quine (1970), chapter 3: Truth. 19 TLI3
276
Harry A. Lewis
In his paper, Tarski showed how Convention T's requirements could be net for one logical language. In order to illustrate his method I shall use a tiny fragment of first-order logic, with the following syntactic description: (7)
Symbols: variables: w ÷ y 2 predicates: F G the existential quantifier: E the negation sign: ô Sentences:
(i) (ii) (iii)
'F' followed by a single variable is a sentence; 'G' followed by two variables is a sentence, If S is a sentence, S' followed by S is a sentence, called the negation of S. If S is a sentence containing a variable other than 'w' or V, the result of writing ¸' followed by the variable followed by S is also a sentence, called (if the variable is v{) the existential quantification of S with respect to í·Ã
Thus the following are sentences of the fragment: (8)
Fx Gwx nFw EyFy EznGxz ôÅæÑæ Fy
A definition9 of truth will be offered for this fragment, using a small part of ordinary English as metalanguage. This part should at least contain the predicates 'smokes' and 'loves' and the names 'John' and 'Mary'. In order to explain the truthdefinition, I need to introduce the notion of satisfaction, since the fundamental semantic notion that we use is not truth but truth of, a relation between an individual and a predicate (or verb-phrase). We say that 'smokes' is true of John just in case John smokes; 'is red' is true of this poppy just in case this poppy is red. We have to complicate the notion in a natural way to fit a sentence like 'John loves Mary'. We can already say: 'loves Mary' is true of John if and only if John loves Mary, and 'John loves' is true of Mary if and only if John loves Mary. But we cannot say that 'loves' is true of John and Mary, for that would also be to say that 'loves' is true of Mary and John, but 'John loves Mary' means something different from 'Mary loves John'. We have to say in what order John and Mary are taken: so we use the notion of an ordered pair, John then Mary,—of a sequence with two members, John and Mary (in that order). Then we say that a sequence satisfies a predicate, by which we mean that the objects in the sequence,
9
Since a definition of truth in Tarski's style proceeds by way of a recursive definition of satisfaction, it may more appropriately be styled a theory of truth. If the metalanguage is powerful enough, such a theory may be converted into an explicit definition of truth.
Model theory and semantics
277
ordered as they are, fit the predicate, ordered as it is. We must use some notational device to keep track of the places in the predicate and to correlate them with the places in the sequence. (Note however that the places in the sequence are occupied by objects, the places in the predicate by names or other noun-phrases.) In the examples just given, both object-language and metalanguage are English. In our logical language, 'F' does duty for 'smokes' and 'G' for 'loves': so T' is true of John if and only if John smokes, and 'G' is true of John and Mary (in that order) if and only if John loves Mary. It is now possible to give the recursive definition of satisfaction for the fragment:10 (9) (i) (ii) (iii) (iv)
For any sequence q of persons whose first member is John and whose second member is Mary11, and all i and j, q satisfies Fv{ if and only if the i'th member of q smokes. q satisfies Gv-^v^ if and only if the i'th member of q loves the j'th member of q. q satisfies the negation of S if and only if q does not satisfy S. q satisfies the existential quantification of S with respect to the i'th variable if and only if at least one sequence differing from q in at most the i'th pkce satisfies S.
and the definition of truth: (10)
A sentence is true if and only if it is satisfied by all sequences of persons whose first member is John and whose second member is Mary.
If the implications of this definition are unravelled, we find out for example that (11)
'EyGyw' is true if and only if someone loves John.
(It is common to draw a veil, as I have done, over the process of abbreviation that yields this result: mention of sequences has been obliterated, but they are the most important piece of machinery required for the definition.) The definition of truth just stated tells us what sentences of the fragmentary language are to mean rather than what they do, as a matter of fact, mean: but this is only because the language that served as object-language was not one whose sentences already had a definite meaning. The procedure for defining truth could equally be followed for English sentences where both the metalanguage and the object-language were English. If it was followed, the role of the semantic definition of truth would be to articulate the structure of the language—to show how the meanings of sentences of arbitrary complexity depend on the meanings of their parts—rather than to give any very helpful information about the meanings of the parts. A definition of truth of the simple type that I have presented does 10
The italicised *P and *G* do duty for names of the predicates *F and 'G'. Concatenation, the writing of one symbol next to another, has been left to be understood, although it is usual in formal approaches to make it explicit. Vj' means 'the i'th variable', e.g. (v^ means *y* 11 I owe this device for handling proper names to Donald Davidson. 19*
278
Harry A. Lewis
all its serious work in the recursive clauses, and we look to it in vain for more than obvious information about the meanings of the simples (in this language, the elementary predicates and names). The attempt to extend a definition of truth according to Convenction T to a more useful fragment of English is a much more difficult task than it at first appears, however. Although we can leave many questions aside in pursuing this objective, it is still necessary to decide how sentences are built up and to determine the precise semantic role of the parts. I may illustrate the difficulties by a suggestive example familiar to philosophers. Students of modal logic (so-called) interest themselves in the notion of necessity, and concern themselves in particular with 'necessarily' as an adverb modifying whole sentences. The consequences required by Convention T of the truth definition would include sentences such as (12) (12)
'Necessarily John is tall' is true if and only if necessarily John is tall.
An obvious way to accommodate necessity in the recursive definition of satisfaction would be this: (13)
q satisfies the necessitation of S (i.e. the result of writing 'necessarily' then S) if and only if necessarily q satisfies S.
Such a clause implies that the sequence whose first member is the member one satisfies 'necessarily vl is odd' if and only if it is necessary that the sequence satisfies ivl is odd'—but the notion of a necessary link between a sequence and an open sentence is surely not present in the given sentence-form, and thus (13) is false. Intentional notions resist straightforward treatment in the definition of truth. A warning about the idea of a simple definition of truth in English for English that satisfies Convention T is appropriate here. There are formal reasons why a definition of truth in a language for the same language is not possible : unless the language concerned is very weak, a version of the Epimenides paradox will emerge : (14)
'Is not true of itself is not true of itself.12
III.
Model theory
Model theory is the investigation of the relationship between languages that can be formally described and the structures, interpretations or models for which their sentences are true. The sentences of such a formal language are, typically, true only in certain models, so that, given a collection of sentences, it is often possible to say what features any model in which all the sentences hold must have. Starting from the other end, with a model, it is usual to find that only certain sentences
See Tarski (1956), and cf. Martin (1970).
Model theory and semantics
279
hold in it. Thus, given some sentences, we can investigate their possible models: given a model, we can investigate which sentences it makes true. The connection between a language and a model is set up by means of a definition of truth: by a definition that explains under what conditions a sentence holds, or is true, in the model. The simplest definitions for first-order logic go by way of a recursive characterization of satisfaction, as described above, or directly to a recursive characterization of truth13. The sole difference from the semantics of Section II is that truth is defined as relative to a given model. In contrast, the simple semantics of Section II offers a definition of truth as absolute. For present purposes I take it as a defining characteristic of model theory that it studies relative definitions of truth. Relative definitions of truth are quite standard in logic textbooks, both for elementary logic (where the notion of model or interpretation is needed to define logical truth and logical consequence) and for modal and intentional logics (where truth is often defined as relative to a 'possible world'14). In applications to natural languages, there is a measure of agreement that truth of sentences is relative to the context of utterance where indexical elements (e.g. tense, location) occur: the philosophical divide comes between those who hold that Semantic' primitive notions are admissible in the definition of truth and those who do not15. An influential recent writer who supported the claims of model theory was Richard Montague. Here are two typical statements of faith: I reject the contention that an important theoretical difference exists between formal and natural languages. On the other hand, I do not regard as successful the formal treatments of natural languages attempted by certain contemporary linguists. Like Donald Davidson, I regard the construction of a theory of truth—or rather, of the more general notion of truth under an arbitrary interpretation—as the basic goal of serious syntax and semantics; and the developments emanating from the Massachussetts Institute of Technology offer little promise towards that end16.
13
For a recursive definition of truth-in-an-interpretation, see Mates (1972) p. 60, and for a definition of truth-in-an-interpretation by way of a recursive definition of satisfaction, see Mendelson (1964) pp. 50—51. 14 Thus for example the necessitation of S is said to be true just in case S is true in all possible worlds. For formal purpose, possible worlds function as do models—they are abstract entities in which certain sentences hold. 15 This is not the place to argue at length that this is the philosophically interesting divide. Davidson favours the absolute definition of truth, Montague the relative (see Davidson 1973, and Wallace, 1972). It is Davidson who has taught us the importance of this contrast, and I follow his usage in using 'absolute* to cover accounts of truth that do not use semantic relata—e.g. interpretations, possible worlds—even if they use a notion of truth as relative to such things as place and time of utterance. 16 Montague (1970a) p. 189. (Emphasis added.) It will be clear from the last note that 'like Donald Davidson* is here misleading.
rtarryA.Lewis
280
There is in my opinion no important theoretical difference between natural languages and the artificial languages of logicians; indeed, I consider it possible to comprehend the syntax and semantics of both kinds of languages within a single natural and mathematically precise theory. On this point I differ from a number of philosophers but agree, I believe with Chomsky and his associates. It is clear, however, that no adequate and comprehensive semantical theory has yet been constructed, and arguable that no comprehensive and semantically significant syntactical theory yet exists. The aim of the present work is to fill this gap, that is, to develop a universal syntax and semantics17.
Both papers from which I have quoted offer a syntax and semantics for fragments of English. These accounts are technically complex and even a summary is out of the question. What I shall try to do is to characterise in general terms the modeltheoretic approach and to indicate the special contributions made by Montague. For first-order predicate logic the standard models are collections of sets, one of them, the domain, containing all the objects that can be talked about; predicates have subsets of the domain or relations on the domain assigned to them, and names have members of the domain assigned to them. An interpretation can thus be construed as a domain together with a function that assigns to each predicate or name an appropriate object taken from the domain. A sentence is then said to be true for an interpretation if it is satisfied by every sequence of objects from the domain, given the interpretation. In this approach, meaning is determined in two stages. The meanings of the logical constants—the connectives ('and', 'if then...' etc.) and the quantifiers ('all', 'some')—that also provide the recursive elements in the definition of satisfaction, are stated in advance for all interpretations. An interpretation then gives the meanings of the non-logical constants (the predicates and proper names of the language). One advantage is that the method permits of the definition of a notion of logical truth as truth in all interpretations. Moreover, truth-in-an-interpretation can be defined in advance of knowing a particular interpretation. In the simple semantics that I sketched earlier, this was not so: to give a definition of truth, we need to have all the basic clauses for the recursive definition to hand, and we had no general way of characterizing the interpretation of, for example, a one-place predicate. The new facility could be seen as an advantage: you may have felt that the basic clauses in the simple semantics, such as (9) (i), were disappointingly trivial. The corresponding clause in a relative definition might look like this: (15)
q satisfies Fv{ in I if and only if the i'th member of q is a member of the set assigned by I to F.
This appears to open up the possibility of discussing alternative interpretations of the basic elements in the language, a possibility that was not evident for the absolute
Montague (1970b) p. 373.
Model theory and semantics
281
18
definition. An interpretation can be thought of as a dictionary for a language whose syntax is known and about which we have semantic information up to a point: we know the meanings of the logical constants (we are not allowed to vary these from one dictionary to another) and we know the kind of interpretation that is allowed for any lexical item whatsoever, since the kind is determined by the syntactic category. What a particular dictionary, or interpretation, does is to specify which meaning of the appropriate kind each lexical item possesses. If we could approach a natural language in a similar way, we could hope to describe its syntax precisely and to determine what the appropriate meaning or interpretation for each lexical item would be. We could expect to discover some logical constants, in particular among the devices for constructing compound or complex sentences out of simpler parts. It is plain, however, that shifting our attention from the absolute to the relative definition of truth has not at all changed the problems that must be met. The standard mode of interpreting predicate logic together with the rough-and-ready methods we have for translating between English and the logical symbolism fail to deal with a host of constructions and aspects of English, such as intentional contexts in general, in particular tense, modality, intentional verbs such as 'believe', 'seek*; and indexical or token-reflexive elements such as personal pronouns. Moreover the kck of a serious syntactic theory linking the logical notation and English is an embarrassment. A defect that is likely to strike the linguist as particularly important is the lack of syntactic ambiguity in the logical notation. The list of obstacles could be prolonged, and they are serious. Montague and others have sought to overcome them all by providing a single theory of language, embracing syntax and semantics, which provides a framework within which a complete formal grammar for any particular language, natural or artificial, could be fitted19. Although Montague's framework is complex, I believe that a large part of it can be understood by the use of two simple ideas20. The first involves framing all our semantical talk in terms of'functions (with truth as the ultimate value). The second is the idea of analysing the syntax of a language into functional relationships. If the functions are matched in a suitable way, an elegant and powerful semantic theory appears to fall out naturally. I shall try to explain first how the notion of function can be used in semantics. The familiar requirement that a semantic theory determine under what conditions a declarative sentence is true, could be stated in a more abstract way by asking that a
18 Strictly, the part of the interpretation that assigns meaning to expressions, but not the domain itself. 19 See Montague (1970a), (1970b) and (1973), and Lewis, D. (1972). 20 Here I am indebted to Lewis, D. (1972). I recommend this article as a preliminary to any readers who would understand Montague's writings.
282
Harry A. Lewis
semantic theory define a function from declarative sentences to truth-values so that every sentence has a truth-value. The abstract way of talking, in terms of functions, appears quite gratuitous at this point, but it is indispensible for the steps that follow. Consider a simple declarative sentence: (16)
John is running.
We want a semantic theory to assign a truth-value to it in a model M: in particular, we now should like it to entail sentences like (17)
val ('John is running') = T in M if and only if John is running in M.
According to the standard approach to the model theory of predicate logic, the name 'John* would receive a member of the domain of an interpretation, and the predicate 'is running* would be given a subset of the domain. (Of course, predicate logic is an artificial language: I continue to use English expressions for illustration only.) The resources of this mode of interpretation do not allow us to say that the subset assigned to 'is running' varies: but this leads to a difficulty, for of course the extension of 'is running'—the set of people who are running—varies from moment to moment, although the meaning of the expression does not. How can we assign a single meaning to 'is running' within a formal semantic theory which allows for this complexity? The answer is, we assign to the predicate a function from times to subsets of the domain; a function, it could be argued, that we already know to exist—for we know that at any time some people are running and some not. The resulting account of the truth-conditions of 'John is running' might look like this: (18)
val ('John is running', tk) = T in M if and only if val('x is running')(val('John'), tk) = T in M This can be read: the valuation function yields truth in M for the arguments 'John is running' and tk(the k-th time) if and only if the result of applying the interpretation of the predicate 'x is running' (which is a function of two arguments) to the arguments (a) the interpretation of 'John', (b) tk, is truth in M. The standard interpretation of predicates by subsets of the domain can be progressively complicated to deal with any features of the occasion of utterance that are relevant to truth-value. A particular valuation, or model, will then ascribe to each predicate an appropriate function. The other simple idea involved is the extension to the semantic realm of Ajdukiewicz' syntactic ideas which derive in turn from Frege. Ajdukiewicz showed how, given two fundamental syntactic categories, it was possible to assign syntactic categories to other parts of speech21. The categories other than the basic sentence
See Ajdukiewicz (1967).
Model theory and semantics
283
and name are all functions of more and less complexity, so that the criterion of sentencehood is that the syntactic categories of the component expressions should when applied to one another yield 's' (sentence). I shall illustrate the idea and its semantic analogue by the case of a simple verb-phrase. If we know that (19) Arthur walks. is a sentence, and 'Arthur' a name, we know that the syntactic category of 'walks' is s/n, i.e. the function that takes a name into a sentence. A semantic analogue (much simpler however than anything in Montague) would be this: if we know that the interpretation of the whole sentence is to be a truth-value, and the interpretation of the name 'Arthur' is to be a person, we can infer that the interpretation of 'walks' is to be a function from people to truth-values. Montague gives a far more complex account of the interpretation even of simple predicates, as he wishes to allow for the occasion of utterance and further factors. But the principle by which the appropriate meaning for items of a certain syntactic category is discovered is the same. The case of adverbs that modify verb-phrases is in point. Syntactically speaking, such adverbs can be seen as turning verb-phrases into verb-phrases (e.g. 'quickly', 'clumsily'): so semantically speaking, they turn verb-phrase meanings into verb-phrase meanings. We therefore know the type of function that such adverbs require as interpretations—they are functions from verb-phrase interpretations to verb-phrase interpretations. Adjectives that attach themselves to common nouns are treated in the same way. The syntax of the fragment of English described in Montague's 'Universal Grammar' (Montague, 1970b) is sufficiently sophisticated to allow that (20) Every man such that he loves a woman is a man is a logical truth, whereas (21) Every alleged murderer is a murderer is not. The assignment to syntactic categories, given the semantic principle I have just presented, is not a trivial matter, and it seems to me, although I claim no expertise, that the syntactic descriptions given of fragments of English in 'Universal Grammar' and 'English as a formal language' are ingenious and interesting22. Both fragments allow syntactic ambiguities, and in the latter paper Montague suggests a way of dealing with ambiguity by relativizing his semantics to analyses of sentences, where an ambiguous sentence receives two distinct analyses.
IV.
Conclusion
My aim in this paper has been to present the basic ideas of two approaches to the semantics of natural languages, those associated with Donald Davidson and with Richard Montague. A long critical discussion would be out of place and it would have to draw on writings and detail not expounded here. See also Montague (1973).
284
Harry A. Lewis
The main themes have been these. A semantic theory can have empirical content even if it is built on the pattern of the theories of truth usually offered for formal languages. Such a theory of truth may represent truth as an absolute notion, or as a relative notion, where the relativity may be to context of utterance (time, place, person) or to "possible worlds". Such notions as 'interpretation' or 'possible world', used as undefined terms in a theory of truth, rest truth on a prior notion that is 'semantic' in that it involves essential use of the notion of truth or a rekted concept such as reference23. A great deal of philosophy is condensed into the contrast between those accounts of truth that use a semantic primitive and those that do not—for example, the question of the intelligibility of concepts of necessity or logical truth can be phrased as the question of the acceptability of certain semantic primitives. In the present context, the contrast is that between theories of meaning for natural languages that make reference to possible worlds, models and interpretations and those that do not. The reader new to this subject may be tempted to suggest that this contrast is unimportant, and perhaps that allegiance to the truth-definition as the criterion of adequacy in semantics is the sole interesting test. Possible worlds—some but not all models—are theoretical entities, it would seem, whose existence is postulated to help with a smoothly-running account of language. If the question is whether possible worlds are disreputable epicycles or respectable ellipses, surely time alone will tell? To be sure, the understanding of language is in us, not in the heavens: but we now readily concede that the ability to produce grammatical sentences is not the same as the ability to produce a grammar that will model the former ability. Why should semantics not trade in notions as obscure to the lay mind as are 'phrase-structure grammar' and 'generalised transformation', provided that they aid theory? Surely we can expect our theory of meaning to be at least as complicated as our syntax? One obstacle to such a generous view lies in the criterion of adequacy built into this approach to semantics: Tarski's convention T. It is a powerful constraint on a theory that it generate the theorems required by'the Convention. Such theorems have on one side a translation or paraphrase of the sentence whose truth-conditions they thus state. Proponents of the relative definition of truth as the semantic paradigm have to persuade us that talk of possible worlds does paraphrase everyday English. If they are right, we should all be easy to persuade, for it strains credulity that native speakers cannot recognize synonymous pairs of sentences when they see them24. The welcome feature of this approach to semantics is that a lay audience can easily test the plausibility of particular proposals by asking that the B-sentences be exhibited: they may then inspect the two sides to see if one is a plausible paraphrase of the other. In other words, such a proposal has empirical
cf. Davidson (1973). This is the argument of Wallace (1972), VI.
Model theory and semantics
285
consequences, and like respectable theories in other fields, it is falsifiable: unlike theories in some other fields, these can be tested by any native speakers of the language in question.
References AJDUKIEWICZ, K. (1967), On syntactical coherence (translated from the Polish by P.T. Geach), Review of Metaphysics 20,635—647. DAVIDSON, D. (1967), Truth and meaning, Synthese 17,304—323. DAVIDSON, D. (1970), Semantics for natural languages, pp. 177—188 in: Linguaggi nella societa e nella tecnica, Milan: Edizioni de Comunita. DAVIDSON, D. (1973), In defense of Convention T. pp. 76—86 in: Leblanc, H. (Ed.), Truth, Syntax and Modality, Amsterdam: North-Holland. Hiz, H. (1968), Computable and uncomputable elements of syntax, pp. 239—254 in: van Rootselaar, B. and J. F. Staal (eds.), Logic, Methodology and Philosophy of Sciences III, Amsterdam: North Holland. Hiz, H. (1969), Aletheic semantic theory, The Philosophical Forum 1 (New Series), 438—451. LEWIS, D. (1972), General semantics, pp. 169—218 in Davidson, D. and G. Harman (Eds.), Semantics of natural language, Dordrecht: Reidel. MARTIN, R.L. (1970), The paradox of the liar, New Haven: Yale University Press. MATES, B. (1972), Elementary Logic (second edition), London: Oxford University Press. MENDELSON, E. (1964), Introduction to Mathematical Logic, Princeton, N. J.: D. Van Nostrand Company. MONTAGUE, R. (1970 a), English as a formal language, pp. 189—223 in Linguaggi nella societa e nella tecnica, Milan: Edizioni di Communita. MONTAGUE, R. (1970b), Universal grammar, Theoria 36,374—398. MONTAGUE, R. (1973), The proper treatment of quantification in ordinary English, pp. 221—242 in: Hintikka, K. J.J., J.M.E. Moravcsik and P. Suppes (Eds.), Approaches to natural languages, Dordrecht: Reidel. QUINE, W. V.O. (1970), Philosophy of Logic, Englewood Cliffs: Prentice-Hall. TARSKI, A. (1956), The concept of truth in formalised languages, pp. 152—278 in: Logic, semantics, metamathematics (translated by J.H. Woodger), Oxford: Clarendon Press. WALLACE, J. (1972), On the frame of reference, pp. 212—252, in: Davidson, D. and G. Harman (Eds.), Semantics of natural language, Dordrecht: Reidel.
IRENA BELLERT
REPLY TO H. H. LIEB
I wish to take this opportunity and reply to Lieb's comments (in: Grammars as Theories, Theoretical Linguistics, Vol I [1974] pp. 39-115) on my proposal (I. Bellert, Theory of Language as an Interpreted Formal Theory, Proceedings of the 11-th International Congress of Linguists, Bologne, 1972). I will discuss only some critical remarks which, if valid, would make my proposal untenable. The first is due to a misinterpretation of one part of my text, which in fact was carelessly formulated and hence misleading. I am indebted to Lieb for his observations. The others which I found objectionable, give me the opportunity to clarify my statements. On page 103 Lieb says: "The separation of the 'axiomatic implications' from the 'meta-rule' is untenable. To make sense of the conception, we have to take "A" in an axiomatic implication as a free variable(...) Because of the free variable(s) these axioms are neither true nor false and they have no acceptable interpretation (...)"· Of course the separation of the axiomatic implications from the meta-rule is untenable and by no means was it intended so in the proposal. But it is quite evident that the conception would not make any sense at all if we took "A" as a free variable. As I said on page 291: "Notice that in the above implicational scheme the expressions A, R and (S, D) (addresser, receiver and sentence with its structural description, respectively) are all bound by universal quantifiers" (the stress is added). It is obvious then that I did not mean "A" to be a free variable. The meta-rule cannot be separated from the expression "C—> A PROPOSITIONAL ATTITUDE S1" which constitutes only part of it, and thus, if taken separately, could by no means be said to be true, false nor even to constitute a well formed formula. However, what evidently misled Lieb was that part of my paper in which I used the latter expression in referring to axiomatic implications, for the sake of brevity. The reason of my careless formulation was that the expressions: "C" in the antecedent, "PROPOSITIONAL ATTITUDE" and "S1" in the consequent are the only ones which are specific for each implication and essential for analycity, whereas the remaining expressions are exactly the same and in implications the variables are bound by universal quantifiers. Therefore, when establishing axiomatic implications for any language or a fragment of a
288
Irena Bellert
language, we would have to specify only the mentioned expressions, while the entire meta-rule would always be presupposed as part of each implication. Perhaps the term 'axiomatic scheme' would be more appropriate than 'meta-rule'. In conclusion, I should then have said that the interpretative component will consist of a finite set of axiomatic implications of the form given by the axiomatic schema (meta-rule), the essential and language-specific expressions of which are: "C", "PROPOSITIONAL ATTITUDE" and "S1". Lieb objects against my statement that "the consequents can be said to follow formally from the antecedents". He says: "But even in analytic sentences material implication does not mean deducibility." (Footnote 131, page 104). I cannot agree with his objections. Material implication, clearly, does not mean deducibility, but this is not what my statement says. What I say here is in agreement with the terminology established by Tarski and widely accepted in the literature. Let me recall Tarski's definitions of the terms in question. He considers a finite class of sentences K from which a given sentence X follows. He denotes by the symbol Z the implication whose antecedent is the conjunction of the sentences in the class K and whose consequent is the sentence X. He then gives the following equivalences: "The sentence X is (logically) derivable from the sentences of the class K if and only if Z is logically provable. The sentence X follows formally from the sentences of the class K if and only if Z is analytical. The sentence X follows materially from the sentences of the class K if and only if the sentence Z is true" (Logic, Semantics, Metamathematics,Oxford: At the Clarendon Press, 1956, p. 419). In my proposal I can correspondingly say that the sentence in the consequent of an axiomatic implication follows formally from the class of sentences (or conjunction of sentences) in tire antecedent, as the implication is analytical. Furthermore, Lieb questions the analyticity of the axiomatic implications (Footnote 133, page 104). The question of properly distinguishing analytic statements from synthetic (contingent) statements has been widely discussed in the literature and there is no complete agreement as to the status of some statements. However, when discussing analyticity the authors agree that a statement is said to be analytical if its truth is based on meaning alone independently of extralinguistic facts. Carnap's meaning postulates have been proposed as an intended explication of the concept of analyticity. He defines meaning postulates as L-true implications. The L-truth is an explicatum for what Leibniz called necessary and Kant analytic truth. In Carnap's formulation the antecedent ^implies the consequent. His example is: "If Jack is a bachelor, then he is not married" (Meaning and Necessity—Meaning Postulates, Phoenix Books, 1967, p. 10 and 222). In spite of the controversies involved, undoubtedly there is a difference made in logic
Reply to H. H. Lieb
289
between unconditionally true statements and contingent, factual statements: To the former class belong logically true statements and those that are not theorems in standard logic but their truth is independent of extra-linguistic facts. Those are usually called analytical. Now, since my implications are intended to be constructed so that their truth be dependent solely on the meanings of the words and the structures involved, I presume that they can correctly be called analytical. Moreover, if they are taken as axioms of the theory, their truth cannot, without contradiction, be considered contingent, and they have to be taken as unconditionally true statements in the theory. Marian Przel^cki has discussed in detail the status of meaning postulates in axiomatized empirical theories (The Logic of Empirical Theories, Routledge & Kegan Paul, New York, 1969). As he observed, it is a usual practice in axiomatizing empirical theories to explicate the meaning of extra-logical terms by meaning postulates which are then considered to be analytical sentences of that language. Lieb questions, however, the empirical contents of such a theory or grammar (Footnote 133, p. 104). The class of sentences that follow from a given sentence S and some pertinent meaning postulates obviously adds nothing to the meaning of S. Meaning postulates, or axiomatic implications (in my terminology), are established for explicating the nonlogical terms and structures contained in S. The axiomatic implications in my proposal are intended to explicate more complex predicates used in specific structural conditions in terms of other predicates—in a way which, in principle, should reflect the native speakers* understanding of the language, that is, in particular, they should account for the conclusions speakers generally can draw from the corresponding utterances in any fixed universe of discourse for which the meaning and denotation of the terms involved are clearly understood. In order to test the empirical adequacy of axiomatic implications, it is necessary to find a possible state of affairs in which the antecedent holds true but the consequent does not. If such a case is found, the implication in question should be rejected, or the conditions C in the antecedent should be modified in such a way that they become necessary conditions (as they are intended to). But this can be done only by determining a universe of discourse, as well as the denotation of some predicates (those that are not further explicated by axiomatic implications, but occur in the consequents only) by establishing in some way (other than the verbal way of specifying axiomatic implications) the sets of individuals in the universe of discourse of which the given predicates hold true. Otherwise the theory will have no empirical contents indeed. It is clear, however, that such tests are based ultimately on speakers' judgements only. Finally, I wish to add that being a linguist with only some knowledge of logic, I did not aim at a rigirous formalization of the proposal but, rather, I did what is a common practice for non-logicians interested in the possibility of formalizing some aspects of their empirical field, namely, I submitted for discussion a rough outline of a theory which would account for the empirical fact that speakers are capable of drawing a number of conclusions from a single utterance
290
Irena Bellert
by virtue of only the meanings of words and structures involved, independently of extra-linguistic facts. And I am indebted for all critical observations, which may help me in further clarifying my proposal as it has been the case with Lieb's comments.