No title

JOURNAL OF SEMANTICS A 'INTER ATIONAL JOURNAL FOR TI-lE INTERDISCIPLINARY STUDY OF THE SEMANTICS OF NATURAL LANGUAGE VO...

Author: Oxford University Press

90 downloads 578 Views 6MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form

DOWNLOAD PDF

JOURNAL OF SEMANTICS A 'INTER ATIONAL JOURNAL FOR TI-lE INTERDISCIPLINARY STUDY OF THE SEMANTICS OF NATURAL LANGUAGE

VOLUME 6,

1988

SWETS & ZEITLINGER B.V. LISSE- THE NETHERLANDS- 1991

JOURNAL OF SEMANTICS AN INTERNATIONAL JOURNAL FOR TilE INTERDISCIPLINARY STUDY OF TilE SEMANTICS OF NATURAL LANGUAGE

VOLUME 6,

1988

Reprittl�d wiJh �rmissiott of Foris PublicatiottS, Dordruht by

SWETS & ZEITLINGER B.V. LISSE- THE NETHERLANDS- 1991

JOURNAL CONTENTS

OF

S E M A N TI C S

VOLUME

6 (1 988)

Articles MARK ARONSZAJN,

271

Thought and Circumstance

NICHOLAS ASHER and HAJIME WADA, account of syntactic,

A computational

Semantic and discourse principles

for anaphora resolution MANFRED BIERWISCH,

Tools and Explanations of

comparison

57 101

Part I Part II OSTEN DAHL, SIMON C.

The role of deduction rules in Semantics

GARROD and ANTHONY J.

SANFORD,

Discourse

models as interfaces between language and the spatial word KEES HENGEVELD,

Illocution,

mood and modality 1n a

functional grammar of spanish JACK HOEKSEMA, ROLF MAYER,

309

The Semantics of non-boolean "and"

JOHAN ROORYCK,

Conditions for mutuality

19 369

Restrictions on dative Cliticization

41

in french causatives PIETER SEUREN,

227 345

Motion imperatives

JOSEF PERNER and ALAN GARNHAM,

147

Presupposition and negation

175

Bookreviews BART GEURTS,

Hiyan Alshawi,

Memory and context for

language interpretation MANFRED KRIFKA,

Gerhard Heyer,

PIETER A.M. SEUREN, dictionary

Generische Kenzeichungen

Collings cobuild english language

95 161 169

JoUT11al of Stmantics 6: I

-

18

THE ROLE O F DEDUCTION RULES IN SEMANTICS

OSTEN DAHL

ABSTRACT

claimed that there is a parallel between the construction of a proof based on a set of premises and e.g. the production of a natural-language text which is based on information in some kind

of data-base. The main part of the paper is devoted to a discussion of the relations between the deduction rules traditionally associated with the existential quantifier and notions pertaining to the theory of reference such as specificity and referentiality I attributivity. Two types of spe cificity are distinguished, which can be connected with 'Existential Elimination' and 'Existen tial Introduction', respectively. A distinction is further made between trivial and non-trivial 'Existential Introduction', where only the latter l::ind involves erasure of 'coreference links.' It is argued that an analogous treatment of the referential-attributive distinction is a way of making sense of Donnellan's suggestion that the latter may depend on the description's role in an argument. Finally, the notions of 'external anchoring' and 'stability of individual con cepts' are related to the distinctions made earlier in the paper.

DEDUCTION RULES

The idea of using 'partial' rather than 'total' interpretations or models in logical semantics, which has been around for a rather long time (see e.g. Hintikka 1969), has become quite popular recently, in the guise of 'situa tion semantics' (e.g. Barwise and Perry 1 983), situations, i.e. partial mod els, being seen as serious alternatives to 'possible worlds' , i.e. total models. In a discussion of the relation between logic and computerized data-base sys tems, Reiter (1978) introduces a distinction between 'open world' and 'closed world' evaluation, which is basically equivalent to that between 'par tial' and 'total models' . Sowa ( 1 983) applies Reiter's distinction to knowl edge representation. A connection hinted at by Sowa, which I want to de velop here, is the close relation between the partial-total distinction and the traditional distinction in logic between proof theory and model theory. Traditionally, proof theory is seen as the study of the ways in which the orems may be derived from a set of axioms, or, on a more liberal view, as the general study of what statements can be concluded to be true, given a set of assumptions or premises. The relation to partial models is easily seen

Downloaded from jos.oxfordjournals.org by guest on January 1, 2011

The distinction between 'partial' and 'total' interpretations (models) is discussed and related to the distinction between proof-theoretical and model-theoretical treatments of logic. It is

2


if we contemplate the fact that a set of assumptions is nothing but a partial assignment of truth-values to the set of sentences that can be asserted about some universe of discourse. Saying that something follows from a set of premises is thus completely analogous to saying that something holds in a partial model. A proof in mathematics or logic is a sequence of formulas, or sentences, which obey certain well-defined constraints, relating to the concept of log ical consequence. From the linguistic point of view, a proof is a special kind of text. Actually, one might claim that proof theory is the only successful formal theory of texts, since only in proof theory has it been possible to de fine 'texthood' rigorously for a text type. The possible links between proof theory and a theory of texts or discourses were pointed out almost two de cades ago by the logician John Corcoran 1 , but have not to my knowledge been taken up in a serious way by any linguist, probably because it has seemed that the properties of proofs that interest logicians are too far re moved from the issues that are central to the linguistic theory of discourse. It would appear that the relation of logical consequence, which is central to the theory of proofs, has little relevance for everyday discourses. I want to claim that this view is mistaken. Deductive inferences are by definition 'truth-preserving'. This means that in one sense of the word 'information' , a statement derived by deduction does not contain any information that is not contained in the premises. In a proof, no sentence may appear that does not follow logically from the as sumptions made at that point in the proof or is itself introduced explicitly as a new, temporary assumption. In other words, a proof contains no informa tion (in the 'logical' sense) which is not contained in the axioms or assump tions, and can, if we like, be seen as a partial 'rephrasing' of the assump tions. In this way, a proof is like a summary or an abstract of a book or an article, which may be seen as an alternative rendering of the information found in the latter. In fact, an abstract writer can be said to obey the same constraints as a writer of a proof: if he adds a thought of his own, that is, something that cannot be concluded from the original text, he breaks the rules of the game. Of course, this is not to deny that there are differences. The proof constructor's aim is to prove a theorem from a set of axioms, which can be assumed to be known to his readers, whereas the abstract writer's is to communicate the gist of the contents of some work to people who have not read it. In this sense, the abstract conveys new information in a way that the proof does not. However, this should not detract us from the important similarities between the two kinds of activities. In fact, we can generalize the analogy further. An abstract differs from most other kind of texts by its essential relations to another text - the book or article that is being abstracted. Although it is true that the 'data-bases' people use when talking about the world are often expressed in the same language as

3


the one they are speaking, we would not like to restrict a theory of text generation to those cases. We need not do so, either, if we extend our con cept of a 'partial model' from denoting a truth-value assignment to a set of sentences to denoting a similar assignment to any set of representations of reality, whatever form they are expressed in. If someone is given the assign ment to write a description of how to get from Stockholm to Copenhagen, it matters little if he bases his description on a map or on the text in Miche lin's Guide. Both may be seen as partial representations of reality from which one 'deduces' the information needed. In discourse theory, it has been popular to view the processing of a text in terms of the construction of 'mental models' or 'discourse representa tions' : when someone listens to or reads a description of some part of real ity, he builds a representation of that part of reality in a stepwise fashion as the discourse goes along. The proof theory analogy suggests a different perspective where one looks at. things rather from the point of view of the speaker or writer: how can a text be constructed on the basis of a represen tation of some part of reality? This is of course nothing but a rephrasing of the problem of what is called 'computer-based text generation' , i.e. the construction of programs for automatic compilation of reports written in natural language on the basis of computer data-bases. Notice that since such reports normally provide the answers to a specific set of questions, text generators may be seen as special cases of question-answer systems, and there is an obvious link between such systems and proof theory in that what a question-answer system does when it gets a question is to try to test, i.e. prove, for each possible answer to the question, whether it follows from the information in the data-base or not. Since extensive work has been done in the area of text generation, I can not be said to be entering virgin ground, but I want to pursue a line of thought that I have not seen discussed anywhere, viz. to explore to what ex tent the rules of deduction formulated by logicians in proof theory can serve as a basis for a formal semantics for natural language discourse, and to show that it is indeed possible to throw some light on some classical ques tions of natural language semantics in this way. Deduction rules as formulated in modern treatments of proof theory are often based on principles due to Gent zen 1 934- 5 , where the rules were given in pairs, in such a way that for every logical constant, there was one 'introduction rule' and one 'elimination rule' . For example, we may postu late a 'Conjunction Introduction Rule' saying that whenever the assump tions contain the propositions a and {3, we are allowed to assert a 1\ {3, and the converse 'Conjunction Elimination Rule', to the effect that given the proposition a 1\ {3 we may assert any of the conjuncts, i.e. either a or {3. The most complex and also most interesting from our point of view among the deduction rules usually postulated are those pertaining to the

4 quantifiers of predicate logic. The ones that we shall discuss here are th1 'Existential Introduction Rule' and the 'Existential Elimination Rule' , tha is, the rules that govern the use of the existential quantifier. The ' 3 Introduction Rule' is the rule that allows us to prefix an existentia quantifier to a formula, simultaneously replacing all occurrences of a speci fied free variable with a variable bound by the quantifier. This is the sim plest rule of the two, and the most interesting one from our point of view but to obtain some background, we shall first look at the second one, the '3 Elimination Rule'.

'3 Elimination' is the process by which you substitute an unbound individu al term for the occurrences of a bound variable in an existentially quanti fied expression, simultaneously deleting the quantifier. As this process i� usually described, it cannot be performed just like that: the choice of an ar· bitrary individual term would not guarantee the truth of the resulting for· mula. Therefore, '3 Elimination' is only allowed to be used as a temporar) step in the proof of a formula which does not contain the individual term Furthermore, the individual term used must not have occurred earlier in tht proof. These conditions are intended to ensure that the choice of the indi· vidual term does not influence the validity of the argument. It seems tha1 the conditions could be relaxed in a system where the extensions of individu· al terms are not determined in advance: in such a system, there could be ar '3 Elimination Rule' allowing one to create new individual terms anc substitute them for existentially bound variables. The domain of such a rult would be exactly those variables that correspond to 'specific' indefinit< noun phrases in the sense of 'specific' which corresponds to Geach's con· cept of 'namely-riders' , i.e. those noun phrases which can be amplified b) a tag of the form ' . . . namely X'. Consider e.g. the classical specificity am· biguity in (1), the two readings of which can be formalized as (2a-b). (I)

Mary wants to marry someone.

(2)

(a) (3x)(Want (Mary, Marry(Mary, x))) (b) Want(Mary, (3x)(Marry(Mary, x)))

The (a) reading is a 'namely' -rider: we can continue ' . . . namely John' . In the (b) reading, this addition does not make sense. Correspondingly, •: Elimination ' is applicable only to (2a), since it can only be applied to ex· istentially quantified formulas where the quantifier has scope over the whole remaining formula.


EXISTENTIAL ELIMINATION AND (T-)SPECIFICITY

5 What I have said here about the relations between the deduction rule ' 3 Elimination' and specificity in natural language i s hardly original o r con troversial. It is of course trivial that two truth-conditionally different read ings of a sentence differ in their logical consequences and therefore in the roles they play in a logical argument. What I shall claim now is that there are ambiguities which are not truth-conditional and which proof theory may throw light on.

EXISTENTIAL INTRODUCTION AND (P-)SPECIFICITY

(3)

Some child has not received his ice-cream cone.

(4)

(a) At a party, the host is distributing ice-cream cones to a group of children. He thinks he has given them to everyone, but his wife sees one unhappy face among the children and utters (3). (b) At another party, the host knows that he had bought exactly one cone for each child. He thinks he has given everyone a cone but sees that there is still one left on the plate, and so utters (3).

The difference between the two 'readings' of (3) is usually said to depend on 'whether the speaker has a specific individual in mind' or 'whether the speaker is referring to a specific individual'. It thus is very similar to Don nellan's (1 966) distinction between referential and attributive readings of definite noun phrases, and could perhaps be claimed to be simply its coun terpart for indefinite NPs. In any case, much of what has been said about the referential-attributive distinction carries over directly to the specificity distinction in (3). Like the referential-attributive distinction and unlike the ambiguity in (1), the 'specificity' exemplified in (3) does not obviously cor respond to a difference in truth-conditions, although arguments to that ef fect have been made, e.g. by Kasher and Gabbay (1976), who - unconvinc ingly - claim that the (a) reading is true only if the speaker can correctly identify the referent of someone. 2 Donnellan says in his classical paper that the referential-attributive distinction reflects a pragmatic ambiguity. Accepting this for the time being, let us refer to the two kinds of specificity as 'T-specificity' (for truth-conditional) and 'P-specificity' (for 'pragmat ic'), respectively.


The term 'specificity' is often used not only to refer to the property that dis tinguishes the (a) reading from the (b) reading of ( I ) but also for distinguish ing different kinds of uses of indefinite noun phrases in cases where there is no apparent scope ambiguity involved. Thus, (3) could be uttered in the context (4a) and (4b), where (4a) would be said to illustrate a 'specific' use.

6

An attempt to formalize P-specificity in a pragmatic framework is made in Groenendijk and Stokhof ( 1 98 1 ) . G&S build their proposal on a notion of an 'epistemic model', in which one defines, for each predicate and each language user, the set of possible denotations of that predicate relative to that language user's beliefs of the world. This involves statements of the fol lowing form (where 'has the information' should be read as 'is of the opin ion ' rather than 'knows that'): '

x has the information that the predicate o is true of the individuals a, b and c'

(5)

You know that you bought seven cones, and there are seven chil dren . But there is still one cone on the table. So some child did not receive his cone.

I think that it is reasonable to say that this is a non-P-specific use of an in definite noun phrase, by a person who knows the identity of its referent. It we admit that this is possible, we must look for a treatment of P-specificity that is at least in principle independent of the speaker's beliefs.


(3), according to the view of G&K, would be used P-specifically if there is exactly one individual in the speaker's epistemic model which is both a member of the set of possible denotations of child and of that of has not received an ice-cream cone. G&K say that their notion of 'specific refer ence' is 'objective in this sense that if two language users have the same in formation about the denotation of the expressions involved and use the same sentence, it can never happen that one of them refers P-specifically to an object without the other also referring specifically to the same object' , whereas the traditional notion of 'having something i n mind ' i s 'purely sub jective, and therefore completely uninteresting from the point of view of conversational analysis' . There are a number of rather difficult problems with this kind of treatment, among others those connected with the individ uation of objects of belief. The problem I want to discuss at this point is the question of whether we really want the notion of P-specificity to be 'ob jective' in G&K's sense. Notice to begin with that the objectivity involved is a bit dubious, in practice, since it is entirely dependent on the speaker's beliefs, which are accessible only to himself. One obvious consequence of G&K's view is that e.g. the speaker in (4a) just cannot use (3) non-P specifically, even if she wants to, since she knows who she is referring to. H owever, I think that there are fairly clear cases when a speaker may use an indefinite noun phrase non-P-specifically even if he knows the identity of its referent. Suppose e.g. that the husband in (4a) does not accept his wife's statement, claiming that the child she is talking of has already eaten his cone. She might then utter (5).

7 What I want to claim now is that there is a direct relation between P specificity and the deductive rule of ' 3 Introduction' . Let us ask the follow ing question: On the basis of what kind of information do the speakers in the two di fferent situations (a) and (b) make the statement (3)? Possible an swers are found in (6a - b). (6)

(a) My husband is distributing cones to the children. I see that Bill looks unhappy and has no cone in his hand . I conclude that Bill has not received his ice-cream cone. (b) I bought seven cones. There are seven children. There is still one cone left. Downloaded from jos.oxfordjournals.org by guest on January 1, 2011

Notice that (6a) - but not (6b) - contains a statement which is identical to (3) except for having a proper name instead of an indefinite noun phrase in the subject position. That means that one may go from (6a) to (3) by an operation which could be said to be the natural language counterpart to '3 Introduction ' . In (6b), on the other hand, no such step is possible. The general idea could then be formulated as follows : a person using an in definite noun phrase P-specifically is making a statement on the basis of a data-base from which that statement is derivable using a logical operation equivalent to '3 Introduction ' . Before elaborating the details o f this idea, let u s first see how i t differs from the proposal of G&K, and in particular, how it can be used to explain the fact that a speaker can use a noun phrase non-P-specifically even if he knows the identity of the referent. What is crucial is the interpretation of the expression ' to make a state ment on the basis of a data-base'. The point is that the data-base involved here need not be identical with the speaker's beliefs. The character of the data-base depends on the nature of the ' language-game' that the speaker en gages in. Two main types of language-game, and correspondingly, two main types of discourse and text, can be labelled 'descriptive' and 'argumen tative'. In a (purely) descriptive discourse, the speaker describes an object or a situation which he has access to information about . In a (purely) argu mentative discourse, the speaker gives arguments to support or prove some thesis. The language-games in which such discourses occur typically obey di fferent constraints as to what can be asserted . In the most extreme form of an argumentative discourse, a mathematical proof, all assumptions are supposed to be known to and accepted by all participants in the language game. Nothing can be asserted that does not follow directly from these as sumptions. In such case, the data-base on the basis of which assertions are made is clearly not what the speaker believes: it is the assumptions com monly agreed upon . Even if everyday arguments are governed by less rigid rules, it is still the case that one expects claims to be made on the basis of

8 information that is accepted by everyone. Therefore, it makes sense also in (5) to say that the statement that someone has not received his ice-cream cone is not made on the basis of the speaker' s beliefs but on the facts that are commonly known to the s peaker and the hearer. In other words, cases of indefinite noun phrases being used non-P-specifically by speakers who know the reference of the noun phrase are to be expected when the noun phrases occur as parts of statements that are presented as conclusions in an argument.

P-SPECIFICITY AND COREFERENCE L INKS


What we have seen so far is how the two concepts of specificity - T-speci f icity and P-speci ficity - may each be linked up with one of the two deduc tion rules associated with the existential quantifier. There is a problem , however : It turns out that there is a trivial way of fulfilling the condition on P-specificity, under which it will collapse with T-specificity. To any T specific individual term, we may apply '3 Elimination' or Skolemization, which will give us a sentence containing an individual term . To this sen tence, we can then apply '3 Introduction ' , taking us back to the original sen tence. It thus appears that any T-specific individual term will fulfill the con ditions for P-specificity trivially. Is there any way of saving the proposal? It appears that what should be done is to formulate some further condition on the individual terms to which '3 Introduction' can apply. Sowa (1983) introduces a system for knowledge representation called 'conceptual graphs' , which builds on the ' existential graphs' used by C.S Peirce for logical representations. I n Sowa's system , individuals are repre sented by nodes, which may be connected by ' coreference links ' indicating which nodes stand for the same individual . Existential quantification is im plicit: any individual node is by default assumed to be existentially quan tified. There is a set of inference rules, by means of which graphs may be derived from each other, and two of which are of special interest here: the ones that involve erasure and insertion of coreference links . About these, Sowa says ( 1 983: 1 5 5): 'Peirce's rules for drawing and erasing co reference links replace the standard rules of universal instantiation and existential generalization . ' He does not give any concrete demonstration of the equiva lence of the Peircean rules and the standard rules of in ference, however, and we shall soon see that there is indeed a crucial di fference between them , with direct bearing on our topic. To make the discussion more concrete, we shall introduce 'data-base' of a simple kind, which will be represented by sets of sentences, containing ba sically proper names and verbs (sometimes with prepositions added) . Two

9 examples of such data-bases are given in (7) and (8). For obvious reasons, we shall refer to them as ' first order data-bases' . The assumption is that we do not have any independent information about the individuals mentioned in the texts. (7)

John lives in England. John loves Mary. Mary lives in Scotland.

(8)

John loves Mary. Mary lives in Scotland .

(7 1 )

John lives in England. Someone loves Mary. Mary lives in Scotland.

(8 I )

Someone loves Mary. Mary lives in Scotland .

Using NN as a 'dummy' proper name, we can then use get to (7") and (8"). (7 ")

John lives in England NN loves Mary. Mary lives in Scotland.

(8 )

NN loves Mary. Mary lives in Scotland.

"

'3

Elimination' to

We can now see that although we have seemingly performed identical opera tions on (7) and (8), there is an important di fference in the effects. (8) and (8 ") can be said to be isomorphic - in a way, they describe the same situa tion or, equivalently, contain the same information, given the assumption that the names are just arbitrary labels. Going from (7) to (7 "), on the other hand, there is a clear loss of information: we no longer know that the person who loves Mary lives in England . We may accordingly distinguish two kinds of applications of (destructive) '3 Introduction ' : trivial 3 Intro duction - as in (8), which does not entail loss of information, and non-


Let us now define 'destructive 3 Introduction' as an operation which gener ates a copy of a first-order data-base with the only d ifference that the in definite pronoun someone has been substituted for one occurrence of a proper noun in the original data-base. Letting destructive 3 Introduction apply to John in John loves Mary in (7) and (8) respectively, we obtain (7 1) and (81).

10

(9)

he 1 lives in England. he2 loves her3 . she4 lives in Scotland. he 1 he2• she4• she3 =

=

(7)

he 1 loves her2 . she3 lives in Scotland . she2 she3 . =

He can now see that in order to represent (7 ") in this format, we have to he2 ' from (9), yielding (9 ' ) : delete the statement 'he 1 =

(9 ' )

he1 lives in England. he2 loves her3 . she4 lives in Scotland. she4 . she3 =

Both (8) and (8" ), on the other hand , will be represented as (7) - this is then what we called the trivial case of 3 Introduction.


trivial 3 Introduction, as in (7), which does, and sharpen the condition on P-specificity by saying that it must involve non-trivial 3 I ntroduction. Exactly what information is contained in (7) that makes 3 I ntroduction destructive there but not in (8)? We may say that it is the information that the referents of the subjects of the two first sentences in (7) are identical. This information is not expressed as a separate proposition but rather fol lows from the fact that the same proper name (John) is used twice. In a sys tem like that of Sowa, this kind of information could be expressed by a 'coreference link' , which would then get lost in going from (7) to (7 ") quite in agreement with Sowa's statement above about the correspondence between 3 Introduction and erasure of coreference links. But notice that in (8) there is no coreference link to erase - in other words , 3 Introduction is equivalent to coreference link erasure only in the non-trivial cases. In order to illustrate this in a more concrete way we shall introduce a notation which is equivalent to the graph representations Sowa uses, but which retains a clausal form (in fact, quite similar to the 'Discourse Representation Struc tures' of Kamp 1 98 1 ). As before, we shall use data-bases which are essen tially sets of sentences : the main difference being that instead of proper names we use subscripted pronouns . Co-reference relations between the pronouns are not indicated primarily by using identical subscripts but by special statements of identity, corresponding to Sowa's coreference links. (7) and (8) will thus correspond to (20) and (2 1 ), respectively:

II THE REFERENTIAL-ATTRIBUTIVE DISTINCTION

We shall now turn to a discussion of the distinction between referential and attributive uses of definite descriptions as introduced by Donnellan (1966):

'A speaker who uses a definite description attributively in an assertion

Donnellan rejects the possibility that the distinction is a function of the speaker's beliefs, using an example that is somewhat similar to the one we used above: 'To use the Smith murder case again, suppose that Jones is on trial for the murder and that I and everyone else believe him guilty. Suppose that I comment that the murder of Smith is insane, but instead of backing this up, as in the example previously used , by citing Jones' behavior in the dock, I go on to outline reasons for thinking that anyone who murdered poor Smith in that particularly horrible way must be insane. If now it turns out that Jones was not the murderer after all, but someone else was , I think I can claim to have been right if the true murderer is after all in sane. Here, I think , I would be using the definite description attributive ly, even though I believe that a particular person fits the description . ' Thus, according to Donnellan ' s analysis, the distinction - at least i n argu mentative texts - does not so much depend on what is in the speaker's mind as on the role of the description in the argument. 4 We shall illustrate this by spelling out the 'referential' and the 'attributive' arguments explicitly. (Of course, these arguments are not logically valid but should rather be seen as fairly typical examples of everyday reasoning, in which , however, rules of deduction may play an important role . ) (I I )

The 'referential' argument: (a) Someone is sitting in the dock (b) Someone murdered Smith


states something about whoever or whatever is the so-and-so . A speaker who uses a definite description referentially in an assertion, on the other hand , uses the description to enable the audience to pick out whom or what he is talking about and states something about that person or thing. In the first case the definite description might be said to occur essentially, for the speaker wishes to assert something about whatever or whoever fits that description; but in the referential use the definite description is merely one tool for doing a certain job - calling attention to a person or thing - and in general any other device for doing the same job, another description or a name, would do as well . ' ( 1 966:285)

12 (c) (d) (e) (f) ( 1 2)

The man in the dock shows symptoms of insanity The man in the dock is insane The man in the dock is Smith's murderer Smith's murderer is insane

The 'attributive' argument: (a) Someone murdered Smith (b) Smith was murdered in a cruel way

[(c) The man in the dock is Smith 's murderer] (d) Smith's murderer is insane

( 1 1 ')

( 1 2')

(a) (b) (c) (d) (e) (f) (g) (h)

he1 he2 he 3 he4 he4 he5 he5 he2

is sitting in the dock murdered him 3 : Smith shows symptoms of insanity : he 1 is insane : he1 he1

(a) (b) (c) (d)

he1 he2 he 3 he 3

murdered he2 : Smith was murdered in a cruel way : he 2

=

We can see that according to the definition of P-specificity given above, someone in the statement Someone murdered Smith would be P-specific relative to ( II) but not to ( 1 2), since the application of 3 Introduction would be non-trivial only in ( 1 1 ), due to the presence of the co reference link ( 1 1 ' h). We might therefore suggest the following condition on the referen tial use of definite descriptions:


The difference here seems to be that the referential argument crucially in volves a statement which provides an independent way of identifying Smith's murderer. If this statement should turn out to be false, the argu ment will collapse. In the attributive case, on the other hand, the identifi cation statement can well be deleted without affecting the validity of the ar gument in any way, as I have indicated by putting ( 1 2c) in square brackets. How is this related to what we have said above about the relation be tween specificity and the use of deduction rules? There t urns out to be a very direct relation, but to see it clearly we have to rewrite the essential premises in the above arguments in a way that does not involve definite de scriptions.

13 ( 1 3)

A definite description is used referentially if and only i f the existen tial statement it presupposes is derivable by non-trivial 3 I ntro duction .

EXTERNAL ANCHORING AND STABILITY OF INDIVIDUAL CONCEPTS

( 1 4)

An old man visited a small girl yesterday. He brought her a present .

The information i n ( 1 4) could b e represented as ( 1 4 ' ) .


To what has been said in this paper so far the following objection may be made: It is all very well to say that a person who uses an i ndefinite N P P specifically bases that use on a data-base from which the statement he makes is derivable by something like '3 I ntroduction ' . But how are we to apply this criterion in practice? H ow do we know whether the knowledge of the speaker is such that we can apply a logical operation 'equivalent to 3 I ntro duction' ? More specifically : it is one thing if we have a set of sentences which contain some individual constant - it is then uncontroversial that we can make these sentences into existential claims by '3 Introduction ' . But i sn't the problem really what the conditions are for using individual con stants or proper names? Without really claiming to be able to answer those questions, I would like to discuss the relation of the notion of P-specificity to some other concepts viz. that of external referential anchoring and stability in individual con cepts. Consider a maximally pure case of a fictional text, say the fairy-tale about Cinderella. Such a text clearly defines a 'data-base' or 'relational structure' involving a set of individuals with certain properties and stand ing in certain relations to each other. But the elements in this structure have no relation - as far as we know - to anything in the real world, or to any other structure outside of it. In the terminology of Chastain ( 1 975), the text is 'referentially segregated ' , or in the terms of other people, it has n o ' referential anchoring' . Still, of course, there are internal coreference links in the story, but none that go outside of it. It sometimes happens that we read a story, believing it to be fictional , and later on find out that it was in fact about real people and real events . What this illustrates is that having referential anchoring is something that is in principle outside the data-base as such - it is something that may or may not be added to it. We shall illustrate how this might work, using our recently introduced notation for data-bases. Suppose that someone tells me the 'story' in ( 1 4) .

14 (14')

he1 i s a n old man. Sh� is a small girl. he3 visited her4 yesterday. he1 he3 she4 she2 he5 brought her6 a present. he5 he1 he4 he2 =

=

=

( 1 5)

he6 lives in Stockholm . he7 i s a teacher. he6 h� =

Suppose now that I find out that the story I heard really concerns the teacher in Stockholm . Notice that this knowledge is separate, in principle, from both ( 1 4 ' ) and ( 1 5) : my total knowledge would have to be represented as in ( 1 6). ( 1 6)

he1 is an old man. She2 is a small girl. he 3 visited her4 yesterday. he3 he1 she4 she2 he5 brought her6 a present. he5 he1 he4 = he2 =

=

he6 lives in Stockholm. he7 is a teacher. he6 he7

In this structure, the story and my previous knowledge show up as sub-structures. The point is that the external anchoring of the dis course referents in the story shows up as an ordinary internal coref erence link in the total structure. This fact makes it possible to link up the notion of external refertntial anchoring with the notion of P specificity. Let us again consider (9) with a slight modi fication: we


Now, I happen to know about a person who lives in Stockholm and is a teacher. This knowledge could be represented as ( 1 5) .

15 have separated out the part o f it that underlies the sentence Someone loves Mary: (9")

he1 lives in England.

I he2 loves she3• she4 lives in Scotland. he1 = he2 . she4• she3 =

( 1 7)

Suppose a man, let us call him John Smith, meets a woman , let us call her Susan Brown , and that John Smith falls in love with Susan Brown . . .

( 1 7), of course, might grow into quite a long story. What distinguishes this

use of proper names from other, more ordinary uses? One thing is immedia tely clear: regarding ( 1 7) and its continuation as a data-base, it will be ' ref erentially segregated' : there is no link from John Smith and Susan Brown to anything outside it. (The same will be true, of course, of proper names used a fictional characters in novels.) This contrasts with the ' normal' use of proper names: a person mentioned in a news item, for instance, will be identifiable also outside that news item. Again, we see that the identifiability of individuals outside the data-base is crucial: this condition singles out P specific indefinite noun phrases, but it also distinguishes arbitrary and non arbitrary uses of proper names. Let us now look at proper names from a slightly di fferent angle, and re call the old discussion about the question whether a proper name has a sense in addition to a reference . Some people have claimed that proper names are abbreviated descriptions. The well-known objection to this theory is that in


We can see that if we look at the enclosed part of (9") as a data-base of its own, its referents have ' external anchoring' due to the coreference state ments at the bottom of the list. The presence of these coreference statements was also precisely the basis for saying that someone in the sentence Someone loves Mary was P-specific when based on this data-base. In other words, 'having external referential anchoring' and 'being P-specific' are closely re lated notions. Let us now return to the question of the conditions for using constants and I or proper names. The rule of 3 Elimination, by which a ' new' constant is introduced to refer to an individual - resembles the way in which 'arbitra ry' proper names may sometimes be introduced in everyday language, e.g. in discourses like the following:

16

CONCLUSION

In this paper, I have argued for the relevance of deduction rules in formal semantics of natural language. More specifically, recall that many seman ticists have equated the meaning of a sentence with the set of propositions that are logical consequences of that p roposition . Even if one does not make that identification, it is clear that whenever two sentences have the same truth-conditions, they have the same logical consequences, which also means that if we apply the same logical deduction rules to two sentences with the same truth-conditions, we should get the same results. The upshot of this is that the study of the truth-conditions of linguistic expressions is equivalent to the study of their role as premises in logical arguments . In this paper, I have tried to show the relevance of proof theory for semantics by showing that there are distinctions commonly made in semantics that can


most cases, we know so much about an individual that there is a very large set of possible descriptions that each would serve as well as an identification of the individual in question, and thus the choice of one of them as the 'meaning' of a proper name that is used of that individual appears ar bitrary. In his discussion of this problem , Kripke ( 1 972) mentions that there are cases when the reference of a name is determined by a description, e . g . when the name Neptune was introduced t o refer t o a hypothesized planet ' which caused such and such discrepancies in the orbits of certain other planets ' . I want to look at somewhat similar cases, when a name is given to a (sometimes hypothetical) historical person for whom there is basically on ly one possible identifying description. A good example is the I ndian lin guist PI.. ll . ci>(A.x .nO..y .P( ( x ,y ) ))), where ci> lowing interpretation: U and O and n range over denotations of type T and P is a variable over type ( e,t ) . To see this, note that every application o f rule (ii) corresponds to application of a variable and every application of rule (iii) to the abstraction over a variable. So steps (a) and (b) correspond to applying basic (not type-lifted) non-Boolean " and" to x and y, resulting in the doubleton ( x,y } . Steps (e), (h) and U) correspond to the three abstractions in the translation of "and" . What is crucial here is that the denotation of "and" after type-lifting is not stipulated ad hoc, but follows from van Benthem's semantics for type shifting. For a conjunction of two quantificational noun phrases, such as every soldier and every officer, type-lifted non-Boolean conjunction will produce the set of all properties o f all pairs of a soldier and an officer. This gives the right interpretation for sentences such as every soldier and every officer met, no soldier and no officer have danced together etc . (cf. also Footnote 1 ) . For cases of mixed conjunctions (i.e. where a quantificational noun phrase is conjoined with a referring term), we now have the option to lift the type of the referring term to that of a generalized quantifier and then apply Boolean conjunction, or else to lift the type of the conjunction opera tor for one of its arguments. In the latter case we get the set of properties of all pairs consisting of the Pope and a Catholic as the denotation of the conjunction the Pope and every other Catholic. A maj or unsolved problem with the type-lifting approach is that we must block the use of type lifting in situations where it is not needed, in particular in the cases of non-Boolean conjunction discussed in the beginning of this paper. Partee and Rooth (1 983) suggest a processing strategy, according to which one uses the lowest types possible. However, the facts they had in mind (scopal readings o f disjunctions) are of a different status than the ones I am concerned with here. In particular, it seems simply wrong to say that Bill and Harry was watching TV is a possible English sentence, and that its oddness stems from the fact that the sentence Bill and Harry were watching TV is the preferred variant because of some processing strategy. In fact, there are cases were a Boolean interpretation is enforced, such as in the ' X as well a s Y ' construction, cf. John as well as Harry works o n this problem . Here there seems to be no processing problem at all . So we must invoke a =


36 rule to use minimal types only, but this rule appears to be not a processing strategy but rather a principle of grammar.

9.

(2 1 )

a. b. c. d. e.

My great opponent and the hero of my youth has passed away. A great man and a good father h as passed away. A great man and the best magician in New Jersey has passed away. •or. Jekyll and Mr. Hyde has passed away. •charles Dodgson and Lewis Carroll h as passed away.

Even though two names can refer to the same entity, appositional conjunc tion is disallowed. The only way to account for this that comes to my mind, is by stipulating that different proper names always correspond to different discourse markers in the discourse representation structure, whereas de scriptions may be represented by a single marker. Hence there will always be the possibility of first-order conjunction for proper names, even in cases of coreference, and the availability of such a conjunction will block the higher-order analysis needed for appositional conjunction. As we have seen, some such blocking principle is needed anyway to rule out Boolean conjunc tions of non-coreferential singular terms. However, making this distinction seems an ad hoc proposal at the moment, and moreover, it still fails to cover the facts entirely, since it would predict that appositive conjunction of a proper name with a definite description would be possible. As far as I am aware, this is not the case. (22)

a. b. c.

• John and my best friend is sick . • M y hero and Houdini h as passed away. •Amy and a long-time lover lies buried here.


It might be supposed that type-lifting will also give us a handle on apposi tional conj unction. Conjunction of the noun phrases his aged servant and the subsequent editor of his collected papers is blocked at the first-order level, since group formation of a and b is not defined in case a = b. Lifting the type and applying regular Boolean conjunction gives us appositional conjunction for free. However, things are not this simple. It should be noted that appositional conjunction is restricted to definites and indefinites; proper names do not seem to be conjoinable in this way. So we have the following pattern:

37 1 0.

Department of Linguistics Univusity of Ptnnsylvania 619 Williams Hall Philadelphia, PA 19104-6305 USA


To conclude, let me sum up the main claims of this paper. It was found that quantified noun phrases and referring terms behave differently in conjunc tions. This finding makes sense if we take quantified noun phrases to denote generalized quantifiers, while holding on to the view that referring expres sions denote individuals. The domain of individuals was sketched with its group structure and compared with alternative proposals. For indefinites, the Kamp/Heim theory of discourse representations was adopted, which has the desirable property of treating indefinites as referring terms instead of existential quantifiers . Since this theory was originally motivated for an entirely different set of phenomena, this paper provides some additional support for it. Finally, it was shown how type raising can be used to relate referring terms to generalized quantifiers so that it is possible to interpret disjunctions of referring terms and mixed conjunctions of referring and quantified expressions. This approach seems preferable over the more uni fied Montagovian approach, which treats all NPs as having the same logical type. This brings me to a matter discussed by Ed Keenan ( 1 982) in a paper called "Eliminating the Universe (A Study in Ontological Perfection)" . Ac cording to Keenan, a semantics for L is ontologically perfect j ust in case the elements of its ontology are possible denotations for expressions in L. H e argues that i t is desirable t o have an ontologically perfect semantics, for ' ' [o ]therwise the denotations of some expressions would be defined in terms of semantic things which we cannot refer to in the language and so in some sense cannot know " . While I think we don't have to be able to refer to an object in order to know it (in fact, it is not necessary to speak any language at all to know some objects, as prelinguistic babies seem to show), ontologi cal perfection seems to be a desirable feature. It is closely related to Occam's Razor and similar requirements of parsimony in scientific methodology. The semantics for a fragment of English in M ontague's PTQ (Montague 1 974) is not ontologically perfect, since it introduces objects without having any expressions denote t hem . I nstead, the objects are needed to build up sets of objects, as denotations for the predicates, and sets of sets of objects, for the NPs and so on. Keenan proposes to eliminate objects and use properties instead as the primitive elements of his Boolean semantics. This paper sug gests yet another road to ontological bliss, not by eliminating the universe, but by re-introducing referring terms.

38 NOTES

I.

Earlier versions of this paper have been presented at the University of Washington and

Stanford University. I am indebted to the audiences at these presentations as well as D. Dowty, C. Roberts and

2.

J. Lenning and an anonymous reviewer for comments and criticisms.

In many of these

cases,

both plural and singular agreements are possible. Exactly what

causes this variation is not clear to me, but it would seem that the singular agreement is caused by the Boolean nature of the conj unction in these cases (hence semantically motivated) and the plural agreement is due to the formal analogy of these conjunctions with the much more com mon non-Boolean variety (hence syntactically-driven). In the area of agreement, such variation is not uncommon, and usually hard to accoun t for in a rigorous manner. To be sure, the exis tence of this variation is often taken to be evidence for a syntactic account of number agree ment, since there appear to be no semantic differences. However, the position that number

sarily weak. My position is that most facts about number agreement

can

seems

unneces

only be explained (as

opposed to described) semantically, but that there remains some arbitrariness which must be ascribed to syntactic encoding. This general position is also taken in Sadock (I 983). 3.

David Dowty has drawn my attention to the existence of

cases

where the conjunction of

two quantifiers does not behave in a Boolean manner, but rather in the manner of branching quantifiers: (i)

No farmer and no student were ever alike.

Unlike the rather similar examples discussed i n Barwise (I 979), these cannot be explained away

quite as easily by invoking the logical properties of reciprocal predicates, as in the appendix

of Hoeksema (1 983). For an analysis of these cases in terms of type-lifting of the non-Boolean conjunction operator,

4.

see

Section 8 .

In the case o f proper names, i t should be noted that there i s a rather common use o f con

joined proper names as singular expressions, namely when the conjunction as a whole is used as a single proper name. Examples of this special use of conjunctions are brandnames (e.g.

Strawbridge and Clothier is havmg a salt; Bolt, Btranek and Ntwman has hirtd a linguist; Johnson and Johnson �/Is baby products), reference to publications by the names of the authors (as in Dowty, Wall and Pettrs is out of print), etc. Semantically, the internal of these names is irrelevant. Each name refers to a si ngle individual-level entity, and not to a group. Of course, this entity may have a certain historical relationship to a group, such as the relation ship between a firm and its founders, or that between a paper and its authors, but that relation ship does not take part in the interpretation of the complex names under consideration. In this respect, such names are on a par with other complex names, such as booktitles, quotes, or placenames like Bird-in-Hand, Pa and White Plains, N. Y. . 5 . Lenning (1 986) and Roberts ( 1 987) have also appealed to the distinction between quan

tificational and referring terms to explain certain d ifferences in distributivity of conjoined ex pressions.

6.

An unsolved problem for almost all accounts of partitives is the non-occurrence of con

joined NPs after partitive of, as illustrated by the ungrammaticality of examples like ont of

you and mt, two of Tom, Dick and Harry, nont of Janict and mt, etc. Only the account given m

Keenan and Stavi ( 1 986) has no di fficulty with such cases, because these authors analyze the

partitive construction as involving a complex determiner of the form "dtt of dtt" combined with a common-noun phrase. While two of tht boys or stvtral of my frttnds allows such a parsing, the cases involving conjunctions do not. However, that account seems flawed for other reasons (cf. Hoeksema 1 984 for some discussion).


agreement is a purely syntactic phenomenon, a position commonly taken in GPSG-studies of agreement and conjunction, such as Sag, Gazdar, Wasow, and Weisler ( 1 985),

39 7.

Even more problematic cases like on� of ev�ry th� students actually seem to involve an

idiomatic interpretation. Clearly, such expressions do not quantify over all triples of students, but rather over all triples in a given partition of the domain . Note that we could have used one in evuy three students as well. Otherwise, it is not possible to replace partitive of by in. 8.

Of course there is a third reading, according to which none of the three men ever published

in the same journal, which does not concern us here. 9.

A problem which arises is that either meaning postulates must be allowed to be optional,

which is inconsistent, or else rather pervasive lexical ambiguity must be assumed, given the large amount of verbs which give rise to both distributive and collective readings.

REFERENCES

Philosophy 4 : 1 59-219.

Biiuerle, R., Schwarze, C. & Stech ow, A. von (eds.), ( 1 983): Meaning, u� and Interpretation

of Languag�. De Gruyter, Berlin.

Benthem, 1. van 1 986: Essays in Logical �man tics, D . Reidel Co., Dordrecht.

Boole, G. 1 854: An Investigation of the Laws of Thought on Which are Founded the Math�

matica/ Th�nes of Logic and Probabilities [Reprint. Dover, New York 1 95 1 ) . Cresswell, M . 1 985: Review o f Landman and Veltman (eds .), 1 984, Varieties ofFormal &man

tics. I n : Linguistics 23. Dowty, D . 1986: A Note on collective predicates, distributive predicates, and A ll. I n : Marshall, Miller and Zhang, (eds.), Proceedings of th� Third Eastern States Conference on Linguis tics. Department of Linguistics, Ohio State University, Columbus. Pp. 97- 1 1 5 . Dowty, D . t o appear, type-raising, functional composition and non-constituent coordination. ln: R.T. Oehrle, E. Bach and D. W heeler, (eds.), Categorial Grammars and Natural Lan

guage Structures, D. Reidel, Dordrecht. Eijck, J. van 1 98 3 : discourse representation theory and plurality. In: A. ter Meulen, (ed .),

Studies m Modeltheor�tic &mantics. Foris Publications. Dordrecht. Pp. 8 5 - 1 06 . Flickinger, D . P . , Macken, M . a n d Wiegand, N . (eds.) 1 982: Procefflings of the First West

Coast Cofl/erence on Formal Linguistics. Linguistics Department, Stanford University, Stanford. Gazdar, G. 1 980: A cross-categorial semantics for coordination. Linguistics and Philosophy 3 :407-409. Groenendijk, J . , Janssen, T. & Stokhof M . , (eds.) 1 98 1 : Formal Methods in the Study of

Language. Mathematisch Centrum, Amsterdam. Heim, I . , 1 982: The Semantics of Definite and Indefinite Noun Phrases. Doctoral dissertation. University of Massachusetts, Amherst. Hoeksema, 1. 1983: Plurality and conjunction. In: A. ter Meulen, (ed.), Studies in Mod�l

th�retic &mantics. Foris. Dordrecht.

Hoeksema, 1 . 1 984: Partitives. MS, Rijksuniversiteit Groningen.

Hoeksema, 1 . 1 986: An account of relative Clauses With Split Antecedents. In: M. Dalrymple,

1 . Goldberg, K. Hanson, M. Inman, C. Pinon and S. Wechsler, (eds .), Pro�ings of the

West Coast Conference on Formal Linguistics, vol. 5 :68-86.

Kamp, H. 1 98 1 : A theory of truth and semantic representation. In: Groenendijlc et al . Keenan, E.L. 1 982: eliminating the universe (a study in ontological perfection). In Flickinger et al . , pp. 7 1 - 8 1 . Keenan, E . L . & Faltz, L . 1 985: Boolean &mantics for Natural Language. D . Reidel, Dordrecht. Keenan, E.L. & Stavi, Y. 1 986: A Semantic Characterization of Natural Language Deter miners. Linguistics and Philosophy 9-3 :253- 326.


Barwise, 1. & Cooper, R. 1 98 1 : generalized quantifiers and natural language. Linguistics and

40 Ladusaw, W . A . I 982: Semantic constraints on the english partitive construction. In: Flickinger et al., pp. 23 1 -242. Lambek, J . , 1 958: The mathematics .of sentence structure. A merican Mathematical Monthly 6 5 : I 54- I 70. Landman, F. 1986: Towards a Thtory of Information. The Status of Partial Objects in &man tics. Foris Publications. Dordrecht.

Landman, F . , 1 987: Groups. MS. University of Massach usetts, Amherst. Link, G. 1 983: The logical analysis of plurals and mass terms: a lattice-theoretical approach. I n : R. Bl:iuerle et al. L0nning, J . T. 1 986: Collective readings of definite and indefinite noun phrases. In: P. Gaer denfors (ed.), Gtntralized Quantifius: Linguistic and Logical Approacht!S. D. Reidel. Dordrecht. Montague, R. 1 974: Formal Philosophy, edited by R . H . Thomason. Yale U niversity Press.

1 : 36 1 - 383.

Roberts, C. 1 987: Modal Subordination, A naphora and Distributivity. Doctoral dissertation. University of Massachusetts, Amherst. Sadock, J . M . I 983: The necessary overlapping of grammatical components. In: J .F. Richard

son, M. Marks, A. Chukerman, (eds.), Pa{Jt!rs from tht ParastsSion on the Interplay of Phonology, Morphology and Syntax. Chicago Linguistic Society, Chicago, pp. 1 98-22 1 . Sag, I . , Gazdar , G . , Wasow, T . & Weisler, S . 1 985: Coordination and how to distinguish categories, Natural Languagt! and Linguistic Theory 3 : 1 1 7 - 1 7 1 .

Scha, R . 1 98 1 : Distributive, collective and cumulative quantification. In: Groenendijk et al. Pp. 483-5 1 2 . Steedman, M. 1 98 5 : Dependency and coordination in the grammar o f dutch and english". Languagt 6 1 : 523-568. Steedman, M., to appear: Combinatory grammars and parasitic gaps. Natural Languagt! and

L inguistic Thtory.


New Haven. Partee, B. & Rooth, M. 1 983: Generalized conjunction and type ambiguity. In: Bl:iuerle et al . Perlmutter, D.M. & Ross, 1 . R . 1 970: Relative clauses with split antecedents. Linguistic Inquiry

Jounral of Semantics 6: 4 1

-

55

RESTRICTIONS ON DATIVE CLITICIZATION IN FRENCH CAUSATIVES*

JOHAN ROORYCK

ABSTRACf

Goodall ( 1 987) have related this restriction to the ergative-inergative distinction. However, the inability to formally define ergative verbs in French, as well as funher restrictions on the clitici zation of datives in causative constructions show that this hypothesis fails to account for the data observed . A thematic condition on dative cliticization in causatives adequately describes the restrictions noted.

I. I NTRODUCTION

1 Recent work on causative ' restructuring' constructions in French (Fau connier 1 983; Tasmowski 1 984; Burzio 1 986) draws the attention to the fact that syntactically similar verbs di ffer with respect to the cliticization of their animate indirect object or dative complement when inserted into the causa tive construction. This difference appears most strikingly when verbs cor responding to the NPJ VP a NP2 .... NPJ /ui2 VP format are constructed with a causative. (I)

a. b.

(2)

J ' ai fait parvenir/arriver cette lettre a son amie . ' I made that letter arrive to her friend . ' J e lui ai fait parvenir/arriver cette lettre. 'I (to her) made arrive that letter.

a.

J 'ai fait nuire/obeir/ressembler Oscar a ce general . ' I made Oscar harm/obey/resemble that general . ' b . • J e lui ai fait nuire/obeir/ressembler Oscar. 'I made him harm/obey/resemble that general . '

These restrictions also apply to certain verbs selecting both a direct and in direct object (telephoner, repondre) or two indirect objects (parter) when only the dative is expressed .


Causative constructions in French display restrictions as to the cliticization of lexical datives onto the causative. In altogether different frameworks, Fauconnier ( 1 983), Burzio ( 1 986) and

42 (3)

J 'ai fait/vu telephoner/parler/n!pondre Oscar a son frere. ' I made/saw Oscar call/talk to his brother. ' b . • Je lui ai fait/vu telephoner/parler/repondre Oscar. 'I made/ saw him call/talk to his brother. '

(4)

J'ai fait/vu donner/conseiller/interdire ce livre a Luc par Max. ' I made/saw give/recommend/refuse that book to Luc by Max . ' b . J e lui a i fait/vu donner/conseiller/interdire ce livre par Max . ' I made/saw him give/recommend/refuse that book by Max . '

a.

a.


In order to explain this observation, both relational grammar (Fauconnier 1 983) and Chomskyan generative grammar (Burzio 1 986) distinguish two classes among the verbs selecting both a subject and an indirect object. They claim that the superficial subject of 'inaccusative verbs' (RG) or 'ergative' verbs like parvenir, arriver (GG) actually is a direct object at the right side of the verb in deep structure. As such, these verbs cannot constitute an S, but necessarily form a VP. Hence, the anaphor of the dative selected by er gative verbs escapes the Opacity condition when attached to the causative. Both Burzio ( 1 986) and Fauconnier ( 1 983) choose to solve this problem by a double subcategorization of the causative for S and VP respectively. The possibility of sentences like (4b) is explained by the passive interpretation of the embedded infinitive, but this problem will not concern us here. 2 Goodall ( 1 987: 1 28 - 1 29) does not accept this double subcategorization scheme for causatives. His analysis mainly rests on a combination of the ergative hypothesis and Case theory. Goodall ( 1 987) assumes that the causa tive cannot assign accusative case to Oscar in (2b) and (3b) because of the intervening trace of lui. Since Oscar is not adjacent to the complex verb con stituted by the causative and the infinitive, Case cannot be assigned and (3b) is ruled out by the Case filter. Goodall (1987) then predicts that whenever the embedded subject does not need Case, the PP complement of the infini tive can freely cliticize on the causative. For Goodall ( 1 987), this is the case in (4b) where the embedded verb need not assign Case to the subject posi tion, since the verb is interpreted as a passive. This situation also occurs in (I b), since inaccusative/ergative verbs do not assign a thematic role and hence no Case to their subject position . In the remainder of this article, I will critically examine both the analysis based on Case theory and the approaches that only makes use of the ergative-inergative distinction . Moreover, I will try to show that a thematic condition on the cliticization of datives onto the causative construction is sufficient to account for the restrictions concerning both 'ergative' verbs ( l ) (2) and ditransitive verbs (3) (4).

43 2. PROBLEMS FOR CASE THEORY

Goodall's (1 987) account of the restrictions on Dative cliticization on the causative does not seem adequate for both theoretical and empirical rea sons . A first problem involves the explanation of (3a). The acceptability of this sentence is explained as a result of the extraposition of the dative com plement in the following sentence. • J'ai fait/vu telephoner/ecrire/repondre a son frere Oscar. ' I made call/write/ answer to his brother Oscar. '

(5)

(6)

L'infirmiere a fait telephoner a leurs parents taus les enfants qui avaient pleure pendant Ia nuit. 'The nurse made call to their parents all the children who cried during the night. '

Apparently, sentences like (5) are perfectly acceptable when the subject is heavier. This pragmatic restriction of 'NP Heaviness' is a well-known for the stylistic postposition of embedded subjects in French (see Bailard 198 1 for discussion). (7)

a.

II dit qu'ont ete acceptes taus les candidats qui s'etaient pn!sentes ce matin. ' He says that have been accepted all the candidates who came this morning. ' b . * I I dit qu'a ete acceptee Violaine. 'He says that has been accepted Vio1aine.

Consequently, it seems much more adequate to analyze (5) and (6) along the lines of (7) as sentences where the infinitival subject has been postposed. In this way, the need for a theoretically awkward Dative extraposition rule dis appears.


Goodall ( 1 987: 1 8 1 n9) assumes that this rule does not involve movement. Consequently, no trace is present to block Case assignment. However, if no movement is involved, extraposition must be viewed as a stylistic rule oper ating in Phonological Form. This rule should then be ordered before the ap plication of the Case filter, otherwise the rule would have nothing to operate upon: (5) would be excluded by the Case filter because of the intervening PP complement. This ordering of filters and stylistic rules is certainly not a desirable result. A further problem we want to point out in Goodall's (1 987) analysis con cerns the exclusion of (5) by virtue of the Case filter. How are acceptable sentences like the following to be explained?

44 The analysis under discussion also makes for some empirically inadequate predictions. Since the trace of the dative blocks Case assignment in (2b) and (3b), sentences where the dative is subject to Wh-movement should be equally unacceptable. However, this is not the case. (8)

a.

Voila l'homme a qui j'ai fait/vu telephoner/repondre les enfants. 'This is the man to whom I made/saw call/answer the children. ' b . Voila I a femme a laquelle Ie sculpteur a fait ressembler sa statue. 'This is the woman to whom the sculptor made resemble his statue. '

(9)

Cunegonde y/lui ressemble/survit/echappe/repond. 'Cunegonde (to it/him/her) resembles/survives/escapes/answers. '

When inserted i n the causative construction, the dative cannot be cliticized on the causative, but the y clitic can. Compare ( l ) -(3) and the following: ( 1 0)

a.

Le sculpteur y a fait ressembler sa statue (a l'idee du bonheur). 'The sculptor (to it) made his statue resemble (to the idea of happiness).' b . Mon grand-pere y a fait survivre ses trois enfants (a I a seconde guerre mondiale). 'My grandfather (to it) made live through his three children (the second World War) . ' c. J'y ai fait/entendu n!pondre mon frere avec grand aplomb (a cette question). ' I (to it) made/saw answer my brother undisturbedly (to that question). '

Now Goodall's ( 1 987) analysis predicted that so-called inergative verbs can not cliticize PP complements onto the causative. The trace of the PP com plement should make Case assignment impossible in both (2)-(3b) and (6). Nevertheless, the sentences in (6) are acceptable. For this problem, the only way to save Goodall's ( 1 987) analysis would be to distinguish homonyms for the abovementioned verbs: an 'ergative' verb with y and an inergative verb


The restrictions noted in (1)-(3) only seem t o involve cliticization, contrary to what is predicted by Goodall (1987). Moreover, Goodall's (1 987) analysis predicts the unacceptability of sentences where PP complements other than datives are cliticized. Verbs like ressembler, echapper, survivre or repondre select an indirect object of the + I - �nimate type that can be cliticized as resp. lui or y.

45

with lui. This rather unplausible solution brings us to another problem for all analyses outlined in the preceding paragraph: the definition of ergative verbs.

3. PROBLEMS FOR THE ERGATIVE H Y POTHESIS

(II)

a.

Une partie en a beneficielprofite aux rebelles. ' Part of it profited to the rebels. ' b . • II a ete beneficielprofite aux rebelles. 'There was profited to the rebels. '


All solutions sketched have two serious drawbacks. First, the formal defini tion of ergative verbs in French does not seem to apply to all verbs allowing for their dative to be cliticized on the causative. A second and more serious problem for the ergative hypothesis lies in the observation that so-called in ergative verbs are not the only verbs for which dative cliticization onto the causative is excluded. As far as the formal definition of ergative verbs is concerned, Tasmowski ( 1 984) has pointed out that it is very difficult to define ergative verbs in French, since the formal tests that have been proposed cannot always be ap plied rigorously. This problem is worth being analyzed in some detail. A quick glance at Gross's (1 975) lists 5 and 7 learns that 35 verbs correspond to the format NPJ VP a NP2 +-+ NPJ /ui2 VP. Of these verbs, l belongs to a literary register (agnier) , and 1 6 do not enter the causative scheme be cause they are stative and have a nonagentive subject, 3 8 verbs enter the scheme NPJ lui Vcaus Vinf NP2, and thus would be ergative: revenir, profiter, incomber, echoir, bem!ficier, apparaftre, arriver, parvenir. 1 0 verbs do not enter the scheme, but their indirect object can b e realized lexi cally at the right of the causative construction: ceder, echapper, faire obsta cle, mentir, obeir, resister, sourire, succeder, survivre, ressembler. Now, considering that most examples adduced in the literature on ergativity in French concern movement verbs, it seems hard to prove that projiter, in comber, echoir, Mnejicier are ergative, while echapper, clearly a movement verb, is not . Ruwet ( 1 988) argues that the property of taking etre as an auxiliary in the perfect tenses is a sufficient condition for ergativity. Accord ing to this definition, echapper could be inergative, since its perfect tenses displays avoir in the construction with a dative. However, the ergative status of benejicier, incomber, and projiter cannot be defined in this way, since they also have avoir in the past tense. Nevertheless, these verbs satisfy some other tests for ergativity cited by Tasmowski (1985): benejicier, profiter can display a partitive en originating in the 'subject' of the ergative verb, and they do not have an impersonal passive.

46

( 1 2)

Arnaud lui a fait/entendu donner/conseiller/promettre des Iivres. 'Arnaud made/heard him give/recommended/promise books (to someone). 'Arnaud made/heard books be given/recommend/promised to him (by someone). '

When a complement introduced by par is inserted into these sentences, the dative can only be interpreted as the indirect object of the infinitive, since the par-complement absorbs the Agent role. (1 3)

Arnaud lui a fait/entendu donner/conseiller/promettre des livres par Paul. 'Arnaud made/heard books be given/recommended/promised to him by Paul . '

H owever, verbs like emprunter, demander, opposer d o not generate am biguous sentences when inserted in the abovementioned construction: the dative lui can only function as the interpretive subject of the infinitive. In causative constructions with these verbs, the dative lui is always of the non lexical type.


However, Tasmowski ( 1 985 :335 - 336) points out that these tests are in operative for a number of reasons that will not concern us here. Finally, Ruwet (1988) notes that ergatives take the repetitive suffix re-. £choir, beneflcier, proflter do not share this morphological characteristic. Moreover, it can be doubted whether this test applies to all ergatives, since the movement verb arriver, a candidate for ergativity (cfr. supra) does not have it either. These problems for a clear definition of ergative verbs in French show that the ergative-inergative distinction in French is too unpre cise a tool to handle restrictions on causatives with. Moreover, it is quite un satisfactory to note that a clear cut difference in acceptability of dative cliticization depends on a very sloppy definition of the verbs allowing for this cliticization onto the causative. The ergative hypothesis for causative constructions in French was de signed to account for the noncliticization of certain lexical datives onto the causative. However, this hypothesis is at odds with some further restric tions on cliticized lexical datives. The insertion of verbs like donner, promettre, conseil/er, interdire in the causative scheme NPJ lui Vcaus Vinf NP2 yields ambiguous sentences. The dative lui on the causative can func tion as the indirect object of the infinitive or as its interpretive subject (Agent). The dative functioning as the interpretive subject of the infinitive actually is a nonlexical4 dative originating in the causative (Milner 1982).

47 ( 1 4)

a. b.

c.

Charles lui a fait/vu emprunter/demander/soustraire cette somme. 'Charles made/saw him borrow/ask/withdraw that sum . ' L e directeur leur a entendu opposer cet argument. ' 'The director heard them oppose that argument. ' Je lui ai vu preferer ce candidat. ' I saw him prefer that candidate. '

( 1 5)

a. •charles lui a fait/vu emprunter/demander/soustraire cette somme par cet escroc. 'Charles made/saw that sum be borrowed/asked/withdrawn from him by that scoundrel. ' b . • Le directeur leur a entendu opposer cet argument par son secretaire. 'The director heard that argument be opposed to him by his secretary.' c. • Je lui ai vu preferer ce candidat par le directeur. 'I saw that candidate be preferred to him by the director. '

The lexical dative of verbs such as demander, emprunter, opposer can only be lexically present at the right of the causative construction to function as the indirect object of the infinitive. (16)

a.

Charles a fait/vu emprunter/demander/soustraire une somme considerable a cet homme par cet escroc. 'Charles made/saw a considerable sum be borrowed/asked/ withdrawn from that man by that scoundrel . ' b . Le directeur 2 �ntendu opposer cet argument au personnel par son secretaire. 'The director heard that argument be opposed to the personnel by his secretary. ' c . J'ai vu preferer par l e directeur ce candidat inconnu a son propre frere. ' I saw that candidate be preferred by the director to his own brother. '

These data show that restrictions on dative cliticization also apply to ditran sitive verbs which clearly have nothing to do with the ergative-inergative


This absence of ambiguity suggests that the lexical dative clitic of verbs such as emprunter, demander, opposer cannot be attached to the causative. This hypothesis is confirmed by the insertion of par NP which yields unaccepta ble sentences.

48

distinction. It will be clear by now that inergative verbs only constitute a subset of the verbs that do not allow for their lexical dative to be cliticized onto the causative. If the ergative hypothesis were maintained, further restrictions would be necessary in order to deal with these observations. This hypothesis clearly fails to account for all restrictions on the cliticization of lexical datives in the causative construction.

4 . A THEMATIC RESTRICTION O N DATIVE CL ITICIZATION

( 1 7)

a. • Je lui fais telephoner/avouer Mathilde. 'I (to him) make call/confess Mathilde. ' b. Je lui fais telephoner/avouer cette histoire par Mathilde. 'I (to him) make call/confess this story by Mathilde.'

Unlike thematic functions of the Agent-Patient type, thematic functions of the Source-Goal type can be thought of as essentially relational. We can say that a Source/Goal function is only fully realized in its link with a Theme. The Theme-Source or the Theme-Goal relation can be conceived of as chain which bears the thematic function. An independent argument for this posi tion can be found in the fact that the only argument of intransitive verbs can be Agent or Patient, but never Source or Goal. 5 If this characterization of the Source/Goal relations is correct, the impossibility of ( l 7a) and (3b) is due to the fact that the Goal function cannot be attributed to the indirect


These restrictions can be accounted for straightforwardly when the thematic relations linking the infinitival arguments are taken into consideration. The so-called 'ergative verbs' have a semantic characteristic in common: their in direct object is the Goal argument of the subject-Theme. A parallel observa tion can be made for infinitives such as donner, conseiller, promettre (cfr. ( 1 2)), where the direct object is a Theme and the indirect object a Goal argu ment. In the causative construction, the Theme arguments of both types of infinitive are redistributed around the causative construction as direct ob jects, and the indirect object Goal can be cliticized on the causative. Whenever the indirect object of the infinitive is not a Goal argument, pronominalization on the causative is impossible. First, it can be noted that the thematic function of the indirect object selected by verbs like emprunter, reclamer, demander, soustraire can be identified as a Source. For the 'iner gative' verbs mentioned (mentir, nuire, obeir, resister etc.), this thematic function cannot be clearly defined. However, for our purpose it is sufficient to say that only indirect objects with a Goal function can be cliticized on the causative. The contrast noted in the following sentence can also be ex plained along these lines.

49

object. Since the Theme argument is left unexpressed, the thematic Goal chain does not obtain. Consequently, the indirect object clitic cannot be considered a Goal argument, and the sentence is unacceptable by virtue of the general restriction on cliticized Goal datives. On the contrary, (l 7b) is fully acceptable because the Theme argument is expressed . Note however that the Goal restriction only applies to lexical datives. At first sight, certain nonlexical datives can be interpreted as Source ar guments. ( 1 8)

If the Goal restriction is only t o be applied t o lexical datives, we should be able to give a formal definition of both lexical and nonlexical datives. In Rooryck ( l987b) it is shown that lexical and nonlexical datives can be distin guished by two formal tests. Unlike the lexical dative, this type of dative cannot appear in the passive construction, or in a construction with a clitic direct object and a lexical indirect object. These properties can be explained if the nonlexical dative is viewed as an essentially eli tic element that can mar ginally be lexicalized (see note 4). Moreover, the absence of nonlexical da tives in passive constructions shows that this type of dative has no argument status and is co-selected by the direct object function. ( 1 9)

a. • ? Je l ' ai arrachelconfisquelrafle a Martin. ' I took it from Martin. ' b . * ?Ce manteau a ete arrachelconfisquelrafle a Martin. 'That coat was taken from Martin.'

(20)

a.

Je l'ai demandelemprunte a Martin. 'I asked/borrowed it from Martin. ' b. Ce manteau a ete demande a Martin. 'That coat was asked/borrowed from Martin. '

It can be observed that the thematic function of the nonlexical dative is not stable: in principle, a Benefactive/Malefactive reading obtains, depending on the interpretation of the sentence. (2 1 )

Je lui ai pris ce livre. 'I took that book for/from him . '

However, with certain types o f NP (body parts, clothes) this thematic rela tion can denote a more precise inalienable possession.


Je lui ai fait/vu arracher/confisquer/rafler ce manteau par mon ser viteur. 'I made/saw that coat be taken away from him by my servant. '

50

(22)

a. b.

Je lui ai casse le bras. 'I broke his arm.' Je lui ai vu cette jupe. 'I saw that skirt on her. '

(23)

a.

Mme. Lafontaine leur a entendu reprocher ces erreurs par l'instituteur. 'Mrs . Lafontaine heard these errors be reproached to them by the teacher. ' b. Je lui ai vu pardonner sa tentative de meurtre par Ie Pape. 'I saw his attempt to murder be forgiven to him by the Pope.'

The lexical dative of the verbs reprocher, pardonner cannot be analyzed as a Goal of the direct object Theme. Nevertheless, the sentences cited are fully acceptable and thus contradict the restriction on cliticized datives. Conse quently, we will have to reformulate this descriptive condition if we want to account for these data. In order to achieve this goal, we want to reformu late the thematic relations of the Source/Goal type. For all verbs analyzed, the Theme-Goal relation can be viewed as a relation that obtains possibly (proposer, conseil/er, promettre) or necessarily (arriver, parvenir/donner, telephoner), or that is prevented (cacher, camoujler, interdire) at a time t 1 after the time of action t0 of the verb itself. In Rooryck (1 987a), the Theme-Goal relationship is analyzed as a relation of contact between an argument Y and an argument Z at a time t 1 • Likewise, the Theme-Source relation can be described as a contact between an argument Y and an argu ment Z at a time t_ 1 before the time of action t0 of the verb under analysis. A Theme-Goal relationship only makes sense when a contact between Theme and Goal is implied. Now, for judgment verbs such as reprocher, pardonner the semantic relation holding between the direct object and the indirect object can also be described in terms of contact.


As we noted above, this nonlexical dative can also function as the interpre tive subject (Agent) of the infinitive (Rooryck 1 988). We would like to main tain that the Source interpretation is imposed on the thematic instability of the nonlexical dative which normally has a Benefactive/Malefactive in terpretation. The abovementioned restrictions on the cliticization of datives in the causative construction can be accounted for by the descriptive condition that only lexical datives with the thematic function of Goal can be pronominalized on the causative. Although this restriction covers the cases hitherto mentioned, some ex ceptions can be found.

51

(24)

Je pardonne/reproche cette faute a Louis. 'I forgive/reproach Louis that error. '

(25)

a.

Je lui ai reproche!pardonne son imprudence, done, de mon point de vue, il a ete imprudent/*il a subi !'imprudence. 'I reproach/forgive him his carelessness, so, from my point of view, he has been careless/•underwent carelessness. b. Ce message lui est parvenu/arrive!echappe, done il l'a eu/r�u/•subi . 'That message (to/from him) arrived/escaped, so he has had/*underwent it.' c. Je lui ai donne!demande ce livre, done, de mon point de vue, il doit l'avoir/•le subir. 'I gave/asked him that book, so, from my point of view, he should have/•undergo it.'

Note that in all these cases the 'contact' paraphrase cannot be negated without obtaining a contradiction. This shows that the paraphrase can be


From the point o f view o f the agentive subject, there i s a relation o f contact between these two arguments: Louis is responsible for the error. As argued in Rooryck ( l 987a), this relation is independent of the time of action to of the verb. On the contrary, verbs of the type ressembler, nuire, obeir do not imply any contact between the lexical dative and the subject Theme. Verbs like demander, reclamer, emprunter however do imply a contact between the lexical dative and the direct object. Since this relation is between a Theme and a Source, it occurs at a time t_ 1 before the time of action to of the verb. It could be objected that the notion of contact is used metaphorically in the case of reprocher, pardonner, while it is not for verbs implying a 'real' Theme-Goal contact relation. Why do pardonner and reprocher imply con tact and not e.g. nuire? However, the presence or absence of a 'contact' re lation can be tested by using paraphrases of these relations as relevant inferences. Thematic relations of the Agent-Patient type involve relations of power exerted by someone or something on someone or something. The notion of 'contact' does not imply this type of relation. Rather, it must be expressed as a relation of 'having/being' or 'being responsible for' . A rela tion of power cannot be expressed in these terms: a Patient undergoes the power of the Agent. The relation of contact can be paraphrased by the verbs avoir, etre, recevoir (have, be, receive), a relation of power by subir (under go) . The verbs we have analyzed as implying a 'contact' relation construct sentences to which a 'contact' paraphrase can be adjoined, but not a 'power' paraphrase.

52 viewed as a necessary implication of the preceding sentence.6 The verbs that do not allow for their dative to be cliticized on the causative do not imply 'contact' paraphrases . (26)

a. * Elle lui ressemble/succede/ment, done elle/il l'a eu/re�u/subi. 'She resembles him/follows him up/lies to him, so she/he has had/received/underwent her/him . ' b. Elle lui a obeiln!sistelcedelsurvecu, done elle a d u le subir/*l'avoir/*le recevoir . ' She obeyed/resisted/gave way/survived (to) h i m , s o s h e has had to undergo/ *have/*receive him.

(27)

a.

b.

Ce comportement lui a beneficie, done il a dO en obtenir /*subir quelque chose. 'That behaviour benefited to him, so he got something out of it/*underwent it . ' Ce comportement lui a nui, done i l a d u e n subir/*obtenir quel que chose. 'That behaviour harmed him, so he has had to undergo it/*did not get/ got something out of it.'

The metaphorical use of the notion of contact is not only possible, it is even necessary in order to explain certain examples of dative cliticization on the causative. (28)

a. b.

Dieu leur a fait apparaitre Ia Vierge. ' God made the Virgin appear to them . ' L a V ierge est apparu aux en fants, done ils l ' ont aper�ue/ *subie. 'The Virgin appeared to the children, so they have seen/*underwent her . '

I n addition t o the 'psychological' (25a), 'physical' (25bc) o r 'indirect' (27a) contact , the paraphrase of (28b) indicates that some sort of ' eye-contact' is necessarily7 established between the referents of the arguments of ap paraitre. For the verbs that do not allow for dative cliticization on the causa tive, no 'contact' paraphrase can be used as a necessary inference, although in some cases ' power' (Patient) paraphrases are possible (cfr. (26b)). This rethinking of thematic relations of the Source/Goal type allows for a reformulation of the restriction on cliticized lexical datives in the causative construction. Only datives that can entertain a relation of contact with an


The contrast between beneficier (contact, Theme-Goal) and nuire (Agent Patient) is particularly revealing in this respect .

53 expressed Theme argument at a moment t 1 after the time of action o f the verb can be cliticized on the causative. Since the relation of contact between the direct and indirect object of j udgment verbs like reprocher, pardonner is independent of the time t0 of the verbal action, the restriction formulated also includes these verbs.

5. CONCLUSION

Research Assistant of the National Fund for Scientific Research �partml!nt of Linguistics K. U. L!!uvl!n Blijde-lnkomststraat 21 B-3000 LEUVEN Belgium

NOTES •

I would like to thank Beatrice Lamiroy, Ludo Mel is, Karel Van den Eynde and two anony

mous referees of the Journal of Semantics for constructive comments and extensive discussions on this subject, and Dirk Delabastita for improving my English. I would like to express my gratitude to the National Fund for Scientific Research (Belgium) for its financial support.


I have tried to show that the ergative hypothesis is unable to provide a correct account for the restrictions on the cliticization of lexical datives in the causative construction. I n order to give a correct description o f these res trictions, I have proposed a descriptive semantic condition stipulating that only lexical datives of a certain thematic type may cliticize on the causative. In this way, a single subcategorization scheme can be maintained for the causative ' restructuring' construction of type (4). A last question that I want to raise concerns the theoretical relevance o f this analysis. How can the approach presented here b e integrated in a n existing theoretical framework? Only t h e pronominal approach presented in Blanche-Benveniste ( 1 984) seems to offer an adequate framework in which to account for these restrictions. Since this approach distinguishes syntactic functions on the basis of their possibility to enter certain construc tions, the distinction they already draw between lexical datives of the P2 and the P3 type can be used to formally define the restriction noted on the clitici zation of lexical datives on the causative. 8 On the semantic side, our ap proach of thematic relations clearly fits in a cognitive semantics framework along the lines of Langacker ( 1 987) for the (prototypical) notion of contact between arguments. In this way, purely structuralist and anti-structuralist currents in respectively syntactic and semantic research seem to converge.

54 I.

As pointed out by Tasmowski ( 1 98 5 : 232-239), Damourette and Pichon ( 1 9 1 1 - 1 940: par.

1 059-2057) distinguish two infinitival constructions for French causatives. This analysis also shows up in Blanche- Benveniste et alii ( 1 984: 1 86- 1 88). In the first construction, the comple ments of the infinitival construction can be cliticized on the infinitive (a). This construction is currently analyzed as a sentential complement containing an infinitive with an overt subject. In the framework of Chomsky ( 1 98 1 , 1 986), this construction can be analyzed along the lines of ECM verbs (�lieve) in English. For such an analysis in the barrier-framework, see D' Hulst and Rooryck (forthcoming). I n the second construction, nothing can appear between the main verb and the infinitive which merge into a complex verb by a restructuring operation (b). This operation is introduced as a Thematic-Index Rewriting rule by Rouveret and Vergnaud, a rule o f Union (Fauconnier 1 983), or 'Faire-attraction' (Milner 1982). See Rooryck ( 1 988) for a criti cism of this type of analysis w hich was first advocated by Kayne ( 1 977). Je le fais/entends/vois/laisse leur en donner. 'I make/hear/see/have him give them of it . '

b. - J e leur en fais/entends/vois/laisse donner. 'I make/hear/see/have give them o f i t . ' - J 'en fais/vois/entends/laisse donner par eux.

'I make/hear/see/have 8\ ve them of it by them . ' - J ' y fais/vois/entends/lai !se parlier/ manger Theophraste. ' I make/hear/see/have Theophraste leave/eat there.' 2.

This double subcategorization of the causative 'restructuring' construction clearly is in

contradiction with the unified account o f the causative ' restructuring' construction as a categorial idiom (Rooryck 1 988). 3. These verbs are gouttr, nuirt, repugner, rtStu, satisfaire, aller, convenir, advenir, appar

tenir, dtplaire, importer, manquer, �er, plairt, riu.ssir, sroir. Nuirt is an interesting

case,

since the verb seems to be acceptable in the causative construction with an animate subject, and unacceptable with an inanimate subject. This opposition is probably due to the strong correlation between agentivity and animacy. J 'ai fait/vu nuire, •cette situation/? ce directeur aux inter!ts du personnel. 'I made/saw this situation/this director harm the interests of the personnel . 4.

For the distinction between lexical and nonlexical datives, and for the restrictions on the

lexicalisation o f nonlexical datives,

see

Leclere ( 1 976), Barnes ( 1 980, 1985), Rooryck ( 1 987b).

See Rooryck ( 1 988) for an analysis of the nonlexical dative of causative constructions as the Agent of the infinitive. 5.

Moreover, for some ditransitive verbs, a correct thematic description requires that the rela

tion between Theme and Goal cannot obtain: cacher, camouf/er, interdire, rtfuser. Now a negated Goal is simply nonsense, but a negation of the link Theme-Goal by the agentive subject seems to provide for an adequate thematic description of these verbs. Note that they allow for the NP/ lui Vcaus Vmf NP2 construction: Sa femme lui a fait cacher/interdire le vin par le medecin.

' H is wife made wine be stowed away/prohibited for him by the doctor . '

6.

However, some verbs imply negated contact relations, where a negated paraphrase is

necessary (see note 5). J e lui ai interdit l 'alcool, done de mon point de vue, il ne pourra plus en avoir/•Je subir.


a.

55 ' 1 prohibited wine for h i m , s o , from my point o f view, he is n o t allowed to

have/•undergo it anymore. In this case, affirmation o f the paraphrase yields a contradiction. 7. This paraphrase cannot be negated without contradiction. 8 . Karel V a n den Eynde, personal communication.

REFERENCES Bailard, Joelle 1 98 1 : A functional approach to subject inversion. Studies in Language 5 , I , 1 - 29. Barnes, Betsy 1 980: The notion of 'Dative' in linguistic theory and the grammar of French. Lmgvtsttcae Invesllgationes 4: 245-292.

guage 9, 2: 1 59- 1 95 . Blanche-Benveniste, Claire, J . Deulofeu, J . Stefanini and K . V a n den Eynde 1 984: L 'Approche Pronominale. SELAF, Paris. Burzio, Luigi 1 986: Italian Syntax: a Government-Binding Approach. Reidel, Dordrecht. Chomsky, Noam 1 98 1 : Lectures on Government and Binding. Foris, Dordrecht. Chomsky, Noam 1 986: Barriers. MIT Press, Cambridge Mass.

Damourette, Jacques and E. Pichon 1 9 1 1 - 1 940: Des Mots a Ia Pensie. D' Artrey, Paris. D'Hulst, Yves and 1. Rooryck forthcoming: An ECM analysis of French perception and movement verbs. Fauconnier, Gilles 1 98 3 : Generalized union. I n : L. Tasmowski and D. Willems (eds.),

Problems in Syntax. Communication and Cognition, 1 95 - 230. Goodall, Grant 1 987: Parallel Structures in Syntax. Cambridge University Press, Cambridge. Gross, Maurice 1 975: Methodes en Syntaxe. Hermann, Paris. Kayne, Richard 1 977: Syntaxe du Fran�ais. u cycle Transformationnel. Le Seuil, Paris. Langacker, Ronald 1987 : Foundations of Cognitive Grammar Vol. /. Stanford University Press, Stanford . Leclere, Christian 1 976: Datifs syntaxiques et datif ethique. I n : J . Cl. Chevalier and M . Gross (eds .). Methodes en Grammaire Fran�aise, Klincksieck, Paris, pp. 73-%. Milner, Jean-Claude 1982: Ordrt!S et Raisons de Langue. Le Seuil, Paris. Rooryck, Johan 1 987a: Les Verbes de Contr61e: une Analyse de 1'/nterpritation du Sujet Non

Exprime des Constructions Infinitives en Fran�ais. Doctoral dissertation, K . U . Leuven. Rooryck, Johan 1 987b: Criteres formels pour le datif non lexical en francais. To appear i n : Studio Nrophilologica. Rooryck, Johan 1 988: French causatives and the Dative problem. Preprint nr 1 1 5 , Department of Linguistics, K . U . Leuven.

Rouveret, Alain and 1 .-R . Vergnaud 1 980: Specifying reference to the subject: French causa lives and conditions on representations. Linguistic Inquiry I I : 97-202. Ruwet , Nicolas 1 988: Les verbes meteorologiques et ! ' hypothese inaccusative. To appear i n :

Claire Blanche-Benveniste, Andre Chervel, a n d Maurice Gross (eds .), Mtlanges a Ia Mtmoire de Jean Stefanini. Tasmowski, Liliane 1984: ? • 'luifaire telephoner quelqu 'un d'autre': une strategie. Lingvisti cae /nvestigationes 8, 2, 403 -427. Tasmowski, Liliane 1985: Faire infinitif. I n : L . Melis (ed.), Les Constructions de Ia Phrase Fran�aiSe. Communicatton and Cognition, 223-365.


Barnes, Betsy 1 985: A functional explanation of French nonlexical datives. Studies in Lan

-

Journal of s�mantics 6: 57 93

TOOLS AND EXPLANATIONS OF COMPARISON - PART 1 *

MANFRED BIER WISCH

ABSTRACf

analyses, preserving as far as possible the concepts that have already been clarified, but modifying the structure of earlier proposals in crucial respects. The reason for adding a new theory to the ones already existing is twofold: (a)

The new theory accounts for a number of relevant facts that have systematically been

(b)

It relates these facts to those already analysed in a way which does not merely give

ignored by earlier analyses. a descriptive account, but rather an explanation in terms of a few underlying condi tions from which the whole range of facts follow in a natural way. A detailed discussion of the variow analyses proposed so far would by far exceed the limits

set for the present paper. 2 I will instead simply list, for the sake of preliminary orientation,

the main points that the present theory shares with some or all of its predecessors, and those in which it differs from them. I n accordance with other approaches, I will make the following assumptions: (i)

The Positive of relative adjectives must be analysed in close connection with the Com parative, the Equative, and a number of related constructions. More specifically, the constructions in question are all based on a single lexical representation of the adjec tives involved.

(ii)

The Positive of a relative adjective is i nterpreted with respect to a contextually deter mined class of comparison C. Within C, a standard, average, or norm Nrc. A[ is de fined with respect to the property A specified by the adjective in question, so that,

' e.g. , John is tall is interpreted roughly as 'John is taller than Nrc. htiKht[ . In the present paper, I will not be concerned with the question how C and Nrc. AI are deter mined, but simply assume that N is available. (I will usually drop the index [C, A ]

(iii)

o f N.)

Relative adjectives assign to an individual x a degree dA where d might be conceived

as a class of individuals that are equivalent with respect to A . (This notion will be somewhat modified below . ) Differing from a l l other approaches, I make t h e following assumptions: (iv)

The lexical representation of a relational adjective is semantically a kind of three place predicate that relates an individual

x,

a standard of comparison v, and a dif-

• Editorial note: Because of its unusual length this paper appears in two parts. Part 2 (i .e. Sec tions 5-9 of the paper) will be published in JS 6.2.


In this paper, I will outline a theory of gradation ' that builds upon quite a number of previous

58

(v)

(vi)

ference c. With respect to their semantic type, both v and c are degrees, and the degree assigned to x is composed of the values of v and c. 3 One of the possible values of v is N. Comparative and Equative constructions are related to each other in roughly the fol

lowing way: the complement clause of the Comparative specifies the value of v, while that of the Equative specifies the value of c. 4 Relative adjectives belong to (at least) two classes, which I will call dimensional adjec tives (tall, long, heavy etc.), and evaluative adjectives (clever, nice, good etc.). The degrees specified by D-adjectives are extents, the degrees specified by E-adjectives are grades. 5

(vii)

There is a small number of conditions on semantic representations that determine, among others, the value the standard of comparison

v

can assume in specified con

figurations.

I . SOME RELEVANT FACTS

In this section, I will briefly discuss three groups of facts that motivate the need of a new theory, as they cannot reasonably be captured by any of the earlier theories. Before turning to the details, I will introduce two notions that are useful in this respect . As is well known, relational adjectives, as a rule, come i n pairs of anto nyms, such as tall vs. short, high vs. low, good vs. bad, clean vs. dirty etc. This pairing exhibits a fair amount of lexical idiosyncrasies, and it is far more regular for D- than for E-adj ectives. As this kind of antonymy is in trinsically related to the phenomena of gradation, an interesting theory of gradation should provide a systematic account of antonymy and the dif ferent role it plays for D- and E-adjectives. For the moment, I will simply call the two sets into which adjectives are to be grouped in this respect + Pol and - Pol adjectives, respectively. Typical examples are the following: (1)

D-Adjectives: a. + Pol: tall, long, high, heavy, old b . - Pol: short, lo w, light, new, young

(2)

E-Adjectives a. + Pol: good, beautiful, pretty, clever, intelligent b . - Pol: bad, ugly, plain, stupid

As will be seen below, the consequences of the + Pol/ - Pol distinction are far more complicated than has been recognized so far.


To conclude this preliminary outline, I should emphasize that more important than the list of individual points relating the present theory to or distinguishing it from other proposals is the general structure of the theory, which is different from its predecessors. This will become clear as we proceed.

59 It has been noted above that the interpretation of relational adjectives may involve a contextually determined norm N. I will call an expression norm-related (or for short NR) if its interpretation involves N. In this sense, the examples in (3) are norm-related, while those in (4) are not: a. b. c.

John is short . Bill is tall. I know that he is tall.

(4)

a. b. c.

How tall is Bill? I know how tall h e i s . Bill is five feet tall.

Norm-relatedness is mostly looked upon as a phenomenon that is attached , one way or another, to the Positive of relational adjectives. Its actual distri bution, however, is far more complicated, as will be seen immediately. As a matter of fact , it cannot be captured along the lines that have been pro posed so far. The first group of facts, to which I will now turn, concerns certain asym metries between + Pol and - Pol adjectives. To begin with, the following contrast, though often recognized, has never been captured in a systematic analysis: (5)

a. Bill is five feet tall. ( - NR) b. • Bill is five feet short . ( + NR)

Notice that (5b) is deviant, but has nevertheless a definite interpretation that might be paraphrased by (6): (6)

Bill is five feet tall, and that is short .

To what extent sentences like (5b) and their re-interpretation (6) are admissi ble, although they are deviant, depends in part on the individual adjective. The di fference between (5a) and (5b) is quite clearcut , though . Notice, that the two cases di ffer not only in acceptability, but also with respect to norm relatedness. Hence an account that simply assigns an ungrammatical status to the combination of a measure phrase with a - Pol adjective will not do. This is also shown by the related contrasts in (7) and (8): (7)

a. How tall is Bill? b. ? How short is Bill?

(8)

a. b.

( - NR) ( + NR)

I know how tall he is. ( - NR) I know how short he is. ( + NR)


(3)

60

The status and interpretation of (7b) is somewhat dubious, while (8b) seems to be unproblematic, meaning something like (9) : (9)

I k now how far his height is below the average.

Consider next Comparative and Equative constructions: a. b.

John is taller than Bill is. ( - NR) Bill is shorter than John is. ( - NR)

( 1 1)

a. b.

John is as tall as Mary is. ( - NR) John is as short as Mary is. ( + NR)

Comparatives are not norm-related as is shown by the possibility to con tinue the examples in (10) in the following way:

( 1 2)

a. b.

John i s taller than Bill, though both are fairly short. Bill is shorter than Bill, but both are tall.

The Equative, however, shows the same asymmetry as the examples with measure phrases and Wh-words. Although these facts - together with the different pattern of norm-relatedness in E-adjectives, to which we will turn below - have been discussed in the context of presupposition ,6 no sys tematic account has been provided by any of the theories of comparison so far. Similar phenomena are to be observed with respect to MPs:

( 1 3)

a. b.

John is five inches taller than Bill is. Bill i s five inches shorter than John is.

Thus, while only + Pol-adjectives allow for a regular M P in the Positive, both + Pol and - Pol-adjectives can take an MP in the Comparative. No tice, incidentally, that the MP specifies the difference in the Comparative, while it specifies the whole extension in the Positive. This fact has an ob vious analogy in the following contrast:

(14)

a. b. c. d.

John John • John John

is is is is

three three three three

feet feet feet feet

tall. too tall. short . too short.

Formally, this di fference is reflected in different wh-phrases:


( 1 0)

61 ( 1 5)

a. b. c.

I know how tall he is. I know how much too tall he is. I know how much taller he is than Bill.

M P s that go with the Equative are of a different kind, they d o not specify units, but multiples of degrees. The crucial point is, that these MPs combine only with + Po l , not with - Pol 0-adjectives: ( 1 6)

a. b.

( 1 7)

a. • John is half as short as Bill is. b. * Bill is twice as short as John is.

John is twice as tall as Bill is. Bill is half as tall as John is.

( 1 8)

a. b. c. d.

(+ She is clever. You know , how clever she is. ( + She is cleverer than her sister. ( + She is as clever as her mother. ( +

NR) NR) NR ?) NR)

( 1 9)

a. b. c. d.

She is dull . You know, how dull she is. She is duller than her sister . She is as dull as her mother.

NR) NR) NR) NR)

(+ (+ (+ (+

The question mark in ( 1 8c) indicates that norm-relatedness for + Pol E adjectives is sometimes dubious in the Comparative. 7 I n general, however, implicit reference to N seems to be indispensible for E-adjectives. Consider next the admissibility of measure phrases. It seems to be a trivial


Notice that sentences like ( 1 7) are not only deviant, they also lack a deriva tive interpretation in the sense observed for the Positive construction (5b). It is by no means clear in advance whether and in which way the facts dis cussed so far are related to each other and to other relevant phenomena. A reasonable theory should, of course, provide a principled answer to these questions. The second group of facts concerns the asymmetry of D- and E-adjectives in regard to the phenomena discussed so far with respect to D-adjectives only. Notice first of all, that E-adjectives are not only less systematic with regard to the + Pol/ - Pol antonymy. (Many of them, such as wise, tough, obscure, do not have a clear counterpart at all, or they derive it by mor phological processes as in unlucky, inelegant, immobile, etc . ) They are also characterized by a different nature of their norm or standard to which gra dation might be related. Deferring this problem, I will first notice that for them norm-relatedness exhibits a rather different pattern:

62 fact that E-adjectives do not take MPs as they are not associated with scales for which units are defined. This is not the whole story, though . Suppose, there is a contest in which cleverness or beauty are scored. We might then have sentences like: (20)

a. b.

She is three points cleverer than her sister. John is almost two points better than all other candidates .

Even under these rather peculiar conditions, sentences like (2 1 ) remain strange: a . *She i s five points clever. b. • John is ten points good.

While these types of MPs are marginal, multiplicative M Ps are completely natural for E-adjectives. The crucial point here is that they are not restricted to + Pol adjectives. Thus, while the sentences in ( 1 7) are deviant, both sen tences in (22) are wellformed and have an unequivocal interpretation, which is, of course, not necessarily precise in an arithmetical sense: (22)

a. b.

John is at least three times as intelligent as Bill is. Bob is only hal f as stupid as his brother is.

The third group of facts to be discussed concerns the adjectives that are ad missible in the degree clauses of Comparative and Equative constructions. According to all theories of comparison, sentences like (23a) are related either by deletion or by interpretive rules - to those like (23b): (23)

a. b.

John is taller t han Bill (is) . John is taller than Bill is tall.

The correctness of this assumption becomes extremely dubious for the cor responding - Pol adjectives: (24)

a. John is shorter than Bill (is). b. • John is shorter than Bill is short .

Once the ungrammaticality of (24b) is recognized, the status of (23b) be comes dubious as well . Notice, moreover, that for semantic reasons (24a) could more plausibly be related to (25) : (25)

John is shorter than Bill is tal l .


(2 1 )

63 This is not the right solution either, however: While (24b) is simply deviant, (25) is not synonymous with (24a) , it has a different, though somewhat mar ginal interpretation that might be paraphrased in the following way: (26)

The degree to which John's height is below the norm is greater than the degree to which Bill's height is above the norm.

Similar problems arise with respect to Equatives: (27)

a. ? John is as tall as Mary is tall . b. ? John is as short as Mary is short.

(28)

John's height is as much below the norm as Mary's.

For the + Pol-counterpart (27a) and the reduced form ( I I a) the correspond ing paraphrase relation does not hold, as ( I I a) is not norm-related. Before turning to further ramifications of the problem at hand, I will briefly touch the role of pitch accent that is involved here. Consider once more example (25), which has the alleged interpretation (26) if and only i f tall has a pitch accent that brings into focus its antonymous relation t o the matrix adjective short. Without the pitch accent, (25) does not have the meaning (26), nor is it synonymous with (24a) , but simply ungrammatical. This observation sheds some light on one factor involved in the questionable status of (27) as well as (23b) and (24b). In these sentences the second adjec tive cannot receive pitch accent. But it cannot normally remain without it either, if we assume a condition like (29): (29)

Adjectives in a degree complement clause that are related to the matrix adjective must be assigned a pitch accent .

This condition is related to the semantic aspect in an obvious way: pitch ac cent means focus , and focus means new information. In other words, adjec tives that simply repeat the property A of comparison already fixed by the matrix adjective cannot normally be realized in the surface. It goes without saying that (29) is rather ad hoc and must be reduced to more general princi ples that relate semantic interpretation to surface structure. It serves the purpose, however, of sorting out a factor that interacts with the semantic


W hile both of these sentences are dubious as a possible source for their counterparts without the second occurrence of the adjective, this time the - Pol case is at least semantically appropriate, in the following sense: Both (27b) and its reduced counterpart ( I I b) can be paraphrased by (28), in accor dance with the + NR status of ( I I b).

64

properties of adjectives in producing the phenomena under discussion. Having separated the consequences of pitch accent and focus, which ac count for the corresponding phenomena in (30) to (3 1 ) based on £-adjec tives, quite a number of purely semantic phenomena remain . (30)

a. ? John is cleverer than Bill is clever. b. ?Bill is more stupid than Mary is stupid .

(3 1 )

a. ? He is as good a s she i s good . b. ? He is as bad as she is bad .

(32)

a. b.

The table is longer than the door is high. The bridge is as long as the river is wide.

(33)

a. b.

Mary is nicer than her brother is clever. The plot is as dull as the music is ugly.

As all adjectives are contrastive, they have pitch accent in accordance with (29). Observe now the following asymmetry between D- and £-adjectives:

Mary is nicer than her brother is stupid. The plot is as interesting as the music is boring.

(34)

a. b.

(35)

a. •The closet is higher than the table is short . b . ? The tree is as high as the road is narrow.

Once two £-adjectives are construed as commensurable, + Pol and - Pol items combi ne rather freely. This does not hold for D-adjectives: while high and short or narro w relate to one-dimensional spatial extension, thus being directly commensurable, the combination of + Pol and - Pol D-adjectives is highly restricted. More generally, - Pol D-adjectives cannot normally ap pear in a degree clause. This restriction seems to hold more strictly for Com parative than for Equative constructions. The observation is related to a final point to be made. Sentences like (35b), which cannot be interpreted in the direct dimen sional sense, can be re-interpreted in analogy to £-adjectives. Under this secondary i nterpretation, (35b) means something like (36): (36)

The grade to which the tree is high is the same as the grade to which the road is narrow.


Consider first the possibility of comparing two different properties or dimensions, as in the following cases:

65 (This secondary interpretation is similar, though not identical, to the way in which (25) receives the interpretation (26) .) Once D-adjectives are re interpreted as E-adjectives, their + Pol and - Pol elements combine freely, as is the case for £-adjectives in general. This secondary interpretation is more easily available for Equative than for Comparative constructions for systematic reasons. Hence (35b) is less deviant than (35a), whose secondary interpretation would be something like (37): (37)

The grade to which the closet is high is greater than the grade to which the table is short .

2. BASIC ASSUM PTIONS

The theory to be proposed is to be conceived within a modular view of knowledge structures in the sense discussed in Chomsky ( 1 980). The facts that are of interest in the present context are determined by the interaction of two major systems: the grammar G and the conceptual system C. Both of these systems are modular in their internal organization . As to C, a general conception of which is still lacking, I will merely assume that it con tains, among other elements, a subsystem C5, in terms of which the scalar interpretation of quantitative j udgement and comparison is organized. Although the theory of scales and measurement developed originally with respect to problems of psychophysics provides a technical framework in terms of which C5 might be made precise, I will rely on largely intuitive notions specifying types of scales and the conceptual structure of acts o f comparison. For the time being, i t i s su fficient t o assume that representa tions determined by the rules and principles of C5 provide the interpreta tion of linguistic structures expressing gradation. How this component o f conceptual organization i s interrelated with other conceptual domains, as


For obvious reasons, this group of facts has a wide range of ramifications, a full display of which would take us too far afield . As can be seen from the above examples, differences in acceptability are sometimes rather sub tle, and judgements are in many cases controversial. On closer inspection, however, even these uncertainties turn out to be anything but mere chaos; they rather fall into a fairly robust pattern, which a reasonable theory should be able to account for. There are a number of relevant generalizations to be extracted from the facts discussed in this section and from a number of related phenomena that will come up as we proceed. I nstead of stating these generalizations explicit ly, I will go on and develop the outlines of a theory from which these facts follow in a rather natural way.

66 well as with perceptual, motoric, and possibly further cognitive systems, must be left open in the present context . It is to be hoped, though , that sys tematic exploration of particular domains, such as gradation, will eventual ly contribute to a more comprehensive theory of the conceptual system C and the representations it has to provide. As to the structure of G, I will accept without further discussion the basic assumptions of the Revised Extended Standard Theory (REST) as devel oped in Chomsky ( 1 977; 1 98 1 ) according to which the various components of G determine a system of representations that are related in the following way: ,

D-Structure

!

S-Structure

/ '-

Phonetic Form

Logical Form

While PF-representations are ultimately related to articulatory and percep tual patterns , LF-representations must be related to conceptual structures. Whether this relation results from a direct mapping of LF into conceptual representations, or is mediated by intervening levels of representation is not clear in advance. The first alternative is argued for e.g. by Jackendoff ( 1 978; 1 984), while an intervening level o f semantic representation, the rules and principles of which are still part of the linguistic k nowledge, i .e. of the gram mar G, is assumed e.g. in Bierwisch ( 1 98 1 ; 1 982) and will be pursued in what follows. The ultimate justification for positing an intermediate level of semantic representation must be provided by relevant generalizations that cannot be stated without recourse to the level in question . The fact that a fairly wide range of phenomena related to gradation can be explained by means of conditions referring crucially to semantic representations is a case in point and hence of interest with respect to the status of the representa tions in question . Let me briefly outline the basic characteristics of the intended component of Semantic Form SF. As any other component of G, the SF-component re quires a specification of the general format of pertinent representations and of the rules and principles that relate these representations to other compo nents of G. The general form of SF-representations is that of expressions in a lambda categorial language. For the sake of illustration , suppose that (40) is the SF representation of (39): (39)

Who did John expect to visit Mary


(38)

67 (40)

xi PERSON x PAST EXPECT JOHN VISIT xi MARY 1

Q

I I N "V

((S/S)/S)/N

I

SIN

I

N

�v

(S/S)

S

SIS

I

SIS

I

\/

SIN S

/

I

N

I

I NI '\. S1/

SINN N

S

This example is based on more or less standard assumptions, it is not meant to make substantial claims with respect to particular details. What it illus trates , might be summarized as follows : (a)

(b)

(c)

SF-representations are labelled trees made up of three types of ele ments: categories, such as S , N, SIS, etc . , semantic primes, such as Q, PAST, PERSON, etc . , and variables x, Xp x2, etc. Categories are attached to non-terminal nodes, they categorize the dominated subtrees , including the variables and primes attached to the terminal nodes. Semantic primes and variables are interpreted by appropriate struc tures of the conceptual system C, in much the same sense in which the primes of PF-representations are interpreted by perceptual and articulatory parameters . I will have to say more on this in the sequel with respect to the primes involved in gradation. Categories are either basic or complex, the latter dominating func tors , so that a category a/b, where a and b are categories , dominates a functor which combines with an argument of category b yielding a complex expression of category a. The system of categories thus defines the syntax o f possible SF-representations. It furthermore specifies the ontology that SF projects on the conceptual system C, insofar. as the categories of SF determine the type of conceptual structures in terms of which the categorized expressions are inter preted.8

Turning next to the rules and principles that relate SF to other levels of representation, or more precisely, to LF, we must recognize two compo nents : (a) the lexical rules specifying the SF-representation of lexical items, and (b) the combinatorial rules determining the compositional representa tion of syntactically complex expressions . The first component embodies


s

68 the full range of lexical idiosyncrasies, although it is based, of course, on general principles . The second component consists in a rather limited system of universal principles. Before I shall illustrate some of the relevant properties of the two compo nents, two general remarks will be in order. Consider first the interaction of lexical information with rules and principles of G. With respect to the syntactic components, REST recognizes at least four types of relevant lexi cal properties: (4 1 )

Syntactic subcategorization Features determining intrinsic Case Properties determining 0-roles Properties determining lexical Control

A crucial condition governing the interaction of lexical information with syntactic rules is the 'projection principle' proposed in Chomsky ( 1 98 1 : 29): (42)

Representations at each syntactic level (i.e. LF, and D- and S structure) are projected from the lexicon, in that they observe the subcategorization properties of lexical items.

More generally, the projection principle guarantees that lexical items assign thematic roles to their arguments invariably at all syntactic levels. The way in which the projection principle is maintained is paved by the trace theory of movement rules which allows traces left in the initial position of moved constituents to be interpreted as variables at LF. This in turn is directly rele vant to the present discussion, as 0-roles and lexical Control are lexical properties that are substantially related to, or even emanating from, the semantic content of lexical items . I nsofar as the projection principle guides the interaction of autonomous syntactic principles preserving the assign ment of 0-roles (and possibly other lexical properties) at all syntactic levels, it paves the way, so to speak, along which aspects of meaning enter the com putational structure of language. This leads to the second remark. In order to interact with computational rules and principles, 0-roles must themselves be amenable to computational processes, i . e . they are to constitute a structural aspect of meaning that necessarily induces a corresponding structure into the semantic representa tion of lexical items. This notion can be pursued in various ways. Suppose that the semantic representation of lexical items exhibits a certain amount of internal structure, i . e . , that it is made up from structural components that provide something like a skeleton for principles of conceptual interpretation to work with. Such components would have to be justified on the one hand by systematic relations within the lexical system of possible languages, on


a. b. c. d.

69 the other hand by rules and principles providing a coherent interpretation in terms of conceptual representations. Suppose, furthermore, that the in ternal semantic structure of lexical items is organized along the lines out lined above for SF-representations in general, so that the ultimate primes of SF are the components from which lexical items are made up according to the syntax of SF. On the basis of these assumptions, 0-roles can be de fined by means of primes of SF along the following lines :

(43)

The intuitive notion behind (43) is to define 0-roles in terms of configura tions at SF, in a similar vein as grammatical functions are defined in terms of phrase structure configurations. I nstead of developing a -more technical account of this notion, I will now turn to the rules that assign SF representations to syntactic structures. Let me first illustrate the lexical component of these rules by means of a standard example. Suppose that, in accordance with traditional assump tions about semantic decomposition, the SF-skeleton of give can b e represented i n the following way:

(45) CAUSE

I

S/SS

I I I \f'

DO x S/

u S

I

I

I I

/\;1/

CHANGE NEG HAVE S/SS

S

I

y

z

I

yI I

HAVE y

I

Sl

z

s s

s

Under appropriate conceptual interpretation of the posited primes, (45) claims that x gives z to y means something like 'x's doing something u brings about a change from y's not having z to y's having z' . I do not want to j usti 10 fy the substantial details o f this analysis. What is of interest here, is the general format of lexical representations, with respect to which a number of comments will be made. First, as already mentioned, lexical SF-


If a is an argument of Pi at SF, where Pi is a designated prime o f SF, then P i assigns t o a a particular semantic role Si . A 0-role 0 k that a lexical item L l assigns t o a constituent C i s the composite o f the semantic roles SJ assigned b y the SF-representation o f Ll to the SF-representation of C. 9

70 structures, which assign lexical items their representation at SF, are subject to the general conditions on SF-representations . That is clearly the case in (45). Secondly, the SF of lexical items, as any other kind of lexical informa tion, should be redundancy free, insofar as general properties of the lexical system can be expressed by redundancy rules. A case in point is the fact, that, in general, only the final state o f the CHANGE-relation is lexically specified, while the initial state can be derived by the following redundancy rule:

(46)

[CHANGE-TO x] - [CHANGE [NEG x] x]

(47)

give; [ - N, + V] , [_ N P2 N P 3] , [_ NP 3 to NP2] ; x 3 x2 [x1 [Vx4 CAUSE [DO x 1 x4] [CHANGE-TO [HAVE x2 x3 ) ] ]]

As (47) indicates, those variables that correspond to the proper arguments of give are bound by lambda-abstractors, which serve two interrelated pur poses : they turn the open proposition (45) into a three-place predicate, and they play a crucial role in the combinatorial rules, to which we turn immedi ately. The remaining variable x4 (that is u in (45)), which cannot be related to a syntactic constituent at LF, is bound by an operator that has the stan dard properties of an existential quantifier at SF, but a somewhat di fferent function with respect to conceptual interpretation, as it selects a contextual ly determined instance of the appropriate type at the level of conceptual representation. (I will return to this point below .) Fourthly, the abstractor in a sense 'collects' the semantic functions as signed to the occurrences of the variable it binds inside the lexical represen tation, thereby creating the 0-role associated with the variable in question. Finally, subcategorized constituents must be connected with a 0-role (although not every 0-role is connected to a lexically subcategorized consti tuent). This connection can formally be expressed in various ways. In the format exemplified by (47) it is represented by identical subscripts of varia bles and subcategorized constituents : x2 and x3 are connected to the direct and indirect object, respectively, while x 1 provides the 0-role of the sub ject; it is therefore not connected to a subcategorized constituent. It should be obvious from these remarks that abstractors play a crucial


Thirdly, as stated so far, (45) is an open proposition involving four unbound variables : u of category S, x, y, and z of category N. The verb give, however, should come out as a three-place predicate, rather than as an open proposi tion, and it should , moreover, assign appropriate 0-roles on the basis of its SF-structure to the subject, direct, and indirect object, respectively. This will be achieved by binding the variables in (45) by two types of operators in the lexical entry for give:

71 role i n the combinatorial rules, to which we turn now. The basic process is that of lambda-conversion according to the familiar equivalence (48) apply ing to lexically interpreted LF-representations . (48)

5( [

•

•

•

x . . . ]a .... [ . . . a . . . ] where x and a are of the same category.

We will say that an argument a specifies i, if a replaces the variables bound by i according to (48). The main combinatorial rule can now be formulated as follows: (49)

The result of Rule (49) is an SF-representation q> ' that results from q> by sub stituting all occurrences of x in q> by \jl. If (49) is applied to all constituents dominated by y , we get an integrated SF-representation x . which will be

assigned to y as its representation at SF. Notice that x might still contain an abstractor, i . e . it might be of the form 5( [�] . if y does not dominate a constituent whose SF-representation appropriately specifies 5(. I n that case, 5( is open to specification at the next level up in LF. Thus starting with the SF-representations provided by lexical rules, the Argument Rule (49) recur sively assigns integrated SF-representations to all constituents of a given LF. A number of amendments are required, of which I will mention the fol lowing: (50)

Unspecified Argument Rule: x . . (q>] is replaced by . . v x [q>] , where . . . does not contain any .

.

y.

This rule turns an abstractor that is not specified by (49) i nto a referential quantifier of the type mentioned earlier, i . e . , an operator that selects a con textually determined instance at the conceptual level . Both (49) and (50) are optional and unordered, as are other rules of G. I nstead o f pursuing further technical details that must be clarified in order to make the combinatorial rules work appropriately, I will conclude this sketch of the SF-component by a simple example that illustrates the basic ideas . According to standard assumptions, (5 l a) is derived from the D-Structure (5 1 b) by moving the object NP into the subject position yielding the S-Structure (5 l c).


Argument Rule : I f a and p are directly dominated by y in LF, and x [q>] is the SF representation assigned to a , and \jl is the SF-representation assigned to p, and 'II is of the same category as x, then 'II specifies 5(.

72 (5 1 )

a. b. c.

The book was given to John. l s [5 [ N P e ) INFL[v p b e [given [ N pthe book] lpp t o John]]]]] l s [ 5 [ NP · the book] I NFL[yp be [given [ N P· e ) [PP t o John]])]] I

I

From (5 1 c), the LF-representation is derived by rules of construal which, among others, interpret indices as variables. Following Higginbotham ( 1 983), I will assume in particular that NP-specifiers are operators that bind appropriate variables, so that (52) will emerge as the LF-representation of (5 1 ) : (52)

[5 [ N P the xi book xi] Past l v p be [given xi [PP to John])]]

(53)

a. b. c. d. e.

book: John: the : Past: be [ +

x [BOOK x] JOHN x [DEF x] PAST Passiv] : x [x]

with BOOK of category S/N of category N with DEF of category (S/S)/S of category S/S with x of category S

The semantic constants appearing in (53 ) are all abbreviations, which need not be analyzed here . I will assume without further justification that PAST is a sentential operator, and DEF a restricted quantifier which turns a nominal into a sentential operator. The passive auxiliary is an operator that requires its complement to be of category S, which expresses the fact that 1 a passive VP does not assign a 0-role to its subject . l Given (47) and (53), we assign lexical SF-representations to (52) , to which the combinatorial rules can apply. The Argument Rule derives the representations to be at tached to the subject NP and to the complement of be in the obvious way: (54)

a. b.

the book: [DEF xi [BOOK xd] of category S/S given X; t o John : x1 [vx4 [CAUSE [DO x 1 x4] [CHANGE-TO [HAVE JOHN xi])]] of category S/N

As the passive be requires a complement of category S, rule (50) must apply to (54b) converting the abstractor ){1 into a referential operator, which ex presses the fact that passive constructions involve an unspecified actor. As be adds nothing to this representation, it will come out as the SF assigned to the whole VP of (52), providing the appropriate complement to both PAST and the subject NP. We thus derive (55) as the SF-representation as signed to the top node of (52):


Suppose now that along with (47), which assigns the SF-representation to give, we have the following lexical rules :

73 (55)

[ [DEF xi [BOOK xi ] ] PAST [vx 1 [Vx4 [CAUSE [DO x1 x4] [CHANGE-TO [HAVE JOHN xi ] ] ) ] ] ]

3. THE CONCEPTUAL STRUCTURE OF COM PARISON

The semantics of gradation, as that of any other domain , must determine (i) the structure of the pertinent semantic representations, and (ii) the way, in which these representations are derived compositionally. In this section, I will be concerned with the first of these interrelated aspects . More specifi cally, I will motivate the content of the representations in question by means of the conceptual interpretation which they can receive. To begin with, I take the mental act of comparison to be a basic capacity that organizes the relevant conceptual domain . Suppose that this capacity exhibits a fairly abstract, self-contained structure that determines its inter action with other conceptual components . The minimal requirements for this structure include two entities a and b, the specification of some aspect or dimension D, with respect to which a and b are to be compared, and a relation between a and b with respect to D. I will assume more specifically, that comparison creates a relation between the values that are assigned to a and b with respect to D by some function f. Thus in its most elementary form , the act of comparison can be indicated by (56), where ' :::> ' is a concep tual prime. (56)

f(a, D)

:::>

f(b, D)

Intuitively, (56) is to be conceived of as determining that the D-value of a includes the D-value of b. Before I can make these intuitive notions more precise, I will somewhat enrich the structure of comparison.


I have arbitrarily chosen PAST to be within the scope of the subject-NP . With the present assumptions, the scope could equally be assigned the other way round . I will not go into these problems here. To summarize, the lexical and combinatorial rules of the SF-component assign a compositional SF-representation to any LF-constituent, where the Argument-Rule (49) spells out, so to speak, the semantic aspect of 0-role assignment, the syntactic aspect of which is taken care of by the projection principle. As the combinatorial rules as well as the general format of lexical rules must be regarded as the universal framework, the semantic theory of grada tion, to which I will turn now , will consist essentially in the specification of appropriate lexical representations for the basic expressions involved in gra dation, including the conceptual interpretation of the relevant semantic components.

74 Suppose that a and b are boards to be compared with respect to length. Let us assume that in fact the act of comparison consists in projecting a and b into an abstract scale in the way indicated in (58): (57) a b

f(a, L)

f(b, L)

c

I n other words, to ascertain that (56) holds for the situation represented by (57) - with L as the dimension o f comparison - a and b are assigned a com mon zero-point in L, which automatically provides the difference c, by which j{a, L) exceeds f{b, L). This , then leads to a slightly more complex structure of comparison, which might be represented by (59), assuming standard interpretation for ' = ' and ' + ' . (59)

f(a, L) = f(b, L) + c

Continuing with this quasi-algebraic notation, (58) might also be repre sented by (60): (60)

f(b, L)

=

f(a, L) - c

Anticipating the intended interpretation, we want (59) and (60) to cor respond to ( 6 1 ) and (62), respectively, in accordance with the fact , that both correctly describe the situation indicated in (57) - and (58), for that matter. (6 1 )

a i s longer than b.

(62)

b is shorter than a.

Notice, furthermore, that (57) can also be described by (63) and possibly (66) induce an additional factor, insofar as they are norm-related in the sense discussed above.

(64), while (65) and


(58)

75 (63)

b is not as long as a.

(64)

b is less long than a.

(65)

a is not as short as b.

(66)

a is less short than b.

Another point t o b e noted i n the structure of (59)/(60) i s the occurrence of two types of L-intervals: (a) Values assigned to the entities to be com pared with respect to L; let us call these intervals L-extents; (b) Intervals that constitute differences between extents ; let us call these intervals L differences. Extents and differences differ with respect to two properties: Extents include (or start at) the zero-point, do not necessarily include 0. Ex tents have a fixed directionality ' from zero up' , differences allow both direc tions : 'towards' and 'away from' zero . From these considerations, one might derive a preliminary characterization of the notion 'dimension' in volved in the structure of comparison: (67)

A dimension D is a (potentially infinite) set of D-intervals which are either D-extents or D-differences .

It is implied by (67) that D has a designated zero-point. 1 3 We will give a more precise characterization of the notion of dimension and scale below. Proceeding still on the intuitive level, I will introduce two further concepts involved in comparison. The first is that of average or norm . In accordance with almost all other analyses, I consider Positive constructions as based on comparison , the standard of comparison being provided by a contextually 4 determined norm. 1 Let us represent the standard in question by N1c ,DJ • where C represents the class of comparison on which the relevant norm


I will return to these problems presently. For the time being, we will merely note the well-known fact, that the same situation can be conceptualized i n different ways, which are, moreover, systematically related . With respect t o (59) and (60), the difference in conceptualization comprises two interdepen dent factors : (i) the choice of either a or b as providing the standard or anchor-point for comparison , and (ii) the 'direction' in which c is to be passed through . There is an intuitive sense, in which (59) represents the sim pler or unmarked case: The direction of the operation determining the value off(b, L) is preserved in adding c, while in (60) the direction of the operation determining f(a, L) must be reversed in subtracting c. This asymmetry is i n fact the root of quite a number of the phenomena discussed in section 2 and will therefore be made precise in our theory. Notice, incidentally, that the elementary structure indicated in (56) does not allow one to express this asymmetry, as it invariably encompasses both (59) and (60). 1 2

76 depends, and D the relevant dimension. (I will continue to omit the subscript and simply write N, where no confusion arises .) N will now be treated as a D-interval, or more precisely as a D-extent, whose particular value de pends on the contextually determined class C. With these provisions, we get the following interpretations for positive adjectives: (68)

a. a is long. b. f(a, L) = N + c

(69)

a. a is short. b. f(a, L) = N - c

(70)

a is as short as b.

What (70) asserts is that a and b are of equal length, and it entails (or impli cates, or presupposes) that both are short . Schematically, the situation can be represented as in (7 1 ) with the concep tual representation (72), where the presupposition is included in angled brackets. f(a, L)

(71)

c

f(b, L) N

N (72)

f(a, L)

=

f(b, L) 1\ ( f(b, L)

=

N

-

c)

That the presupposition refers to b rather than a in (70) is borne out by the


There are various well-known questions to be raised i n this connection. The first is how to fix C and how to compute N, if C is fixed. I have nothing substantial to say in this respect. It seems reasonable, however , that basic conceptual capacities must be responsible for these operations, the compu tation of N possibly being a constituent part of the capacity of comparison itself. Another question is whether N is a specific interval, or rather a certain range of intervals. Related to this question is the specification of the dif ference c, as not just any non-empty interval will be sufficient for a to be long, or short. I take these questions to be captured by a general account of vagueness, specifying different demands of precision according to the contextual setting. 1 5 I will therefore consider N as a specific, though context-dependent, D-interval, and make no particular assumptions about 6 c in cases like (68) and (69) , except that it is not empty . 1 Consider now more intricate cases of norm-relatedness like

77

(73)

a. a is three feet long. ft + ft + ft b. f(a, L) =

I nstead o f (73) (b) we might simply have (74)

f(a, L)

=

3 ft

Similarly for Comparatives: (75)

a. a is three feet longer than b . b. f(a, L ) f(b, L ) + 3 ft =

(76)

a. b is three feet shorter than a. b. f(b, L) = f(a, L) - 3 ft

Finally, we have measurements without recourse to D-units: (77)

a. a is twice as long as b. f(b , L) b. f(a, L) = 2 ·


negation a is not as short as b, which still implies that b is short , but makes no implication with respect to a. I now turn to the second concept that deserves preliminary discussion, viz. measurement. This involves two factors: first the introduction of a metric into the set of intervals, and second the specification of units o f measurement. As t o the metric, this has in fact already been introduced b y the u s e of ' + ' a n d ' - ' and their intended interpretation, which will b e made precise below. In other words, I suppose that the structure of comparison is based on scales that are intrinsically metrical in the sense that intervals can be mapped into real numbers preserving standard operations on numbers. With appropriate provisos concerning vagueness, this seems to hold even for £-adjectives, to which we turn later: Although John isfour times as lazy as Paul is fairly imprecise under normal conditions, it still makes sense, im plying the possibility of multiplication. I would like, in fact, to conjecture, that we are able to work conceptually with non-metrical scales, but only as a highly marked and derivative accomplishment, the unmarked case being 7 ordered, metrical scales. 1 Units of measurement presuppose metricality, but not vice versa. A unit o f measurement can be conceived of as a representative of a designated equivalence class of D-intervals. (Equivalence of intervals will be defined below.) In practice, a unit of measurement is simply a designated D-interval . There may be several such units for one scale, foot, mile, meter, or second, day, month, year being obvious examples. Again, the usual provisos as to the degree of precision apply. Suppose, then, that ft represents the L-unit foot. We now get provisional interpretations of the following kind for con structions containing measure phrases :

78

(78)

A D-scale is an ordered pair (D, :::> ), where (ui , v) l is an infinite set of D-intervals d, and (i) D = I di : di (ii) :::> is an asymmetric, transitive, reflexive relation in D, called ' interval-inclusion ' . =

A D-interval d might b e conceived a s representing the result of scanning a pertinent object along a certain dimension. It starts at some initial point u and spans a certain stretch v. The identification of objects and their relevant dimensions must be determined by other systems of conceptual organiza tion, such as spacial and temporal orientation, conceptualization of precep tion, emotion, etc. They must be analyzed independently and will be presupposed for the present purpose (u, v) accounts for the intrinsic direc tionality of intervals, which is motivated on intuitive grounds and will be exploited in the definition of further concepts . The relation of (improper) interval-inclusion makes intervals available for comparison. I ntuitively, an interval d 1 includes an interval d2 if and only if d1 includes all parts of d2• Thus , interval-inclusion can be fixed by the following axiom:

Interval-inclusion imposes a partial ordering on D. It allows defining a number of other relations, such as interval-exclusion, proper inclusion, overlapping, etc. We need not pursue these possibilities. We will now fix a designated initial point in accordance with the assump tion implicit in our informal discussion, that generally the scanning of an object assigns it a D-extent that begins at 0. We thus define a D0-scale as follows:


So far, I have given an intuitive account o f the conceptual structure that I assume to p rovide the purport of expressions of gradation. I will now turn to a more precise formulation of the relevant concepts. Starting with the act of comparison as the constitutive capacity under lying gradation , we arrived at the concept of D-scale as its relevant formal structure. The main task for a theory of the pertinent conceptual domain is therefore to develop an adequate theory of scales . As alleged above, the types of scale available at the conceptual level are of a rather restricted varie ty. If this is correct in principle, an adequate theory would have to specify the relevant restrictions as part of an explanatory theory of conceptual or ganization. Pursuing a much more moderate goal , I will simply introduce the required concepts by way of definition. As a basic building block, I will define the notion of D-scale in the follow ing way:

79 (80)

A D0-scale is an ordered triple (i) D and :::> as in (78), and (ii) D0 is a subset of D with the (a) di ED0 iff di = (0, v) (b) for any di E D there is a

(D, D0 ,

:::> ),

where

condition

d E D0 with d i i

:::>

di

(8 1 )

Let a and b b e elements o f D0, with a :::> b and c and d any elements of D. Then a = b + c = def /1. d [c :::> d ... a :::> d /1. ..., b :::> d]

By this definition , a spans exactly the concatenation of b and c. What we need for the analysis of gradation is, however, a somewhat different opera tion, viz. one whose result is either exactly or at least the extent a. There is a fairly wide variety of constructions whose interpretation is open to these alternative interpretations. Compare for example: (82)

a. b. c.

John is ten inches taller than Bill. In fact, he is ten inches and a half (?nine inches and a half) taller. John is six feet tall. I n fact he is six feet and two inches tall. John is as tall as Bill. In fact he is somewhat taller.

Notice, that we are not dealing here with vagueness, as the ' precisely' reading alternates only with the 'at-least' -reading, that is, there is a clear directionality. A further point to be noted is the asymmetry between the two interpretations: the ' precisely' -interpretation is preferred , it will only be sus pended if evidence to the contrary shows up. This asymetry is reversed in cases like (83), whose normal interpretation is that John is shorter than Bill. (83)

John is not as tall as Bill.

Although some of these facts have been noted in the literature, no system-


Condition (b) guarantees that 0 is the initial point of the scale, i.e. that there are no intervals that start ' below' 0. The set D0 is the set of D-extents in the sense discussed above. I will assume that D0 is ordered with respect to inter val inclusion. This assumption is straightforward for linear dimensions, which constitute the vast majority, but it deserves special considerations if multidimensional comparison must be acknowledged. I will return to this question below. In order to account for the structure of comparison introduced above, we must define the operations ' + ' and ' - '. Consider first 'addition' . A straightforward way to define ' + ' in terms of the concepts introduced so far would be as follows :

80 atic account has yet been given. We will see, as we proceed, that they can in fact all be reduced to the same source, namely the definition of addition (and subtraction). The point to be noted is this: the 'exactly' -interpretation corresponds to the biconditional in (8 1 ), while the ' at-least ' -interpretation would be represented by a simple implication. These are not options of equal right, though. Rather the biconditional is preferred, except when there is evidence to the contrary. This is exactly the situation of a default interpretation. I will therefore replace (8 1 ) by the following definition:

(84)

Let a, b, c, d be as in (8 1 ) . Then a :::> b + c = def A d� � d - a � d A � b � � : M A d � � d - a � d A � b � �

(85)

A D 1 -scale is an ordered quadruple (D, D0 , � , + ) with D, D0 , and � as in (80), and + as defined in (84).

Notice, that (84) can be turned into a recursive definition , that accounts for iterative addition as in a ::::> b + c + d, so that , furthermore, iterative addi tion of equivalent intervals can be defined as follows: (86)

b = de f a :::> b 1 + b2 + . . . + bn where a :::> n b, bp . . . , bn are elements of an equivalence class as defined in (90) below. •

We have thus derived multiplication of intervals as successive addition in the usual way. This kind of multiplication will not only be needed for the in terpretation of measure phrases like three feet, ten minutes, etc . , but also for sentences like (87): (87)

a. b.

John is three times as tall as his brother. A is twice as much longer than B as C is longer than D.


In (84), the biconditional is split up into a (defining) implication and a default replication, the default being marked by the operator M as in troduced in Reiter ( 1 980). This will account, under the SF-representation to be developed i n the next section, for the preferred 'exactly' -reading in (82). The reversed preference in cases like (83) will result from the fact, that under negation the presupposition of the default is not met, hence the default does not apply. The relation ' :::> ' in the definiendum of (84), which replaces the ' = ' of (8 1 ) , represents improper interval-inclusion with the preferred mu tual inclusion determined by the default. Adding the operation just defined to the D0-scale, we get the following extension:

81 W e next enrich D 1 -scales b y adding interval subtraction . The essential point of this move is the introduction of a reversal in the orientation of inter vals : While d1 + d2 extends an extent d1 by a difference d2 , d1 - d2 first scans d1 from 0 up, and then d2 from the end of d1 down. The formal ac count is given in (88):

(88)

Let a, b E D0 with b ::> c and c, d E D. Then a C · b - c = 1\ d [c ::> d - b ::> d 1\ ..., a ::> d) : M 1\ d [c ::> d - b ::> d 1\ ..., a

de f

::>

d)

Notice that ::::> ' is an asymmetric relation and must hence be reversed in the definiendum of (88). We are now ready to define the type of scale u nderlying the interpretation of gradation in D-adjectives: '

A D2 -sca1e is an ordered quintuple (D, D0 ,

::> ,

+ , - ).

This is a fairly rich and specific algebraic structure, whose formal properties and empirical adequacy with respect to the intended purpose might be ex plored in various directions. There are a number of further concepts that can be defined with respect to this structure. We might e.g. want to define ' a + b' as the default-bound minimal interval that includes all and only the intervals included in a and b, and correspondingly ' a - b' as the default bound maximal interval that includes all those intervals that are included in a, but not in b. The usual associativity of ' + ' and non-associativity o f ' - ' will emerge. I will not state those definitions formally, but I will make use of various properties implicit in the structure of D2-scales. One notion still to be defined explicitely is equivalence of intervals, one of the things needed for the definition of units of measurement. (90)

Two intervals di di E D are v-equivalent, iff ' d.I = (u.I ' v.) (uJ. ' vJ. ) and vI. = vJ. . and d.J I DV is the set of v-equivalent intervals. =

Now a unit of measurement in D (or, for short , a unit of D) is a representa 1 tive of a designated Dv c D. B As already mentioned, there may be more than one unit for a given D. We thus arrive at D2 -scales with one or several D-u nits. The last notion to be defined is that of the norm or average. Formally, N1c, DJ is simply an element of D0 whose specific value depends on a certain class C of objects. One way to think of this dependence might be as follows : The class C obviously selects a certain subset of extents, call it q. The norm N1c.o1 = (0, n) might now be determined by some weighted middle over q. Although an empirical justification of this or some alternative


(89)

82 approach is certainly necessary, all we need for the present purpose is the specification of N as an element of D0 depending on the parameter C. Let me conclude this section with some remarks on the nature of D. Although I would claim that gradation concerns only the invariant structure of D-scales, there must be different sets of intervals instantiating this struc ture. Compare the following examples : (9 1 )

a. b.

John i s taller than his bed is long. John is taller than his car is fast.

(92 )

a. b.

a is bigger than b.

a

b

There seem to be four, partially incompatible, interpretations: (i) (ii)

(iii) (iv)

big refers to all relevant extensions, so that in (92) the comparison is two-dimensional, and (92a) comes out undecidable. big refers (in conflicting cases) only to the dominant extension, in which case it is one-dimensional , and (92a) comes out true (assuming vertical extent to be dominant) . big refers to the product of all relevant dimensions, i.e. to the square-measure in (92); now (92a) comes out false. big refers to something like 'global impression' , which is vague, but certainly one-dimensional; again (92a) comes out false.


While (a) has a straightforward interpretation, (b) has not , although there exists a derived, in fact fairly vague, interpretation of (b) which makes 1 height and speed somehow commensurable. 9 In any case, length, weight, speed, loudness , and many others, yield different sets of intervals , which are, however, susceptible of the same type of scalar structure. I have illus trated this structure by means of length, where the linear ordering of D0 and the metrical structure of the scale is fairly obvious . Although I suppose that these properties can indeed be generalized - at least as the unmarked case - to arbitrary scales of comparison, there are certain problems with cases that seem to involve multi-dimensional comparison. While some of 20 those cases can be discarded as apparent counterexamples , this does not seem to be appropriate with cases like big or large, which essentially involve multi-dimensional conditions. Consider the following situation:

83

4 . THE SEMANTIC FORM OF D-ADJECfiVES

Presupposing the framework outlined in Section 2, I will first fix certain relevant properties of adjectives in general. Syntactically, adjectives are of the category [ + N , + V], and they are the head of adjective phrases AP as their maximal projection. APs occur as predicates, and as adnominal and adverbial modifiers. The relevant structures can be indicated as follows :

A / ""'

(93) a.

N'

N'

AP

N

Camp

b.

� I V'

v

be

AP

A A V'

c.

V'

v

AP

Camp

In the sequel, I will largely ignore adverbial modification, as it involves 22 various unsolved problems that must be clarified independently. Both N ' and V ' normally assign a 0-role, the latter to the subject (except for passive and raising cases), the former to whatever makes lexical NPs referential . In the present framework, their semantic form will therefore have the general structure x [P x] , where [P x] represents an arbitrarily complex proposition with x as the only free variable. The question now is: How do APs contri bute to this structure, i . e . , how is their semantic representation incorporated into P? This question turns on the much debated issue whether adjectives


Whether (iii) and (iv) must ultimately b e reduced t o the same conceptual condition, is unclear. (i), (ii), and (iii), however, are clearly different, and 1 they all seem to play a role under certain conditions':-2 For reasons , which cannot be explained here in detail, I consider (iii) or (iv) the basic interpreta tion with (i) as an auxiliary option on demand . What is crucial, however, is the fact that the scales to be invoked are linear in all cases, with the provi so, that (i) must be taken to involve more than one comparison at once, so that this interpretation boils down to the situation mentioned in note 20. I conclude from this incomplete discussion that D2-scales serve in fact as the conceptual structure for the interpretation of gradation. I would like to emphasize that the structure of scales outlined here makes explicit the conditions the conceptual interpretation of gradation must meet. It neither specifies the rules that generate the relevant conceptual representations, nor does it make any claim as to the format of those representations. It might well turn out that conceptual representations are radically different in nature from the algebraic notation I have been using for expository reasons.

84 (or rather APs) are basically predicative or attributive with respect to SF. There are essentially two alternatives which, in the present framework, can be formulated as follows (I assume that N 1 is semantically of category SIN): (i)

(ii)

Obviously, the choice between (i) and (ii) depends on general considerations about predication and modification; it includes the analysis of other N 1 -modifiers, such as restrictive relative clauses, P Ps, etc . , which cannot be discussed here. Notice, however, that in the present framework there is no need for a particular semantic rule, if (ii) is adopted, as the U nspecified Argument Rule (50) automatically applies, provided the copula has the semantic form x [x] , where x is of category SIN (see note 1 1 ) . This can be seen as follows: as be on the present assumption requires an argument of category SIN, the unspecified abstractor Q of the AP must be turned into a referential quantifier, and the V 1 will come out as x [v Q [P x 1\ Q x] ] . This seems t o be a plausible result which argues i n favour o f (ii), although it is not a very strong argument . There seems to be a somewhat more specific argument in favour of (ii) which directly relates to gradation. (94)

a. b.

John is tall . John is a tall boy.

Clearly, these sentences are norm-related, and boy seems to provide the rele vant class C of N1c. height] in (94b). Hence for D-adjectives the modified nominal has a quite specific function for the interpretation of the AP that must be captured in the SF-representation of D-adjectives. This then re quires AP to be basically attributive. There are two objections, though. First, modified nouns restrict, but do not in general uniquely determine, the comparison class C. Consider cases like This is a big cube. Without further contextual information, cube does not provide a sufficiently restricted com parison class. Secondly, the norm, and hence the comparison class, becomes irrelevant (and will in fact disappear) in comparative constructions, even if a modified noun occurs:


AP is basically predicative, its semantic category is SIN, its SF representation has the structure x [P x] . Its attributive use requires a semantic rule that accounts for modification, yielding x [P x 1\ Q x] , where x [Q x] would be the SF-representation of the N 1 to be modified. AP is basically a modifier, its semantic category is (SIN)I(SIN), its SF-representation has the structure Q [x [P x 1\ Q x]J , where Q is a variable of category SIN, which is to be specified by the modified N 1 • Now its predicative use requires a semantic rule that turns it into an appropriate expression o f category SIN.

85 (95)

John is a taller boy than his brother, though both are not as tall as one would expect.

�A' AP

(96) (Deg)

�

A

(Comp)

Although adjectives are not subcategorized for Deg, I will assume that they may assign to it a particular 0-role, which might be called 'Grade' . 24 In the present framework, this 0-role must be provided by a particular abstractor, which I will write 'c' , where ' c' is a variable to be interpreted by D-intervals. On these assumptions, we get (97) and (98) as the structure of SF representations for non-gradable and gradable adjectives, respectively, where P represents the 'proper content' of the adjective, and Q is to be speci fied by the modified N ' . (If we drop the modificational aspect, we get the simplified (b)-versions for predicative adjectives.) (97)

a. b.

Q [x [P x x [P x ]

(98)

a. b.

c c

"

Q

[ Q [x [P x c [x [P x c)]

x]] "

Q

x]]]

For gradable adjectives, P must be a relational expression of category S/N N, provided that c is of category N, i.e., that intervals are treated as in dividuals of some kind. First of all, D-adjectives identify the conditions that select a particular dimension. Suppose that VERT refers to the conditions associated with high , viz. those that select the vertical axis of a given object. Similarly MAX might specify the maximal axis, relevant for long, etc. 25 Let DIM be a meta-variable that covers the semantic primes (or configura-


In the sequel, I will tentatively adopt the alternative (ii), but will largely drop the notational complications arising from the modificational structure, if predicative constructions are at issue. After this preliminary discussion of APs, I will turn to the properties of their heads. While adjectives in general may take complements, as for exam ple proud of NP, this does not seem to be the case for D-adjectives. 23 The most characteristic property of adjectives is the rich structure of the degree constituent that realizes the specifier system of adjectives. Postponing the structure of the degree constituent to the next section, I will thus take (96) as the internal structure of AP:

86 tions of them) specifying the different conditions of different D-adjectives. I will assume that DIM is a functor-variable that specifies a particular dimension for any appropriate object x. Thus 'DIM x' will be interpreted conceptually by one of the available dimensions of x. Let furthermore QUANT be a semantic prime that represents the scanning of the extent of x along DIM x. While DIM requires a particular instance for each different adjective, QUANT will be a characteristic constant of all D-adjectives. We thus get (99a) as a partial specification of the SF-representation of D adjectives, where (99b) indicates the conceptual interpretation assigned to it. QUANT DIM

I

I

N/N

X

I

f(x, D)

b.

N/N N

\Y N

Now the crucial point is, of course, that D-adjectives do not merely select the D-extent of x, but relate it to a particular interval that is specified in various ways. I repeat some of the relevant cases together with their in tended conceptual interpretation (ignoring for the moment the asymmetrical relation ::::> ' ) : '

( 1 00)

a. b. c. d. e.

a is long. a is 3 feet long. a is longer than b. a is short. a is three feet shorter than b.

f(a, f(a, f(a, f(a, f(a,

L) L) L) L) L)

N + c 3 ft f(b, L) + c N - c f(b, L) 3 ft

I will adopt three requirements that the SF-representations of D-adjectives should meet: (i) (ii) (iii)

D-adjectives are not semantically ambiguous between a normrelated and a purely dimensional interpretation, i.e. ( I OOa) and ( l OOb) should be based on the same lexical representation of long. + Pol and - Pol D-adjectives are based on the same dimensional conditions, i.e. long and short are lexically identical in the relevant respect. Specifications of Deg, particularly Comparative and Equative con structions, contribute in a strictly compositional way to the resulting constructions.


a.

(99)

87

Usually (i) is met by taking the non-normrelated interpretation as basic and then stipulating rules that introduce norm-relatedness under particular con ditions. These rules turn out to be the more ad hoc, the more completely they are specified. (Actually, most of the cases discussed in Section l have never been taken into account.) Condition (ii) has practically been ignored. Condition (iii) will be taken up in the next section. Suppose now that we postulate the following SF-representations for ( lOOa,b,d): c [[QUANT MAX a] = [N + c)] [[QUANT MAX a] = [0 + 3 FOOT)] v c [[QUANT MAX a] [N - c)]

v

=

The basic idea is that D-adjectives involve generally three D-intervals: the extent of the object in question, an extent to which it is compared, and a difference between the two. In order to achieve this uniformity, I have used 0, to specify the compared extent in case (b). In order to be consistent, 0 must be interpreted as the empty extent (0,0), not just the initial point of the scale. Although this yields the correct conceptual interpretation accord ing to the definition of ' + ' , so far it has only been a formal move. Its sub stantial motivation will become clear in the sequel. Notice, that on this assumption D-adjectives are three-place predicates, which relate an object to a compared extent (which might be empty) and a difference. This consti tutes the first specific assumption of the present theory. In order to reconcile this analysis with condition (i), a second assumption is necessary: lexical representations of D-adjectives do not specify the value of the compared extent, they rather contain a particular variable v, the value of which might be, among others, 0 or N. The choice of N or 0 will be sub ject to systematic conditions to be discussed in section 7 below. I will now propose the following lexical SF-representation of D adjectives: (101) a. b.

+ Pol adjectives: - Pol adjectives:

c c

[x [[QUANT DIM x] [x [[QUANT DIM x]

[v + c))] [v c))]

The conceptual interpretation in terms of D2 -scales is obvious: The values of v and c are interpreted by D-intervals, ' + ' and ' - ' by the operators de fined above, and ' = ' is a constant of category SIN N interpreted by ' :;) ' or ' C-' in the context of ' + ' and ' - ' , respectively. (Here and in the sequel I use different letters for particular variables instead of x1 , x2 , etc., in order to make the representations more comprehensible.) The semantic category of these representations is (S/N)/N, i.e., they are two-place relations, associating an individual and an interval . 26 The crucial point is, that v is an


( 1 00 ' ) a. b. d.

88

(continued in JS 6.2)

NOTES

I.

(for Part 1 , Sections 1 -4)

A more elaborate version of this theory appears in Bierwisch (in press), which is the trans·

lation o f Bierwisch ( 1 987). The present paper develops the basic ideas of this theory, which has been modified in the extended version in a number of details, but not in essence. The most im portant change co ncerns the conditions discussed in Section 6, which could be simplified in in teresting ways. By a systematic reformulation, C I and C 2 could be reduced to a single condition, while C 3 could be dispensed with altogether. The resulting three conditions provide a more perspicuous formulation of the relevant principles . Although these and a number of other modifications are of interest for possible future developments, they do not a ffect the general orientation of the analysis developed here. As the present paper, which has circulated for a while, presents a relatively selfcontained version of the relevant proposals, I have left it unchanged . I want to thank my colleagues Reinhard Blutner, Hannes Dolling, Karin Goede, Ewald Lang, Anatoli Strigin , and l ise Zimmermann for patient discussion of the various ver sions through which this analysis has passed . Special thanks are due to Arnim von Stechow. The extensive discussion we had about the facts to be dealt with below, and his own analysis of comparative constructions provided the stimulus that eventually led to the theory proposed here. I n a sense, the present analysis grew out of an attempt to overcome what I consider the

shortcomings of his approach. Needless to say that he is not responsible for whatever faults

there might be in my analysis. 2.

For an early survey see Bartsch and Vennemann ( 1 972); for a recent discussion see Von

Stechow ( 1 984). I t might be noted, incidentally, that the ingredient notions from which the

various analyses are built up would allow other combinations of the same descriptive adequa cy. The crucial task is, therefore, to capture the relevanr generalizations, i . e . , to explain the pertinent facts in terms of underlying principles.

3.

Previous th eories analyse relational adjectives as two-place predicates relating an individu

al x to a degree d without an internal structure of d. This prevents an appropriate analysis of the Positive and its relation to other constructions. We will see below, that nevertheless rela-


open variable not bound by any operator and hence not amenable to specifi cation by syntactic constituents. (But see below .) It will be specified rather by special conditions on semantic representations. It is by this status of v that condition (i) is met: long is not ambiguous, but rather subject to dif ferent conditions according to its context. Condition (ii) is met in the ob vious way, and condition (iii) will be shown to be met in the next section. The way, in which representations like (101) yield SF-representations such as ( 1 00 ' ) should be obvious. Suppose that threefeet is an expansion of Deg in (%) with the SF-representation 3 FOOT of category N . Then the Argu ment Rule (49) will substitute it for c, with subsequent specification of v by 0 by general conditions. If on the other hand Deg is empty, the Unspecified Argument Rule (50) will turn c into a referential quantifier, with subsequent specification of v by N again by general conditions. In both cases, we derive an expression of category S/N whose 0-role may be assigned to the subject NP .

89 tiona! adjectives are two-place predicates, in a certain sense, insofar as v is not available as an argument in the same way as are x and

4.

c.

Hence the Comparative and the Equative are parallel in a rather different way than that

assumed in other theories. The common core of various theories is the assumption that in both constructions the matrix clause as well as the complement clause specify degrees which are then related by means of parallel, though different, operations, such as ' > ' for the Comparative and ' 2: ' for the Equative, or addition for the Comparative and multiplication for the Equative. Each of these accounts fails in crucial respects. We will see, however, that addition is in fact involved in all pertinent constructions, while multiplication is a completely separate issue.

5.

The failure to recognize the essential difference between D-adjectives and E-adjectives cor

responds to the tacit, but general assumption that the analysis of tall can directly be generalized to that of all relational adjectives.

6.

See e . g . Kiefer ( 1 978) for discussion. Whether + NR is in fact a property that is appro

A fairly plausible answer to that question follows from independently motivated assumptions of the theory proposed below.

7.

This is borne out b y the controversial status of diagnostic sentences like

(1)

She is cleverer than her sister, though both are fairly dull.

That sentences like (1) are not ruled out completely is due to the possibility to construe at least

some E-adjectives in a quasi-dimensional fashion. What that means will become clear in Sec

tion 7 .

8.

A s the reader might notice, (a) - (c) i s but a rough outline of notions that have been deve

loped in various forms, e.g., by Lewis ( 1 972), Cresswell ( 1 973), or Hellan ( 1 98 1 ), the latter being particularly close to the present approach, as he not only uses a similar system of seman tic representation in analysing comparative constructions, but also combines it with a REST type syntax, albeit in a different way than that expounded below. The present proposal differs, however, from most other versions in that it construes SF-representations as subject to in terpretation in terms of conceptual structures - which are mental representations - rather than model-theoretic constructs.

9.

There are various points to be clarified with respect to (43). First, if P1 has more than one

argument, then S must be relativized to the place of a. Suppose e.g. that there is a semantic J prime DO of type S/N S. We might then say that DO assigns the semantic role Agent to its first, and Action to its second argument. A less trivial pomt, which has a number of conse

quences, is the distinction between semantic roles and 0-roles. For one thing, it allows one to reconcile the 0-criterion proposed in Chomsky ( 1 98 1 ) , according to which each syntactic argu

ment has one and only one 0-role and each 0-role is assigned to one and only one argument,

with the observation of J ackendoff ( 1 972), that a syntactic argument might bear more than one role, as in (i), where - in one reading - John might be both Agent and Theme: (i)

John rolled down the hill.

According to the present approach, there is only one 0-role assigned to John, which is however the composite of two semantic roles in the intended reading. Another consequence of this dis tinction is the fact that there may well be semantic roles that do not enter into any 0-role as signed by a lexical item. The Action-argument of DO would be a case in point in most . occurrences of that prime. Finally, (43) ought 10 be extended 10 syntactic constituents in general, so that it encompasses also 0-roles that are assigned compositionally, e . g . by verbs and prepositions, etc. These points all deserve much more careful consideration than the present context allows.

10.

Thus one of the frequently discussed details is the fact that CAUSE must conceptually

be interpreted as something like direct causation, since, e . g . , the fact that John poisened Bill's father as a consequence of which Bill inherited a million dollar cannot possibly be described


priately analysed as part of the presupposition of the sentences in question, is a separate issue.

90

(i)

John is two inches taller than Bill.

(ii)

These two inches are too much.

In terms of the present analysis, two inchi!S in (i) is interpreted by a representative of a certain


by John gav� Bill a million dollar. Similarly, CHANGE must conceptually be restricted to a direct transition from one state to another. HAVE on the other hand is open to the interpreta tion by a large range of different (asymmetric) relations with possession as a kind of unmarked choice. Problems like these are to be dealt with in a systematic account of conceptual interpre tation. I I . On this account, be does not contribute substantially to the semantic interpretation. That is certainly an oversimplification, as the verbal head of the VP must presumably be sus ceptible of appropriate interaction with the tense constituent in determining the temporal in terpretation of the sentence. I will ignore this aspect throughout the present paper. Notice, however, that the passive be differs from the copula �. which participates in assigning a e-role to the subject. To express that difference, I will assume that the copula be has the SF representation � [x) with x of category S/N. Like the passive be, the copula does not contribute substantially to the semantic interpretation, except that it transmits the 9-role, which the pas sive � does not. 1 2 . Present theories of comparison can be classified as to whether they take (S6) or (S9)/(60) as the conceptual core of comparative constructions. Cresswell ( 1 976) and Bartsch/Venne mann ( 1 972) e.g. are based essentially on (56), while Hellan ( 1 98 1 ) and Von Stechow ( 1 984) are based on something like (59)/(60). We will see later on, that further consequences depend on this choice. 1 3 . It might be noted that D-intervals correspond in many respects to degrees as introduced by Cresswell ( 1976) and refined by Klein ( 1 980). The major difference is that Cresswell defines degrees as equivalence-classes of objects, whereas intervals emerge from the operation of com parison. These alternative conceptions differ in a similar way as the two possibilities to in troduce natural numbers, viz. as sets of sets of equal cardinality, or as constructed by the successor-operation. Notice, that comparison with respect to cardinality has numbers as degrees - or intervals, for that matter. 14. The major exception to this approach is Klein ( 1 980), who rejects not only the compara tive nature of positive D-adjectives, but also reference to degrees as a constitutive factor. Klein's arguments, however, seem to me to be misleading. See Von Stechow ( 1 984) for some discussion. The most important objection to Klein's approach is the fact, that it cannot ac count for rather elementary facts, such as the comparison of two dimensions of the same object as in Th� river is more wide than d�p. without disavowing the initial assumptions. I S . For revealing discussion of these problems, see Pinkal (1983). Notice that completely parallel questions are at issue in cases that have nothing to do with comparison and gradation, as in Austin's famous example France is hexagonal. 16. Nothing in particular hinges on these assumptions, though. They simply make the discus sion more perspicuous. It would merely be a task of formal adjustment to replace, e.g., N by a certain family of intervals and then to proceed in terms of maximal and minimal elements of this family. Nothing would be gained thereby. 1 7 . This claim must not be confused with the completely different issue of whether particular cognitive capacities, say perception of brightness or temperature, must be explained by theories using non-metrical scales. What is at issue is the conceptual structure of comparison, as a con sequence of which in the unmarked case even non-metrical phenomena would be concep tualized in a metrical fashion, as soon as it comes to comparison. 1 8 . This accounts for the peculiarities of measure phrases, which generally are not referen tial, although nouns designating units can be used referentially. Compare the following contrast:

91 D•, while th� two inch� i n (ii) is interpreted by a particular interval that i s a member of D•. I will not pursue here the formal details that would make this account more explicit . 19.

This is actually not quite correct . The most (and probably only) plausible rendering o f

( 9 1 ) (b) would have t o be paraphrased b y (i): (i)

The degree to which John is taller than the average exceeds the degree to which his car is faster than the average.

In other words, height and speed are not mapped into a common scale, but rather the relative amount compared to average. This is a fairly di fferent type of reinterpretation,

as

we will

see

below. 20.

A

case

in point is the clever discussion of clever in Klein ( 1 980). Klein argues that clever

might involve e.g. social or mathematical (or some other) capability yielding incomparable degrees of cleverness. One can plausibly argue, however, that as soon as in fact different ceptual level, so that (i) comes out parallel to

cases

like (ii):

(i)

John is as clever (mathematically) as Sue is (socially).

(ii)

John is as inteUigent as Sue is patient.

Somewhat different remarks apply to the multidimensionality of good discussed i n Kaiser ( 1 979). All these cases can eventually be reduced to linear and in fact metrical scales at the con ceptual level in fairly plausible ways. 2 1 . For an analysis from the ontogenetic point of view, see Goede ( 1 983). She shows that (ii) seems to be characteristic for a particular transient phase in the development of German gro.P. 22.

Although some of these problems are closely connected to phenomena of gradation, it

seems to be a rational strategy to explore the basic structure of gradation first and to explain the more complicated cases by the interaction of this structure with other syntactic and seman tic components. 23 . It is unclear, whether the bracketed phrases in (i) to (iii) should be analyzed as optional complements of adjectives:

(i)

Tohru is tall [ for a Japanese].

(ii)

Sue is short [in comparison to her class mates].

(iii)

Bill is clever [in solving mathematical problerru].

Semantically, they contribute to the specification of C (in (i) and (ii)) or D (in (iii)) of the norm N1c,or As there is, however, a fair number of unsolved problems here, I will make no sub stantial claim, leaving the question for future research.

24.

Notice that there is a parallel with respect to the specifier of nouns and adjectives: the

Determiner of an NP specifies the E>-role that makes a noun referential, the Degree of an A P realizes the E>-role b y which an adjective identifies a grade. I t might be a n interesting question, whether this parallel can be generalized to specifiers of other categories as well. 25 . See Bierwisch ( 1 967), and more recently Lang ( 1 987) for a more detailed discussion of the different conditions involved in spatial D-adjectives. 26.

For the sake of perspicuity, I have omitted the modificational aspect of adjectives; their

inclusion would lead to the following representation: c [Q [� [[Q x)

A

[[QUANT DIM x] "' [ v

where Q is of category SIN.

+

c )]J))


criteria or conditions are involved, the comparison must be appropriately relativized at the con

92 REFERENCES Bartsch, R . & Vennemann, Th. 1 972: �mantic Structur�. Athen!ium, Frankfurt/M. Bierwisch, M. 1967: Some semantic universals o f German adjectivals. Foundatiofl5 of Lan

guag�

3: 1 - 36.

Bierwisch, M . 1 98 1 : Basic issues i n the development o f word meaning. In: W. Deutsch (ed . ) ,

Th� Child's Construction of Language. Academic Press, London, New York. Pp. 34 1 -387. Bierwisch, M . I 982: Formal and lexical semantics. Linguistisch� Btrichu 82: 3 - 1 7 . Bierwisch, M . 1 987: Semantik der Graduierung. I n : M . Bierwisch, E . Lang (eds.). Bierwisch, M . (in press). The semantics o f gradation. I n : M . Bierwisch, E . Lang (eds.), Gram-

matical and Conc�ptual Asf)«ts of Dim�IUional Adjectives. Springer Verlag, Berlin, Heidelberg, New York, Tokio. Bierwisch, M. & Lang, E. 1987: Grammatisch� und konuptu�/1� As�kt� von Dimensionsad

4: 275-344.

Chomsky, N . 1 977: Essays on Form and lnt�rp�tation, North-Holland, New York, Am sterdam.

Chomsky, N. 1 977a: On Wh-movement. In: P. Culicover, T. Wasow & A. Akmajan (eds.),

Formal Syntax. Academic Press, New York. Pp.

7 1 - 1 32 .

Chomsky, N . 1 980: Rults and Repr��ntations. Columbia University Press, New York. Chomsky, N . 1 98 1 : �tu� on Gov�rnment and Bindmg. Foris Publications, Dordrecht. Cresswell, M. 1 973 : Logics and Languages. Methuen, London. Cresswell, M. 1 976: The semantics of degree. In: B. Partee (ed . ) , Montagu� Grammar. Aca demic Press, New York. Pp. 26 1 -292. Goede, K. 1 983 : Zum Zusammenhang zwischen den alterstypischen Antworten auf Fragen mit 'groBer' und 'mehr' . Zeitschrift fur Psychologie 1 9 1 : 233-252. Hellan, L . 1 98 1 : Towards an lnugrated A nalysis of Comparativts. Narr, Tiibingen. Higginbotham, J. 1 98 3 : Logical form, binding and nominals. Linguistic Inquiry 14: 337-394. Hornstein, N. 1 977: Towards a theory of tense. Linguistic Inquiry 8: 52 1 - 538. J ackendoff, R . 1 972: Semantic lnterp�tation in Generativ� Grammar. MIT Press, Cam bridge, Mass. J ackendoff, R. 1 977: X-Syntax: A Study of Phras� Structur�. MIT Press, Cambridge, Mass. J ackendoff, R . 1 978: Grammar as evidence for conceptual structure. In: M . Halle, J. Bresnan, G . A . Miller (eds.), Linguistic Theory and Psychological RNIIIy. M IT Press, Cambridge, Mass. Pp. 201 -228. J ackendoff, R . 1 984: Semantics and Cognition. M IT Press, Cambridge, Mass. Kaiser, G . 1979: Hoch und gut - O berlegungen zur Semantik polarer Adjektive. LinguiStische

Bericht�

59: 1 - 26.

Kiefer, F. 1978: Adjectives and presuppositions. Theoretical Linguistics S: 1 35- 1 74.

Klein, E . 1 980: A semantics for positive and comparative adjectives. Linguistics and

Philosophy 4 :

1 -45 .

Lang, E. 1 987: Semantik der Dimensionsauszeichnung. I n : M . Bierwisch and E. Lang (eds.) Leisi, E. 1 953: D�r Wortinhalt; Seine Struktur im �tsch�n und Englisch�n. Winter, Heidelberg. Lewis, D. 1 972: General semantics. I n : D. Davidson & G. Harman (eds.), Stmantics ofNatural

Language. Reidel, Dordrecht . May, R . 1 977: Th� Grammar of Quantification. Dissertation. M I T . Pinkal, M . 1 98 3 : On t h e limits of lexical meaning. I n : R . Bliuerle, C . Schwarze, A. von Stechow (eds.), Meaning, Use, and lnterpr�tation of Languag�. De Gruyter, Berlin, New York. Pp. 400-423 .


j�ktiven. Akademie-Verlag, Berlin. Bresnan, J. 1 97 3 : SyntaJt o f the comparative clause construction in English. Linguistic Inquiry

93 A rtificial lnt�llig�nce 1 3 : 8 1 - 1 32 . Journal of �man tics 3 . Discourse and logical form . Linguistic Inquiry 8 : 1 0 1 - 1 39 . Predication. Linguistic Inquiry I I : 203 -238. Argument structure and morphology. Th� Linguistic R�i�w 1: 8 1 - 1 1 4 .

Reiter, R. 1980: A logic for default reasoning.

Von Stechow, A . 1 984: Comparing semantic theories o f comparison. Williams, E . 1 917: Williams, E . 1 980: Williams, E . 1 98 1 :


Joiii7Uil of Semantics 6: 95

-

98

BOOK REVIEW

Hiyan Alshawi, Memory and Context for Language Interpretation. Cam bridge University Press, Cambridge, 1 987. Pp. ix + 1 88 . $25 .00 (hardback). BART GEURTS


Broadly construed, this book is an investigation into the ways in which world knowledge, lexical information, and context contribute to the inter pretation of texts. It does not provide an in-depth study of a narrowly cir cumscribed problem area, but instead proposes a " relatively unified frame work" (p. 1 ) for text interpretation. The author situates his work in the field of automatic, not human, language processing. It should become clear, however, that its relevance extends beyond the boundaries of that field. The book falls into two parts . In the f1rst part Alshawi presents his gener al framework for text interpretation, which comprises a · memory model and a context mechanism as its main components . Within this framework solutions are sought to three specific problems: the resolution of referential expressions, word sense disambiguation, and what is called by the author " relationship interpretation " , i .e. the problem of making explicit the se mantic relations underlying, for instance, genitive noun phrases, preposi tional phrases with with , and noun-noun compounds. The memory and context models are then related to other research in artificial intelligence. The second part of the book describes and evaluates a computer pro gram, called "Capture" , which implements the ideas put forward in the first part. The Capture system performs a task that requires the ability to adequately interpret texts and, more specifically, to solve the interpretation problems mentioned above: the system creates a relational data base on the basis of natural-language input texts. Although in some respects it eluci dates matters discussed in the preceding pages, this second part will mainly be of interest to readers specifically concerned with automatic language processing. This review therefore concentrates on the first part , which is of a more general interest . I n the memory model worked out by Alshawi , knowledge is encoded in a semantic network, i . e . an aggregate o f nodes interconnected by labeled links, which together represents information about objects, their properties , and relationships. Contextual information is represented, at any given mo ment during text processing, by a collection of so-called contextfactors. As sociated with a context factor are a scope, which is a set of memory entities, and a significance weight. When the context factor comes into existence this weight is set to a pre-determined numerical value, and subsequently it is decremented - by gradual decay or otherwise, depending on its type -

96 in the course of the interpretation process . At any given point, the salience or - the term preferred by Alshawi - context activation of a memory entity is obtained simply by summing the significance weights of the context fac tors that have that entity in their scope. Context activation is treated as an open-ended notion in that virtually all processes involved in text interpretation can create context factors. Thus, for instance, when the syntactic analyser recognizes a sentence as a passive, it can foreground its subject by creating a context factor for it, and when a particular entity is referred to, memory mechanisms can activate conceptu ally related entities. Twelve different types of context factors are discussed, seven of which play a role in the Capture system. The author makes clear, particular set, and that, on the other hand, the framework he proposes is expressly designed to be able to handle alternative and I or additional types . The basic concept underlying Alshawi's treatment of memory processes for information retrieval and inference is what he refers to as "marker processing" . (More common terms for the same thing are " marker pass ing" and " marker propagation" ; the notion of "spreading activation" may be regarded as its psychological counterpart .) Marker processing is es sentially a way to perform set operations on the entities stored in memory (i.e. nodes and links). For instance , in order to retrieve all female doctors it knows, a marker-processing system performs an intersection search. This is done by placing one marker, M l , on all instantiations of the memory en tity "doctor" , and another marker, M2, on the all instantiations of the memory entity " female". Once these marking operations have been com pleted, a simple search procedure collects all entities marked by both M l and M2, so as to obtain the set of all female doctors . To illustrate how all this apparatus is put to work, I will briefly discuss Alshawi ' s method for the resolution of singular referential expressions, which is part of one of the three interpretation problems mentioned above which Alshawi tackles within his general framework . Alshawi's basic method for reference resolution is quite straightforward. The procedure starts off with the generation of constraint markers. These are obtained by applying to the syntactic analysis of the referential expres sions in question a set of rules that require, for instance, that all the memory entities denoted by the head noun of a full noun phrase are to be marked. After all the appropriate marker-generation rules have been ap plied, an initial search is performed, which ignores all entities whose con text activation remains below the focus threshold, which is a constant that indicates the amount of activation that an entity needs in order to be in fo cus. If this first search fails, the threshold condition is dropped, and the en-


however, that, on the one hand, his theory is not strictly committed to this

97


tity that satisfies the constraints and has the highest context activation is chosen . On the face of it this procedure might appear overly simplistic, but Al shawi demonstrates that, in fact, it can produce fairly sophisticated results when embedded in the kind of model he proposes. The central component of this model - or rather, this class of models - is the context-factor sys tem. This system can, to give just one example, activate entities that were not mentioned explicitly in the text, but are contextually implicated by the interpretation process. Thus it becomes possible to account for certain well known cases of anaphoric coherence exemplified e.g. by a text that intro duces a house, and immediately afterwards refers to the roof. The scope of Alshawi's framework is restricted in a number of ways, as the author himself points out . First, the class of sentence types covered is not very broad, and , possibly as a result of this, the texts that were con structed to test the Capture system (which are all listed in an appendix) do not give a very natural impression. Secondly, since Alshawi has attempted ' 'to maximize the 'performance I complexity' ratio' ' (p .2) of his system, in terpretation tasks that require complex inferences (e.g. because they in volve hearer models) or non-deterministic procedures (which would allow the system to give up an interpretation previously chosen in favour of an other one, because new input indicates that it is preferable after all) are left out of consideration. On the other hand, however, one could well argue that, given the state of the art in artificial intelligence as well as in its neigh bour disciplines, these are exactly the right kinds of limitations to impose on this type of research. Alshawi outlines a general framework for text processing within which models c:m be constructed with, judging from the model detailed by the author himself, several attractive properties. To begin with , the framework makes for simple and computationally tractable systems which, the re marks in the previous paragraph notwithstanding, cover a coherent and surprisingly wide range of phenomena. It furthermore assimilates some of the best work in artificial intelligence, either by directly incorporating it or by working out solutions that combine the advantages of alternative ap proaches. An example of the former is the memory mechanisms that Al shawi employs, which make use of the marker-processing techniques devel oped by Fahlman. The context-factor system exemplifies the latter, as it is shown to support, in effect, both top-down and bottom-up interpretation processes. Finally, and linking up to the example just given, it is worth men tioning that Alshawi's framework seems to be able to accommodate rela tively robust processing models - which is interesting because robustness is

98 an essential , but often ignored, characteristic of human cognitive skills in general, and language skills in particular. The book's presentation is on the whole fairly adequate. Especially note worthy in this context are the impartiality and explicitness with which the author assesses the strengths and weaknesses of his theory and of the model he implemented - generally not the most conspicuous features of artificial intelligence texts, or texts emanating from any other field, for that matter. of Language and Literature of Brabant

Department University

P . O . Box 90 1 53 5000 LE Tilburg


The Netherlands

EDITOR'S NOTE

D. Geeraerts T. Hoekstra E. Lang K. Oatley G. Redeker H. Schotel R. Schreuder H. Stark T. Mitchell


The editors wish to express their gratitude to the following colleagues who are not consulting editors of this Journal but have been kind enough during the past year to referee papers that were submitted for publication:

Journal of StmanticJ 6: I O J - 1 46

TOOLS AND EXPLANATIONS OF COMPARISON - PART 2*

MANFRED BIERWISCH

5 . COMPARATIVE AND EQUATIVE CONSTRUCfiONS

� I I � I �

J ohn is two I feet taller than/times

( 1 02)

as

tall as ) Bill is.

s

( 1 03)

NP

John

!NFL

Pres

VP

AP

V

be

� l A � A'

S'

� co�s

�

A

[ )\ T "II 'h�o � T 'T A [ 1\ g

two

�oot

Urnes

J

Wh

X;

Bill

Pres

AP

V

be

l

Deg

x1

l

A

Pro

• Editorial note: Because of its unusual length this paper appears in two parts . Part I (i.e. Sec tions

J -4

of the paper) was published in J S

6. 1 .


Measure phrases are but a special instance of Deg. We will now incorporate some of the more complicated Degree constructions . Although I will mainly be concerned with their semantic structure, a few syntactic prerequisites must be fixed. For expository reasons, I will follow Jackendoff ( 1 977), without pursuing the full range of complexities discussed there. 27 Suppose, then, that (1 03) is the L F -structure underlying (1 02):

1 02


I will briefly comment on a number of more or less crucial points . First, I have assumed that the complement clause is not extra-posed, but rather directly generated in its final position. This raises various questions which need careful consideration, but cannot be pursued here. Secondly, I have placed than and as in complementizer position. This is an arbitrary (and rather dubious) decision, which has no substantial consequences in the present context. Than and as are formal markers of the complement constit uent of Comparative and Equativ e constructions without any semantic content. What is crucial, however, is the appearance of a Wh-Deg operator t hat is moved to the COMP-position . It is phonetically empty, but it has an essential semantic function: The SF-representation of [Wh x] will be an ab stractor 5t, such that the SF-representation of the whole clause comes out as a property of intervals. (It is, in fact, a property of D-differences, hence I will represent the abstractor as c, in accordance with the notational conven tions used so far . ) This assumption, w hich is basically due to Chomsky ( 1 977), brings comparative complement clauses in close connection to rela tive clauses: both are formed by Wh-Movement, and both represent proper ties of individuals of certain types . These characteristics encompass both simple clauses like as Bill is and complex clauses such as than Sue thought that everybody expected Bill to be. Another crucial point is the appearance of [A pro] . which is phonetically empty, but semantically non-distinct from the adjective in the matrix clause. The notion of non-distinctness will be discussed below. An essential condi tion for its occurrence is the 'correspondence' to the matrix adjective. Although the requirement is intuitively obvious, it is anything but clear how it is to be made precise in general syntactic terms. For the time being, I will simply rely on the intuitive plausibility of the condition, without introduc ing any ad hoc formalization. One reason for this is that the problem is probably related to a final point to be mentioned . Complement clauses of Comparatives can be reduced in a systematic way so that, again in intuitive terms, constituents that are identical to counter parts in the matrix clause are usually not realized at the surface . Two ques tions must be clarified in this respect: fi rst, whether the phonetically empty elements are deleted under identity, or whether they are not inserted in the first place and then supplied by rules of interpretation or construal . Second ly, deletion as well as interpretation requires some kind of across-the-board correspondence in the sense of Williams ( 1 977), which raises a number of intricate problems: across-the-board correspondence is straightforward for parallel structures, such as coordinate clauses, but Comparative construc tions require it for a matrix and an embedded clause, the latter being a con stituent of the former. Whatever the correct solution to these problems will tu rn out to be, I will presuppose here that it provides an independently moti vated LF-representation accessible to semantic interpretation without addi-

103 tiona! assumptions. L et me illustrate the presumed SF-representations of complement clauses by some slightly simplified examples: ( 1 04)

( 1 05)

( 1 06)

a. b. c.

John is taller than Bill (is). John is taller [than Wh c Bill is c pro] c [ [QUANT VERT BI LL] [v + c))

a. b. c.

The table is two feet longer than (it is) high. The table is two feet longer [than Wh c it is c high] c [ [QUANT VERT y ] [v + c))

a. b. c.

John is taller than Bill expected (him to be) John is taller [than Wh c Bill expected he be c pro] c [PAST [EXPECT BILL [ [QUANT VERT y] [ v + c) ))]

=

=

The parenthesized parts in (a) need not be phonetically realized. The brack eted parts of the LF-representations (b) have the SF-representation (c), where 'y' in ( 1 05c) and ( 1 06c) is a variable that must be bound by (i . e . , must be coreferential with) the subject of the matrix clause. We will see later that the value of v will be 0 in all these cases. Notice that the representations in (c) are of category SIN , viz. expressions for properties of D-intervals, as mentioned above. This holds also for the SF-representation of than Bill or than high in ( 1 04) and ( 1 05). For ease of reference, I will use W as a variable over SF-expressions of this kind. Hence [ W c) represents a proposition that ascribes the property W to the interval c. Turning now to the analysis of Comparative constructions, remember first what we want their conceptual interpretation to be: ( 1 07)

a. b.

a is longer than b . f(a, L) ::::> f(b, L) + c

Given the SF-representation of D-adjectives proposed in ( 1 0 1 ) , the basic point is fairly obvious: The D-extent assigned to the object that provides the standard for comparison must become the interpretation of the variable v. Notice that this holds equally for + Pol and - Pol adjectives. ( 1 08)

a. b.

a is shorter than b . f(a, L) C· f(b, L ) - c

In order to derive the corresponding SF-structures in a compositional way, three things have to be accomplished: (i) the complement clause than Wh c b is c pro must be interpreted as than Wh c b is c long for both ( 1 07) and ( 1 08). Why this is so, and how it can be achieved, will be discussed below. (ii) The interval property represented by the complement clause must be


=

1 04 turned into a n extent that actually has that property. Given the property W as j ust introduced, this is achieved by prefixing the iota-operator '1c' to the proposition [ W c) . Formally, this amounts simply to substituting the ab stractor c contained in Why the iota-operator, which now binds the variable bound by c. Notice that the uniqueness condition associated with the iota operator requires the default-condition discussed above to be in force, as otherwise c would not be uniquely determined. This has exactly the desired consequences for the interpretation of comparative constructions. (iii) The expression lC [ W c) must eventually be substituted for the variable v in the matrix adjective. These considerations lead to the following SF-structure for longer (than):

longer: c [ W [x [ [QUANT M AX x]

=

[c ' [W c ' ] + c)]]]

The lexical representation for shorter will be identical to ( 1 09), except that ' + ' is to be replaced by ' - ' . Assuming now ( 1 04c) as the value specifying the abstractor W, we derive ( I l l ) as the SF-representation for ( 1 1 0): 28

( 1 1 0)

John is shorter than Bill .

(1 1 1)

Vc [ [QUANT VERT JOHN] [v + c ' ] ] - c)]

[lC ' [ [QUANT VERT BI LL]

The analysis proposed so far is similar in some respects to that of Hellan ( 1 98 1 ) and Von Stechow ( 1 984). It differs, however in crucial respects. The main point is, that 1c [ W c) - and hence the reading of the complement clause - replaces the extent-variable v, which automatically accounts for the fact that Comparative constructions of D-adjectives can never be norm related, as N can show up only as a specification of v , which is no longer availabe. (Notice that there is a variable v in ( 1 1 1 ) which originates, however, from the adjectives in the complement clause and must be 0 by general condition s . ) Another crucial consequence of this analysis is, that the additive (or subtractive) ingredient of Comparatives is not a contribution of the comparative morpheme, but rather of the adjective. This provides the required parallelism for + Pol and - Pol adjectives giving the 'converse' ef fect of the Comparative without further ado. In this way condition (iii) for mulated in Section 4, is met : Positive and Comparative constructions are in fact derived in a strictly parallel way for all types of adjectives. There are a number of further consequences that follow from the above analysis. I will mention three of them. First, since W is in the scope of an iota-operator, which requires c to be uniquely specified, the complement clause cannot contain a negation : the value for c in Bill is not c tall is simply not uniquely determined. Hence the u ngrammaticality of sentences like


( 1 09)

105 John is taller than Bill is not must not be captured by a stipulated syntactic constraint, but can be explained in terms of SF-properties. Secondly, being in the scope of 1c gives W the status of a presupposition (or conventional implicature) - very much like restrictive relative clauses. This does exempt the complement clause from negation . Hence the negation in ( 1 1 2a) does not affect the implicit proposition, that Bill is c ' -tall. As fur thermore the iota-operator requires uniqueness of c ' , the default interpretation of ' + ' (and ' - ') remains in force for the complement clause, so that John is not taller than Bill can only mean that John is at most as tall as Bill. This can be seen by the equivalence of ( l l 2b) and ( l l 2c). (Here and in the sequel I will abbreviate QUANT DIM by D ' . ) a. b. c.

J o h n i s not taller than Bill. ., [v c [[D ' JOHN] = [1c ' [[D ' BI LL] = [ v + c ' ]] + c)]) v c ' (c ' unique) [ ( [0 ' BIL L] = [v + c ' ] ] 1\ ., v c [[D ' J OHN] = [c ' + c)] ]

The third point concerns the content of the complement clause. Here the relevant consequences follow only if further assumptions are introduced . They will be motivated below. What is at issue, is the third class of problems discussed in Section I , viz. the ungrammaticality of sentences like (24b), repeated here as ( 1 1 3) :

(1 13)

• John i s shorter than Bill i s short .

In terms of the conceptual interpretation, the complement clause of a Com parative has to specify a property of an extent, i . e . , of an interval that starts at 0, because v must be interpreted by an extent. This corresponds to the in tuitive notion, that both taller than b and shorter than b refer to b's height, rather than b's shortness. This condition is met by specifying the v of the complement clause by 0, so that 1c ' [ W c '] comes out as an extent. As an automatic consequence of this condition, - Pol adjectives are not admitted in the complement clause because they require c to be subtracted from v, which is impossible for v = 0. Hence c ' pro must be interpreted as c ' tall in the complement of both taller and shorter. That is in effect what ( I l l ) illustrates . We will return to this point in the next section. As it stands, the analysis of the Comparative given in ( 1 09) is inadequate in two respects: First it does not account for the analytic Comparative more, hence it is not sufficiently general. And , secondly, it does not fit into the syntactic structure indicated in ( 1 03). 2 9 What we need to do in order to complete the analysis is to strip out , so to speak, the effect of the compara tive morpheme from ( 1 09). To that effect, the open variable v in the SF structure of 0-adjectives must be made available by means of a prefixed ab-


( 1 1 2)

1 06 stractor v for syntactically determined specification. Suppose, now, that A is a variable over SF-structures of D-adjectives as given in ( 1 09), i .e. , A is of category (S/N) IN and contains a free variable v. Then v [A [1c 1 [ W C 1 ])] will have the desired result. This structure must, of course, be prefixed by abstractors binding A and W, which yields ( 1 1 4) as the SF-structure of more and -er (where -er must later be attached to the following adjective by the familiar process of affix hopping). ( 1 1 4)

more/-er:

A [ W [v [A (1 C 1 [W C 1 ] ] ) ] )

( 1 1 5)

morel-er: e "

[A [W [( v [A c " ] ] [1 C 1 [W C 1 ] ) ] ] )

I f this i s first applied to the NP two feet, the abstractor c " is specified giving the following SF-structure: ( 1 1 6)

two feet morel-er:

A [W [ [v [A 2 FOOT]] [c 1 [W C 1 ]] ] ]

The expression 2 FOOT will eventually b e absorbed b y the abstractor c i n A , such that the Degree constituent indeed specifies the c and hence receives its 8-role - although its SF-representation is by no means a simple constant of category N. I f, on the other hand, no measure phrase appears under Deg then c" is converted by rule (50) into a referential operator with narrowest scope that ultimately binds the grade-variable in A . 30 In order to achieve that result, ( 1 1 6) must apply to the SF-representation A and W of the adjective and the complement clause, in that order, which is in line with the LF-structure ( 1 03). We will turn now to the Equative construction. The basic idea is this: While the Comparative specifies the extent variable v by means of the com plement clause, the Equative in the same way specifies the difference varia ble c. As this account differs substantially from previous theories, let me first give some intuitive motivation for it . Obviously ( 1 1 6a) could approxi mately be construed as having the interpretation ( l l 6b). The situation is more complicated with ( 1 1 7) , which means something like ( 1 1 8) , involving the presupposed norm-relatedness indicated in { l l 7b): -

1,


I f ( 1 1 4) i s applied t o SF-structures o f adjectives like ( 1 0 1 ) , i t yields represen tations such as { 1 09) by successive lambda conversion. One further adjust ment is called for , though, in order to account for the measure phrases that may appear under Deg 1 in ( 1 03) . These must first be absorbed by the Com parative morpheme yielding the SF-structure of two feet more, etc. We thus end up with ( 1 1 5) as the SF-structure of more:

107 ( 1 1 6)

a. b.

a i s as long as b. f(a, L) = f(b, L)

( 1 1 7)

a. b.

a is as short as b. f(a, L) = f(b, L) /\ ( f(b, L)

( 1 1 8)

C·

N - c)

a is as much below the norm as b.

As ( 1 1 8) suggests, the Equative construction involves in fact an equality o f differences rather than of extents directly. Furthermore, while - Pol 0 adjectives are excluded from complement clauses o f Comparatives, this i s n o t the case for Equatives: a. b.

J ohn is as tall as Bill is short. The window i s almost a s low a s it i s narrow.

Hence complement clauses of Equatives are not subject to the restrictions that hold for extents. These, and a number of further relevant properties o f Equatives, follow directly, if w e assume ( 1 20b) t o b e the SF-representation of ( 1 20a): ( 1 20)

a. b.

John i s as short as Bill (is). [ [Q UANT VERT JOHN] = [N [N - c ' ] ] ] ]

11

c ' [ [QUANT VERT BILL]

=

T h e eta-operator that binds c ' in ( 1 20b) i s identical with the iota-operator , except that i t does not require uniqueness. This accounts for two inter related facts : first, Equatives admit negative complement clauses, which Comparatives do not: (121)

a. b.

John is as tall as nobody else. • John is taller than nobody else.

Secondly, the preferred interpretation of negated Equatives is not based on the uniqueness of c ' . This can be seen as follows: ( 1 22)

a. b. c.

John is not as short as Bill. [N - [11 c ' [ [D ' BILL] = [ N - c ' ]] ] ] --, [[D ' JOHN] [N - c ' JJJ v c ' [([D ' B ILL] = [ N - c ' J] /\ --, [[D ' JOHN] =

=

The preferred interpretation of ( 1 2 2a) is that o f John is shorter than Bill, plus the presupposition that Bill is short . ( 1 22b) indicates the SF-structure of ( 1 22a), and it is by definition equivalent to ( l 22c). The presupposition is maintained as the complement clause is not affected by the negation. The


(1 1 9)

1 08

( 1 23) as (of Deg): A ('W [A [Tt c ' [W c ' ]]]] If A and W are specified by appropriate readings, (123) yields SF-represen tations as indicated in (120) . Notice that again the Degree constituent ulti mately specifies the abstractor {: of the D-adjectives to which it applies. There is one point to be added to ( 1 23), in order to account for construc tions like twice as tall as S. Let p be a variable over SF-structures assigned to NPs like three times, t wice, half etc., which might be called 'factor phrases ' . n is a functor of category N/N , and represents multiplication of its argument, so that [n c) is interpreted as the product of c and the factor contained in n. Obviously, n must apply to the extent specified by the com plement clause of Equatives . Furthermore, factor phrases must be optional in a different way from measure phrases like two feet: absence of measure phrases triggers the Unspecified Argument Rule, which provides a contextu ally specified value; absence of factor phrases is nothing but absence. 3 1 I represent this optionality by including n and the abstractor by which it is bound into parentheses. We thus get ( 1 24) instead of (1 23): ( 1 24) as (of Deg):

(fi)

[A [ W [A [Tt c ' [W [(n) c ' ]]])]]

Notice that n must be precluded from - Pol D-adjectives in order to account for the ungrammaticality noted above: ( 1 7)

a. "' John is half as short as Bill is.

This will be taken care of by the conditions to be discussed below. I will now briefly consider constructions with less (which I will call Sub tractive constructions). Although less derives from the Comparative of lit tle, Subtractive constructions are more closely related to Equatives than to


matrix clause, however, is in the scope of negation, so that the default reading of ' - ' does not apply, and the lack of the uniqueness condition on c ' automatically provides the preferred interpretation. Notice that I have assumed in ( l 20b) t hat the extent variable v is specified by N for both the matrix adjective and the adjective of the complement clause. This assumption follows from the conditions to be discussed below. It provides, of course, the norm-relatedness involved in these constructions, and it gives it the status of a presupposition, as N occurs within the scope of the eta-operator. This holds only for - Pol adjectives, though. + Pol ad jectives admit both N and 0 as the value of v in these constructions, hence norm-relatedness is not obligatory for as tall as, but only for as short as. Using the notational conventions introduced above, the SF-structure of the Degree-morpheme as can now be g iven as follows:

109 Comparatives: their complement clause specifies the difference variable c rather than the extent variable v. To see this, consider the intended con ceptual interpretation: ( 1 25) a. b.

a is two feet less short than b . [f(a, L) ::) f(b, L) + 2 ft ] 1\ ( f(b, L)

C·

N - c' )

Just as Equatives, Subtractives preserve norm-relatedness for - Pol adjec tives, and they are based on the comparison of differences rather than ex tents directly, as the paraphrase ( 1 26) suggests:

On the other hand, Subtractives share with Comparatives the uniqueness condition on c ' , since they do not allow negated complement clauses. Hence, they are based on the iota-operator. I will thus assume the SF representation ( 1 27b) for sentences like ( l 27a): ( 1 27) a. b.

John is less short than Bill. v c " [[D ' JOHN] = [N - [1 ' [[D ' BI LL] [N - c ' ]] - c " ]]

Under standard assumptions, ( l 27b) determines the appropriate conceptual interpretation. And it follows from the assumptions introduced so far, if we set up (1 28) as the SF-structure for less: ( 1 28) less (of Deg):

c"

[ A [ W [A [ 1 c ' [W c ' ] - c " ]])]

Notice that there are three occurrences of ' - ' in ( l 27b). The first two of them derive from short (and the pro corresponding to it). They would be replaced by ' + ' . if short is substituted by tall. The last one derives from less, indicating its relatedness to little (which I will not pursue here). So far, I have considered only constructions whose complement can reasonably be construed as (the residue of) a clause, including cases like than Bill, than tall, than he believes, etc. Notice, incidentally, that even completely missing complements are taken care of, because in those cases the Unspecified Argument Rule would turn W into a referential operator that picks an appropriate property from the context. Thus the but-clause of ( l 29a) would have the SF-structure ( l 29b), where the most likely interpreta tion of W is provided by the initial clause:


( 1 26) The extent to which a is short is two feet smaller than the extent to which be is short.

1 10 (1 29) a. b.

Bill is tall, but John is taller. v c" v W [[QUANT VERT JOHN]

[1 c I [W c ' ]

+

c , ]]

Consider now sentences like {1 30): ( 1 30) a. John is taller than six feet. b. Sue is shorter than six feet.

(131)

v

c " [[QUANT VERT JOHN]

=

[6 FOOT + c " ]]

This would require, however, an ad-hoc-adjustment of the analysis of more/-er given in (1 1 5) . Can we preserve the analysis given so far and still derive something like ( 1 3 1)? What would be needed to achieve this is some means that turns a measure phrase into a property of intervals. There are various possibilities to be considered here, depending on how measure phrases are to be treated for independent reasons. One way would be to con sider expressions like '6 foot' as belonging alternatively to category N or SIN. 32 With this proviso, measure phrases could be treated directly as a possible complement to Comparatives. The result would be structures like ( 1 3 1 ) with 1 c ' [6 FOOT c ' ] instead of '6 foot' . Another possibility would be to assign (132a) the SF-structure ( 1 3 2b): ( 1 32) a.

than six feet

b.

c

[6 FOOT

=

c]

This alternative is similar in spirit to a proposal made in Bresnan ( 1 973). It can be elaborated in various ways. I will leave it at that, concluding merely that the analysis of sentences like ( 1 30) can be reconciled with the present account. Notice, incidentally, that along with ( 1 30), we have ( 1 33), where (a) is ulti mately equivalent to (1 30a), while (b) is ungrammatical: ( 1 33) a. John is more than six feet tall. b. • sue is more than six feet short . This follows directly from the present analysis, since the measure phrase in ( 1 30) specifies the extent variable v, while more than sixfeet is a degree con stituent that specifies the difference variable c and is excluded from - Pol adjectives for the same reason as simple measure phrases.


Here the status of the measure phrase as a residue of a complement clause is rather dubious: there is no than six feet is. Intuitively, the measure phrase should directly specify the extent variable v with ( 1 3 1 ) as the SF-structure of ( 1 30a):

Ill

( 1 34)

much : c

( 1 35)

more: c"

[Q [x [ [QUANT Q x]

=

[ v + c]]]]

[Q [ W [x [[QUANT Q x]

=

[1 C 1 [W c 1 ] + c " ]]]]]

Notice that ( 1 34) is of exactly the same category as the SF-structure of at tributive adjectives indicated in (98) above. Thus ( 1 34) accounts for the ad nominal much, and ( 1 35) for its lexically fixed Comparative. Here are some provisional illustrations, based on obvious assumptions and abbreviations: ( 1 36)

[ N ' much water] : x [Vc [[QUANT WATER x]

( 1 37)

[ N ' two gallons more water than wine] : x [[QUANT WATER x] [1 c I [Vy [ [QUANT WINE y] = [0 + c I ]]] + 2 GALLON]]

=

[N + c]]] =

Interestingly, ( D ') can also be used as a modifier of adjectives as in ( 1 38), which is not to be confused with the Comparative more, nor with the Degree-modifier, to which I shall turn presently . ( 1 3 8)

John is more tall than slim.

In order to make the pertinent SF-structure more perspicuous, I shall in troduce the following notational abbreviation, which will be useful for other purposes as well:


The last remark brings me to a bunch of problems which I have deliber ately avoided so far. I am referring to the analysis of much, many and their comparative more. In some sense, these items are expressions of gradation par excellence. l nspite of this, they cannot be treated here in full detail, be cause they raise a large number of problems whose analysis would go far beyond the present limits. The main point is their oscillating syntactic sta tus: they can occur as quantifiers, adjectives, adverbials, degree-modifiers, and it is by no means obvious whether these different occurrences can all be reduced to the same SF-structure, although there must be a common core in ay event. In order to indicate, how the present theory might eventually cover these items, I will give a few hints regarding their SF-representation. Obviously, many and few as well as much and little are antonyms in much the same way as + Pol and - Pol adjectives and should hence· be analysed in parallel fashion. Their characteristic property is not the fixing of any par ticular dimension or condition for comparison. They represent, so to speak, gradation per se. As a first upshot, we might therefore assume that much/many differs from tall, long etc. in that it has a variable instead of one of the constants VERT, MAX, etc. We would thus have the following SF-representations, where Q is a variable of category S/N:

112 ( 1 39)

x

[TALL x]

=

def x [Vc [[QUANT VERT x]

=

[v + clll

The definiens of ( 1 39) is the Positive reading of tall, the v of which will be come N by general condition. Similar abbreviations can be defined for all 0-adjectives. With this proviso, the SF-structure of ( 1 38) derived by means of ( 1 35) would be as follows: ( 1 40) Vc " [[QUANT TALL JOHN] [0 + c ' ]] + c " ]

[1 c I [ [QUANT SLIM JOHN]

=

( 1 4 1 ) John i s more tall [[than W h c ] John i s c pro slim] Notice that according to (140), ( 1 38) does not express a comparison between two extents of John, but rather between the extent to which he is tall and slim, respectively. In effect, then, the comparison is shifted from the scale of physical extension to another, more complex scale. This becomes ex plicit, if we spell out the defined structure of TALL (and SLIM, for that matter). We will return to that kind of shift in Section 7 . S o far, I have not introduced i n the analysis of much and more any new concepts over and above those already used. There is one point however, that needs some clarification. Let us return to the categorization and in terpretation of QUANT and DIM. The intuitive notion that DIM maps its argument into a pertinent D-scale so that QUANT ' scans' the extent of the argument can be made precise in various ways. The notation used so far can reasonably be interpreted in the way of (l 42a), which might, however, be construed as an abbreviation for something like (142b): ( 1 42)

a.

I

I I

QUANT

DIM

x

N/N

N/N

N

V / \ N

b.

I I I I I \ / /\ 1/

QUANT

\

(N/S)/N N

c

DIM

N

X

C

SIN N N

N

s

N

The crucial point is that the analysis of much and more, as it stands, cannot be reconciled with the assumptions embodied in either (l 42a) or (l42b), as we have assumed that Q is of category SIN. What we need is a functor that


This presupposes that the LF-structure of ( 1 38) is ( 1 4 1 ) with the usual as sumption about pro, whose SF-structure will be that of much.

1 13 turns one-place predicates like WATER into relations (or functors) that associate the argument of the predicate with a scale of quantities. Suppose that AMOUNT represents such a functor, so that (1 43b) would be inter preted as mapping an amount of water x into a D-scale of amounts. (143a) assimilates this structure to that of (142a): (1 43)

a.

I

I I

AMOUNT WATER (N/N)/(S/N) SIN

x

N

I

I

''IN"'

I I

Y;jl

AMOUNT WATER

x

c

"

s

N

In other words, [AMOUNT Q] turns x into a quantity of Q. This is straight forward for mass nouns like water, sand etc. In order to include adjectives like tall, slim into the values of Q, we must generalize its interpretation, so that [AMOUNT TALL] specifies the amount to which x is tall, i.e. above the norm of height. Notice that by the definition of tall this reduces ulti mately to an interval on the D-scale of height, telescoping so to speak the norm-relatedness. With these considerations in mind, (1 34) is to be replaced by (144): (144) much :

c

[<J [x [[QUANT AMOUNT Q x]

=

[ v + c)])]

A similar adjustment is to be made for more. Suppose now that we let AMOUNT apply also to sets, where it specifies their cardinality, and we represent the Plural, somewhat ad hoc, by a functor SET, which turns a common noun into a set of individuals that have the property represented by the noun. Then the SF-structure of many would be identical to that of much given in ( 1 44), assuming appropriate syntactic categorisation. Let us briefly look now at cases like more than six feet tall, i.e. to the Degree modifier. Obviously, more than sixfeet must determine an interval, just as the measure phrase sixfeet, that is, its SF-representation must be an expression of category N that eventually specifies the abstractor c of tall. In other words, the Degree modifiers more and much merely provide an in terval amenable to further specification. The obvious way to incorporate these conditions into the present framework is to set up the following SF structures: ( 1 45) much (of Deg):

c

[TJ c' [c '

[ v + c))]


I over I , and v as selecting I over any value v > I , is logically equivalent to the classical bivalent system, no matter how many truth-values P has. In other words, the number of truth-values is irrelevant for the classical calculus, although with the three operators as defined any truth-value other than "true" or " false" is otiose. On the other hand, any n-valued logic allows for n- 1 different specific negations , whereby the classical negation (i.e. -, ) is equivalent to the disjunction of the other negations. Thus, in the three valued system proposed here, -, p = ( - p v == p). See for details and proofs the Appendix to Seuren ( 1 985) by A. Weijters. In Seuren ( 1 985), no account is taken of other cases of "marked" nega tion, such as those discussed by Horn (1 985) and exemplified in (25) and (26) above. It is Horn's merit to have drawn attention to these cases. Yet, as has been made clear above, I am not convinced that the Horn cases form one (natural) category with the cases of presupposition denial. In my view, presupposition denials form a separate category, distinct from the Horn cases. These I take to involve a specific form of linguistic quotation in (pseu-


It is claimed in Seuren ( 1 985) that all presuppositions are derived from lexi cal preconditions, even those that seem to be associated with, or induced by, grammatical constructions such as (pseudo)cleft, or phonological features such as contrastive accent. Clearly, to uphold this view non-trivial gram matical analyses are required . This analysis is thus heavily dependent on sound grammatical theory. Of more direct interest here is the question of the logical properties of the two negations, and of the relation between the logic and the semantics of natural language negation. In Seuren ( 1 985: 239) the following truth-table is given for the two negations, with the classical bivalent operator thrown in for good measure, though I would claim that this classical operator does not occur in natural language:

1 98

2 . THE MODE L-THEORY O F TH REE-VALUED CALCULI

In this section we will investigate the logical and model-theoretic properties of the trivalent system proposed. This seems useful because surprisingly little work has been done in this area, perhaps due to a deeply rooted mis trust, in logical circles, with regard to such calculi. It is hoped that the straightforward simplicity of the model-theoretic aspects of three- (and multi-)valued calculi will help to take away this distrust or lack of interest. We will concentrate on the model-theory of the calculi . Note, however, that what we call "model-theory" must be distinguished clearly from what we call "semantics for natural language" . Most brands of formal semantics for natural language are based on the assumption that linguistic semantics is a variety of the kind of model-theoretic semantics developed in logic around the middle of this century. This assumption is radically dismissed here. We speak of "model-theory" when referring to "semantic" methods developed in logic . "Semantics", for us, is the study of the cognitive and linguistic processes that occur when sentences are understood. The incremental unity of negation, as opposed to its semantic and logical ambiguity, will be adumbrated in this section, but not made explicit until section 3 . We will first take a look at the standard model-theory of the classi cal bivalent calculus. 2. 1 . Standard Boolean model-theory for bivalent propositional calculus

Let us revert to fig. 1 above, which gives the classical valuation space for a language L(a, b, c) with no entailment relations among the atomic sen tences a, b, c. In the standard conception, a sentence p is said to "express a proposition" . Let us use the notation ' ' /p/" for the proposition expressed


do)cleft or contrastive accent constructions. They therefore involve the minimal, not the radical, negation. Then , In (1985) I provide no clear analysis of the relation between the lo gical and semantic properties of negation on the one hand, and its incremen tal effect on the other. No clear account is given why negation can be said to be ambiguous and yet sufficiently unified for there to be one single nega tion word. As long as this form of ambiguity is not clearly analysed and argued for, this analysis is open to Gazdar's criticism also applicable to Horn's analysis: why, if negation is ambiguous, is there no language in the world, as far as we know, which disambiguates between the two senses? In Section 3 more will be said about this. But first we shall have a closer look at the logical and model-theoretic aspects of the issue.

1 99 by p.lp/ is, in the standard conception, a characteristic function from valu ations to truth-values or, in other words, the set of valuations in which p is true: l vn l vn{p) = l ) . Thus, in the system presented in fig. l, Ia! is the function l < v p l ) , ( v2 ,2 ) , (v 3 , l ) , ( v4 ,2 ) , ( v5 , l ) , ( v6 ,2 ) , (v7 , 1 ) ) , or, alternatively, the set I v I v3 , v5 , v7 ) , and a expresses this function, or this set . In other words, ' an interpretable sentence is associated with a set of valuations, or, if it is n ways ambiguous, with n sets of valuations, in the way sketched in fig. 5 for the non-ambiguous sentence a:

Fig. 5: Ia! as a subset of U .

This has the advantage that the truth-functional operators can be interpret ed as simple set-theoretic operations on the valuations in the field of valua tions U. The classical bivalent negation is now interpreted as follows: for any sentence p, I ..., p/ = U-/p/, or, in other words, the complement of !pi in U. Thus, f -, af = U-/a/ l v2 , v4 , v6 , v8 ) . Likewise for conjunction and disjunction: /pAq/ = /p/Nq/, and /pvq/ = /p!U/q/. To say that p is true in some v n now means: v n E /p/, and, of course, to say that p is false in vn means: vn t !pi, which is equivalent to saying that vn E U-/p/, or ..., p is true in vn . Again, to say that ..., p is false in vn is: vn E U-/p/, i.e., vn E U-(U-/p/), and therefore vn E /p/. On this basis we can give a general definition of the notion "truth-value ", independently of the specific properties of the propositional calculus in question, as follows: =

(3 1 )

I n a propositional calculus P for a language L i n a universe (set o f valuations) U , there i s a truth-value a just i n case the propositions of the sentences of L structure U in such a way that for every p E L and every vn in U there is precisely one set of valuations H !;; U such that vn (P) = a iff vn E H .

Note that this definition allows for gaps i n U only i f a provision i s in troduced for a variable U. Take, for example, Fig. 2, where a is unvalued in v3 and v6 • Here, /a/ = I vi ' v4 ) . Now the complement of /a/ in U, i.e. U-/a/, would be l v2 , v3 , v 5 , v6 ) , and no gaps would be possible. We can,


u

200

however, make a provision to let U vary with each sentence p E L in such a way that U P is precisely the set of valuations for which p is valued, i.e. the intersection of all /q/ of q E L such that for all v n E U, vn E /q/ if vn (P) = l or 2 . Then, for a in fig. 2, Ua -/a/ = ( v , v ), and definition (3 1 ) will 2 5 apply, with U relativised with respect to any given p E L . This gives the truth-table for negation in any U and any L without gaps and with one single complement-taking negation, i.e. classical bivalent ne gation: p •P 2

Fig. 6: Truth-table for the classical bi valent negation

The truth-tables of the standard binary truth-functional operators /\ and v are conveniently constructed as follows. Let /a/ and /b/ be represented as in fig. 7. The standard definition of /a /\ b/ is /a/ n /b/. Hence, any vn E Ia/ n /b/ will yield the value ' ' I ' ' for a /\ b, and all vn t I a/ n /b/ will yield "2" .

Ia!

2

2

Fig. 7: Truth-table construction for classical /\ .

Likewise for disjunction. Given that /a v b/

=

I a! U

/b/, we have:


l 2

20 1

2

Fig. 8: Truth-table construction for classical

v.

2.2. Presuppositional Boolean model-theory for trivalent propositional calculus

We have seen (Section 1 .2) that the Frege-Strawson analysis of presupposi tional facts is untenable on empirical grounds: it is normally possible for a sentence to be true under negation even though its presuppositions are not all fulfilled. We have also seen (Sections 1 . 3 ; 1 .4), again on empirical grounds, that a reversal to standard bivalent logic is not a viable alternative. Our conclusion was that a choice had to be made between argument-split theories or NEG-split theories (or a combination of both). We will now have a look at the model-theoretic aspects of NEG-split theories. Instead of restricting U for each sentence p E L to UP and thus creating gaps, we will now keep U constant again, avoiding gaps, but keep UP and re-interpret it as the subuniverse ofp, still defined as the intersection of all /q/ such that for all v0 E U, v0 E /q/ if vn (p) = 1 or 2. UP is to be interpret ed as the set of valuations ("possible worlds") in U expressed by the con junction of all presuppositions of p. If p has no presuppositions in the lin guistic sense (as in, e.g. , There are trees), we still let p "presuppose" all logical truths, and say that, in such a case, UP = U. (This, as we shall see in a moment, makes it impossible for a presuppositionless p to be valued "3".)

Thus, for any sentence, p , /p/ £ U P £ U , as i s demonstrated i n fig. 9 for the sentence a, which we take to presuppose b and not to have any fur ther presuppositions.


Since a and b are represented in figs. 7 and 8 as being logically independent, the tables thus constructed can be generalized to any p, q E L .

202 u

Fig.

9: /a/ as a subset of u•.

and u . as a subset of U .

=

(32)

For any vn in U and any p E L: v n(P) v n(p) vn(p)

=

=

1 iff E /p/ 2 iff vn E U P-/p/ 3 iff vn E U-UP .

Clearly now, if p has no presuppositions, and thus U P = U, vn(P) = 3 will be impossible because v n E U-UP = 0 is impossible. Note also that there is no room left now for a fourth value corresponding to the total com plement, since, given the definitions of the three values in (32), it is not true that for any vn and any p E L. vn(P) = 4 iff vn E U-/p/. (In fact, as the reader may care to ascertain, the assumption of such a fourth value will take away the truth-functionality of the truth-functional operators, and thus des troy the logic.) This enables us to formulate the logical property of presuppositions: (33)

If a sentence p has the set PP as its presuppositions (p) )P). then for all valuations vn E U, vn (P) = 3 iff there is at least one q E PP such that vn (q) -.e I .

Equivalently, i f p ) ) q, then for all v n E U , if vn(P) = 1 or 2 than vn (q) = 1 . It is important to realize, however, that the notion of presupposition


W e have now created two disjoint complements for any /p/, besides the old classical complement, which is the union of the two others. We shall speak of: the inner complement of p: U P-/p/; the outer complemen t of p: U-U P ; the total complement of p : U-/p/ ( = the classical complement of p) . For L(a, b, c), with a ) )b, and the valuation space as in fig. 2 above, this means that /a/ = ( v p v4 ) , U a-/a/ ( v2 , v 5 ) , U-U8 = ( v3 , v6 ) , and , of course, U-/a/ = ( v2 , v3 , v5 , v6 ) , or the union of the inner and the outer complements of /a/. The effect of this is more structure in U, and, notably, the emergence of three truth-values, which we shall call "true" (" I "), "minimally false" ("2"), and "radically false" ("3"), defined as follows:

203 plays no role in the logic proper once that has been set up. Everything in the logic will be done in terms of three truth-values and, as we will now see, the truth-functions, including the two negations. The point is that presup positions is itself not a logical but a linguistic-semantic notion. The logic is only tailored to fit the presuppositional phenomena. On the basis of this we now define two negation operators, the minimal negation ( - ), and the radical negation ( == ), as follows: (34)

For any p E L: I - pi I == pi =

=

=

=

=

=

=

=

=

=

=


It now follows that if for some vn , vn(P) 1 and hence vn E lpl or, equivalently, vn E UP-(UP-Ipl), vn ( -p) = 2 (and vice versa), since 1 - pl 2. = U P-Ipl, according to (34), and thus, according to (32), vn ( - p) Likewise, it follows that i f for some vn , vn (P) 2, and hence v n E UP-Ipl, vn ( - p) = 1 (and vice versa) , according, again, to (32) and (34). This gives us part of the truth-table for minimal negation (cp. fig. 4), i.e. from 1 to 2 and vice versa. It does not give us yet the function value 3 from 3. This we get when we realize that under the definitions given so far UP = U - p for any p, since U - p is still the intersection of all lql such that for all v n E U, v n E lql if vn (P) 1 or 2. With this knowledge we can now say that if vn (P) 3, and hence vn E U-UP , or, equivalently, vn E U-U - p • vn ( -p) 3 (and vice versa), according to (32). This completes the construction of the truth-table for minimal negation, as given in fig. 4 above. In similar fashion we derive the truth-table for radical negation. If, for some vn and for some p, vn (p) = 1 , and thus vn E lpl or, equivalently, vn E U-(U-Ipl), then vn E U-(U-U� , since lpl � UP . Now, U U "" P since ' == p can have no presuppositions in the linguistic sense. Therefore, vn ( == p) = 3 is impossible, as we have seen. (Since U == p is the intersection of all lql such that for all vn E U, vn E lql if vn ( ==p) = 1 or 2, it follows that all v n E U must be a member of any q intersecting with U "' P i .e. only logical ' truths can intersect to form U "" P ' and again, U "" P U). This means that if vn (P) = 1 , vn E U "" P -(U-U � . According to (34), U-UP = I == pl. Hence, if vn (P) 1 , vn E U,. P -I == pi, which, according to (32), amounts to saying that v n( == p) = 2. Likewise, if v n(P) = 2, v n E U P-Ipl, and hence v n E UP ' or, equivalently, vn E U-(U-U� , and thus vn E U "" P -(U-U� , so that if vn (P) = 2, v n ( ==p) = 2. (It also follows that if vn ( ==p) = 2, v n (P) = 1 or 2.) If, on the other hand, vn (P) 3, then, by (3 1), vn E U-UP , and thus, according to (34), vn E I == pi, and hence vn ( == p) 1 , and vice versa. This establishes the truth-table for radical negation as given in fig. 4 above. As regards the binary truth-functional operators, in particular conjunc tion and disjunction, it is quite possible to provide Boolean underpinnings for their truth-tables in a three-valued calculus constrained by minimal and

204 radical negation. This can be done in a variety of ways, all of which will preserve their classical bivalent properties. For conjunction this means that, anyway, /p 1\ ql :::: /p/ n /q/, as in the bivalent logic. The question is now how to define U P"Q in such a way that justice is done to the logical property of presuppositions as defined in (33) above. One way is to define it as UP n U q · The resulting truth-table, which is the table given in Seuren ( 1 985: 239), is constructed in fig. 10, based on arbitrary, logically independent a and b: u

1\

with U aA b

=

u. n Ub .

It is also possible, however, to define U p Aq as U P U U q . This results in the table constructed in fig. I I : u u. f----

2- = 2

/b/

!J:rz- r- 2 - - 2 I

I

� 2 - : 2 .: 1&1-

Ub:

3

Fig. 1 1 : Truth-table construction for trivalent " with u a A b

=

u. u u b .

For disjunction a similar distinction does not work . We can define U pvq as U P U U q , which gives the truth-table presented in Seuren ( 1 985:239), and constructed in fig. 12. But if U pvq is defined as U P n Uq , as in fig. I 3 , no coherent interpretation results since, given the definitions in (32), a disjunc tion will be defined as both true and radically false when one disjunct is true


Fig. 10: Truth-table construction for trivalent

205 u

t=::;:U :;:;::;:;:;:;:_ a

� 1--

� llat t---

z=� z z- 3 C::::

1-

LLL..L..1 1 -

1\

with

Uavb = U0 U Ub .

u u.


v

with

Uavb = u. n Ub .

and the other radically false . Fig. 1 3 , therefore, does not represent a possible analysis. In Seuren (1 985) the table for conjunction corresponds to fig. 10, as has been said. We now see that this table is preferable to the one constructed in fig. I I . According to fig. 10, a sentence of the form ac /\ bd (i.e. a presupposing c, and b presupposing d) generally presupposes both c and d, since for any vn E /a /\ b/, vn E Ua and vn E Ub. I n a general way, this is correct, since a sentence like: (35)

Angus feeds his horse and Paddy feeds his donkey.

presupposes both that Angus has a horse and that Paddy has a donkey. The table constructed in fig. I I does the same, but it also makes (35) presuppose that Angus has a horse or that Paddy has a donkey, which is linguistically incorrect, as appears from the infelicity of a discourse like (see Section 3 below):



206 (36)

! Angus has a horse or Paddy has a donkey,

and Angus feeds his

horse and Paddy feeds his donkey.

(37)

Angus has a horse, and he either feeds it or he starves it.

Fig. 12 thus seems to be the empirically correct. Blau ( 1 978:75) has trivalent truth-tables for both the presupposition preserving and the presupposition-cancelling negation . The former is identi cal to our - , the latter, however, is the classical negation ..., , i.e with truth converted into falsity, and all other values converted into truth. We have seen that this negation does not define a truth-value in any logic with more than two truth-values, since, given the minimal negation and its concomi tant inner complement, the classical negation, with its total complement, no longer satisfies definition (3 1). Once an inner complement is defined, the only other negation that defines a truth-value is the one associated with the outer complement, i.e. the radical negation . In other words, in any system with more than two truth-values and more than one negation, the classical negation is dysfunctional as a separate operator. Blau's trivalent conjunction and disjunction operators are defined ( 1 978: 87) as follows:


We will see below, anyway, that it is not useful to let conjunctions carry presuppositions at all . A nd must be taken to block any projection of presup positions, because, if not, a sentence of the form b and ab will both assert and presuppose b, which is rightly considered repugnant in all of the rele vant literature. So it really does not matter much whether we take fig. l O or fig. 1 1 as representing the correct table. Both preserve classical logic equally. As regards disjunction. Seuren ( 1 985: 239) gives the table represented in fig. 12. Thus defined, ac v bd ) ) c v d, but ac V bd does not presuppose either c or d singly, because for some vn E /a V b/, vn � Ua or vn � Ub. Moreover, in fig. 12 a disjunction preserves all presuppositions that are shared by both disjuncts, or: ac v be ) ) c. That this is so easily seen when one realizes that if a ) ) c. U a cannot be larger than /c/ but it may be smaller. Likewise for Ub if b ) ) c. Clearly, then, for any vn E /a V b/, vn E Ua U Ub, and hence, vn E /c/. This is intuitively correct, witness the felici ty of:

207

Vq

1\q p

l

2

3

l

2

3

l

l 2 3

2 2 2

3 2 3

l l l

l

l

2 3

3 3

2 3

Fig. U: Truth-tables for tri valent conjunction and disjunction as in Blau ( 1 978: 87).

=

=

2.3. Constructive model-theory for bivalent non-presuppositional

languages

We have seen (Section 2 . 1 ) that the standard way of setting up a model theory for propositional languages is by associating each sentence of the lan guage with the set of valuations, or, in a different terminology, possible worlds, in which it is true. That set is said to be the "proposition" expressed by that sentence. There is, however, another way of constructing the model theory. Take agrun the classical valuation space for L(a, b, c) as represented in fig . l , which is repeated here for convenience:


The conjunction operator thus defined is inconsistent with Boolean seman U8 U Ub (i . e . as in fig. 1 1 ) , except tics. This table is generated by Ua/\b that the combination of truth and radical falsity yields radical falsity, which fits into an analysis as in fig. 1 0, with Ua/\b U8 n Ub. It should, in the conception of fig . 1 1 , yield minimal falsity for the combination of truth and radical falsity, since any valuation which has this combination of values will be a member of U3 U Ub. Blau's disjunction operator conforms to fig. 1 3 , which, as w e have seen , is incoherent in a model-theoretic (Boolean) in terpretation . Blau' s truth-tables for conjunction and disjunction must thus be rejected on grounds of model-theoretic interpretability. We will see in a moment (Section 3) that the logical presuppositional properties of the conjunction operator do not matter at all for natural lan guage semantics, since, as has already been said, presuppositions do not project through and, this being nothing but a concatenator of subsequent discourse increments. The logical properties of presupposition can only be taken to be epiphenomenal upon an underlying cognitive interpretative mechanism. The language is free to decide when presuppositions appear, these being a semantic property. H owever, before going into questions of this nature we will have a look at an alternative way of setting up the model theory of a propositional language .

208 u

vi : v2 : v3: v4: v� : v6 : v, : Vg:

b

a l

2 l

2 l

2 l

2

c l l

l l l

2 2

l 2 2 2 2

l l

2 2

•a

•b

•C

2

2 2 l

2 2 2 2

l

2 l

l

2

2 2

l

2 l

a /\ b l

2 2 2

l l l l

l l

. . . etc

l

2 2 2

Now, instead of letting a sentence express its proposition, which is a set of valuations, we go the other way around. We drop the notion of proposition as defined in standard model-theory, and consider a valuation v 0 to be a function from sentences of L to truth-values, that is, as the set of sentences of L for which v 0 has the value l . Thus, e.g., v 2 ( b, c, • a, . . . ) in fig . l . It i s easily seen that for a language L all o f whose sentences are atomic -.e[JUDGe & RASHe & CONTe

=

( >-.x [x is happy] , Laurie ) )

Now consider the noetic events described i n the following two cases:

Plainly, e3 is an instance o f J 11 , e4 is not, for e3 does while e4 does not have the property of being rash. However, both e3 and e4 are judgments , hence the events have the same relational constituent . And both have the cir cumstance of Laurie's being happy as their content. Since they have rela tional constituent and content in common, but one is an instance o f J 1 1 while the other i s not, J " fails t o satisfy clause a ) of (D9). Similar considera tions would also show that neither W nor Q satisfy this clause of the defi nition. What are some examples of noetic event types that do satisfy the definiens of (D9)? Here are some: P:

Thinking that Fred is careful

w • : Wishing that Mark would not be anxious Q • : Wondering whether Laurie is happy Take J• (analogous points can be made, mutatis mutandis, with respect to the other two). The first component of any event that is an instance of J • is the relation of d-thinking; the content o f any such event i s the circum stance of Fred ' s being careful. Unlike the cases of J and J ' , no two instances of this type will involve different contents. Unlike the case of J 1 1 , no two noetic events that agree on noetic relation and content will split with respect instantiating J • . Either both will be instances or both not .


a) Mark is thinking that Laurie is happy, but he is doing so very rashly: he has just considered the matter, and without any good evidence has hastily come to the conclusion that she is happy because, he " reasoned" , all blondes are happy. Let 'e3' denote the noetic event here of Mark's thinking that Laurie is happy. b) Fred is thinking that Laurie is happy, but he is not doing so at all rash ly : he has just seen Laurie, seen that she is smiling and knows from much past experience that Laurie is a person who smiles only if happy. A cautious person, Fred has reflected for a moment and concluded that Laurie is hap py. Call the noetic event here of Fred's thinking that Laurie is happy, 'e4' .

29 5 I propose, then, that the noetic event types satisfying (09) are objects of thought; that is: (T2)

All content-specific noetic event types are objects of thought

4 . THE NEW CONCEPTION AT WORK

I wish to consider an assignment relating sentences and events of thinking to content-specific noetic event types. The assignment applies to a very sim ple fragment of English including imperatives, interrogatives and indica tives, and some rudimentary that- and whether-clauses that can be em bedded in sentences that serve to report events of thinking. The assignment is intended to work as follows. The type assigned to a sentence will serve as the object of thought expressed by the sentence; the type assigned to a that or whether-clause is intended to serve as the object of thought for an event reported by a sentence embedding that clause. In the long run, I should want to consider a much richer fragment than the one to be considered here. For the purposes of this paper, only a simple illustration is needed , su fficing to give an idea of i) the association of noetic event types as objects of thought to sentences and to events of thinking, and ii) how my proposal circumvents the problem faced by circumstantial ac counts in identifying the objects of non-doxastic thoughts. I'll begin by cit ing the vocabulary and formation rules of the fragment. Then I'll specify a compositional assignment of sentences, and of that- and whether-clauses to content-speci fic noetic event types.

The English fragment F Most of the names and verb phrases of the fragment are already present in the language in which we have been preparing to discuss the assignment .9


What is common to diverse events of thinking in virtue of which we may correctly say that they have the same object is, I claim, at least in some cases the content-specific noetic event type of which the diverse events are all in stances. So, I propose that at least in some such cases, it is that content specific type, the one having the diverse events of thinking as its instances, the very property itself, that is what we are referring to when we speak of what the subjects of those events are thinking. The object of thought just is that type. Let us turn now to consider some applications of this proposal, in particu lar, how (P4) may be upheld.

296

i)

or ii)

Let N and I is el be, respectively, a name and one-place predicate, then 0 1 is a sentential clause if either a) b) c)

0 ' is an indicative clause ( Ithat N is 0l ), 0 I is an interrogative clause ( = I whether N is 0l ). 0 ' is a subjunctive clause ( = I that N would be 0l ). =

0 is a one-place predicate i ff either i) 0 is a basic one-place predicate of F, or ii) 0 is the result of attaching either a) b) or c)

an indicative clause to the right of the two-place predicate, ' is thinking' , an interrogative clause to the right of the two-place predi cate, 'is wondering' , a subjunctive clause to the right of the two-place predicate, ' is wishing' .

Nothing is a sentential clause or a one-place predicate unless ob tained from steps i) or ii). The definition of 'sentence of F' will appeal to three syntactic operations acting on name/predicate pairs.

iii)

- The sentential operations, IND, /NT and IMP: Where N is a name and I is 0l a one-place predicate of F, IND (N

'

I is 0l ), an indicative,

=

I is 0l ), an interrogative, ' IMP (N I is 0l ), an imperative, ' /NT (N

=

I N is 0. l =

I Is N 0 ? l I N be 0. l '

All indicatives, interrogatives and imperatives are obtained by application


- The Basic Vocabulary of F: Names: ' Michael' , 'Barbara ' , 'Fred ' , ' Laurie' , ' Mark' Basic one-place predicates : ' is careful' , ' is anxious' , 'is happy' Two-place predicates: 'is thinking ' , ' is wondering' , ' is wishing' Subjunctive and imperative copula : 'be' Modal auxiliary: 'would ' Complementizers: 'that ' , 'whether' Punctuation: ' , ' ' . ' ' ? ' Besides the basic vocabulary cited here, there are complex expressions in F. - Sentential Clauses and One-place Predicates: The definitions here proceed by simultaneous induction:

297 of the above operations to pairs o f name and one-place predicate. Now the definition of sentence of F is straightforward .

Sentence of F: 0 is a sentence ofF iff an interrogative.

0 is either an indicative, an imperative, or

Besides simple sentences such as Laurie is happy.

this definition of sentence together with the recursive specifications of sen tential clause and one-place predicate, give us much more complex sentences consisting of multiply imbedded sentential clauses. For example, the follow ing are well-formed sentences of F: Michael is thinking that Fred is wondering whether Laurie is happy. Mark is wishing that Barbara would be wishing that Mark would be careful. The next step is to specify a function assigning to each sentence of F a noetic event type that fits our intuitions about what is expressed by S .

The assignment The assignment function, • , is compositional, operating on names, one place predicates, sentential clauses and sentences, as follows. i)

Where N is a name, N* is j ust the denotation of the name.

Examples: ' Barbara' • Barbara Fred, ' Fred'* =

=

Specification of • for one-place predicates goes in three steps, first to basic one-place predicates , then to sentential clauses, then to the complex, one-place predicates formed with the sentential clauses. ii)

Where P is a basic one-place predicate of F, P* is the denota tion of the >.-term formed from that predicate.

(Keep in mind F has some predicates in common with our background language.)


Mark is anxious .

298 Example: 'is careful'* = >vc [x is careful] (i .e. , the property of being careful) iii)

Where 0 is a sentential clause, a) if 0 is l that N is P l , 0* A.e[JUDGe & CONTe b) if 0 is l that N would be P l 0* A.e[WISHe & CONTe c) if 0 is lwhether N is P l , 0* = A.e[QUESe & CONTe =

=

l is P l • , N * ) )

= < , = (

l is P l * , N* ) ]

= (

l is P l • , N* ) ]

Example: 'that Barbara is careful'* A.e[JUDGe & CONTe ( ' is careful' * , 'Barbara'* ) ] = A.e[JUDGe & CONTe = ( A.x [x i s careful] , Barbara ) ] =

iv)

where P is a complex, one-place predicate, and 0 is the sentential clause of P : P* = A.x[3e(0*e & SUBJe = x)]

An example: 'is thinking that Barbara is careful ' * = A.x[3e(' that Barbara is careful'*e & SUBJe

=

x)]

(See previous example to confirm the •-assignment cited here for 'that Bar bara is careful'.) A.x[3e(A.e ' [JUDGe ' & CONTe ' = ( A.y[y is careful] , Barbara )]e & SUBJe = x)], =

and by eliminating the A.e ' -term inside, we can simplify further: A.e[3 e(JUDGe & CONTe ( y[y is careful] , Barbara ) & SUBJe = x)] . =

=

The result: the predicate 'is thinking that Barbara is careful' is assigned the property that a person has in virtue of engaging in a judgment whose con tent is the circumstance of Barbara's being careful. Finally, • acts on sentences according to the following: v)

Where S is a sentence of F, N a name of the basic vocabulary, and P a one-place predicate of F, then either a) S IND(N, P) and S* A.e[JUDGe & CONTe = ( P* , N * ) ] , or IMP (N, P) and b) S A.e[W ISHe & CONTe ( P*, N* ) ] , or S* =


Next, the complex, one-place predicates. In general, for any complex, one-place predicate of F, there is a largest sentential clause from which it has been formed; call this "the sentential clause" of the predicate. Then we posit :

299 c)

I NT(N, P) and >-.e[QUESe & CONTe = ( P* , N* ) ] .

S s•

I ' m set now to fulfill some promises. I wish to advance two proposals, one identifying the things expressed by the sentences of F, the other identifying the objects of certain noetic events. The proposals place constraints on what things are expressed by sentences (o f F), and what things are the objects of thought (of d-thinkings, wishings and wonderings), and together these two proposals entail (P4) . The first is very straightforward: V0(if 0 is a sentence of F, then 0 expresses 0*)

The claim , then is that with respect to the sentences of this simple fragment of English, what such a sentence expresses is the noetic event type assigned to it by the • function. I ' ll consider some examples shortly to motivate the claim . The second thesis associates noetic events with their objects: (T4)

ve vc: i) ii) iii)

the object of e = >-.e[JUDGe & CONTe = c) iff JUDGe & CONTe = c the object of e = >-.e[WISHe & CONTe = c) iff WISHe & CONTe = c the object of e = >-.e[QUESe & CONTe = c) iff QUESe & CONTe = c

Put a little roughly, (T4) tells us that the objects of events of d-thinking, wishing and wondering are simply their respective content-specific types, and that only such events have those types as their objects . Let's turn t o examine consequences of (T3) and (T4), and t o see how these proposals accommodate (P4).

What some indicatives, imperatives and interrogatives express From (T3) and (T4) together with some facts we noted about the assignment function • we get the following result: (R2)

i) ii) iii)

indicatives of F express objects of d-thinkings. imperatives of F express objects of wishings. interrogatives of F express objects of wonderings .

It is not hard to see how (R2) is derived. Take the case of indicatives - clause i). Let 0 be any indicative sentence of F. By (T3), 0 expresses 0• . But if you examine how • operates on indicative sentences, you'll see that 0• is a


(T3)

300 content-specific judgment type. By (T4), such types are the objects of d thinkings. Since choice of 0 was arbitrary among the indicatives of F, we get that the indicatives of F express objects of d-thinkings. Exactly analo gous reasoning yields clauses ii) and iii). (P4) follows directly from (R2). It has been a long-standing puzzle, what to say are the things expressed by imperatives and interrogatives. Moreover, consider the following sen tences of F: Fred is careful.

( 1 7)

Fred, be careful.

( 1 8)

Is Fred careful?

One is strongly inclined to say that no one of these sentences expresses the same thing as any of the others. Nevertheless, t here does seem to be a sense in which, though these sentences express different things, the things they ex press may be said to concern the same state of affairs or circumstance that of Fred's being careful. These points concerning ( 1 6) - ( 1 8) seem to hold , generally, for all such trios of indicative, imperative and interrogative formed from the same name and predicate. Finally, I think it is natural to hold that , in general, there is a similarity in kind between things expressed by any of the three moods, a common genus of things that are expressed by our utterance of sentences. Here, then , are some intuitive desiderata: For any trio of indicative, interrogative and imperative formed from the same name and predicate, a) each one expresses something different from the thing ex pressed by either of the others, but b) the things they express may all be said in a certain sense to con cern the same circumstance and c) things expressed by (utterances of) sentences fall under a com mon, natural genus. On the present view, with respect to the sentences of the fragment, F, each of these points is satisfied. Before showing this to be the case, let me review briefly some matters that were raised when I first discussed the notion of event type. I noted then that our ordinary usage of the terms 'j udgment' , 'wish ' , ' question' and 'thought' is ambiguous. When w e use one o f these terms we may mean to be speaking of a noetic event, but we also may mean a noetic event type. From here on, when I wish to speak of noetic events , I'll use of 'judgment' (for d-thoughts) , 'wishing ' , or 'wondering' . When I mean to speak of noetic event types, I adopt a slight contrivance in order to dis-


( 1 6)

30 1

(T5)

the JUDGMENT that N is 0 < I is 0 l •, N• > 1

(T6)

the WISH that N would be 0 ( 'is 0 ' • , N• ) ]

(T7)

the QUESTION whether N is 0 = Ae[QUESe & CONTe N• ) ]

=

=

Ae[JUDGe & CONTe Ae[WISHe & CONTe = =

( 'is 0'•,

Armed with these identities, let's see how we arrive at the desiderata, a) -c), cited above. I ' ll work backwards. The three categories of sentence in the fragment are associated by • with different categories of object of thought. Every item assigned by • to a sen tence of F is of the same basic genus as any other: they are all noetic event types. Indicatives (true/false ones) express J UDGMENTS, interrogatives express QUESTIONS, and imperatives express WISHES. It sounds natural enough, but moreover, we have provided a precise account of what these things are, and which of them gets expressed by which indicative, imperative or interrogative in our fragment. So point c) is satisfied . Now consider what's expressed according to the present proposals by the sample sentences ( 1 6) - ( 1 8). These get assigned, respectively, to the follow ing types (the reader is invited to check the results): ( 1 6•)

Ae[JUDGe & CONTe

( 1 7•)

Ae[WISHe & CONTe

( Ax [x is careful] , Fred ) ]

( 1 8•)

Ae[QUESe & CONTe

( Ax [x i s careful] , Fred ) ]

=

( Ax [x is careful] , Fred ) ]

1 ( 1 6•)- ( 1 8•) are different things. 0 Nevertheless, there i s a clear sense in which these things expressed by ( 1 6) - ( 1 8) concern the same circumstance: each one is instantiated solely by events having the same circumstance, that


ambiguate: I ' ll use capital letters; in particular , I 'll use ' JUDGMENT' , ' WISH ' , and 'QUESTION' , respectively, for types o f judgment, wishing and wondering. I also contended earlier that when we speak of "the judgment that 0" , "the wish that 0" , or "the question whether 0" , where 0 is a sentence, our reference is not ambiguous; we are speaking of types. In fact, when the embedded sentence in such a phrase is a sentence of F, the type denoted is, I claim, a content-specific noetic event ytpe. This claim may now be formu lated precisely; we can specify for every such term exactly which content specific noetic event type it denotes. Let N be a proper name of F, and 0 be any expression such that I is 01 is a one-place predicate of F; then I pro pose that all instances of the following schemas are true:

302 of Fred 's being careful, as their content . And it is easy to see that analogous points hold true with respect to the • assignments to any trio of sentences of F formed from the same name and predicate. So we get points a) and b).

Objects of thought Let us see how the problem of the objects of non-doxastic thoughts gets handled on the present view. First, let's consider the case of judgments. If both of these sentences Laurie is thinking that Mark is anxious

(6)

Ed is thinking that Mark is anxious

Are true, then we should say that Laurie and Ed are thinking the same thing. The circumstantial account that was considered and rejected in section I I gets this right; according t o it, both noetic events reported by these sentences have the same object : the circumstance of Mark's being anxious. Can the present account use noetic event types to equal advantage? Yes . According to the present account, the object of each of the events reported by (5) and (6) is a JUDGMENT, and moreover, the same JUDGMENT is associated with both cases, namely: J..e [JUDGe & CONTe ( x [x is anxious] , Mark ) ] the JUDGMENT that Mark i s anxious =

But now what about non-doxastic thoughts? Consider, again, sentences: (7)

Barbara is wondering whether Mark is careful

(8)

Fred is wondering whether Mark is careful

The trick is to assign a common object to the thoughts reported by these sen tences, distinct from the object of the thought reported by (9)

Ed is thinking that Mark is careful

The circumstantial account could not handle this problem . On the present account there's no difficulty. (7) and (8) both report thoughts whose object is the type, (type l )

J..e [QUESe & CONTe = ( x[x is careful] , Mark ) ]

this i s a different thing from the object o f the judgment reported b y (9), namely


(5)

303 (type2)

}..e [JUDGe & CONTe

=

( >..x [x is careful] , Mark ) ]

1)

2) 3)

the view affords a conception of the objects of thought according to which they are seen to fall under a single, natural and recognizable genus, the view entails (P4), and the resulting conception of the objects of thought does not succumb to the problem posed in Section 2 concerning the objects of non doxastic thoughts.

These were the merits I claimed for the account at the end of Section 2. Moreover, with respect to our fragment, F, the view affords a clear and natural answer to the question: What do imperative and interrogative sen tences express? The answer accords with what the untutored might have been inclined to say all along: that imperatives express wishes and interroga tives express questions (the untutored don't know to use capitals).

5. PROPOSITIONS AND CONCLUSIONS

Propositions, judgments and truth So far, I have isc.:�ted a class of JUDGMENTS, WISHES and QUES TIONS all of which, I claim, are objects of thought . But I haven 't said any thing yet about the objects of thought that have been the principal concern in the Fregean Tradition; I haven't said anything yet about propositions. According to (D2), a proposition is a truth-valued object of thought . Then, a full account of the nature of propositions ought to do two things: 1) it should pick out a class of objects of thought that bear truth-values, and 2) it should explain what it is for such items to be true or false. The account I am about to give parallels in certain respects the preceding account of objects of thought, generally. From (T l), the thesis that all


(type 1) is a QUESTION, (type2) is a JUDGMENT. This precisely fits the intuitions I cited for claiming that we should not say on the basis of the truth of (7)- (9) that Ed is thinking the same thing as Fred and Barbara. I said then that it is a question that is crossing the minds of the latter two whereas a judgment is crossing Ed ' s mind; that was correct , but now the claim can be made without ambiguity: it is a QUESTION that is crossing their minds. Moreover, as desired, we can say that what is crossing Fred 's and Barbara' s minds i s something that , like what's crossing E d ' s mind (if (7) - (9) are true), concerns Mark 's being careful. For all events t hat instantiate either (type l ) o r (type2) have the circumstance of Mark's being careful a s their content. Let me summarize what has been accomplished so far. I am advancing a view that consists of proposals (T 1 ) - (T7). Here are some of its features :

304

(T8)

All content-specific JUDGMENTS are propositions

I also propose to accept certain identifications concerning which proposi · tions are which JUDGMENTs . Where N and l is el are, respectively, a name and one-place predicate of F, then instances of the following hold: (T9)

The proposition that N is 0 = A.e[JUDGe & CONTe = ( 'is 0'* , N * ))

As a consequence of (T9) and (T3 ), we get all instances of: ( 1 9)

'N is 0' expresses the proposition that N is 0.

The claims asserted by instances of this schema seem to me to be exactly what we should want. I wish to turn now to the matter of truth-value. If content-specific JUDGMENTS are propositions, then such JUDGMENTS must all be possibly such as to have truth-values. What makes a JUDG MENT true or false? Roughly, my proposal is this: a JUDGMENT is true just in case its con tent obtains . This is rough in part because, strictly speaking, the notion of content has been defined for noetic events, not their types. But we can fix that: (D 1 0) contT = c)

=

d/ the c such that 0 (Ve(if e instantiates T, then CONTe

The content of a type, then , is the circumstance such that necessarily, any event instantiating the type has that circumstance as its content. Now we can say, speaking strictly of the content of a type: (01 1 ) T is true

=

df T is a JUDGMENT & contT obtains.

Let's consider a simple illustration of how (D1 1 ) applies to the particular JUDGMENTS I have singled out as propositions . Here is a representative proposition:


objects of thought are noetic event types, together with (D2), the definition of proposition, it follows that all propositions are noetic event types. But since I do not claim that all such types are objects of thought, a fortiori they are not all propositions. Nor can I say in any informative way which noetic event types are propositions. What I shall do, though, analogous to the course taken above in addressing the general question of which noetic event types are objects of though t , is to mark off a subclass of types whose mem bers all are propositions. The character of the members of this subclass is illustrative of the character of propositions at large. Or so I claim. In the previous section, content-specific JUDGMENTS were associated by the function • to the indicative sentences of F. My proposal is that all such noetic event types are propositions:

305 (type 3) X.e[JUDGe & CONTe

=

( 'is astute'*, ' Michael ' * ) ]

This is a proposition because i t is a content-specific JUDGMENT. Since it is a proposition, it is possibly true or false. Under what conditions is it true? By (D l l ): j ust in case its content obtains. So we need to establish which cir cumstance is the content of the type. No event can instantiate (type3) unless the content of that event is the circumstance of Michael's being astute (Hint: apply * to the relevant vocabulary items and consider the fact that (type3) is specific with respect to content). Then by (D l O) , Michael's being astute is the content of the type. We have the result, then, (20)

This accords, it seems to me, with what we should have expected to be the case, a priori.

Conclusions Let ' s consider a brief list of some of the important things not accomplished in this paper; then, in defense, I'll conclude by pointing out how, in many ways, far less might have been accomplished . A short list of shortcomings: i) No account has been given here of how the logical or semantic relations can be seen as operating with these "new " objects of thought making up their fields. ii) No set of necessary and jointly sufficient conditions has been given for being an object of thought. iii) The fragment considered was meager, it included no complex singu lars (e.g. definite descriptions) and no quantification, let alone such diffi cult items as indexicals, demonstratives, or any sort of intensional operator. I think that it is reasonable to undertake study of the logical and semantic relations independently of having a clear conception of what the terms of those relations are. By the same token , I think it a reasonable task to under take a study of the nature of the terms of those relations independently of exploring the logical characteristics o f those terms. This latter task is the one I have taken up here. It might be claimed that the logical and semantic rela tions are definitive of their terms, that there simply are no characteristic fea tures of those terms other than those features having to do with how such terms behave under the logical and semantic relations (a number just is whatever obeys the Peano Postulates). Any proponent of The Tradition who holds that the objects of thought are terms of logical and semantic rela-


The proposition that Michael is astute is true iff the circumstance of Michael's being astute obtains.

306 tions should not accept this perspective. The cornerstones of the Tradition and related extensions, like our thesis (P4), provide a wealth of characteris tic features independent of the logical and semantic ones. It is true that I have offered no set of necessary and jointly su fficient con ditions for a thing's being an object of thought. However, I have offered a necessary condition: a new conception of what genus of entity the objects

NOTES I. 2.

See , for example, Frege ( 1 956), Russell ( 1 9 1 8) and Moore ( 1 95 3 , chap. 3).

According to the standard dictionary entry, to say something is "noetic" is to say that it

is "of or pertaining to . . . the intellect; characterized by, or consisting in intellectual activity ' . This seem s reasonably appropriate. Husser I made use of a related locution, 'noema' , in his work on intentionality. My use of 'noetic' is actually derived from some of Alvin Plantinga's recent work in epistemology. Plantings speaks o f a person's " noetic structure", by which he means, roughly, the structure of propositions that comprises the person's beliefs, ordered ac cording to their epistemic status for the person. I don't know whether Plantinga's terminology is derived directly from Husserl ' s . 3 . Certainly states o f belief have objects as well; perhaps objects o f the same sort . Attention shall be focused here, though, on treatment of OCCII"�nt ev�nts and acts of thinking. I do not think this policy jeopardizes any central contentions or proposals in the paper. 4.

Aren't contextual features such as the time and plac� of an utterance, who is the utterer,

who is being addressed essential features of an utterance? Otherwise, such features as these would also have to determine parameters of our basic locution.


of thought fall under. I have suggested that we look at the objects of thought as types of thinking, rather than as circumstantial entities. One advantage of this new conception is that, in contrast to the circumstantial conceptions that have arisen within the Fregean Tradition, we have a coherent account of the objects of non-doxastic thinkings. I have also made what seems to me a good start at deploying this new conception, for I have used it to isolate the things expressed by (utterances of) sentences of several very familiar al beit simple forms . The results of this deployment seem to me too accord ex actly with our intuitions. It is true that no logically tricky or complex constructions are included in the fragment. But my purpose in presenting a fragment of English has not been to show how this new· view could afford insights into the proper treat ment of the logic and semantics of natural language. It was i ntended to serve i n illustrating how the present accou nt supports the truth of (P4) while han dling the particular problem addressed in section I I , of identi fying the ob j ects of non-doxastic thoughts. It seems to me the fragment serves this pur pose well enough. I do tend to think the alternative conception of objects of thought laid out here offers new handles in grappling with some long standing problems in various areas of semantic analysis. Discussion of such putative applications, though, will have to be left for occasions other than the present one.

307 5. There are two important qualifications here. (I) Frege was certainly not the first to propose the view that trut h-valued objects of thought are the primary bearers of semantic and logical properties and relations. His teachers, Windelbandt and Lotze, preceded Frege in holding this view. Indeed, the view was already expressed i n the work of the medieval logicians. I owe these points to Goran Sundholm, whose paper at the Cleves conference tracked the historical de velopment of the view with great care. I stick with the phrase ' Fregean Tradition' only because Frege is unquestionably a leading proponent of this view in the r�nt history of logic, philosophy and semantics. (2) Though Frege held that there are items satisfying the right side of (02), he did not use the term 'proposition' to apply to them, but rather the term 'thought'. C f Frege, op.cit. 6. 7.

Kaplan ( 1 974) . For Hare, see ( 1 952, chapts. 2 & 12); for Stenius, ( 1 967).

gous to the proposal made in this paper about proper treatment of some of our talk of objects of thought. 9 . This section was influenced by some material in Terry Parsons' ( 1 985). 10. Whether these are distinct noetic event types depends on the proper analysis of their in

stances. ln the relevant cases, ! think the matter is pretty clear-cut. However, I am not claiming

that an event instantiating one of the generic types }..e [JUDGe). }..e [WISHe] and }..e [QUESe]

can't also instantiate one of the others. Maybe, for example, it can be made out that questions comprise a species of wish. This idea is actually at the heart of a fairly current proposal on the proper semantic treatment of questions, cf. Aqvist ( 1 965).

REFERENCES Aqvist, L. 1965: A New Approach to the Logical Theory of Interrogatives, Part I : A nalyis. Uppsala. Frege, G . 1956: The thought: a logical inquiry. Translated by: A . Quinton and H. Quinton, · Mind LXV: 289-3 1 1 . Hare, R . M . 1952: The Language of Morals. Oxford. Kaplan, D. 1 974: The ramified theory of types. Unpublished. Montague, R. 1 969: On the nature of certain philosophical entities. Monist 5 3 : 1 59- 1 94. Moore, G . E . 1953: Some Main Problems of Philosophy. Allen & Unwin. Parsons, T. 1 985: Underlying events in English. In E . LePore and B. McLaughlin (eds .), Ac

tions and Events: Perspectives on the Philosophy of Donald Davidson, Basil Blackwell, Oxford. pp. 235-267 . Russell, B. 1 9 1 8 : The philosophy of logical atomism. Reprinted in R. Harsch (ed .), Logic and

Knowledge, Allen & Unwi n , 1956. Stenius, E. 1 %7: Mood and language-game. Synthese 17: 254-274.


8. I recommend to the reader's attention, R . Montague's ( 1 969). His proposal there on how to treat "object talk" of various sorts - what is experienced, what is reported, etc. - is analo

Journal of Semantics 6: 309-344

A COMPUTATIONAL ACCOUNT OF SYNTACTIC, SEMANTIC AND DISCOURSE PRINCIPLES FOR ANAPHORA RESOLUTION

NICHOLAS ASHER and HAJI ME W ADA

A BSTRACT

pragmatic and even "stylistic" constraints on anaphora. We build on our B U ILDRS im plementation of Discourse Representation (DR) Theory and Lexical Functional Grammar (LFG) discussed in Wada & Asher

( 1 986). We develop and argue for a semantically based

processing model for anaphora resolution that exploits a number of desirable features: (I) the partial semantics provided by the discourse representation structures (DRSs) of DR theory, (2) the use of syntactic and lexical features to filter out unacceptable potential anaphoric antece dents from the set of logically possible antecedents determined by the logical structure of the DRS, (3) the use of pragmatic or discourse constraints, noted by those working on focus, to impose a salience ordering on the set of grammatically acceptable potential antecedents. Only where there is a marked difference in the degree of salience among the possible antecedents does the salience ranking allow us to make predictions on preferred readings. In cases where the difference is extreme, we predict the discourse to be infelicitous if, because of other con straints, one of the markedly less salient antecedents must be linked with the pronoun. We also briefly consider the applications of our processing model to other definite noun phrases besides anaphoric pronouns.

I . INTRODUCTION

The analysis of anaphora and an efficient method for searching anaphoric antecedents is a central problem of computational linguistics. Intrasentential anaphora involving singular noun phrase antecedents has been for several years a central topic of syntax. More recently intrasentential and intersen tential has received much attention from those working in semantics, prag matics and discourse theory. The problem of anaphora is also of central concern to computer scientists working on natural language understanding systems. This research has led to a host of constraints within different paradigms on the anaphoric process. We present for a semantic framework that integrates these different constraints - syntactic, semantic, pragmatic and even "stylistic" - into a unified model of anaphora resolution. From our perspective anaphora resolution is the process by means of which we come to identi fy or otherwise relate a "discourse referent , " in the sense of Karttunen ( 1 976) and Kamp ( 1 98 1 ), introduced by an anaphoric


We present a unified framework for the computational implementation of syntactic, semantic,

310


pronoun, with a discourse referent introduced b y a possible antecedent. This identification (or other relation) helps to determine the truth condi tions of a discourse containing an anaphoric pronoun, but the information that determines the identification itself is not purely semantic, syntactic or pragmatic. Thus, modelling the process of anaphora resolution forces us to confront something that everyone knows in principle but doesn't often face up to in practice: the tools of syntax and semantics won't suffice to deter mine truth conditions for discourses in general (and in particular those con taining anaphoric elements but also many others as well). So in order to have some hope of mechanically generating reasonable, determinate truth conditions (presumably the task of semantics), we need to have a theory which integrates semantic and syntactic information with less well understood notions relevant to pragmatics and discourse structure. We view our work on anaphora, as well as other work on anaphora within DR theory and related paradigms, as an attempt toward the construction of such a theory. Building on the BUILDRS implementation of Discourse Representation (DR) Theory and Lexical Functional Grammar (LFG) discussed in Wada & Asher ( 1 986), we will describe a computational implementation of our theory of anaphora resolution. We will develop a processing model for anaphora resolution that exploits a number of desirable features: ( 1 ) the partial semantics provided by the discourse representation structures (DRSs) of DR theory, (2) the use of syntactic and lexical features to ftlter out unacceptable, potential anaphoric antecedents, (3) the use of pragmatic or discourse constraints, noted by those working on the notion of local fo cus or topic. 1 The process of anaphora resolution is a process that finds the appropriate , anaphoric antecedent discourse referent to be identified or otherwise related to a discourse referent introduced by an anaphoric pronoun. The various constraints provided by syntax, semantics, pragmat ics and discourse theory "weed out" potential candidates for anaphoric linkage until a unique or appropriate discourse referent is found. The con straints are ordered in such a way as to minimize backtracking in the search for anaphoric antecedents. There are two sorts of constraints: absolute and interpretive. The absolute constraints, which include constraints based on the logical structure of the DRS as well as constraints relying on syntactic and lexical information, rule out possible anaphoric antecedents absolutely; the absolute constraints are applied first in our implementation. The in terpretive constraints impose a preference salience ordering on the set of potential antecedents that are lexically, semantically and syntactically ac ceptable. According to our processing model, the salience ordering works in the following way. Where there is no marked difference in the degree of salience among the grammatically and logically acceptable potential antece-

31 1 dents of some anaphoric pronoun, the salience ordering provides no useful information. Where there is marked difference in the degree of salience among the antecedents, it allows us to make predictions on preferred read ings. In cases where the difference is extreme, we predict the discourse to be infelicitous if, because of world knowledge, one of the markedly less salient antecedents must be linked with the pronoun. We apply our processing model also to the processing of definite descriptions which ex hibit on the familiarity theory of definiteness2 a similar anaphoric behavior to that of anaphoric pronouns. In the concluding section, we compare our work to some alternatives extant in the linguistic and computational litera ture.

BUILDRS is an implementation of an LFG parser and a OR-theoretic semantic component in PROLOG . 3 It constructs a set of DRSs, semantic representations for a discourse, from the set of f-structures of the dis course's constituent sentences delivered by the parser. It also includes an anaphora resolution module for finding antecedents to anaphoric pro nouns. The f-structures derived by the parsing component of BLDRS are standard. The semantic theory underlying DRSs is discussed in detail in Kamp ( 1 98 1 ), ( 1 983), Asher ( 1 986), ( 1 987) and to some extent in Wada and Asher ( 1 986). To recapitulate some of those discussions very briefly, a DRS is a pair of sets (U, Con ) , where U is a set of discourse referents and Con a set of conditions, i.e. property ascriptions, whose arguments may be dis course referents in U. Discourse referents come in various types; we use , � for lower case letters for discourse referents of individual type, K 1 , discourse referents of proposition-type. The other types of discourse refer ents will not be at issue here. DRSs are in effect partial models for a dis course. By embedding the DRS for a discourse within a total model, we provide the discourse with a truth conditional interpretation. On a proper embedding g of a DRS K in a model M , discourse referents in U K are mapped onto individuals in the domain of M such that all the conditions in ConK are satisfied in M relative to the assignment of objects to discourse referents provided by g . 4 The mapping from f-structures to DRSs works basically as follows. � Suppose D is a discourse consisting of sentences S 1 , , Sn . The parser component begins by mapping S 1 into an f-structure F 1 • Then the DRS construction algorithm converts F 1 into a DRS D 1 • Now the process starts over again and S 2 is mapped into an f-structure F2 ; the DRS construction algorithm now builds D2 from F2 . Now, however, a new step must be per•

•

.

.

•

•


2. THE BASIC FRAMEWORK: BUILDRS

3 12

(I)

A man loves a woman. She loves him.

(2) man(u 1 ) woman(u2) loves(u 1 , u2) loves(xp x2) XI = U 2 x2 = u l Here is a rough sketch of how BUILDRS arrives at (2). The discourse is read into the parser 1 sentence at a time. The parser returns an F-structure for the first sentence, which looks like this:


formed . The construction algorithm now incorporates the new information in D2 into the already given information in D 1 to get a new DRS for the two sentence discourse ( S 1 , S2 > . D 1 acts as a context used in the interpreta tion of the new information in D 2 • The processes of f-structure construc tion, DRS construction, and DRS "amalgamation" continue until all the sentences of the discourse are processed . The parser component of the system generates a discourse referent with a unique numerical tag, whenever it encounters the determiner of a singular or plural NP or a proper name or a bare plural. This discourse referent serves as an index for a noun phrase, but it is used only to coindex gaps, pro and NPs in such phenomena as long distance dependencies and control phenomena. Our parser contrasts with others attempting to incorpate com plex syntactic theories like Government and Binding or various extensions to the original LFG in that we do not attempt to use the syntactic representa tion or a coindexing mechanism in anaphora resolution. 6 Anaphora resolu tion is a process that operates only on semantic representations, DRSs, in our system, although it depends on many sources of information - includ ing of course syntactic information. Syntactic information relevant to anaphora resolution is stored in a d atabase and used or translated into the level of semantic representation. We hope thus to avoid the redundancies that come from straightforwardly mating some established syntactic theory of binding with DR theory. 7 But we also analyse anaphora at the level o f semantic representation, because we believe that anaphora must b e under stood semantically, as part of the determination of truth conditions of a dis course. As an example of the DRS construction component of BUILDRS at work, consider ( 1 ) . The DRS in (2) results from ( 1 ) using BUILDRS:

313 (2 ' )

subj det a - u 1 pred man gender masc number sing obj det a - u2 pred woman gender fern number sing pred loves ( ( subj ) ( obj ) )


The components o f this F-structure are now translated into partial DRS structures. These are essentially lambda abstracted formulae in which the lambda abstraction is either over discourse referents or sets of conditions. For example in the subj (subject) F-structure in (2 ' ), the a - u 1 value of the det (determiner) slot receives the translation [ud APA.Q [[u 1 ] , [P, Q]] . This structure, which we call a partial DRS, will become a DRS when the property abstracts P and Q are filled in with the appropriate values. A.P and A.Q abstract over sets of conditions, where those conditions are themselves understood as having at least one lambda abstracted argument. The partial DRS specifies that u 1 is to be applied to the lambda abstracted argument in the sets of conditions that will fill in P and Q after A. conversion. The transla tion process also yields the structure A.x [[] man(x)] as the value of the pred slot in the subj F-structure. We call this an incomplete DRS. The informa tion given by the slots gender and number is stored in a database for the dis course referents under the entry for the discourse referent introduced by the determiner. This information will be used later in anaphora resolution . Once the various entries in the F-structure have been translated, the pro gram calls the routine conversion, which simultaneously applies incomplete DRSs to property abstracts and discourse referents to abstracted arguments in the conditions of incomplete DRSs. So for example the conversion of the incomplete DRS and the partial DRS in the subj F-structure of (2 ' ) is the partial DRS [u 1 ) A.Q [[u d . [man(u 1 ) Q]] . A similar partial D RS results for the obj F-structure: [u2) A.Q [ [u2] , [woman(u2), Q] . The main verb or pred slot of the whole F-structure in (2 ' ) is another incomplete DRS, A.xA.y [[], loves(x,y)] . The information that the second argument of loves is the Object F-structure while the first argument is the Subject F-structure guides how the application o f the partial DRSs to this incomplete DRS is to proceed; the program will always convert the partial DRSs with this incomplete DRS in such a way that u 1 ends up in the first position of loves, while u2 ends up in the second. Because of this encoding, A.xA.y loves (x,y) is equivalent to A.yA.x loves(x,y). The program can legitimately first convert either the Subj or Obj F-structure ' s translations with the main pred's translation. I f we first use the Obj F-structure's translation, w e get the incomplete DRS

314 Ax. [[u2 1 , [woman (u2), loves(x, u2)] ] . Converting this with the Subj F-struc ture's translation yields the completed DRS [[u p u21 [man(u 1 ), woman(u2), loves (u p u2)] ] , here given in list notation.s Now the program turns to processing the second sentence. First the parser returns an F-structure, and the translation and conversion routine yield a DRS, K2, for the second sentence.

K2 does not yet look like some portion of (2), however, because of the peculiar conditions introduced by the anaphoric pronouns . Anaphoric pronouns like she and him introduce discourse referents that must be identi fied with other discourse referents; the conditions they introduce are partial ly filled in equations that the anaphora resolution process must complete. In K2 they are the conditions x 1 = [ 1 and x2 = [ 1. But the anaphora reso lution process cannot complete these equations until K2 has been merged with the DRS for the previous sentence in the following way: the discourse referents in the universe of K2 are appended to the set of discourse referents in the universe of the first DRS, and the set of conditions of the new DRS are appended to the set of conditions of the DRS for the second DRS. Now the anaphora resolution process can find the appropriate antecedents and produce the final result in (2) . (2) says that there are objects correlated with u 1 , u2 and x 1 and x2 (call these objects u 1 , u2, x 1 and x2 respectively) such that u 1 is a man, u2 is a woman, u 1 loves u2, x 1 = u2, x2 = u 1 , and x 1 loves x2 . Here is a n example with a slightly more complex semantic structure and anaphora resolution problem.

(3) (4)

Every man loves a woman. She is beautiful. z

=>

beautiful(z) z = [1

woman (u2) loves(u . , u2)


loves(x 1 , x2) XI = [ 1 x2 = [ 1

315

beautiful(z) z = u2

(ACCESS) A discourse referent x is accessible to a discourse referent y just in case (i) X E UK and y E UK, or (ii) X E UK and Y E UK' and there are K 1 , , K0 such that K 1 is subordinate to K 1 and K1 is subordinate to K2 and . . . and K0 is subordinate to K. •

•

•

The point of using DR theory and the DRS construction algorithm as a com-


BLDRS returns two different DRSs for (3), the flrst corresponding t o the wide scope reading for every man, the second corresponding to the wide scope reading for a woman . The differences result from the order in which sub and obj F-structures are combined with the translation of the main pred of the first sentence. The determiner every yields under translation a partial DRS with a complex structure; in list notation it is [u 1 ] :>..P :>..Q [[[u 1 } ,P] :) Q] . When filled in, this structure yields a complex condition of the form K :) K 1 , where K and K 1 are DRSs . This condition should be read informally as saying that for any objects corresponding to the dis course referents declared in the universe of K such that the conditions in K are all satisfied, there are objects corresponding to the discourse refer ents declared in K 1 such that all the conditions in K 1 are satisfled.9 K and K 1 are termed subordinate DRSs or subDRSs of the principal DRS. More generally, we say that K 1 is subordinate to K2 just in case K 1 occurs in a condition of ConK or as a component of a condition of the form K 1 >=> 2 K2• SubDRSs in general arise with the introduction of logical operators on DRSs such as the conditional (the operator E) ), other quantificational 1 operators, negation, modal operator or attitude predicates . 0 The construction of the rest of the DRS for the discourse proceeds as be fore, and we will not go into details. The anaphora resolution process, however, yields different results for the two DRSs constructed from (3). But to explain this we need to turn to our first and principal semantic con 11 straint on anaphora, accessibility .

3 16


ponent in a system of anaphora resolution is to exploit the notion of accessi bility; it furnishes the core of our constraints on anaphora. Our approach to anaphora resolution hinges on being able to recover the discourse refer ents accessible to a give discourse referent. 12 The discourse referent data base constructed during the production of DRSs for a discourse reflects the relations between subordinate and "superordinate" DRSs. The database has the form of a tree structure; each node in the tree represents the universe of some DRS, and the paths between nodes represent the relation of accessi bility. So once a DRS has been constructed, it is an easy matter to find the accessible discourse referents for any discourse referent in the database. Let us see how accessibility affects anaphora resolution in some detail. The anaphora resolution module is the last module in BUILDRS, and it takes as input the complete DRS formed at the end of processing each sen tence in the discourse. The reason it needs to operate on complete DRSs is t hat the constraints on anaphoric relations imposed by the accessibility rela tion require that the logical structure of the sentence be determined. The scope of various quantifiers and truth functional operators is only fully de termined at the level of a complete DRS. So once a DRS K for a discourse has been constructed, BUILDRS goes back and searches for conditions of the form 'x = [ ] ' . A discourse referent introduced by an anaphoric pronoun will occur on a certain node nm in the tree; the discourse referents accessible to it are all those on nodes in the path from nm to the root no of the tree, which stores the universe of the top level DRS for the whole dis course. We will call nodes on this path n-accessible to nm. The program searches back along that path to find all the discourse referents accessible to x. We will call the set of discourse referents accessible to x Accx . Our ac cessibility constraint, ACCESS, says that a discourse referent x introduced by an anaphoric pronoun may only be identified with a discourse referent y E Accx . Our program, however, imposes a series of constraints on the set of accessible discourse referents to "filter out" unsuitable discourse refer ents. The first such constraint is a feature constraint FEAT, and it relies on the number and gender features stored in the discourse referent database . 1 3 The feature constraint, FEAT, says that a discourse referent x introduced by an anaphoric pronoun may only be identified with a discourse referent y having the same number and gender features. To understand how these constraints work, let us return to the DRSs in (4). Note that the logical structure of the DRS representing the left-right scope assignments precludes any successful match for the discourse refer ent introduced by the anaphoric pronoun. The discourse referent intro duced by a woman is not accessible to the discourse referent introduced by she. When the scope assignments are rearranged as in the second DRS, however, the discourse referent introduced by a woman does become acces-

317 sible to the discourse referent introduced by she, and a successful match is made. Accessibility is a powerful, semantic constraint. Exploiting principles of DRS construction for various constructions, the accessibility constraint rules out or allows the following identifications of discourse referents the indexings in (5) suggest. Note that for us two NPs are coindexed iff the dis course referents they introduce are identified during the anaphoric resolu tion process (on at least one reading). The coindexings that are starred are those that the constraints do not allow to occur (5)

c.

d. e.

f.

Every farmeri who owns a donkeyj beats itj . Hei. is unhappy. If a farmeri owns a donkeyj , hei beats itj . Itj • is unhappy (as suming narrow scope for a donkey) . Every cadeti receives a rigorous training. First hei goes to boot camp. Then hei gets intensive flight instruction in a basic train er. Finally, hei is given 200 hours of flight time in a supersonic aircraft . It's false that a mani came to visit yesterday. Hei• left a while ago (assuming narrow scope for a man). John suspects that a womani broke into his apartment . He be lieves that the police will never catch heri (assuming narrow scope for a woman . John doubts that a woman broke into his apartment. He be lieves that the police will never catch heri. (assuming narrow scope for a woman).

The admissible coindexings in the famous pair (5a) and (5b) fall straight out of the analysis of universally quantified sentences and DRT's analysis of 1 conditionals. 4 The program yields almost the same DRS for the first sen tences of (5a) and (5b); we give here the DRS for (5b) with the narrow scope reading for a man and a donkey and wide scope for the conditional operator:

farmer(u 1 ) donkey(u2) owns(u" u2) unhappy(z3 ) z3 = [ 1

beats(z 1 , z2) z, u, z2 = u2


a. b.

318 The discourse referents u 1 and u2 are accessible to the discourse referents z 1 and � given the definition of accessibility, and since the identifications of z 1 with u 1 and � with u2 pass the test required by FEAT, the anaphora resolution component of the program adds those equations to the DRS. By the definitions, however, none of the discourse referents introduced earlier in the DRS are accessibile to z3, and so the DRS for (5b) with this scope reading cannot be completed . A similarly straightforward explanation in volving accessibility and the narrow scope of the NP a man vis a vis negation accounts for the impossibility of (5d) . A more complex account of the in

nal formulation of our program (Wada & Asher 1 986) , the algorithm placed no constraints on scope possibilities and returned separate DRSs with each


terpretation of the attitudes combined with the constraints ACCESS and FEAT are needed to explain the data in (5e) and (5f), but it should be noted that this interpretation is independently motivated. 1 5 (5c) is to be explained once again by appealing to accessibility. But here we also require a particular interpretation of the second, third and fourth sentences in the discourse. They all describe what every cadet does and as such can be seen as amplifications of the conditions in the consequent of the => operator introduced by the first sentence. The construction process al lows the material of subsequent discourse to be entered into the consequent, subordinate DRS, when such redescription or amplification occurs. 1 6 As Roberts ( 1 987) has observed, the subordination of a larger than clausal chunk of a discourse to some supposition requires assigning an implicit quantificational or modal force to the subordinated material. 1 7 In (Sc) the subordination succeeds , because the discourse cues make clear that the material after the first sentence amplifies on the training each cadet receives and so it is taken to have an implicit quantificational force. This subordi nated reading is apparently not possible for (Sa) and (Sb) . Unfortunately, it is very difficult to give rules for when such subordinated readings are pos sible . We use cues like the enumeration of tasks or the presence of modals or quanti ficational adverbs to check for possible subordinated readings, but that isn't sufficient. It is really a combination of such cue words and lexical knowledge that enables a discourse like (5c) to pass the ACCESS constraint but not (Sa) and (Sb). 1 s None of the analyses of the examples (Sa-f) are new theoretically. But note that they rely heavily on an analysis of scope as well as a certain amount of interpretive flexibility . 19 The definition of accessibility and the structure of the discourse referent database presuppose a determinate scope assign ment for all noun phrases. We have also seen that in the DRT framework the scope of NPs is not limited by sentence or clause boundaries but by logi cal structure. Thus, an indefinite NP in a discourse may have scope over a large, multisentential chunk of discourse, and even a quantificational NP as in (5c) have scope over a multisentential chun k of discourse. In the origi

3 19 scope possibility. In view of the data in (5a-c) (but see also May ( 1 985)), however, this is simplistic and does not do justice to the facts. The formula tion of scope constraints are complex and involve perhaps as many diverse sources of information as anaphora itself. We have not unfortunately been able yet to formulate a very thorough list of principles for scope assignment, but we will list those principles of which we are aware and which we believe might be useful. 20

-


We have already discussed to some extent principles relevant to scope with respect to subordinated readings as in (5c). In order to set down our other principles for analysis of scope, we need to discuss an important difference between discourse referents introduced by definite NPs and those introduced by indefinite NPs. We distinguish between two sorts of definites dependent and non-dependent. All definites generate presuppositions of familiarity; some of these are part of the non-linguistic and pragmatically given context of utterance, but some also depend on explicitly introduced linguistic elements . Non-dependent definites are those NPs whose presup positions are met (or assumed to have been met) within the non-linguistic component of the context. Dependent definites are those NPs whose presuppositions depend on already introduced, explicitly linguistic elements in the discourse. Within the theory, the " familiarity" presupposition of a definite is interpreted as the requirement that the discourse referent it in troduces be identified with some antecedently occurring discourse referent (in the case of a dependent definite) or some contextually given object . Often (but not always) a dependent definite is a complex term containing an anaphoric element as a component. The denotation of a dependent definite will typically be a function of the denotation of some other NP. De pendent definites have a restriction on their scope possibilities. Our claim is that they prefer as wide a scope as is possible, relative to the NPs that function as their linguistic antecedents or on which they are functionally de pendent. We have built our claim into the theory by requiring that a dis course referent x introduced by a dependent definite be copied into a separate list DEFi where i is the node of the accessibility tree which con tains the discourse referent with which x, or some discourse referent also in troduced ("co-introduced") by the NP that introduces x, is to be identified. 2 1 x is then deleted from its original position in the accessibility tree. Dependent definites create certain processing complexities. In order to find the appropriate anaphoric attachment for the pronouns within a depen dent definite a, the discourse referents introduced by a must be introduced as low down in the accessibility tree as possible. We must then resolve the equations for the discourse referents introduced by pronouns in a first, then raise these discourse referents up as high as possible in the accessibility tree, then resolve other equations introduced by other anaphoric pronouns. We

320 have yet to implement this order of processing in our program, however. Most, if not all, uses of proper names, demonstratives and many uses of definite descriptions are non-dependent definites. Such NPs naturally take a "referential " interpretation (or at l east a widest possible scope with respect to other operators) and so introduce discourse referents that are ac cessible to any other discourse referent introduced in the discourse. We cap ture this observation in the implemented version of the theory by copying the discourse referents introduced by non-dependent definites into a separate list DEF on the root node of the accessibility tree in the database and by introducing the following contraint.

Together with our analysis of dependent definites, (DEF) and (ACCESS) appears to account for the following data: (6)

a.

Everyone who likes Gary Cooper is a good judge of films. He was a great actor. b . • Everyone who likes his favorite movie stari is a good judge of films. Hei is a great actor. c. If Mary likes everyone who likes the best dressed seniori at Austin H igh, then she likes himi . d . •If Mary likes everyone, who likes hisj motheri, then she likes heri.

(6a) is predicted to be good, because Gary Cooper is a non-dependent definite; so the discourse referent introduced by the name moves to the DEF list at the root node of the accessibility tree. Thus, that discourse referent is always accessible to any discourse referent introduced in subsequent dis course. In (6b) on the other hand, his favorite movie star is a dependent definite that moves onto the DEF list for the node in the accessibility tree which contains the discourse referent i ntroduced by everyone. That dis course referent is inaccessible to the discourse referent introduced by he in the subsequent sentence. 22 (6c- d) show similar predictions of DEF together with ACCESS for more complex DRS structures. Indefinites force us to add further to the the constraints on anaphora resolution. Indefinites appear to be sensitive to surface order in a way that definites are not . With ACCESS alone the theory predicts (7a) (from Sidner (1983)) to be acceptable; (7b) is predicted to be bad when the indefinite is given narrow scope with respect to the conditional operator. The explana-


Suppose x is a discourse referent introduced by a anaphoric pronoun into the accessibility tree t at node n. Then x may be identified with y, only if y E ACCESS(x) U ( U i DEF) , where i is any node n-accessible to n.

(DEF)

321 tion for this prediction i s that the discourse referent provided b y John is co pied onto the root node of the accessibility tree for (7a); thus that discourse referent is accessible to the discourse referent introduced by he. On the other hand, if the indefinite is interpreted as having narrow scope with respect to the conditional operator, then it introduces a discourse referent x in the DRS for the consequent of the conditional in (7b), while the pronoun introduces a discourse referent z in the DRS for the antecedent. By the accessibility con straint, z cannot be identified with x. (7)

a.

If hei comes before the show, give Job� these tickets and send

him to the show.

b. *If hei comes before the show, give a m� these tickets and send

a man were assigned wide scope over ( 1 982) have argued, indefmites may sometimes be interpreted specifically, and under such an interpretation (7b) would be perfectly acceptable if

the conditional. Indeed as Fodor and Sag

are typically assigned maximal scope. 24 Nevertheless, it appears difficult to

assign a scope to

a man

that is wider than that of the conditional operator.

Besides the logical structure of the DRS that affects anaphora resolution through the accessibility constraint , there is another scope-like relation on discourse referents that aff� s anaphora resolution. Suppose that a is an NP in sentence Si in the discourse ( S p . . . , Si, . . . Sn > · Suppose that a

introduces a discourse referent X0 at node n on the accessibility tree t for the discourse. The F-structure to DRS mapping must first process Sp then

S 2 , and so on, so that any discourse referent introduced in sentence Sj at node n for j s i must be introduced into the DRS prior to the introduction

of the discourse referent x

a at level n. Further the construction procedure

we have devised says that if the partial DRS that is the translation of a1 is

applied to the translation of a PRED that is of the form Ax1

•

•

•

�(x 1 ,

. . . , �) after the application of the partial DRS that is the translation o f

a ' then the discourse referent introduced b y ai at level n i n t will precede i the discourse referent introduced by aj in the discourse referent list at level

n in t. These facts about the translation procedure from F-structures to

DRSs determine a strict ordering, order (,) on discourse referents in the

universe of DRS (or at each level in the accessibility tree) prior to the con struction of the DEF lists. We will adopt the following scope-like constraint on anaphora that exploits this ordering: 25

(SCOPE) Suppose x is a discourse referent introduced by an anaphoric

pronoun into the accessibility tree t at node n and suppose y is

also introduced at node n. Then:

(a) if y f. U i Defi and order (x, y) then x may not be identified with y .


him to the show. 23

322 (b) If z is introduced at node n, order (y, x), z .,t. x, order (x, z) and x may be identified with y, then x may not be identi fied with z.

(8)

a. ? Hisi mother loves Johni b. • Hisi mother loves someonei . c . Someonei loves hisi mother. d. • Hisi mother is loved by someonei.

Our notion of scope allows one d iscourse referent x in U K to have scope over a discourse referent y if x is added to U K prior to y. 27 Although clearly the different scopes that result from the processing of two indefinite NPs won't affect truth conditions or anaphora, the scope relations that obtain between a discourse referent introduced into a universe U K by an indefinite and a discourse referent introduced into U K by a pronoun does affect aitaphora.28 The notion of order operative in our constraint SCOPE must be strongly distinguished from the notion of the precedence ordering of the NPs that give rise to the discourse referents in the sentences of the discourse: it is certainly not the case, for instance, that an antecedent NP must precede the pronoun in the sentence for felicitous anaphora to take place. The data in (9) show that defmite and indefinite antecedents do not obey a simple precedence restriction. More precisely, if an indefinite introduces a dis course referent x at node n in the accessibility tree r and an antecedently oc curring pronoun introduces a discourse referent z at some node m such that n is accessible to m in r and the antecedent and pronoun both occur in the same sentence, then the identification of x with z is perfectly permissible. (9)

a. b.

The fact that shei had already climbed this mountain before encouraged Rosai to try again. The fact that hei had already climbed this mountain before en couraged a mani to try again.

DR theory predicts (9b) to be acceptable if the rules for processing complex NPs like the fact that c1> from Asher ( 1 987) are adopted. Those rules yield a DRS like the following:29


To use this constraint fully, we would like to specify some constraints on the possible orderings for discourse referents in a DRS using order. One ab solute constraint from the work of May ( 1 985) is the following: when a sen tence t/t has a SUBJECT NP a containing an anaphoric pronoun f3 introducing a discourse referent x at level n in r, then order (x, y), where y is a discourse referent introduced by an NP in non SUBJECT position in 1/t. If we constrain the mapping from f-structures to DRSs so that this con straint on order is always satisfied , then part (a) of SCOPE rules out the weak crossover data concerning quantifiers and pronouns in (8):26

323

K, u

K ==

fact (K) z z had already climbed this mountain before z = u

K encouraged u to try again

( 1 0)

a. b. c. d. e. f. g.

In hisi wallet Billi found a dollar. ? In hisi wallet someonei found a dollar. 30 First hei lost hisi wallet. Then hisi car got stolen. Fredi was having a bad day. •?First hei lost hisi wallet. Then hisi car got stolen. Someonei was having a bad day. Everyone likes hill\. Fredi is very engaging. •?Everyone likes himi . Some cati is very engaging. Some cati is very engaging. Everyone likes himi .

In a contrast like that between ( l Oa) and ( l Ob), the problem is once again a matter of scope, but nothing in our mapping forces the discourse referent introduced by the pronoun to have wide scope. The precedence of the pronoun strongly suggests wide scope, however, and that, we claim, ac counts for the marginality of ( l Ob) as opposed to ( l Oa) . The possibilities for cataphora even among definites also is limited. This what the second part of the SCOPE constraint addresses . We should note is that cataphora with anaphoric pronouns (though not anaphorically used demonstratives like this) really is a form of accomodation of the interpreter


The precedence of the indefinite with respect to the pronoun becomes more important, however, when both the discourse referent introduced by the in definite and the one introduced by the pronoun are accessible to each other or when antecedent indefinite and pronoun occur in different sentences. The second case is especially significant for our theory. Our mapping from F-structures to DRSs in the intersentential case requires that the discourse referent introduced into U K by a noun phrase from the prior sentence take wide scope over any discourse referent introduced into U K by a noun phrase from the second sentence, unless the latter is introduced by a defi nite. Cataphora with definites is acceptable in our theory because of the way we have formulated SCOPE. Part (a) of SCOPE, however, in conjunction with the other constraints of our theory , predicts (lOd) and (I Of) to be bad, unless one can have specific readings of the indefinite antecedents.

324 to an unusual discourse situation. The recipient considers the identification of a pronoun in a previous sentence with a discourse referent introduced by a definite in a subsequent sentence only if there are no antecedently available discourse referents. If there are discourse referents antecedently introduced to the occurrence of the pronoun or introduced in the same sentence, cataphora does not appear to be a possibility. Consider for instance the fol lowing possibilities: (1 1)

Sam was watching T V . H e had prepared the meat the night before so that the meal would be easy to make. Fred was now in the kitchen washing up.

3. DISJOINT REFERENCE IN BUILDRS

The notion of accessibility alone will not rule out certain ungrammatical dis courses. For instance, ACCESS together with DEF and SCOPE permit the discourse referents introduced by him and himself to either be identified with the discourse referent introduced by John or not . Nevertheless, in ( l 3a) the identificaiton cannot be made while in ( l 3b) it must be made. ( 1 2)

a. b.

John likes him . John likes himself.

Sentences as ( 1 2) require at least a distinction between reflexive and non reflexive pronouns, familiar from the literature on syntax, and some con straint making use of this feature. We have introduced a feature Refl : a dis course referent x in U K gets the feature + Refl if it is introduced into U K by a reflexive pronoun; otherwise x gets the feature - Refl . Many have argued that syntactic configuration imposes constraints governing the anaphoric behavior of reflexive and non-reflexive pronouns. To appeal to some familiar examples (taken from Lasnik & Freidin ( 1 98 1 ) ( l 3a-c) and Rein hart ( 1 983) ( 1 3 f-g):


( 1 1 ) is an attempt to have a neutral discourse. According to common-sense knowledge, it is clearly possible that Fred could have done the preparation . I n fact if the first sentence of ( 1 1 ) were dropped, the antecedent of he would be Fred and there would be no difficulty with this judgment. But no matter how you fill in the details, it is extremely difficult to get Fred to be antece dent of he once we add the first sentence of ( 1 1 ) . 3 t Such data points to limi tations on accomodation with cataphora. Part (b) of SCOPE predicts this fact .

325 ( 1 3)

a. I met a mani who he.i said that Mary had seen ei. b . The mani who he.i wants the woman to like ei. c. The mani who he.i thinks eii likes the woman. d. A man who hardly knows heri loves Maryi . e. •Hei loves a womani whoi hardly knows Billi . f. Near himi Dani saw a snake. g. •Near Dani hei saw a snake.

( 1 4)

Reagan voted for Reagan

According to a binding theory like that put forward in Chomsky's Lectures on Government and Binding, the two occurrences of Reagan cannot be as signed the same index without apparently violating some of the principles of the binding theory; yet it appears clear that the two occurrences could be coreferential. 33 An example due to Peter Sells shows another apparent difficulty with this simple view: ( 1 5)

Everyone likes John. Fred likes him. Mary likes him. Even John likes him.

In ( 1 5) John and he cannot be coindexed without violating the binding theory, yet again the obvious intent of the discourse is to make him and John coreferential. The moral we draw from these examples is that certain syntactic features determine constraints on which discourse referents may be identified with other discourse referents by means of the equations explicitly introduced by the anaphora resolution process . We take disjoint reference to be a con straint on discourse referents with certain features; the features have to do with whether the discourse referent in question was introduced by a reflexive or non-reflexive pronoun. Disjoint reference requires that a discourse


Examples like these have led many to accept that something like Reinhart's c(onstituent)-command or Bach and Partee's ( 1 980) function-argument constraint determines a constraint of disjoint reference governs the distinc 2 tion between reflexive and non-reflexive pronouns. 3 There are certain difficulties with syntactic approaches that lead us to take a non-syntactic, semantic interpretation of disjoint reference con straints. Standard syntactic treatments, as Roberts ( 1 987) has argued, assign indices to NPs that do not have a clear interpretation. The intended interpre tation seems to be NPs with the same index are coreferential , while NPs with different indices are not coreferential . There are examples in the literature, however, that belie this simple view. Consider for instance ( 1 4), which is an example discussed in Evans ( 1 980):

326


referent with the appropriate feature not be identified with other discourse referents that are introduced by NPs in certain positions . This of course does not preclude that the discourse referents are mapped onto the same in dividual in a proper embedding of the DRS of which they are part. So our view differs from the simple view; just because two discourse referents can not be identified explicitly by means of an equation introduced by the anaphoric process, they may nevertheless turn out to be "coreferential . " But further, w e interpret disjoint reference constraints as requiring in ( 1 6) that the discourse referent z introduced by he not be identified with the dis course referent introduced by the second occurrence of the name John . That is, disjoint reference constraints must be intra-sentential . But as we see it, nothing in the disjoint reference constraints precludes that we identify z with the discourse referent i ntroduced by the first occurrence of John. We have chosen not to generate a separate C-structure with an indepen dent syntactic constraint on coindexing. It proves nevertheless relatively easy to adapt a C-command-like constraint to our mapping from f structures to DRSs; so we have stored information from the f-structure rele vant to disjoint reference in the discourse referent database. This strategy yields a suitable disjoint reference constraint on discourse referents, and it has the benefit that all our constraints apply at a single, representational level, eliminating redundancies. But further our strategy reflects a commit ment to only one interpretation of anaphora: anaphora involves a relation between discourse referents. We do not believe in a variety of different sorts of anaphoric binding relations - some determined by syntax, others by dis course effects. Rather, we believe that anaphoric binding relations always concern relations between discourse referents, although these relations may be determined by a number of constraints, employing syntactic, semantic, and pragmatic information. 3 4 Disjoint reference phenomena reveal an important facet of the represen tation of the content of a sentence - the semantic topic of the sentence. The subject - or in DR theoretic terms, the discourse referent introduced by the SUBJ f-structure - is, in one sense of aboutness , what the sentence is about. It is the primary topic. But there are other aspects to the representation of the content of a sentence that syntactic treatments o f disjoint reference have noticed. Every verb introduces a state or event discourse referent into a D RS; we think of this discourse referent as having certain "thematic roles slots" described by the lexical subcategorization of the verb. These slots are filled by other discourse referents introduced by NPs in the sentence; which discourse referent fills which slot is determined by the grammatical function of the NP introducing it as is supposed in Case Grammar. 3 5 These dis course individuals play a secondary role to the subject (they are secondary topics), although they play more important roles in the representation of the content than the discourse referents introduced by ADJUNCTS . 3 6 Very roughly, non-reflexive pronouns must not in the predication of a property

327

We now define a domain ofdisjoint reference for a reference marker (i)

(ii)

(iii)

XD.

Suppose that a discourse referent x0 is introduced by a DET or P RED in the SUBJ f-structure fn of a superordinate f-structure fn + 1 with Pred P n + 1 • Then Domain(�) = ( y : y is a discourse referent occuring as an argument in either Tr(fn), Tr(Pn + 1 ) or in the property locally p redicated of an argument of Tr(Pn + 1 ) ; or y is a discourse referent related in a predication to the event described by Tr(P n + 1 ) I Suppose that a discourse referent x0 is introduced by a DET or PRED in a non-SUBJ f-structure fn that is a subcategorized gram matical function in a superordinate f-structure fn + 1 with Pred P n + 1 • Then Domain(x0) = ( y : y is a discourse referent occurring as an argument in Tr(Pn + 1 ) or Tr(fn) I Suppose that a discourse referent x0 is introduced by a DET or PRED in f-structure fn that is not a subcategorized grammatical function in a superordinate f-structure fn + 1 • Then Domain(x0) ( y : y occurs as an argument in Tr(fn) 1 .

Using this definition of domain, we can now formulate a disjoint reference constraint. (DISREF) (a) A discourse referent x with the feature - Refl cannot be identified with a discourse referent y with feature - Refl in the domain of xY (b) A discourse referent x with feature + Refl must be identified

with a discourse referent y with - Refl in the domain of x .


to a primary or secondary topic of the predication be used to ascribe a reflexive property, while reflexive pronouns must be so used. As with syntactic notions of disjoint reference, our "semantic" notion as signs NPs filling certain thermatic slots (like SUBJECT or other subcatego rized functions) more stringent disjoint reference constraints than others. The crucial feature accounting for the difference in judgments above in our use of semantic disjoint reference constraints involves the notions of local and proper predication. Suppose that a discourse referent x0 is introduced by a DET, or PRED (in the case of proper names or anaphoric pronoun), in the f-structure fn. Then the translation of fn under the construction al gorithm (a partial or predicative DRS) yields a property Tr(fn) locally predicated of "» · Now suppose that fn is a subcategorized argument of a PRED P n + 1 in a superordinate f-structure fn + 1 • Then the translation of P n + 1 yields a property Tr(P n + 1 ) properly predicated of "» .

328

( 1 6)

a. man(u1 hardly-knows(u p x1) x1 = m Mary(m) loves(ul ' m) b. loves(x 1 , u 1 ) woman(u 1) XI = [ ) Bill(b) hardly-knows(u p b)

Together the constraints A CCFSS, DEF, SCOPE and DISREF account, as far as we are aware, for the data concerning disjoint reference and cross over. But there is one other traditional domain of the syntactic theory of binding, viz . cases of " reconstruction" involving picture nouns, to which they ought to apply and with which they have some difficulty. But picture nouns be themselves already pose problems even without reconstruction. What for instance ought to be the relationship between the PP and the NP that picture i n ( 1 7 a) for instance? We suspect that there is a complex interac tion between verbs like see and picture nouns so that one might take the PP as an adjunct, then DISREF would permit ( 1 7a) . On the other hand if the PP is part of the object f-structure or some subcategorized grammatical function of see (as in see oneself in a picture, then DISREF would permit only ( 1 7b).


For any discourse referent x and y , the constraint DISREF stipulates that if y e Domx then x and y must not be identified by BUILDRS' s anaphora resolution module. So if we examine, for example, the DRSs for (I 3d) and (1 3e) in ( 1 6a) and ( 1 6b) below, we see that the discourse referents intro duced by 'a man' and ' Bill' (u1 and b) are in each case accessible to the dis course referents introduced by the pronouns. But in ( 1 6b), b lies in the domain of x1 and since x1 has the feature Refl-, b may not be identified with x 1 • Similarly with the adjunct prepositional phrases in ( 1 4f) and ( 1 3g) or the relative adjunct clauses in ( 1 3a - c), the domain relations are such that though ACCFSS , DEF and SCOPE would permit either coindexing, DIS REF rules out ( 1 3a-c) and ( 1 3g) where the pronoun gives the discourse referent it introduces the feature Refl-.

329 ( 1 7)

a. b.

Fred saw that picture o f him. Fred saw that picture o f himself.

We suspect that there are, as suggested by Roberts ( 1 987), two construc tions, since these two sentences do differ in meaning . Other verbs don't per mit both forms; usually they license only reflexive anaphoric constructions like that in ( 1 7b). This phenomenon appears puzzling from our point o f view , since i t contrasts with the behavior of anaphoric pronouns i n other PPs like those in ( 1 3 f) and ( 1 3g). In view of these difficulties, however, any account concerning reconstruction phenomena seems premature. 38

4. 1 . An absolute discourse constraint? The last "absolute" constraint on anaphora resolution i n the current im plementation o f BUILDRS is one that governs permissible patterns o f anaphoric linkage. 39 I t may b e only one o f many such constraints, but we have found only this one so far. Perhaps the best way to understand this constraint is to look at a violation of it. Consider the following discourse. Maryi invited Susani over for dinner . Shei prepared sukiyaki. Shei arrived late. Shei# served heri apologetic guest sullenly.

( 1 8)

( 1 8) strikes many speakers as simply incomprehensible because of the shift ing subject. It appears that one can easily shift from Mary to Susan as sub ject in the sentences of a discourse as in the first three sentences of ( 1 8) . But one cannot switch back again even though world knowledge demands that 'she' in the fourth sentence have 'Mary' as an antecedent. 40 We shall sim ply take as a constraint on "discourse presentation" the following. Let of discourse referents i n [ACC] be the set of equivalence classes under ACC . Then =

(PRES) Suppose z1 and z2 are discourse referents introduced by anaphoric pronouns in a DRS K, and that z1 is identified with x and z2 with y in K, and x = y is not a condition in K. Now suppose z3 is in [ACCu) troduced by an anaphoric pronoun into K and [ACCzd [ACCz3 ] . Then z3 cannot be identified with x in K . =

=

(ACCESS), (FEAn. (DEF}, (SCOPE), (DIS REF) and (PRES) are all the absolute constraints on anaphora resolution in our current implementation. BUILDRS does not actually construct the set of all accessible discourse


4 . PRAGMATIC AND DISCOURSE CONSTRAINTS ON ANAPHORA I N BUILDRS

330

4.2. Discourse constraints and salience The other constraints of a pragmatic o r discourse nature are more subtle than the absolute constraints we have just detailed; these discourse con straints impose preferences that may be overturned by other evidence. We distinguish two kinds of such constraints. The first pertains to the creation of a salience ranking on discourse referents, the second to world knowledge. An essential component of our analysis and implementation is that the abso lute, salience and world knowledge constraints operate as independent modules. Constraints pertaining to world knowledge are well recognized to be important factors in determining coreference, but we have little to offer 2 in this area that is not already well-known. 4 We shall assume the existence of a world knowledge component that "checks" the predictions of the salience filter for coherence, but we will not examine what its structure is. 43 Henceforth, we shall concentrate on salience. Salience constraints form yet another filter used to narrow down the set of potential anaphoric antecedents in BUILDRS. The phenomena that we try to accomodate with salience constraints are usually associated with the literature on local focus or topic. Roughly, the most salient discourse refer ent at a particular point in the discourse (or in the construction of the DRS from the discourse) is the entity that others have called the "entity in focus."


referents for x for each such x introduced by an anaphoric pronoun. In stead, it constructs a smaller set from the internal database of all discourse ' ' referents for the discourse - a set that we shall call ACC x · ACC x is that subset of ACC" of discourse referents that agree in number and gender with x and which also have passed the constraints (DEF), (SCOPE), (DIS 1 REF) and (PRES). 4 Every discourse referent outside of ACC 'x is excluded from further consideration as a possible anaphoric antecedent for x, so our program applies these constraints prior to applying the constraints that are not absolute. The order in which these constraints may be applied is not necessarily determined in the implementation. These constraints also in volve only feature checking and matching of the features in the discourse referent database. Since the database is easy to construct while the DRS it self is being constructed, these constraints are also quite efficient. ' If ACC x is a singleton ( z ) , the algorithm simply specifies that x = z. ' If ACC x is not a singleton , then BUILDRS invokes another, indepen dent module that computes a salience ranking on ACC 'x· This ranking enables BUI LDRS to prefer certain anaphoric links to others and to rule out some candidates for anaphora that the salience ranking assigns an ex tremely low rank to. We turn now to describing the salience ordering i n more detail .

33 1

(D l )

(0) Sami really goofs sometimes. ( l ) Yesterday was a beautiful day and hei was excited about trying out hisi new twin. (2) Hei wanted Fredk to join him on a practice flight . (3) Hei called himk at 6am. (4) Hek was furious at being awakened at that hour U ust to go for an airplane ride).


We have chosen to elaborate our own notion of a salience ranking rather than use the more familiar notion of focus. 44 We do not think that the notion of the focus, the topic or the backward center of the sentences in the discourse prior to the one containing the pronoun is the right one for the 4 analysis of anaphora. 5 It is not clear to us that for every clause there is just one focus or perhaps any focus at all, although it is undeniable that some thing like a focus exists in many discourses . Unfortunately, tests for local focus do not seem to capture a clearly defined notion, and recent attempts to clarify the notion of focus tend to make predictions that go beyond any supporting data from speaker's intuitions. 46 One simple modification to the concept of focus, however, that seems to solve at least some of the difficulties is to abandon the idea of one local focus in favor of a degree of focus or a degree of ho w much more salient one discourse referent is than another among the set of syntactically and semantically acceptable, poten tial antecedents. We will use the factors associated with focus to determine a degree of focus or salience. Where several discourse referents have all pret ty much the same degree of salience, we claim that the salience filter does not choose among these candidates. This is important, because in many cases the salience ranking does not clearly distinguish between two possible antecedents, and BUILDRS would come to the wrong conclusion if it al ways had only one choice to make in those cases and that choice had to be based on rather delicate considerations of salience. Only in cases where a significant disparity in degree of salience exists, does the salience filter indi cate a preference for one antecedent over another, and only in cases of ex treme disparity does the salience filter install a very strong preference, which, if not obeyed because of constraints pertaining to world knowledge, leads to predictions of discourse infelicity. When salience and world knowl edge both agree on a preference for an anaphoric antecedent, the discourse exhibits coherence; when the salience filter instills a very strong preference for an antecedent that is ruled out because of world knowledge, the dis course is infelicitous or awkward. Grosz, Joshi and Weinstein ( 1 986) provide a convincing example showing that in some cases the salience of one discourse referent is so extreme that a failure to choose it as an antecedent for a discourse referent subsequently introduced by an anaphoric pronoun leads to infelicity.

332

(02)

Sami really goofs sometimes . Yesterday was a beautiful day and hei was excited about trying out hisi new twin. Hei wanted Jilli to join him on a practice flight. He1 called heri at 6am. Shei was furious at being awakened at that hour.

Any salience ranking between the discourse referents introduced by Sam and Jill is irrelevant to our choice of antecedent for she in the fourth sen tence of (02). We conclude that salience rankings apply only to those possi ble antecedent discourse referents that have passed the absolute filters. I n this example the set ACC '1, where x i s the discourse referent introduced b y she i s the singleton consisting only of the discourse referent introduced by Jill. It is important to note that none of the facts that we have mentioned so far are absolutely decisive in determining anaphoric antecedents. Consider for instance the following variant. (03)

Sami really goofs sometimes. Yesterday was a beautiful day and hei was excited about trying out hisi new twin. Hei wanted Fredk to join him on a practice flight. Hei called himk u p at 6 am to invite himk out flying. Fredk was fast asleep. Hek was furious at being awakened at that hour.

In (03) factors that call attention to what others have called a "shift" in focus come into play and they in effect balance out the factors that in (0 1 ) were decisive in determining the salience o f Sam over Fred. Suppose that the occurrence of ' he' in the sixth sentence of (03) introduces a discourse refer ent z. Because of an aspectual shift and the reference to Fred again in the


World knowledge dictates that he in (0 1 .4) be coindexed with ' Fred ' , but the salience ordering prefers Sam very strongly, because Sam is so much more salient than Fred. The result is a clash between salience and world knowledge filters and the coherence of the discourse suffers. The salience o f Sam over Fred is due to a number of factors, none of which by themselves would be sufficient to make the coreference of he and Fred infelicitous. The fact that there has been repeated anaphoric linkage to Sam and not to Fred and the fact that there is a parallelism between the structure of (0 1 . 2), (0 1 .3) and (0 1 .4) and that the subjects in (0 1 .2) and (0 1 . 3) are anaphoric pronouns coindexed with Sam lead the reader of (0 1 ) to expect with near certainty that h e i n subject position o f ( 0 1 .4) should be coindexed with Sam . It is also crucial for such examples that the absolute constraints by themselves don't dictate a clear choice for the anaphoric an tecedent of he in (0 1 .4). So much is obvious from considering the perfectly felicitous (02) t hough it is very close in form to (0 1 ) .

333


fifth sentence, Fred is in fact slightly more salient than Sam, but not so much so that were the discourse continued in a slightly different way, in felicity would result from coindexing the ' he' in the fifth sentence. 47 We draw two morals from these examples. (I) Salience is a matter of degree. The difference between (D l ) and (D3) seems to be one of degree of salience. The most salient discourse referent in ACC 'x may be so much more salient than the next most salient discourse referent in ACC 'x that if the most salient discourse referent cannot, because of world knowledge constraints , b e identified with the discourse referent introduced by a n anaphoric pronoun that has all the appropriate grammatical features relevant to such an identification, the resulting discourse is infelicitous or awkward. (2) Salience is the result of adding up the various preferences of the factors rele vant to determining salience in order to assign a salience ranking. Features like recency, parallelism, reiteration, and the like each add something to the overall salience ranking, but none alone is decisive. Our implementation of the salience ranking consists in a series of con straints. These constraints act as filters on ACC 'x• the set of syntactically and semantically acceptable antecedents for x. Each one of the constituent filters determines a particular salience ranking on ACC 'x that must be combined in some way to yield a total salience ranking. One might allow each constraint to have a "vote" on the candidate discourse referents, and the discourse referent with the most votes would win. But not all of these constraints have equal sway. The literature on focus has already made clear in general what sorts of constituent filters are required to determine the overall salience ranking: recency, grammatical function, parallelism, reiter ation, and " focus shifting" expressions and configurations like complex demonstratives and other definite expressions, clefs, fronting and aspectual shift. From the few empirical studies we know of on this topic, it appears that recency should be weighted relatively heavily and that discourse refer ents introduced in the same sentence should get the same weighting from this filter. 48 Two more observations from the literature on focus are that the next most influential factors in determining the total salience ranking are reiteration and parallelism and that these factors are of roughly equal im portance. Reiteration occurs when a discourse referent is repeatedly linked with discourse referents subsequently introduced in the discourse by ana phoric pronouns. Whenever anaphoric pronouns repeatedly refer back to some previously introduced discourse referent, this discourse referent tends to become the focus of the discourse. Parallelism concerns both syntactic and semantic constructions . 49 Parallelism also seems to come in degrees; a parallel structure may be strongly reinforced over several sentences as in (D l ) or there may be only a weak syntactic parallelism; also an otherwise weak parallel structure may be reinforced by words like too and also. 50 Finally, focus shifting mechanisms also tend to act cumulatively and in

334


degrees. They often give a relatively high salience to a discourse referent that previously was accorded a low salience. These observations have led us to develop a quantitative and additive model for salience. Since each interpretive constraint provides its own rank ing on ACC ' , we will require that these rankings be additive; if S1 and S2 are rankings on ACC 'x• then so is the ranking obtained by summing the particular assignments by S1 and S2 to each member of ACC" (we will write this ranking as S1 + S2) . Each filter adds its assignment to the members of ACC 'x to the sum of the weightings of the previously applied salience filters. The last filter in the sequence of ftlters is the sum of all the previous filters; we will also sum together the weights assigned to candidates in ACC 'x that have been already explicitly identified in the DRS. We shall call this final sum the total salience ranking, or the salience ranking sim pliciter when no confusion results. The particular numerical values our filter model uses in the implementation are of course totally arbitrary since what our model is looking for is differences in degrees of salience, not absolute values of salience. We must also choose cutoff values for the relevant differ ences in degree o f salience. H ere the choice of numerical values does make a difference. Good writing will minimize ambiguity and so in easily u nder standable texts, one discourse referent (the intended antecedent) will be much more salient. If we are generous and fix the bounds for infelicity quite high, then BUILDRS will usually return more than the intended interpreta tion; fixing the bound for infelicity low makes BUILDRS able to capture intended interpretations in good writing but perhaps misinterpret ambi guous and less good writing. The parameters relevant to the constraints mentioned so far are mostly recoverable either from the f-structures of the discourse ' s constituent sen tences or from the DRS. Let us investigate each o ne of these constituent filters in a bit more detail. First, the recency of a particular discourse refer ent y in ACC 'x in a DRS K for a discourse D refers to the point at which y was introduced into K relative to the point at which x was introduced. I n the current implementation, this filter assigns values t o discourse referents in ACC 'x as follows: (i) if y is the most recently introduced discourse referent in ACC ' " ' then salience(y) = 4; if y is introduced into K during the processing of the same sentence as the one in which the most recently introduced discourse referent in ACC 'x is introduced, then salience(y) = 3 ; if y is introduced during the processing of one sentence previous to the one in which the most recently introduced discourse referent in ACC 'x is introduced, then salience(y) = 2; i f y is introduced two sentences previous to the one in which the most recently introduced discourse referent i n ACC 'x is introduced, then salience(y) = l ; else salience(y) = 0. Given this assignment of values for the recency filter, we have been to some degree con strained on the numerical v alues that the reiteration filter assigns to dis-

335

=

=

The least important factor is the surface grammatical function of the NP introducing the discourse referent . Apparently if y in ACC 'x is in subject position, it tends to be more salient than one in object position. Our gram matical function filter assigns the value 2 to a discourse referent in ACC 'x introduced by an NP whose grammatical function is SUBJ, the value 1 to a discourse referent in ACC 'x introduced by an NP whose grammatical function is OBJ or OBJ2 and 0 otherwise. Finally with respect to salience shifting constructions, we presently only consider aspect shift and the use of proper names or definite descriptions. But since we encode aspectual character using event-type discourse referents and we have left these out o f o u r description of the D R S construction process here for simplicity, w e shall not discuss this in detail . The way our constraint works roughly is as fol lows . Consider a discourse with an aspectual shift from simple past actions or activities to statives in combination with the use of a proper name or definite description introducing a discourse referent x f ACC 'z• where z is a discourse referent introduced by an anaphoric pronoun which occurs in a clause after the aspect shift. I f y f ACC 'x is the most salient discourse salienceACC ' x(Y) + 3. In referent in ACC 'x• then in ACC 'z salience(z) =


course referents in ACC 'x: if y f ACC 'x and if y has been already linked in K with other reference markers n times then salience(y) = n but the maxi mum value reiteration can assign is 4 (that is, repetition should be no more important than recency can be). The parellelism filter is quite complex and also quite unsatisfactory as it stands. Parallelism is a relatively local effect - it usually operates on two successive clauses, though it can occur throughout a more extended chunk of discourse. There are a variety of clues suggesting parellelism to consider . First, clue words like too and also may add to the salience of a particular 1 discourse referent in ACC 'x. 5 Suppose that such a clue word occurs. The filter then looks to see what is the surface position or gramm atical function of the NP introducing x in ACC 'x· It then looks for the NP in the previous clause with the same grammatical function. If that NP introduces y and y f ACC 'x• the filter assigns salience(y) 2 , and it adds 2 for the presence of the clue word . It also checks to see whether in the DRS there are condi tions C1 containing x as an argument and C 2 containing y as an argument SUCh that C1 "" C2 (y/x), where this means that C 2 COntains X in the same argument position where c l contains y and c2 is otherwise an alphabetic variant of C 1 • If so, the parellelism filter adds 2 to whatever other value it assigns to y by means of the other tests. Second, the parallelism filter may be triggered by a repeated pattern in previous clauses. If x was introduced in Ko by an NP with grammatical function F and in (at least) the last two previous clauses y and z i ntroduced in Ko by NPs with F and there is a u such that u = y and u = z are already conditions in Ko. then salience(u) 3; otherwise, salience(y) = o. s 2

3 36 general, however, if there is a salience shift indication, the program stores the final values as features on the set of discourse referents, and they are used in the next computation of salience needed when resolving a subse quent anaphoric pronoun. Salience shift also affects other factors; the pro gram will not consider repetitions of a discourse referent in the next calculation, if those repetitions occurred prior to the focus shift. Let us now return to (0 1 ) and (03). Suppose (0 1 ) yields a DRS contain ing discourse referent t1 for Sam t2 for Fred. Let's focus on the conditions and discourse referents contributed by (0 1 . 3) and (0 1 .4) . Consider first the processing of (0 1 . 3):

'

Our anaphora resolution module begins with z1 • ACC 1 = ( t 1 , t2 ) (collaps ing equivalence classes of d iscourse referents under = ) . Since the absolute constraints do not provide a unique solution, BUILDRS begins a salience ranking computation. First it looks to recency. Since Fred and Sam are both mentioned in 1 . 2 (Sam must be the anaphoric antecedent of the pronoun in 2 by the absolute constraints), recency (REC) weights them equally with value 4. But now the repetition filter (REP) assigns t1 the value 4 + 4 = 8 and t2 the value 4 + 0 = 4. Parallelism (PAR) also prefers t1 to t2; it assigns t 1 the value 8 + 3 = 1 1 , and t2 the value 4 + 0 = 4. Finally, the grammatical function filter (GF) assigns t1 the value 1 1 + 1 = 1 2 and t2 the value 4 + 2 = 6. There is a huge gap between the value assigned to t1 and t2, so the pro gram strongly prefers t1 as an antecedent and replaces z1 = [ I with z1 = t 1 • D ISREF forces ACC '1 = ( t2 } , so BUILDRS never uses the salience filter. We now pass to the contribution made by (0 1 .4):

called(z . , �) zt = t t z2 = t2 z3 was furious . . . ZJ = [ I


z1 called � zt = [ I � = [I

337

5 . OTHER DEFINITE NPS AND ANAPHORA RESOLUTION

We return now to examine briefly how other definite NPs besides anaphoric pronouns fare with respect to the filter model for anaphora resolution we have developed. As DR theory and the familiarity theory of definiteness re quire, discourse referents introduced by definites need to be linked up to the appropriate, accessible discourse referents . A discourse referent x in troduced by a definite description or proper name, however, appears to be able to be identified with a discourse referent that is not the most salient in ' ACC x far more felicitously than were x introduced by an anaphoric pro 5 noun . Consider the following variants of D 1 : 3 (04)

Sami really goofs sometimes. Yesterday was a beautiful day and hei was excited about trying out hisi new twin. Hei wanted Fredk to join him on a practice flight. Hei called himk at 6am . Fredk was furious at being awakened at that hour.

(05)

Sami really goofs sometimes. Yesterday was a beautiful day and hei was excited about trying out hisi new twin. Hei wanted Fredk to join him on a practice flight. Hei called himk at 6am. The potential pas sengerk was furious at being awakened at that hour.


With respect to z3 , BUI LDRS again must resort to the salience filter . SalienceREc(t2) = 4 = SalienceREc(t2); SalienceREP(t 1 ) = 4 + 5 = 9 and SalienceREP(t1 ) = 4 + 1 = 5; SaliencePAR(t 1 ) = 9 + 3 = 12 and Sa liencePA R(t2) = 5 + 0 = 5. SalienceGF(t 1 ) = 12 + 1 = 1 3 and SalienceGF (t2) = 5 + 2 = 7 . The program again strongly prefers t 1 as an antecedent (almost twice as much), so that the program predicts infelicity when world knowledge forces us to choose t2. Turning to (03), we note that the discourse referent introduced by the last occurrence of ' he' in (03), call it z4, also has as its ACC ' set [ t" t2 ] . But here the computation for salience provides almost balanced results due to the presence o f a salience shifting expression. SalienceREc(t 1 ) = 4; Sa lienceREc(t1 ) = 2; SalienceREp(t 1 ) = 2 + 4 = 6 and SalienceREp(t2) = 4 + 1 = 5 ; SaliencePAR(t 1 ) 6 + 0 = 6 and SaliencePAR(�) = 5 + 2 = 7 . SalienceG�t 1 ) = 6 + 2 = 8 and SalienceGF(t2) = 5 + 2 = 7 . SalienceFS(t2) = 7 + 3 = 1 0, SalienceFS(t 1 ) = 8. Although t2 is now the preferred antece dent for z4 in (03), the discrepancy between t1 and t2 is not so great that infelicity occurs should world knowledge dictate that t1 rather than t2 be the antecedent. But if (03) is continued in such a way that t2 becomes repeated frequently, then it will become infelicitous to identify a discourse referent introduced by an anaphoric pronoun with t 1 .

338

6 . CONCLUSION

According to the filter model proposed here, the list of accessible discourse referents , which is itself determined by the logical structure of the discourse, has been paired down by various "grammatical feature" filters. These filters on the list of potential antecedents for an anaphoric pronoun or other definite NP are absolute. If they fail to produce a unique antecedent, BUILDRS resorts to ranking the candidates according to a salience. The salience ranking is the result of applying several filters each assigning a par ticular weighting to the members of ACC 'x• where x is the discourse refer ent introduced by the anaphoric pronoun or definite. BUILDRS allows the user to determine weightings and even the constituent filters determing the salience ranking. The model suggests that by postponing the most difficult tasks in the process of anaphora resolution, we may actually be able to avoid them in all except the worst possible cases. Of course, the list of feature mechanisms relevant to resolving pronominal anaphora and understanding definite descriptions discussed here should not be taken to be complete. We have not talked in detail of the constraints dependent on world knowledge or at all of those concerning global focussing or conversational planning dis cussed in Grosz & Sidner ( 1 985). But we must leave the question of where these fit in within the DR theoretic approach to discourse for another time.

ACKNOWLEDGEMENT We have benefitted from discussions with Hans Kamp, Lee Baker, Franz Guenthner, Werner


Although also very close to (D l ) , (D4) and (D5) are perfectly felicitous. As with (D2), here again the relative salience of one discourse referent over another is irrelevant. In keeping with t he line of explanation that we ad vanced in the case of (D2), we hypothesize that definite descriptions and proper names introduce another absolute or deterministic filter on the set o f accessible discourse referents . For a definite description of the form ' 'the a" introducing a discourse referent x, this filter simply checks for each y i n ACC 'x the conditions applicable to y in a database that contains all those conditions explicitly introduced in the DRS or derived from them by a restricted inferencing component. s4 I f a(y) is a condition in the database, then y passes the filter; otherwise not. We will call the result of filtering ACC 'x with this content filter ACC*x· An analogous filter is defined for proper names. The salience ranking for potential antecedents for x, where x is introduced by a definite is defined on ACC*x· The presence of the extra filter on the set of potential antecedents appears to account for many cases where a definite description makes reference back to a discourse entity that is " no longer in focus . "

3 39 Frey, Andy Schwartz, Henk Zeevat and Ede Zimmerman. We thank the Center for Cognitive Science at the U niversity of Texas at Austin and to the Seminar fuer Natur-Sprachliche Systeme at the University of Tuebingen for research support.

Ctnter for Cognitive Science Dtpartment of Philosophy and Dtpartmtnt of Linguistics The University of Texas at A ustin

NOTES

I.

In the recent computational literature see Reichman ( 1 978), Sidner ( 1 979), ( 1 983) and

observations concerning focus and topic have been around for a much longer time. For a sur· vey and bibliography see Smith ( 1 985). 2. We take the familiarity theory of definiteness to be an integral part of the story DR theory has to teU about definiteness and indefiniteness . The OR-theoretic formalization of the

familiarity theory is developed in Heim ( 1 982). 3.

For an introduction to DR theory as we will be assuming it, see Kamp ( 1 986), Asher (1 986),

and Wada and Asher ( 1 986). The LFG component we have used is that detailed in Bresnan and Kaplan ( 1 982). The fragment treated by BLDRS discussed in Wada & Asher ( 1 986) contained indefinite singular and quantificational singular noun phrases, anaphoric pronouns, intrarui tive, transitive, ditransitive verbs, control and attitude verbs, relative clauses, possessives, and compound sentences using sentential conjunction; we have since expanded the implementation to handle definite descriptions and some prepositional phrases .

4.

Although there are extensions to the original fragment covered by DR theory that incor

porate discourse referents of other than individual type, we have not implemented any rules that employ them and so we shall ignore them here. For more details, see Kamp ( 1 98 1), (1 986), Wada & Asher ( 1 986). 5.

The mapping from F-structures to DRSs is discussed in detail in Wada & Asher ( 1 986).

It owes a good deal to the work in Frey ( 1 985). A good deal of work in the area of DRT im plementation is now published. Although we are aware of several efforts in this area besides our own (Klein & Johnson ( 1 986) , Sedogbo ( 1 986), Guenthner, Lehmann & Schonfeld ( 1 985)), it is the latter that is most relevant. Guenthner & Lehmann advocate a hierarchy of distinct filters on potential antecedents that is stricter and more sequentially bound than ours. They apply morphological, syntactic and semantic constraints using separate mechanisms, whereas we have integrated the anaphora resolution into one process using constraints from a variety of sources . Guenthner et a!. have a separate set of constraints defined on the syntactic struc tures of sentences in a discourse; we do not. We have also incorporated an explicit set of dis course constraints which they do not discuss in any detail. 6.

Some of the developments in LFG binding theory are reported in Sells ( 1 985).

7.

See for instance Roberts's ( 1 986) mating o f GB syntactic constraints o n anaphora with DR

theory's. 8.

Note that conversion here in these complex

cases

involves merging lists of discourse

referents as well as simple application. This description is only intended as a sketch, however;

the actual implementation of conversion is quite complicated . For a treatment of more com

plex NPs such as those containing relative clauses see Wada and Asher ( 1 986). 9.

For more details see Wada and Asher ( 1 986).

10.

For details see Van Eyck ( 1 985), Asher ( 1 986).

II.

The definition of accessibility here differs from that discussed and implemented in the


Groz, Joshi & Weinstein ( 1 983, 1985; Joshi & Weinstein 1986). Of course many of the linguistic

340 original BUILDRS of Wada & Asher ( 1 986). We have separated out the implicit processing constraint in that earlier definition and made it part of the separate constraint SCOPE defined below.

12.

One might also use other theories of discourse semantics like Seuren 's to furnish logical

constraints on anaphora. See Seuren ( 1 985). 1 3 . In the original BUILDRS ( 1 986) program, the only features in the list were the number and gender of NP that introduced the discourse referent and the feature ( + )Refl.

14. IS.

See Kamp ( 1 98 1 , 1 988).

For details on d,e see Asher ( 1 987). With some additional machinery developed i n that

paper, we can also explain the anaphoric links involved in Geach's Hob Nob sentence given below: Hob believes that a witch; has blighted his mare. Nob believes she1 has killed his cow. Apparently, Partee has called such a phenomenon the "telescoping effec t . " See Roberts

1 8.

Some examples (see Fodor and Sag ( 1 982)) appear to indicate that subordinated readings

are possible even without the presence of quantificational or modal elements (which may of course only emerge upon close reanalysis). These readings appear to rely, however, on some sort of notion of thematic continuity. We have no idea how to capture such cases without ap peal to detailed world knowledge; we hope to investigate such subordination cases using knowledge bases in future work.

1 9.

As i n subordinated readings. There are other complicated sorts of constructions that re

quire interpretation to fit the accessibility constraint. Consider for instance (Sg) (again due to Barbara Partee originally we think): (S)

g.

Either Fred does not own a car, or it is in the garage.

The accessibility constraint would dictate that if this sentence is translated in the obvious fashion into a DRS, the discourse referent introduced by the pronoun cannot be Identified with the discourse referent introduced by

a car.

But the intended meaning of this sentence is, we

think, very close to an exclusive disjunction with a hidden ellipsis. Fred does not own a car, or he does own a car and it is in the garage. I f this elliptical reading of disjunction is plausible, then the accessibility constraint predicts that the intended anaphoric link is admissable. Dis junctive sentences where the subject of the first disjunct is a universally quantified or definite NP seem particularly susceptible to this interpretation. We first heard of this solution from Hans Kamp.

20.

David Gadbois of the University of Texas at Austin is currently expanding and refining

21.

The latter condition formulates within DRT a criterion of functional dependency of one

the implementation of scope in BUI LDRS. NP on another. 22.

Assuming of course that a subordinated reading of the subsequent material is not availa

23 .

This was already implemented in Wada & Asher ( 1 986). This asymmetric behavior of in

ble. Our tests indicate that there is none, and most speakers seem to concur with the prediction. definites and definites also occurs in certain discourse contexts: (a)

(b)

If he, were to come home before we clean up this mess, I would be afraid. John, would get so angry that he might do anything. ll •

If he, were to come home before we clean up this mess, I would be afraid. A man; would get so angry that he might do anything.


16.

( 1 987) for a discussion. 1 7 . See Roberts ( 1 987) for a treatment o f this and similar cases involving modalities.

341 B y appealing t o the notion of modal subordination in the construction of the D RSs for (a) and (b), we explain the discrepancy between (a) and (b) in an exactly analogous way to the explana tion for the discrepancy between (7a) and (7b). 24. It appears that indefinite descriptions with "enough content" can have referential or specific indefinite readings. Such "specific" indefinites would have just the same status vis a

vis accessibility as non-dependent definites. We believe that it is the specificity or the definite ness of the crossing coreferential NPs consider� tog�ther as a pair that makes Bach-Peters sen tences with indefinites like the one below acceptable even though it closely resembles the questionable examples in ( l Ob , d, f) : A man, w h o hardly knows herl loves a womani w h o scorns him,. The ungrammaticality or difficulties with ( l Ob, d ,

f) stems from speakers' difficulties in getting

clude from such examples and others that definiteness is not determined simply by the deter miner but by the total structure and content of the NP. With respect to (DEF), we will interpret

referentially interpreted indefinites like non-dependent definites. 2 S . We are endebted to Andy Schwartz for this formulation.

26. A similar constraint might take account of the data concerning questions, but our frag ment does not yet encompass them. David Gadbois of the U niversity of Texas is working on this aspect of the program too. 27.

This rule for scope is consonant with an observation made about an order dependence

of indefinite that was built into DR theory from the start: indefinites are as a rule (but see Fodor and Sag ( 1 982) for some exceptions) supposed to introduce new discourse individuals into the discourse. 28. It is also essential in analysing distributivity readings of plural definites and indefinites. 29. 30.

We will not full process the conditions in the DRS below.

The acceptability of (6d) depends on whether someone has wide or narrow scope over the

PP in his wall�t. Our theory currently treats the semantic structure of b as completely linear.

I f it is not linear D R theory will predict the sentence to be good as in (6b). In DR theory current ly temporal adverbs do contribute scope distinctions, and perhaps the same might be said of locative adverbs. If that is the

case,

scope rearrangement even of existentials might be impor

tant to get certain anaphoric facts to work out. 31.

Substituting the indefinite NP a man produces, as we would hope, the same effect.

32. There are, however, well-known di fficulties for those who adopt such a conclusion. Spe cifically, these di fficulties occur when claUMS occur in ADJUNCf position. Compare for in stance ( 1 3 h , i) with ( 1 3f, g) above: ( 1 3)

h.

Near the place he1 1ived, Dan; sa w a snake.

i.

Near the place Dan1 lived, he1 saw a snake.

Neither C-command nor function-argument constraints easily account for the difference. 33.

For a discussion of this example and of the difficulties of interpreting indices generally

in syntactic anaphoric binding theories,

34.

see

Roberts ( 1 987).

This distinguishes our approach from that of Reinhart ( 1 983) or Roberts ( 1 987). Reinhart distinguishes bound anaphora and other sorts of coreference. In bound anaphora the pronoun is treated as a bound variable. She claims that only bound anaphora obeys the principles of the binding theory. She also develops a pragmatically based account of disjoint reference along the following lines. Where a speaker uses a pronoun as a bound anaphor and (thus obeys the relevant principles o f binding theory), he may not use a pronoun that is not interpreted as a bound variable. Roberts also contemplates two sorts of binding relations, C-command


a referential reading for the indefinite NPs in them with minimal descriptive content. We con

342 binding and discourse binding. The C-cornmand binding is completely determined at the syn.

tactic level and forces coreference where the same indices are assigned to two NPs. The access i bility relation of DR theory determines the constraint on discourse binding she explicitly mentions. Roberts also adopts a pragmatic construal of discourse reference. She assigns rankings to the various kinds of bindings and the pragmatic strategy is that the speaker must

always use the strongest binding potential of the sentence's grammatical structure. Roberts also has a pragmatic rule for interpretation, which is that the hearer must assume that if the 5peaker does not take advantage of the strongest binding potential, then unless there are rea sons to avoid that binding, he does not intend the expressions to corefer.

3S.

For a theory linking surface grammatical functions with thematic roles, see Dowty

( 1 987). 36.

We hypothesize also an indefinite number of open slots to be filled by discourse referents

introduced by NPs in ADJUNCT position; typically ADJUNCTs provide arguments on the

role , ' ' they function as a backgrounding role for the event being talked about rather than some role within the event. Nevertheless, this remark does not imply that sentences with fronted PPs

differ in truth conditions from sentences with sentential PPs In normal position. This is a

difference.

37.

Note that our disjoint reference constraint also allows the following discourse to be gram.

matical: Johllj thinks that everyone hates him,. Well, it's not true. Jim likes � . Mary likes hilllj and even John likes hilllj . All discourse referents introduced by the occu rrences o f him i n the second sentence are identi fied with each other and with the discourse referent introduced by

John in the frrst sentence.

It is an implicature that the John in the second sentence is the same as the John in the first sen tence, but nothing forces this identification in the DRS or, more to the point, the identification

him in the last clause with the dis John in the last clause. Perhaps some of the explanatory machinery devised by P. Sells ( 1 986) might be useful

of the discourse referent introduced by the occurrence of course referent introduced by the occ urrence of

38.

here but that is just a guess. We hope to do some more work on this topic.

39.

We are indebted to Carlota Smith for pointing out this constraint to us.

40.

This appears to be a rule of "focus shifting , " but unlike the other rules concerning focus

41 .

In a full theory that treats the problem of dUcourse segmentation and global focus,

and salience, it does appear to be absolute.

ACC '• will be subject to further constraints. For a discussi on of global focussing mecha

nisms, see Grosz ( 1 977), Grosz

would limit the size of ACC '• ·

42 .

& Sidner ( 1 98S),

and Guindon et a!. (1986). Such constraints

Since anaphoric pronouns carry themselves little in the way of lexical semantic informa

tion, one way of using world knowledge within the BUILDRS program comes to mind that

is similar to the content filter used for definite descriptions (see below, section S). It is a type

of lexical constraint defined relative to the "thematic roles" of the verb arguments in which

the pronoun and its potential antecedents occu r. Among such lexical constraints might be some that were absolute; others might resemble more closely the behavior of the salience constraints in BUILDRS.

43 .

Sidner ( 1 983) also advocates the use of world knowledge to check bindings suggested by

a focuslike mechanism.

44.

We are, however, endebted to the work of Sidner ( 1 979, 1 983) and Grosz, Joshi and

Weinstein ( 1 98 1 , 1 983, 1 986) on focus and centering. We have IUed many of the factors they

apply to determining focus to define a preference ordering for the anaphora resolution module by means of a salience ranking.


event of the main clause. Note that when fronted sentential PPs appear to change "thematic

343 45. Many researches like Grosz, Joshi and Weinstein talk: about the backward center of the previous sentence in a discourse, but it seems to us that one needs to generalize this notion at least so that the center is the most salient discourse individual so far introduced in the dis course. Perhaps Grosz, Joshi and Weinstein's backward center for the previous sentence is the most salient individual in the discourse, but we are not completely sure on this point.

46.

Smith ( 1 985) contains a detailed study of such problems. Sidner ( 1 983) countances two

foci - a discourse focus and an actor focus, it should be noted. But

as

she always ranks one

of these foci above the other, her theory also embodies essentially the one focus view. 47. Consider as a replacement sentence for (04.5), He1 1uzd to go flying alone. 48 . 49. 50.

See Guindon e t al. ( 1986).

Sidner ( 1 979) uses the constraints of reiteration and parallelism. For instance compare Sidner's ( 1 979) example with 'too' to one without it. We fmd that

our intuitions on these examples are quite different:

(b)

The violet is commonly found near the green whittierlief1 •

The wild rose is found near i1j too.

The viotet1 is commonly found near the green whittierleaf1• The wild rose is found near it 111•

(b) strikes us as completely ambiguous even somewhat

clear preference for the coindexing that Sidner notes. 51.

infelicitous, whereas in (a) there is a

This filter should be generalized to look also for contrast as a way of making a discourse

referent salient, and perhaps other rhetorical relations should be looked for. But this is not an issue that we 52.

can

go into here.

We need to look two clauses back so tha� the parallelism filter will not be confused by

the "listing" phenomenon: "My friends Bi� and Harry1 are so crazy. He1 is going around the world in a dinghy, and he1 wants to hanglide in the Himalayas. " Note, however, that we do not need to look back more than two clauses for such a parallelism effect, since that will involve

a violation of PRES. Note also that given the current state of DRS structure, the parallelism

filter is not sufficient to capture parallelism of "argument" or discourse structure. We plan

to remedy this by augmenting the purely logical structure of a DRS with information pertaining to the global structure of the discourse. 53.

Data concerning this phenomenon is also well known; we have adapted an example of

Grosz et al. ( 1 986) again. 54.

We have not

as

yet implemented any such inferencing component. This should be taken

to be a placeholder for either some extant program or future research.

REFERENCES Asher, N. 1986: Belief in Discourse Representation Theory. Journal ofPhilosophical Logic 1 5 . Asher, N . 1 987: A Typology for Attitude Verbs and Their Anaphoric Properties. Linguistics

and Philosophy 10. Bach, E., & Partee, B. 1980: Anaphora and Semantic Structure. In: K.J. Kreiman & A . E. Ojeda (ed.s.), Papers from the Parasession on Pronouns and Anaphora, Chicago Linguistic Society, Chicago. Bresnan, J., & Kaplan, R. 1 982: Lexical-Functional Grammar : A Formal System for Gram

matical Representation. In: J. Bresnan (ed.), The Mental Representation of Grammatical

Relations, MIT Press. Evans, G. 1980: Pronouns. Linguistic Inquiry, 1 1 . Freidin, R., & La.milc, H . 1 98 1 : Disjoint Reference and Wh-Trace. Linguistic Inquiry 1 2 .


(a)

344 Fodor, J . , & Sag, I . 1 982: Referential and Quantificational Indefinites . Linguistics and

Philosophy S .

Grosz, B . , Joshi, A . , & Weinstein, S . 1 983: Providing a Unified Account o f Definite Noun Phrases in Discourse. A CL Pf'OCeMings. Grosz, B . , Joshi, A . , & Weinstein, S . 1 986: Toward a Computational Theory of Discourse In terpretation. Manuscript. Grosz, B . , & Sidner, C. 1985: The Structure of Discourse Structure. SRI Technical Note 369. Guenthner, F., Lehmann, H . , & Schonfeld, W. 1 985: A Theory for the Representation of Knowledge. IBM J. Res. Develop. , January.

Guindon, R . , Sladky, P . , Brunner, H . , Conner, J. 1986: The Structure of User-Adviser Dia logues: Is there Method in their Madness? MCC report . Heim, I . 1982: The Semantics of Definite and Indefinite Noun Phr=. Ph.D. dissertation, University of Massachusetts . Structure: Centering. Proc. IJCAI. Karnp, H. 1 98 1 : A Theory of Truth and Semantic Representation. In: J. Groenendij k , Th . Janssen, & M. Stokhof (eds.), Formal Methods in the Study of Language, Mathematisch Centrum Tracts, Amsterdam. Karnp, H. 1 986: SID Without Time or Questions. Forthcoming CSLI report . Karnp, H . 1988: Conditionals in DR Theory. Manuscript.

Karttunen, L. 1976: Discourse Referents. In: J . D . McCawley (ed.), Syntax and Semantics, Academic Press, New York . Klein, E . , & Johnson, M. 1986: Discourse, Anaphora and Parsing. COLING Conference

Proceedings. May, R . 1985 : Logical Form: Its Structure and Derivation. MIT Press. Reichman , R . 1 978: Conversational Coherency. Cognitive Science 2 . Reinhart, T . 1983: A naphora and Semantic Interpretation. University of Chicago Press. Roberts, C . 1 986: Modal Subordination, Anaphora, and Distributivity. Ph.D. Thesis, University of Massachusetts. Sedogbo, C. 1 986: Extending the Expressive Capacity of the Semantic Component of the OPERA System. COLING Conference Proceedings. Sells, P . 1985: Lectures on Conttmporary Syntactic Theories. CSLI Lecture Notes Vol . 3 . Sells, P . 1986: On the Nature of Logophoricity. I n : A. Zaenen (ed.), Studies in Grammatical

Theory and Discourse Structure, Volume 2: Logophoricity and Bound A naphora. Seuren, P. 1985: Discourse Semantics. Blackwell. Sidner, C. 1979: Toward a Computational Theory of Definite Anaphora Comprehension in

English. MIT Technical Report AI-TR-53 7 . Sidner, C. 1 983: Focusing i n the Comprehension of Definite Anaphora. In: M . Brady, & R. Berwick (eds.), Computational Models of Discourse, MIT Press.

Smith, C. 1 985: Sentence Topic in Texts. Studies in the Linguistic Sciences, I S .

V a n Eyck, J . 1985: Aspects of Quantification in Natural Language. P h . D Thesis, University of Groningen. Wada, H . , & Asher, N. 1986: BUILDRS: An Implementation of DR Theory and LFG.

COLING Conference Proceedings.


Joshi, A . , & Weinstein, S . 1 9 8 1 : Control of Inference: Role o f Some Aspects o f Discourse

lolmUll of Semantics 6: 345-367

MOTION IMPERATIVES

ROLF MAYER

ABSTRACT

and linguistic aspects of path connection are discussed . The semantic analysis is then extended in the pragmatic direction: It is shown how semantic inferences may be filtered out via prag matic considerations. We suggest that a level of execution structure is needed to supplement the level of semantic representation. Motion imperatives are evaluated against maps, and aspects of executability are discu.ssed . I t is finally shown how the deontic function of motion imperatives

can

be fulfilled by texts in the indicative mood and that the criteria of adequacy

valid for motion imperatives then have to be met by motion indicatives.

I . INTRODUCTION 1

We are taking as our starting-point a fictitious map in the following form where movement along the edges is possible in any direction:

Fig. 1 : The M a p M •

The object to be moved is represented as a black circle located at one of the " places " . A crucial part of the ideas to be presented here has been im plemented in PROLOG. The system does the following: If the instructions given as an input are linguistically and logically free of errors (or can be ' 'in terpreted" by the object), the object complies with the instructions. After execution, further instructions may be given . The implementation simulates


In this paper a restricted sample of motion imperatives is treated within the framework of dis

course representation theory. In order to pave the way for this treatment, the concept of path

346 the situation where the obj ect has complete " knowledge" of the map (which 2 means the object can " find" a path from any place to another place). The instructions make use of a small lexicon of German containing the following items (in some places we will, however, go beyond the fragment):

Consider an example (assume the object is at A): First input :

Fahre von A nach C . ("Go from A to C" . ) Fahre weiter nach F . ("Go farther t o F " .)

Execution : The object moves from A to C and then to F. Second Input

Fahre zuriick. ("Go back " .)

Execution: The object moves from F to A . In the following we are not s o much interested in matters of implementation but in the interaction of semantics and pragmatics at the level of discourse structure. Although our interest in this paper is mainly focused on imperatives, 3 much that will be said is also relevant with regard to motion texts in the in dicative mood. 4 The aspects discussed below include the following: a) conceptual machinery related to spatial locations, paths, and path connection b) coherence (how is the connection of paths linguistically ex pressed?) c) discourse representation structures for motion imperatives d) the logic of (motion) imperatives; semantically versus pragmati cally licensed inferences e) the evaluation of motion imperatives against maps; issues of ex ecutability.


(i) Verbs in the imperative mood: Reise ("Travel"), Fahre ("Go") (ii) Prepositions: von ("from"), iiber ("in the sense of " via"), nach ("to") (iii) Place names: A, B, C, D, E, F, G, H, I (iv) Adverbs: dann ("then"), dabei (in the sense of "when doing so"), weiter ("farther"), zuriick ("back"), (von) dort ("(from) there")

347 2. LOCATIONS AND PATHS

I assume that an object x located in space defines a minimal set of spatial points Loc(x,t) (determined by the object's boundaries) relative to a point of time t. We call this set its minimal location . The following principle holds: P: I f Loc(x,t) = I and Loc(x,t) = I ' , then I = I '

Def : A path p is a temporal Loc(x,t). position of

relative to an object x - a continuous function from interval T onto a set of minimal (spatial) locations We call p(t(s)) the starting position, p(t(e)) the ending p.

This definition corresponds to the specification of " path" given in Wunder lich and Herweg ( I 986) . Spatial locations as occurring in the above definition may be conceived of as the constituents of "absolute Newtonian space " . However, as such they are never perceived . Instead, commonsense experience (and modern science) relies on objects to define locations. In particular, language allows t he possibility of locating objects and events x by referring to frame loca tions I o (fixing ' 'search domains" in the sense of M iller and Johnson-Laird ( 1 976)) such that Loc(x,t) c Loc ( l o ,t) (where c denotes the subset relation); typical examples of frame locations are towns and countries. If we model paths by sequences of spatial frame locations f: < I 0 { 1 ) . . . I 0(n) > (where we assume that the spatial intersection of I 0(i) and I 0(i + I ) is empty, with i ranging from I to n - I ) , we arrive at a concept of path frame that will play a major role in section 6. The following definition explicitly relates paths and path frames :

Def : We call path p a realization of path frame f: < I 0 { 1 ) . . . I 0(n) iff the following holds: ( 1 ) p(t(s)) c Loc( l 0{1 ), t(s)); p(t(e)) c Loc(l 0 (n) , t(e)) (2) there are t(i) < t(i + I) (for i = I . . . n - I ) such that p(t(i)) c Loc(l 0(i), t(i)) and p(t(i + I )) c Loc( l 0{i + I ), t(i + I )). ( < denotes the relation o f temporal precedence) Before we define two notions of path connection, we will introduce the con cepts of a stationary path and a strictly stationary path as auxiliary notions (where the former is implied by the latter).


P says that an object x (located in space) has exactly one minimal location relative to a point of time t . A path can b e defined in the following way :

348

Def p is a stationary path relative to 1 o iff p(t(s)) c Loc ( l o , t(s)) and p(t(e)) c Loc( l 0 , t(e)). For p to be a stationary path relative to 1 o , it will do if 1 o is the start location and the goal location of p.

Def. p is a strictly stationary path relative to 1 ° iff p(t) c Loc(l o ,t) for all t in the domain of p .

Def. : Path p (domain : T) is strongly connected with path p ' (domain : T ' ) iff (i) T meets T' (ii) p(t(e)) = p ' (t(s)), where the meet-relation is defined as follows : Def. : I f T and T ' are temporal (closed) intervals, T meets T ' iff the end point o f T and the starting point of T ' coincide. If path p is strongly connected with path p ' , p ::::: p ' represents the set theoretical union of p and p ' . For reasons of conceptual elegance we assume existence of the empty path Pe such that the following holds: p ::::: Pe = p for all p .

Def. : Path p i s weakly connected with path p ' (relative t o 1 ° ) iff (a) p is strongly connected with p ' or (b) (i) there is a path p" such that p" is stationary relative to 10 (ii) p is strongly connected with p " (iii) p " is strongly connected with p ' .

I f path p is weakly connected with path p ' relative to I o , p - p ' represents the set-theoretical union of p , p ' and p " (with p " the (possibly empty) link ing path in the sense of the above definition) . Note that strong path connec tion implies weak path connection . Keeping 1 o ftxed , we can say that weak path connection is an asymmetric relation: If p is weakly connected with p ' (relative to l 0), p ' is not weakly connected with p (relative to I 0).


A strictly stationary path p relative t o 1 ° i s e . g . established i f after moving to 1 ° a person stays there (before possibly continuing his/her movement). We are now ready to define the notions of weak and strong path connec tion (we presuppose that the paths to be connected have the same moving object).

349 I f paths p and p 1 are weakly (strongly) connected relative to 1 o , we could also say that the corresponding motion events are weakly (strongly) con nected relative to 1 o . With regard to our fragment, the parameter 1 o as occurring i n the defini tion of weak path connection is assumed to be filled up via goal adverbials.

3 . COHERENCE

(1)

Fahre noch A . Fahre von A nach F . ("Go t o A. G o from A t o F . " )

(2)

Fahre noch A . Fahre von dort nach F. ("Go to A. Go from there to F.")

(3)

Fahre nach A. Fahre donn nach F. ("Go to A. Go then to F . " )

(4)

Fahre nach A . Fahre weiter nach F. ("Go to A. Go farther to F . " )

(5)

Fahre nach A. Fahre nach F. ("Go to A. Go to F.")

(5 1 )

Fahre nach A. Fahre nach F. Beide Fahrten sind i nteressant. ("Go to A. Go to F. Both journeys are interesting . " )

(6)

(Object is assumed to be at A) Fahre von B nach F. ("Go from B to F. ")

( 1 ) - (4) represent "unmarked" instructions - in contrast to (5) and (6). I f we want an object to move to A and then t o F , (5) isn't the conventionalized


"Coherence" is taken to mean that sentences are properly linked with regard to each other. Conditions of coherence determine the conceptual and linguistic connection of paths; they tell us in particular when the construc tion of a complex path is linguistically licensed. Since in the context of the present paper we are primarily interested in "unmarked" sequences of mo tion imperatives, weak path connection counts among the conditions of co herence. With regard to our small lexicon, the weak connection of paths can be es tablished not only by the presence of properly chosen start adverbials (i .e. adverbials that "link up" with goal adverbials of the preceding sentence5 ) but also by text connectors such as donn and weiter. Here is a sample o f miniature texts:

350 way of expressing oneself. Neither is (6) if we want to tell the object to move from A to F via B. This does not mean, however, that no appropriate con texts can be found for (5) and (6) - (5 ' ) is a case in point.6 We will now further concentrate on the discourse particles dann and wei fer; later on we will also deal with the text connectors zuriick and dabei (in so far as these particles are related to our subject matter).

(7) and (8) serve to bring out a semantic difference between dann and weiter: Fabre von E nach A. Fahre dann von F nach G . ("Go from E t o A . G o then from F t o G . " )

(8)

?Fabre von E nach A. Fabre von F weiter nach G . ("Go from E t o A . G o from F farther t o G . ")

In both (7) and (8) the paths are not weakly connected. (7) certainly could not be used to tell somebody to get to G via A and F where equal value is attached to each path section ("bridging" in the sense of Clark (1977) seems hardly possible). However, as in the case of (5) and (6), a possible context can be constructed. Just think of a situation where only the path sections from E to A and from F to G are focused on. (8), however, doesn't seem to be licensed by a possible context . this would demonstrate that weiter (as a path connector) requires the weak connection of paths. In the context of the present paper there is another distinction that is worth noting. We take the following configuration as our starting-point : First input :

Fabre von A nach C . ("Go from A t o C . " )

Execution: possible path frame: ( A , B, C ) Second input: (a) Fahre nach F. ("Go to F.") (b) Fabre weiter nach F. ("Go farther to F . " ) (c) *Fabre dann nach F . ("Go then t o F . ") The contrast in acceptability between (b) and (c) can be accounted for in the following way: (b) presupposes that a path has been covered beforehand, but it does not presuppose that another path still has to be covered before travelling to F takes place.


(7)

351

Input:

Fabre zuriick nach I. ("Go back to I . " )

Use o f zuriick requires that the events i nvolved be conceptually connected, which suggests that a solution to the path connection problem has to be found within a theory of event structures. We will conclude this section by taking a short look at the text connector dabei (in its temporal sense). We take (9) - ( 1 1) as our starting-point: (9)

Fabre i.iber A nach C . ("Go via A t o C . ")

(9 )

Fabre i.iber A nach C. Das ist wichtig. ("Go via A to C. That is important" . )

( 1 0)

?Fabre nach C. Fabre i.iber ("Go to C. Go via A . ")

( 1 1)

Fabre nach C. Fabre dabei i.iber A. ("Go t o C. When doing s o , g o via A.")

1

(1 1

1

)

A.

Fabre nach C. Fahre dabei i.iber A. Das ist wichtig. ("Go to C. When doing so, go via A. That is important . " )

(9) and ( 1 1 ) are truth-conditionally equivalent but create different embedda bility conditions for subsequent discourse. With regard to a formal


(c) however, presupposes that another action has t o b e executed before travelling to F takes place. The continuations (a) and (b), on the other hand, are both acceptable. However, the perspectives in (a) and (b) are different: I n (a) no conceptual link to the "old" path is explicitly established , in (b) such a connection is focused on by means of weiter; the hearer/reader is invited conceptually to j oin the two paths into a complex one. The example is related to what we call the path connection problem: When are paths conceptually related to each other and when can they be taken as subpaths of a more inclusive path? The path connection problem can be illustrated with regard to zuriick ("back ") (we will only consider the presuppositional use of zuriick here). Note that it would be wrong to assume that the movement of an object to a place 1 ° can always be described by means of zuriick if the object has ever been at 1 o before. Our miniature robotic system can be used to make this point clear. Suppose that our ' 'object ' ' has already been to all of the availa ble places and that its current position is A. We also assume that a user (not knowing anything about the object' s former "journeys" now starts playing with the system. Certainly the following initial input would be inap propriate:

352 representation, I assume that both (9) and ( I I ) explicitly induce one path marker. However, while (9) explicitly induces only one event-marker, ( 1 1 ) induces two . Note that event-markers are potential anchors for sentential anaphors. Das in (1 1 ' ) preferably refers to the event introduced by the se cond sentence; if the transitional adverbial in (9 ' ) is not accented, das refers to the whole complex event denoted by the initial sentence. 7 Dabei Oike von dort ) is a local anaphor in the sense that it finds its antece dent in the preceding clause, in contrast to presuppositional uses of zuriick, where the distance to the antecedent may be arbitrarily large.

As

has been observed by various authors (see for instance Seuren (1985)) and has also been illustrated in this paper, discourse particles are primarily related to the preceding context and not so much to the world . Discourse representation theory as inaugurated by Kamp ( 1 98 1 a, b) is well-suited to do justice to this insight by providing a level of discourse representation structures that is meant to mediate between language and the world. In dis course representation theory the meaning of a sentence can be conceived of as a function from discourse representation structures into discourse representation structures - which simply is to say that the preceding context determines how new verbal input has to be processed; the processing is to be algorithmic in nature and based on the syntactic structure of the input. In the following we show by means of a sample text how the results from the preceding sections could be incorporated into a DRT framework. 8 Our presentation is not algorithmic (note that we have not given an explicit syn tax of our fragment) but is hopefully explicit enough such that an algorithm could be constructed. The text to be processed into a DRS is the following: (1)

Fahre von A nach C . Fabre dabei tiber B . Fabre von C tiber F weiter nach E . Fabre dann von dort zuriick nach A . ("Go from A t o C . When doing so, go via B . Go from C via F farther to E. Go then back to A.")

We first present the DRS for ( 1 ) and then comment on its build-up:


4. DRS5 FOR MOTION IMPERATIYES

353 e, e · , e · · , e · ·

· ,

x, p, p · , p

'

· ,

i , j , k , 1 , m, n . . .

K : �------� Going ( e )

Agent ( e , x ) x-ADR Path ( e , x , p ) VON ( e , p , A, S T , i ) NACH ( e , p , C , GO , k ) i , but not to ( B, M• > (where M• is again our fictitious map). (6)

Fahre von A nach C. ("Go from A to C . ")

(6 ' )

Fahre nach C. ("Go to C.")

(7)

Fahre nach C. ("Go to C.")

(7 ' )

Fahre tiber B nach C. ("Go via B to C.")

The following holds: If I and I ' are result-equivalent relative to ( Pos, M ) , then I and I ' are also result-equivalent relative to any ( Pos , M ' > where M ' C M. The notion of result-equivalence may become relevant in cases like the following: The speaker thinks the hearer has the map M . . (related to a cer tain situation). Suppose the speaker wants the hearer to move from A (where the hearer is) to C via B. He could either choose (7) or (7 ' ), which are result-equivalent relative to (A, M .. > . However, if the speaker is quite sure about what the map of the hearer ( = M .. ) is, he would for reasons of linguistic economy utter (7) and not (7 ' ). We can give the following interpretation to the concept of a map M : Def : M i s (relative t o a person x) the ability set o f x i f ( v(i), vU) > E M is to mean: If x is at the place v(i), s(he) is able to find her/his way to vU). It is not necessarily the case that M is identical to the transitive closure Cl(M) of M: ( v(i), vU) ) and ( vG), v(k) ) may be contained in M but ( v(i), v(k) ) needn't be. This possibility mirrors the general fact that people are often un able to change a situation s(i) into a situation s(k) without getting appropri ate information about what the individual "chunks" of the transformation are. This information may for instance consist in specifying that first s(i) has to be changed to sU), and then sU) to s(k) (see Moore ( 1 985), Morgenstern (1 986)). The above considerations account - in our particular case - for the


b) Suppose M .. = ( ( A, B ) , ( B, C ) ) . (7) and (7 ' ) are then result equivalent relative to ( A , M . . ) :

363 potential usefulness of motion imperatives as guides of behaviour. Their paradigmatic function can be called deontic. However, the deontic function can also be fulfilled by sentences in the declarative mood which is evident from the build-up of many route descriptions (see, for example ( 1 0) below). In many cases the imperative verbal form is even left out (so that the ellip sis signals the deontic function). Typical examples are the authentic texts (8)-)9): Links tiber die Briicke, dann zunachst auf der StraBe weiter. Kurz nach der Kurve nach links in den Wald. ("Left over the bridge, then along the road. Just after the bend, left into the forest").

(9)

Auf dem Weg nach links weiter in Rkhtung Tal und bei der nach sten Abzweigung rechts. Dann nach links tiber die Briicke und rechts tiber eine zweite Briicke zur Aumtihle. ("Carry on, then turn left in the direction of the valley, then right at the next bend. Then left over the bridge and right over a second bridge to Aumtihle")

(10

Etwa drei Kilometer nach Beginn der Wanderung wenden wir uns bei den Hausern nach links in Richtung Aumtihle. 50 Meter vor dem Gasthaus geht es nach links tiber die erste, dann rechtshalten dend und gleich wieder links tiber die zweite Briicke. ("About three kilometres after the beginning of the walk we turn left by the houses in the direction of Aumtihle . 50 metres before the inn we go left over the first bridge, then immediately left again over the second bridge")

As

was pointed out above, instructions must be such that each step can be executed (which of course depends on the ability set of the addressee). Let me illustrate the linguistically relevant problems by finally looking at the is sue of NP-evaluation. Obviously, in order to realize instructions like (8)-(10) one must be able to identify the referents of the NPs involved. Look again at (8)- ( 1 0): Note that the uses presented here are not anaphoric ones. This induces the question of how the definite NPs are evalu ated. The intuitions are relatively clear. In all three cases "imaginary jour neys" are constructed where the paths have a temporal ordering associated with them. (8), for instance, tells us to follow the road until a bend x is reached and then tum left; x is supposed to be the first bend that is reached when following the road. To be a bit more formal: x (a bend which we may for the sake of simplicity conceive of as a frame location) has to be chosen such that (i) p(t) c Loc(x(t), t) holds - where p i.; the corresponding path and there is no t 1 < t such that (ii) p(t 1 ) c Loc(x 1 (t 1 ), t 1 ), and x 1 is a bend different from x.


(8)

364

University of Tubingen Seminar fur natiirlichsprachlicht Systtmt

NOTES I.

I thank Franz Guenthner, Rob van der Sandt and the referees of this paper for a number

of valuable suggestions. 2.

Winograd (1 972), Kuipers ( 1 978), McDermott and Davis ( 1 984), and others have devised

elaborate programs I do not wish to compete with. The set-up presented here rather serves as a useful environment to bring linguistic aspects into focus.

3.

Unless otherwise stated, in this paper the term imperative is paradigmatically used for

clauses or texts in the imperative mood that function as orders or commands. However, the i nterpretative range of imperatives is much wider and includes requests, threats, exhortations, permissions, concessions, warnings, advice and wishes (see Huntley ( 1 984 : 1 03), Donhauser ( 1 986: 1 64ff.)). - It is not among the aims of this paper to provide rules of speec h act deterrni-


From a procedural point of view ( 1 0) presents additional difficulties: Let us assume (as is reasonable) that the reference of dem Gasthaus has to be established when (10), qua instruction, is used to find one's route. If (10) is to be procedurally optimal, it should be possible visually to identify the inn at a distance of 50 metres (in order to prevent the necessity of going far ther towards the inn and then going back to the bridge). This suggests that " visual fields" (qua "resource situations" in the sense of Asher and Bonevac ( 1 987)) relative to path locations p(t) are required in the descriptive apparatus. ( 1 0) also points to the possibility that a text may be descriptively adequate without being deontically appropriate (the example may become more convincing if we substitute 1 kilometre for 50 metres in (10)). I have nothing definite to say about the distribution of definites and in definites within "pragmatic" motion imperatives (where the latter term is to refer to texts in the imperative or declarative mood that are to guide one's movement). The possibility of both a second bridge and the second bridge in (9) and ( 1 0), respectively, suggests that perspectivization is involved (such that, for instance, in the second case a restricted resource situation is in duced).17 The use of the definite article then pragmatically signals to the addressee that a unique referent can be found to guide his/her motion. Let me end this paper on the following note. In robotics (as part of Artifi cial Intelligence) one is "interested in the automatic synthesis of robot mo tions, given high-level specifications of tasks and geometric models of the robot and obstacles" (B.R. Donald (1987 :295). The present article was not meant to promote robotics as defined above but to take the basic idea from the field and sort out a number of linguistic aspects. It is the hope of the author that the insights gained from the study of the small core fragment presented here are still useful when rich and "realistic" fragments in the area of " linguistic robotics" are thoroughly investigated.

365

REFERENCES

Asher, N. and Bonevac, D. 1 987: Determiners and Resource Situations. In: Linguistics and Philosophy 1 0, 567-596. Aqvist, L. 1 984: Deontic Logi c . In: D. Gab bay and F. Guenthner: Handbook o f Philosophical Logic. Vol. I I ; 605-7 1 4 . Bauerle, R . 1 987: Ereignisse und Reprlisentationen. L I LOG-Report 43. Barwise, J. and Perry, J. 1 983: Situations and Attitudes. Cambridge: MIT-Press. Chellas, B . F . 1 97 1 : Im peratives. In: Theoria, Vol. 37, 1 1 4- 1 28 .


nation. - On the general problem of relating sentence types and semantic-pragmatic functions see Meibauer ( 1 987) and Rosengren ( 1 988); imperatives are discussed (among others) by Don hauser ( 1 986, 1 987), Wunderlich ( 1 976: 1 50ff.) and W underlich ( 1 984)). 4. The present paper is in fact "complementary" to Mayer ( 1 989) where a comprehensive treatment of spatial coherence (relative to a fragment in the indicative mood) is given . 5 . " Linkage" with regard to our small fragment means that a goal location is taken up (lexi cally or proadverbially) by means of a start adverbial in the subsequent clause. For a more · detailed analysis see Mayer ( 1 989). 6. (5 ' ) is something like a conditional imperative in the sense of Donhauser ( 1 986: 1 7 J ff.) and may be paraphrased as " I f you go to A or to F, you will find those journeys interesting". 7. See B!!uerle ( 1 987 :4 l ff.) for an explicit t reatment of the sentential anaphor das. 8 . I conceive of discourse representation theory as a programm� rather than a full-fledged theory. If the reader prefers a different framework - for instance situation semantics (see Bar wise/Perry ( 1 983)) - a codification there might be easily possible. 9. See e.g. Herweg et al. ( 1 987) who rightly point out that the calculation of frame locations as induced by spatial prepositional phrases " P RAP NP" crucially depends on parameters characterizing the object denoted by the NP (which implies the important role of world knowledge). 1 0 . Such an interpretation of sentences or texts in the imperative mood clearly fails when they express wishes the realization of which is not within the power of the addressee (example: "Sleep well", see Donhauser ( 1 986: 164)). I I . Explicit inference rules are given in Mayer ( 1 989). 1 2 . R. van der Sandt has pointed out t o m e that the alleged entailment relation is in fact a relation of presupposition. 1 3 . For an exposition see Aqvist ( 1 984:634ff.), Hintikka ( 1 979), Karnp ( 1 979), and Lewis ( 1 979). 14. Procedural aspects of route finding have so far played a dominant role in the field o f Ar tificial Intelligence. See Habel ( 1 987) for an overview. - Linguistic aspects of route directions are e . g . dealt with by Klein ( 1 979) and Wunderlich ( 1 978). 1 5 . There are two major options as far as a semantic representation of (5) is concerned: (i) When processing S( l ) from (5) , one "looks ahead" and makes use of the locative expression beim Krankhaus to derive a goal location with respect to the path induced by S ( l ) . (ii) O n e keeps aspects of execution outside t h e semantic representation and does not build a goal location into the representation of S(l ). One might call this kind of representation "declarative" . At a second level the execution structure is calculated. This level of evaluation is " procedural' ' . 1 6 . Psychologists, such a s Johnson-Laird ( 1 983), speak o f " mental models" as prototypical ly representing situations by means of a finite number of tokens and relations between them. Herskovits ( 1 985) has particularly stressed the role of paradigmatic configurations and typical ity in the linguistic codification of spatial relations. 1 7 . This interpretation is i n accord with the account given i n Uibner ( 1 985: 304ff.) for cases like (i) Er brach sich das Bein. ("He broke his leg").

366 Clark, H . H . 1 977: Bridging. I n : P . N . Johnson-Laird and P . C . Wason (eels.): Thinking: Read ings in Cognitive Science. London: Cambridge University Press. Davidson, D. 1 967 : The logical form of action sentences. I n : N. Rescher (ed.): The Logic of Decision and Action . Pittsburgh: University of Pittsburgh Press, 8 1 -95 . Donald, B . R . 1 987: A Search Algorithm for Motion Planning with Six Degrees of Freedom.

In: Artificial Intelligence 3 1 , 295-353.

Donhauser, K. 1 986: Der lmperativ im Deutschen: Studien zur Syntax und Semantik des deut schen Modussystems. Hamburg: Buske. Donhauser, K. 1 987: Verbaler Modus oder Satztyp? Zur grammatischen Einordnung des deut schen lmperativs. I n : J. Meibauer (eel.): Satzmodus zwischen Grammatik und Pragmatik. Tubingen: Max Niemeyer Verlag, 57-74. Habel, Ch. 1 987: Prozedurale Aspekte der Wegplanung und Wegbeschreibung. L I LOG Report 1 7 . Hanks, S . and McDermott, D . 1 987: Nonmonotonic Logic a n d Temporal Projection. I n : Ar

9, 341 -378. Herweg, M . , Khenkar, M . , Pribbenow, S. and Rehkamper, K . 1 987: ElsaB-Wanderung fur Linguisten. Exemplarische Analyse und Reprasentation eines Satzes aus einer Reise beschreibung. LILOG Memo 8. Hintikka, J. 1 979: The Ross Paradox as Evidence for the Reality of Semantical Garnes. In: E. Saarinen (ed .) ( 1 979): Game-theoretical Semantics, 329-345 . .

Huntley, M . 1 984 : The Semantics of English Imperatives . In: Linguistics and Philosophy 7,

1 03- 1 33 . Johnson-Laird, P . N . 1 983: Mental Models. Towards a Cognitive Science of Language, I n fer ence and Consciousness. Cambridge: University Press. Kamp, H. 1 979: Semantics versus pragmatics . In: F. Guenthner and S . J . Schmidt (eels.): For mal Semantics and Pragmatics for Natural Languages. Dordrecht: D. Reidel, 255-287. Karnp, H. 1 98 1 (a): A theory of truth and semantic representation. In: J .A . Groenendij k , T . M .

Janssen, M . B . Stokhof (eels.): Formal Methods in the Study of Language, Bd . l , Amsterdam

Mathematisch Centrum, 227-322.

Karnp, H. 1 98 l (b): Evenements, representations discursives et reference temporelle. In: Lao gages 64, 39-64. Klein, W. 1 979: Wegauskiinfte. In: Zeitschrift fiir Literaturwissenschaft und Linguistik (9),

9-57. Kuipers, B . 1 978: Modelling spatial knowledge . In: Cognitive Science 2, 1 29- 1 53 . Lewis, D. 1 979: A Problem about Permission. I n : E. Saarinen (et alii) (ed .): Essays i n Honour of Jaakko Hintikka, 1 63 - 1 7 5 . Uibner, S . 1985: Definites. In: Journal o f Semantics 4, 279-326. Mayer, R. 1989 : Coherence and Motion. To appear in: Linguistics 27/3. McCarthy, J. 1 980: Circumscription - a form of non-monotonic reasoning. In: Artificial In telligence 1 3 , 27-39. McCarthy,

J.

1 984: Applications of Circumscription to Formalizing Common-Sense

Knowledge. I n : Artificial I ntelligence 28, 89- 1 1 6 .

McDermott, D . a n d Davis, E . 1 984: Planning routes through uncertain territory. I n : Arti ficial I ntelligence 22, 1 07- 1 56. Meibauer, J. 1987 : Satzrnodus zwischen Grarnmatik und Pragmatik. Tubingen: Max Niemeyer Verlag. Miller, G. A. and Johnson-Laird , P . N . 1 976: Language and Perception. Cambridge: Cam bridge University Press. Moore, R . 1 985: A Formal Theory of Knowledge and Action. I n : J . R . Hobbs and R . C . Moore

(ed.): Formal Theories of the Commonsense World. Norwood : Ablex Publishing Corpo ration. Morgenstern, L. 1986: A First Order Theory o f Planning, Knowledge and Action. In: Theoret ical Aspects of Reasoning about Knowledge. Proceedings . 99- 1 1 5 . Reiter, R . 1 980: A logic for default reasoning. In: Artificial Intelligence 1 3( 1 , 2), 8 1 - 1 32. Rescher, N. 1 966: The Logic of Commands. London: Routledge & Kegan Paul Ltd.


tificial I ntelligence 33, 379-412. Herskovits, A . 1 985: Semantics and pragmatics of locative expressions. In: Cognitive Science

367 Rosengren, I. 1 988: Die Beziehung zwischen Satztyp und lllok utionstyp aus einer modularen Sicht. In: Sprache und Pragmatik 6. Lund. Seuren, P . A . M . 1 985: Discourse Semantics. Oxford: Basil Blackwell. Sondheimer, N. 1 978: A semantic analysis of reference to spatial properties . In: Linguistics and Philosophy 2, 1 978, 235-280. Winograd, T. 1 972: Understanding Natural Language. New York/London: Academic Press. Wunderlich, D. 1976: Studien zur Sprechakttheorie. Frankfurt: Suhrkamp Verlag. Wunderlich, D. 1 978: Wie analysien man Gesprache? Beispiel Wegauskunfte. In: Linguistische Berichte 58, 4 1 -76. Wunderlich, D . 1984: Was sind Aufforderungsslitze? In: G . Stickel (ed . ) : Pragmatik in der Grarnmatik: Jahrbuch 1 983 des Instituts fiir deutsche Sprache. Dusseldorf: Schwann, 92- l l 8. Wunderlich, D . and Herweg, M. 1 986: Lokale und Direktionale. To appear in: Ch. Schwarze and D. Wunderlich (ed.): Handbuch der Semantik . Konigstein/Ts . : Athenlium.


Jounwl of s�mantics 6: 369-385

CONDITIONS FOR MUTUALITY

JOSEF PERNER and ALAN GARNHAM

ABSTRACT

provides a participant a in that situation with grounds G for assuming that a and b, the other participant, mutually know some proposition p indicated by S. Our criterion derives from ana lytic criteria proposed by Lewis ( 1 969) and Schiffer ( 1 972). We discuss how our criterion ap plies in a series of test examples, and compare it with Clark and Marshall's ( 1 9 8 1 ) trip/�

copres�n� hn�ristic. We argue that triple copresence is empirically incorrect . It is neither a necessary nor a sufficient condition for mutuality, and it fails on a wide variety of examples. We also consider Sperber and Wilson's ( 1 986) recent claim that the concept of mutual

knowledge should be replaced by those of mutual manifestness and mutual cognitive environ

ments, and argue that this move fails to solve the problem of mutuality. Finally we discuss how community membership produces muuiality. We argue that mutuality can only be established if certain rules of common sense reasoning can be assumed, and discuss the these rules must be 'mutually' known.

sense

in which

INTRODUCTION

Mutuality is

the characteristic feature of a cluster of concepts that includes mutual knowledge, common knowledge, mutual belief, mutual manifest ness, and shared understanding. These concepts play a crucial role in the analysis of such notions as trading and bargaining (Aumann 1 976; Milgrom 1 98 1 ) norm, social practice, rule, role, social group and organization (Bach and Harnish 1 979), definite reference (Clark and Marshall 1 98 1 ), meaning (Schiffer 1 972), convention (Lewis 1 %9) and distributed processing (Hal pern and Moses 1 984). Indeed, the analysis of almost any cognitive inter action requires one or more of these concepts and they, therefore, form an essential part of any cognitive theory. In this paper our prime concern is with mutuality, though , for concreteness, we will couch our discussion mainly in terms of (reasons for assuming) mutual knowledge. The question we will focus on is: when can a person engaged in an inter action with another justifiably assume that they have mutual knowledge? This question is important because many consequences follow from the es tablishment of mutuality. To give just three examples: mutual knowledge about social roles produces expectations about how other people will be-


We present a finite psychological decision procedure for determining whether a situation S

370 have (Bach and Harnish 1979); mutual knowledge about an object licenses certain types of definite reference to that object (Clark and Marshall 1 98 1 ); mutual knowledge that an agreement (e.g. to meet) has been made pro duces the expectation of coordinated action at a later date (Lewis 1 969). One standard analysis of mutuality is what Barwise (1985) calls the iter ated attitude approach 1 • On this view two agents, a and b, mutually know some proposition, p, if the following conditions are satisfied: that p that p that b knows that p that a knows that p that b knows that a knows that p that a knows that b knows that p and so on, ad infinitum

This analysis led Clark and Marshall to identify the mutual knowledge Mutual knowledge is common, yet it would seem that two people can never verify mutuality, because it would take an infinite amount of time to verify an infinity of conditions. Clark and Marshall proposed a so lution to this paradox - a heuristic finite decision procedure for determin ing when knowledge is mutual. While we agree with Clark and Marshall that a finite decision procedure for mutuality is required, we will show that the heuristic they describe is not a good one. As an alternative we propose a psychological decision criterion for mutuality based on the analytic criter ia put forward by Lewis (1 969) and Schiffer ( 1 972). Sperber and Wilson ( 1 986), have drawn a more radical conclusion from the mutual knowledge paradox. They argue that the concept of mutual knowledge has no place in a theory of communication - and, by implica tion, no place in cognitive theory - because no two people can ever be sure they have mutual knowledge. However, their arguments are invalid, be cause they rest on the unwarranted assumption that mutuality must be both defined and testedfor in terms of the infinite sequence of iterated attitudes. Nevertheless, Sperber and Wilson make two valid points about mutual kno wledge. First, there are many circumstances in which ascription of mutual knowledge would be open to doubt. More generally, assumptions about knowledge or, indeed, about other mental states and achievements, are often mistaken. Assumptions about what people ought to be able to per ceive or infer - assumptions about cognitive environments, as Sperber and Wilson call them - are safer. However, this point is a point about knowl edge, not about mutuality. Sperber and Wilson's second valid point is that mutual knowledge is not a precondition for communication, as has sometimes been supposed. Speak-

paradox.


a knows b knows a knows b knows a knows b knows

37 1

tion about who sent it establishes mutuality between sender and receiver.

Halpern and Moses also show that commonality of knowledge cannot be es tablished by processors sending acknowledgements back and forth, since an infinite number of acknowledgements would be needed, corresponding to the infinite series of checks that Clark and Marshall and Sperber and Wil son worry about. We part company, therefore, with both Clark and Marshall and Sperber and Wilson , by suggesting that mutuality should not be defined in term of iterated attitudes. Nor should the infinite series of iterated attitudes be regarded as the definitive test for mutuality. Rather, mutuality can be es tablished by a finite decision criterion, just as a person's knowledge of a fact, or whether that fact is manifest to them can be. The criterion we propose is intended to decide on mutuality, whether it be of knowledge or belief or manifestness. For 'historical' reasons, we for mulate the criterion for the case of mutual knowledge. However, it should not be criticised because it does not solve the very difficult problems of es tablishing when someone knows something. Its aim is to provide a method of deciding about mutuality. The adequacy of our criterion is primarily an empirical question - does it classify situations in the same way as people do? Its only rival is the triple copresence heuristic of Clark and Marshall ( 1 981 ), which often fails when our criterion produces the same result as our judgement. Although our criterion is principally an empirical one, intended to explain how people make judgements of mutuality, it also makes clear how mutuality relates to the infinite series of iterated attidudes. One can show that, for states such as having reasons to believe or manijestness, satisfaction of our criterion


ers who make a definite reference to a church do not necessarily assume that they and their audience have mutual knowledge of the church, merely that the audience will be able to work out which building they are referring to (1 986:43 -4). The church need not be mutually known, only mutually manifest. However, the fact that mutual knowledge is not necessary for suc cessful communication does not mean that the concept of mutuality can simply be dismissed. Sperber and Wilson themselves are forced to distin guish between manifest ness and mutual manifestness. The concept of mani festness itself does not avoid the infinite regress2 . The fact that mutuality need not be established by an infinite series of checks is also shown by Halpern and Moses' ( 1 984) analysis of when knowl edge is common among the processors of a computer system with distribut ed processing. They show that a processor can assume that information it has sent out will become common knowledge if it knows when and where the information will arrive and if the information includes details of where it comes from. A single message with a known time of arrival and informa

372 logically entails the infinite iteration (see Perner and Garnham, ms. Appen dix B).

THE MUTUALITY CRITERION

Decision Rule for Mutuality:

Any situation S involving two participants a and b, which is perceived by one participant, a, as S[a] , provides grounds G (where G is a subset of S[a]) for a to assume that the proposition p is mutually known by a and b iff participant a has reason to believe that G satisfies the following four conditions: Cl. C2. C3. C4.

G G - aRG & bRG G - aRp & bRp Whether G satisfies conditions C2 and C3 is established by com mon sense reasoning.

If q is any proposition and x any rational person then xRq means that per son x has reason to believe that q. The symbol ' - ' stands for material im plication and

for conjunction. states the trivial fact that S can provide a with grounds for mutual knowledge only if it gives a reason to believe that G holds. Condi tion C2 is the central part of our criterion . It requires that G provides grounds for both participants a and b to know that G holds. In other words, G must be self-revealing or 'open' to both participants. Condition C3 states that G must also make the target proposition p known to a and b . Condition C4 requires that the judgement about whether G meets C2 and C3 must depend on application of common sense inference rules. To apply our criterion we assume that the source situation S can be regarded as a set of propositions. Figure I gives a bird's eye view of Schiffer's candle gazing situation - a prototypical example of a situation that warrants the assumption of mutual knowledge - which we character ize as follows: '&'

Condition CJ


The work of Lewis ( 1 969:52-3) and Schiffer ( 1 972:34-5) suggests a deci sion procedure that allows each of two people (a and b) who are par ticipants in an interactive situation S(a,b) to decide whether that situation provides them with good grounds to assume that a target proposition p is mutually known. In stating this procedure, we make more explicit than did Lewis and Schiffer the role of common sense inferences.

373

S

=

'p 1 : p2: p3: p4: p5 :

Bob's eyes are within Ann's visual field. Ann's eyes are within Bob's visual field. The burning candle is within Ann's visual field. The burning candle is within Bob's visual field. There is no visual obstruction within the triangular area formed by Ann's eyes, Bob's eyes and the candle. '

Since Ann is fully informed about this situation we can equate S with her view of it (S[a]). The example has also been constructed so that no distinc tion need be made between S[a] and G. The actual situation S can therefore be equated with G. It is then relatively easy to show that situation S ( G) provides Ann with grounds to assume mutual knowledge of the fact that the candle is burning3 . The application of our criterion to Schiffer's candle gazing situation demonstrates how our decision criterion can be used as a precise algorithm in a concrete situation . It correctly shows that this situation provides grounds for mutual knowledge of the fact that the candle is burning. It also correctly rejects situations where mutual knowledge does not obtain. Clark and Marshall constructed �pveral versions of a scenario in which Ann and Bob share knowledge about some target proposition p (that tonight's movie at the Roxy is ' Monkey Business') and yet fail to have grounds for assuming mutual knowledge of p. Version 5 is the most complicated of these scenarios: =

" Version 5. On Wednesday morning Ann and Bob read the early edition of the newspaper and discuss the fact that it says that A Day at the Races is playing that night at the Roxy. Later, Bob sees the late edition, notices the correction of the movie to Monkey Business, and circles it with his red pen. Later, Ann picks up the newspaper, sees the correction, and


Fig. /. Schiffer's candle gazing situation

374 recognizes Bob's red pen mark . Bob happens to see her notice the correc tion and his red pen mark. In the mirror Ann sees Bob watch all this, but realizes that Bob hasn't seen that she has noticed him . . . . " (Clark & Marshall, 1 98 1 : 1 4). There is no way of selecting a set of propositions that satisfy our criterion from this scenario, so it does not provide grounds for mutual knowledge4 •

CRITIQUE OF TRIPLE COPRESENCE


Unlike our criterion, Clark and Marshall ' s triple copresence heuristic is dif ficult to apply to Version 5 of their scenario. Clark and Marshall ( 1 98 1 :32) point out that to have grounds for assuming mutual knowledge Ann needs 'evidence of triple copresence - of certain events in which Ann, Bob, and the target object are copresent, as when Ann, Bob, and the notice about Monkey Business were openly present together Wednesday morning. ' This criterion is difficult to apply because the expression 'openly present together' is too vague to serve as part of a fully explicit decision criterion. As the authors admit: "The trick is to say what counts as triple copresence - as being 'openly present together" ' (198 1 :32). Unfortunately this trick is not revealed in an explicit definition, but only exemplified by reference to Schiffer's candle gazing situation. There, Ann's evidence for things be ing 'openly present together' is 'evidence that she and Bob are looking at each other and the candle simultaneously.' ( 1 98 1 : 32-3). In applying this idea to Version 5 of Clark and Marshall's scenario one has to decide how the expression 'looking at each other' should be interpreted. If it requires eye contact then Version 5 fails the test for triple copresence. However, if it can be construed as 'Bob looks at Ann and Ann looks at Bob' then Ann would have grounds for assuming mutual knowledge in Version 5. For Ann has simultaneous evidence from the mirror that Bob is looking at her and the notice about Monkey Business and evidence that she is looking at Bob and the notice in the mirror. Of course, Bob does not have corresponding evidence of triple copres ence and therefore he has no grounds for assuming mutual knowledge. However, for Ann the triple copresence heuristic makes the wrong predic tion that she has grounds to assume mutual knowledge. The fact that Clark and Marshall's criterion leads to a different result for Ann and Bob indi cates that it is incorrect, at least under the current interpretation. Any satis factory criterion should give the same result for both participants except in cases of mistaken beliefs. It seems, therefore, that 'looking at each other' should be construed as 'having eye contact' . One problem with this interpretation is that there are obvious examples

375

,

'

The 'hole in the menu' candle gazing situation

of mutual knowledge based on seeing that do not depend on eye contacL Imagine yourself sitting side by side with a friend at night on a quiet bench in the park and suddenly a flash of lightning lights up the surroundings. In this situation you don't have to look each other in the eye to mutually know that there was a flash of lightning. In contrast to Clark and Marshall's triple copresence heuristic, our criterion is not affected by the absence of eye con tact. In fact it does not even demand copresence. A further problem is that even triple copresence with eye contact (the strongest interpretation of Clark and Marshall's formulation) does not pro vide sufficient grounds for mutual knowledge. To demonstrate this fact we assume that Ann and Bob are again seated at a table. Ann is holding the menu in front of her reading iL The waiter brings a lighted candle and puts it on the table. After the waiter has left, the candle goes ouL Bob sees it go out. He can also see Ann's eyes and the half of the menu folder facing him but not the other half. Ann can see Bob's eyes. She could not see that the candle had gone out if it were not for a little hole in the side of the menu folder that Bob cannot see. This situation is shown in Figure 2. In this situation Ann has evidence that she and Bob are looking at each other (eye contact), that Bob can see the extinguished candle, and that she herself can see the candle. So she has evidence for triple copresence (with eye contact) but no grounci:. for assuming that the candle's extinction is mutually known . The triple copresence heuristic has again led to the wrong decision. Our criterion, in contrast, correctly denies grounds for mutual knowledge in this situation. The reason why it fails our criterion can be easily seen. For Ann to know that the candle has gone out it is essential that she can see through the hole in the menu folder. This fact must therefore be part of G (to satisfy C3). C2 requires that Ann has reason to believe that G must reveal this fact to both participants. However there is nothing in the situation S that would


Fig. 2.

'

/1\Hmu

376


give her reason to believe that Bob thought that she could see the candle. On the contrary there is every reason for her to assume that Bob cannot see the hole in the menu. Therefore this situation fails condition C2 and Ann has no reason to asume mutual knowledge. The last few examples have shown that triple copresence is neither a suffi cient nor a necessary condition for assuming mutual knowledge. It might be moved, against our objections, that triple copresence is a heuristic test for grounds for mutual knowledge. It is not intended to provide a set of necessary and sufficient conditions . However, the wide range of straight forward examples that this ' heuristic' fails on renders it almost useless as part of a psychological decision criterion. The basic problem with physical copresence as a test for mutuality is that even if Ann can establish copresence she has not thereby established that Bob will have grounds to assume copresence. Our criterion shows that triple copresence only provides grounds for mutuality when copresence indicates itself to both parties. Much the same problem arises with linguistic copresence (Clark and Marshall, 1 98 1 :35 -42). Again, Clark and Marshall do not state precise conditions for linguistic copresence, but illustrate the idea by an example: "Imagine Ann saying to Bob I bought a candle yester day. By uttering a candle, she posits for Bob the existence of a particular candle. If Bob hears and understands her correctly, he will come to know about the candle's existence at the same time as she posits it. It is as if Ann places the candle on the stage in front of the two of them so that it is phys ically copresent. The two of them can be said to be in the linguistic copresence of the candle." (Clark & Marshall, 1 98 1 :39). From this quote it appears that linguistic copresence holds if the par ticipants and a spoken or written version of the target proposition are phys ically copresent, and the auxiliary assumptions of rationality, attention and simultaneity, plus some new ones specific to language, are satisfied. This in terpretation would capture the case in which Ann and Bob are present when a third person mentions the candle, but what about a telephone conversa tion in which Ann and Bob are not physically copresent? Is it sufficient that Ann and Bob hear the mention of the candle simultaneously? This criterion would permit mutual knowledge in a telephone conversation, but would wrongly suggest that a radio announcement by Ann heard by Bob in a different place would provide grounds for mutual knowledge. A possible modification of linguistic copresence to cover the telephone conversation but exclude the radio announcement might be a rather vague formulation that says: 'Ann and Bob have to hear each others' voices at about the same time'. But even this formulation, and probably any other attempt to cap ture the notion of linguistic copresence, will not be able to account for the mutual knowledge established by a reliable messenger (M) that Ann sends to Bob to tell him that tonight's movie at the Roxy has been changed to

377

THE PROBLEM OF UNCERTAI NTY

One could object that our criterion succeeds on the messenger example only because of the unreasonable assumption of perfect reliability. Real life is never perfectly predictable and Sperber and Wilson argue that even the slightest uncertainty poses an insurmountable problem for establishing mutual knowledge. They argue (1 986:20) that because uncertainty in estab lishing one step of the infinite iteration affects all later stages multiplicative ly, the probability of establishing mutual knowledge itself is vanishingly small. Sperber and Wilson's argument, however, only holds if mutual knowledge is defined as an infinite iteration of higher-order knowledge states, each of which has to be established separately, if mutuality is to be demonstrated. If, as we argue, mutuality can be established on the basis of


Monkey Business (p) and that she sent the message. Our criterion leads to a j udgement of mutual knowledge. Take Ann's point view. Condition C3 is met since S ( = S[a] = G) implies that both Ann and Bob know p. C2 is met , since Ann knows what she told M and since Ann has good reason to anticipate that M will give the message to Bob as instructed. C2 is also met for Bob since Bob can reconstruct from what M told him that Ann must have instructed M to tell him that p and to tell him that the message came from her. Also C4 is met since the judgement that C2 and C3 apply does not depend on any special expertise. In contrast, this messenger scenario would not provide grounds for mutual knowledge according to Clark and Marshall's triple copresence heuristic since Ann and Bob and the spoken form of p are never copresent. Ann and Bob do not even hear each others' voices at about the same time. Although Clark and Marshall clearly intend linguistic copresence to mean that Bob and Ann are physically copresent and only the mutually known fact or object is linguistically present, one might try to abandon the re quirement of physical copresence of the participants and allow that one of them need only be linguistically present. One could then argue in the mes senger example that at the time of M's delivery of the message Ann and p are linguistically and Bob is physically present. However, under this inter pretation, the copresence heuristic would give the wrong decision in simpler message situations, for example one in which Ann instructs M to tell Bob that p, and M later tells Bob that p. Since Ann is physically and Bob and p are linguistically present when Ann gives her instruction the copresence heuristic would suggest mutual knowledge. Intuition and condition C2 would clearly reject this suggestion, since Bob has no grounds for knowing who had sent the message (C2 fails) and hence he would not have grounds to assume that Ann knew p.

378

COMMUNITY MEMBERSHIP

Besides physical and linguistic copresence the third major way of establish ing mutual knowledge, according to Clark and Marshall, depends on com munity membership. "Even when Ann is not acquainted with Bob, she can assume there are generic and particular things the two of them mutually know. The basic idea is that there are things everyone in a community knows and assumes that everyone else in that community knows too . " (Clark and Marshall, 1 98 1 :35). There are two distinct problems in establishing mutual knowledge via community membership. The first is: how can strangers es tablish mutual knowledge about which community they belong to? The sec ond is to decide what knowledge can be assumed to be mutual between members of a community once they are mutually known to one another. In many cases the first p roblem is solved quite straightforwardly. Ann says to Bob "Hello, I am from Newick. " and Bob replies "So am I " . This exchange establishes mutual knowledge of 'Ann and Bob are both from Newick', as our decision criterion shows. The second problem is different. The important part of its solution is to show why Ann and Bob, after establishing that they both come from Ne wick, have grounds to believe that they mutually know 'Newick is in Sussex' . A positive decision about mutual knowledge of this fact can be made by our criterion if we allow as a rule of common sense inference: 'if somebody is from Newick then that person knows Newick is in Sussex' . This rule will then be admissible by condition C4 for establishing C2 and C3. The critical question is : under what conditions such a rule should so admitted? It is tempting to argue that all that is needed is that everybody in Newick knows that they live in Sussex. Surely it follows from this fact, as C3 requires, that if Ann and Bob are from Newick, then both know that Newick is in Sussex.


a finite series of tests, any compounding of uncertainties will be strictly limited . Using our criterion, Ann's uncertainty about whether mutual knowledge is established by her message will only be marginally greater than her uncertainty about whether Bob knows the target proposition p. Her confidence that Bob knows p will depend on how certain she is that the messenger will inform Bob of p. Her confidence that she and Bob have mutual knowledge of p depends, in addition, on the probability that her messenger will not forget to tell Bob that Ann wanted him to know that the message came from her and the probability that Bob will apply the mutu ality criterion. If these two probabilities are high Ann has good grounds to assume that mutal knowledge will have been achieved, despite the uncer tainty.

379


Clark and Marshall ( 1 98 1 :37) argued in exactly this way: "It is instruc tive to spell out the two main assumptions required here for mutual knowledge of proposition p ('Newick is in Sussex'). First, Ann must believe that she and Bob mutually know they belong to a particular community. Let us call this assumption community membership. And second, Ann must believe that everyone in that community knows that particular propo sition p. Let us call this assumption universality of kno wledge. " Before we give the correct characterization of how Ann establishes C3 in the previous example, we will show, in a further example, that universality of knowledge cannot be the correct criterion. The vicar of Newick visited every inhabitant in private and told each one of them the following: ' ' I have told everybody in Newick that Mr Mainwaring-Knight is possessed by Sa tan, but you are the only one who I have told that I told everybody. " According t o Clark and Marshall's criterion o f universality of knowledge within the community of Newick it follows that if Ann and Bob meet and identify each other as being from Newick they then mutually know that Mr Mainwaring-Knight is possessed by Satan. But this conclu sion is obviously unwarranted. In fact, Ann has firm evidence that Bob would not know that she knew about Mr Mainwaring-Knight's possession, since the supposedly trustworthy vicar has assured her of that fact. This unwarranted conclusion can be avoided by our criterion if we re quire that the rules used for establishing conditions C2 and C3 must be, not just universally but, mutually known. We tried to capture this requirement informally by stipulating that conditions C2 and C3 must be established by rules of common sense reasoning, which are mutually assumed. Indeed, we have had to use a contrived example to illustrate the difference between rules that are merely universally known and those that can be used to estab lish mutual knowledge. With this requirement the 'lying vicar' example fails our criterion. Con dition C3 requires that situation S (which contains: ' 'Ann and Bob are both from Newick") implies that Ann and Bob have reason to believe that Mr Mainwaring-Knight is possessed by Satan. C4 requires that this implication be established by a mutually known rule 'if somebody is from Newick then that person knows that Mr Mainwaring-Knight is possessed by Satan'. The way the vicar informed his flock ensures that this rule is universally known (everybody in Newick knows it) but not mutually known (Ann, for instance, thinks that she is the only one who knows that everybody knows). The rule, therefore, fails to satisfy condition C4 in our present formulation, which re quires mutuality. Had the vicar told his flock a different story: "I have told everybody that Mr Mainwaring-Knight is possessed by Satan and I have assured them that I will tell everybody else", then Mr Mainwaring-Knight's possession would be mutually known within the community of Newick. That this is so can be

380 established by our criterion. The situation S is the sum of personal encoun ters with the vicar. Under the assumption that the vicar is reliable and will do what he says, everybody in Newick has reason to believe S (C2 satisfied). And by commonly assumed rules about communication it can be estab lished that everybody who was part of S would have reason to believe that Mr Mainwaring-Knight is possessed by Satan (C3 satisfied). Let us call the body of knowledge that is shared by a community in this way communal knowledge.

COMMON SENSE REASONING AND COMMUNAL KNOWLEDGE

We have seen that, contrary to Clark and Marshall's claim, it is not suffi cient that the rules used to determine whether G satisfies conditions C2 and C3 are universally known within a community. They must be, in some sense, mutually known. However, two people do not mutually share the common assumptions of a community until they mutually know that they are mem bers of that community. This fact causes no problem when community membership is established on the basis of some other knowledge, such as knowledge of language. If Ann and Bob tell each other that they are Scot tish, they mutally know the folklore of Scots, and can use that knowledge to establish other pieces of mutual knowledge. However, mutual knowl edge of community membership can, in some cases, be established on the basis of knowledge about the common beliefs of that community. Ann and Bob may mutually establish that they are sighted using their knowledge of how slighted people behave, or that they are freemasons by certain charac teristic, but subtle, bodily movements. Therefore, just as it is too weak a condition to require that the rules used to establish C2 and C3 are univer sally known, it is too strong a condition to require that they are mutually known, in the usual sense. We, therefore, distinguish between mutual knowledge (between a and b


One can see from the last example of the truthful vicar how a body of communal knowledge can be built up if a group of people is authoritatively instructed, as children are in school. In school everybody is told (roughly) the same facts. Since this instruction is carried out in the open, nobody can have any suspicion that there is any secret about it, unlike the case of the lying vicar. So educated people have reason to assume that they mutually know the facts typically taught at school. For instance, ' Newick is in Sus sex ' is taught at the infant school in Newick and can therefore be assumed to be mutually known by all members of that community older than 4 years. So, to return to our unfinished example, Ann can reason from the fact that Bob is from Newick to the fact that he knows 'Newick is in Sussex' using only rules that are mutually known to people from Newick5 •

38 1

S

=

'p1 : p2: p3 : p4:

Ann is sighted. Bob is sighted. Ann's eyes are within Bob's visual field . Bob's eyes are within Ann's visual field . '

Can Ann assume mutual knowledge of p l and p2? Trivially, aRp l and aRp4. Ann also has reason to believe p2 and p3 on the basis of the common knowledge of what sighted people look like and of what they can see. In par ticular, her visual abilities enable her to establish that Bob is sighted before that fact becomes mutually established. Since he is sighted she can assume he shares common assumptions about seeing, since these assumptions are communal among sighted people. This attribution allows her to establish, straightforwardly, that S - bRS. It follows that S provides grounds for mutual knowledge of all of pl, p2, p3 and p4, and hence of pl and p2 in particular.


that p) , and communal kno wledge - knowledge that all members of a community 'mutually' entertain. Mutually occurs in quotes here, since the mutuality of communal knowledge is, for two arbitrarily selected members of a community, not actual but conditional on their recognising each other as members of the community. Community members know that other community members share communal knowledge. But they do not neces sarily know who the other community members are. When two people meet they may not know which community memberships they have in common. None of their communal knowledge is mutual until they have established (mutually) which communities they belong to . It is often claimed (e.g. Sperber and Wilson, 1 986: 1 9) that mutual knowledge must be known to be mutual. But in one sense two members of a community who have met but not established that they are members of that community do not mutually know the relevant communal knowledge. The rules used to establish C2 and C3 can be either mutually known or communally known, and as communal knowledge these rules can be at tributed to someone identified as a member of a community, before that membership is mutually established. In some cases, therefore, communal knowledge of these rules can be used to establish mutual knowledge of community membership and hence mutual knowledge, in the strong sense, of the rules of reasoning used in the community. This method of establish ing mutual knowledge works only when the communal knowledge of a community provides a way of identifying members of that community. The following example illustrates how knowledge of the common sense psychology of seeing - knowledge that is communal but not yet mutual can be used to establish mutual k nowledge of sightedness. Ann and Bob have met for the first time, and have not yet spoken:

382

C4:

Whether G satisfies C2 and C3 must be established by rules that are tacitly assumed (or that have been previously established) EITHER to be mutually known between participants a and b OR to be communal knowledge of a community of which G establishes that a and b are members.

What is important in this formulation of C4 is that the mutuality of the rules for testing the other conditions must not itself be tested. It must either be tacitly assumed or be explicitly established by the criterion on a previous occasion . If mutual knowledge of these rules had to be tested when the criterion was applied, an infinite regress would result. Another important feature of this formulation of C4 is that it allows in ference rules which are only communally known. Why is it safe to attribute to someone communal knowledge before mutual knowledge of community membership has been established? The reason lies in the nature of common sense psychology and communal k nowledge. Two people either are mem bers of a community, and (potentially) mutually know the fol klore of the community, or they are not. There is no question of these psychological rules having an intermediate status, such as being known to everyone, but only being known by a select few to be known to everyone. Communal knowledge can never be like the knowledge imparted by the lying vicar. Therefore, if Ann identifies Bob as a member of a certain community, and she knows that she is a member of that community, she already knows that the common sense psychology of that community can become mutual . She will not usually come to any mistaken conclusions if she assumes Bob will use that psychology in his reasoning. If G includes information about com-


Mutual knowledge of sightedness and of the common sense psychology of seeing can be established by this 'bootstrapping' method because vision can be used to establish whether another person is sighted. However, sighted people can only mutually establish that they are sighted in this way if con ditions like p3 and p4 hold. For each of two people to know that the other is sighted is not enough. The fact that each is sighted must be part of the situation or inferable from it. Other types of community membership have the same self-recognising property as seeing. For example, by becoming a freemason one learns the secret signs that allow one to recognise other Free masons. Freemasons can establish, between themselves, that they are Free masons by using these signs before they mutually know, in the usual sense, that they are Freemasons. To do so they use knowledge about Freemasons that is communal and, hence potentially mutual. However, even a Freema son will not necessarily recognise another freemason if neither makes the appropriate signs. We are now in a position to reformulate condition C4:

383 munity membership, Ann can use her knowledge of Bob's knowledge of common sense psychology to establish mutual knowledge of community membership.

CONCLUSION


We have presented a finite decision criterion that a person, a, can use to de cide whether a and b have mutual knowledge of a proposition p in situation S that relates a and b (or whether their reasons to believe p are mutual or whether p is mutually manifest). The criterion can only be used if (a has rea son to believe that) a stock of mutual or communal knowledge is already available to a and b. We have also shown how this stock of knowledge can be increased by application of our criterion. A criterion that uses mutual knowledge of one fact to establish mutual knowledge of another is not cir cular in the way that a definition of mutual knowledge in terms of mutu ality would be. However, it follows that our criterion cannot be used to es tablish all the mutual knowledge that a and b share. Some of that knowledge must be established in other ways. Nevertheless, providing that some mutual knowledge can be established in a different way, and provid ing that people's knowledge is organised in a suitable manner, our criterion can be used to build a large stock of mutual knowledge from very modest beginnings. But what precisely are those beginnings? The simplest way in which mutual knowledge can be 'established' without using our criterion is for it to be assumed. Two people might assume any bit of knowledge to be mutu al. Such assumptions are risky, in the sense that they might be incorrect. But because the assumption that a piece of knowledge is mutual typically has many consequences, such assumptions can be revised in the light of sub sequent evidence. Indeed, the most general assumptions, for example of the mutuality of the minimum rationality needed to apply our decision criterion or of (near) universal perceptual and cognitive abilities, are both the safest and the ones having the most widespread and immediate conse quences. Mistaken assumptions about the mutuality of specific pieces of knowledge, on the other hand can be harder to correct. In such cases mutu ality would be better established by our criterion . Nevertheless, the fre quent misunderstandings in human interactions are probably attributable, at least in part, to unfounded assumptions of mutuality. Developmentally, the idea of initially assuming wholesale mutuality from which one gradually retreats is reflected in Piaget's ( 1 923 / 1 926; 1948 / 1 956) doctrine of 'egocentrism', according to which young children operate as if facts known to them are 'open' to everybody. Recent evidence (Wimmer, Hogrefe and Perner, 1988) however suggests that this idea is in-

384

Laboratory of Experimental Psychology University of Sussex Brighton BNJ 9QG England

ACKNOWLEDGEMENT

The authors express their gratitude to Steve Isard and Richard Power for valuable suggestions and critical comments. This paper is a shortened version of a more extensive unpublished manuscript, which is available from the authors.

NOTES I.

The term 'attitude' is used in the technical sense of 'propositional attitude ' , which simply

means that the verb know has a proposition as one of its arguments. This use is compatible with the idea that knowledge is a mental state but it is also compatible with Ryle's ( 1 949) sug gestion that knowing is an achtevtmtnt. 2.

Sperber and Wilson claim that "the situations which establish a mutual cognitive envi

ronment are essentially those that have been treated as establishing mutual knowledge" ( 1 986: 45) and they refer in a footnote to Lewis and to Clark and Marshall. Clark and Marshall ex plicitly propose a finite, though heuristic, test for mutuality. And as we shall show, Lewis's conditions can be regarded as providing a finite decision criterion for mutuality. By accepting


correct. Even children below the age of 4 years judge whether someone else knows a fact independently of whether they themselves know it. The opposite extreme to a wholesale assumption of mutuality would be to assume the m utality of only minimal rationality. In any particular case this assumption might have to be given up on the basis of good evidence, such as the gross behaviour of an insane person or the failure of someone to respond normally in a straightforward interaction. More generally, any assumption of mutual knowledge may be abandoned if a notices that b reacts inappropriately to behaviour based on that assumption. However, such an overly cautious approach to mutuality is implausible. We typically impute much more mutual knowledge to strangers than just the assumption of minimal rationality. We do not first test out whether they can hear but tacitly assume mutuality of hearing. Only when their reactions to our noises violate this assumption do we retract from it. Children, probably, make even stronger a priori assumptions about mutuality. For example, they may well assume that everybody understands their language until they have con tact with a foreigner. There is, therefore, an important empirical question about which abilities and pieces of knowledge children tacitly assume to be mutual and which ones they attempt to establish using our criterion. How ever, we must leave this question for another occasion .

385 Lewis's criterion, Sperber and Wilson undermine their own arguments against mutual knowl edge. If there is a finite decision criterion, there is no infinite series of checks. 3 . Perner and Garnham (ms.) give the full details. To draw a conclusion about mental achievements or states, rather than about cognitive environments (manifestness I reason to be lieve) it is necessary to import some additional assumptions about what Ann and Bob are pay

ing attention to , but these assumptions are about (ordinary) knowledge or beliefs, not about mutuality. 4.

A step-by-step analysis is given in Perner and Gamham (ms.).

5.

It is worth noting that Lewis, in his demonstration of how common knowledge follows

from his three conditions assumes "mutual ascription of some common inductive standards and background information, rationality, mutual ascription of rationality, and so on." ( 1 969: 56-7), though earlier he talked o f merely "share[d) . . . inductive standards and background information" ( 1 969:53), an idea that is shown to be inadequate by our lying vicar example. H owever Lewis did not make this assumption explicit in a fourth condition corresponding to

tled for the insufficient criterion of universality instead of mutuality.

REFERENCES Aumann, R . J . 1 970: Agreeing to disagree. A nnals of Statistics 4: 1 236- 1 239. Bach, K. and R . M . Harnish 1 979: Linguistic communication and Speech Acts. MIT Press. Cambridge, MA. Barwise, J. 1 98 5 : Modeling shared understanding. Unpublished manuscript. Department of Philosophy and CSLI , Stanford University. Clark, H . H . and C . R . Marshall l 98 1 : Definite reference and mutual knowledge. In: A . K . Joshi,

B . Webber, and I . Sag (eds . ) : Elem�nts of Discours� Und�rstanding. Cambridge University

Press. Cambridge.

Halpern, J. Y . and Y.O. Moses 1 984: Knowledge and common k nowledge in a distributed envi ronment. Proceedings of th� Third A CM Conference on Principles of Distributffl Comput

mg, pp. 50-6 1 . Hughes, G . E . and M . J . Cresswell l 968: A n Introduction to Modal Logtc, Methuen. London. Hintikka, J. 1 969: Knowlfflg� and Beli�f. Cornell University Press. Ithaca , N Y . Lewis, O . K . 1 969: Convention: A Philosophical Study. Harvard University Press. Cam bridge, MA. Milgrom, P. 1 98 1 : An axiomatic characterization of common knowledge. Econometrica 49: 2 1 9-222. Moore, R.C. 1 980: R�asoning about knowlfflg� and action. SRI International, Technical Note 1 9 1 . Perner, J . and A . Garnham (ms . ) : Conditions for mutuality: Extended version. Unpublished manuscript: Laboratory o f Experimental Psychology, University of Sussex. Piaget, J. 1 926: The Language and Thought of th� Child. Routledge and Kegan Paul. Lon don. (Originally published, 1 923). Piaget, J. and B . Inhelder 1 956: The Child's Conception of Space. Routledge and Kegan Paul. London. (Originally published, 1948). Ryle, G. 1 949: Th� Concept of Mind. Hutchinson. London . Schi ffer, S. 1 972: M�ning. Oxford University Press. Oxford. Sperber, D. and D. Wilson 1 986: R�l�vana: Communication and Cognition. Blackwell. Oxford. Wimmer, H . , G .-J . Hogrefe and J. Perner 1 988: Children's understanding of informational access as source of knowledge. Child Development 59: 386-396 .


our C4. It is probably for this reason that Clark and Marshall ( 1 98 1 :37), who relied on Lewis's conditions in their treatment of mutual knowledge established by community membership, set

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

Recommend Documents