No title

JOURNAL OF SEMANTICS Volume 24 Number 2 CONTENTS NICHOLAS ASHER AND ERIC MCCREADY Were, Would, Might and a Compositiona...

Author: Oxford University Press

94 downloads 319 Views 826KB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form

DOWNLOAD PDF

JOURNAL OF SEMANTICS Volume 24 Number 2

CONTENTS NICHOLAS ASHER AND ERIC MCCREADY Were, Would, Might and a Compositional Account of Counterfactuals

93

YAEL GREENBERG Exceptions to Generics: Where Vagueness, Context Dependence and Modality Interact

131

JO-WANG LIN On the Semantics of Comparative Correlatives in Mandarin Chinese

169

Please visit the journal’s web site at www.jos.oxfordjournals.org

Journal of Semantics 24: 93–129 doi:10.1093/jos/ffl013 Advance Access publication March 6, 2007

Were, Would, Might and a Compositional Account of Counterfactuals NICHOLAS ASHER The University of Texas at Austin

Abstract This paper has two purposes. We first give a new dynamic account of epistemic modal operators that account for both their test-like behaviour with respect to whole information states and their capacity to induce quantificational dependencies across worlds (modal subordination). We then use this theory, together with an analysis of conditionals and irrealis moods, to give a fully compositional semantics of indicative and counterfactual conditionals. In our analysis, the distinction between counterfactual and indicative conditionals follows directly from the interaction between the semantics of the conditional and irrealis operators and the semantics of the particular modals involved in the conditional consequent. We indicate some theoretical and logical consequences of our approach.

1 INTRODUCTION Over the past 20 years with the advent of the ‘dynamic turn’ in semantics, there have been numerous discoveries and theoretical proposals for modals in Romance and Germanic languages, as well as for conditionals and counterfactuals. But there has been little attempt to produce a compositional account of counterfactuals from an account of the meanings for the conditional and the modals, something which Kratzer (1977, 1981, 1986, 1991) argued for convincingly but only carried out in her static framework for analysing modalities.1 Here we present a detailed, dynamic semantics for modals that leads to a new semantics for counterfactuals. There are already many proposals on the semantics of the English modals might and would in the literature, both dynamic and static. But 1

Prominent exceptions to this generalization are Gillies (2003) and Kaufmann (2005). Gillies, however, does not consider the modal subordination use of modals and Kaufmann considers complementary interactions between conditionals and modals to the ones that we concentrate on. The Author 2007. Published by Oxford University Press. All rights reserved. For Permissions, please email: [email protected]

Downloaded from jos.oxfordjournals.org by guest on January 1, 2011

ERIC MCCREADY Aoyama Gakuin University

94 Were, Would, Might and a Compositional Account of Counterfactuals as far as we can see, there are no proposals that can simultaneously capture both of two important sets of observations made in the literature; in fact, it seems fair to say that these observations pull any putative account of these constructions in different directions. First, Veltman’s (1996) seminal paper on epistemic modals introduces the idea that epistemic possibilities conveyed with a modal like might interact dynamically with factual information introduced in a discourse. Factual information introduced into a discourse may rule out certain epistemic possibilities that might otherwise be present. Consider the following minimal pair.

Veltman’s examples show that the available epistemic possibilities partly depend on what information has already been introduced in the discourse; once it has been established in the discourse that it is not sunny, it is no longer permissible to introduce the epistemic possibility that it might be sunny, as in (1b).2 While some have claimed that (1b) has a reading according to which the speaker revises his contribution, we claim that revisions need much more linguistic marking than is present in (1b). Like Veltman, we believe that (1b) remains very marginal even when one has the revision scenario in mind. An alternative is to adopt a pragmatic story on which an assertion of (1b) would be bad because it violates Grice’s maxim of Quality: since the first sentence explicitly rules out the possibility of it being sunny, the second sentence is obviously false. Put another way, if the discourse is to be uttered, the speaker is not sincere in his utterance of one of the two sentences, since they are inconsistent. In any case, (1b) remains incoherent on an epistemic reading. It is worth noting that introducing as an epistemic possibility something that has already been established in the discourse as in (1c) is also infelicitous, though much less so than (1b). (1c) ?It is sunny. It might be sunny. We take the second sentence of (1c) to be uninformative, and so we account for its oddness using a pragmatic principle of informativeness. 2 Some people, philosophers notably, claim that might p is ambiguous between the epistemic reading that says ‘for all that is known given the discourse so far, it’s possible that p’ and the metaphysical reading, on which a claim is made about ‘real’ possibilities given the actual facts. We think that (1b) has—at best—a very marginal metaphysical reading, and so that it is pretty bad, whereas (1a) is felicitous on the preferred epistemic reading. We think that there is in fact a metaphysical reading in this variant of (1b): It is not sunny. But it might have been sunny. But this example has rather different properties in general.


(1) a. It might be sunny. It’s not sunny. b. #It is not sunny. It might be sunny.

Nicholas Asher and Eric McCready 95

But the second sentence of (1b) is not uninformative; the discourse is epistemically unsatisfiable. One of the important goals, we think, of a theory of modals is to show why (1b) is so different from (1c). The second observation, originally due to Roberts (1987, 1989), that has motivated a considerable amount of work in dynamic semantics is that uses of modals may pick up and refine epistemic possibilities introduced by the use of modals in previous utterances. This phenomenon has come to be known as modal subordination. Consider, for example the discourses in (2). Here the indefinites should be understood de dicto, as non-specific indefinites under the scope of the modal.

There is a striking difference between (2a) and (2b). In (2a) the use of the epistemic modal would enables the pronoun it to find its intended antecedent, the wolf introduced under the scope of the modal in the first sentence, while futurate will in (2b) does not allow this possibility. Example (2c) shows that the modal might has the same effect as would in enabling the accessibility of the intended antecedent. Nevertheless, since a wolf occurs under the scope of the modal operator in these sentences, it is unavailable as an antecedent for the pronoun in nonmodal contexts, which is what standard dynamic semantics predicts. The accessibility of the antecedent under the scope of a modal to a pronoun also under the scope of a modal, however, was something that standard dynamic semantic accounts of anaphora as well as more traditional accounts could not predict; and the accounts of Roberts and later of Frank (1997) provided significant insights into the semantics of anaphoric expressions. While examples of a dependency of would claims on might claims are common, might claims sometimes also depend on would claims. (3) Because 1950 DA is large—more than 1 kilometre (0.6 miles) across—the consequences would be grave and global. Clouds of debris would create a multi-year winter that would kill off many species and might even threaten civilization. (From ‘An Asteroid Might Hit Earth in 2880’, R. Britt at space.com) The epistemic possibility introduced by might depends on the information introduced under the scope of the would operators above. This example is complex. Let us also consider a constructed variant, which may make the point clearer.


(2) a. A wolf might walk in. It would eat you first. b. A wolf might walk in. #It will eat you first. c. A wolf might walk in. It might eat you first.

96 Were, Would, Might and a Compositional Account of Counterfactuals (4) If in 1950 DA hit the earth, there would be clouds of debris that would create a multi-year winter. It might even threaten civilization. Finally, counterfactuals also can depend on might claims: (5) A wolf might walk in. If it were to eat you first, I’d be unhappy, but not as unhappy as if it ate me first.

(6) a. If I were not to sleep tonight, I would topple over tomorrow (I might topple over tomorrow). b. If I had not slept last night, I would have toppled over today. c. What should be your mascot if you were a school? (Google) d. If we were to get more serious, should I tell him my age? (Google) e. If it were easy, anyone could do it (Google, Converge Magazine). The ease with which modals and conditionals combine strongly suggests that we should try to build a compositional account of the semantics for counterfactuals. The slight differences in meaning


The natural way to account for Veltman’s observations and the natural way to account for the data on modal subordination do not easily combine. They involve different techniques for evaluating formulae with respect to a set of indices or points of evaluation: one approach is to evaluate a formula relative to properties of sets of such evaluation points—a sort of ‘collective’ approach (to use some terminology from the interpretation of plurals); the other evaluates a formula relative to individual evaluation points—a ‘distributive’ approach. The history of the efforts in this area suggests that a fully adequate semantics for the modals has to be both collective and distributive with respect to points of evaluation. Moreover, the phenomena concerning modal subordination indicate that modals are dynamic with respect to other modals (they can access and update information under the scope of certain other modals) while remaining static tests with respect to non-modal contexts. Our first goal in this paper is therefore to provide a semantics for modals that has these properties. We will do this in section 3, after first, in the next section, examining further properties of might and would and critically discussing existing accounts of modals. As we have seen, then, the semantics of the modals is rich and intriguing. But this is not the end of the puzzles related to them. Particularly interesting is the question of how so many modals combine productively together with a conditional whose antecedent is adjusted for the appropriate mood to create counterfactuals (and noncounterfactual conditionals) of various kinds.


between the different modals create different sorts of counterfactuals; for instance, should and would do not have quite the same range of meanings, nor do could and might. And not unexpectedly, there is a definite change in meaning between (6e) and the variant if it were easy, anyone might do it. We thus take as our second goal an account of the contribution of modal expressions to counterfactuals; more specifically, we will provide a compositional account of these constructions. The idea is that, with the right semantics for modals, the conditional mood and the irrealis mood that appears in counterfactual constructions, a theory of counterfactuals will fall out in a compositional manner. Section 4 provides such a theory.

Crucial for a dynamic theory of modality is the question of how modals interact. But before investigating this question, let us examine how they behave with respect to non-modal discourse. An interesting place to start is a use of the epistemic modal would that concerns its effects as an agreement marker in discourse. In (7), B’s utterance conveys that the content of A’s utterance conforms to B’s expectations, or, in other words, that in all of B’s epistemic possibilities (before processing A’s utterance) Kim’s teasing Pat was something that was bound to happen. (7) a. A: Kim teased Pat. b. B: Kim would do that. There are in addition many uses of ‘polite would’. One is (8) a. Who is Asher? b. That would be me. (9) a. What are you drinking? b. I’d like a glass of champagne. Such polite uses indicate a sort of scalar strength between / and Would / (Would and Might are operators in our formalization and are the counterparts of the English modals). More specifically, Would / is weaker than / and so clearly does not entail it. This is made even clearer by discourses like this one. (10) a. b. c. d.

A: Who drank the last beer? B: That would be me. A: ‘Would’? B: OK, it was me.

The lack of entailment allows room for the politeness effect of (8b) and (9b).


2 SOME DATA AND DESIDERATA FOR A THEORY OF MODALS

98 Were, Would, Might and a Compositional Account of Counterfactuals

(11) a. ?John is at the party. He wouldn’t be at the party. b. (?) John is at the party. He would be at the party. Example (11a) is a flat out contradiction, unless the second sentence is understood as some sort of self-correction.5 We admit that (11b) also sounds pretty odd; indeed, we should predict that it is pragmatically odd since we take Would / claims to be weaker than /. Thus, the second clause of (11b) is redundant. We therefore take the oddness of (11b) to be pragmatic in nature. Still, at least some speakers balk at the claim that the first clause of (11b) entails the second. We think that is perhaps due to a temporal effect on the interpretation of would. Would has some element of past tense evaluation in it, as noted by various authors (e.g. Iatridou 2000); it refers to epistemic possibilities just prior to the now of non-modal discourse, though there are temporal dependences between modals in modal discourse. There is a feeling that prior to the extended ‘now’ for the discourse and thus prior to update with /, Will / (the non-past variant of Would /) must have been true. /, that is, should not come as a surprise. Further, it seems that, at least for some speakers, there is a lag between update and readjustment of epistemic possibilities. Our model does not include times, so we cannot model the lag between update and readjustment. But aside from this complication, we take this inference to go through. 3

Though, see Asher and Gillies (2003) for a view of discourse context where this is not the case. Here we consider only uses of would on which it does not receive a counterfactual interpretation or one dependent on content in the scope of other modals, as in (2). 5 Or unless we understand some sort of suppressed antecedent of a counterfactual as occurring there, for which there is little, if any, evidence. 4


Perhaps, a more fraught claim is this one. Our view is that / implies Would / on this use. Our first reason for this is philosophical. On a certain idealization that we adopt here, the discourse context established through prior assertions represents what is the common ground and thus mutually believed by the conversation’s participants.3 The epistemic possibilities through which the meaning of would is defined ought to reflect these convictions. Thus, if / is part of the prior discourse context, our analysis should reflect that / is true in all the epistemic possibilities determined by that context. An epistemic understanding of would makes, in such a circumstance, Would / true.4 Does the usage of would reflect this conceptual point? We think it does. We have two pieces of evidence for this claim in addition to the intuitions already discussed. The first comes from discourses like those in (11). Here we must understand would as in the examples above. If we do, the discourse in (11b) is perfectly sensible.


Next, consider the discourse in (12). Here it seems clear to us that the content / of A’s second utterance is being used to correct B’s claim that :Would /. If / implies Would /, then we have a straightforward explanation of the two Corrections in terms of indirect proof or modus tollens. (12) a. A: John stole some money from Mary. b. B: He wouldn’t do that. c. A: Well he did.

(13) a. #John might come to the party but John would not come to the party. b. #John would not come to the party but John might come to the party. c. #John would come to the party but it’s not the case that John might come to the party. This basic picture of how Might and Would interact will be extended and expanded as we proceed. For the second part of our project, we also need to consider how they interact with conditionals; the kind of compositional semantics for counterfactuals we want to give must assign meanings to the various modals and to the conditional expressed if . . . then and then see how those all combine together.6 What sort of semantics ought the conditional itself to have? This is of course a question that many have worried over; but on the present view, the semantics of the conditional in counterfactuals should be exactly the same as that in so-called ‘indicative’ conditionals, in which the antecedent is in the indicative (as opposed to subjunctive) mood. Indicative conditionals have interesting connections with the epistemic modals as Stalnaker (1975) and Kratzer (1981) noted long ago (for a discussion, see Gillies 2004), as well as a number of commonly agreed upon characteristics:

they have Ramsey test-like behaviour and support modus ponens (MP). (The Ramsey test refers to a theory of conditionals on which, in essence, a conditional is true if the consequent proves true

6 A similar project is attempted by McCawley (1996), among others. See Bennett (2003) for some related discussion.


This leads one to wonder about the relationship between might and would. Their semantics is strongly interconnected, and they appear to be weak duals, in the sense that one cannot have Might / and Would :/. Similarly, most speakers balk at Would / (Q2) together with Might :/. Intuitions also support that Would / implies Might /.

100 Were, Would, Might and a Compositional Account of Counterfactuals when we hypothetically add the antecedent to our stock of beliefs; see Ramsey 1929).7 they support logical exportation (across the turnstile), in the sense that / 0 (w 0 v) ~ (/ ^ w) 0 v and vice versa. they have a modal flavour—more specifically, they should obey the following equivalence: :ð/0wÞ4ð)ð/ ^ :wÞÞ:

(14) a. A: If one of the grounds staff committed the crime, then it was the driver. b. B: No, it might have been the groundskeeper. B’s response is a way of elaborating on his denial of A’s claim; clearly, we take B’s response to explain why he thinks A’s assertion is false. But if this is right, then the truth conditions of a conditional must involve the semantics of modals in some way, and that in turn means that conditionals involve epistemic possibilities. Now, to set the stage for our analysis, we sketch the dynamic semantic background for the previous analyses we will discuss and build on. Dynamic semantics was designed to account for how the meaning of an utterance in a particular discourse context might change or contribute to that very discourse context. Accordingly, all dynamic semantic theories analyse the meaning of a sentence as a relation between contexts, an input context and an output context. Broadly speaking, contexts in dynamic semantics are sets of n-tuples consisting at least of a world, an assignment function and often other things as well.8 There are two broad parameters in the way dynamic semantics may describe the effects of an utterance on a context: first it may be distributive and define the effects of a sentence on each element of the context, or not, which is the collective view; secondly these effects may affect those elements of the context or not. If they do not, the semantics is called eliminative. Discourse Representation Theory (DRT) of Kamp & Reyle (1993) as well as Dynamic Predicate Logic (DPL) (Groenendijk & Stokhof 1991) are 7

This means that the consequent does not contain more information than the antecedent in the sense of dynamic semantics. A more formal definition of the Ramsey test is given in the main text below. 8 In this paper we will use the notion of possible world in its ordinary sense as an evaluation point that gives a truth value to all well-formed sentences of the formal language in a classical fashion. In this paper, we do not investigate a notion of epistemic possibility which may not be logically closed.


While the first two properties are obvious enough, the modal properties of ordinary conditionals require some comment. Consider the following interaction from Gillies (2004) between two detectives, A and B, who are investigating a crime at a mansion for which grounds staff consists of a driver and a groundskeeper.


Veltman’s account of Might: Let r be a set of possible worlds, / a formula of a propositional language, k/k its usual interpretation and + the update operation. Then: r + Might / ¼ r

if

r \ k/k 6¼ ;

¼ ; otherwise: The idea is that an information state will pass the test of Might / whenever / is true in at least one world in the context. But updating contexts with factual information such as that in (1c) will eliminate all those worlds that support the epistemic possibility that it is sunny, and so attempting to update the information state with it might be sunny will yield an empty information state, a sign that something has gone wrong. Veltman’s semantics for modals naturally extends to a simple semantics for conditionals. Gillies (2004) analyses conditionals as introducing tests on (Veltman style) information states: a state r will pass / 0 w iff r obeys the Ramsey test for this conditional—that is9

r + / + w ¼ r + /.

9 This conditional should be distinguished from material implication /. 0 is the semantic representation of the natural language, if–then conditional.


examples of distributive semantics that are not eliminative, whereas Veltman’s (1985, 1996) update semantics, which he uses for his account of the modals, is eliminative but not distributive. Let us take a look at Veltman’s update semantics first. Contexts are understood as sets of worlds and the epistemic sense of might that is to be captured is one that surveys the possibilities common to the discourse participants who have accepted the information in the discourse so far. Thus, the picture is that as discourse proceeds, the set of epistemic possibilities left open to the participants gradually narrows as they build up a common ground of information between them. When updating a context with a formula, we either eliminate worlds from that context that do not satisfy it or the formula operates as a ‘test’ on the context as a whole. Formulae of the form Might / and Would / and conditionals function as tests on the context. Here is the semantic clause for formulae of the form Might /:


(15) a. If John were to come to the party, it would be fun. b. But of course if John were to come to the party, he might have a heart attack and that wouldn’t be fun. On a standard Lewisian semantics for might counterfactuals, it is difficult to see how to make these two counterfactuals jointly satisfiable, as we show in more detail in section 4. An accommodation account of might offers hope of a satisfactory analysis of (15). However, the accommodation account needs to be strongly constrained, if we are to account for Veltman’s original observations about might, which strongly suggest that Might functions as a test on existing epistemic possibilities. Further, any compositional analysis of counterfactuals that adopts this view of might threatens to trivialize the semantics by making all might counterfactuals with a satisfiable consequent true, since it should always be possible to extend the set of modal possibilities to include those that support some satisfiable sentence. In our semantics we will reject an accommodation account of might and argue for a different analysis of (15). Veltman’s theory and Gillies’ extension capture the data about the interactions between might sentences, conditionals and non-modal sentences effectively. But they do not extend easily to capture other data about the modals, in particular the data on modal subordination or how modal statements interact dynamically. Might is just a test on the whole information state according to Veltman’s semantics; so if the update with Might / is successful, then we get back the same context that we started with. The facts about modal subordination, however, require us to isolate the set of epistemic possibilities in which / holds so that subsequent discourse can modify it further; for example we must isolate those worlds in the context where the wolf walks in as in (2a), in 10

Thanks to Thony Gillies for these examples.


von Fintel (2002) takes a more ‘dynamic’ approach to counterfactuals, making use of the idea that the set of modal possibilities relevant to the evaluation of the counterfactual dynamically evolve and expand to include possibilities that verify the antecedent of the counterfactual. One might think that the modal might could be understood similarly as an expander of some background set of epistemic possibilities. This is known as the accommodation account of might, according to which Might / expands the epistemic possibilities to make / true (Gillies 2003). The attractions of this view are that it offers a Lewisian semantics for counterfactuals, or one built on a semantics for conditionals like the one above, a solution for the following pair of problematic examples.10



order to refine those possibilities to those where the hearer gets eaten first. This is simply not possible on the Veltman semantics: we cannot isolate the relevant set of epistemic possibilities and we cannot modify that set, unless we update with a non-modal assertion. But the facts about modal subordination show that modification is possible with modals and that this modification is not equivalent to updating with a non-modal assertion. Veltman’s test semantics for might alone cannot furnish us an appropriate context in which to evaluate subsequent modal claims. Now, the work by Veltman and Gillies is for a quantifier-free fragment only, and so we need to extend their semantics to a quantificational one. Most research in dynamic semantics has adopted the distributive semantics for quantification (e.g. Groenendijk & Stokhof 1991; Kamp & Reyle 1993), and we will follow suit. The distributive semantic framework of DRT is also the one in which the first accounts of modal subordination were developed. Roberts’ (1987) account of modals, for instance, considers them to be two place operators, one argument of which is given by what is in the scope of the modal operator, the other argument (the restrictor) accommodated from the context. Modal subordination is then a matter of finding an appropriate proposition in the antecedent discourse to serve as the restrictor or first argument of the modal. For example, in (2a), the proposition under the scope of the Might operator in the first sentence provides the first argument to the operator Would in the second sentence, while the second argument is given by the material under the scope of the would in the second sentence. Roughly, Roberts (and also Frank 1997) gives the following semantics for Would (/, w): every element of some fixed set of modal possibilities associated with that context that satisfies / also satisfies w. So, for example in (2a) the formula it would eat you first is satisfied in a context just in case every possibility in which a wolf walks in is also an element in which the wolf eats the hearer first. This is a distributive semantics for the modal operators. This account is attractive for its simplicity, as are related anaphoric accounts of modal subordination like that of Geurts (1995) and Frank (1997). But we do not think they offer the right approach to modal subordination. First, two place operator accounts like Roberts’ (and also Geurts 1995; Frank 1997) do not specify the first argument of a modal operator; that is left to pragmatics. These accounts do not get us the right interpretation of (2a) by themselves; for example another possible interpretation of (2a) is that the first argument of would is not the proposition under the scope of might, but any other contextually

104 Were, Would, Might and a Compositional Account of Counterfactuals salient proposition. Consider, for instance, (2a) in the following discourse context: (16) a. John doubts/claims that a tiger will walk in. b. But a wolf might walk in. c. It would eat you first.

(17) A tiger might walk in. Then a wolf might walk in. They might eat you first. Example (17) shows that the possibilities relevant to the interpretation of the third modal statement are those where a wolf and a tiger walk in, something which Frank’s approach could accommodate. However, given that both Roberts and Frank work with a fixed set of modal possibilities, information not under the scope of a modal operator doesn’t affect the interpretation of future modal statements.11 And that is not right for epistemic modalities; we need to make the set of modal possibilities dynamically evolve as the information state evolves. Allowing conjunctions of propositions to be antecedents for modals as Frank does, further, threatens to exacerbate the problem mentioned 11

This problem also affects Geurts’ approach. See van Rooij (2005) for a review and criticism.


Anaphoric accounts permit the proposition that a tiger will walk in to be the first argument of the modal. But we do not think that (16) can have that reading. Frank’s account explicitly allows propositions under the scope of negation to count as first arguments of binary modal operators. But if we replace (16a) with It’s not the case that a tiger will walk in, we still think that the only possible reading of the discourse entails that it is the wolf that would eat you first. Further, the distributive semantics does not predict the data in (1a,b); any number of contextually available antecedents might make (1b) acceptable. Another drawback is that these accounts of modal subordination make it very difficult to capture Veltman’s observations. Roberts’ account does not update the epistemic possibilities; it is essentially a static view of modality. Frank’s account allows us to consider essentially conjoined propositions as anaphoric antecedents to be identified with the restrictor: the proposition that is used here need not be the content of a single sentence, but rather could be the conjunction of multiple earlier discourse constituents. This allows one to make use of the content of successive modals to evaluate information under the scope of yet another modal, something that would be needed, for example to interpret the plural pronoun in (17):


in the previous paragraph—namely, that there is an over-generation of possible readings of modally subordinate sentences. Consider:

(19) a. A wolfi might walk in. Then again one might not. #Iti/#The wolfi would eat you first. 12 See also Kaufmann (1997, 2000) and Kibble (1998) for more discussion of these sorts of constraints on modal subordination. 13 Thanks to Ken Safir for coming up with this example. 14 In section 4 we present a constraint on entailments to the effect that ambiguous expressions must be disambiguated the same way both on the left and right of the turnstile. One might wonder whether this constraint makes both the pronoun and modal cases fall out. It would only if one takes the possibility of having distinct antecedents to be a genuine ambiguity, which seems controversial at best. In any case, we do not wish to endorse such a view.


(18) A wolf might walk in. It probably wouldn’t eat you. But a tiger might walk in, and it definitely would eat you. The modal in the fourth clause in (18) could take any of the embedded propositions expressed by the last three sentences on Frank’s approach, whereas intuitions would suggest that the last modal is constrained to take the last updated set of epistemic possibilities, and this predicts as in (17) that there will be both a wolf and a tiger present.12 Salience concerns dictate that the tiger is the preferred eater. Now one could get the wolf to be the cause of the addressee’s fright, if the context contained the information that wolves normally do not eat people unless there is a tiger present, in which case the wolves become very competitive and aggressive—and tend to eat people present.13 The problem of not specifying in any way the antecedents for the modals on the anaphoric approach leads to an appalling lack of logical structure for the modals in natural language. It is problematic to evaluate whether he is fat entails he is fat without specifying the antecedent of the pronoun in each case; if we do not know whether the two he’s have the same antecedent (or reference), we cannot tell whether the entailment goes through. The anaphoric and accommodation accounts predict that it is equally difficult to evaluate entailments between modal claims, and that just seems wrong to us. John might come to the party clearly entails that John might come to the party. For this reason, the anaphoric account of modals yields very little in the way of an interesting logic for natural language modals and none of the intuitively valid inferences we presented earlier.14 But the most important reason for rejecting anaphoric accounts is that modal subordination relations do not pattern with relevantly similar instances of anaphoric reference to propositions or other abstract entities. Consider the following examples, where the modal would in (19a,b) and the anaphor in (19c) is supposed to pick up the possibility that a wolf walks in.

106 Were, Would, Might and a Compositional Account of Counterfactuals b. A wolfi might walk in. Then again there might not be any wolf. #The wolfi would scare you. c. [A wolf might walk in]j. Then again one might not. I’m (not) concerned about itj (thatj, the possibilityj). d. A student owns a car. Another student doesn’t own a car. The car is very expensive.

15

We are thankful to an anonymous reviewer for pointing this out. It appears that van Rooij (2005) takes a similar view. See also Zeevat (1992) for some related discussion. We propose a different approach from van Rooij’s because his paper is primarily concerned with presupposition, which we feel is a somewhat orthogonal issue to the semantics of modals and modal subordination. While the formalism proposed by van Rooij is quite elegant, we think that it has problems with simple cases of modal subordination that we will outline once we get to our technical proposal. 16


The anaphoric approach requires that the first argument of the modal in (19a) be an accessible antecedent referring to an abstract entity. Now (19c), which involves anaphoric reference to an abstract entity in exactly the same position as (19a,b), is perfectly acceptable. And in fact a theory like DRT augmented with an account of accessibility for abstract entities (e.g. Asher 1993; Frank 1997) predicts that the proposition that a wolf walks in is accessible to the modal in the third sentence of (19a,b). When we add constraints given by discourse structure on anaphora such as those found in Segmented Discourse Representation Theory (SDRT) (Asher & Lascarides 2003), we still predict that the modals in (19a,b) should be able to find appropriate antecedents. Our predictions of anaphoric accessibility using DRT or SDRT indicate correctly that (19c) is acceptable. On the other hand, the modally subordinate cases (19a,b) are semantically uninterpretable. The uninterpretability of the pronoun in (19a,b) could be argued to result from the fact that the wolf mentioned in the first constituent is not salient,15 but it is difficult to maintain that salience is the sole cause of infelicity when we replace the pronoun with the definite, especially in view of the passable (19d), in which the car clearly has the first instance of a car as antecedent. The conclusion we think that one must draw from these observations is that modals are not anaphoric expressions. Our alternative picture is that modals are dynamic operators that take epistemic possibilities as inputs and refine, reset or otherwise modify them.16 Our view is in many ways closer to that advocated by Kratzer, except that for us the ‘modal bases’ are dynamic and change as the discourse context proceeds. The updated epistemic possibilities serve as inputs to subsequent occurrences of modals. If we take the superficial order of sentences to determine the sequence of updates, then we should expect that it is always the most recent set of epistemic possibilities that are relevant to evaluating a modal. If this were right, then modals would behave


similarly to tenses in a dynamic semantic theory like DRT (Kamp & Reyle 1993). But the situation is not so simple, as the following example shows: (20) a. (p1) John may come to the party. (p2) He might drink quite a bit. (p3) We would all have fun. b. (p4) But then again he might not drink anything. (p5) And then we wouldn’t have fun.

(21) a. b. c. d. e.

(p1) John had a great evening last night. (p2) He had a great meal. (p3) He ate salmon. (p4) He devoured lots of cheese. (p5) He then won a dancing competition.

Example (21c–d) provides ‘more detail’ about the event in (21b), which itself elaborates on (21a). Example (21e) continues the elaboration of John’s evening that (21b) started, forming a narrative with it (indicating temporal progression). Clearly, the ordering of events does not follow the order of sentences, but rather obeys the constraints imposed by discourse structure, as shown graphically below. Thus the eventualities that are understood as elaborating on others are temporally subordinate to them, and those events that represent narrative continuity are understood as following each other. The relevant parameter for interpreting tenses is discourse adjacency in the discourse structure, not superficial adjacency. A theory like SDRT (Asher 1993; Asher & Lascarides 2003) provides the discourse structure in Fig. 1 for (21) and this allows us to get a proper treatment of the tenses therein. Here p6 and p7 are discourse constituents created by the process of inferring the discourse structure. See Asher & Lascarides (2003) for details.


The first modal in (20b) cannot take as input the last updated set of epistemic possibilities in (20a) because then the updated epistemic possibilities in (20b) should support both that John drinks quite a bit and that he does not drink anything. Intuitively, no epistemic possibility should do that. This strategy predicts that (20a,b) is at least modally inconsistent. But the example is perfectly interpretable and ordinary. What intuitively happens with (20b) is that the input epistemic possibilities are those introduced by the first sentence in (20a). That is, the input epistemic possibilities to the first sentence in (20b) are the output epistemic possibilities from the first sentence of (20a). The phenomenon observed in (20a,b) generalizes to tenses as well. Sometimes the temporal structure of a discourse is more elaborate than what is suggested by the DRT analysis of tenses. There are clearly temporal shifts that show that the treatment of tenses cannot simply rely on the superficial order of the sentences in the text.


The same idea is operative with the modals. When we construct an appropriate discourse structure for (20a,b), the appropriate attachment site for (20b) is the first sentence of (20a). The rest of the material in (20a) which elaborates on one possibility offers a contrast with the material in (20b), which elaborates a different, in fact disjoint, possibility, something we signal with the discourse relation Alternation. Figure 2 shows an appropriate discourse structure for (20a,b). Again, p6, p7 and p8 are created during the process of inferring the structure. This discourse structure intuitively makes clear that the two contrasting possibilities depend on the possibilities introduced by the discourse constituent that dominates them and that they elaborate. They offer two different ways the possibility of John’s coming to the party, which they elaborate on, might proceed.17 We now turn to providing the 17 Discourse effects using contrast also operate with conditionals and counterfactuals, a phenomenon that has received extensive discussion by various authors including Frank (1997) and von Fintel (2002).

(i)

If the USA threw its weapons into the sea tomorrow, there would be war; but if all the nuclear powers threw their weapons into the sea tomorrow, there would be peace. (ii) ??If all the nuclear powers threw their weapons into the sea tomorrow, there would be peace, but if the USA threw its weapons into the sea tomorrow, there would be war. (iii) If the USA throws its weapons into the sea tomorrow, there will be war; but if all the nuclear powers throw their weapons into the sea tomorrow, there will be peace. (iv) ?If all the nuclear powers throw their weapons into the sea tomorrow, there will be peace, but if the USA throws its weapons into the sea tomorrow, there will be war. Similar asymmetries also feature in many different sorts of constructions and provide another instance of a general feature of contrasting constituents in discourse. Contrast is not a symmetric relation. The interpretation of the first clause gives rise to the natural implicature that only the USA throws its weapons into the sea, which then generates a natural contrast with the second clause of examples (i) and (iii) above. On the other hand, (ii) and (iv) require a more drastic revision of the asserted content to take place in order for the contrast to be valid. Contrast is a veridical relation in the sense of Asher & Lascarides (2003) and so does not support revision of asserted content, though the revision or cancellation of implicatures is a common phenomenon with contrast. For a more thorough discussion of the semantics of contrast, see Asher (1993).


Figure 1 SDRT graph for (21).


Figure 2

SDRT graph for (20a,b).

3 SEMANTICS OF MODALS Recall that our purpose here is to combine distributive and nondistributive intuitions, in order to account for modal subordination and Veltman’s observations. To do this, we need to combine a basically distributive dynamic semantics for quantifiers with the non-distributive semantics for the modals. We begin with a standard dynamic semantic formalism like that of DPL with partial assignment functions, which yields a good account of quantification and inter-sentential anaphora. The elements of the context are world–assignment function pairs.18 To this we add another element for evaluating the modals, a pair consisting of a basic or global set of epistemic possibilities and a focussed set of epistemic possibilities. Thus, each point of dynamic evaluation becomes in effect an information state. In order to capture Veltman’s examples in (1a,b), we need to test the current epistemic possibilities when evaluating sentences of the form Might /. Following Fernando (1993), we define a notion of satisfaction at the level of sets that rides piggyback on the underlying distributive interpretation. To handle the modals, we will define certain operations on sets of epistemic possibilities understood collectively.19 Accordingly, we extend the standard constituents of dynamic meanings in dynamic semantics, world–assignment pairs, to include two sets of 18 This semantics is essentially identical to that given for DRT in Fernando (1993) or Asher & Lascarides (2003). 19 This semantics has in effect the same formal apparatus as an account of plurals which takes seriously both distributive and collective predications. Collective and distributive interpretations of plurals are analogous to non-distributive and distributive interpretations of modals: collective interpretations test sets of assignments within a context, while distributive interpretations are determined relative to individual assignments that comprise the individual world–assignment pairs in the context.


semantic details of how to interpret modals, all the while with an eye to using this analysis within a discourse-based framework like SDRT, and to its applicability to a compositional theory of counterfactuals.


20 It seems that epistemic modals can embed, at least in the following way (though as one reviewer has pointed out to us, not all embeddings are equally felicitous).

(i) It might be that John might come. Here embedding with identical modalities seems perfect to us. To ensure that our epistemic possibilities can handle arbitrarily deep (though finite) embeddings, we proceed inductively. We begin with some choice a of a set of world–assignment pairs and use that choice to inductively build up more complicated sets of epistemic possibilities. We detail this in a fuller version of this paper. Still, it might be that this is just an instance of modal concord, as with the possibility that John might come, in which one does not necessarily feel the effect of two distinct modal operators, as suggested by a reviewer. We leave this issue here for the present. 21 Let A be a fixed model consisting of a domain A, a set of worlds and an interpretation of the non-logical constants. The dynamic transitions []A are defined inductively and we let kak be the semantic value of terms. Recall that in dynamic semantics we actually think of an existentially quantified formula dx/ as consisting of two actions, one determined by the quantifier and one determined by /. Accordingly, we will adopt the convention that dx/ is dynamically to be understood as dx ^ /. In what follows, we make use of projection functions 1, 2 on r to pick out the world and assignment function of a context element, respectively. We assume the usual mechanisms to avoid variable clash and the assignment of two objects to the same variable.

A A r½Rt1 ; . . . ; tn A r# iff r ¼ r# ^ Ækt1 kA ð1ðrÞ;2ðrÞÞ ; . . . ; ktn kð1ðrÞ;2ðrÞÞ æ 2 R1ðrÞ : A r½t1 ¼ t2 A r# iff r ¼ r# ^ kt1 kA ¼ kt k : 2 ð1ðrÞ;2ðrÞÞ ð1ðrÞ;2ðrÞÞ r½/ ^ wA r# iff r½/A +½wA r#: 1ðrÞ 2ðrÞ A A r½:/ r# iff r ¼ r# ^ :dw$; h r½/ r w$ ; h ; where r1ðrÞ w$ is the result of replacing 1(r) 2ðrÞ with w$ and r h is the result of replacing 2(r) with h and h ¼ /2(r), that is h agrees on all variables with 2(r) except those introduced in /. a a r½dxA r# iff da 2 A r ¼ rx ; where rx is the result of extending 2(r) with 2ðrÞ [ fÆx; aæg:


epistemic possibilities, each element of which is a set of quadruples consisting of a world, an assignment and two sets of epistemic possibilities, which are sets of world–assignment pairs.20 We adopt the constraint that epistemic possibilities at the outset include the actual world; namely, we stipulate that an initial state for the interpretation of a discourse is as follows: e0 ¼ {Æw, ;, G, Fæ : w 2 W}, where w is the world of some element of G, F ¼ G and all assignment functions are undefined. As the notation suggests, G is the global set of possibilities, and F is the focussed set. Note that we assume that the focussed set of possibilities is the same as the global set of possibilities in the initial state. As usual, it is the task of indefinite noun phrases in discourse to introduce new discourse referents to which we can refer in subsequent discourse and to narrow down the information present in the initial information state (cf. Heim 1982; Kamp & Reyle 1993, and many others). It is the task of might modals and other operators to introduce a focussed set of possibilities distinct from the global epistemic possibilities. We believe that it is this part of their meaning that allows for phenomena like modal subordination to arise. It is straightforward to define a distributive dynamic DPL-style semantics, where a dynamic transition, r[/]r#, is induced by formulae over our new information states.21 We will speak of r#’s being a / descendant of r, just in case r[/]r#. We now lift this notion to a set of


epistemic possibilities in the spirit of Fernando (1993) and Groenendijk et al. (1996). Doing so will allow us to make use of the global and focussed sets of possibilities directly. e# is a / descendant of e iff "r 2 e dr# 2 e# r[/]r# and "r# 2 e#dr 2 e r[/]r#. As the discourse proceeds, we learn things and so refine and indeed revise our epistemic possibilities in light of what has been learned. The semantics does not yet model this sort of change. Let us call the discourse context that set of quadruples that is the result of the evaluation of successive sentence tokens in a discourse. A discourse context is the same sort of animal as an epistemic possibility—a set of world–assignment, epistemic possibility quadruples; and it contains the information of what has been said up to this point. Simplifying matters considerably, we will take what has been said in discourse as having been established and accepted as part of the common ground.22 Thus, whatever is true or supported in such a discourse context should be reflected in the set of epistemic possibilities of those quadruples r that are part of the discourse context. To accomplish this job, we introduce a new notion of a discourse update, which also incorporates a notion of revision. The global set of epistemic possibilities is just updated with new information, and in it, given our initial set up, 1(r) 2 3(r) for all r, meaning that update of G always involves the actual world. It is the focussed epistemic possibilities that may shift to a counterfactual possibility and may need to be revised. Thanks to the work of Lewis (1973), Spohn (1988) and others, it is straightforward to define a revision function w on worlds that transfers to epistemic possibilities if we assume a partial ordering on worlds. The particular ordering we assume is that of Lewis (1973: 13–9).23 This ordering forms a system of spheres centred around each element r. A set e of such elements can also have a system of spheres, where each sphere Sn of e is such that: Sn ðeÞ ¼ f[ðSn ðrÞÞ : r 2 eg:

Let S/m ðeÞ be the smallest sphere around e such that some elements in S/m ðeÞ have / descendants.24 Then ewk/k ¼ fr : dr# 2 S/m ðeÞr#½/rg:

22 Thus, we pass over all the problems of correction, denial and disagreement—but see Asher & Lascarides (2003) or Asher & Gillies (2004) for discussions of these phenomena. In any case, we think that revision involves necessarily some sort of discourse indications that this is taking place, indications that are not present in the simple examples we consider here. 23 Lewis’ ordering is connected and satisfies strong ordering. For simplicity we will only define our revision function on possibilities with respect to their world components. 24 We assume that there is such a smallest sphere—the limit assumption—in order to simplify the definitions.


112 Were, Would, Might and a Compositional Account of Counterfactuals Discourse update is needed to evaluate sequences of formulae that are translations of our examples. Discourse update for formulae other than existential ones is as follows.25 Suppose / is a formula other than dx ^ w. Then:

r is a / of r# iff dr1 such that r#½/A r1 and discourse update G Fr1 r ¼ r1 G#r1 ; Fr wk/k ; where G# is a / descendant of G. 1

F# is a / descendant of the maximal set M of elements in Fr that have / descendants, and G# ¼ (Gr M) [ F#, provided there are / descendants in Fr. 25

The definition of discourse update for existential formulae is as follows. Suppose / :¼ dx ^ w: Then, r is a / discourse update of r# iff for some r3, r#½dxA r3 ; and each r1 2 Gr and each r2 2 Fr are such that 2(r1)(x) ¼ 2(r2)(x) ¼ 2(r3)(x) and r is a w discourse update of r3

The clause for dv/ descendants ensures that the bindings of variables outside of modal contexts carry over into them, when we quantify in, as in: (i) A student just walked in. Pat might grant him an interview. Processing of the first sentence, dx ^ Sx; ensures that there is some individual a that verifies S; the a input context for the second sentence will thus be rx ; where Sa. The modal formula modifies this input set in such a way that this assignment is preserved, as we will see momentarily. 26 As usual, r[/] ; indicates that update with / yields failure, that is, in our setting, update is not possible for any element of r.


The notion of discourse update formalizes the idea that information introduced into the discourse must be reflected in the updated epistemic possibilities—hence, the need for the revision operator w. Note though that the w operator only applies to the focussed set of possibilities. This ensures that update is not guaranteed success if G lacks / descendents. For an example, in evaluating (1a,b), we check whether the translations of those formulae give us a sequence of coherent discourse updates, where a coherent discourse update is one where for some input r there is a non-empty output. Discourse update is also the notion that we need to define logical consequence. Modal formulae also affect the epistemic possibilities via their basic semantics—that is in how they affect dynamic transitions over r. Might / updates the value of the focussed set of epistemic possibilities with / and then revises the global possibilities so as to ensure that the focussed set remains a subset of the global set, provided that the set of focussed epistemic possibilities does have a / descendant. If not, the update fails, as Veltman’s observations attest.26 ÿG F r½Might /A r G#r F#r ; where:


If not, then F# is a / descendant of the maximal set M of elements in Gr that have / descendants, and G# ¼ (Gr M) [ F#, provided there are / descendants in Gr. r½Might /A ; otherwise:

r Fr r½Would /A rðG G#; F#Þ; provided G# and F# are specified in one of two ways:

1. G# is a / descendant of Gr and F# is a / descendant of Fr or, 2. F# is a / descendant of Fr and G# ¼ (Gr Fr) [ F#. r½Would /A ; otherwise:


Might intuitively involves an existential quantification over epistemic possibilities. And like all existentials in dynamic semantics, it involves a special kind of discourse action—that of resetting, in this case of the focussed epistemic possibilities. Might resets only the focussed set of epistemic possibilities because asserting Might / should not reset all the epistemic possibilities to those in which / holds. This semantics incorporates the test idea of Veltman’s semantics but involves more than a simple test; it allows information under the scope of the Might operator to transform the epistemic possibilities in the input. It also differs from ‘accommodation’ views of might like Gillies’ (2003), according to which might always enlarges the epistemic possibilities under consideration. On our view it rather refines certain epistemic possibilities that must be already in place. As in Kratzer’s original semantics both might and would depend on an antecedently given set of epistemic possibilities. In our account, however, we specify how these epistemic possibilities evolve as discourse proceeds. The anaphoric accounts of modals allow them to pick up any salient proposition in the context, as discussed earlier; our account is much more constrained and is closer to Kratzer’s original view. However, our semantics for would does share an important feature with anaphoric accounts. Would tests and updates the global set of epistemic possibilities or the set of focussed epistemic possibilities. Would thus semantically under-specifies which set of epistemic possibilities it depends on. Anaphoric accounts, of course, take all modalities to be under-specified and in a much more unrestrained way. In modal subordination cases, the update function of would is directed to the focussed set, especially when a failure to do so would leave presuppositions about anaphoric antecedents unsatisfied. However, in order to assess the validity of an argument involving would we need to check both update possibilities.


(22) A wolf would walk in. It might eat you first. Our account explains this because in an out of the blue context, both the local and global possibilities are initially set to a very large set of epistemic possibilities, and it seems implausible that all of those possibilities contain a wolf that walks in. But that is what would have to happen in order for the would statement in (22) to go through. Our notions of logical consequence and validity remain like those from DPL with one important change: we replace the basic notion of a dynamic transition with our notion of discourse update. This is needed to ensure that new factual information affects the epistemic possibilities in the correct way.

Logical consequence: Let C be a sequence of formulae. Then C ~ / iff for all models A and for all information states r; r# such that r# is a C discourse update of r, there is a r$ such that r$ is a / discourse update of r#. Given an ambiguous expression on the left- or right-hand side of ~, we must test for ~ on each disambiguation of the premises and the conclusion modulo a parallelism constraint that tokens of the same ambiguous expressions are all disambiguated in the same way on both sides of the ~. Validity: ~ / iff ; ~ /.

We will see the effect of the ambiguity condition momentarily. For the present we simply note that it corresponds to the definition of consequence in van Deemter (2005). 27 To encode the effects of discourse structure in the model of the sort needed to capture the semantics of (20a,b), we could adopt a more complex analysis of G as a set of sets, partially ordered under . But we prefer to account for discourse structure effects using the attachment constraints of SDRT (Asher & Lascarides 2003).


Notice that if not every element of Gr has a / descendant, then the first possibility of satisfying Would / will fail; if in addition not every element of F has a / descendant, then Would / will have no output. The specification of the focussed and global epistemic possibilities in the update conditions for Would / should be understood as two possible interpretations, not one set of update conditions involving a disjunction.27 Might and would are not duals on our semantics, due to their dynamic properties, and we think rightly so because of the difference in their discourse behaviour—in particular with respect to modal subordination. The standard modal subordination sequence (2a) is not as good when the order of modals is reversed in the same contexts:


We can verify some logical consequences of our definitions. Fact 1

(i) (ii) (iii) (iv) (v) (vi)

Would / 2 /; Might / 2 Would / Would / ~ Might / / ~ Might / / ~ Would /; Would obeys the axioms of h in a K modal system, while Might obeys the axioms of ) in K.

28 Note, however, that prior to any resetting this claim is true, which makes the inference perhaps defeasibly plausible. 29 van Rooij (2005) seems to get this wrong. He requires a puzzling clause for the semantics of would in order to get standard cases of modal subordination to work in his system (his footnote 19). The requirement is that to evaluate h/ we consider only accessible worlds u such that for any world v accessible to u, u and v have exactly the same set of accessible worlds. From this requirement, however, h/ follows from )/ on a standard notion of dynamic consequence for van Rooij’s system. 30 In the absence of other information in the discourse that might allow us to distinguish between global and focussed sets of possibilities, we would conclude that Would / implies : Might :/. However, this inference is non-monotonic. If antecedent discourse already forces us to distinguish global and focussed epistemic possibilities, then this implication no longer follows, as Would / may have been satisfied relative to the focussed epistemic possibilities of some contextual element.


(i) follows from the fact that even if an information state in our system (a tuple r) supports Would / (and hence that either its global or focussed epistemic possibilities support /), there is no guarantee that r supports /, which is what our notion of entailment requires. For instance, the focussed epistemic possibilities might have been reset by a might claim such that 1(r), the world of evaluation, or 2(r), the assignment in the context, may not be elements of any of those possibilites.28 (ii) is another key non-equivalence that holds in our system.29 This is crucial because we do not want our existential and universal modalities to collapse. Might / resets and updates the focussed set of epistemic possibilities, but Would / will only follow from the output information state, if both the global and focussed set of epistemic possibilities support /. The reason for this is the way we defined ~ for potentially ambiguous or under-specified expressions; because would in our system has an under-specified meaning, the inference must go through on both readings for the consequence to go through. The fact that the local possibilities support / given an update with Might / does not guarantee that the global possibilities will. Our semantics verifies (iii), regardless of which disambiguation for would is chosen.30 (iv) and (v) follow from the definition of discourse update. Note that this semantics makes predictions about the cases of standalone would. Consider again (7), repeated below. When attached to A’s

116 Were, Would, Might and a Compositional Account of Counterfactuals assertion it is natural to understand B as saying that in all of his epistemic possibilities, what A says turns out to be true. It marks a form of agreement, which is intuitively what is going on in (7b). The anaphoric account would make B’s assertion with stand-alone would some sort of logical truth on the standard semantics for would and so should be ruled out on pragmatic grounds of informativeness. (7) a. A: Kim teased Pat. b. B: Kim would do that.

(13) a. #John might come to the party but John would not come to the party. b. #John would not come to the party but John might come to the party.


Like Veltman’s original semantics, our semantics verifies / ~ Might / by the definition of Might and the definition of discourse update. And also like Veltman, Might / 2 /. Resetting the epistemic possibilities to reflect / does not introduce a requirement for / to be true in the world of evaluation 1(r) even after resetting of the epistemic possibilities by Might, although the actual world and assignment of evaluation are elements of the epistemic possibilities associated with them in the initial discourse context. Updating with modal formulae may make our focussed possibilities go counterfactual. Because the epistemic possibilities must always verify what has already been established in the discourse, Veltman’s examples (1a,b) immediately fall out as predicted. Might / updates the focussed epistemic possibilities of an element of the discourse context r to those where / holds, as long as / was a global epistemic possibility in r. Updating an information state with it might be sunny resets the focussed possibilities to those where it is sunny; we can then update with the factual information that it is not sunny which will revise the epistemic possibilities to reflect the fact that we have now learned that it’s not sunny. The discourse has a coherent content, assuming that the original input context does have some sunny worlds as possibilities. However, updating first with it’s not sunny makes a subsequent update with it might be sunny fail or return ; because the input information state does not contain sunniness as an epistemic possibility; the requirement of discourse update, namely, that the facts established in the discourse must be reflected in the set of epistemic possibilities, precludes any such possibility. Our semantics also predicts that examples like (13a,b), repeated here, should not yield any coherent output for any given input state because of the nature of discourse update.


Our account makes sense of the basic operations of quantifying into modal contexts like many others. But it also delivers distinctive treatments of modal subordination, as with the classic examples in (2), repeated below. (2) a. A wolf might walk in. It would eat you first. b. A wolf might walk in. #It will eat you first. c. A wolf might walk in. It might eat you first.

(17) A tiger might walk in. Then a wolf might walk in. They might eat you first. (18) A wolf might walk in. It probably wouldn’t eat you. But a tiger might walk in, and it definitely would eat you. To push the account, consider the following modification of (19a) and its counterpart (19e): (19) a. A wolfi might walk in. Then again one might not. #Iti/#The wolfi would eat you first. #Then again it would not. e. A wolf might walk in. Then again one might not. The wolf might eat you first. Then again there might be no wolf and hence no one gets eaten. Example (19a) is predicted to be bad, given that the discourse precludes an attachment of the third sentence to the first. Given that the first and second sentences are linked by alternation as these sentences present sets of disjoint possibilities, the third sentence must attach to the second


The classic (2a) works as expected. The set of possibilities introduced by the might modality is picked up straight away and modified by a would sentence. A might sentence can also felicitously follow another might sentence, as in (2c); our semantics predicts modal subordination phenomena in that case as well. On the other hand we cannot bind variables outside a modal context with quantifiers introduced inside a Might or Would operator as in (2b). Suppose that he in the second clause of (2b) introduces an occurrence of the same variable x that a wolf inside the modal might binds. Although the might clause resets the epistemic possibilities and assigns x a value in all focussed epistemic possibilities, this resetting does not affect the actual world of evaluation or the actual assignment function. So the value that the variable is assigned by the existential quantifier cannot be passed onto the occurrence of the variable introduced by the pronoun he. One of the salient features of this approach is that the possible interpretations of modal subordination sentences are quite restricted. Examples (2), (17) and (18) come out as desired.


(20) a. (p1) John might come to the party. (p2) He might drink quite a bit. (p3) We would all have fun. b. (p4) But then again he might not drink anything. (p5) And then we wouldn’t have fun. The discourse structure of this example tells us that p2 and p3 form a complex constituent p7 that contrasts with a complex constituent p8 containing p4 and p5. Both of these elaborate the first constituent, which focusses the local epistemic possibilities on those in which John comes to the party. It appears that the complex fronted phrase but then again really provides an alternative set of possibilities. In effect this is not just a contrast but also a contrast of alternative possibilities. The discourse presents two ways of elaborating the set of possibilities introduced in p1. In the discourse structure for (20)—see Figure 2—p7 and p8 are related not only with contrast but also with a modal sort of alternation.31 The semantics for such modal alternations is to us intuitively clear: the two terms of the relation provide two disjoint (and typically exhaustive) sets of epistemic possibilities. To implement this in the semantics, we have to interpret this relation using the local possibilities from p1 as input to p4, which then outputs a set of modified local possibilities to p5. Alternation is thus quite unlike socalled veridical discourse relations in SDRT in which the output from 31 We draw here on Zimmermann’s (2000) views about disjunction, which seem exactly applicable to these cases.


and this fails to give the right input global and local possibilities to interpret the third sentence. To see why, note that this means that the wolf would eat you first must attach to the discourse constituent given by a wolf might not walk in. According to our semantics, would can either test the global possibilities or the focussed ones; the focussed ones clearly fail to support the anaphoric connection required to interpret the pronoun under the scope of would since there are no wolves in the reset, focussed epistemic possibilities. On the other hand, the global epistemic possibilities, though not reset by the might claims, must contain possibilities where there are no wolves, in view of the fact that the second might claim has been satisfied. So the wolf would eat you first also fails to give an output when the global possibilities are used to evaluate the modal. Note also, however, that the first two sentences of (19a) are perfectly fine, given that the second might claim simply resets the focussed epistemic possibilities to those where there are no wolves. To handle (19e), we need to resort to discourse structure. But first let us turn to a simpler example, (20a,b), which we repeat below


the first term of the relation is used to interpret the second term; that is for some discourse relation R, rkRðpi ; pj Þkr#iff rkKp1 k+kKp2 k+k/R k; where Kpi is the logical form associated with the discourse constituent pi and /R is the contribution to the interpretation by the discourse structure. The semantics for this modal sort of alternation has similarities to that for disjunction in standard dynamic semantics:

rkAlternationðp1 ; p2 Þkr# iff dr1 dr2 ½rkKp1 kr1 ^ rkKp2 kr2 ^ rkKp1 ^ Kp2 k;:

32 Discourse structure, in particular the type of discourse relation used to link the new information to the discourse context, offers us the possibility of accounting for negated examples of modal subordination, which feature prominently in Frank’s (1997) account of modal subordination.

(i) I didn’t buy a refrigerator. It would have taken up too much room. (ii) I didn’t buy a refrigerator. It would have cost too much. We postulate a special discourse connection here according to which there is a causal link between the possibility described under the negation and the second clause containing the modal, which we gloss as :p, but if p then the result would have been q—or, p would have caused q. Furthermore, q in turn is a reason for not doing p. The pattern of a negated sentence discourse linked to a sentence with an epistemic modal suggests a type of elliptical explanation involving a conditional, a particular type in other words of discourse relation between two constituents in a discourse. This discourse link differs truth conditionally from the sort of narrative link one gets with the two examples. This kind of account is supported by the fact that negated modal subordination is not universally possible, as shown by the following examples, which lack the discourse connection we describe. (iii) #I didn’t buy a refrigerator. I would like it. (iv) I didn’t buy a refrigerator. I didn’t like it. (de re OK) The explanation for the special status of negation in these cases seems not to stem from the meaning of the operator itself but rather from the rhetorical structure that its uses in discourse suggest. Hobbs (2005) has independently told a somewhat similar story about examples like these.


So update with both Kp1 and Kp2 is possible, but the two are incompatible. This gives us exactly the desired reading of (20). In (19a), on the other hand, the alternation occurs between the first two constituents. The only place the third sentence can discourse attach is to the second sentence (for details, see Asher 1993; Asher & Lascarides 2003), and here the semantics of alternation does not provide for the right set of epistemic possibilities to verify the modal claim in the third sentence, as before. Finally, let us turn to (19e). The repeated use of might in the third and fourth sentences allows us to set up a complex contrast, where the third sentence provides a possible refinement of the first sentence’s local possibilities, while the second does the same for the fourth (see Asher 1993 for details). This sort of attachment is only possible because we see the contrast explicitly appealed to in the second pair of constituents, which is not the case in (19a).32

120 Were, Would, Might and a Compositional Account of Counterfactuals 4 COUNTERFACTUALS AND MODALITY

(23) a. A: John might come to the party. b. B: If he’s still smoking, he’ll annoy everyone else there. The intended reading of the conditional in (23b) is that if John is still smoking and comes to the party, he will annoy everyone else there. Adapting a Veltman–Gillies-style semantics for conditionals to our framework, the definition below ensures that all the epistemic possibilities of a given element of the discourse context together with the world of evaluation and assignment support the conditional.

r½/0wA r iff — every / descendant of r has a w descendant and — every / descendant of Fr has a w descendant.

Though the interpretation of the conditional is close to Gillies’ or Veltman’s, our approach has some advantages. First of all, we account for the sensitivity of conditionals to the epistemic modalities that have been introduced in discourse.33 Second, this account with nested epistemic possibilities allows for the embedding of conditionals within modals and within other conditionals unproblematically. Further, with 33 Of course, as we saw with might and would, discourse structure will determine what the epistemic possibilities are for any sentence to be interpreted in the discourse context.


Now that we have a semantics for modals in place, we are ready to study counterfactuals. We start this project by giving a semantics in our system for the conditional operator itself; this will serve as a foundation for the interpretation of counterfactuals. The conditional semantics by itself can interpret indicative conditionals (along with the Might operator in some cases); it is the pairing with the irrealis operator and Would that enables the interpretation of counterfactuals. We then turn to the irrealis mood that appears in counterfactuals and give a proposal for its semantics that relies on data about counterfactual semantics. We then show that the result of combining the semantics for the conditional operator and irrealis mood with the semantics for modals developed in the preceding section gives a satisfactory semantics for counterfactuals. A first question that arises in giving a semantics for 0 in our system is this: given that we manage two sets of epistemic possibilities in our discourse contexts, which is the conditional is in fact sensitive to? It does appear that conditionals are sensitive to the focussed epistemic possibilities:


Fact 2

(i) (ii) (iii) (iv) (v)

/ 0 w, / ~ w; / 0 (w 0 v) ~ (/ 0 w) 0 v and vice versa; the deduction theorem holds for 0;35 Might(/ ^ :w) ~ :(/ 0 w); :(/ 0 w) ~ Might(/ ^ :w).

Now, with an account of the conditional marker in place, we can give our compositional semantics of counterfactuals. The semantics of counterfactuals is the result of applying the conditional operator to two 34 To see that 0 verifies MP, suppose that for an arbitrary r, r½/0wA r and r½/A r#: So r has a / descendant r#; by the semantics of the conditional, every / descendant of r has a w descendant. So for some r$, r#½wA r$; which suffices to validate MP by our definition of logical consequence. This establishes (i). To show (ii), suppose r[/ 0 (w 0 v)]r and suppose further that r[/ ^ w]r#, for some r#. Now every / descendant of r must also have a w 0 v descendant. And this implies that every w descendant of any / descendant of r has a v descendant. Now r# is a w descendant of a / descendant of r. So r# has a v descendant. Since r# was arbitrary, we have shown that every / ^ w descendant of r has a v descendant and thus (/ ^ w) 0 v. For the reverse direction, suppose we have r[(/ ^ w) 0 v]r and suppose we have r[/]r#. We have to show that every w descendant of r# has a v descendant. Suppose not for some w descendant of r#, call it r*. But then r* is a / ^ w descendant of r and it must have a v descendant. To show (iii), the direction from right to left is just an application of MP, which we have already shown holds for 0. Now from left to right, assume C, / ~ w, which means that every C, / descendant of some input r has a w descendant. Now consider any C descendant of r, call it r# and consider any / descendant of r#; it also must have a w descendant. And since r and r# were arbitrary, we have shown C ~ / 0 w. (iv) follows straightforwardly from the semantics of the conditional. To show (v), suppose that r[:(/ 0 w)]r. Then either some / descendant of r fails to have a w descendant or some r1 2 3(r) has a / descendant but no w descendant. In the latter case, r clearly supports Might(/ ^ :w). Now suppose r itself has a / descendant that has no w descendant. So r supports / ^ :w. By Fact 1 (iv), r has a Might(/ ^ :w) descendant, which is what we needed to show. 35 If a particular notion of conditionals (and entailment) satisfies the deduction theorem, then it enables the inference from w ~ / to w 0 /. Clearly, one would like this to hold in one’s logic.


Gillies’ or Veltman’s notion of a test, conditionals cannot contribute to information growth, not unless we take a more complicated picture of what it is to learn new information (where we might distinguish between different information states and so consider sets of information states as inputs). By separating out elements of discourse contexts from epistemic possibilities we can ‘learn’ (in our simplified sense here) conditionals, eliminating those elements of the discourse context whose epistemic possibilities do not support newly introduced conditionals. Because of the semantics of conditionals, updating the discourse context with a conditional will automatically be reflected within the epistemic possibilities permitted by the discourse context. Some desirable properties for conditionals fall out of this analysis, and in fact, we can come close to defining the epistemic conditional in terms of the modals.34

122 Were, Would, Might and a Compositional Account of Counterfactuals arguments, the first given by the antecedent clause and modified by an irrealis operator, the second given by the consequent, which is (in English at least) invariably modified by an epistemic modal—either might or would—or some other modal operator like should.36 (6a), for instance, is analysed below as (6a#): (24) (6a) If I were not to sleep tonight, I would topple over tomorrow. (6a#) Irr (I not sleep tonight) 0 Would (I topple over tomorrow).

Might counterfactuals receive an entirely similar analysis. So the only thing we need to do now is to specify a meaning for irr. We hypothesize that the antecedent of a counterfactual in Romance and Germanic languages has some sort of an ‘irrealis’ operator—either introduced by the subjunctive mood or a special tense morpheme like the imparfait in French, the Konjunktiv II in German and were or the pluperfect in English. The exact source of irrealis is rather language dependent and further depends on the clause itself: it is not difficult to find counterfactual conditionals in English in which irrealis is expressed through past tense alone, without a subjunctive or pluperfect. Since our focus here is not on the morphosyntax of these constructions, we will not go into detail about where exactly irrealis is located.38 For the 36

The same does not hold for some other Germanic languages, according to a reviewer. The interactions of tense and modality are complex (Abusch 1997; Condoravdi 2002), and we will not go into them here. Nevertheless, our general analysis applies equally well to past as well as ‘present’ tense counterfactuals. We also note that of course might statements can have, as Condoravdi notes, a ‘counterfactual’ reading on which Past outscopes Might. McCready (2006) indicates one way to incorporate Condoravdi’s results about tense into a SDRT theory of modals in the context of an analysis of a particular kind of interaction between temporal interpretation and discourse structure. However, there is a technical difficulty with the (simplified) account of modality presented there (involving clashes of veridicality). There are several possible fixes, but this is not the place to discuss them; here we simply note that any of them is compatible with the present more detailed account of epistemic modality. 38 See Iatridou (2000) for a discussion of the connection between tense and counterfactuality. 37


Intuitively, such a sentence is satisfied in a structure at an input context or information state just in case revising the input context with my not sleeping tonight makes true that I would topple over tomorrow, and the latter will be true in the adjusted descendant of the input just in case all the epistemic possibilities of that descendant make my toppling over true. We analyse ‘past’ tense counterfactuals like (6b) in just the same way as we handle present tense counterfactuals, except that the irrealis operator has scope over the past tense operator.37 (25) (6b) If I had not slept, I would have toppled over. (6b#) Irr Past (I not sleep) 0 (Would Past (I topple over)).


(5) A wolf might walk in. If it were to eat you first, I’d be unhappy, but not as unhappy as if it ate me first. Lewis’ semantics treats counterfactuals in a context-independent way. But then we cannot interpret: (26) If it were to eat you first, I would be unhappy. The anaphoric dependence forces us to interpret the antecedent with respect to those local epistemic possibilities in which the wolf walks in, guaranteed to exist given the update effects of the first sentence of (5). At the very least we need to relativize a Lewis-style semantics to local epistemic possibilities on occasion. This of course already follows for our decompositional analysis of counterfactuals from the way we have treated epistemic conditionals. Now we have to give a semantics for the irrealis operator. Following Lewis’ intuitions, we take it to be an imaging operator.

Semantics for Irr: wr G F r½irrð/ÞA r ; ; ; w Gwk/k Fwk/k

where w 2 1(r)wk/k.


purposes of the present paper, we will, however, make the assumption that irrealis is licensed by a feature located on the mood head that combines with other elements in the clause to yield past tense, pluperfect or whatever the right morphological realization of irrealis may be for a particular language or construction. While this is rather preliminary, it will suffice for our purposes here. We will hypothesize that this operator resets or revises at least the focussed epistemic possibilities of each element in the set of epistemic possibilities—that is for the purpose of evaluating the consequent of the counterfactual. Indeed, such a revision operator is implicit in Lewis’ own semantics for counterfactuals. However, our approach will be decidedly more epistemic, based as it is on a semantics for epistemic modalities and the epistemic conditional. There are reasons not to be completely happy with Lewis’ semantics for counterfactuals. First of all, we cannot account for modal subordination cases involving counterfactuals like (5). Such cases show that the Lewis semantics, which exploits a similarity ordering on worlds that is dependent only on the world of evaluation, cannot be right. Consider again (5) repeated below:


(5) A wolf might walk in. If it were to eat you first, I’d be unhappy, but not as unhappy as if it ate me first. The first sentence of (5) resets the local epistemic possibilities to those in which a wolf walks in, and these possibilities are refined again by the antecedent of the counterfactual to those in which the wolf eats the addressee first. This last set of local possibilities, call it F#, now serves to interpret the consequent of the counterfactual in the world of evaluation, as intuitively desired. Of course, we have required that the consequents of conditionals also be evaluated relative to elements of F. Due to discourse update, the local possibilities of each element of F also encode the information that a wolf has walked in and eats the addressee first, ensuring the appropriate interpretation of would in the consequent of the counterfactual. Our semantics also permits us to analyse the motivating examples for the accommodation accounts of might within counterfactuals. Given that we have argued that accommodation accounts of might provide too weak truth conditions to might counterfactuals, this is a good thing. Consider again, (15) a. If John were to come to the party, it would be fun. b. But of course if John were to come to the party, he might have a heart attack and that wouldn’t be fun. Our treatment of these examples turns on the systematic underspecification of the epistemic possibilities relevant to would—that is the ability of would sentences to be sensitive either to the focussed or global set of epistemic possibilities. By playing on the difference between the sets of global and focussed epistemic possibilities, both of which have been revised to take account of the information under the scope of the irrealis operator, we can jointly satisfy the two conditionals. Our


The operator works on the world of the discourse context element, but as it shifts that world it must also shift the global and epistemic possibilities to reflect the change in the world of evaluation. While operators like Might also reset local epistemic possibilities and have a stand-alone use, the irrealis marker resets the actual point of evaluation as well as the epistemic possibilities. A stand-alone use of the irrealis operator would lead to a loss of information. Thus, we provide a pragmatic reason to expect the lack of the irrealis operator outside the antecedents of conditionals. We have made both the conditional and irr sensitive to local epistemic possibilities. Thus, our semantics is designed to treat examples like (5), repeated once again below:


(27) a. b. (28) a. b.

If I strike this match, it won’t light. If I strike this match and it’s wet, it will light. If I were to strike this match it would light. If I were to strike this match and it were wet, it would not light.

39 The contrastive particle but is a necessary discourse cue, underlining the fact that one is switching from one set of epistemic possibilities to the next. 40 Our semantics does depart from Lewis’ in one respect: MP for counterfactuals, that is the inference A, Ah/B‘B fails to be valid. What our counterfactual validates is the inference A, Ah/B, B‘Would B. However, in our semantics Would B does not entail B. There may be all sorts of facts reflected in the local epistemic possibilities that make B true in those possibilities and make B expected to hold in the actual world. But from this it does not follow that B holds in r. The failure of MP here follows from the fact that Would / does not entail /. We think this is right, but we will not pursue this issue here, as it takes us to a much more radical revision of Lewis’ semantics and a new semantics for irr.


account permits this because in general the global and focussed epistemic possibilities are not identical. Suppose, for example that would in (15a) is sensitive to the focussed epistemic possibilities. Then it is possible that in all those possibilities in which John comes to the party the party is fun, even though there are epistemic possibilities in the global set where in fact John gets a heart attack. The might in (15b) forces us to reset the focussed possibilities using the global possibilities of the context updated with the antecedent. The output of the processing of the first clause of the consequent then resets the focussed possibilities to those where John comes to the party but gets a heart attack. Then when we evaluate the would claim with respect to those, it turns out indeed that the party is not fun.39 Our semantics relativizes the Lewis semantics for counterfactuals to a discourse-dependent set of epistemic possibilities, but otherwise, the counterfactual behaves much like Lewis. Inferences like strengthening of the antecedent and transitivity fail for our counterfactuals just as they do for Lewis counterfactuals.40 Our semantics takes the epistemic nature of counterfactuals seriously. But is this justified? Suppose we consider pairs like those in (28). These counterfactuals depend for their truth and their acceptability on all sorts of facts that we relegate to the background epistemic possibilities. For instance, we would not take either (28a) or (28b) to be true if we accepted as a background epistemic possibility that there was a world just like ours except that when I strike the match it does not light or a world just like ours except that when I strike the match and it is wet, it lights. We also think that it cannot be the case that counterfactuals like (28a,b) are true while the corresponding standard conditionals in (27a,b) are also true, if the antecedent of the counterfactual is still epistemically possible.

126 Were, Would, Might and a Compositional Account of Counterfactuals In other words, we have the entailment: Counterfactuals to conditionals:

Might //ð/h/wÞ/:ð/0:wÞ:

5 CONCLUSIONS This paper has made two main contributions. The first was giving a new semantics for epistemic modals that validates both the Veltman ‘eliminative’ intuitions and Roberts/Frank-style distributive intuitions; this is something that existing accounts have not been able to do, and we see the step we have taken as an important one in giving a universally valid theory of modality. The second contribution crucially builds on the first: an account of counterfactuals that takes seriously the notion that they can be compositionally built up from 41

In fact, we think that we ultimately should have a more complex semantics of irr that would bring closer together counterfactuals and the normality conditionals of Asher & Morreau (1991). But that is a topic for another time.


There thus seems to be a close connection between the counterfactual and the epistemic possibilities as well.41 Our account’s interpretation of the unary modals was designed among other things to provide a compositional semantics for counterfactuals, and it certainly does better than the anaphoric approach. The latter must stipulate that the first argument of would is in fact ‘grammatically’ specified in a counterfactual. The fact that the semantics for would must take the antecedent into account falls out for free on the present account. It follows from the sensitivity of epistemic conditionals to the focussed possibilities as they are determined by previous discourse and to the imaging semantics given for the irrealis operator. Further, our account shows why a would or might is necessary in a counterfactual construction: the irrealis shifts the epistemic possibilities but not the actual world or assignment, and in order for the proposition under the scope of the irrealis operator in the antecedent of the conditional not to be irrelevant to the evaluation of the consequent, we must use an epistemic modal to test the consequent in the relevant set of possibilities. This story, we think, improves on the Lewis account, which has nothing to say directly about the relationship between the irrealis mood, the modal used and the type of counterfactual in play, due to the use of the non-compositional operators h/ and )/.


ERIC MCCREADY Department of English Aoyama Gakuin University 4-4-25 Shibuya Shibuya-ku, Tokyo 150-8366 [email protected]

Acknowledgements The authors thank Thony Gillies, Hans Kamp, Robert van Rooij, Philippe Schlenker and two anonymous reviewers for this journal for helpful discussion and comments. The second author gratefully acknowledges the support of the Japan Society for the Promotion of Science (grant #P05014).

REFERENCES Abusch, D. (1997), ‘Sequence of tense and temporal De Re’. Linguistics and Philosophy 20:1–50. Asher, N. (1993), Reference to Abstract Objects in Discourse. Kluwer Academic Publishers. Dordrecht. Asher, N. & A. Gillies (2004), ‘Common ground, corrections and coordination’. Argumentation 17:481–512. 42

Asher, N. & A. Lascarides (2003), Logics of Conversation. Cambridge University Press. Cambridge. Asher, N. & M. Morreau (1991), ‘Commonsense entailment: A modal theory of non-monotonic reasoning’. In J. Mylopoulos and R. Reiter (eds.), IJCAI 1991. San Francisco. 387–92.

See McCready (2005) for a first attempt at such a project.


their semantic constituents: the conditional itself, the modals that appear in the consequent and irrealis mood. The research project described here is just a part of a much larger project: a full semantics of English modals and conditionals. The semantic proposal in this paper provides a way to analyse many other complex conditionals that we did not discuss here—namely, those involving must, ought and should, for instance. We have presented in effect a family of semantic theories for complex conditionals that use the basic epistemic conditional together with various modals. We have also produced an account of the basic epistemic modals that answers to intuitions about modal subordination as well as Veltman’s observations about the non-distributive semantics of epistemic modals. These basic positions can be extended to a host of other modalities, such as, for instance, deontic modalities,42 if the facts warrant.

128 Were, Would, Might and a Compositional Account of Counterfactuals Hobbs, J. (2005), ‘Toward a useful concept of causality for lexical semantics’. Journal of Semantics 22:181–209. Iatridou, S. (2000), ‘The grammatical ingredients of counterfactuality’. Linguistic Inquiry 31:231–70. Kamp, H. & U. Reyle (1993), From Discourse to Logic: Introduction to Model Theoretic Semantics of Natural Language, Formal Logic and Discourse Representation Theory. Kluwer Academic. Boston, MA. Kaufmann, S. (1997), ‘What’s at stack in discourse’. In P. Dekker and M. Stokhof (eds), Proceedings of the 11th Amsterdam Colloquium. Amsterdam. Kaufmann, S. (2000), ‘Dynamic context management’. In Martina Faller, Stefan Kaufmann, and Marc Pauly (eds.), Formalizing the Dynamics of Information. CSLI Publications. Stanford, CA. 171–88. Kaufmann, S. (2005), ‘Conditional truth and future reference’. Journal of Semantics 22:231–80. Kibble, R. (1998), ‘Modal subordination, focus, and complement anaphora’. In J. Ginzburg, Z. Khasidashvili, C. Vogel, J.-J Levy and E. Vallduvı´ (eds), The Tblisi Symposium on Logic, Language and Computation: Selected Papers. CSLI Publications, Stanford, USA. 73–86. Kratzer, A. (1977), ‘What ‘must’ and ‘can’ must and can mean’. Linguistics and Philosophy 1:337–55. Kratzer, A. (1981), ‘Partition and revision: The semantics of counterfactuals’. Journal of Philosophical Logic 10:201–16. Kratzer, A. (1986), ‘Conditionals’. Chicago Linguistics Society 22:1–15. Kratzer, A. (1991), ‘Modality’. In A. von Stechow and D. Wunderlich (eds.), Semantik: Ein internationales Handbuch der zeitgeno¨ssischen Forschung. Walter de Gruyter. Berlin. 639–50.


Bennett, J. (2003), A Philosophical Guide to Conditionals. Oxford University Press. Oxford. Condoravdi, C. (2002), ‘Temporal interpretation of modals’. In D. Beaver, S. Kaufmann, B. Clark, and L. Casillas (eds.), The Construction of Meaning. CSLI Publications. Stanford, CA. 59–88. Fernando, T. (1993), ‘Generalized quantifiers as second order programs— ‘dynamically’ speaking, naturally’. In P. Dekker and M. Stokhof (eds), Proceedings, Ninth Amsterdam Colloquium. University of Amsterdam, Amsterdam. Frank, A. (1997), Context Dependence in Modal Constructions. Ph.D. thesis, University of Stuttgart, Germany. Geurts, B. (1995), Presupposing. Ph.D. thesis, University of Stuttgart, Germany. Gillies, A. (2003), ‘Modal scorekeeping and ‘might’ counterfactuals’. In P. Dekker (ed.), Proceedings of Fourteenth Amsterdam Colloquium. ILLC Publications. Amsterdam. Gillies, A. (2004), ‘Epistemic conditionals and conditional epistemics’. Nous 38:585–616. Groenendijk, J. & M. Stokhof (1991), ‘Dynamic predicate logic’. Linguistics and Philosophy 14:39–100. Groenendijk, J., M. Stokhof, & F. Veltman (1996), ‘This might be it’. In D. Westerstahl and J. Seligman (eds.), Language, Logic and Computation: The 1994 Moraga Proceedings. CSLI Publications. Stanford, CA. 255–70. Heim, I. (1982), On the Semantics of Definite and Indefinite Noun Phrases. Ph.D. thesis, University of Massachusetts. GLSA Publications, University of Massachusetts at Amherst. Amherst, MA.

Nicholas Asher and Eric McCready 129 Lewis, D. (1973). Counterfactuals. Basil Blackwell. Oxford.

Spohn, W. (1988), ‘Ordinal conditional functions. A dynamic theory of epistemic states’. In W. L. Harper and B. Skyrms (eds.), Causation in

First version received: 21.11.2005 Second version received: 04.04.2006 Final version received: 20.11.2006

McCawley, J. (1996), ‘Conversational scorekeeping and the interpretation of conditional sentences’. In M. Shibatani and S. Thompson (eds.), Grammatical Constructions: Their Form and Meaning. Clarendon Press. Oxford. 77–101. McCready, E. (2005), The Dynamics of Particles. Ph.D. thesis, Dept. of Linguistics. ProQuest/UMI, Ann Arbor, MI. University of Texas Austin, TX. McCready, E. (2006), ‘Created objects, coherence and anaphora’. Journal of Semantics 23:251–79. Ramsey, F. (1929), ‘Law and causality’. Reprinted in Foundations (1978). Routledge. London. 128–51. Roberts, C. (1987), Modal Subordination, Anaphora, and Distributivity. Ph.D. thesis, Dept. of Linguistics, University of Massachusetts. GLSA Publications, University of Massachusetts at Amherst. Amherst, MA.


Roberts, C. (1989), ‘Modal subordination and pronominal anaphora in discourse’. Linguistics and Philosophy 12:683–721.

Decision, Belief Change, and Statistics, vol. II. Kluwer. Dordrecht. 105–34. Stalnaker, R. (1975), ‘Indicative conditionals’. Philosophia 5:269–86. van Deemter, K. (2005), ‘Towards a logic of ambiguous expressions’. In K. van Deemter and S. Peters (eds.), Semantic Ambiguity and Underspecification. CSLI Publications. Stanford, CA. 203–37. van Rooij, R. (2005), ‘A modal analysis of presupposition and modal subordination.’ Journal of Semantics 22:281–306. Veltman, F. (1985), Logics for Conditionals. Ph.D. thesis, Dept. of Philosophy, University of Amsterdam, The Netherlands. Veltman, F. (1996), ‘Defaults in update semantics’. Journal of Philosophical Logic 25:225–61. von Fintel, K. (2002), ‘Counterfactuals in a dynamic context’. In M. Kenstowicz (ed.), Ken Hale: A Life in Language. MIT Press. Cambridge, MA. 123–52. Zeevat, H. (1992), ‘Presupposition and accommodation in update semantics’. Journal of Semantics 9:379–412. Zimmermann, T. E. (2000), ‘Free choice disjunction and epistemic possibility’. Natural Language Semantics 8:255–90.

Journal of Semantics 24: 131–167 doi:10.1093/jos/ffm002 Advance Access publication May 3, 2007

Exceptions to Generics: Where Vagueness, Context Dependence and Modality Interact YAEL GREENBERG Bar-Ilan University

This paper deals with the exceptions-tolerance property of generic sentences with indefinite singular and bare plural subjects (IS and BP generics, respectively) and with the way this property is connected to some well-known observations about felicity differences between the two types of generics (e.g. Lawler’s 1973, Madrigals are popular vs. #A madrigal is popular). I show that whereas both IS and BP generics tolerate exceptional and contextually irrelevant individuals and situations in a strikingly similar way, which indicates the existence of a basically equivalent tolerance mechanism, there is also a difference between them, unnoticed so far, which concerns the degree to which the properties of the legitimate exceptions can be characterized in advance. Following claims in Greenberg (2003), I argue that both this newly observed difference as well as the traditional felicity differences result from an underlying contrast in the type of ‘non-accidentalness’ expressed by the two types of generic sentences, and more formally, in the accessibility relations that their generic quantifier (Gen) is compatible with. To capture the new difference in tolerance of exceptions, I develop an improved version of the exceptions-tolerance mechanism for generic sentences suggested in Kadmon & Landman (1993), namely, a restriction on the set of individuals and situations quantified by Gen, which is partially vague to two different degrees using supervaluationist methods. The different degrees of vagueness in this restriction are shown to be systematically dependent on the two types of accessibility relations that IS and BP generics are compatible with, which are redefined as precise and vague restrictions on the generic quantification over worlds.

1 INTRODUCTION This paper deals with two well-known puzzles in the analysis of generic sentences as in (1), which are usually discussed separately, and makes the novel claim that a proper understanding of one of these puzzles can lead to a better understanding of the other one: (1) a. A dog has four legs. b. Dogs have four legs. The first puzzle has to do with the well-known fact that, although sentences as in (1) express generalizations and are usually represented as The Author 2007. Published by Oxford University Press. All rights reserved. For Permissions, please email: [email protected]


Abstract

132 Exceptions to Generics

(2) a. #A madrigal is popular/#A room is square/#A man is blond. b. Madrigals are popular/Rooms are square/Men are blond. (3) a. #A Norwegian student with a name ending with ‘s’ wears thick green socks (odd as generic, fine as existential). b. Norwegian students with names ending with ‘s’ wear thick green socks. (4) a. #I noticed that a thick book with a red paperback cover deals exactly with your thesis topic (odd as generic, fine as existential). b. I noticed that thick books with red paperback covers deal exactly with your thesis topic. In this paper, I discuss a novel observation which connects these two puzzles, namely, the fact that although both IS and BP generics tolerate exceptional and contextually irrelevant entities in a strikingly similar way, the ability to specify the properties of the exceptions to BP generics like (2b)–(4b) (namely those with infelicitous IS counterparts) is much lower than the ability to specify the properties of the exceptions to IS generics. The main claim I make is that both this difference as well as the felicity differences seen in (2)–(4) are a result of an underlying contrast in the kind of modality that IS and BP generics can express, argued for in Greenberg (2003).1 To capture the newly observed difference, the exceptions-tolerance mechanism for both IS and BP generics is defined as a restriction on the set of individuals and situations quantified over by generic quantifier (Gen), which is 1

Preliminary versions of parts of this claim appear in Greenberg (2003, 2004).


headed by a generic operator with universal force, they also tolerate exceptions. For example, unlike ‘Every dog has four legs’, both (1a) and (1b) are considered true, although there are clear cases of three-legged dogs. The second puzzle is that while both minimally contrasting generics with indefinite singular and bare plural subjects as in (1a) and (1b), respectively (IS and BP sentences, henceforth), express generalizations, support counterfactuals and tolerate exceptions, there are also reported differences between them. Examples of these are the ‘definitional’ nature of the IS generics v. the weaker ‘inductive’ nature of the BP ones (noted in Lawler 1973; Burton-Roberts 1977) and the felicity differences seen in example (2) (from Burton-Roberts and Lawler) and (3)–(4) (from Greenberg 2003). The puzzle, then, is how to capture simultaneously both the similarities and differences between the two types of generics:

Yael Greenberg 133

2 TOLERANCE OF ENTITIES BY IS AND BP GENERICS: THE SIMILARITIES In addition to the well-known observation, mentioned above, that IS and BP generics tolerate exceptional individuals, another observation about both types of generics is that these legitimate exceptions are those which are considered abnormal or ‘non-standard’ in some sense. The observation is common in the genericity literature (see, e.g. Delgrande 1987, 1988; Asher & Morreau 1995; Krifka 1995; Krifka et al. 1995; Drewery 1997; Pelletier & Asher 1997; Eckardt 1999), but, as I will show below, current attempts to capture it precisely face problems. I will call this ‘the abnormality constraint’ on the exceptions to generics, and formulate it like this: An individual which is considered a legitimate exception to a generic sentence is one which, in addition to not having the VP property, is assumed to be exceptional, non-standard or abnormal in some other respect. For example, legitimate exceptions to (1) are dogs that in addition to not having four legs are those with mutations, those that have undergone an accident, etc. A much less common observation about IS and BP generics is that both can tolerate not only exceptional, but also contextually irrelevant, individuals. The traditional view about generics (e.g. Dahl 1975; Krifka 1987; Krifka et al. 1995; Condoravdi 1997), supported by contrasts


systematically sensitive to the kind of modality of the sentence, namely to the restriction over worlds quantified over, and which is vague to two different degrees, using supervaluationist methods. The paper is structured as follows: Section 2 examines traditional as well as novel similarities in the way IS and BP generics tolerate exceptional and contextually irrelevant individuals and situations. Section 3 deals with the newly observed contrast in the way IS and a subclass of BP generics tolerate exceptions, which indicates that the similarity in the exceptions-tolerance mechanism cannot be total. Section 4 develops an informal explanation of the contrast, based on Greenberg’s (2003) theory of the two types of modalities that IS and BP generics can express. In section 5, the intuitions and observations developed in the previous sections are integrated into a unified tolerance mechanism for IS and BP generics, namely, a restriction on the domain of individual (and situations) which is sensitive to the difference in modality and which is vague to two different degrees, using a modified version of the proposal of Kadmon & Landman (1993) for a domain-vague restriction for generics. Section 6 concludes the paper and examines the advantages of the proposal over other exceptions-tolerance proposals.

134 Exceptions to Generics as in (5), is that unlike sentences with explicit quantifiers like ‘every’, generics cannot be contextually restricted: (5) (Context: There are lions and tigers in this cage.) a. Every lion is dangerous. (Can mean ‘Every lion in this cage is dangerous’) b. Lions are dangerous. (Cannot mean ‘Lions in this cage are dangerous’) But there are many other generics which can be contextually restricted, for example, (6)–(8):

(7) (There are books and periodicals in this library.) Books/A book can be borrowed for a week, but periodicals/A periodical can only be borrowed for one day. (Can mean ‘A book/Books in this library’) (8) (There are very cheap clothes in Jack’s shop!) A shirt costs only $10/Shirts cost only $10. (Can mean ‘A shirt/Shirts in this shop’) Both IS and BP generics, then, can tolerate contextually irrelevant individuals, as well.2 The distinction between exceptional and contextually irrelevant individuals is not always clear in the genericity literature, but in this paper I will keep emphasizing its importance. One reason for this is that the abnormality constraint applies only to the former, and not to the latter type of individuals. For example, in evaluating the IS and BP generics in (7), books in other libraries (i.e. contextually irrelevant ones) are clearly not considered abnormal—these are simply not talked about in the first place. In contrast, legitimate exceptions to (7) are books in this library which are taken as abnormal or non-standard, for example, damaged, rare or highly requested books. The same distinction is found with other IS and BP generics [e.g. (6) and (7)]. 2

Condoravdi (1997) admits that some cases of contextual restriction in generics exist, as in ‘People in this university dress formally. Professors wear a tie’ (113), but claims that this is possible only in modal subordination cases, that is, when ‘the restricting set of variables is itself within the scope of a generic operator’ (113). Notice, however, that in (6)–(8) the ‘context sentences’ are nongeneric. Moreover, in (i), where the context sentence is clearly extensional, the generic can be still understood as restricted to the books and periodicals in this library: (i) This library holds 11 800 books, and 578 periodicals. Books/A book can be borrowed for one week only, but periodicals/a periodical can only be borrowed overnight.


(6) (There are professors and students in this university.) A professor wears a tie/Professors wear a tie. (Can mean ‘A professor/ Professors in this university’)

Yael Greenberg 135

As is well known (see, e.g. Chierchia 1995), a subclass of IS and BP generics, namely those with stage-level predicates, tolerate also contextually irrelevant situations. Notice that unlike the tolerance of contextually irrelevant individuals, which is usually dependent on the utterance context, the information about irrelevant situations is contributed also by presuppositions, implicatures or real-world knowledge of the VP as well (for discussion, see e.g. Schubert & Pelletier 1988; Chierchia 1995; Carlson 1999; Cohen 1996). For example, without knowing the utterance context for (9) (e.g. talking about this school or this country), we cannot say anything about which individual first graders are contextually relevant and which are not:

However, even in this null context, we can say something about which situations are contextually irrelevant in (9), namely non-school days situations, since finishing school presupposes going to school. Similarly, even without knowing what the utterance context of (10) is, we easily exclude situations where the computer is off as irrelevant, because working quickly presupposes working: (10) A Pentium computer works quickly/Pentium computers work quickly. The presuppositions and implicatures of the VP, or the real-world knowledge about it, limit the choice of contextually (ir)relevant situations, and the utterance context (e.g. talking about what first graders do on Wednesdays) can further specify it.3,4 In addition to contextually irrelevant situations, both IS and BP generics with stage-level predicates tolerate also exceptional situations. Similar to the tolerance of exceptional individuals, discussed above, 3 This characteristic of contextually irrelevant situations is found not only with IS and BP generics but also with habituals with proper name subjects. Our real-world knowledge tells us that people usually (though not necessarily!) snore when they are asleep, so the salient reading of ‘John snores’ is roughly ‘Whenever John is asleep, he snores’ (although other readings, e.g. ‘Whenever John is nervous he snores’, are also possible in specific contexts). Similarly, the salient reading of ‘John reads the Journal of Semantics’ is not ‘Whenever John reads, he reads the JoS’, nor ‘Whenever John reads a journal, he reads JoS’ (although these readings are possible in certain contexts), because we know that usually people do not spend their whole reading time reading only a journal and that usually if people read journals, they read more than one. Instead, more plausible readings are ‘Whenever a new issue of JoS is out, John reads it’; ‘Whenever John wants to read an interesting paper, he reads JoS’; ‘Whenever John is in the departmental library, he reads JoS’, etc. 4 But notice that sometimes presuppositional information about the VP can give us information about contextually irrelevant individuals as well. For example, in considering ‘Snakes lay eggs’, male individual snakes are considered irrelevant because the VP (‘lay eggs’) presupposes giving birth, which is only relevant of females (see, e.g. Carlson 1999).


(9) A first grader finishes school at 13.00/First graders finish school at 13.00.

136 Exceptions to Generics these are situations which are considered abnormal or less standard. In interpreting (9), for example, these can be extremely stormy day situations, or those where the prime minister comes to visit school. And in interpreting (10), these can be situations where it is extremely hot, or where the computer is placed near a strong magnet. We saw that the abnormality constraint applies only to exceptional, and not to contextually irrelevant, entities. But there is another difference, so far unnoted, between exceptional and irrelevant entities, found with both IS and BP generics. The difference is illustrated in the listener’s reactions to (6) repeated here as (11), in the table in (12):

(12)

Listener’s reaction: the following is a counterexample to the generalization:

Speaker’s evaluation of listener’s reaction

a. But look at John (a professor from another university)—He does not wear a tie!

Oops! The listener has misunderstood something.

b. But look at Bill (a professor from this university)—He does not wear a tie!

Legitimate reaction (but I don’t (necessarily) agree— he is not really a normal or standard professor)

c. But look at this situation (where Bill is in the shower)— Bill is not wearing a tie now!

Oops! The listener has misunderstood something

d. But look at this situation (where Bill is walking in the university’s campus)—Bill is not wearing a tie now!

Legitimate reaction (but I don’t (necessarily) agree— this is not really a normal or standard situation)

The difference between (12a) and (12b) shows us that taking an irrelevant, but not an exceptional, individual as a counterexample to the generalization is evaluated as a misunderstanding. The contrast between (12c) and (12d) shows us that exactly the same holds for irrelevant and exceptional situations. Crucially, even if I, as the speaker of (11), think that Bill is an exceptional (abnormal) professor and that this legitimizes his not wearing a tie, or that this is an abnormal situation (e.g. extremely hot) and that this legitimizes Bill not wearing


(11) (Context: There are professors and students in this university.) Professors wear a tie/A professor wears a tie.

Yael Greenberg 137

a tie now, the fact that my listener still takes these entities to falsify my generalization (i.e. as illegitimate exceptions) is not interpreted as a misunderstanding on his or her side, but rather as a legitimate reaction (although I can still disagree with this reaction). I will take the reactions of ‘misunderstanding’ in (12a,c) to indicate that the listener has failed to accommodate something in what the speaker had in mind, namely the domain restriction. This is very similar to what we feel about B’s reaction to A’s explicitly quantified statement in (13) in the context of talking about this class:

The legitimacy of the listener’s reaction in (12b,d) indicates that, unlike the properties of the contextually relevant entities, the properties of exceptional entities are not required to be accommodated by the listener in the first place. In section 5.1.2 below, I will show that this observation strongly supports the approach of Kadmon & Landman (1993) to Gen as a ‘domainvague’ quantifier. In section 6 below, I will show that it poses a problem for attempts to define Gen as weaker than universal (e.g. similar to ‘most’). To summarize so far, we have seen that IS and BP generics tolerate entities in a strikingly similar way: Both can tolerate four types of entities (contextually irrelevant and exceptional individuals and contextually irrelevant and exceptional situations). In both, the information about irrelevant individuals comes from the utterance context, whereas the one about irrelevant situations comes also from presuppositions, implicatures, etc. about the VP. And in both, contextually irrelevant entities differ from exceptional entities in two ways (the presence/ absence of the abnormality constraint and the (un)necessary accommodation of properties). These similarities should be captured in any theory of IS and BP generics, and they strongly indicate that the two types of generics have the same basic tolerance mechanism. In the next section, however, I look at data indicating that the similarity with respect to tolerance of exceptional entities is not total. 3 TOLERANCE OF ENTITIES BY IS AND BP GENERICS—THE DIFFERENCE

3.1 Cases where mere abnormality is not enough . . . Above we talked about the abnormality constraint, according to which the legitimate exceptions to generics are those considered abnormal.


(13) a. No student got less than 85. b. #You’re wrong! Bill (a student who doesn’t take this course) got 83!

138 Exceptions to Generics A fact which is much less noted, however, is that in many cases merely being abnormal is not enough to be considered a legitimate exception. Clearly not any abnormal or non-standard dog will be considered a legitimate exception to (1), repeated here as (14). In fact, language users can quite easily divide potential abnormal dogs to those which will and those which will not be easily considered exceptions to (14), as in (14a) and (14b), respectively:

Similarly, with (15) the individuals in (a), but not the ones in (b), are naturally considered legitimate exceptions to the generalization, although both can be thought of as ‘non-standard’ or abnormal children. As (16) shows, such distinctions can be easily made with respect to abnormal situations as well: (15) Children learn to read at age six/A child learns to read at age six. a. A child with learning disabilities. b. A child who has five names ending with ‘t’. (16) A Pentium VI computer works very quickly/Pentium VI computers work very quickly. a. Situations where the computer has to process an exceptionally heavy file. b. Situations where the computer is located in a purple room. Notice that although the difference between the (a) and the (b) examples is rather strong, it is possible to imagine special contexts where entities in (b) are considered legitimate exceptions, e.g. in (15) cases where children with five names ending with ‘t’ are those from royal families, which are traditionally taught reading and writing at age 4. Crucially, however, no such special context is needed in the case of the (a) examples.

3.2 . . . and cases where ‘mere abnormality’ is enough On the surface, the distinctions illustrated in (14)–(16), (17) may look rather trivial. As I will show below, however, understanding the mechanism behind them and trying to capture it precisely is not an easy


(14) A dog has four legs/Dogs have four legs. a. Dogs with a mutation, dogs which have undergone an accident, masochistic dogs (which enjoy damaging their body), dogs participating in some cruel scientific experiments . . . b. Dogs with five names, infertile dogs, dogs that can read and write, dogs with vocal cords problems, dogs whose mother won a national medal . . .

Yael Greenberg 139

task, and, as I will show in section 6 below, it cannot be handled by current exceptions-tolerance mechanisms. What is even more important at this stage, however, is the observation that there are generics where such apparently trivial distinctions seem much harder, if not impossible to make. Consider (17): (17) Well-known forty-five year old teachers do not cook on Monday afternoons.

(18) Potential legitimate exceptions to (17): well known forty five years old teachers who are exceptionally successful/exceptionally unsuccessful/especially fat/especially thin/who have more than ten sons/who are exceptionally rich/who are exceptionally poor/who never drink tea . . . Adding more properties to (18) does not seem to change this unclarity. In contrast, if someone were to mix the descriptions in (14a) and (14b), reasonable language users would rather easily manage to redivide the list, and probably all of them would get more or less to the same division as in (14a) and (14b) above. Notice that I am not claiming that the legitimate exceptions to sentences like (17) are not those considered abnormal or that with such sentences it is hard to characterize which individuals are normal and which are abnormal instances of the CN property. In fact, the abnormality constraint applies to sentences like (17) just as it does to sentences like (14): with both generics we assume that the legitimate exceptions are abnormal in some sense. Moreover, with both we can characterize which individuals are considered normal and which abnormal. What I am claiming is that whereas in sentences like (14) we can easily tell which of the


Like (14)–(16), (17) is clearly a generic sentence: it makes a generalization, which is, moreover, non-accidental (it supports the counterfactual ‘If Ann were a well known forty five years old teacher she would probably not cook on Monday afternoons either’), and it tolerates exceptions (several such teachers who do cook on Monday afternoon would not falsify it). But what do the legitimate exceptions to this sentence look like? Comparing (17) to (14)–(16) above, we can see that here is it much harder, if not impossible, to predict this matter in advance. If we try to come up with a list of abnormal or non-standard well-known 45-yearold teachers, as in (18), it is much harder to divide it into those who will be easily counted as legitimate exceptions to (17), and those who do not (namely those who, given their abnormality, are not expected to have the VP property and those who, although abnormal, are still expected to have the VP property, respectively):

140 Exceptions to Generics abnormal individuals are relevant for legitimizing exceptions and which are not, in (17) we cannot. With generics like (14), then, the degree to which we can specify the abnormality relevant for legitimate exceptions is high, whereas with generics like (17), it is very low.

3.3 The connection to the IS/BP puzzle

(19) #A well known forty five years old teacher does not cook on Monday nights. Looking at other BP generics whose IS counterparts are infelicitous as generic, as in (2b)–(4b) above, supports this new observation. How the exceptions to (3b) will look like is hard to predict (exceptionally rich Norwegian students whose names end with ‘s’? exceptionally poor ones? exceptionally educated ones? exceptionally uneducated ones? those who have Italian ancestors? Those who do not have Italian ancestors? etc.). In the same manner, thinking about all kinds of abnormal madrigals, rooms, thick books with red paperback covers, etc., it is hard to predict which will be considered legitimate exceptions to the BP generics in (2b) and (4b) above, and which will not. The new observation is schematically summarized in (20): (20) The degree to which the properties of the exceptions to a generic can be specified is high with felicitous IS generics and their BP counterparts but very low with BP counterparts, of infelicitous IS generics. Obviously, we want to explain the correlation in (20). This, however, is not straightforward given the current views about IS and BP generics. Most current theories of genericity (e.g. Schubert and Pelletier 1988; Wilkinson 1991; Chierchia 1995, 1998; Krifka 1995; Krifka et al. 1995; Link 1995; ter Meulen 1995) ignore the felicity contrasts between characterizing IS and BP sentences described above, and assign them an equivalent semantic representation, including an equivalent exceptionstolerance mechanism. Such theories will have difficulties deriving the


The crucial observation I would like to make now is that there is a correlation between the ability to specify the legitimate exceptions, we have just discussed, and the ability to have a felicitous IS generic. Specifically, those BP generics whose legitimate exceptions are hard to characterize are exactly the ones whose IS counterparts are infelicitous. Whereas (14)–(16), for example, can naturally have generic IS counterparts, the IS counterpart of the BP sentence in (17), namely (19), is odd as generic, and is naturally interpreted as existential:

Yael Greenberg 141

4 A THEORY OF IS/BP GENERICS AND WHAT IT EXPLAINS

4.1 IS and BP generics have a basically equivalent semantic structure with different accessibility relations Greenberg (2003) argues that while both IS and BP sentences express non-accidental, modalized generalizations, they differ in the type of non-accidentalness they can express. Formally, IS and BP generics are taken to have the same basic structure as in (21), where P and Q are the subject and VP properties, respectively, and the superscript ‘cont.norm’ is a restriction on P which stands for ‘contextually relevant and normal’ (to be made precise in section 5 below). Example (21) roughly says that in all appropriately accessible worlds, every contextually relevant and normal P individual has the Q property in these worlds: (21) "w# [w# is appropriately accessible from w0] / ["x Pcont.norm (x, w#) / Q(x, w#)] Following, for example, the line of thought of Heim (1982), Chierchia (1995) or Krifka et al. (1995), the universal quantifiers over accessible worlds and individuals in (21) capture the non-accidentalness of generics and the generalizations over individuals they make, respectively5. (Discussion of quantification and tolerance of situations is dealt with below). 5 Unlike these theories, though, the universal quantification over worlds and individuals in (21) is separated. This will allow an easier characterization of the accessibility relations below.


contrast in characterizing exceptions described in (20). Of the few theories which do attempt to capture the felicity contrasts between IS and BP generics, almost all (e.g. Krifka 1987; Dobrovie-Sorin & Laca 1996; Cohen 1999, 2001a) assign IS and BP generics two completely different semantic structure (namely, quantificational and predicational). Such theories will have problems capturing the strong similarities between minimally contrasting IS and BP generics, namely the fact that both express non-accidental, counterfactual supporting generalizations, and the range of similarities between them concerning the way they tolerate entities, described in section 2. What we need, then, is a theory which attempts to formally capture both the strong similarities and differences between IS and BP sentences and which could lead to an explanation of the generalization in (20). In the next section, I introduce Greenberg’s (2003) theory of generics and show how it achieves exactly these needs.

142 Exceptions to Generics To capture the contrasts between IS and BP generics (e.g. the felicity differences between them), Greenberg (2003) argues that their modal nature, that is their accessibility relations, are different. While both IS and BP generics can involve what is called an ‘in-virtue-of ’ accessibility relation, the other, ‘descriptive’, accessibility relation is available for BP generics only. The following two sections give a summary of these claims.

4.2 IS sentences express only in-virtue-of generalizations

(22) a. A dog has four legs (in virtue of having a four legged genetic makeup). b. "w#["x dog (x, w#) / has four legs genetic makeup (x, w#)] / ["x dogcont.norm (x, w#) / has 4 legs (x, w#)] (‘In all worlds where every dog has a four legged genetic makeup, every (contextually relevant and normal) dog has four legs’) (23) a. A boy does not cry (in virtue of being tough). b. "w#["x boy (x, w#) / tough (x, w#)] / ["x boycont.norm (x, w#) / :cry (x, w#)] (‘In all worlds where every boy is tough, every (contextually relevant and normal) boy does not cry’) Sometimes it is hard to determine which in-virtue-of property the speaker has in mind. Consider (24)–(25): (24) An accountant in this place hardly pays taxes (in virtue of being covered by the local legislation/in virtue of being deeply dishonest/in virtue of earning almost nothing/in virtue of having connections with the mayor . . .). (25) A woman in this town does not walk alone outside (in virtue of living in such a violent place/in virtue of living in such religious town/in virtue of having so many children . . .).


Following Kratzer’s (1981) and Brennan’s (1993) works on non-generic root or ‘circumstantial’ modality, Greenberg (2003) argues that IS sentences, like A dog has four legs or A boy does not cry, necessarily assert that the generalization they express is true in virtue of a certain property that the subject is assumed to have (e.g. ‘having a four legged genetic makeup’ or ‘being tough’, respectively), that the speaker has in mind and the listener has to accommodate. Formally, the accessibility relation of the sentence is what captures this in-virtue-of part. Suppose if you hear, for example, (22a) and (23a), and the in-virtue-of properties you accommodate are ‘has a four legged genetic makeup’ and ‘be tough’, respectively, then the sentences will be interpreted as in (22b) and (23b):

Yael Greenberg 143

(26) "w#["x P (x, w#) / Sc (x, w#)] / ["x Pcont.norm (x, w#) / Q (x, w#)] ‘In all worlds where P 4 Q, all contextually relevant and normal Ps have Q’ Now, Greenberg (2003) argues that the choice of the in-virtue-of properties is not arbitrary but constrained by two real-world-based presuppositional requirements. The first is that the in-virtue-of property must be associated, given the common ground, with the subject property. A property S is associated with a property P in a world w iff "x P(x) / S(x) follows known facts, norms, stereotypes, etc. in w, that is iff this universal statement holds in all worlds which are epistemically or deontically or stereotypically accessible from w. Clearly, without this presuppositional requirement, one could wrongly take a clearly false IS sentence like ‘A dog has three legs’ to be true, for example, in virtue of having a three-legged genetic makeup. The association presupposition prevents this, since the property ‘having a three legged genetic makeup’ is not associated with being a dog, and hence cannot serve as S in (26). In addition, the association presupposition explains the infelicity of IS generics like (3a) above: Out of context we do not have (non-trivial) shared knowledge, norms or stereotypes about ‘extremely unnatural’ properties like being a Norwegian student whose name ends with ‘s’. Thus, there is no (non-trivial) property we associate with P, and the ‘association’ presupposition on the choice of the S property is not met. In addition, the in-virtue-of, S, property should be taken as a reasonable causer of properties of the sort of the VP property. Thus, although false, ‘A dog has three legs’ is felicitous since intuitively we


Without supporting context, the listener may end up accommodating the ‘wrong’ in-virtue-of properties in (24) and (25), that is not the one that the speaker has in mind. This is very similar to what happens with Kratzer’s (1981) examples of circumstantial modality, as in ‘I cannot play the trombone’, which can be thought to be true in view of the physical condition of the trombone, the physical condition of the speaker, the fact that speaker does not know how to play, etc. What is crucial, though, is that as with Kratzer’s example, hearing sentences like (24) and (25) the listener still assumes that there is, indeed, a unique in-virtue-of property that the speaker has in mind, which he/she has to accommodate and that this in-virtue-of part fixes the accessibility relation of the sentence. The general form of in-virtue-of generics, then, is (26), where P, Q and SC stand for the denotations of the subject, VP and the contextually supplied in-virtue-of property, respectively:



can find a property associated with being a dog which reasonably causes having a specific number of legs (a property of the sort of ‘has three legs’). In contrast, ‘#A man is blond’ is infelicitous since, although there are many properties we associate with being a man (‘having male organs’, ‘loving sports’, etc.), none is taken to reasonably cause having a specific hair colour (i.e. a property of the sort of ‘being blond’). Formally, we take S to reasonably cause the sort of Q iff there is good possibility that the disjunction ["x S(x) / Q(x)] _ ["x S(x) / :Q(x)] holds, where good possibility is defined as truth in a world w, maximally similar to w0. Notice that this world w cannot be w0 itself. In our world, for example, it is not true that every individual with a four-legged genetic makeup has four legs or that every such individual does not have four legs, because of the well-known mutations, accidents, etc. Nor should this world w be a world where everything takes its normal course of events relative to w0, as in Krifka’s (1995) ‘most normal worlds’ suggestion for generics or Dowty’s (1979) inertia worlds proposal for the progressive, since such a world would not be free of mutations or accident either. These can be seen as part of the natural course of events in our world as well (clearly a world completely free of accidents, mutations, etc. will be considered truly abnormal from the point of view of our world). Rather, following the lines of Landman’s (1992) modification of Dowty’s inertia worlds proposal, we take S to reasonably cause the sort of Q iff there is a world w, maximally similar to ours, except that S is allowed to develop on the basis of what is internal to it, with no external interruptions, and in this world the disjunction ["x P(x) / Q(x)] _ ["x S(x) / :Q(x)] holds. In such a world, for example, we indeed expect having a four-legged genetic makeup to lead to having four legs or to not having four legs, and mutations, accidents, etc. do not intervene. Notice that the choice of the in-virtue-of properties and their relation to the VP properties is not a matter of objective facts only, but is heavily dependent on the beliefs, stereotypes, norms, etc. of the speaker (and the accommodation of the listener). This is because formally, both the association relation as well as the reasonable causation relation are required to hold not in the actual world w0, but in worlds similar to w given some norms or given what our beliefs about causation relations are. This may look problematic, as far as the truth conditions of generics are concerned, but I think it reflects, in fact, important observations about in-virtue-of generics, such as the fact that the felicity of IS generics depends on the beliefs/stereotypes/ norms available in the common ground. Thus, the felicity of the out of the blue (3a) can significantly improve in the context of (27), whereas

Yael Greenberg 145

in the context of (28), it is again infelicitous as generic and gets a salient existential reading: (27) There are very interesting traditions in Norway concerning professions and names. For example a Norwegian student whose name ends with ‘s’ wears thick green socks. (28) I walked in the dorms and noticed that a Norwegian student whose name ends with ‘s’ wears thick green socks.

4.3 BP sentences can express both in-virtue-of and descriptive generalizations Unlike IS generics, BP ones are ambiguous between an ‘in-virtue-of ’ reading and a ‘descriptive’ reading which merely asserts that the generalization is non-accidental,6 that is expected to hold in other possible worlds. Crucially, in this reading, we do specify the in-virtueof factor. Formally, we do not specify the exact sense in which the possible worlds quantified over are similar to ours. Instead, these worlds are defined more vaguely as maximally, or overall similar to w0, using Stalnaker’s (1968) or Lewis’s (1973) terminology. ‘Boys don’t cry’, for example, is ambiguous. It can express both an in-virtue-of generalization, just like its IS counterpart ‘A boy does not cry’ (asserting that ‘every (relevant and normal) boy does not cry’ holds in all worlds where ‘every boy is tough’ holds), and also a descriptive generalization, which is especially appropriate as a conclusion of some inductive inference. Think about someone watching the behaviour of enough boys in various ‘tear-inducing’ situations. This speaker may use this sentence to assert that not crying is not accidental of boys, but crucially he/she does not try to convey the factor in virtue of which the generalization holds. Maybe he/she does not even know what this factor is, and even if he/she does, conveying it is not an integral part of the assertion, so the listener is not committed to accommodate it. The 6

Cf. Cohen (1999) and Prasada & Dillingham (1999) for similar claims.


Example (27) improves the status of (3a) since in this context we can associate some property with the subject, P, property, namely obeying certain Norwegian traditions concerning names. In contrast, nothing in (28c) leads to associating a property with P, so the generic reading is again hard to get. In general, then, even if objectively there is some non-accidental property true of every P individuals, which systematically causes the sort of Q, unless the speaker knows or believes it (and the listener accommodates it), the IS generic is infelicitous.

146 Exceptions to Generics sentence, then, has an interpretation along the lines of (29), asserting that in all worlds in the union set of w0 and the set of worlds which are maximally similar to w0 (except from what is needed to allow for the existence of different or non-actual boys), every contextually relevant and normal boy does not cry: (29) "w# [w#2{w0} [ {w$: w$Rmaxw0}] / ["x boycont.norm (x, w#) / :cry (x, w#)]

4.4 Back to the contrast in tolerating exceptions—the intuitive explanation We are now in a position to explain the correlation summarized in (20) above, between the ability to characterize legitimate exceptions and the ability to have a felicitous IS generic. Given the claims made above about IS and BP generics, we can now rephrase (20) as (30): (30) The degree to which the properties of the exceptions to a generic can be specified is high with in-virtue-of generics (where an invirtue-of property is available), but very low with unambiguously descriptive generics (where no such in-virtue-of property is available). We can now explain (30) in the following way: Once a language user has in mind in virtue of what the generalization is true, he/she can predict, at least to some extent, which properties characterize the exceptions. Intuitively, these properties are those which are taken, from the point of view of w0, to block the ‘reasonable causation’ relation between the in-virtue-of and the VP property. Assuming that ‘A dog has four legs’ holds in virtue of having a four-legged genetic makeup, 7

Cf. the findings of Prasada & Dillingham (1999) who show that whereas informants gave many BP generics both a ‘by virtue of ’ and an ‘in general’ paraphrase, some BP sentences can only get the ‘in general’ paraphrase.


Crucially, although BP sentences are potentially ambiguous, there are cases where the only possible reading they can have is descriptive. These are exactly the cases seen in (2b)–(4b) above (e.g. ‘Norwegian students whose names end with ‘‘s’’ wear thick green socks’), namely, those where no appropriate in-virtue-of property is available in the context, so the ‘in-virtue-of ’ reading is blocked. Since IS sentences can only express ‘in-virtue-of ’ generalizations, the IS counterparts of such BP sentences [e.g. (2a)–(4a)] are infelicitous as generic. Since the BP sentences are potentially ambiguous, they are still felicitous in such cases, but crucially, they are unambiguously descriptive.7

Yael Greenberg 147

for example, the legitimate exceptions are dogs with properties which are taken to block the reasonable causation relation between ‘having a four-legged genetic makeup’ and ‘having 4 legs’, that is properties in (31a), but not in (31b), even though the latter can be taken as abnormal properties of dogs, just like the former: (31) a. Undergoing an accident, having mutations, being part of a medical experiment. b. Having a vocal cords problem, having 5 names, loving semantics.

(24) An accountant in this place hardly pays taxes (in virtue of being covered by the local legislation/of being deeply dishonest/of earning almost nothing/of having the right connections with the mayor . . .) (25) A woman in this town does not walk alone outside (in virtue of living in such a violent place/of living in such religious town/of having so many children . . .) Hearing (24) and (25) out of the blue, it is hard to determine how the legitimate exceptions to these sentences will look like. Are the legitimate exceptions to (24) accountants in this place who earn lots of money? those who earn very little? those who work under the direct supervision of their manager? very new ones? very old ones? are the legitimate exceptions in the case of (25) women in this town who are fully armed? those who have a special permission from the local rabbi? both? Once the sentences are uttered in context, however, and a unique in-virtue-of property is chosen, the apparent vagueness with respect to


Consider, in contrast, a BP sentence like (3b) (‘Norwegian students whose names end with ‘‘s’’ wear thick green socks’), which is, as claimed above, unambiguously descriptive. All we assert in uttering such sentences is that the generalization is non-accidental, crucially, without specifying (or even knowing) in virtue of what it is non-accidentally true. Thus, it is much harder (if not impossible) to say which properties block the in-virtue-of (i.e. the reasonable causation) relation and consequently which properties characterize the legitimate exceptions. All we can say is that these properties must be abnormal in some sense, but we cannot specify the right ‘sense’ of the abnormality relevant here. There are two pieces of data which support this line of thought. The first is that there are, in fact, IS sentences like (24) and (25), repeated here, where characterizing the legitimate exceptions seems very hard, similar to descriptive BP generics like (3b):


5 FORMALLY CHARACTERIZING THE TOLERANCE MECHANISM OF IS AND BP GENERICS We are now in a position to integrate the observations and intuitions developed in the previous sections into a precisely defined exceptionstolerance mechanism of IS and BP generics. I start by reviewing the 8 This similarity between BP generics and habituals with referential subjects is one of the indications that the availability of both in-virtue-of and descriptive readings with BP sentences should be derived from the potentially referential status of BP NPs (see, e.g. Carlson (1977) and more recently Chierchia’s (1998) analysis of BP NPs as kind referring even in characterizing sentences). In contrast, the fact that IS sentences are only compatible with an ‘in virtue of a property’ modality should be derived from the property-only interpretation of IS NPs. A fully compositional account deriving the difference in truth conditions of IS and BP sentences from the semantic difference between IS and BP NPs, however, is well beyond the scope of this paper (but see Greenberg 2003 for a preliminary suggestion along these lines).


the exceptions is to a large extent resolved. For example, armed woman can be taken as legitimate exceptions to (25) if we accommodate ‘in-virtue-of living in such a dangerous place’, whereas woman with a special permission from the rabbi can be taken as exceptions if we accommodate ‘living in such a religious town’. The second support comes from examining exceptional situations to habitual sentences with referential subjects. Like BP generics, these habituals are potentially ambiguous between an in-virtue-of and descriptive generics.8 ‘Mary walks to school’, for example, can make a descriptive generalization, based on watching Mary for a couple of mornings, and merely asserting that her walking to school in every relevant and normal situation is not accidental, without having in mind (or even knowing) the in-virtue-of factor. Crucially, in this case it is hard to characterize in which abnormal situations we expect her not to go to school and in which of them we still expect her? to go to school (stormy situations? situations where she is offered a lift? where she has an exceptional amount of money? where she sleeps by a friend? all of them? etc.). This unclarity is resolved to a large extent when the context shows that the sentence is asserted in virtue of a certain property and when such a property is accommodated. For example, assuming that Mary walks to school in virtue of not having money for the bus, we could predict that when she is offered a lift by her friend, she would not walk to school, that is this situation will be considered legitimately exceptional. But if we assume that Mary walks to school in virtue of her wish to train herself for long walks, then it will be exceptionally stormy days, and not days when she is offered a lift, which will be considered legitimately exceptional.

Yael Greenberg 149

supervaluationist proposal of Kadmon & Landman (1993) for a ‘domain-vague restriction’ on Gen. This will supply us with the formal tools for capturing the degree to which the properties of the exceptions can or cannot be specified.

5.1 Kadmon & Landman: a domain-vague restriction on Gen

For a generic statement there is no well-defined set of objects that the universal statement ranges over. We don’t expect the context of utterance to make clear what the objects are exactly that the generalization expressed applies to. And we don’t attempt to accommodate a precise set of objects. Hence, when we encounter objects that do not fall under the generalization expressed, there is always the possibility that they are not among the objects that the generalization is supposed to apply to, and we are therefore able to regard them as legitimate exceptions (409). . . . What we would like to propose, then, is that it is an integral part of the nature of generic statements that the restricting set of properties is vague . . . Saying ‘‘An owl hunts mice’’ is just like saying ‘‘every (possible) owl with the right properties hunts mice’’, while, crucially not committing yourself to what the right properties are. (408, original emphasis) 9 Kadmon & Landman deal only with IS generics (and do not attempt to explain the differences between them and their BP counterparts). 10 A general problem for restricting the subject domain of generics, pointed out by Carlson (1999), is posed by sequences like ‘Pheasants lay speckled eggs. Once rare, they now number in the millions’ (10). Intuitively, the first sentence talks only about female (birthing) pheasants. But if this is done by restricting the domain of the BP, then we have a problem since the pronoun in the second sentence, which seems to refer back to the BP, refers to pheasants in general. One direction of solving this problem is to assume that anaphora in generically interpreted expressions may work differently from anaphora in extensional sentences. Another possible suggestion is that the BP and the pronoun in this sentence are both originally interpreted as kinds (and anaphora is allowed), and at a later stage some type-shifting operation takes place (as suggested in, e.g. Cohen 1999) and we get quantified structure with two different domain restrictions.


5.1.1 The proposal of Kadmon & Landman Like many other theories of genericity, Kadmon & Landman (1993) (K&L, henceforth) take the generic operator, Gen, to be universal and modalized,9 and propose to account for the tolerance of exceptions by a restriction on the quantification over individuals.10 To capture the difference between generics like ‘An owl hunts mice’ and universals like ‘Every owl hunts mice’, K&L propose that in the latter the restriction is precise: the speaker has in mind a precise set of restricting properties, even if he/she does not specify them explicitly, which is supposed to be accommodated by the listener (often with the help of context). In context, then, no relevant individual can be excluded from the quantification, and no exceptions are tolerated. In contrast, the set of properties restricting Gen is vague:

150 Exceptions to Generics K&L’s proposal is based on the following intuitive observation: We feel that when you use a generic NP, you are not trying to be precise . . . It is not supposed to be clear to your hearers exactly what owls are supposed to actually hunt mice. Adult owls? Healthy owls? Ones that live in nature? Ones that are not spoiled by some person who brings them food? Ones that have mice to hunt? Ones that don’t happen to be crazy? . . . And so on and so forth. (407)

(32) (In this university) A professor wear a tie/Professors wear a tie. (33) "w# [w#Rw0 / "x professorXprofessor (x, wear a tie (x, w#))] K&L take the domain-vague restriction to be a pair Æv0, Væ. v0 is the precise part of the restriction, that is a (possibly empty) consistent set of properties, all of them compatible with the CN property (‘professor’), which is directly provided by the context. In contrast, V is the vague part of the restriction. K&L follow the supervaluationist approach to vagueness, originally developed by, for example, Kamp (1975) and Fine (1975) to deal with vague predicates like ‘tall’, ‘bald’, etc., according to which the core characteristic of vagueness is that there are various possible ways to resolve it and get to a precise statement or, using Fine’s terminology, various possible ‘precisifications’. Crucially, with vague predicates there is no way to completely determine which of these precisifications are better than the others, so we are left with all of them. Following this line of thought, K&L define V as a set of precisifications on v0—that is as a set of sets of properties, each of which is (a) consistent, (b) contains only properties compatible with the CN property and (c) is a superset of v0 (the contextually supplied properties). In the case of (33), for example, each precisification in V has ‘in this university’ as a member, together with other properties. Thus, the precisifications (sets of properties) can be {in this university, V1, V2,


Notice that this observation is very similar to the one we had above, about the difficulty in characterizing the exceptions to unambiguously descriptive BP generics. As I will show below, however, these two observations are not equivalent. Formally, K&L take Gen to be a nominal GQ, and define a domainvague restriction on the CN property. I will adapt their definitions to present framework, in which Gen is a sentential operator. In the case of (32), for example, I will represent the domain-vague restriction on the set of professors as the superscript ‘Xprofessor’, as in (33). (This replaces the informal ‘cont.norm’ restriction from section 4.1.):

Yael Greenberg 151

(34) "w# [w#Rw0 / "x, s [professorXprofessor (x, w#) ^ InvolveYwear-a-tie (s, x, w#)] / wear a tie (s, x, w#)] Following K&L’s line of thought, we define ‘Ywear-a-tie’ as a pair Æk0, Kæ. k0 is the precise part in this pair, namely, a contextually supplied set of properties of situations. Since, as discussed in section 2,


V3, V4}, {in this university, V1, V6, V8, V16}, {in this university, V3, V4, V11, V20}, etc. (where V stands for a property). Each such precisification in V represents one possible way of making the restriction precise (which is compatible with what is already known from the context), where crucially, there is at least one context where we do not determine which of these ‘ways of making the restriction precise’ are better than others and consequently where all precisifications are available. Thus, even if we encounter a professor who does not wear a tie, there is a possibility that he lacks a property in one of the (unchosen) precisifications, and is thus not quantified over. Consequently, this professor can be considered a legitimate exception to (32). K&L’s suggestion naturally captures the interaction, discussed in section 2, between irrelevant and exceptional individuals tolerated by generics. While contextually irrelevant individuals are excluded from the domain of quantification by considering properties in the precise part of the restriction, namely v0, exceptional individuals are excluded by assuming that they lack a property in one of the unspecified precisifications in the vague part V. K&L’s suggestion also explains the observation made in section 2, that taking a contextually irrelevant individual, but not an exceptional individual, as a counterexample to a generic statement is evaluated as a misunderstanding. In K&L’s definitions, only the properties of the relevant entities are in the precise part of the restriction and need to be accommodated. In contrast, properties of the non-exceptional entities are in the vague part, that is in the set of precisifications, and since the speaker does not necessarily have in mind a unique precisification, the listener is not expected to accommodate it either. Finally, using K&L’s framework we can easily capture the observation, made in section 2, that IS and BP generics tolerate situations in a way strikingly similar to the way they tolerate individuals. Assuming that VPs denote sets of situations, we do that by imposing a domain-vague restriction on the set of situations quantified over by Gen. For example, (32) will be represented as in (34), where the superscript ‘Ywear-a-tie’ is the domain-vague restriction on the set of situations:


5.1.2 Some problems and the direction of an improved theory K&L’s vague restriction, however, cannot be the whole story regarding the exceptions puzzle since it is, in fact, too vague. The only limitations imposed on it are that all sets of properties in it are consistence and that it contains only properties which are compatible with the CN property. Besides these two ‘logical’ limitations, K&L’s definition imposes no constraint on which property can and which cannot be part of the precisifications. It thus wrongly predicts any property whatsoever to potentially legitimize exceptions. As seen above, however, language users have two systematic intuitions about which properties can and which cannot legitimize exceptions. The first is the abnormality constraint, according to which, 11 But notice again that the utterance context can further limit the domain. For example, in certain contexts only ‘meetings with the Dean’ can be taken to be contextually relevant situations.


the contextual relevance of situations can be usually recovered from the presuppositions, implicatures or real-world knowledge of the VP, we want to define k0 as systematically keyed to the VP property. For example, given what we know about the VP ‘wearing a tie’, k0 in (34) can be defined as the set {being a formally dressed situation}, so, for example, shower situations are considered irrelevant.11 In contrast to k0, K, the vague part of the restriction over situations is a set of precisifications on k0, namely, a set of sets of properties of situations, each of them is a superset of k0, for example, {being a formally dressed situation, K1, K2, K3, K4}, {being a formally dressed situation, K2, K3, K5, K6}, {being a formally dressed situation, K5, K7, K8, K6}, etc (where K is a property of situations). Each such precisification represents one way of making the restriction over situations precise, and again, there is at least one context where no unique precisification is chosen. Thus, K excludes from quantification all situations which lack a property in one of the (unchosen) precisifications, and by that allows tolerance of exceptional situations, whose characterization is vague. As mentioned above, irrelevant or exceptional situations are tolerated only when the VP is stage level. This can be easily captured by requiring both k0 and all sets of properties in K to be empty with individual-level VPs [like ‘have four legs’ as in (1)], so all situations end up being quantified over in such cases, and non of them is tolerated. K&L’s approach, then, is also productive in capturing the tolerance of situations by generics. In the remainder of the paper, I will focus again on the tolerance of individuals.

Yael Greenberg 153

5.2 The abnormality constraint on K&L’s domain-vague restriction The abnormality constraint holds for both in-virtue-of and descriptive generics: In both cases, we take the properties which legitimize exceptions to be those considered abnormal. One way to think about this intuition is to define abnormal as ‘true of the minority’. Thus, if a property is assumed to hold of the minority of contextually relevant P individuals, then individuals having it are considered abnormal, and, thus, legitimate exceptions to the generic. Definition (35) is an attempt to make this idea precise: (35) The abnormality constraint on K&L’s domain-vague restriction: Any set of properties v in V is such that j\v \ P in cj is not significantly smaller than j\v0 \ P in cj. Example (35) says that that the number of the P individuals who have the properties in any of the precisifications v in V is not significantly smaller in the context c than the number of contextually relevant P individuals as a whole. Consider, for example, (36), its representation in (37) and


for example, having a mutation, but not having a tail, will be a property of legitimate exceptions to (1). In addition, with in-virtue-of generics we also have a further ‘relevant abnormality’ constraint, according to which e.g. ‘undergoing an accident’, but not ‘having vocal problems’ will characterize the exceptions to (1). K&L’s definition cannot capture these two systematic constraints since there is nothing in it which will guarantee that some properties should be preferred members of the precisifications over other properties. All properties have an equal status. More generally, K&L’s definitions allow only two possibilities in the restriction of generics, total specificity (with respect to the properties of the irrelevant entities, excluded by v0) and total vagueness (with respect to the properties of the legitimate exceptions, excluded by V), whereas in reality language users take the characterization of the exceptions to generics to be partially vague: though there is no precise list of properties of exceptions, we are not totally ignorant concerning their characterization. Moreover, the restriction is partially vague to two different degrees: a high one (with unambiguously descriptive generics, involving mere abnormality) and a lower one (with in-virtue-of generics, involving relevant abnormality). Our goal, then, is to define two constraints on K&L’s restriction, which capture these two degrees of vagueness.

154 Exceptions to Generics the abnormality constraint on the vague restriction ‘Xfirst grader’ in (38): (36) (Context: talking about this school) First graders finish school at 13.00. (37) "w#[w# R w0 / "x first graderXfirst 13.00 (x, w#)]

grader

(x, w#) / finish at

(38) Abnormality constraint on Xfirst grader: j\v \ first grader in cj is not significantly smaller than in this school \ first grader in cj.

12

The claim that generics quantify over the majority of the members of the subject set is made in Cohen (1996, 1999). See section 6 for a brief comparison between Cohen’s approach and the present one. 13 As a reviewer correctly points out, this definition cannot help us account for the fact that BP generics like (i) and (ii) (sometimes called ‘Port-Royal puzzle generics’) can be true even if only a minority of the members of the subject set has the VP property: (i) Dutchmen are good sailors. (ii) Frenchmen eat horsemeat. In this paper, however, I follow, for example, the approach of Krifka et al. (1995), who claim that such BP sentences do not seem to be characterizing generics at all, but rather Direct Kind Predication (DKP) structures, which do not involve generic quantification. Thus, sentences like (i) and (ii) are not supposed to be subject to definitions like (35) regarding the domain-vague restriction on the Gen. Krifka et al. (1995) show that, unlike a BP sentence like (i), which seems to mean that ‘the Dutch distinguish themselves from other comparable nations by having good sailors’ (82), the IS counterpart of (i), namely (iii), has the stronger and more standard ‘characterizing’ generic interpretation according to which ‘we can more or less expect that a random Dutchman will turn out to be a good sailor’ (82): (iii) A Dutchman is a good sailor. To this observation we can add two more: First, unlike standard characterizing generics, sentences like (i) and (ii) do not support counterfactuals in the usual way. For example, (i) does not support the truth of ‘If my brother was a Dutchman, he would probably also be a good sailor’. Second, the subjects of all examples of BP sentences with a ‘Port-Royal puzzle’ interpretation I am aware of refer to well-established kinds like nationalities [as in (i) and (ii)] or biological kinds, as in ’Tigers eat people’ (from Cohen 2001b). This further supports the claim that such generics express DKP, and not generic quantification. (But see, e.g. Cohen (1996, 2001b) for a quantificational, probabilitybased analysis to such generics, called ‘relative generics’ by him.)


The contextually supplied property in v0 for (37) is ‘in this school’. According to (38), no matter which properties we put in each precisification v in V in ‘Xfirst grader’, and no matter which precisification v we look at, the result of intersecting these properties with the property of first graders will yield a set which is not significantly smaller than the set of first graders in this school. That is, the result is always the significant majority of first graders in this school.12,13 Crucially, this means that not only the intersection of properties but also any single property in any precisification must hold of the majority of relevant individuals (e.g. of first graders in this school), since if we were

Yael Greenberg 155


trying to intersect a property of the minority with all other properties in v, the intersection could never yield the majority of relevant individuals, as required by (35). This correctly captures our intuitions about what the abnormality constraint means: only properties of the minority (abnormal properties like having no school bag or being younger than 4 years old) are necessarily excluded from the restriction, and only individuals with such properties (i.e. abnormal individuals) are not quantified over, and are therefore considered legitimate exceptions to (36). Notice that whereas according to (35) a precisification can have only properties of majority as members, it clearly does not have all such properties as members. If the precisifications contained all properties of the majority (e.g. ‘not being called David’, ‘not being called Susan’, ‘not being called Harry’, ‘not being called Mary’, etc.), then we would be wrongly left with no individual to quantify over, and this would contradict (35), according to which we should end up with the majority of relevant individuals. This also means that simply being a property of the majority does not necessarily put you in the precisifications and the restriction on Gen. If this were the case, then assuming, for example, that ‘not having a name beginning with A’ is a property of the majority (of first graders in this school), we would wrongly predict (36) to be automatically interpreted as ‘First graders (in this school) whose names do not begin with ‘A’ finish at 13.00’. But we do not. All that (35) requires is that each precisification should consist of a certain combination of properties of the majority, and this is easily met if this combination has only part of these properties. Example (35), then, does not make any prediction about ‘not having a name beginning in A’ or about any other specific property of the majority (being such a property you may or may not end up in the restriction). It does make predictions about properties of the minority (being such a property necessarily prevents you from being in the restriction). Finally, notice that a consequence of definition (35) is that although the legitimate exceptions to generic sentences are still considered ‘abnormal’, as the widely held intuition says, we clearly do not use the term ‘abnormal’ in its everyday, common use as ‘far from the norm’ or ‘not stereotypical’. Rather, the term means (roughly) ‘has certain properties (or a property) of the minority’. This is a welcome result, since, as a reviewer correctly pointed out, a generic like (24) above can be true even if the only well-known 45-year-old teacher who does cook on Monday afternoons is Ann, who is the most stereotypical wellknown 45-year-old teacher. Put in other words, the most stereotypical member of the set can clearly be considered a legitimate exception to a generic. Indeed, such a member cannot be considered ‘abnormal’ in

156 Exceptions to Generics the everyday use of the word of ‘far form the norm’ or ‘far from the stereotype’. However, in the present theory, ‘abnormal’ does not have this everyday use, but the weaker use ‘has a property of the minority’. Thus, using the definition of ‘abnormality’ in (35) is compatible with a stereotypical member of a set being a legitimate exception to a generic sentence, since even such members have ‘properties of the minority’. A general advantage of the present theory, then, is that it clarifies the meaning of what ‘abnormal’ is and what it is not, as far as generic sentences are concerned.14

The abnormality constraint in (35) captures correctly the way exceptions to descriptive generics are tolerated. But this is not enough for in-virtueof generics. As seen above, here we need also to capture the fact that the exceptions are relevantly abnormal, where ‘relevantly abnormal’ means ‘being a property which blocks the reasonable causation relation between the in-virtue-of property, S, and the VP property, Q’. We start, then, by defining a set of blocking properties BÆÆS,Qæ,wæ, as in (39): (39) B 2 BÆÆS,Qæ,wæ iff B is taken to be a property which, from the point of view of w, ‘blocks’ the reasonable causation relation between S (the in-virtue-of property) and Q (the VP property). For example, BÆÆhave a four legged genetic makeup, have four legsæ, wæ is the set of properties which, from the point of view of w, block the reasonable causation between having a four-legged genetic makeup and having four legs. ‘Having a mutation’, ‘undergoing an accident’ or ‘cutting off one’s leg’ are intuitively in this set, while ’being yellow‘ or ’having vocal problems’ is not.15 An important observation about BÆÆS,Qæ,wæ is that this is, in fact, a vague set of properties, that is, a vague second-order property. We can think about this vagueness very similarly to the way the vagueness of 14

See section 6 below for a comparison between the present approach to ‘abnormality’ and other approaches. 15 Formally, assuming that S is a reasonable causer for Q, then B 2 BÆÆS,Qæ,wæ iff in w "x [S(x) ^ B(x)] / :Q(x). For example, assuming that having a four-legged genetic makeup reasonably causes having four legs, ‘having a mutation in the gene responsible for number of legs’ would be taken as property blocking this reasonable causation, from the point of view of w0, since in w0 every individual with a four-legged genetic makeup who has a mutation in this genetic makeup would not have four legs (I am disregarding further and more far-fetched scenarios, e.g. the possibility that such an individual would have four legs as a result of some transplant). In contrast, it is false that every individual with such a genetic makeup who has a problem in his vocal cords will not have four legs. But see the discussion of the vagueness of blocking properties below.


5.3 The relevant abnormality constraint on K&L’s domain-vague restriction

Yael Greenberg 157

(40) BÆÆS,Qæ,wæ is a vague set of properties where for any set of properties B 2 BÆÆS,Qæ,wæ it holds that b 2 B iff b is a property which, from the point of view of w, ‘blocks’ the reasonable causation relation between S and Q. (41) a. The set of definitely blocking properties is {b: b 2 \BÆÆS,Qæ,wæ} (present in all sets of BÆÆS,Qæ,wæ). b. The set of definitely non-blocking properties is {b: b ; [BÆÆS,Qæ,wæ} (present in no set of BÆÆS,Qæ,wæ). c. The set of borderline blocking properties is {b: b 2 [BÆÆS,Qæ,wæ ^ b ; \BÆÆS,Qæ,wæ} (present in some, but not all sets of BÆÆS,Qæ,wæ). Turning back to generics, we want to ensure that (a) the legitimate exceptions to in-virtue-of generics are those individuals with blocking


first-order properties, like ‘bald’, is treated in supervaluations theories. Take again BÆÆhave a four legged genetic makeup, have four legsæ, wæ. Properties like ‘having a mutation in the gene responsible for number of legs’ or ‘cutting off one’s own leg’ are definitely in this set (in supervaluationist terms: they are in the positive extension and hence present in all precisifications). Other properties, like ‘being infertile’ or ‘having vocal problems’, are definitely not in this set (they are in the negative extension and hence absent from all precisifications). And still other properties, for example, ‘living in an area with many traps’ or ‘having a serious blood infection’, are borderline cases (present in only some precisifications). Speakers can be uncertain whether these properties do or do not block the reasonable causation relation between ‘having a four legged genetic makeup’ and ‘having four legs’. Another example is (25) above (‘A woman in this place does not walk alone outside’). Suppose (25) is uttered in context, so one unique in-virtue-of property is accommodated, for example, ‘in virtue of living in a violent place’. Thus, we are interested in the set of ‘blocking’ properties BÆÆliving in this dangerous place, not walking alone outsideæ, wæ. This set is vague as well, with properties in its positive extension, for example, ‘being fully armed’; in its negative extension, for example, ‘having a name ending with f ’ and crucially, also with borderline blocking properties, for example, ‘being the mafia leader’s wife’. It may be unclear whether this latter property indeed blocks a woman in this violent place from walking alone outside, or not. In supervaluationist terms, the vagueness of the set of blocking properties BÆÆS,Qæ,wæ means that this set of properties is, in fact, a set of sets of properties. We redefine, then, BÆÆS,Qæ,wæ as in (40) and (41) (where B is a set of properties and b is a property):

158 Exceptions to Generics properties and (b) any vagueness in characterizing the legitimate exceptions to in-virtue-of generics is due to the vagueness of what is and what is not considered a blocking property. We achieve that in (42), the relevant abnormality constraint on the domain-vague restriction, by requiring that besides the contextually supplied properties, the properties in the restriction on Gen are the complements of the blocking properties defined in (40) and (41):

Definition (42a) is K&L’s requirement that every precisification in the restriction is a superset of v0—the set of contextually supplied properties. Definition (42bi)–(42biii) guarantees that, besides the properties in v0, complements of definitely blocking properties are definitely in the restriction, complements of definitely non-blocking properties are definitely not in the restriction, and borderline blocking properties are borderline properties in the restriction. To see how (42) works, take again (1) (‘A dog has four legs’). In this sentence, the set of contextually supplied restricting properties v0 is empty (i.e. there are no contextually irrelevant dogs). The relevant abnormality constraint in (42bi) correctly guarantees that we end up definitely quantifying over all dogs that do not have mutations or who did not undergo an accident. These properties are the complements of the definitely blocking properties in BÆÆhave a four legged genetic makeup, have four legsæ, wæ, and thus, given (42bi), present in all sets of properties the restriction of Gen. Consequently, dogs with mutations or who did undergo accidents are definitely not quantified over in the first place, and are thus definitely predicted to be legitimate exceptions to (1), as our intuitions tell us. On the other hand, since ‘living in an area with many traps’ is a borderline member of BÆÆhave a four legged genetic makeup, have four legsæ, wæ, its complement (‘not living in an area with many traps’) is present in only some of the precisifications in the restriction on Gen given (42biii). Thus, we have vagueness concerning whether dogs living in an area with many traps will or will not characterize the legitimate exceptions to (1). Finally, since being infertile or having vocal problems are definitely non-blocking properties (they are present in no precisification of BÆÆhave a four legged genetic makeup, have four legsæ, wæ), their


(42) The relevant abnormality constraint on the domain-vague restriction Æv0, Væ: a. "v 2 V/ v0 v, b. (i) If b 2 \BÆÆS,Qæ,wæ then b 2 \V. (ii) If b ; [BÆÆS,Qæ,wæ then b ; [V. (iii) If b 2 [BÆÆS,Qæ,wæ ^ b ; \BÆÆS,Qæ,wæ then b [V ^ b; \V.

Yael Greenberg 159

complements (being fertile and not having vocal problems) are present in no sets of properties in the restriction, as (42bii) dictates. Thus, infertile dogs or dogs with vocal problems are not excluded from the quantification over Gen. Consequently, unless they happen to have some other ‘blocking property’, infertile dogs or dogs with vocal problems are not considered legitimate exceptions to (1), and thus (correctly) predicted to be covered by the generalization in (1).16

5.4 Back to the contrast in characterizing exceptions

(43) The degree to which the properties of the exceptions to a generic can be specified is high with in-virtue-of generics (i.e. IS generics and their BP counterparts) but very low with unambiguously descriptive generics (i.e. BP generics with infelicitous IS counterparts). In the section above we saw that the domain-vague restriction of invirtue-of generics is limited by the relevant abnormality constraint in (42) (in addition to the abnormality constraint in (35)). This enables us to specify both the positive and the negative extension of the restriction of Gen (by specifying the positive and negative extension of BÆÆS,Qæ,wæ), so the only source of vagueness in the characterization of the legitimate exceptions are the borderline properties in the restriction, that is, the borderline properties in the vague set BÆÆS,Qæ,wæ [e.g. ‘living in an area with many traps’, in the case of (1)]. In contrast, descriptive generics are limited by the abnormality constraint only, which, crucially, allows us to characterize only the negative extension of the restriction, namely, the properties which are present in no precisification v–v0. These are the properties which are definitely considered properties of the minority of relevant P individuals, since adding them to the restriction will violate the requirement in (35) that we should end up quantifying over the majority of relevant P individuals. Crucially, however, we have no way 16 Notice that the restriction of in-virtue-of generics should be limited using both the relevant abnormality and abnormality constraints, since relevant abnormality alone is not enough to guarantee that we end up quantifying over the majority of relevant individuals. Suppose we hear (25) and accommodate ‘in virtue of the fact that this place is so violent’, so, for example, fully armed women are taken to be legitimate exceptions (they have a ‘blocking property’). The problem arises when such a ‘blocking’ property happens to be a normal property, that is, a property of the majority of women in this town, so despite the violence most women in this town do walk alone outside. In such a situation, (25) is judged as false, but if the relevant abnormality in (42) is the only constraint we use, the sentence is wrongly predicted to be true. Adding the abnormality constraint in (25), then, ensures that only abnormal blocking properties legitimize exceptions to in-virtue-of generics.


We are now in a position to explain precisely the generalizations in (20) and (30), above, summarized here as (43):


5.5 Characterizing accessibility relations as (potentially) vague restrictions on worlds We have just defined the restriction on the set of individuals as vaguer with descriptive than with in-virtue-of generics. But notice that we can also treat the difference between the in-virtue-of and descriptive accessibility relations as a difference in degree of vagueness: In Greenberg (2003), the in-virtue-of accessibility relation is very specified: we look only at worlds where every member of the subject set has a contextually supplied in-virtue-of property (e.g. with ‘A dog has four legs’, we look only at the worlds where every dog has a four-legged genetic makeup). In contrast, with descriptive generics, no in-virtue-of property is specified, and consequently we do not specify the exact way that the accessible worlds are similar to w0, but define them in a vaguer way, as maximally or overall similar to w0. Thus, the degree of vagueness with respect to the restriction on individuals correlates with the degree of vagueness with respect to the restriction on worlds, as schematically summarized in (44): 17 While the lack of specification of properties of the majority in the restriction of unambiguously descriptive generics has the advantage of capturing the intuition about their vagueness with respect to the properties of their exceptions, a reviewer notes that this lack of specification may lead to problematic or too weak truth conditions for such generics. Further research should attempt to clarify whether more constraints on the restriction of descriptive generics is indeed needed and which intuitive judgments on the truth conditions and legitimate exceptions of such generics justify such additional constraints.


to characterize the positive extension of the restriction with descriptive generics, that is, to find even one property X which is present in all precisifications v–v0, since, as explained in section 5.2 above, even if some property is clearly a property of the majority [e.g. ‘not having a name beginning with A’ in the case of (36) above], this does not yet ensure that it is a member of all precisifications (though it may be a member of some of them). Consequently, the negation of such a property (e.g. the abnormal property ‘having a name ending with A’) is a borderline case—there is no way to predict in advance that it definitely will or definitely will not characterize the exceptions to (36). The same procedure holds for any other property of the majority of relevant P individuals. The higher degree of vagueness concerning the properties of the legitimate exceptions with descriptive generics, then, results from the fact that we end up with many more ‘borderline’ properties in the restriction than we do with in-virtue-of generics.17

Yael Greenberg 161

(44)

Degree of vagueness Degree of vagueness of the accessibility with respect to relation properties of the exceptions Unambiguously in- Low virtue-of generics

Low

Unambiguously High descriptive generics

High

0

Z Degree of vagueness of "x

Now, instead of unnecessarily defining two contrasts in the degree of vagueness (on "w and on "x), we can define only a difference in degree of vagueness of "w, and then define an algorithm which will derive from it the difference in degree of vagueness of "x. Formalizing the intuitive difference in vagueness between the invirtue-of and the descriptive accessibility relations can be done by slightly deviating from Greenberg’s (2003) original view, according to which with the former there is an in-virtue-of property in the semantic structure, whereas with the latter there is no such property at all (so we use ‘maximal’ or ‘overall’ similarity). In an alternative view, we can claim that with both types of generics we take the generalization to hold in virtue of some property. But that this property is specified to the speaker only with in-virtue-of generics, while with unambiguously descriptive ones, it is unknown or unspecified. Formally, we represent the accessibility relation as a set of propositions (and thus as a set of sets of worlds), which is precise in the case of in-virtue-of generics and vague in the case of descriptive generics. In both cases, we universally quantify over all worlds where every P member has the in-virtue-of, S, property: (45) "w# [w# 2 "x P(x) / S(x)] / "x [w# 2 "x PXp(x) / Q(x)] With in-virtue-of generics, S is fixed in every context of utterance c. Consequently, in every context c, we end up with a precise set of worlds quantified over. For example, if in a context c where ‘An accountant hardly pays taxes’ is uttered we choose ‘in virtue of being dishonest’, then we quantify over all worlds where every accountant is dishonest. In contrast, with unambiguously descriptive generics like ‘Norwegian students whose names end with ‘‘s’’ wear thick green


Z Degree of vagueness of "w

162 Exceptions to Generics socks’, the choice of in-virtue-of property S is not resolved by context, that is, even in a specific context c S is unknown, so there are multiple potential properties playing its role in (45). Consequently, we end up having multiple propositions of the form "x P(s) / S(x) in (45) (e.g. "x P(s) / S1(x), "x P(s) / S2(x), "x P(s) / S3(x), etc.), and thus with multiple potential sets of accessible worlds, each of them represents one way of making the accessibility relation precise. We now define a restriction Xp on the set of individuals in (45) for both in-virtue-of and descriptive generics, which is sensitive to the (potentially vague) accessibility relation:

Notice that unlike K&L’s definition, the second member of the pair Æv0, væ is a set of properties, and not a set of sets of properties, that is, the restriction is not explicitly defined as vague. Nonetheless, it indirectly comes out as vague, with no need to stipulate vagueness, as in K&L’s theory. The reason is that, given how the abnormality constraint is defined, the restriction can potentially contain various combinations of properties of the majority, and there is no unique set of properties which are chosen (as discussed above). Moreover, the dependency of (46) on the in-virtue-of property S [through the definition of relevant abnormality in (46b)] guarantees that it correctly comes out vaguer with descriptive generics than with in-virtue-of ones. Whereas in the latter the only source of vagueness is the vague extension of BÆÆS,Qæ,wæ (i.e. which properties are considered complements of ‘blocking properties’ and which do not), in the former it is also the choice of what BÆÆS,Qæ,wæ is in the first place, since we have complete vagueness with respect to the choice of S. 6 CONCLUDING REMARKS AND COMPARISONS WITH OTHER EXCEPTIONS-TOLERANCE MECHANISMS In this paper I showed that, despite the strong similarities between them, IS and BP generics differ in the degree to which the properties of their legitimate exceptions can be specified. I argued that this contrast is a special case of a much wider and deeper difference between IS and BP generics, namely, a difference in the accessibility relations, which


(46) Let Xp be a pair Æv0, væ, where both v0 and v are sets of properties, every property v 2 v0 is directly supplied by context, v0 4 v. Let BÆÆS,Qæ,wæ be a set of blocking properties [as in (39) above]: a. jv \ P in cj is not significantly smaller than jv0 \ P in cj (abnormality) b. If b 2 BÆÆS,Qæ,wæ then b 2 v–v0 (relevant abnormality)

Yael Greenberg 163


is also manifested in felicity differences between them (originally observed in, e.g. Lawler 1973). I developed an improved version of the exceptions-tolerance mechanism for generic sentences suggested in Kadmon & Landman (1993) to account for the newly observed difference, namely, a restriction on the set of individuals quantified by Gen, which is partially vague to two degrees using supervaluationist methods. These two degrees of vagueness are not stipulated but follow from the systematic dependency of this restriction on the two types of accessibility relations that IS and BP generics are compatible with, which are redefined here as precise and vague restrictions on the domain of worlds. The theory developed here has much in common with intuitions of other theories proposed in the genericity literature. I want to finish this paper by evaluating the success of the present theory to formally capture two such intuitions, relative to these other proposals. Like many other theories of genericity, the present theory attempts to capture the intuitive observation that the exceptions to generics are somehow abnormal (what I called the abnormality constraint). Unlike the present theory, however, other theories use, in some way or another, the unanalysed adjective ‘(ab)normal’ in their definitions. This, I suggest, is problematic. One type of problem is found with the well-known exceptions-tolerance mechanism of Krifka et al. (1995) and Krifka (1995), in which the quantification over accessible worlds is further restricted to the worlds which are ‘most normal’, from the point of view of our world. The idea is that in those most normal worlds, abnormal things like mutations or accidents do not exist, so dogs with mutations or those who have undergone an accident are not quantified over (cf. Delgrande 1987, 1988). However, treating mutations, accidents, etc. as abnormal is problematic, since a world completely free of such phenomena will be considered truly abnormal from the point of view of our world. A similar problem, noted by, for example, Cohen (1999) is that, under this proposal ‘A bird flies’ circularly means that ‘every bird flies in all accessible most normal worlds (where among other things, birds fly)’. The reason for these kinds of problems seems to be the fact that ‘(ab)normality’ in the suggestion of Krifka et al. and Delgrande is total. This is avoided in theories which relativize (ab)normality to the subject property (e.g. with respect to being a dog), as in, e.g. Eckardt (1999), Asher & Morreau (1995) and Pelletier & Asher (1997). These theories, however, have no way to account for the systematic ability of language users, observed in section 3 above, to distinguish between those abnormal subject members who do and who do not count as legitimate



exceptions, for example, for the fact that an individual dog can be ‘abnormal for a dog’ (e.g. infertile or with exceptional problems in its vocal cords), but nonetheless we would expect it to be covered by the generalization ‘A dog has four legs’. An intuitively better suggestion may be to relativize abnormality to the VP property, as suggested in circumscription theories like McCarthy (1986) or Drewery (1997) (e.g. requiring the exceptions to be abnormal with respect to having four legs). Such a suggestion, however, cannot account for the fact that the characterization of exceptions varies not only with respect to material in the sentence (the subject and VP) but also with respect to the accommodated material, as in example (25) above (‘A woman in this place doesn’t walk alone outside’). It was shown that considering individuals who are abnormal with respect to being a woman in this town and/or with respect to walking alone outside is not enough, since different types of such abnormal women are considered exceptions to (25) depending on the context in which the generic is uttered, which determines the in-virtueof factor of the sentence. The present theory gives more precise content to what ‘abnormal’ means, as far as IS and BP generics are concerned. It defines ‘abnormal’ as ‘having a property of the minority (of relevant individuals) which blocks the reasonable causation relation between the accommodated, in-virtue-of property and the VP property’. This allows the type of abnormality to vary depending on both material in the sentence, as well as accommodated material, and correctly predicts that when the invirtue-of factor cannot be accommodated—as in the out of the blue (25) or with unambiguously descriptive generics like (3b)—the type of abnormality relevant for legitimate exceptions cannot be specified either, that is, it is vague. The present theory also attempts to capture the intuition that generics make claims about the majority of contextually relevant individuals in the subject set. This intuition is formulated in a simpler and an elegant way in Cohen (1999), who abandons the common idea that generic quantification is universal, and argues instead that its quantificational force is similar to ‘most’. The meaning of ‘Dogs have four legs’ in this theory is roughly ‘Most dogs, in all admissible histories (continuing the present history) have four legs’. This has the immediate advantage that there is no need to stipulate any exceptions-tolerance mechanism. There are, however, two main problems with such a move. The first is that this interpretation does not capture at all the widely held abnormality constraint. The exceptional dogs in (1), namely those

Yael Greenberg 165

(47) a. Most (potential) professors wear a tie. b. #But look at Bill! He does not wear a tie! The fact that, unlike (47b), (12b) above is considered legitimate and felicitous indicates, then, that although many generics indeed make claims about the majority of individuals, the quantificational force of Gen cannot be that of ‘most’. This is captured in the present analysis by defining Gen as universal, but at the same time restricting the individuals who are quantified over in such a way that no matter which precisification in the restriction is chosen, we end up indirectly quantifying over the majority of (relevant) individuals.

Acknowledgements I want to deeply thank Susan Rothstein for her invaluable comments in clarifying the ideas presented in various versions of this paper. Thanks also to Ron Artstein, Greg Carlson, Gennaro Chierchia, Ariel Cohen, Edit Doron, Angelika Kratzer, Anita Mittwoch, Galit Sassoon, participants of the Israeli Association for Theoretical Linguistics 2003 conference at Ben-Gurion University, and especially Fred Landman and three anonymous reviewers for their helpful comments.

YAEL GREENBERG Department of English Bar-Ilan University Ramat Gan 52900 Israel e-mail: [email protected]


with no four legs, constitute the minority, but nothing guarantees that they are abnormal in some other sense, relative to those with four legs. The second problem has to do with the observation, made in section 2 above, that taking exceptional entities as counterexamples to generics is evaluated as completely legitimate. I suggest that this can only happen if Gen is indeed universal. If the quantificational force with example (11) (‘Professors wear a tie’) was ‘most’, the cooperative listener would never say something like ‘but look at Bill, he does not wear a tie!’ [as in (12b) above], because he/she would not expect the quantification to range over all individuals in the first place. That is, if generics had a ‘most’-like quantifier, we would wrongly expect the listener’s reaction to (12b) to be as infelicitous as the ones in (47b):

166 Exceptions to Generics REFERENCES Condoravdi, C. (1997), Descriptions in Context. Garland. New York. Dahl, O. (1975), ‘On generics’. In E. Keenan (ed.), Formal Semantics of Natural Language. Cambridge University Press. Cambridge 99–111. Delgrande, J. P. (1987), ‘A first-order conditional logic for prototypical properties’. Artificial Intelligence 33:105–30. Delgrande, J. P. (1988), ‘An approach to default reasoning based on a first-order conditional logic: revised report. Artificial Intelligence 36:63–90. Dobrovie-Sorin, C. & B. Laca (1996), Generic Bare NPs. Unpublished MS. Universite Paris 7 and Universite de Strasbourg, France. Dowty, D. (1979), Word Meaning and Montegue Grammar. Kluwer. Dordrecht. Drewery, A. (1997), ‘Representing generics’. In Proceedings of the Second ESSLLI Students Session, Ninth European Summer School in Logic, Language and Information, Aix-enProvence, France. Eckardt, R. (1999), ‘Normal objects, normal worlds, and the meaning of generic sentences’. Journal of Semantics 16:237–78. Fine, K. (1975), ‘Truth vagueness and logic’. Synthese 30:265–300. Greenberg, Y. (2003), Manifestations of Genericity. Ph.D. thesis, Bar Ilan University, Israel, Also published in Outstanding Dissertations in Linguistics. Routledge. New York. Greenberg, Y. (2004), ‘Tolerating exceptions with ‘‘in-virtue-of ’’ and ‘‘descriptive’’ generics: two types of modality and reduced vagueness’. In K. von Heusinger and K. Turner (eds.) Where Semantics Meets Pragmatics: The Michigan Papers. Elsevier. Amsterdam. Heim, I. (1982), The Semantics of Definite and Indefinite Noun Phrases.


Asher, N. & M. Morreau (1995), ‘What some generic sentences mean’. In G. Carlson and F. J. Pelletier (eds.), The Generic Book. The University of Chicago Press. Chicago 300–338. Brennan, V. (1993), Root and Epistemic Modal Auxiliary Verbs. Ph.D. thesis, University of Massachusetts, Amherst. Burton-Roberts, N. (1977), ‘Generic sentences and analyticity’. Studies in Language 1:155–96. Carlson, G. (1977), Reference to Kinds in English. Ph.D. thesis, University of Massachusetts, Amherst 224–237. Carlson, G. (1995), ‘Truth conditions of generic sentences: two contrasting views’. In G. Carlson and F. J. Pelletier (eds.), The Generic Book. The University of Chicago Press. Chicago 224–237. Carlson, G. (1999), ‘Evaluating generics’. In P. Lasersohn (ed.), Illinois Studies in the Linguistics Sciences 29:1–11. Chierchia, G. (1995), ‘Individual level predicates as inherent generics’. In G. Carlson and F. J. Pelletier (eds.), The Generic Book. The University of Chicago Press. Chicago 176–223. Chierchia, G. (1998), ‘Reference to kinds across languages’. Natural Language and Linguistic Theory 6:339–405. Cohen, A. (1996), Think Generic: The Meaning and Use of Generic Sentences. Ph.D. thesis, Carnegie Mellon University, Pittsburgh, PA, Also published in 1999, CSLI. Stanford, CA. Cohen, A. (1999), ‘Generics, frequency adverbs and probability’. Linguistics and Philosophy 22:221–53. Cohen, A. (2001a), ‘On the generic use of indefinite singulars’. Journal of Semantics 18:183–209. Cohen, A. (2001b), ‘Relative readings of many, often and generics’, Natural Language Semantics 9:41–67.

Yael Greenberg 167 Carlson and F. J. Pelletier (eds.), The Generic Book. The University of Chicago Press. Chicago. 358–382. McCarthy, J. (1986), ‘Applications of circumscription to formalizing common-sense knowledge’. Artificial Intelligence 28:89–116. Pelletier, F. J. & N. Asher (1997), ‘Generics and default’. In J. van Benthem and A. ter Meulen (eds.), Handbook of Logic and Language. MIT Press. Cambridge, MA. Prasada, S. & E. M. Dillingham (1999), ‘Principled and statistical connections in common sense conception’. Cognition 99:73–112. Schubert, L. K. & F. J. Pelletier (1988), ‘An outlook on generic statements’. In M. Krifka (ed.), Genericity in Natural Language. SNS-Bericht, University of Tu¨bingen. Germany. Stalnaker, R. (1968), ‘A theory of conditionals’. In N. Rescher (ed.), Studies in Logical Theory. Blackwell. Oxford. ter Meulen, A. (1995), ‘Semantic constraints on type shifting anaphora’. In G. Carlson and F. J. Pelletier (eds.), The Generic Book. The University of Chicago Press. Chicago. Wilkinson, K. (1991), Studies in the Semantics of Generic Noun Phrases. Ph.D. thesis, University of Massachusetts, Amherst. First version received: 28.01.2006 Second version received: 26.10.2006 Accepted: 15.12.2006


Ph.D. thesis, University of Massachusetts, Amherst. Kadmon, N. & F. Landman (1993), ‘Any’. Linguistics and Philosophy 16:353–422. Kamp, H. (1975), ‘Two theories about adjectives’. In E. Keenan (ed.), Formal Semantics of Natural Language. Cambridge. Kratzer, A. (1981), ‘The notional category of modality’. In H.-J. Eikmeyer and H. Reiser (eds.), Worlds and Contexts. New Approaches to World Semantics. de Gruyter. Berlin. 38–74. Krifka, M. (1987), An Outline of Genericity [partly in collaboration with C. Gerstner]. SNS-Bericht, University of Tu¨bingen. Germany. Krifka, M. (1995), ‘Focus and the interpretation of generic sentences’. In G. Carlson and F. J. Pelletier (eds.), The Generic Book. The University of Chicago Press. Chicago. Krifka M., Pelletier, F. J., Carlson G, ter Meulen A., Link, J. & G. Chierchia (1995), ‘Genericity: an introduction’. In G. Carlson and F. J. Pelletier (eds.), The Generic Book. The University of Chicago Press. Chicago. 1–124. Landman, F. (1992), ‘The Progressive,’ Natural Language Semantics 1. Lawler, J. (1973), ‘Studies in English generics’. University of Michigan Papers in Linguistics 1:1. Lewis, D. (1973), Counterfactuals. Harvard University Press. Cambridge, MA. Link, G. (1995), ‘Generic information and dependent generics’. In G.

Journal of Semantics 24: 169–213 doi:10.1093/jos/ffm001 Advance Access publication April 21, 2007

On the Semantics of Comparative Correlatives in Mandarin Chinese JO-WANG LIN National Chiao Tung University

The main objective of this article is to provide a formal semantics of comparative correlatives of the form ‘yue` . . ., (jiu` ) yue` . . . ’ in Mandarin Chinese. A new analysis is proposed which treats the comparative correlatives as one which involves a quantificational tripartite structure and which derives all the meanings of the ‘yue` . . ., (jiu`) yue` . . . ’ constructions through a comparison of degrees which relate to different or same individuals. The comparison of the same or different degrees of a given property is shown to be the consequences of the general processing strategy for noun phrase interpretations suggested by Partee, as well as of the constituent structure of the sentence. It is also shown that the same semantics for Chinese yue` can be extended to the -er in English comparative conditionals, as well as to similar constructions in other languages. At the same time a superficial cross-linguistic variation from independent factors can also be derived.

1 INTRODUCTION Mandarin Chinese has a special construction which consists of two clauses each of which contains a yue` ‘more’ morpheme. In addition to yue`, the second clause of such a construction may contain the morpheme jiu`, which is sometimes translated as then in English. This is illustrated in (1)–(3) below. (1)

nıˇ yue` shengqı`, ta (jiu`) yue` gaoxı`ng you more angry he JIU more happy ‘The angrier you are, the happier he is.’

(2)

woˇ yue` zhui ta, ta (jiu`) yue` I more chase him he JIU more ‘The more I chased him, the more he ran.’

(3)

ta yue` da`, (jiu`) yue` piaòlia`ng she more big JIU more beautiful ‘The older she becomes, the more beautiful she is.’

pa˘o run

The Author 2007. Published by Oxford University Press. All rights reserved. For Permissions, please email: [email protected]


Abstract

170 On the Semantics of Comparative Correlatives in Mandarin Chinese

(4) ru´guoˇ nıˇ yue` shengqı`, ta (jiu`) yue` gaoxı`ng if he more angry he then more happy lit. ‘If you are angrier, he is then happier.’ (5) zhıˇyaò nıˇ yue` shengqı`, ta (jiu`) yue` gaoxı`ng as-long-as you more angry he then more happy lit. ‘As long as you are angrier, he is happier.’ On the other hand, the ‘yue` . . . yue` . . . ’ constructions are the Chinese counterparts to the English ‘the –er . . . the –er’ constructions in terms of translation. For instance, (1) is the Chinese counterpart to the English sentence below. (6) The angrier you are, the happier he is. Interestingly, constructions like (6) have been thought of as a kind of conditional. For example, McCawley (1988) refers to this construction as ‘comparative conditionals’, a term which has also been adopted by Beck (1997) and others. Despite the semantic similarity between the ‘yue` . . . yue` . . . ’ constructions and the normal conditionals, the ‘yue` . . . yue` . . . ’ constructions are actually not fully equivalent to ru´guoˇ- or zhıˇyaòconditionals (if or as-long-as conditionals). There is at least one difference between the two constructions. Namely, while the ‘yue` . . . yue` . . . ’ constructions may describe an on-going situation or a fact that has happened in the past, as Xing (2001: 379–380) has already observed, ru´guoˇ- or zhıˇyaò-conditionals may not. To illustrate, consider (7). This sentence may describe an on-going situation in which more and more guests are arriving at the time as the speech is being delivered and hence the food may not be enough. In (8), both the events of her scolding me and my getting angry took place yesterday.


The most significant semantic feature of the above construction is that the increasing degrees of change in the property denoted by the first clause causes the increasing degrees of change in the property denoted by the second clause. Lu¨ (1982: 367–369) refers to the above phenomenon as dependent change [yıˇbia`n in terms of Chinese] and Xing (2001: 378) dubs such constructions as constructions of conditional dependent change [tiaójia`n yıˇbia`n ju` in terms of Chinese]. Intuitively, it seems to make sense to refer to the ‘yue` . . . yue` . . . ’ constructions as conditional, because their semantics is quite close to those of regular conditionals. On the one hand, the meaning of (1) is almost equivalent to (4) or (5), where a conditional connective is inserted, as Liu (2006) has pointed out.

Jo-wang Lin 171

(7)

ke`reń yue` laí yue` du o, shı´wu` keˇneńg huı` bu´ goù guest more come more many food possibly will not enough ‘More and more guests are arriving, so the foods may be not enough.’

(8) zuo´tian ta yue` ma` woˇ, woˇ jiu` yue` shengqı`, Yesterday he more scold me I then more angry Suoˇyıˇ woˇ jiu` bu` gen ta jia˘ng hua` le So I then not with him say word Par ‘Yesterday, the more he scolded me, the angrier I got. So I decided not to talk with him.’

(9)

ru´guoˇ ke`reń yue` laí yue` du o, shı´wu` keˇneńg huı` bu´ goù if guest more come more many food possibly will not enough ‘If more and more people arrive, the foods may be not enough.’

(10) zuo´tian ru´guoˇ ta yue` ma` woˇ, woˇ jiu` yue` shengqı` yesterday if he more scold me I then more angry lit. ‘Yesterday, if he scolded me more, I would get more angry.’ Very interestingly, the properties that the ‘yue` . . . yue` . . . ’ constructions display are quite similar to those exhibited by what Cheng and Huang (1998) refer to as ‘bare conditionals’, in which a wh-phrase in the consequent clause is anaphoric to another wh-phrase in the antecedent clause. The construction of ‘bare conditionals’ is illustrated by (11). (11) sheí xian laí, sheí xian chi who first come who first eat ‘Whoever comes first eats first.’ According to Lin (1996), a big difference between ‘bare conditionals’ and regular conditionals is that the antecedent clause in a ‘bare conditional’ may describe a fact as is shown in (12), whereas a ru´guoˇclause (‘if-clause’) only indicates a possibility, as is shown in (13). 1 2

For this fact, see Lin (1996) for more discussion. This fact has been observed by Liu (2006).


Now if the conditional morpheme ru´guoˇ is inserted into (7) and (8), the meaning is either changed or the sentence is simply ungrammatical, as is shown in (9) and (10). Sentence (9) only indicates a possibility not the reality,1 while sentence (10) is an unacceptable sentence and cannot be used as a counterfactual conditional.2

172 On the Semantics of Comparative Correlatives in Mandarin Chinese (12) nıˇ zuo´tian gen sheí yı` zu˘, jintian jiu` haí you yesterday with who one group today then still shı` gen sheí yı` zu˘ be with whom one group lit. ‘Whoever you were in the same group with yesterday, you will still be in the same group with him today.’

Sentence (13) does not entail that someone did not hand in his homework, but (12) entails that you were in the same group with that someone. On the basis of this difference, Lin (1996) concludes that ‘bare conditionals’ are not elliptical ru´guoˇ-conditionals ‘if-conditionals’. By parity of Lin’s reasoning, it can be concluded that the ‘yue` . . . yue` . . . ’ constructions are not elliptical ru´guoˇ-conditionals. Although the ‘yue` . . . yue` . . . ’ constructions are not elliptical ru´guoˇconditionals, their semantics is very similar to the latter. Therefore, in this article, the semantics of the ‘yue` . . . yue` . . . ’ constructions will be approached in the way linguists have approached regular conditionals. However, in what follows, the ‘yue` . . . yue` . . . ’ constructions are referred to as comparative correlatives, because it is not known whether or not they are really a kind of conditional,3 nor can such a view be defended at this stage. The main goal of this article is to provide a formal semantics of comparative correlatives and the semantics of the morpheme yue` ‘more’. In addition to this general goal, this article will in particular pay attention to the following questions. First, when the subject NP of a yue`-predicate is a bare noun, why is it that sometimes different individuals are compared with respect to the degrees of a given property but at other times the same individual’s different degrees of a given property are compared? For instance, in sentence (14), which contains an individual-level predicate in the first clause, the degrees of beauty evident in different girls are being compared, but in (15), which contains a stage-level predicate in the first clause, the same individual’s different degrees of nervousness are more likely being compared. 3 This problem particularly stands out when we consider cases where the first occurrence of yue` is embedded to a relative clause, as will be discussed later in section 6.


(13) ru´guoˇ zuo´tian sheí meí jiao zuo`ye` dehua`, if yesterday who not hand-in homework if jintian ta jiu` bı`x u bu˘jiao today he then must make-up ‘If anyone didn’t hand in his homework yesterday, he must do so today.’

Jo-wang Lin 173

n haízi yue` piaòlia`ng, (jiu`) yue` du o reń zhui girl more beautiful then more many man chase ‘The more beautiful a girl is, the more men will chase her.’ ‘"x, y[girl(x) ^ girl(y) ^ y is more beautiful than x][more men chase y than x]’4

(15)

xue´sheng yue` jıˇnzhang, (jiu`) yue` rońgyı` student more nervous then more easy ka˘o de bu` ha˘o take-exam DE not good ‘The more nervous a student is, the easier it is for him to not do well in an exam.’ ‘"x,s,s#[student(x) ^ x is more nervous in situation s than in s#] [x’s performance is worse in s than in s#]’

Curiously, however, if the first clause in (15) is turned into a complex NP with a relative clause, the different individuals’ degrees of nervousness instead of the same individual’s different degrees of nervousness would be compared. (16)

yue` jıˇnzhang de xue´sheng (jiu` ) yue` rońgyı` more nervous REL student then more easy ka˘o de bu` ha˘o take-exam DE not good ‘"x,y[student(x) ^ student(y) ^ x is more nervous than y][x’s performance is worse than y]’

Another goal of this article is to argue that the different interpretations of comparative correlatives in Mandarin Chinese may be due to the proposed processing strategy of semantic interpretation proposed by Partee (2004), as stated in (17), as well as to the constituent structure of a given comparative correlative. (17)

Partee’s General Processing Strategy There is a general processing strategy of trying lowest types first, using higher types only when they are required in order to combine meanings by available compositional rules. (Partee 2004: 204–205)

Finally, in addition to Chinese comparative correlatives, this paper will also discuss their counterpart constructions in English, aiming to 4 The logical representations in (14)–(16) are not intended as analyses of the examples, but rather as helpful indicators of their meaning.


(14)


2 HSIAO’S (2003) ANALYSIS OF CHINESE COMPARATIVE CORRELATIVES The most recent published semantic analysis of Chinese comparative correlatives known to the author is Hsiao’s (2003) work, in which she suggests that comparative correlatives express a proportional reading. According to her, the first yue` ‘more’ in a comparative correlative contains an independent variable X, whereas the second yue` contains a dependent variable Y. The value of the degree or quantity of Y changes with that of the degree or quantity of X. The meaning of the construction is expressed in terms of the following formula: (18) y ¼ ax + b In mathematics, the a and b in (18) stand for constants, and b can be the number zero. Now let us suppose that a ¼ 2, b ¼ 0, and x and y are positive natural numbers. We will get the following results. (19) 2 ¼ 4¼ 6¼ 8¼ ...

2 2 2 2

3 3 3 3

1 2 3 4

+ + + +

0 0 0 0

In other words, the value of y is always two times the value of x. So the two values are proportional.


provide a possible universal semantics of comparative correlatives across languages. This article is organized as follows. Section 2 is a brief review of Hsiao’s (2003) perspective on the semantics of comparative correlatives in which some limitations and reservations are presented. Section 3 is a summary of Beck’s (1997) analysis of comparative conditionals. Section 4 argues that her analysis of comparative conditionals may not be directly extended to Chinese and that, therefore, a new analysis of Chinese comparative correlatives is needed. Section 5 offers a new proposal of the semantics of the morpheme yue` ‘more’ and the construction containing yue` as a whole. Section 6 is an extension of the analysis proposed in Section 5 in order to include the more complex NP with a relative clause. Section 7 further shows how the proposed semantics of yue` can account for the co-occurrence between yue` and an overt item of comparison: one not permitted in English comparative conditionals. Finally, Section 8 concludes this article, by offering an outline of one possible direction for a unifying semantics of crosslinguistic comparative correlative constructions.

Jo-wang Lin 175

Now the question is: Do comparative correlatives really express a proportional function as Hsiao (2003) has suggested? Take (20) as an example. (20)

Zhangsan yue` Zhangsan more ‘The more Zhangsan

pa˘o yue` kuaì run more fast runs, the faster he is.’

(21)

The The The The The

speed speed speed speed speed

in in in in in

the the the the the

first second is 3 metres per second second second is 3.5 metres per second third second is 5 metres per second fourth second is 6 metres per second fifth second is 6.7 metres per second

Clearly, in the above scenario, the speed at the later second has no fixed proportional relation to the speed at the previous second. Therefore, it can be concluded that Hsiao’s treatment of comparative correlatives as ‘proportional correlative constructions’ may not adequately account for the semantics of such constructions.5 3 BECK’S (1997) ANALYSIS OF ENGLISH COMPARATIVE CONDITIONALS As was noted at the outset of this article, it has been argued by some that the Chinese comparative correlatives are the equivalents of the English comparative conditionals. It is, therefore, helpful to see how such constructions are analysed in the literature. 5

Also see Liu (2006) for a similar conclusion.


Suppose that (20) represents a proportional meaning and Zhangsan’s speed is measured in terms of the distance that he can run per second. Further assume that Zhangsan runs 5 metres in the first second. Then according to the formula in (18), his speed should be 10 metres per second in the next second and 15 metres per second in the third second, and so on and so forth. Does (20) really express such a proportional relationship? This proportional interpretation is possible, but (20) does not need to have such a proportional interpretation. The only required factor for a sentence like (20) to be true is that the speed during a subsequent second be faster than the speed in the previous second. It is not necessary for the speed during a subsequent second to be proportional to the speed during the previous second. For example, (20) may perfectly describe the following scenario, though there is no proportional relationship between the speeds in two adjacent seconds.


(22) a. If a man owns a donkey, he beats it. b. "x,y[man#(x) ^ donkey#(y) ^ own#(x,y)][beat#(x,y)] (23) a. If a man owns a donkey, he usually/sometimes beats it. b. MOST/SOME x,y[man#(x) ^ donkey#(y) ^ own#(x,y)] [beat#(x,y)] As for the semantics of comparison, Beck (1997) proposes the following analysis of English comparative clauses. She adopts a maximal degree analysis for comparison. Therefore, the semantics of (24a) is (24c). (24) a. Louise is (3cm) taller than Otto (is tall). b. er# (kd[tall#(d, Otto)])(3 cm)(kd[tall#(d, Louise)]) c. The max d1[tall#(d1, Louise)] ¼ 3 cm + the max d2[tall# (d2,Otto)] 6

See also Thiersch (1982), Fillmore (1987), Culicover and Jackendoff (1999), Borsley (2004) and Dikken (2005) and references cited there for other discussions of the syntax and semantics of comparative conditionals.


The most detailed semantic analysis of such constructions is provided by Beck (1997).6 In accordance with earlier works such as Wold (1991), she analyses comparative conditionals as a special kind of conditional which involve the correlative constructions suggested by von Fintel (1994). Her semantic analysis of comparative conditionals is mainly based on two assumptions. First, comparative conditionals display many properties similar to those of regular conditionals, as discussed in Heim (1982). Therefore, Heim’s theory of conditionals can be used to analyse comparative conditionals. Second, both the antecedent and consequent clauses in a comparative conditional involve two compared items. So the semantics of comparison must be incorporated into the semantics of comparative conditionals. The following is a summary of Beck’s treatment of comparative conditionals with respect to the above two points. In order to explain the donkey anaphora in English, Heim (1982), following Lewis (1975), analyses conditionals based on the assumption that they have a quantificational tripartite structure, in which the ifclause is the restriction of a possibly implicit quantifier or adverb of quantification and the consequent clause is the nuclear scope. In order to bind, the quantifier or adverb of quantification may freely select a free variable within its scope. For example, according to Heim, both indefinites and donkey pronouns can be analysed as free variables. Therefore, an (implicit) adverb of quantification may bind the variables introduced by them, as shown in (22)–(23).

Jo-wang Lin 177

According to the above analysis of comparison, the comparative morpheme -er takes three arguments, as shown in (25). (25) ½½-er(D1)(d)(D2) ¼ 1 iff the max d2D2(d2) ¼ d + the max d1D1(d1) However, we know that the difference argument sometimes need not be present. Beck suggests that such difference arguments are existentially closed. Therefore, when -er takes only two arguments, its semantics is as in (26), while the meaning of (24a) without the difference argument is represented in (27).

Returning to comparative conditionals, Beck points out that they share many similarities with regular conditionals. First, like regular conditionals, the default quantifier of a comparative conditional is a universal quantifier. Second, like adverbs of quantification in regular conditionals, the quantifier in comparative conditionals may bind different variables, such as individual, time or world variables. For instance, in (28), (29), and (30), what is quantified over is a world variable, individual variable and time variable, respectively. (28)

a. The better Otto is prepared, the better his talk will be. b. "w1,w2[Otto is better prepared in w1 than in w2] 0 [Otto’s talk is better in w1 than in w2]

(29)

a. The slimmer an attorney looks, the more successful he is. b. "x,y[attorney(x) ^ attorney(y) ^ x looks slimmer than y] 0 [x is more successful than y]

(30)

a. The hotter it is, the more tired Uli is. b. "t1,t2[it was hotter at t1 than it was at t2] 0 [Uli was more tired at t1 than he was at t2]

Third, also like regular conditionals, though a universal operator is a default quantifier, an overt adverb of quantification may take over its job. This is illustrated in (31). (31)

a. The stronger a climber is, the better he usually is. b. MOST x,y [climber(x) ^ climber(y) ^ x is stronger than y][x is better than y].


(26) ½½-er(D1)(D2) ¼ 1 iff dd[d > 0 ^ the max d2D2(d2) ¼ d + the max d1D1(d1) (27) dd[d > 0 ^ the max d2[tall#(d2, Louise)] ¼ d + the max d1[tall# (d1, Otto)]]

178 On the Semantics of Comparative Correlatives in Mandarin Chinese In view of the above similarities between comparative conditionals and regular conditionals, Beck (1997) proposes to adopt Heim’s (1982) analysis of regular conditionals. The only thing that is more complicated for comparative conditionals is that they involve additional semantics of comparison. Beck proposes that both the first and the second clauses in a comparative conditional are CP projections and they form a correlative structure. The comparative morpheme -er projects a functional DP with the as the specifier. At the same time, the whole DP is raised to the specifier position of CP, as is represented below.7

CP

CPλw1,w2

CP CPλw1,w2

∀

DegPj

C´ λwλd[good´w(d, Otto’s_talk)]

DegPi

C´ λwλd[well(d, λx[prepared´w(x)](Otto))]

the

-er

the´(w1,w2)

Otto’s talk is tj good

-er

The the´(w1,w2)

Otto is ti well prepared

In (32), the default quantifier is the implicit universal quantifier, which takes the first clause as its first argument and the second clause as the second argument. The representation in (32) clearly shows that the internal structures of the two clauses in question are exactly the same. Apart from the above representation, another crucial component of Beck’s analysis is the analysis of the. An important difference between normal comparatives and comparative conditionals is that the former may overtly provide the compared argument, i.e. a than-phrase, but the latter may not. Therefore, it is difficult to represent the comparing and compared arguments in comparative conditionals. Beck’s idea is to let the morpheme the take charge of this. She suggests that the morpheme the can take the three arguments as defined in (33). The first argument is a pair of individual, time or world variables, the second argument is the comparative morpheme -er and the third argument is the 7

According to Beck, measure phrases which show difference specifications also appear in the specifier position of DegP. Therefore, the morpheme the in a comparative conditional is incompatible with a difference phrase.


(32) "(kw1,w2[the#(w1,w2)(-er#)(kwkd[well#(d, kx[prepared#w(x)] (Otto))])])) (kw1,w2[the#(w1,w2)(-er#)(kwkd[good#w(d, Otto’s_talk)])])

Jo-wang Lin 179

We get the two elements to be compared with the help of the operator denoted by je/umso/desto. This operator allows us to use the information provided by the syntax twice. Thus, to add an item of comparison in the syntax is impossible because we already have one, although one that is not visible as such at S-structure. It is implicitly present with the operator. (Beck 1997: 253)


interpretation of C#. Sentence (32) is an example. The first argument is a pair of world variables, which serve as the comparing and compared arguments of the comparative morpheme -er. According to this analysis, we obtain the interpretation (34) for (32). (33) ½½the# (w1,w2)(½½-er#)(D<s,>) iff dd[d > 0 ^ ½½-er#(D(w1) (d)(D(w2))] (34) "(kw1,w2[the#(w1,w2)(-er#)(kwkd[well#(d, kx[prepared#w(x)]) (Otto)])])) (kw1,w2[the#(w1,w2)(-er#)(kwkd[good#w(d, Otto’s_talk#)])]) iff "w1,w2[dd[d > 0 ^ the max d2[well(d2, kx[prepared#w2(x)]) (Otto)] ¼ d + the d1[well#(d1, kx[prepared#w1(x)])(Otto)]]] 0[dd#[d# > 0 ^ the max d2[good#w2(d2, Otto’s_talk#)] ¼ d# + the max d1[good#w1(d1, Otto’s_talk#)]]] The meaning that (34) expresses is as follows: For all worlds w1 and w2, the degree of Otto’s preparedness is better in w2 than in w1, then the degree of excellence of his talk is better in w2 than in w1. In (34), the variables quantified over are world variables. If (29) is considered instead, the variables quantified over are individual variables. However, a detailed analysis here can not be justified given the space required to do this. According to Beck, the above proposal of comparative conditionals not only provides a semantics for such constructions, but it also has an immediate syntactic consequence concerning the absence of an overt item of comparison. She points out that although standard comparative constructions allow an overt item of comparison, a than-phrase, to appear as in (35), comparative conditionals do not, as evident in the ungrammatical construction in (36). (35) Otto is more tired than Hans. (36) The more tired Otto is than Hans, the more aggressive he is. According to her, examples such as (36) are ill-formed because of the following reason [je/umso/desto are the German counterparts of English the]:

180 On the Semantics of Comparative Correlatives in Mandarin Chinese To put it in another way, if the two compared items in (36) are times, it is ruled out because the following paraphrase does not make sense using Beck’s proposal: The more tired Otto is at time t1 than at time t2 than Hans, the more aggressive he is at time t1 than at time t2. 4 MOTIVATING A NEW ANALYSIS OF CHINESE COMPARATIVE CORRELATIVES

(37) a. Zhangsan bıˇ Lıˇsı` gao Zhangsan compare Lisi tall ‘Zhangsan is taller than Lisi.’ b. Zhangsan bıˇjiaò gao Zhangsan compare tall ‘Zhangsan is taller.’ Despite the superficial differences between Chinese comparative correlatives and comparative conditionals in other languages, Beck (1991) suggests in footnote 14 that the function of the in comparative conditionals is performed by yue` in Chinese comparative correlatives as it also occurs twice just like the. If this position is adopted and it is further assumed that the Mandarin counterpart of -er is a phonologically null element, then Chinese comparative correlatives would be complete analogues of comparative conditionals in other languages and there would be no problem in applying Beck’s semantics of comparative conditionals to Chinese comparative correlatives. Two arguments that question this line of thinking are presented below. To begin with, an interesting fact about the positive form of gradable adjectives in Chinese should be pointed out. The positive form of gradable adjectives in Chinese may not stand alone and needs the support of the morpheme hen ‘very’ in order for the sentence


From the discussion in the last section, it is clear that Beck’s analysis of English comparative conditionals relies heavily on the semantic analysis of the morphemes the and -er, because they are the two morphemes which provide the source of comparison in comparative conditionals. However, if we compare Chinese comparative correlatives with English comparative conditionals, we find that it is not straightforward to apply Beck’s analysis directly to Chinese comparative correlatives. On the one hand, Chinese lacks a morpheme equivalent to the English morpheme the. On the other hand, it is not clear that yue` ‘more’ is the Chinese counterpart to the English comparative morpheme -er, because it does not appear in comparative constructions as illustrated in (37).

Jo-wang Lin 181

containing them to be acceptable. This is illustrated by the contrast between (38a) and (38b). (38)

a. Zhangsan Zhangsan ‘Zhangsan b. Zhangsan Zhangsan ‘Zhangsan

jia`nkang healthy is healthy.’ heˇn jia`nkang very healthy is healthy.’

(39) (40)

The coffee in Rome is expensive. a. In Rome, even the coffee is expensive. b. The rents are high in Rome, but at least the coffee is not expensive.

Kennedy suggests that such variability may be accounted for in terms of a ‘Delineation Function’ which ‘maps a measure function to a degree that represents an appropriate standard of comparison based on the features of the context of utterance’ (also see Lewis 1970; Graff 2000; Barker 2002; Kennedy and McNally 2005; Kennedy 2005). This ‘Delineation Function’ is denoted by a degree morpheme pos, as given below:9 (41) ½½poss ¼ kg 2 D<e,t>kx.g(x) > s(g) Kennedy speculates that universally the positive form is not expressed by an overt degree morphology, which he takes to be the second possible universal feature of the positive form. Despite Kennedy’s speculation that there is no overt degree morphology associated with the positive form of gradable adjectives, the obligatory presence of heˇn in Chinese sentences such as (38b) seems 8 According to Xiandai Hanyu Xuci Lishi (1982), heˇn is not stressed when functioning as a positive degree marker. When it is stressed, it is interpreted as an intensifier. 9 According to Kennedy (2005), instead of assuming a covert degree morpheme pos, the same semantics can also be achieved by assuming a lexical type-shifting rule for adjectives.


The use of hen, ‘very’, in sentences such as (38b), has been correctly taken as the most neutral ‘positive degree marker’ (see Sybesma 1999 and Xidai Hanyu Xuci Lishi 1982).8 According to Kennedy (2005), there are two possible universal features of the positive form of gradable adjectives. One is the context-dependent interpretations as illustrated by (39) and (40). Whether or not a sentence such as (39) is true depends upon the context in which it is uttered. When the context is about living in various Italian cities, as in (40a), it could be judged true. But the same sentence is false in a context like (40b).

182 On the Semantics of Comparative Correlatives in Mandarin Chinese to falsify this speculation. The first piece of evidence for this claim comes from the fact that ‘heˇn + Adj’ displays exactly the same contextdependent interpretation as the positive form of gradable adjectives in English. Thus, like the positive form expensive in (40), the standard of comparison of heˇn + qiońg ‘very + poor’ in (42a) and (42b) is different. (42) a. ta heˇn qiońg, liań chi fa`n de qiań d ou he very poor even eat meal Rel money all meí yoˇu not have ‘He is poor. He even does not have money to eat meals.’

A second piece of evidence can be adduced by crisp judgments about borderline cases, i.e. objects for which it is unclear whether or not the predicate holds. According to Kennedy (2005), ‘a feature of the delineation function is that it is constrained to return to a value that counts as a significant degree of the relevant property in the context of utterance’. This feature predicts a difference in acceptability when the context involves very slight differences between the compared objects. For example, in the context of (43A), both the explicit and implicit comparisons are acceptable, but in the context of (43B), the positive form, i.e. the implicit comparison, is not acceptable. (43)

Context A: A 600 word essay and a 200 word essay This essay is longer than that one. Compared to that essay, this one is long. Context B: A 600 word essay and a 590 word essay a. This essay is longer than that one. b. ??Compared to that essay, this one is long. a. b.

Significantly, the Chinese sentence below is also unacceptable in the context of (43B) but is acceptable in the context of (43A). (44) Gen with weńzhang article ‘Compared

na`-pian weńzhang bıˇ qıˇ-laí, zhe`-pian that-Cl article compare getting this-Cl (heˇn) chańg very long to that article, this article is long.’

The above fact indicates that heˇn in Chinese is indeed an overt comparison degree morphology that expresses the ‘Delineation Function’.


b. ta heˇn qiońg, liań jiaoche d ou ma˘i-bu`-qıˇ he very poor even sedan all buy-not-afford ‘He is poor. He cannot even afford a sedan.’

Jo-wang Lin 183

In addition to heˇn, there are other degree comparison morphemes such as ge`ng ‘more’, or its variant ge`ngji a and bıˇjiaò ‘more’, or its variant jiaòweí, as the following sentences illustrate. (45)

Gen Zhangsan bıˇ qıˇlaí, Lıˇsı` bıˇjiaò with Zhangsan compare getting Lisi more ‘Compared to Zhangsan, Lisi is stingier.’

xia˘oqı` stingy

(46)

Gen Zhangsan bıˇ qıˇlaí, Lıˇsı` ge`ng xia˘oqı` with Zhangsan compare getting Lisi more stingy ‘Compared to Zhangsan, Lisi is stingier, (though both are stingy).’

(47)

a.Gen Zhangsan bıˇ qıˇlaí, Lıˇsı` heˇn bıˇjiaò/bıˇjiaò with Zhangsan compare getting Lisi very more/more heˇn xia˘oqı` very stingy ‘Compared to Zhangsan, Lisi is much stingier.’ b.Gen Zhangsan bıˇ qıˇlaí, Lıˇsı` heˇn ge`ng/ge`ng with Zhangsan compare getting Lisi very more/more heˇn xia˘oqı` very stingy ‘Compared to Zhangsan, Lisi is even much stingier, (though both are stingy).’ c.Gen Zhangsan bıˇ qıˇlaí, Lıˇsı` ge`ng bıˇjiaò/bıˇjiaò with Zhangsan compare getting Lisi more more/more ge`ng xia˘oqı` more stingy ‘Compared to Zhangsan, Lisi is even much stingier.’


The meaning of bıˇjiaò ‘compare’ in (45) is very close to the English comparison morpheme -er. Sentence (45) is true as long as Lisi is stingier than Zhangsan, who does not have to be stingy at all. In contrast, for (46) to be true, both Zhangsan and Lisi must be stingy according to the standard of comparison but Lisi is even stingier than Zhangsan. As can be seen from the above, Chinese has different degree morphemes that compare two objects or compare one object and a contextual standard of comparison and says that one object has a higher degree than the other object or the contextual standard with respect to the property denoted by the adjective. More significantly, however, these different degree comparison morphemes should not occur with each other. Thus, the following sentences are all ungrammatical regardless of the word order of the degree comparison morphemes.

184 On the Semantics of Comparative Correlatives in Mandarin Chinese Returning to the Chinese comparative correlatives, we find that the morpheme yue` may not occur with other degree comparison morphemes, as is shown by (48). (48) Pingguoˇ yue` heˇn/ge`ng/bıˇjiaò da`, . . . apple YUE` very/more/more big lit. ‘The bigger an apple is (than another one which is also big), . . . ’

(49) The more tired Otto is than Hans, the more aggressive he is. Beck’s explanation of this, as already stated, is that a than-phrase is impossible in comparative conditionals because an overt item of comparison is already implicitly present in the semantics, i.e., the pair of variables that serve as the first argument of the. Now if the Chinese yue` did take on the job of the and had the same semantics as the as Beck has suggested in her footnote, the prediction would be that yue` would


The easiest way to account for the unacceptability of (48) is to assume that yue` itself is a degree comparison morpheme just like heˇn, ge`ng and bıˇjiaò. Therefore, the unacceptability of (48) simply falls under the same generalization as the sentences in (47); namely, one sentence does not allow two degree comparison morphemes. In contrast, if we assume that yue` is an equivalent to the English morpheme the, as Beck has suggested, and that the comparative correlatives in Chinese involve a null -er, at least two difficult problems arise. The first one is that the ungrammaticality of (48) will immediately become a puzzle, because we can no longer explain the ungrammaticality of (48) in the same way that we can account for the unacceptability of the examples in (47). The second problem is that Beck’s proposal raises the question of why the null comparison degree morpheme is unlike the other overt comparison degree morphemes with respect to the compatibility with yue`. In view of these problems, it can be concluded that the treatment of yue` as a counterpart to the English the and the postulation of a null er for Chinese comparative correlatives raise more questions than does treating yue` itself as a comparison degree morpheme. This area is treated in more detail later in this paper. A second problem with Beck’s suggestion to extend her analysis of comparative conditionals to include Chinese comparative correlatives is related to the co-occurrence problem between the (or its equivalents in other languages) and an overt item of comparison. As noted, one property of comparative conditionals in English is the incompatibility between the morpheme the and an overt item of comparison as seen in (49), reproduced below.

Jo-wang Lin 185

be incompatible with an overt item of comparison. Unfortunately, this prediction is not born out. In Chinese, yue` may naturally occur with an overt item of comparison, as McCawley (1988: 183) and Liu (2006) have already pointed out. One example to illustrate this point is (50). (50)

nıˇ yue` bıˇ ta kuaìle`, ta jiu` yue` tongku˘ you more compare he happy he then more painful lit. ‘The happier you are than he, the more painful he is.’ ‘The more your happiness exceeds his, the more painful it is for him.’

(51)

½½-er (w1,w2)(½½D<s,>) iff dd[d > 0 ^ max(kd.D(w1)(d)) ¼ d + max(kd.D(w2)(d))]

This proposal seems more likely to identify the meaning of yue` in Chinese as -er in English. The problem with the above modification of Beck’s original proposal is that just like the original proposal, the semantics in (51) excludes an overt item of comparison from being present, because items of comparison are already given by the semantics of -er. Therefore, we cannot say that yue` in Chinese has the same semantics as -er in (51).


Sentence (50) clearly indicates that Beck’s suggested treatment of yue` in Chinese as the equivalent of the in English is problematic, because the semantics of the that she has proposed wrongly excludes examples such as (50). Yue` is actually an adverb adjoined to a VP/AP, indicating an increasing degree of the property denoted by the VP/AP. It alone should be responsible for what the and -er together do in English comparative conditionals; namely, in Chinese comparative correlatives, yue` alone should undertake the responsibility of providing the two items that are being compared. From the above, it can be seen that there are difficulties in adopting Beck’s suggested treatment of yue` as the Chinese equivalent of English the and in assuming a null -er in Chinese. It is also suggested that yue` alone should be responsible for what the and -er together do in English comparative conditionals. In what follows, it will be shown that a possible modification of Beck’s analysis in which the meanings of the and -er are collapsed, as one anonymous reviewer has suggested, encounters the same problem. Let us assume that the in English is semantically vacuous and that the comparative morpheme -er in English has a meaning similar to that of (51), which collapses the functions of the original the and -er and provides conditions equivalent to the ones originally derived.

186 On the Semantics of Comparative Correlatives in Mandarin Chinese In addition, the proposal of (51) has its own drawbacks as the anonymous reviewer has pointed out: (i)

(ii) (iii)

(52) Gestern war as ku¨hl. Heute ist es umso Yesterday was it cool. Today is it the ‘It was cool yesterday. Today it’s all the more hot.’

heisser hotter

So it seems that if the spirit of Beck’s proposal is to be retained, there will be no easy way to unify Chinese comparative correlatives and English comparative conditionals or their counterparts in other languages. Therefore, an alternative approach is proposed here. First, a semantics of yue` as it occurs in Chinese comparative correlatives is provided and then the possibility of extending the proposed analysis of yue` to English -er as it occurs in ‘the Adj-er . . . the Adj-er . . . ’ constructions is explored. Meanwhile, a novel proposal for the function of the will be offered which might explain why the morpheme the as it occurs in English comparative conditionals is incompatible with an overt item of comparison. 5 A PROPOSAL OF CHINESE COMPARATIVE CORRELATIVES Before analysing the Chinese comparative correlatives in detail, the assumptions made need to be clearly stated. As noted, according to Heim (1982) and Lewis (1975), the semantics of conditionals involves a quantificational structure. This approach is adopted with Chinese comparative correlatives also because the same shared similarities between Chinese comparative correlatives and regular conditionals observed by Beck between English comparative conditionals and regular conditionals have also been found by this study. In addition, it is helpful to explain von Fintel’s (1994) use of extension relation in conditionals. According to von Fintel (1994), the meaning of a proposition refers to those situations in which the proposition is


We can no longer keep the meaning of the comparative morpheme uniform in different comparative constructions, for instance, the -er in standard comparatives is different from the -er in comparative conditionals. The modification will lose Beck’s original account for the impossibility of comparative conditionals to include measure phrases. German data like (52) discussed by Beck (1997: 254) implies the semantic contribution of umso, the German equivalent of English the.

Jo-wang Lin 187

true and adverbs of quantification may quantify over minimal situations. With this assumption, the meaning of ‘Qadv, if P, Q’ is roughly as follows: (53) ½½Qadv(P)(Q) ¼ Qadv smin [P is true in s] [ds#[s < s# ^ Q is true in s#]]

(54)

Prohibition Against Vacuous Quantification For every quantifier Q, there must be a variable x such that Q binds an occurrence of x in both its restrictive clause and its nuclear scope. (Kratzer 1995: 131)

In (53) the use of extension provides the situation variable s in the nuclear scope so that the quantifier Qadv may bind one occurrence of s in the restrictive clause and another occurrence of s in the nuclear scope. Another assumption that is adopted is that adjectives have an additional degree argument and verbs also have an additional degree or quantity argument. It is not controversial to assume that adjectives have a degree argument,10 but the assumption that verbs have a degree or quantity argument is more controversial. Here, Doetjes (1997), who argues for an extra degree or quantity position on the basis of the distribution of degree adverbs in French, Dutch and English is pertinent. Doetjes’s (1997) has argued that the degree position of a predicate is dependent on the event position. This paper also adopts this assumption, but uses situation variables instead of event variables. The variable g is used to represent a verb’s degree or quantity argument without distinguishing them. Both have the semantic type d, which is reminiscent of the word degree. Based on this assumption, an adjective like g aoxı`ng ‘happy’ or a verb like zoˇu ‘walk’ denotes kxkgks.walk#(x)(g)(s) of semantic type of <e,>>, where 10

For instance, see Cresswell (1976), von Stechow (1984), Heim (2001), among others.


What (53) indicates is this: For Qadv-many minimal s situations in which P is true, there is an extension from s to s# such that Q is true in s#. A few remarks on the notion of extension are in order here. According to von Fintel, under the notion of minimal situations, if the P-situation is a minimal situation, then it is not possible that the minimal P-situation is also a Q situation. Therefore, the notion of extension is necessary in the semantics of conditionals. The use of the notion of ‘extension’ has another important role to play in (53). According to Kratzer (1995), natural language does not allow vacuous quantification, whose definition is given in (54)


(1) nıˇ yue` shengqı`, ta (jiu`) yue` gaoxı`ng you more angry he then more happy ‘The more angry you are, the happier he is.’ (55)

In addition to the above logical form, two other important features of the Chinese ‘yue` . . . yue` . . . ‘ constructions’ which need to be analysed are the semantics of yue` and jiu`. It is proposed that yue` is a degree adverb whose main semantic functions, in its simplest case, are to compare two degree arguments and to claim that one degree in situation s1 is greater than the other degree in situation s2. That is, in the simplest case, the argument of yue` is a relation between a degree and a situation with the form kgks.P(g)(s) of type >. When yue` takes such an argument, the > expression in question will apply to a different degree and situation variable twice, producing two propositions and claiming that one degree in a situation is greater than the other degree in another situation, as is shown in (56). 11

Liu (2006) has made a similar suggestion for comparative correlatives. However, his discussion of the semantics of comparative correlatives remains descriptive and informal. Therefore, this paper will not make substantial comments on his unpublished conference presentation.


walk#(x)(g)(s) means that x has done a g-quantity of walking in situation s. Third, it is assumed that in ‘yue` . . . yue` . . . ’ constructions the first clause is mapped to the restriction of a possibly covert quantifier and the second clause to the nuclear scope.11 Moreover, just as the morphemes the and -er in Beck’s analysis of comparative conditionals take scope over the restriction and nuclear scope, the two yue`’s are adjoined to the restriction and the nuclear scope, respectively, in logical form and the possibly empty jiu` is adjoined to the top of the nuclear scope. Based on the above assumptions, Chinese comparative correlatives can now be analysed. Take (1), reproduced below, for instance. Its logical form looks like (55).

Jo-wang Lin 189

(56) ½½yue` ¼ kP>kg1kg2ks1ks2[P(g1)(s1) ^ P(g2)(s2) ^ g2 > g1]

(57) ½½yue` ¼ kP<e,>>.kxkykg1kg2ks1ks2[P(x)(g1)(s1) ^ P(y)(g2) (s2) ^ g2>g1] In other words, the argument of yue` actually does not have a fixed semantic type but a flexible one. Despite the flexibility of the semantic type of the argument of yue`, the different cases can be unified under one single generalization as proposed in (55), where each a stands for a semantic type. (58) ½½yue` ¼ kP>>ka1#ka1$ . . . kd1kd2 . . . kan#kan$ks1ks2[P(a1#) . . . (d1) . . . (an#)(s1) ^ P(a1$) . . . (d2) . . . (an$)(s2) ^ d2 > d1] As for the word jiu`, to understand its function in a sentence, let us consider more contexts in which it occurs: (59)

a. b. c. d. f.

ru´guoˇ xia`yu˘, woˇ jiu` bu´ qu` if rain I JIU not go ‘If it rains, then I will not go.’ yinweì xia`yu˘, suoˇyıˇ woˇ jiu` meí qu` sha`ng ke` because rain so I JIU not go attend class ‘Because it rained, I therefore didn’t go to the class.’ zhıˇyaò nıˇ qu`, woˇ jiu` qu` as-long-as you go I JIU go ‘As long as you go, I will go.’ woˇ caí gang daò, ta jiu` yaò zoˇu I CAI just-now arrive he JIU want leave ‘As soon as I just arrived, he wanted to leave.’ cońg dı` yi cı` jia`nmia`n, woˇ jiu` xıˇhuan ta from the first time meet I JIU like him ‘I have liked him since I met him for the first time.’


However, the argument of yue` is not always a relation between a degree argument and a situation. It can also be a relation between individuals, degrees, and situations, i.e. an expression of type <e,>>. In such cases, the semantics of yue` requires that the <e,>> expression applies to two different individuals, two different degrees and two different situations, again producing two propositions and claiming that one degree of a given property of the first individual in a situation is greater than the other degree of the same given property of another individual in another situation. This is illustrated in (57).

190 On the Semantics of Comparative Correlatives in Mandarin Chinese All the examples above seem to have varying degrees of causation. Causation is apparent in the first three sentences in (59) but are less obvious in (59d) and (59f). In (59d), my arriving can be interpreted as the cause of his leaving, perhaps because he does not like me.12As for (59f), let us imagine a context like the following: Mary met John in a conference. He is a man of great learning and is very handsome. Since their meeting in the conference, Mary has liked John.

12 Sentence (59d) can also be interpreted in such a way that it’s an accident that his leaving is close to my arriving. On this reading, it’s the short distance between two time intervals that is behind the relation of the two propositions involved. It is not clear to me whether this interpretation of jiu` is the same as that of the jiu` as it occurs in (59a–c), because jiu` clearly also has a non-causation meaning as the examples in (i). It is difficult to pin down what jiu` in the examples in (i) below means and therefore it is not possible to discuss this issue.

(i)

a. san ge jiu` goù three Cl JIU enough ‘Three is enough.’ b. a ba dia˘n jiu` he eight o’clock JIU ‘He arrived as early as eight o’clock.’

le Par daò arrive

le Par

13 A reviewer asks why the two examples in (60) remain ungrammatical if jiu` is taken out. It is assumed that when the overt jiu` does not surface, the construction still contains an implicit equivalent of jiu`. Therefore, the causation flavor is always there, no matter whether jiu` surfaces in syntax.


In the above context, it is hard to say that the meeting event itself causes Mary to like John. Instead, it is John’s appearance and his knowledge that impressed Mary and made her like him. This example is a good case to show that it is not necessarily the case that when two events e1 and e2 are linked by jiu`, e1 must directly cause e2. It can be the case that the properties of the participants of e1 cause the occurrence of e2. Nevertheless, a certain relation, which might be pragmatically determined, still must exist between e1 and e2 in order to make it possible for the properties of the participants of the event to trigger the causation. It is argued that the obvious cases of causation as witnessed by (59a–c) should be analysed the same way, namely, the causation should be derived from the relation that jiu` conveys. Although the content of this relation can not be pinned down, this relation will be referred to as relation R, from which the causation is derived. It is believed that the jiu` as it occurs in Chinese comparative correlatives is the same jiu` as the one in (59). Thus, comparative correlatives have the same kind of causation meaning as the examples in (59). This causation meaning explains why the examples in (60), as opposed to those in (61), are not acceptable sentences.13 Sentence (60a) sounds odd, because it is

Jo-wang Lin 191

impossible for there to be a causal relation (at least in the actual world) between Zhangsan’s scolding Lisi and Lisi’s growing tall.14 Similarly, (60b) is unacceptable because normally a person’s mental activity has no causal relation to the change of weather. In contrast, the examples in (61) are all acceptable because in each case a reasonable causal relation can be established between the first and the second clauses. a. Zhangsan yue` ma`, Lıˇsı` jiu` yue` gao Zhangsan more scole Lisi then more tall ‘The more Zhangsan scolded (Lisi), the taller Lisi is.’ b. woˇ xia˘ng zheì ge we`ntı´ yue` jiu˘, yu˘ I think this Cl question more long rain jiu` xia` de Yue` da` then fall DE More heavy ‘The more I think about this question, the heavier the rain is.’

(61)

a.

b.

c.

Zhangsan yue` ma`, Lıˇsı` jiu` yue` bu` Zhangsan more scold Lisi then more not lıˇ ta pay-attention-to him ‘The more Zhangsan scolds Lisi, the more Lisi does not want to pay attention to him.’ woˇ xia˘ng zheì ge we`ntı´ yue` jiu˘, I think this Cl question more long xinqıńg jiu` yue` ju˘sang mood then more depressed ‘The more I think about this question, the more depressed I feel.’ w uyuń yue` du o, yu˘ jiu` xia` de dark-cloud more many rain then fall DE yue` da` more heavy ‘The more dark clouds the sky has, the heavier the rain will be.’

14 The analysis predicts that examples like (60) in fantastic contexts should become acceptable when the causal relation is forced. This is indeed the case. Suppose that the context of Alice in Wonderland is (i), where the scolding of the rabbit is a certain kind of spell which causes the person being scolded by the rabbit to become taller for each scolding. Then the Chinese comparative correlative in (ii) is indeed acceptable.

(i) (ii)

The more the rabbit scolded Alice, the taller she became. Tu`zi yue` ma` aìlı`si, aìlı`si jiu` bia`n de yue` rabbit more scold Alice Alice then become DE more ‘The more the rabbit scolded Alice, the taller Alice becomes.’

gao tall


(60)

192 On the Semantics of Comparative Correlatives in Mandarin Chinese In view of the data in (60)-(61), it can be proposed that jiu` in Chinese comparative correlatives links the degree arguments in the first and second clauses through a relation R from which the causation meaning is derived. To be more precise, it is proposed that a ‘yue` . . . (jiu) yue` . . . ’ construction as a whole has a syncategorematic meaning as in (62), which puts the causality into the meaning of the construction through the pragmatically determined relation R:15

In (62), the condition R(,) is intended to mean that the degree d1 in the situation s1 has a relation R to the degree d3 in the situation s3. Since the relation R may give rise to a causation meaning, the meaning obtained is that the degree d1 in the situation s1 has caused the degree d3 in the situation s3. To illustrate the application of (62), let us reconsider sentence (1). According to the semantic rules discussed above, the step-by-step computation of (1) is (63). (1)

nıˇ yue` shengqı`, ta (jiu`) yue` gaoxı`ng you more angry he JIU more happy ‘The angrier you are, the happier he is.’

(63) a. b. c.

d.

½½[AP shengqı`] ¼ kxkgks.angry#(x)(g)(s) ½½[IP nıˇ shengqı`] ¼ kgks.angry#(you#)(g)(s) ½½[CP yue` nıˇ shengqı`] ¼ kP>kg1kg2ks1ks2[P(g1)(s1) ^ P(g2)(s2) ^ g2 > g1](kgks.angry#(you#)(g)(s)) ¼ kg1kg2ks1ks2[angry#(you#)(g1)(s1) ^ angry#(you#)(g2)(s2) ^ g2 > g1] ½½[CP yue` t a g aoxı`ng] ¼ kg3kg4ks3ks4[happy#(he#)(g3)(s3) ^ happy#(he#)(g4)(s4) ^ g4 > g3]

15

For the sake of simplicity, jiu` has been treated as a syncategorematic item without its own semantic value. It is possible to assign jiu` an independent meaning as in (i). ½½jiu` ¼ k§ >> k< >> ½½§(x1#) (x1$) . . . (g1) (g2) . . . (xn#)(xn$)(s1)(s2) ] / [dd3,d4,s3,s4[s1 < s3 ^ s2 < s4 ^ R(,) ^ R(,) ^ d1] 0 dd3,d4,s3,s4[s1 < s3 ^ s2 < s4 ^ R(,) ^ R(,) ^ Q(a1#) . . . (d3) . . . (an#)(s3) ^ Q(a1$) . . . (d4) . . . (an$)(s4) ^ d4 > d3]

Jo-wang Lin 193

e.

½½" yue` nıˇ shengqı` jiu` yue` t a g aoxı`ng ¼ "g1,g2,s1,s2[angry#(you#)(g1)(s1) ^ angry#(you#)(g2)(s2) ^ g2 > g1 ] 0 dg3,g4,s3,s4[s1 < s3 ^ s2 < s4 ^ R(,) ^ R(,) ^ happy#(he#)(g3)(s3) ^ happy#(he#)(g4)(s4) ^ g4 > g3]

(64) a. When we only have one special offer in our store, no retired people come to shop here. They prefer stores that offer considerably more deals. But when we have at least five special offers, they start coming. When we have more than five offers, retirees come in lots. b. The more special offers we have, the more retirees come to shop here. 17

Beck (1997: 245) writes: Many people I have presented this to have complained that this is too weak. The suggestion is that (15) means something like: if there is a positive difference in how well Otto is prepared in w1 vs. w2, then there must also be a corresponding or resulting positive difference in the quality of his presentation in w1 v. w2. So, the two difference degrees (the one in the antecedent clause and the one in the consequent) should somehow be related: Either they ought to be identical, or proportional, or the second should be functionally dependent on the first.

Beck has presented an argument against the above criticism. However, her argument will not be discussed in detail in this paper. Interested readers are referred to her paper directly.


What (63) says is this: For every g1, g2, s1, s2 such that you are g1-degree angry in s1 and g2-degree angry in s2, and g2 is greater than g1, there exist a g3, g4, s3 and s4 such that s3 is an extended situation of s1 and s4 an extended situation of s2 and he is g3-degree happy in s3 and g4-degree happy in s4 and g4 is greater than g3. Moreover, g1 in s1 has an R relation to g3 in s3 and g2 in s2 has an R relation to g4 in s4. From the R relation the causation meaning of the sentence can supposedly be derived. A great advantage of the above semantics is that it captures the corresponding or resulting relation between the degree arguments in the first and second clauses. This corresponding or resulting relation is expressed via the suggested relation R. As a result, this analysis is immune from the criticisms of Beck’s (1997) analysis that she has discussed in her paper.17 Before further analysing Chinese comparative correlatives, it is necessary to discuss one concern that has been raised by a reviewer about the notion of causation used here and universal quantification over degrees. Suppose that we have a context like (64a) below and the Chinese counterpart to (64b) is (65a). Based on the proposed analysis of yue`, the truth conditions of (65a) are roughly (65b).

194 On the Semantics of Comparative Correlatives in Mandarin Chinese (65) a. woˇmeń tı´g ong yue` du o te`jia`pıˇn, jiu` huı` we provide more many deal then will yoˇu yue` du o tuı`xi ureńyuań laí zhe`lıˇ goùwu` have more many retirees come here shop ‘The more special offers we have, the more retirees come to shop here.’

The reviewer says that (65a) is wrongly predicted to be false in the context of (64a), because there is no degree g3 that was caused by g1 for 4 > g1 > 1. However, this problem is not a serious one, because it is arguably another instance of an old problem related to domain restriction of quantification. The domain restriction of quantification can be illustrated by an English sentence such as Every student went to the party. Here, the universal determiner quantifies the domain of students. However, in order for the sentence in question to be true, it needs not be the case that all the students in the whole world went to the party. It is only those students who are relevant, for example those students in my department, that count. A student who does not belong to my department and who didn’t go to the party is not able to falsify the universal statement. So natural language quantification often involves an implicit restriction of the domain of quantification and (65a) is no exception.18 If this is correct, then the problem that the reviewer has raised can be accounted for: namely, those g1’s such that 4 > g1 > 1 are irrelevant degrees that are not in the domain of restriction. In what follows, in order to simplify the proposed analysis which is already very complicated, the issue of implicit domain restriction will be ignored. To further support the semantics of yue` in (58), sentence (66), in which time can be taken to play a role, should be considered. If it is supposed that stage-level predicates such as re` ‘hot’ and bu`sh ufu´ ‘uncomfortable’ have a time argument in addition to the individual and degree argument, then the truth conditions of (66) would be computed as in (67). 18

For a detailed discussion of this issue, the reader is referred to von Fintel (1994).


b. "g1,g2,s1,s2[we have g1-many special offers in s1 ^ we have g2-many special offers in s2 and g2 > g1] 0 dg3,g4,s3,s4[s1 < s3 ^ s2 < s4 ^ R(,) ^ R(,) ^ g3-many retirees come to shop here in s3 ^ g4-many retirees come to shop here in s4 ^ g4 > g3]

Jo-wang Lin 195

Tianqı` yue` re`, woˇ jiu` yue` bu`sh ufu´ weather more hot I then more uncomfortable ‘The hotter the weather is, the more uncomfortable I feel.’

(67)

a. ½½AP re` ¼ kxkgktks.hot#(x)(g)(t)(s) b. ½½IP ti anqı` re` ¼ kgktks.hot#(the-weather#)(g)(t)(s) c. ½½CP yue` ti anqı` re` ¼ kP>kg1kg2kt1kt2ks1ks2[P(g1)(t1)(s1) ^ P(g2)(t2)(s2) ^ g2 > g1] (kgktks.hot#(the-weather#)(g)(t)(s)) ¼ kg1kg2kt1kt2ks1ks2[hot#(the-weather#)(g1)(t1)(s1) ^ hot#(the-weather#)(g2)(t2)(s2) ^ g2 > g1] ufu´ d. ½½CP yue` woˇ bu`sh ¼ kg3kg4kt1kt2ks3ks4[uncomfortable#(I#)(g3)(t1)(s3) ^ uncomfortable#(I#)(g4)(t2)(s4) ^ g4 > g3] e. ½½" yue` ti anqı` re` jiu` yue` woˇ bu`sh ufu´ ¼ "g1,g2,t1,t2,s1,s2[hot#(the-weather#)(g1)(t1)(s1) ^ hot#(the-weather#)(g2)(t2)(s2) ^ g2 > g1] 0 dg3dg4,s3,s4[s1 < s3 ^ s2 < s4 ^ R(, ) ^ R(,) ^ uncomfortable#(I#) (g3)(t1)(s3) ^ uncomfortable#(I#)(g4)(t2)(s4) ^ g4 > g3] (For every degree g1, g2, time t1 and t2 and situation s1, s2, if the weather is g1-degree hot at t1 in s1 and the weather is g2-degree hot at t2 in s2 and g2 is greater than g1, then the g1-degree of heat at t1 in s1 will cause me to feel g3-degree of discomfort at t1 in s3, an extended situation of s1 and the g2-degree of heat will cause me to feel g4-degree of discomfort at t2 in s4, an extended situation of s2, and g4 is greater than g3.)

As the reader can check himself/herself, the truth conditions given in (67), where the degree of heat is relative to a given time and situation seems to be intuitively correct for the sentence in (66). In particular, the truth conditions do not require that the degree of heat increases with the advance of time. For example, the truth conditions in (67) predict that it is possible for (66) to be used in a scenario like the following: The degree of heat at time t1 is greater than the degree of heat at time t2, and t2 follows t1 and I am more uncomfortable at t1 than at t2. This further supports the proposed analysis of yue`. The above illustrates how degree arguments can be compared with respect to a time and a situation. However, as noted by Beck for English comparative conditionals, it is also possible to compare two different individuals with respect to the degree of a given property. For instance, according to her analysis, the semantic paraphrase of (68a) should be (68b).


(66)

196 On the Semantics of Comparative Correlatives in Mandarin Chinese (68) a. b.

The bigger an apple is, the more delicious it is. "x,y[apple#(x) ^ apple#(y) ^ x is bigger than y][x is more delicious than y]

Now the question is: Do we get the same paraphrase for the Chinese counterpart to (68a)? As (69) suggests, the answer seems to be an affirmative one. Sentence (69) has a reading in which the degrees of bigness/size and deliciousness/taste of different apples can be compared. (69) pingguoˇ yue` da`, (jiu`) yue` ha˘ochi apple more big then more delicious ‘The bigger an apple is, the more delicious it is.’

(70) If a noun phrase translates as kx[N(x)] and a predicate AP translates as kxkgks.AP(x)(g)(s), then ½½NP AP translates as kxkgks[N(x) ^ AP(x)(g)(s)]. According to (70) and the semantics of yue`, the first clause of (69) should then translate as (71): (71) a. b.

½½pingguoˇ da` ¼ kxkgks[apple#(x) ^ big#(x)(g)(s)] ½½yue` pıńgguoˇ da` ¼ kxkykg1kg2ks1ks2[apple#(x) ^ big#(x)(g1)(s1) ^ apple#(y) ^ big#(y)(g2)(s2) ^ g2 > g1]

As for the second clause in (69), the subject NP is an empty NP. It is proposed that in Chinese ‘yue` . . . yue` . . . ’ constructions, such an empty subject NP (indicated by a line through it) copies its interpretation from the subject NP in the first clause.20 Therefore, the meaning of the second clause in (69) without jiu` is (72). 19 20

For a rule quite similar to (70), also see McNally and Boleda (2004). Alternatively, we may assume that the empty subject is a null pronoun which is anaphoric.


Given this reading, the next task is to explain how such a reading comes about. First a discussion of the interpretation of common nouns is necessary. As is well-known, the most basic interpretation of common nouns is the <e,t> interpretation. In (69), the subject NP consists only of the bare common noun pingguoˇ ‘apple’. Suppose that the <e,t> interpretation of this bare common noun percolates to the subject NP. Apparently, this meaning of the subject NP can not be combined with the meaning of the VP of type <e,>> via functional application. Although <e,t> and <e, expressions do not combine via functional application, they may get an interpretation via the following rule of interpretation:19

Jo-wang Lin 197

(72)

½½yue` ha˘o chi ¼ kxkykg3kg4ks3ks4[apple#(x) ^ delicious#(x)(g3)(s3) ^ apple# (y) ^ delicious#(y)(g4)(s4) ^ g4 > g3]

According to (62), the meaning of the whole ‘yue` . . . jiu` yue` . . . ’ construction in (69) is thus (73). (73)

(73) claims that for all x, y, g1, g2, s1, s2 in which x and y are apples, g1, g2 are the degrees of the bigness of x and y in situation s1 and s2, respectively. When g2 is greater than g1, there exist a g3, g4, s3 and s4 such that s3 is an extended situation of s1 and s4 is an extended situation of s2. Furthermore, g1 of x in s1 causes g3 of x in s3 and g2 of y in s2 causes g4 of y in s4 and g4 is greater than g3. In other words, the different degrees of bigness and deliciousness of apples are being compared. It should be emphasized that in (73) (¼ (69)) two apples are being compared with respect to a given property because the argument of yue` is a relation between individuals and degrees. That is, the argument of yue` is ‘kxkg . . . ’ rather than ‘kg . . . ’. This in turn is due to the fact that the subject noun phrase in (69) is analysed as having a predicative meaning, which is combined with the predicate of the clause by applying rule (70). Another important point which should be noted about (73) is that the adjective da` ‘big’ should be understood as an individual-level predicate, which describes the final size of an apple rather than a stagelevel predicate, which describes a temporary size. When yue` modifies an individual-level predicate and the subject NP is a bare noun, it always compares different individuals’ degrees with respect to a given property. This reading is further corroborated by (74). piaòlia`ng, jiu` yue` du o reń zhui (74) n haízi yue` girls more beautiful then more many man chase ‘The more beautiful a girl is, the more men will run after her.’ Such an interpretation of bare nouns and individual-level predicates is actually not a surprise, because individual-level properties do not


" [yue` pingguoˇ da`][jiu` yue` ha˘ochi] ¼ "x,y,g1,g2,s1,s2[apple#(x) ^ big#(x)(g1)(s1) ^ apple#(y) ^ big#(y)(g2)(s2) ^ g2 > g1] 0 dg3,g4,s3,s4[s1 < s3 ^ s2 < s4 ^ R(,) ^ R(,) ^ apple#(x) ^ delicious#(x)(g3)(s3) ^ apple#(y) ^ delicious# (y)(g4)(s4) ^ g4 > g3]

198 On the Semantics of Comparative Correlatives in Mandarin Chinese change over time. This is the reason why it is difficult to compare the same individual’s different degrees of a permanent property, as the oddness of (75) shows.21 (75) Xia˘omıńg yue` c ongmıńg, mama jiu` yue` xıˇhuan ta Xiaoming more smart mother then more like him ‘The more clever Xiaoming is, the more his mother likes him.’

21 This sentence can be more acceptable if one compares across different possible worlds instead of the actual one.


If an individual-level property does not change over time, naturally one cannot compare its different degrees for the same individual. Therefore, if one wants to compare different degrees of an individual-level property, one has to compare different individuals’ degrees with respect to that property. However, it should be noted that the AP da` ‘big’ in (69) can also be interpreted as a stage-level predicate. With this interpretation, the size of bigness modified by yue` does not denote a permanent final size of the apple, but rather the gradual change of the shape of the apple. If this interpretation is chosen, (69) can be used to compare the different degrees of the same apple instead of two different apples. In what follows, how this reading is derived will be discussed. Again, it will be argued that the source of this reading comes from a proper analysis of an NP consisting of a bare noun. Kamp (1981), Heim (1982), Diesing (1990) and many others have suggested that indefinite noun phrases can be analysed as free variables accompanied by conditions regarding the assignment of those variables. For instance, ‘a girl’ can be translated as a free variable x, including the condition ‘girl#(x)’. According to Partee (2004: 206), such a KampHeim approach to indefinites is to treat them as e-type expressions. This paper follows Partee in making this assumption and notates the translation as ‘x:girl#(x)’. With respect to Chinese, it is assumed that a NP which consists of a bare noun is also a type of indefinite and can be analysed as a free variable with a condition on it. Given the above assumptions, (69) can now be alternatively interpreted as (76), where the AP da` ‘big’ is understood as a stage-level predicate meaning a temporary size of an entity and the subject NP is given a Kamp-Heim-style treatment of indefinites. (76) pingguoˇ yue` da` jiu` yue` ha˘ochi apple more big then more delicious ‘The bigger an apple becomes, the more delicious it is.’ (Stage-level interpretation of big)

Jo-wang Lin 199

a. b. c. d. e. f.

Some remarks on the above technical details are in order here. In (76), since the bare noun subject is analysed as a type e variable as in (76b), it can be combined with the AP da` ‘big’ directly via functional application. The result of this is an expression of type > as in (76c) rather than type <e,>>. Therefore, when yue` applies to this > expression as in (76d), only the degree argument and not the individual argument will be lambda converted twice to produce two propositions. Later the free variable introduced by the bare noun subject is bound by the implicit universal quantifier. After the binding of the free variable by the universal quantifier, the result obtained is that two different degrees of bigness of the same individual apple are compared.23 In connection with the above analysis, it is interesting to note that Beck (1997: 237) has observed that a sentence like (77) might get a reading on which ‘an attorney is understood generically, and we talk about a development in the sliminess of one and the same attorney (i.e. we are not comparing different attorneys)’. Clearly, this reading of (77) analyses slimmer as a stage-level rather than as an individual-level property. (77) The slimmer an attorney looks, the more successful he is. Beck briefly notes, without arguments, that this reading of (77) can be formalized as (78): 22 I assume that a lambda operator is first inserted to bind the free variable x (to fix the type mismatch) so that it can be quantified over by the universal quantifier. 23 When the time arguments are also added, we will get the result of comparing degrees of bigness at different times. To simplify the matter, I do not represent the time arguments.


½½AP da`] ¼ kxkgks.big#(x)(g)(s) ½½NP pingguoˇ ¼ x:apple#(x) ½½ pingguoˇ da` ¼ kgks.big#(x:apple#(x))(g)(s) ½½CP yue` pingguoˇ da` ¼ kg1kg2ks1ks2[big#(x:apple#(x))(g1)(s1) ^ big#(x:apple(x))(g2)(s2) ^ g2 > g1] ½½CP yue` ha˘ochi ¼ kg3kg4ks3ks4delicious#(x:apple#(x))(g3)(s3) ^ delicious# (x:apple#)(g4)(s4) ^ g4 > g3] ha˘ochi22 ½½CP " yue` pingguoˇ da` jiu` yue` ¼ "x,g1,g2,s1,s2½½big#(x:apple#(x))(g1)(s1) ^ big#(x:apple#(x)) (g2)(s2) ^ g2 > g1] 0 dg3,g4,s3,s4[s1 < s3 ^ s2 < s4 ^ R(,) ^ R(,) ^ delicious#(x:apple#(x))(g3)(s3) ^ delicious#(x:apple#)(g4)(s4) ^ g4 > g3]

200 On the Semantics of Comparative Correlatives in Mandarin Chinese (78) Gen x [attorney(x)]["t1,t2 [x is slimmer at t1 than at t2] 0 [x is more successful at t1 than at t2]]

(79) reń yue` mańg jiu` yue` rońgyı` shengbı`ng man more busy then more easy get-sick ‘The busier a man is, the easier it is for him to get sick.’ (80) xue´sheng yue` jıˇnzhang jiu` yue` rońgyı` ka˘o student more nervous then more easy take-exam de bu` ha˘o DE not good ‘The more nervous a student is, the easier it is for him to not do well in the exam.’ Most native speakers consulted agree that the dominating reading of (79) and (80) is one which compares the different degrees of nervousness of the same individual. Many of them have difficulties in getting the reading which compares the different degrees of nervousness of different individuals, but they do accept such a reading with stronger contextual support. In light of this difference, it is


Although Beck explicitly says that she wants to disregard this reading in her analysis of comparative conditionals, it is still worthwhile to make some comments on the analysis of (78). First, a question raised by (78) is that it is not clear how two universal operators are generated at the same time from a given conditional. As far as it is known, no one seems to have proposed a theory of conditionals that justifies the postulation of an implicit Gen and " simultaneously from the same conditional. Second, it is not clear how and why the indefinite an attorney must move out of the scope of a temporal conditional when the predicate is interpreted as a stage-level property. Due to these reasons, Beck’s approach to (77) is not explored in this paper. To sum up, Chinese comparative correlative constructions have two important properties in their interpretations. First, when the subject NP is a bare noun and the predicate is an individual-level property, the construction can compare the different degrees of that individual-level property only with respect to different individuals but not with respect to the same individual. This is because an individual’s permanent property does not change over time. Second, when the subject NP is a bare noun and the predicate is a stage-level property, the construction tends to compare the different degrees of the stage-level property with respect to the same individual instead of a comparison of different individuals. This is further supported by (79) and (80).

Jo-wang Lin 201

reasonable to say that the first reading is an unmarked one, whereas the second reading is a marked one. If the native speakers’ intuitions and the analysis of the relevant data are correct, then the following generalizations for Chinese comparative correlatives can be made: (81)

The above two generalizations about the ‘yue` . . . yue` . . . ’ constructions in Chinese need not be stipulated and can be derived from a general theory of semantic interpretation. What follows is a plausible explanation of the two generalizations using Partee’s suggested processing strategy for sentence interpretation. In her famous article ‘Noun Phrase Interpretation and Type-shifting Principles’, Partee (2004) has made the following suggestion for nounphrase interpretation: (82)

There is a general processing strategy of trying lowest types first, using higher types only when they are required in order to combine meanings by available compositional rules. (Partee 2004: 204–205)

It is suggested here that the two generalizations in (81) about Chinese comparative correlatives follow from the above general processing strategy. Consider the generalization (81A) first. It is accepted that a NP consisting of a bare noun can be analysed either as a free variable of type e or as a predicative expression of type <e,t>.24 On the other hand, in this article an AP or VP is analysed as a type <e,>> expression. Now, according to the processing strategy in (82), the type e meaning of bare noun subjects should be tried first when they are combined with their predicates. This choice then facilitates functional 24

The <e,t> meaning of a NP consisting of a bare noun is supported by examples like (i):

ta shı` la˘oshi he be teacher ‘He is a teacher.’ (ii) woˇ shand ong-reń I Shangdong-person ‘I am from Shandong.’

(i)


A. When the subject NP is a bare noun and the predicate is a stage-level property, a treatment of the bare noun subject as a type e free variable is necessary prior to an analysis of it as a predicative expression of type <e,t>. B. When the subject NP is a bare noun and the predicate is an individual-level property, the bare noun subject must be analysed as a predicative expression of type <e,t>.


(83) Jıˇngcha´ zaì na`bian Policeman at there ‘The policeman is over there./A policeman is over there.’ Sentence (83) has a reading in which the bare noun subject is understood as a definite, referring to a specific policeman whom both the hearer and speaker know about. However, it cannot be understood


applications and no special composition rule is needed. On the other hand, the <e,t> meaning of a bare noun NP is forced only when a combination of the bare noun subject and the predicate by a special composition rule like (70) is required. Partee did not further discuss the implication of the general processing strategy in (82), but it seems quite plausible to say that the general processing strategy actually reflects the degree of difficulty in processing the interpretation of a sentence. Namely, the less costly a strategy is, the easier it is to process the sentence using that strategy. Or to use the term ‘prior’, the implication of Partee’s suggestion is that when one processing strategy must be chosen prior to another one, the reading obtained from the priority strategy is the one that is more salient or unmarked. If this view is correct, it then follows from the proposed analysis of yue` that when the predicate of a bare noun subject is a stage-level property, a reading which compares the same individual is more striking or easier to get than one which compares different individuals, because the first reading utilizes a more simple semantic object and a cost-free strategy, whereas the second reading employs a more complicated semantic type and a special compositional rule. As for the generalization (81B), in principle, when the predicate of a bare noun subject is an individual-level property, the possible lowest type should be tried first. However, if the bare noun subject is analysed as an e-type free variable, i.e., the lowest type possible, this leads to a pragmatically odd and semantically contradictory interpretation in which a permanent individual-level property may change over time. It is exactly due to this reason that in a ‘yue` . . . yue` . . .’ construction, the bare noun subject of an individual-level property must be analysed as a type <e,t> expression, (the possible lowest type next to type e expressions) and make use of a non-functional application rule such as (70) to do the semantic composition. As a result, a reading which compares different individuals is obtained. The above discusses the e and <e,t> uses of a bare noun NP to explain the two generalizations in (81). However, one might wonder why the ,t>, i.e., the generalized quantifier use, has been left out. This is because it is not for certain that bare noun NPs in Chinese have this meaning. Consider the example in (83).

Jo-wang Lin 203

as a statement about the existence of a policeman. Namely, the bare noun subject cannot be interpreted as a generalized quantifier of the following type: kPdx[policeman#(x) ^ P(x)]. To express an existential reading, the existential operator yoˇu ‘have’ must be used as in (84). (84)

yoˇu jıˇngcha´ zaì na`bian Have policeman at there ‘There are policemen over there.’

6 AN EXTENSION TO COMPLEX NPs WITH A RELATIVE CLAUSE At the outset of this article, it was mentioned that comparative correlatives in Chinese usually take the form of two independent clauses, one being placed before the other. However, as Hsiao (2003) has observed, this is not the only form of comparative correlatives. A ‘yue` . . . yue` . . . ’ construction may also take the following form, where the first yue` is inside a relative clause and the second yue` modifies the main predicate. (85)

yue` jıˇnzhang de xue´sheng ] more nervous Rel student [I# jiu` yue` rońgyı` ti ka˘o de bu` ha˘o]]27 then more easy take-exam not good lit. ‘Students who are more nervous are more likely to do bad in their exams.’

[CP[IP[DPi[CP tj

Such constructions raise two important questions. The first one is related to the interpretation of the sentence. Sentence (85) is the complex NP version of the normal comparative correlative in (80). 25

See also Cheng (1994) for a discussion of this point. An existential reading of bare noun NP in object position can be obtained. However, such existential readings are arguably derived from the binding by an existential operator adjoined to VP as in Diesing (1990). 27 In this article, the VP-internal subject hypothesis is assumed. 26


The obligatory presence of the existential operator yoˇu ‘have’ in (84) indicates that Chinese bare noun NPs are better analysed as free variables or predicative expressions.25 Both possibilities allow yoˇu ‘have’ to do the existential binding.26 Even if the generalized quantifier interpretation is possible for Chinese bare noun NPs, such an interpretation will not be used in Chinese comparative correlatives. This is because, as noted, the general processing strategy of sentence meaning is to try the lowest types first, using higher types only when they are required. Since e and <e,t> are types lower than ,t>, the former will be tried before the latter.


(86)

The semantic computation of (86) is (87): (87)

a. b. c. d. e. f. g.

½½C# ¼ kgks.nervous#(x)(g)(s) ½½CP ¼ kxkgks.nervous#(x)(g)(s) ½½N ¼ ½½N# ¼ kx.student#(x) ½½NPj ¼ kxkgks[nervous#(x)(g)(s) ^ student#(x)] ½½NP ¼ ½½yue` NPj ¼ kxkykg1kg2ks1ks2[nervous#(x)(g1)(s1) ^ student#(x) ^ nervous#(y)(g2)(s2) ^ student#(y) ^ g2 > g1] ½½AP1 ¼ ½½x rońgyı` ka˘o de bu` ha˘o ¼ kgks.easy-do-badin-exam#(x)(g)(s) ½½AP2 ¼ kxkgks.easy-do-bad-in-exam#(x)(g)(s)


However, the interpretation of (85) greatly differs from that of (80). As noted, the dominant reading of (80) is to compare the different degrees of nervousness of the same individual. Curiously enough, this is not the reading of (85). For almost every native speaker who was consulted, (85) is used to compare the degrees of nervousness of different individuals. A question thus immediately arises as to why this is the case. A second question is: Can the reading be derived from the above proposed analysis of Chinese comparative correlatives? These are the questions which need to be addressed next. Let it be assumed that just as for normal ‘yue` . . . yue` . . . ’ constructions, (85) is also mapped to a tripartite structure with an implicit universal operator as the quantifier in Logical Form. Furthermore, let it be assumed also that the subject NP is mapped to the restriction of the quantifier and the VP to the nuclear scope. Since the subject NP is raised from the VPinternal position to the specifier position of IP, there is a lambda operator binding the original subject position. Finally, let it be assumed that the two occurrences of yue` are adjoined to the restriction and the nuclear scope, respectively and that jiu` is further adjoined to the top of yue`. Then the Logical Form of (85) looks like (86).

Jo-wang Lin 205

h. i.

½½yue` AP2 ¼ kxkykg3kg4ks3ks4[easy-do-bad-in-exam#(x)(g3)(s3) ^ easy-do-bad-in-exam#(y)(g4)(s4) ^ g4 > g3] ½½" yue` jıˇnzh ang de xue´sheng jiu` yue` rońgyı` ka˘o de bu` ha˘o ¼ "x,y,g1,g2,s1,s2[nervous#(x)(g1)(s1) ^ student#(x) ^ nervous#(y)(g2)(s2) ^ student#(y) ^ g2 > g1] 0 dg3,g4,s3,s4[s1 < s3 ^ s2 < s4 ^ R(,) ^ R(,) ^ easy-do-bad-in-exam#(x)(g3)(s3) ^ easy-do-bad-in-exam#(y)(g4)(s4) ^ g4 > g3]

7 THE SEMANTICS OF COMPARATIVE CORRELATIVES WITH AN OVERT ITEM OF COMPARISON It has been noted earlier that unlike English comparative conditionals, Chinese comparative correlatives allow an overt item of comparison to occur as (50), reproduced below, shows.

(50) Nıˇ yue` bıˇ ta kuaìle`, ta jiu` yue` tongku˘ you more compare he happy he then more painful ‘The happier you are than he, the more painful he is.’ ‘The more your happiness exceeds his, the more painful it is for him.’ 28

This assumption, it seems, is implicitly made by Kamp (1981), Heim (1982) and Diesing (1990).


As it can be seen from the last step in (87), the final translation is a claim about the comparison of different individuals’ degrees of nervousness instead of a comparison of the same individual’s different degrees of nervousness. This interpretation is a consequence of the phrase structure and can be explained as follows. Suppose that the free variable analysis, i.e. type e analysis, of indefinites applies only to full indefinite NPs.28 Then this analysis is inapplicable to the noun xue´sheng ‘student’ in (86), because xue´sheng ‘student’ in (86) at most constitutes a N#. In contrast, there is no problem with the treatment of the bare noun as a type <e,t> predicate. When this is the case, the meaning of N# can be combined through rule (70) with the meaning of the relative clause, which is a type <e,>> expression, to produce another expression of type <e,>>. Consequently, when this expression serves as the argument of yue`, two different individuals x and y, and two different degrees g1 and g2, will be used to produce two propositions. Therefore, a reading in which different individuals are compared with respect to their degrees of nervousness is obtained.

206 On the Semantics of Comparative Correlatives in Mandarin Chinese In this section, it will be shown that the proposed semantics of yue` may apply to (50), given a proper analysis of the comparison morpheme bıˇ. It should be noted that this paper does not intend to provide a comprehensive treatment of Chinese comparative constructions. So the treatment of bıˇ will be minimal only to allow discussion of examples like (50). It is proposed that sentences with bıˇ have structures like (88) and bıˇ has a semantics like (89). The computation of (88) is, therefore, (90). (88)

If there is no other element in addition to the representation, existential closure will apply to the g2 and situation variable, giving rise to the interpretation that he is g1-degree happy in situation s and you are g2-degree happy in situation s, and that g2 is greater than g1. This seems to be intuitively correct. If, in addition to the representation of (88), there is yue`, the computation will continue as follows: (91) ½½yue` nıˇ bıˇ ta kuaìle` ¼ kP>kg5kg6ks5ks6[P(g5)(s5) ^ P(g6)(s6) ^ g6>g5](kg2ksdg1[happy#(he)(g1)(s) ^ happy#(you#)(g2)(s) ^ g2 > g1]) ¼ kg5kg6ks5ks6[dg1[happy#(he)(g1)(s5) ^ happy#(you#)(g5)(s5) ^ g5 > g1] ^ dg2[happy#(he)(g2)(s6) ^ happy#(you#)(g6)(s6) ^ g6 > g2 ^ g6 > g5]] The last line of (91) infers that in s5 you are g5-degree happy while he is g1-degree happy and that you are happier than he. It also implies that in s6 you are g6-degree happy and he is g2-degree happy, and again that the former is happier than the latter.


(89) ½½bıˇ ¼ kxkP<e,>>kykg2ksdg1[P(x)(g1)(s) ^ P(y)(g2)(s) ^ g2 > g1 ] (90) ½½bıˇ t a ¼ kP<e,>>kykg2ksdg1[P(he#)(g1)(s) ^ P(y)(g2)(s)^ g2 > g1 ] ½½bıˇ t a kuaìle` ¼ kykg2ksdg1[happy#(he)(g1)(s) ^ happy#(y)(g2)(s) ^ g2 > g1] ½½Nıˇ bıˇ t a kuaìle` ¼ kg2ksdg1[happy#(he)(g1)(s) ^ happy#(you#) (g2)(s) ^ g2 > g1]

Jo-wang Lin 207

The last step is then to compute the whole construction, which is as follows. (92)

½½" yue` nıˇ bıˇ t a kuaìle` jiu` yue` t a tongku˘ ¼ "g5,g6,s5,s6[dg1[happy#(he)(g1)(s5) ^ happy#(you#)(g5)(s5) ^ g5 > g1] ^ dg2[happy#(he)(g2)(s6) ^ happy#(you#)(g6)(s6) ^ g6 > g2 ^ g6 > g5]] 0 dg3,g4,s3,s4[s5 < s3 ^ s6 < s4 ^ R(,) ^ R(,) ^ painful#(he#)(g3)(s3) ^ painful#(he#)(g4)(s4) ^ g4 > g3]

8 CONCLUDING REMARKS This paper has argued that Hsiao’s (2003) proposed semantics for Chinese comparative correlatives in terms of the notion of proportion is inadequate and that Beck’s (1997) semantics for English comparative conditionals, which involves reference to comparisons of times, worlds and individuals, cannot be extended to account for the Chinese data. Therefore, a completely new analysis of yue` is proposed, according to which a comparison of degrees is related to different or the same individual, time or situation. The different interpretations of comparative correlatives, i.e. comparison of the same individual’s or different individuals’degrees of a given property, are shown to be the consequences of Partee’s general processing strategy for noun phrase interpretations, as well as of the constituent structure of the sentence. This new analysis of comparative correlatives, though quite successful in accounting for the Chinese data, leaves us wondering whether it is still possible to maintain a universal semantics of comparative correlatives or to explain the variability in some other way. What follows is an outline of a possible direction that may achieve this goal. If the proposed analysis of Chinese yue` is correct, the best strategy for a universal semantics of comparative correlatives is to accept that


Thus, the final truth conditions are the following: For every g5, g6, s5 and s6 such that you are g5-degree happy in s5 and he is g1 degree happy in s5 and g5 is greater than g1 and you are g6-degree happy in s6, and he is g2-degree happy in s6 and g6 is greater than g2 and g5. Furthermore, there exists a degree g3 in s3, caused by (or related to) g5 in s5 and a degree g4 in s4, caused by (or related to) g6 in s6 and g4 and g6 are degrees of his painfulness and g4 is greater than g3. It seems that the truth conditions are intuitively correct. The proposed semantics of yue` thus successfully accounts for the co-occurrence of yue` and an overt item of comparison, which is simply impossible based on Beck’s analysis of the for English comparative correlatives.

208 On the Semantics of Comparative Correlatives in Mandarin Chinese other languages employ the same semantics as yue` and therefore any observed cross-linguistic variations can be attributed to independent factors. Indeed, it is suggested here that this is possibly the case and that the cross-linguistic variation is due to syntactic differences between different languages. There are at least the following syntactic differences between Chinese and English-kind comparative correlatives:

The second difference, it is suggested, can be attributed to the fact that English employs overt wh-movements, whereas Chinese lacks such movements (however, this issue will not be addressed here). As for the first and third differences, they are, in all likelihood, inter-related. Following Beck (1997), it can be assumed that -er heads a degree phrase which takes an adjective as its complement and the as the specifier. Moreover, it is proposed that the is syntactically a DP bound by the subject and semantically it is a subject copy in disguise. Namely, the denotes the same thing as the subject of IP. Based on these assumptions, the surface structure of an English comparative correlative looks roughly like the following: (93) a. b.

The busier John is, the happier I am. CP

CP

CP DegPi Speck

Deg´ Deg

the

-er

DegPj

C´ Specx AP busy Johnk is ti

the

C´ Deg´

Deg

AP

-er

happy

Ix am tj

However, at LF, the input of semantic interpretation, the DegP is reconstructed back to its launching site (possibly because the predicate


(A) The morpheme the is present in the DegP in English comparative correlatives, whereas such a morpheme does not seem to exist in Chinese comparative correlatives. (B) English comparative correlatives involve the overt syntactic movement of the ‘the Adj-er’ constituent, whereas Chinese comparative correlatives do not involve such movements. (C) English comparative correlatives do not allow an overt item of comparison, whereas Chinese comparative correlatives do.

Jo-wang Lin 209

AP must be within the scope of tense and the must be bound).29 Therefore, the ultimate LF is (94). (94)

CP ∀

CP

CP IP

DPk

IP I´

Tense

DPx VP

be

the

VP V

DegP Speck

John -es

Tense

Deg´

DegP Deg´

Specx

Deg

AP

-er

busy

I

-es

be

the

Deg

AP

-er

happy

If it is assumed that the -er in the above structure has exactly the same denotation as yue` in Chinese, then the semantic computation of the restrictive clause, excluding the contribution of tense and the copular verb be, is the following: (95) a. ½½AP ¼ kxkgks.busy#(x)(g)(s) b. ½½Deg# ¼ ½½-er(kxkgks.busy#(x)(g)(s)) ¼ kx1kx2kg1kg2ks1ks2[busy#(x1)(g1)(s1) ^ busy#(x2)(g2)(s2) ^ g2 > g1] c. ½½the ¼ John# d. ½½DegP ¼ ½½Deg#(½½the) ¼ kx2kg1kg2ks1ks2[busy#(John#)(g1)(s1) ^ busy#(x2)(g2)(s2) ^ g2 > g1] e. ½½IP ¼ kg1kg2ks1ks2[busy#(John#)(g1)(s1) ^ busy#(John#)(g2)(s2) ^ g2 > g1]

At this stage, the denotation of the restrictive clause looks exactly like that of the restrictive clause of the Chinese comparative correlative. This also applies to the nuclear scope. The degree and situation variables are then bound by the default universal quantifier, obtaining the following translation: (95) f. "g1,g2,s1,s2.[busy#(John#)(g1)(s1) ^ busy#(John#)(g2)(s2) ^ g2 > g1] 29 The reconstruction is possibly not necessary, as long as semantic composition can be done. However, for the sake of simplicity, this assumption will be made.


V

I´

210 On the Semantics of Comparative Correlatives in Mandarin Chinese 0 dg3,g4,s3,s4[s1 < s3 ^ s2 < s4 ^ happy#(I#)(g3)(s3) ^ happy#(I)(g4)(s4) ^ g4 > g3]

(96) The more nervous a student is in a test, the worse his performance will be. This sentence seems to have two interpretations. One interpretation compares the degrees of nervousness of the same student and the other compares the degrees of nervousness of different students. The first interpretation can be easily derived as follows. The indefinite NP is analysed as a free variable with a condition on it, i.e., ‘x:student#(x)’ in terms of the notation used in this paper. The morpheme the also denotes ‘x:student#(x)’. Since the variable denoted by the and the variable denoted by the subject is the same, the result is a reading which compares the same individual when the variables are bound by the default universal quantifier. On the other hand, suppose that instead of translating as ‘x:student#(x)’, the morpheme the may also be translated as ‘y:student#(y)’. This possibility is not surprising if the morpheme the is seen as just another copy of the indefinite DP in subject position. An indefinite analysed as a free variable can freely translate as ‘x:student#(x)’ or ‘y:student#(y)’ or any variable one prefers to pick. When the variable introduced by the disguised indefinite, i.e., the morpheme the, is different from the variable introduced by the indefinite subject DP, the result is an interpretation in which the comparing argument and the compared argument are different. Therefore, different individuals instead of the same individual are


It is important to note that based on the above analysis, the function of the morpheme the is to saturate the compared argument (the standard of comparison), i.e., the variable x1 of the first proposition, in contrast to the subject DP which saturates the comparing argument (the target of comparison), i.e., the variable x2 of the second proposition. This then predicts that an overt item of comparison such as than Bill is impossible, because the and than Bill compete for the same argument position. Thus an important syntactic difference between English and Chinese comparative correlatives can be explained. Chinese comparative correlatives allow an overt item of comparison, precisely because no morpheme like the saturates the compared argument in syntax. So it can be seen that it is possible to give English and Chinese comparative correlatives a unifying analysis, while at the same time explaining the superficial syntactic difference through independent factors. In (93) and (94), the subject of the restrictive clause is a definite NP. When the NP is an indefinite as in (96), a similar analysis can be given.

Jo-wang Lin 211

Acknowledgements I am indebted to Professors Paul Portner and Cheng-sheng Liu for their comments and discussions on previous versions of this paper. I would also like to thank the two anonymous reviewers for their suggestions which have led to the significant improvement of this paper. Finally, the research for this paper would not have been possible without the support of the government of Taiwan through the MOE ATU Program and a NSC Grant. JO-WANG LIN Department of Foreign Languages and Literatures National Chiao Tung University 1001 Ta Hsueh Road Hsinchu 300 Taiwan email: [email protected]


compared in this case. It can, therefore, be concluded that by making some plausible assumptions it is not a problem to extend the proposed analysis of yue` to the English -er with respect to comparative correlatives, though the exact compositional process of the constructions in the two languages might not look exactly alike due to the syntactic differences of both languages. The above offers an outline of a possible direction for a unifying analysis of cross-linguistic comparative correlatives. Though the suggestion remains somewhat sketchy, it does look very promising and is certainly worth further exploration in the future. However, a caveat about the analysis is in order. Namely, the unifying analysis of crosslinguistic comparative correlative constructions has been achieved at the cost of a dual semantics of the comparative morpheme -er. This is due to the fact that the -er as it occurs in comparative correlatives does not seem to use the same semantics as the -er as it occurs in standard comparative sentences. Therefore, this is in contrast to Beck’s endeavour to maintain a uniform semantics of -er across constructions. However, it is not certain that this cost represents a real disadvantage as lexical items often have different semantics depending upon the contexts in which they appear and also because type-shifting may be used to derive one meaning from another. It can be assumed that this is also the case for the morpheme -er and it can be concluded that it is possible to extend the new semantics of yue` for Chinese comparative correlatives to similar constructions in other languages, even though the constructions may look syntactically different and involve a slightly different compositional process.

212 On the Semantics of Comparative Correlatives in Mandarin Chinese REFERENCES Heim, I. (1982) ‘The semantics of definite and indefinite noun phrases. Unpublished PhD thesis. Department of Linguistics. University of Massachusetts. Amherst, MA. Heim, I. (2001) ‘Degree operators and scope’. In C. Fe´ry & W. Sternefeld (eds), Audiatur Vox Sapientiae: A Festschrift for Arnim von Stechow. Akademie Verlag. Berlin. 214–239. Heim, I. & Kratzer, A. (1998) Semantics in Generative Grammar. Blackwell Publishers Inc. Oxford. Hsiao, S.-Y. (2003) ‘On proportional correlative constructions in Chinese and Mongolian’. Journal of Taiwanese Languages and Literature 1:243–272. Kamp, H. (1981) ‘A theory of truth and semantic interpretation’. In J. Groenendijk, T. Janssen & M. Stokhof (eds), Formal Methods in the Study of Language: Proceedings of the Third Amsterdam Colloquium: Part I. Mathematical Centre Tracts. Amsterdam. 277–321. Kennedy, C. (2005) Parameters of comparison. Unpublished MS. Department of Linguistics, University of Chicago. Kennedy, C. & McNally, L. (2005) ‘Scale structure and the semantic typology of gradable predicates’. Language 81: 345–381. Kratzer, A. (1995) ‘Stage-level and individual-level predicates’. In G. N. Carlson & F. J. Pelletier (eds), The Generic Book. University of Chicago Press. 125–75. Lewis, D. (1970) ‘General semantics’. Synthese 22:18–67. Lewis, D. (1975) ‘Adverbs of quantification’. In E. Keenan (ed), Formal Semantics of Natural Language. Cambridge University Press. Cambridge. 3–15.


Barker, C. (2002) ‘The dynamics of vagueness’. Linguistics and Philosophy 25:1–36. Beck, S. (1997) ‘On the semantics of comparative conditionals’. Linguistics and Philosophy 20:229–271. Borsley, R. D. (2004), ‘An approach to English comparative correlatives’. In S. Mu¨ller (ed), Proceedings of the 11th International Conference on HeadDriven Phrase Structure Grammar. CSLI Publications. Stanford. 70–92. Culicover, P. & Jackendoff, R. (1999) ‘The view from the periphery: The English comparative correlative’. Linguistic Inquiry 30:543–571. Cheng, L. S. L. (1994) ‘Wh-words as polarity items’. Chinese Languages and Linguistics 2:615–640. Cheng, L. S. L. & Huang, C.T.J. (1998) ‘Two types of donkey sentences’. Natural Language Semantics 4: 121–163. Cresswell, M. J. (1976) ‘The semantics of degree’. In B. H. Partee (ed), Montague Grammar. Academic Press. New York. Dikken, M. den (2005) ‘Comparative correlatives comparatively’. Linguistic Inquiry 36:497–532. Doetjes, J. S. (1997) Quantifiers and Selection: On the Distribution of Quantifying Expressions in French, Dutch and English. ICG Printing. Dordrecht. Fillmore, C. J. (1987) ‘Varieties of conditional sentences’. ESCOL 3:163–182. von Fintel, K. (1994) ‘Restrictions on quantifiers domains’. Unpublished PhD thesis. Department of Linguistics. University of Massachusetts. Amherst, MA. Graff, D. (2000) ‘Shifting sands: An interest-relative theory of vagueness’. Philosophical Topics 20:45–81.

Jo-wang Lin 213 Sybesma, R. (1999) The Mandarin VP. Kluwer Academic Publishers. Dordrecht. Thiersch, C. (1982) The harder they come . . . : A note on the double comparative construction in English. In W.Welte (ed), Sprachtheorie und angewandte Linguistik. Festschrift fu¨r Alfred Wollmann. Tu¨binger Beitrage zur Linguistik 195. Narr, Tu¨bingen. 47–65. von Stechow, A. (1984) ‘My reaction to Cresswell’s, Hellan’s, Hoeksema’s, and Seuren’s comments’. Journal of Semantics 3:183–199. Wold, D. (1991) A few properties of the . . . the comparative constructions. In Proceedings of the Second Annual Meeting of the Formal Linguistic Society of Midamerica (FLSM2). University of Michigan. Ann Arbor. 272–281. Xiandai Hanyu Xuci Lishi [Examples and Explanation of the Empty Words of Modern Chinese] (1982). Shangwu. Beijing. Xing, F.-y. (2001) Hanyu Fuju Yanjiu [A Study of Complex Sentences]. Shangwu Yinghsuguan. Beijing. First version received: 19.06.2006 Second version received: 18.10.2006 Accepted: 26.12.2006


Lin, J.-w. (1996) Polarity licensing and Wh-phrase quantification in Chinese. Unpublished PhD thesis. Department of Linguistics. University of Massachusetts. Amherst, MA. Liu, C.-S. L. (2006) ‘Polarity items in Chinese comparative Conditionals’. Paper presented in the 14th Annual Conference of the International Association of Chinese Linguistics. 25– 28 May 2006. Academia Sinica, Taipei. Lu¨, S.-X. (1982) Zhongguo Wenfa Yaoluë [The Basics of Chinese Grammar], Shangwu Yingshuguan. Beijing. McCawley, J. D. (1988) ‘The comparative conditional constructions in English, German and Chinese’. Proceedings of the 14th Annual Meeting of the Berkeley Linguistic Society. 176– 187. McNally, L. & Boleda, G. (2004) ‘Relational adjectives as properties of kinds’. In O. Bonami & P. Cabredo Hofherr (eds), Empirical Issues in Formal Syntax and Semantics 5: Papers from CSSP 2003. 179–196. Partee, H. B. (2004) ‘Noun phrase interpretation and type-shifting principles’. In Compositionality in Formal Semantics: Selected Papers by Barbara, H. Partee. Blackwell. Malden. 203–230.

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

Recommend Documents