JOURNAL OF SEMANTICS Volume 25 Number 1
Special issue on PROCESSING MEANING (Part 2)
Edited by: JULIE SEDIVY ROBYN CARSTON IRA NOVECK BART GEURTS
CONTENTS ANDREW KEHLER, LAURA AND JEFFREY L. ELMAN
KERTZ, HANNAH ROHDE
Coherence and Coreference Revisited
1
End of Special Issue
MICHELA IPPOLITO On the Meaning of Only
Please visit the journal’s web site at www.jos.oxfordjournals.org
45
Journal of Semantics 25: 1–44 doi:10.1093/jos/ffm018
Coherence and Coreference Revisited ANDREW KEHLER, LAURA KERTZ, HANNAH ROHDE AND JEFFREY L. ELMAN University of California, San Diego
Abstract
1 INTRODUCTION More than three decades of research has sought to uncover the principles that determine how hearers interpret pronouns in context.1 This work, which has predominantly been carried out in the psycholinguistics and computational linguistics communities, has focused to a large extent on identifying preferences or heuristics that hearers utilize to interpret a pronoun; these preferences are often based on linguistic properties of possible antecedent expressions, such as the grammatical and thematic roles that they fill within a sentence. As a collection, these preferences are often in conflict, and no clear consensus has emerged with respect to how they are utilized or how conflicts among them are reconciled during the interpretation process. 1 Throughout the discussion, we will use pronouns to mean unaccented, third-person pronouns, unless otherwise specified. Accented third-person pronouns are considered in section 3.5.
Ó The Author 2007. Published by Oxford University Press. All rights reserved. For Permissions, please email:
[email protected].
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
For more than three decades, research into the psycholinguistics of pronoun interpretation has argued that hearers use various interpretation ‘preferences’ or ‘strategies’ that are associated with specific linguistic properties of antecedent expressions. This focus is a departure from the type of approach outlined in Hobbs (1979), who argues that the mechanisms supporting pronoun interpretation are driven predominantly by semantics, world knowledge and inference, with particular attention to how these are used to establish the coherence of a discourse. On the basis of three new experimental studies, we evaluate a coherence-driven analysis with respect to four previously proposed interpretation biases—based on grammatical role parallelism, thematic roles, implicit causality, and subjecthood—and argue that the coherence-driven analysis can explain the underlying source of the biases and predict in what contexts evidence for each will surface. The results further suggest that pronoun interpretation is incrementally influenced by probabilistic expectations that hearers have regarding what coherence relations are likely to ensue, together with their expectations about what entities will be mentioned next, which, crucially, are conditioned on those coherence relations.
2 Coherence and Coreference Revisited
2 To be clear, we will not ultimately conclude that coherence establishment is the root cause of all biases in pronoun interpretation. See section 7 for further discussion. 3 Smyth does not characterize parallelism effects as the result of an independent preference, but instead as a by-product of the structure of the coreference processor. We discuss his analysis in more detail in sections 3.1 and 3.4.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
The emphasis on such factors may partially explain Beaver’s (2004) observation of a ‘curious near absence of work within [the formal semantics and pragmatics] tradition on anaphora resolution’, particularly with respect to its concentration on absolute semantic constraints rather than semantically relevant factors that cause some interpretations to be favoured over others. Indeed, pronouns provide a textbook case of an underspecified linguistic form that must be semantically interpreted within a context, and as such, we would argue, the study of their behaviour offers a window into the larger questions concerning the semantic and discourse interpretation processes that go on around them. Yet, with limited exceptions, the semanticist will find a striking lack of emphasis on meaning in the existing literature on the topic. The current focus on preference-driven theories is in fact a departure from the type of approach outlined in Hobbs (1979), who, working in the artificial intelligence tradition, argued that the mechanisms that drive pronoun interpretation are driven predominantly by semantics, world knowledge and inference, with particular reference to how these are used to establish the coherence of discourses. That is, in his account the same types of inference processes that semanticists commonly appeal to for computing implicatures, accommodating presuppositions, and the like are also those used for computing, using his term, the ‘petty implicatures’ associated with assigning pronouns to their referents. Hobbs’s approach thus gives us a starting point for an attempt to bridge the gap between semantics and psycholinguistic research as they pertain to pronoun interpretation. In previous work, Kehler (2002) argued that the preferences commonly cited in the psycholinguistic and computational linguistics literatures are to some extent epiphenomena of the methods by which discourse coherence is established, although he offered no new empirical data to support this position. In this paper, we present new evidence in support of a coherence analysis (sketched in section 2), and describe how it can accommodate a range of previous findings suggestive of conflicting preferences and biases.2 We start section 3 by examining the grammatical subject preference (Crawley et al. 1990, inter alia), which favours referents that occupy the grammatical subject position of the previous clause, and the grammatical role parallelism preference (Sheldon 1974; Smyth 1994; Chambers and Smyth 1998, inter alia),3 which favours referents that
Andrew Kehler et al. 3
2 COHERENCE AND COREFERENCE Hobbs (1979) presents what in some respects could be considered to be the most parsimonious theory offered to date of how pronouns are interpreted. In his account, pronoun interpretation is not even an independent process, but instead results as a by-product of more general reasoning about the most likely interpretation of an utterance,
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
occupy the same grammatical role as the pronoun. We present the results of our first experiment that show that both preferences can be neutralized when coherence is carefully controlled for, and furthermore argue that the grammatical role parallelism preference is an epiphenomenon of an independent interaction between information structure and accent placement in a particular class of coherence relations. We follow in section 4 with the results of a second experiment designed to distinguish two types of bias proposed by Stevenson et al. (1994): a thematic role preference, according to which the occupants of the Goal thematic role are preferred to those that occupy the Source, and an event-structure bias, according to which hearers focus on the end state of the previous eventuality when interpreting an utterance. The results support the event-structure bias, and further show that the bias is limited primarily to those coherence relations which implicate event structure in their formulation. In section 5, we address the ramifications of our analysis for the time course of pronoun interpretation during incremental processing, and offer a model that captures how a hearer’s coherence-driven expectations about how the discourse is likely to proceed can predict online measurements of pronoun interpretation difficulty. In section 6, we examine a case study with respect to implicit causality biases that have been well-studied in the psycholinguistics literature, and argue on the basis of a third experiment that they represent one instance of a more comprehensive set of biases that drive predictive discourse interpretation. In section 7, we revisit the grammatical subject preference and offer reasons against interpreting the results of Crawley et al. (1990) as support for an independent subject assignment strategy. We also argue, however, that data from Stevenson et al. (1994) offer more convincing support for the existence of a subject bias beyond what can be explained solely by coherence-driven expectations, and suggest a way in which these data can still be explained without appeal to overlaid interpretation heuristics or preferences. We conclude in section 8 by summarizing the ways in which our analysis provides alternative explanations of previous results and suggests areas for future work.
4 Coherence and Coreference Revisited including the establishment of discourse coherence. Pronouns are modelled as free variables in logical representations which become bound during these inference processes; potential referents of pronouns are therefore those which result in valid proofs of coherence. To illustrate, consider passages (1a) and (1b), adapted from an example from Winograd (1972). (1) The city council denied the demonstrators a permit because. . . a. . . . they feared violence. b. . . . they advocated violence.
Explanation: Infer P from the assertion of S1 and Q from the assertion of S2, where normally Q/P. Oversimplifying a bit, we encode the world knowledge necessary to establish Explanation for (1) within a single axiom, given in (2). (2) fear(X, V) ^ advocate(Y, V) ^ enable_to_cause(Z, Y, V) /denyðX; Y; ZÞ If we assume that the variables X, Y, V and Z are bound to the city council, the demonstrators, violence, and the permit, respectively, axiom (2) says that if the city council fears violence, the demonstrators advocate violence, and a permit would enable the demonstrators to bring about violence, then it might ‘plausibly follow’ that the city council would deny the demonstrators a permit. The first sentence in (1) is represented as in (3). (3) deny(city_council, demonstrators, permit) This representation matches the consequent of axiom (2), triggering a process of abductive inference that can be used to establish Explanation. At this point, X will become bound to city_council, Y to demonstrators and Z to permit. Both the follow-ons (1a,b) provide information that can be used to establish one of the conjuncts in the antecedent of the axiom, thereby establishing a causal connection between the clauses. Clause (1a) is
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Hearers appear to have little difficulty resolving the pronoun they in each case, despite the fact that it refers to the city council in sentence (1a) and the demonstrators in sentence (1b). Note that the only difference is the verb used in the second clause, which suggests that semantics and world knowledge are responsible for determining the correct referents. The Explanation coherence relation, as signalled by because, is operative in each case (the variables S1 and S2 represent the first and second sentences being related, respectively):
Andrew Kehler et al. 5
represented as in (4), in which the unbound variable T represents the pronoun they. (4)
fear(T, violence)
This predication unifies with the first conjunct in the antecedent of axiom (2), forcing the unification of the variables T and X. Since X is already bound to city_council, the variable T representing they also receives this binding, and the pronoun is therefore resolved. Likewise, clause (1b) is represented in (5). (5)
advocate(T, violence)
4
Both these concerns will be addressed later in the paper.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
This predication also matches a conjunct in the antecedent of axiom (2), but in this case it is the second conjunct, which will necessitate the unification of the variables T and Y. Since Y is already bound to demonstrators, the representation of they also receives this binding. Thus, identification of the correct referent for the pronoun in both (1a) and (1b) is a by-product of establishing an Explanation relation. Despite the appeal of this example, the literature has largely rejected Hobbs’s approach in favour of methods that rely on more surface-level aspects of linguistic representation, such as the grammatical and thematic roles that potential antecedents occupy. There are no doubt reasons for this; for one, there are statistical tendencies in support of such preferences (e.g. a bias towards references to the previous subject as compared to other grammatical positions) that do not receive obvious explanations from a purely coherence-driven theory. Further, it is unclear how facts concerning incremental processing can be predicted by a coherence-driven account that relies on information that may not become available until well after the pronoun is encountered.4 Kehler (2002) extended Hobbs’s work by presenting a typology of coherence relations, most taken or adapted from those in Hobbs (1990), based on three general classes of ‘connection among ideas’ first articulated by Hume in his Inquiry Concerning Human Understanding—namely Resemblance, Contiguity in time or place and Cause or Effect [Hume 1955: 32 (1748)]. Kehler argues that these categories differ in the types of inference processes used to establish them; this distinction in turn affects how pronouns are interpreted. This, Kehler claims, explains why different heuristic preferences appear to dominate in different contextual circumstances. We will describe exemplar relations in each of the three categories (particularly Occasion,
6 Coherence and Coreference Revisited Parallel and Result) as well as the manner in which establishing each interacts with pronoun interpretation by considering examples (6a–d): (6) a. Bush narrowly defeated Kerry, and special interests promptly began lobbying him. [¼Bush] b. Kerry was narrowly defeated by Bush, and special interests promptly began lobbying him. [¼Kerry]5 c. Bush narrowly defeated Kerry, and Romney absolutely trounced him. [¼Kerry] d. Bush narrowly defeated Kerry, and he quickly demanded a recount. [¼Kerry]
Occasion: Infer a change of state for a system of entities from the assertion of S2, establishing the initial state for this system from the final state of the assertion of S1. Occasion allows one to express a situation centered around a system of entities by using intermediate states of affairs as points of connection between partial descriptions of that situation. As such, the inference process that underlies Occasion attempts to equate the initial state of the second utterance with the final state of the first, performing inferences as necessary. Biases in pronoun interpretation in Occasion are therefore predicted to correspond to the relative degrees of salience of the event participants with respect to (the hearer’s mental representation of) the event’s end state. As the grammatical subject is the canonical place to mention the topic of a sentence—in the sense that, information structurally, (6a) highlights what Bush did, whereas (6b) highlights what happened to Kerry—it stands to reason that the degree of salience accorded to Bush and Kerry would differ between (6a,b), and with it, the preferred referent for the pronoun.6 Thematic role biases discussed in the literature can also be linked to Occasion. In a passage-completion experiment, Stevenson et al. (1994) 5
The preference for Kerry in this case may rely to some degree on the hearer knowing that he is a US Senator, and thus, like Bush, is able to be lobbied. 6 Section 7 will present a refinement of this characterization.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
The alternation found in examples (6a,b) can be used to argue for the existence of a grammatical subject preference. The difference in voice in the first clause results in different entities being realized in subject position, and most informants find that the favoured interpretation for the pronoun shifts accordingly. Kehler argues that the subject preference is most closely associated with examples that participate in the Contiguity relation Occasion (such as (6a,b)), which is defined as follows (adapted from definitions in Hobbs 1990):
Andrew Kehler et al. 7
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
found evidence for both a grammatical subject preference and a bias in favour of entities that occupy the Goal thematic role over those that occupy the Source. Whereas participants were considerably more likely to complete passages like (7a) in a way that requires he to refer to John rather than Bill (here John is both the subject and the Goal), they are equally likely to complete passages like (7b) in a way that requires that he refer to Bill (a non-subject Goal) as John (a subject Source). (7) a. John seized the comic from Bill. He____________ b. John passed the comic to Bill. He____________ Stevenson et al. thus conclude that there is both a subject assignment strategy and a Goal preference at work—which agree on a referent in (7a), but disagree in (7b)—and that the Goal preference may result from a bias towards focusing on end states. We will return to this topic in section 4, where we argue that the end-state bias is in part a by-product of the manner in which Occasion relations are established. Example (6c) provides counterevidence to the subject preference we witnessed in (6a,b) since the preferred referent is the object of the first clause rather than the subject. Such examples have been used to argue for a grammatical role parallelism preference, which favours entities that occupy the same grammatical role as the pronoun (Sheldon 1974; Smyth 1994; Chambers and Smyth 1998, cf. footnote 3). Kehler argues that this preference is closely associated with Resemblance coherence relations such as Parallel, in which commonalities and contrasts among corresponding sets of parallel relations and entities are established: Parallel: Infer P(a1, a2, . . .) from the assertion of S1 and P(b1, b2, . . .) from the assertion of S2, for a common P and similar ai and bi. In (6c), the entities Bush and Romney are parallel, as are Kerry and the referent assigned to him. We will henceforth refer to such pairs ai and bi for some i as parallel elements. Examples cited to support a grammatical role parallelism preference are often characterized by Parallel relations, as are the typical stimuli found in psycholinguistic research in support of the preference (Smyth 1994; Chambers and Smyth 1998). The bias towards a pronoun’s parallel element in these constructions is very strong; informants are almost unanimous in judging the pronoun in (6c) to refer to Kerry (assuming that the pronoun is not contrastively accented; more on this in section 3.5). The question, then, is why the same effect is not seen in example (6a). Note that this preference is not straightforwardly predicted on a coherencedriven theory, since assigning either referent in (6c) would result in a perfectly coherent Parallel relation. Kehler (2002) offers a rationale for this association, but that position will be revised in section 3.5.
8 Coherence and Coreference Revisited Finally, example (6d), repeated below as (8), is an instance of the Result relation, which, like the previously discussed Explanation relation, is in the Cause–Effect category. (8) Bush narrowly defeated Kerry, and he quickly demanded a recount. [¼Kerry]
The analysis of example (8) would follow the spirit of the analysis of examples (1a,b). As this example violates both the subject and grammatical parallelism preferences, it argues instead for a ‘common sense’ preference, since the interpretation of the pronoun appears to be determined by the same world knowledge that is used to establish the coherence of the passage, specifically that one would expect the loser of an election to demand a recount rather than the winner. To sum, we have described three categories of coherence relation that are associated with three underlying inference processes, which in turn appear to be correlated with different types of pronoun interpretation biases. In the sections that follow, we describe psycholinguistic experiments intended to evaluate the evidence for these biases in the context of a coherence-driven analysis. 3 GRAMMATICAL ROLE PREFERENCES The first of our studies addresses the conflict between the subject and grammatical parallelism preferences in light of the coherence analysis. Much of the motivation for the experiment and its design draw from the work of Smyth (1994), and additional aspects are motivated by the work of Wolf et al. (2004). We briefly describe these two works in turn, and then follow with a discussion of our experiment.
3.1 Smyth (1994) Smyth (1994) posits an Extended Feature Match Hypothesis (EFMH), which characterizes pronoun assignment as a search process based on feature matching that predicts that a ‘pronoun with two or more grammatically and pragmatically possible antecedents in a preceding clause will be interpreted as coreferential with the candidate that has the same grammatical role’ (p. 197). Whereas we have thus far cast the
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Establishing a Cause–Effect relation requires that a causal link be identified between the propositions denoted by the utterances in a passage. The Result relation is essentially the same as Explanation except that the cause precedes the effect: Result: Infer P from the assertion of S1 and Q from the assertion of S2, where normally P/Q.
Andrew Kehler et al. 9
(9)
Mary helped Julie change the tire and then she helped Peter change the oil.
Participants were asked to fill a blank by writing the name of the person that they understood the pronoun to refer to. The results overwhelmingly favoured parallel assignment; 100% of the subject pronouns were assigned to the preceding subject and 88.12% of the non-subject pronouns were assigned to the non-subject referent. Experiment 3 tested the prediction that a reduction in the parallelism between the clauses should reduce the number of parallel responses. It varied three factors: grammatical role parallelism for the non-subjects (parallel v. not parallel), full syntactic parallelism (no adjunct v. adjunct), and pronoun position (subject or non-subject). The results further supported parallel assignment, as the percentage of parallel assignments ranged from 64% to 90% across conditions. There were also main effects of adjunct parallelism and grammatical role parallelism: Cases in the nonparallel adjunct condition received fewer parallel assignments than those in 7
The first experiment was a small study to test the role of context sentences in the experiments of Crawley et al. on their results. The fourth tested a variety of effects in cases involving subordinate structures, which will not concern us here.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
parallelism preference as a heuristic, it is worth noting that Smyth explicitly denies this view, stating that ‘PF [¼parallel function] is not a special default strategy, but rather an epiphenomenon arising from the structure of the coreference processor’, and thus ‘there is no sense in which it is an independent rule or strategy to be acquired’. Instead, coreference is established by a feature-match process, and due to a priming effect, the identity of the grammatical role filled by the referent is available as one of the criteria for matching, along with other features (e.g. number, gender). A lack of full syntactic parallelism between the clauses—such as when one clause contains an adjunct and the other does not—is predicted to prevent syntactic priming and reactivation, resulting in fewer parallel interpretations (pp. 206–7). We will focus on the two of Smyth’s four experiments that are central to our analysis, his Experiments 2 and 3.7 Both are argued to provide evidence for the EFMH, and hence to contradict the claim of Crawley et al. (1990) that parallel function ‘is not important for understanding pronouns in text’. The Experiment 2 materials were constructed by taking 20 of the stimuli of Crawley et al. and modifying them so that the clauses were fully parallel syntactically. The nonsubject roles were varied between direct, indirect and prepositional objects. A sample passage is given in (9).
10 Coherence and Coreference Revisited
3.2 Wolf et al. (2004) Wolf et al. (2004) previously tested the predictions of Kehler (2002) against both the grammatical subject and grammatical role parallelism preferences in a reading time experiment that manipulated coherence frame (Parallel/Result) and pronoun gender (masculine/feminine). Coherence was signalled by manipulating the verb in the context sentence as well as a connective between the first two clauses, specifically and similarly (which signals a Parallel relation) and and so (which signals a Result relation). In half of the stimuli, the referent indicated by pronoun gender supported the coherence frame, and in the remaining half it did not. Examples are given in (10a,b). (10) a. Fiona complimented Craig and similarly James congratulated her/him after the match, but nobody took any notice. b. Fiona defeated Craig and so James congratulated her/him after the match, but nobody took any notice. For the Parallel stimuli, faster reading times were measured when the antecedent was in a parallel grammatical role than when it was not. For the Result stimuli, which were semantically biased towards a non-parallel referent, faster reading times were measured for non-parallel antecedents. Wolf et al. thus confirmed that preferences for pronoun interpretation can be reversed by manipulating coherence, per Kehler (2002). Several questions remain that warrant investigation, however. First, Wolf et al. used gender-unambiguous pronouns, which, in the causal continuations, resulted in interpretations that were less coherent in the parallel antecedent condition than in the non-parallel condition (consider the variant of (10b) with the pronoun him v. her). As a result, the increased reading times could have been caused by this incoherence
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
the parallel condition, and similarly cases in the non-parallel role condition received fewer parallel assignments than those in the parallel condition. As pointed out by Kehler (2002), however, an examination of Smyth’s syntactically parallel stimuli suggests that his modifications to the examples of Crawley et al. may have introduced a confound, in that in some cases they also changed the operative coherence relation from Occasion to Parallel, whereas Occasion appears to be more highly represented in his non-parallel stimuli. Hence, our first experiment controls for and manipulates syntactic parallelism and coherence separately. So as to keep the results as directly comparable as possible, our design will otherwise follow Smyth’s fairly closely, particularly with respect to being an offline task in which readers are explicitly asked for their pronoun assignments.
Andrew Kehler et al. 11
3.3 Experiment 1 The present experiment addresses a variety of factors by independently varying sentence structure, pronoun position and coherence relation in an ambiguous pronoun resolution task (Kertz et al. 2006). Two versions of each preference are evaluated—a ‘basic’ version, which characterizes it as a single, all-purpose processing strategy, and a particular ‘modified’ version, which corresponds more closely to one of the aforementioned proposals in the literature. The basic version of the grammatical subject preference states an across-the-board preference for antecedents that occupy the subject position of the previous clause. The modified version allows for the possibility that the subject preference will not override an interpretation favoured by a strong pragmatic bias (Crawley et al. 1990). The basic version of the grammatical role parallelism preference states an across-the-board preference for antecedents that occupy the same grammatical role as the pronoun. The modified version adds an additional constraint requiring that the syntactic structures of the two clauses be fully parallel; otherwise the grammatical subject preference is invoked (Smyth 1994).8 8
As Crawley et al. did for the grammatical subject preference, Smyth likewise appeals to the idea that plausibility factors could limit the applicability of the parallel grammatical role preference. We address this proviso in section 3.4.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
rather than by the pronoun interpretation process. Second, the stimuli of Wolf et al. contained only object pronouns, and thus each possible interpretation was supported by either the grammatical subject preference or the grammatical role parallelism preference. We ask whether similar results would be found for subject pronouns with object antecedents, which are dispreferred by both preferences. Third, since their stimuli all include a prepositional phrase in the second clause but not in the first, the passages did not have fully parallel structure. Whereas this property is irrelevant to a general grammatical role parallelism preference, it makes their results potentially compatible with the EFMH’s prediction that a lack of full syntactic parallelism will result in a reduced parallelism bias. Finally, it has been proposed that connectives carry their own focusing properties (Stevenson et al. 1994, 2000) that can affect antecedent selection, such that the use of and similarly and and so in the data of Wolf et al. could be claimed to redirect the current focus of attention in different ways. While we find this idea to be uncompelling in several respects (see footnote 25), we can test whether similar effects will be found for stimuli without connectives by not relying on them to disambiguate coherence.
12 Coherence and Coreference Revisited 3.3.1 Stimuli In a 2 3 2 3 2 design, stimulus sets were constructed with eight variants, as in (11a–d). Each stimulus contains two clauses: an introduction and a follow-on that contains an ambiguous pronoun. Both clauses contain a transitive verb in active voice. (11) Samuel threatened Justin with a knife, and a. . . . Erin blindfolded him (with a scarf). [Parallel] b. . . . Erin stopped him (with pepper spray). [Result] c. . . . he blindfolded Erin (with a scarf). [Parallel] d. . . . he alerted security (with a shout). [Result]
3.3.2 Participants Thirty-two undergraduates from the University of California San Diego (UCSD) participated for extra course credit. All were self-reported monolingual speakers of English.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Sixteen stimulus sets were constructed for a total of 128 experimental stimuli. Each set varied pronoun position (subject/object), sentence structure (fully/partially parallel) and coherence relation (Parallel/ Result). With possible antecedents as the subject and object of the first clause, we are able to test the full 2 3 2 configurations of possible coreference patterns. As in Wolf et al. (2004), passages participating in Result relationships semantically favoured the non-parallel referent, whereas those participating in Parallel relations incorporated no semantic bias. The modified grammatical subject preference can thus be evaluated by analysing the Parallel condition only. The distinction between full and partial syntactic parallelism between the clauses was implemented by either including or excluding a modifier phrase in the second clause to match the modifier in the first clause, which were varied between preverbal adverbial phrases and post-verbal prepositional phrases, balanced across sets. Varying the stimuli across this dimension allows us to determine if pronoun interpretation is affected by the existence of full v. partial syntactic parallelism, as predicted by the modified grammatical role parallelism preference, when coherence is controlled for separately. Coherence type was assessed in a norming phase, during which three trained judges, blind to our hypothesis, were asked to categorize stimuli as instances of either Parallel coherence or Result coherence. All three judges agreed on the coherence relation for 119 out of 128 total stimuli. For the remaining nine stimuli, two of three judges agreed with an averaged confidence level above a pre-determined threshold.
Andrew Kehler et al. 13
3.3.3 Task A repeated measure design was used, in which each participant was tested on two stimuli from each of the eight types, with no two variants from the same set presented to the same participant. The two replications were block randomized, and the 16 experimental stimuli were interleaved with 24 distractors (16 of which also contained ambiguous pronouns and 8 of which contained unambiguous pronouns). The resulting sixteen lists were then reversed to rule out ordering effects, yielding 32 unique stimulus lists. Participants were presented with a paper and pencil task, for which they read a two-clause passage and answered a question immediately after, as in (12).
The participant’s answer was taken to indicate the antecedent selected in interpreting the ambiguous pronoun. 3.3.4 Predictions As we have characterized it, the ‘basic’ form of the grammatical subject preference predicts a strong bias towards interpreting all pronouns to refer to the subject of the previous clause. The modified form predicts the same bias, but only in Parallel relations since the Result stimuli were pragmatically biased. The ‘basic’ form of the grammatical role parallelism preference predicts a strong bias towards interpreting subject and object pronouns to refer to subject and object antecedents, respectively. The modified grammatical role parallelism preference makes the same predictions, but only for the stimuli in the fully parallel condition. The coherence hypothesis makes the same predictions as the basic grammatical role parallelism preference for Parallel coherence stimuli (regardless of the full/partial syntactic parallelism distinction), but predicts an interpretation bias towards grammatically non-parallel referents for Result stimuli. 3.3.5 Results The results followed the predictions of the coherence hypothesis, confirming the expected interaction between pronoun position and coherence type, but were not consistent with the other hypotheses. The raw number of subject v. object assignments for each of the eight conditions is shown in Table 1. Table 2 organizes the results according to the predictions of each account. These results show that the manipulations to test the basic and modified forms of both the grammatical subject and grammatical parallelism preferences all resulted in near 50/50 splits, whereas the
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
(12) Samuel threatened Justin with a knife, and he blindfolded Erin with a scarf. Who blindfolded Erin?
14 Coherence and Coreference Revisited Coherence
Syntax
Parallel
Parallel Non-parallel
Result
Parallel Non-parallel
Pronoun position
Subject ante
Object ante
Subject Object Subject Object Subject Object Subject Object
64 5 61 8 2 59 4 61
0 59 3 56 62 5 60 3
Table 1 Results of Experiment 1 by condition
The subject preference All pronouns The qualified subject preference Non-biasing context (Parallel coherence) The parallel structure preference Subject pronouns Object pronouns The qualified parallel structure preference Subject pronouns (fully parallel structure) Object pronouns (fully parallel structure) The coherence hypothesis Subject pronouns (Parallel coherence) Subject pronouns (Result coherence) Object pronouns (Parallel coherence) Object pronouns (Result coherence)
Subject ante
Object ante
n
0.52
0.48
512
0.54
0.46
256
0.51 0.52
0.49 0.48
256 256
0.52 0.50
0.48 0.50
128 128
0.98 0.05 0.10 0.94
0.02 0.95 0.90 0.06
128 128 128 128
Table 2 Results of Experiment 1 by analysis
predictions of the coherence analysis were all confirmed with at least a 90/10 split. The dependent measure for our statistical analyses was the rate of assignments to the subject antecedent (subject and object assignments received scores of 1 and 0, respectively). A full factorial analysis of variance was conducted with pronoun position (subject/object), sentence structure (fully/partially parallel) and coherence relation (Parallel/Result) as factors, with separate analyses treating participants
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Condition
Andrew Kehler et al. 15
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
(F1) and items (F2) as random variables. The analysis confirms that the interaction between coherence type and pronoun position, predicted by the coherence hypothesis, is significant [F1(1, 31) ¼ 1379.23, P < 0.0001; F2(1, 15) ¼ 2016.158, P < 0.0001]. A second smaller effect, which we did not predict, was found for coherence alone [F1(1, 31) ¼ 4.429, P < 0.05; F2(1, 15) ¼ 7.105, P < 0.05], where subject antecedents were selected more often in Parallel coherence relations than in Result relations. Collapsing across conditions, the overall mean score was 0.516 6 0.062. A one-sample t-test comparing this mean to a hypothetical mean of 0.5 demonstrates that the overall rate of subject antecedent assignment is not significantly different from chance, contra the grammatical subject preference. Whereas the main effect of coherence described above could potentially be interpreted as slight support for the modified subject preference, this effect is overwhelmed by the effect predicted by the coherence analysis. The main effect of pronoun position, predicted by the grammatical role parallelism preference, is not statistically significant; nor is the interaction between sentence structure and pronoun position predicted by the modified grammatical role parallelism preference. There was no significant effect of structure, and no significant interaction between structure and coherence. Likewise, there was no significant three-way interaction among coherence, syntactic structure and pronoun position. As such, lack of parallel structure did not impact the likelihood of a parallel pronoun assignment in the Parallel condition, for either subject or object pronouns. Finally, modifier type (pre-verbal adverbial v. post-verbal prepositional phrase) was not a significant factor alone or within any interaction. These results support the coherence hypothesis, confirming that pronoun interpretation preferences can be triggered or suppressed by manipulating coherence relations. They also suggest that the contradictory results reported in the literature to date may stem at least in part from a failure to control for coherence in experimental stimuli. We forgo a detailed discussion of these results with respect to the grammatical subject preference until section 7, since Experiments 2 and 3, to be presented subsequently, are relevant to that discussion as well. We discuss these results with respect to the EFMH further in the next section. We then follow up in section 3.5 with a semantic analysis that, we claim, demonstrates that the grammatical role parallelism preference is an epiphenomenon of the interaction of information structure and accent placement in Parallel relations.
16 Coherence and Coreference Revisited
3.4 Comparison with the EFMH
(13) Phil tickled Shanley, and (so) Liz poked him. While it is not completely clear to us how such biases are predicted to interact with the feature-matching mechanism of the EFMH, example (13) is of the sort employed in our stimuli for the Result condition. Following Sheldon (1974), however, Smyth also correctly notes that the parallelism effect is so strong that it can seemingly trump gender mismatches (see also Oehrle 1981): (14) William bumped Bonnie and ?she/SHE poked Rod. That is, example (14) is infelicitous without accent on she (cf. 13), even though there is only one female referent available.9 This fact weakens the force of Smyth’s appeal to pragmatic biases with respect to examples like (13), however, since one is left with a parallelism effect that is so strong that it can withstand a firm semantic constraint like a gender mismatch but yet is soft enough to be overridden by a more pliable pragmatic bias. And the fact of the matter is that pragmatic biases cannot override parallel function if the operative coherence relation is Parallel, as pointed out in Kehler (2002). Consider (15): (15) Condi Rice admires Hillary Clinton, and George W. Bush absolutely worships her.
9
We will offer an explanation for this infelicity in the next section.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
We conclude from the results of Experiment 1 that coherence is the dominant factor in determining parallel reference assignments, and not grammatical structure: (i) parallel structure did not give rise to parallel coreference in Result stimuli and (ii) reduced syntactic parallelism did not reduce the likelihood of a parallel interpretation in Parallel stimuli. As we have discussed, these results contrast with the predictions of the EFMH, which characterizes pronoun interpretation as a featurematching process that is sensitive in part on the degree of syntactic parallelism between clauses. With respect to result (i) above, however, it should be noted that Smyth acknowledges that pragmatic biases might be at work in some examples: ‘in some cases, a conjunction can introduce a pragmatic bias which is incompatible with a PF interpretation’ (p. 208). He cites example (13), in which a causal interpretation supports interpreting the non-subject pronoun him as coreferent with the subject Phil:
Andrew Kehler et al. 17
Assuming a Parallel relation with a deaccented her, informants reliably report that the referent must be Clinton, despite a strong pragmatic bias towards Rice given the political persuasions of the politicians involved. It therefore needs to be explained why plausibility can save a nonparallel pronoun interpretation in (13) but not in (15). The crucial difference between the acceptable (13) and the unacceptable version of (14) is that (13) participates in a Result relation, whereas (14) participates in a Parallel relation. Plausibility only comes into play in determining which referent makes for a coherent Result relation in (13).
Experiment 1 revealed a dramatic bias in Parallel (but not Result) coherence relations towards a referent in a parallel grammatical role, across both the subject pronoun and object pronoun conditions. In this section, we ask why Parallel coherence is so strongly aligned with parallel coreference, giving the appearance that a parallel grammatical role bias is at play. We argue that the bias emerges from the interaction between coherence relations and information structure, for reasons that are independent of a theory of pronoun interpretation. 3.5.1 Parallelism effects That there would be an association between Parallel coherence and parallel coreference may, at first blush, seem unsurprising on a coherence-driven analysis. After all, when establishing Parallel coherence, the inference mechanism attempts to establish points of similarity between a pronoun and its parallel element. It stands to reason that the way to establish maximal similarity is to assume coreference between the two. Indeed, Kehler (2002) posited an analysis of just this sort. This observation does not fully explain the behaviour we discussed in the previous section, however. For one, as we just saw, the parallelism effect is recalcitrantly strong as compared to other types of preferences noted in the literature, able to withstand strong pragmatic biases (15) and even gender conflicts (16). (16) Condi Rice admires Donald Rumsfeld, and George W. Bush absolutely worships her. [¼Rumsfeld?] Yet the strength of the parallelism bias cannot be attributed only to the semantics of the Parallel relation, since substituting a mention of either referent by name in place of the pronoun in either of these examples results in a perfectly coherent Parallel passage.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
3.5 A semantic analysis of the grammatical role parallelism preference
18 Coherence and Coreference Revisited No other preference proposed in the literature is resilient to grammatical and world-knowledge influences in a similar way. Yet, as we have already seen, the effect simply appears to go away when the operative coherence relation is non-Resemblance. That is, the coreference pattern that was infelicitous for (15) and (16) is perfectly acceptable on a Result interpretation, per (17a,b), respectively. (17) a. Condi Rice defeated Hillary Clinton and George Bush congratulated her. b. Condi Rice defeated John Kerry and George Bush congratulated her.
(18) Powell defied Cheney, and Bush punished him. (Kehler 2002) On the Parallel construal of (18) (paraphrase and as and similarly), him can only refer to Cheney if unaccented (i.e. it can refer to Powell only if it receives accent). On the other hand, on the Result construal (paraphrase and as and as a result), him can refer to Powell if it is unaccented. All these data show a clear pattern whereby Resemblance relations (e.g. Parallel) require an unaccented pronoun to corefer with its parallel element, whereby pronouns in non-Resemblance (e.g. Result) relations are not similarly constrained. In the remainder of this section, we argue that these facts are predictable from the manner in which different coherence relations partition utterances information structurally with respect to focus and background, and how this partition in turn determines the placement of accent on referring expressions (whether pronominal or not) within an utterance. The analysis explains the data that have been used to support a grammatical role parallelism preference without appeal to any pronoun-specific interpretation mechanisms or strategies. 3.5.2 Coherence, coreference and accent The idea that the aforementioned facts are unrelated to pronominalization goes against the common wisdom in the literature, which often treats accented pronouns in English as being governed by special rules or associated with specific discourse functions. For example, Kameyama (1999) proposes a Complementary Preference Hypothesis, which says that ‘a focused pronoun takes the complementary preference of the unstressed counterpart’, that
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Lest there be any doubt that these different interpretation patterns are due to the difference in coherence type, we can ask whether passages that are ambiguous between Parallel and Result construals enforce different constraints on the interpretation of unaccented pronouns across the two coherence construals. This is indeed the case; consider (18):
Andrew Kehler et al. 19
(19) Condi Ricei admires 8 > > < absolutely worships > > :
Hillary9Clinton, and George W. Bush HERi > > = RICE . (cf. 15) #heri > > ; #Rice
(20) Condi Ricei admires 8 Donald9Rumsfeld, and George W. Bush HERi > > > > < = RICE . (cf. 16) absolutely worships #heri > > > > : ; 8 9 #Rice HIMi > > > > < = POWELL (21) Powelli defied Cheney, and Bush punished . (cf. #himi > > > > 18, on the Parallel reading) : ; #Powell Likewise, the lack of accenting on the pronoun in the Result cases remains when a proper name is used instead: Hillary Clinton and George Bush (22) Condi Ricei defeated John Kerry heri . (cf. 17a,b) congratulated Rice
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
is to say, one first computes the preferred referent for an unaccented pronoun, and then selects an entity from the remainder of the ‘currently salient’ set of entities. Similarly, Beaver (2004) offers an analysis in which Kameyama’s predictions result from partial blocking effects between accented and unaccented pronouns in a bidirectional-optimality-theoretic (OT) implementation of a Centering-based pronoun interpretation system. Smyth (1994) likewise posits that accented pronouns selectively block the parallel interpretation when the EFMH applies. Finally, Gundel et al. (1993), in their treatment of referring expressions and cognitive status, place unaccented and accented pronouns into two different categories (in focus and activated, respectively, p. 283, footnote 14). However, it turns out that all the aforementioned facts concerning coherence and accentuation are actually constraints on coreference rather than merely pronominalization (Akmajian and Jackendoff 1970; Venditti et al. 2002; de Hoop 2004). This can be seen by considering variants of our previous examples in which the pronouns are replaced by proper name mentions of their referents. In all these cases, the requirements on accenting the direct object (marked using capital letters) are insensitive to whether a full name or pronoun is used:
20 Coherence and Coreference Revisited (23) Powelli defied Cheney, and Bush punished the Result reading)
himi . (cf. 18, on Powell
3.5.3 An analysis The facts described so far call instead for an explanation for why Parallel and Result relations differ information structurally, such that they impose different constraints on what elements of a sentence must receive accent. Kehler (2005) outlines an analysis, cast using the machinery of Schwarzschild’s (1999) optimizationdriven theory of focus marking and accent placement, that accounts for these differences. We only summarize the arguments here.10 The crucial fact is that Parallel and Result relations will give rise to different F(ocus)-markings for otherwise similar (or, in the case of example 18, identical) examples, which in turn results in different distributions of accents. A brief discussion of Schwarzschild’s system should suffice to understand the argument. In Schwarzschild’s analysis, F-marking serves as the interface between semantics and phonology. On the semantics side, felicitous utterances are entailed by the prior discourse (that is, Given), with the proviso that F-marking a phrase effectively turns it into a ‘wildcard’ (or ‘F-variable’) when matching against an antecedent. For instance, in a context that mentions a red apple, the NP a [green]F apple will be considered Given. On the phonology side, there is a constraint that FOC-marked nodes—F-marked nodes that are not immediately
10 Readers who are not interested in the technical details of the argument can skip the remainder of this section without loss of continuity.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Therefore, the information structural constraint at work is one that relates coherence and coreference to accentuation, and is not specific to pronouns. Simply put, pronouns are not constrained to refer to their parallel elements in Parallel relations. Instead, the information structural constraints imposed by Parallel relations (but not Result relations) require that the pronoun, like any other referring expression, receive accent when it is not coreferential with its parallel element. The factors that determine the ability to pronominalize a mention and those that determine accentuation, while independent, interact to entail that unaccented pronouns in Parallel relations can only corefer with their parallel elements. As such, these data neither result from any specialpurpose functions of accented pronouns nor can be used to support the existence of a grammatical role parallelism bias.
Andrew Kehler et al. 21
dominated by another F-marked node—must contain an accent. As such, the word green in a [green]F apple will require accent.11 In establishing Givenness, FOC-marked nodes are assigned discourse antecedents by a function h; in the example just given, h will map the denotation of green to that of red. In Schwarzschild’s system, an OT-style optimization procedure solely determines h. Kehler (2005) argues against this aspect of the analysis, claiming that it cannot predict accent patterns for passages like example (18), repeated below as (24a), in which accent varies depending on the coherence relation inferred.
Kehler claims that h assigns different mappings to the two coherence construals. In particular, the mapping established for the Parallel relation is precisely the one that results from the identification of parallel elements (i.e. the ai and bi). As such, if the pronoun him refers to Powell in (24a), the Parallel relation (and hence h) will enforce the following mapping between entities and predicates in the second clause (left side of the equations) and their parallel elements in the first clause (right side of the equations): (25) a. ½½BushF1 g;h ¼ ½½Powellg b. ½½punishedF2 g;h ¼ ½½defiedg c. ½½PowellF3 g;h ¼ ½½Cheneyg Loosely speaking, this F-marking results in the background Who did what to whom. Because h applies only to F-marked constituents, him must be F-marked for this mapping to hold, and by FOC must be accented despite it representing Given information, per (24b). On the other hand, if him refers to Cheney in (24a), h need not map it to a distinct entity as it would then be coreferential with its parallel element. In this case, ½½Cheneyg,h ¼ ½½Cheneyg, and Cheney becomes part of the background (i.e. Who did what to Cheney). Unlike the Parallel relation, however, F-marking in a Result relation is not governed by a pairwise mapping since its definition does not incorporate one. Instead, the F-marking in (24c) is favoured. In this partition, unlike that in (24b), Powell is part of the background, 11
A variety of other rules and constraints are also at play, which we will not discuss here.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
(24) a. Powell defied Cheney, and Bush punished him. b. BUSHF1 PUNISHEDF2 HIMF3 : (Parallel, HIM ¼ Powell) c. BUSHF1 ½PUNISHEDF2 himF3 : (Result, him ¼ Powell)
22 Coherence and Coreference Revisited
12 The above analysis is restricted to cases in which a common relation over parallel entities comprises the background (the ‘common topic’), which is the case in all of the examples we have considered. Oehrle (1981) notes that in other ‘discourse frames’ a pronoun can remain deaccented even when not coreferential with its parallel element, as in (i):
(i)
A: Can you give me an exact description of Bill’s role in the fight? B: John hit Billb and heb hit Max.
The difference between this example and the others is that the context sets up Bill’s participation as the background, as opposed to the question Who hit who? Our analysis predicts this accent pattern given that A’s question is the antecedent of both clauses in B’s response, rather than the first clause of B’s response serving as antecedent to the second. Also, whereas we have focused on accented pronouns in Parallel relations since those are the cases relevant to our argument, accented pronouns can of course occur outside of Parallel relations. For the results of a corpus analysis see Wolters and Beaver (2001), who conclude that most instances of accented pronouns in their data can be seen as signalling rhetorical contrast, of which the examples discussed here would presumably constitute one type. See also Kehler (2005) for a discussion of examples that involve accented pronouns in Result relations—for example, Johni pushed Bill and HEi/ JOHNi fell—in which accentuation is similarly orthogonal to the form of referring expression used.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
representing a shared variable in the causal relation used to establish coherence (e.g. the P in defyðP; CÞ/ punishðB; PÞ). As such, it is not F-marked, and thus need not receive accent. The crucial fact to be abstracted from this brief synopsis is that Parallel relations, by way of establishing a mapping between parallel elements, give rise to a particular focus/background partition. A side effect of this partition is that a noun phrase (pronominal or not) that does not corefer with its parallel element will require accent regardless of its Givenness status in the remainder of the discourse. Result relations are not similarly restricted, and as such, the optimal focus/accent distribution will often result in the deaccenting of a noun phrase that denotes Given information without any parallelism restriction. Hence, we find different constraints at play in (24a) depending on the coherence relation that is construed. This analysis likewise explains the full set of interpretation patterns witnessed in (19–23), and in particular demonstrates how the resistance of the apparent parallelism bias to influences of semantic plausibility (19) and gender conflicts (20) results without recourse to any pronoun-specific principles.12 To summarize this section, our experiments and analysis show how a coherence-driven analysis predicts when evidence for the parallel grammatical role preference will emerge—particularly, in Resemblance relations like Parallel—and the underlying information structural reason why it does. As a result, there is no work left to be done by positing a separate parallel grammatical role bias or heuristic.
Andrew Kehler et al. 23
4 THEMATIC ROLE BIASES
(26) John handed a book to Bob. He____________ In such cases, the subject fills the Source role and the object of the prepositional phrase fills the Goal role. Participants were asked to provide a natural completion to the pronoun prompt provided in the second sentence, and the pronoun was then categorized as referring to the Source or the Goal. They found that Goal continuations, that is those which correspond to a Goal interpretation for the pronoun, occurred about as frequently as Source continuations (a 49–51% split). The result seems intuitive enough: In a passage such as (27), in which the Occasion relation is operative, pronominal reference to Bob appears to be unobjectionable: (27) John handed a book to Bob. He began reading it. Yet this is unexpected in light of the grammatical subject and grammatical role parallelism preferences, since both point to John as the preferred referent. Whereas participants could have first assigned the pronoun using these biases and then written a continuation that accommodated that assignment, apparently this is not what happened. Stevenson et al. describe two potential explanations for their result. The first is a thematic-role bias which amounts to a heuristic that ranks Goals above Sources. The second is a bias for focusing on the end state of the previously described event, under the assumption that the Goal is more salient to the end state than the Source. Stevenson et al. ultimately argue for the end-state bias; under this interpretation, the apparent heuristic preference for Goals is an epiphenomenon. Our coherence analysis predicts an end-state bias, but only specifically for passages related by Occasion. Recall that in our analysis, the different biases underlying pronoun interpretation are ultimately traceable to properties of the inference processes that are used to
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
A finding of Experiment 1 was that participants reliably interpret a subject pronoun to refer to a non-subject referent in Result relations if the semantics of the passage supports that interpretation. This possibility is not limited to Result relations, however. As we indicated in section 2, Stevenson et al. (1994) report on a series of storycompletion experiments that suggest that the occupants of some thematic roles are systematically preferred to others. Of particular interest here are the patterns they found for passages with a transferof-possession context sentence followed by an ambiguous pronoun, as in (26):
24 Coherence and Coreference Revisited establish coherence. Among the coherence relations discussed in section 2, Occasion is the only one that specifically incorporates a bias towards focusing on the end state of the previous eventuality: Occasion: Infer a change of state for a system of entities from the assertion of S2, establishing the initial state for this system from the final state of the assertion of S1. As such, the coherence analysis would predict that different pronoun interpretation biases will emerge for different coherence relations, and in particular, that Occasion relations will give rise to a Goal preference.
An experiment was designed to distinguish the two possible explanations of Stevenson et al., as well as to test the predictions of the coherence analysis (Rohde et al. 2006). Passages like (26) were paired with versions in which the imperfective form of the main verb was used (28). (28) John was handing a book to Bob. He____________ Crucially, the thematic roles remain the same in examples (26) and (28), but the perfective verb in (26) describes a completed event which is compatible with end-state focus, whereas the imperfective verb in (28) describes an event as an ongoing process, making it incompatible with end-state focus (Moens and Steedman 1988). The thematic role preference thus predicts a similar distribution of Source and Goal interpretations between the two conditions, whereas the event-structure hypothesis predicts a greater percentage of Source interpretations in the imperfective condition than in the perfective condition. We focus the present discussion on testing these predictions, and will return to the predictions of the coherence analysis momentarily. 4.1.1 Stimuli Twenty-one experimental stimuli consisted of a transferof-possession context sentence followed by an ambiguous pronoun prompt, as in (26) and (28). Participants saw either the perfective or the imperfective form of each verb, but not both. The Source referent always appeared in subject position, and the Goal was always the object of a to-phrase. All verbs described physical transfer events (e.g. hand, throw); we excluded verbs that described abstract or conceptual transfer (e.g. show, teach). We also included 29 filler passages with non-transfer verbs (transitive and intransitive) in the context sentence that varied between perfective or imperfective. The transitive verbs (Agent-Patient and
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
4.1 Experiment 2
Andrew Kehler et al. 25
Experiencer-Stimulus) varied in active and passive voice. Adverbs, proper names or gender-unambiguous pronouns served as prompts. 4.1.2 Participants Forty-eight monolingual English-speaking undergraduates at UCSD participated in the study for extra credit in linguistics courses.
4.1.4 Evaluation and analysis Two trained judges assessed the participants’ intended pronoun interpretations. Judges were instructed to be cautious, erring on the side of categorizing a pronoun as ambiguous if the pronoun could be plausibly interpreted as coreferential with either referent, even if their personal interpretation biases strongly indicated a particular one. As such, not all responses could be disambiguated.13 4.1.5 Results The results, shown in Table 3, indicate that pronoun interpretation is sensitive to verbal aspect: Imperfective context sentences yielded significantly more Source interpretations (70%) than perfective sentences [51%; F1(1, 47) ¼ 52.854, P < 0.0001; F2(1, 20) ¼ 30.079, P < 0.0001].14 As such, the event-structure hypothesis is supported over a thematic role bias, since the latter predicts no difference in the distribution of interpretations across conditions.
4.2 Effects of coherence These results suggest that the Goal bias is at least in part an epiphenomenon of a bias towards focusing on the end state of the previous eventuality. We now examine the main prediction of the 13 Our use of judges follows Arnold (2001). Stevenson et al. (1994) had participants circle their intended referents after completing the passages. However, they too ultimately relied on judges to remedy contradictions in the participants’ circling. 14 Table 3 excludes cases that were judged to be ambiguous.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
4.1.3 Task Our design followed Stevenson et al. closely. Participants were asked to write continuations for the 50 passages. They were instructed to imagine a natural continuation to the story, writing the first continuation that came to mind and avoiding humour. As noted by Arnold (2001), in this task participants create a mental model of the event described by the context sentence before writing a continuation; as such, the task involves both interpretation and production. While the prompt constrains the surface realization of the subject to a pronoun, we hypothesize that their continuation depends in part on their expectations about how the discourse will proceed and which individual in the event will be mentioned again.
26 Coherence and Coreference Revisited
Source Goal
Perfective
Imperfective
0.51 0.39
0.70 0.17
Table 3 Results of aspect manipulation
15 Analysis of the imperfective condition revealed a different distribution of coherence relations, but a highly similar relationship between each coherence relation and the corresponding distribution of Source and Goal interpretations. The fact that the different distributions in Figure 3 can be attributed to a different distribution in coherence relations across the perfective and imperfective conditions provides further support for the coherence analysis. 16 This analysis is similar to one conducted by Arnold (2001), who ran a passage-completion experiment in a no-pronoun, full-stop condition, allowing participants to use either a pronoun or name to re-mention a referent at their discretion. Coding a coarser three-way split between cause, end point and other relations, she similarly found differences in the biases across continuation type. 17 These t-tests use subject means. The results over item means are consistent [Occasion: t(20) ¼ 7.2642, P < 0.0001; Elaboration: t(19) ¼ 69.7292, P < 0.0001; Explanation: t(19) ¼ 9.1115, P < 0.0001 (one-sample t-tests)].
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
coherence analysis, specifically that the end-state bias will be primarily an epiphenomenon of establishing Occasion relations. As for other coherence relations, the predictions are as before: Resemblance relations (particularly Parallel) should favour a grammatically parallel antecedent, and Cause–Effect relations (e.g. Explanation, Result) will depend on the semantics incorporated in the passage and the referent to which causality or consequentiality is most likely to be imputed in a particular context. To test this prediction, our judges annotated all unambiguous responses with the coherence relation that held between the context sentence and the continuation. Judges resolved disagreements through discussion, following Stevenson et al. (2000). Our analysis is restricted to the perfective cases since only these are compatible with endstate focus.15 Six coherence relations were annotated: Occasion, Explanation, Result, Violated Expectation (another relation in the Cause–Effect category), Parallel and Elaboration (another relation in the Resemblance category), although Parallel, representing less than 2% of the continuations, is not analysed further.16 The results are shown in Table 4, which lists for each coherence relation its overall frequency and the percentage of pronoun interpretations to the Source. We found that Occasion relations were dominated by Goal continuations, whereas Elaborations and Explanations showed a Source preference [Occasion: t(45) ¼ 5.3537, P < 0.0001; Elaboration: t(42) ¼ 19.66, P < 0.0001; Explanation: t(30) ¼ 6.4983, P < 0.0001 (one-sample t-tests)].17 The restriction of the
Andrew Kehler et al. 27 Coherence relation Occasion (171) Elaboration (126) Explanation (82) Violated Expectation (38) Result (25)
Percentage of corpus
Source bias
0.38 0.28 0.18 0.08 0.06
0.18 0.98 0.80 0.76 0.08
Table 4 Probabilities from Experiment 2 (perfectives)
5 ONLINE INTERPRETATION Summing to this point, the previous two experiments have provided support for a coherence-driven theory of pronoun interpretation. 18 We also found a Goal bias for Result relations, but the small set of Result continuations (< 6%; n ¼ 25) was very homogenous, more than half consisting of the form X transfers Y to Z. Z thanks X, making extrapolation difficult. Whereas our coherence analysis would predict that causal inference plays a greater role in establishing Result relations than Occasion relations, the effect described by the second eventuality in a Result sequence is often a direct result of the end state brought about by the first, and thus it would perhaps not be surprising to find an end-state bias for Result relations as well. This notwithstanding, Stewart et al. (1998) show that verbs are highly variable with respect to their biases in Result relations; see section 6 for further discussion.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Goal preference to Occasion relations reinforces the conclusion that a generic thematic role preference is insufficient as a predictor of pronoun interpretation.18 Whereas our results support the conclusion of Stevenson et al. that the Goal preference is an epiphenomenon of a bias towards focusing on end states, they further show that the end-state bias is to a large degree an epiphenomenon of the inference processes used to establish Occasion relations. The bias towards the Goal simply disappears when either of the other two common relations (Elaboration or Explanation) is operative. While the context sentences in all of our perfective stimuli describe events with salient end states, the results summarized in Table 4 strongly suggest that it is the coherence relation that dictates the extent to which that end point is relevant. Occasion relations exhibit a clear preference for the Goal, as they are precisely the relations that rely specifically on the end state of an eventuality in establishing coherence. Thus, thematic role biases constitute another case in which a coherence-driven analysis can explain the underlying reasons we see evidence for an interpretation heuristic, as well as why this evidence emerges only in particular contextual circumstances.
28 Coherence and Coreference Revisited
19 It also follows recent work in sentence processing that contends that online measurements of interpretation difficulty can be successfully predicted by probabilistic, expectation-driven models (e.g. Hale 2001; Levy 2007). These models posit that the sentence processor implicitly makes predictions about what words are likely to come next in an utterance; degree of processing difficulty corresponds inversely with how well these expectations align with the material that is actually seen. Hale and Levy show that expectations can be estimated to good effect using generative models trained from online corpora, and that they predict a variety of reading time data that have been reported in the sentence processing literature. 20 All terms in (29) are of course conditioned on the current context as well.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Experiment 1 showed that the grammatical subject and parallel grammatical role preferences can be neutralized when coherence has been carefully controlled for in the stimuli. Experiment 2 supported the proposal of Stevenson et al. that event-structure biases are involved in pronoun interpretation (rather than thematic role biases), and furthermore localized them to those coherence relations that could be expected to encode such a bias as a side effect. In each case, we closely followed the design of the antecedent work to which we compared ourselves, which meant using offline methods for assessing interpretations. An obvious remaining question for a coherence-driven theory is what it predicts about incremental processing. There is a wealth of online evidence that language interpretation proceeds in a highly incremental fashion, and pronoun interpretation has been a rich source of such evidence (Caramazza et al. 1977; Gordon and Scearce 1995; Stewart et al. 1998; Koornneef and van Berkum 2006, inter alia). The question is how coherence establishment can influence pronoun interpretation in cases in which the pronoun is encountered before the coherence relation is known. We begin addressing this question in this section, and then continue in the sections that follow with respect to two case studies: implicit causality (IC) effects and the grammatical subject preference. Our proposal follows the lead of Arnold (2001), who hypothesized that referent accessibility is influenced by a hearer’s probabilistic expectations about what referents will be subsequently mentioned in the discourse, which are in part driven by expectations about how the discourse is likely to be continued.19 We focus our analysis on the role of coherence-driven expectations associated with discourse contexts in terms of two types of probabilistic information that are naturally combined: (i) expectations concerning how the discourse is likely to be continued with respect to coherence relation, and (ii) the likelihood that a certain referent will get mentioned by a pronoun conditioned on the occurrence of that coherence relation. These come together in the following equation (in which ante stands for an antecedent in a particular grammatical or thematic position, and CR stands for coherence relation):20
Andrew Kehler et al. 29
(29) Pðpronoun ¼ anteÞ ¼
+
PðCRÞ Pðpronoun ¼ antejCRÞ
CR2CRs
21
This formula is no doubt too simplistic as a full theory of probabilistic pronoun interpretation (one reason will be discussed in section 7); however, we can nonetheless use it for current purposes to illustrate how our analysis can make predictions about incremental processing. 22 The bias towards the Source was reported as 51% in section 4.1.5, which was the percentage in a Source/Goal/ambiguous distinction (see footnote 14). The 56.7% bias reported here represents normalized percentages after setting aside the ambiguous cases.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
For example, to compute the likelihood that a pronoun will corefer with the subject of the previous sentence, we simply sum, over all coherence relations, the likelihood of seeing that coherence relation multiplied by the likelihood of a subject reference given that coherence relation.21 This equation makes explicit the idea that at any point during comprehension the hearer will have expectations about how the discourse will be continued with respect to coherence and that the difficulty in interpreting the linguistic material to follow will be conditioned in part on those expectations. These expectations will then evolve based on subsequent linguistic input that influences the probabilities represented. Values for these terms need to be estimated in order to make predictions about online interpretation. However, we do not have direct access to the relevant probability distributions that language processors implicitly represent at a particular point in a discourse, nor is corpus analysis feasible if one desires a tight control on contextual factors. Instead, sentence-completion tasks like those used in Experiment 2 have become a standard way to estimate such biases (Caramazza et al. 1977; McKoon et al. 1993; Stewart et al. 1998; Koornneef and van Berkum 2006, inter alia). For the case of the perfective transfer-of-possession sentences used in Experiment 2, therefore, the two columns of biases shown in Table 4 provide estimates of P(CR) and P(pronoun ¼ sourcejCR), respectively. When applied to (29), these numbers result in an average of 56.7% bias towards the Source at the time that a subject pronoun is encountered.22 Following a substantial previous literature that demonstrates that such biases impact reading times (see the review in the next section on implicit causality), these numbers would predict at most a modest reading time delay for Goal interpretations over Source ones. While the overall results are similar to the near 50/50 split found by Stevenson et al. (1994), our results show that there is nothing 50/50 about the pattern once coherence is taken into account. Each of the coherence relations encodes a considerably stronger bias one way or the other about who
30 Coherence and Coreference Revisited
6 IMPLICIT CAUSALITY Perhaps the most well-studied phenomenon relevant to the interaction between coherence and pronoun interpretation involves the so-called Implicit Causality (IC) verbs. The literature on the topic is voluminous; out of necessity our discussion will not be comprehensive (but see Rudolph and Forsterling (1997) for a comprehensive review as of their writing). Consider (30a,b), from Caramazza et al. (1977): (30) a. Jane hit Mary because she had stolen a tennis racket. b. Jane angered Mary because she had stolen a tennis racket. Intuitively, the pronouns in (30a,b) refer to Mary and Jane, respectively. The reason for the difference points directly at the matrix verb, since the passages are otherwise identical. Caramazza et al. (1977) conclude 23 This prediction ignores the difference between pronoun interpretation in syntactically coordinate contexts (as in our experiments) v. syntactically subordinate ones (as would be the case within an adjunct headed by because). As there is evidence that this distinction may matter (Miltsakaki, 2001), it may ultimately need to be accounted for in a richer probabilistic model of pronoun interpretation.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
will be mentioned next; it is only after the frequencies of coherence continuation are factored in that the biases have a cancelling effect. Equation (29) further predicts that other phenomena which influence the likelihood of the upcoming coherence relation could impact pronoun interpretation biases, and as such, influence reading times. An obvious example is coherence-constraining connectives. Consider the connective because, which is only consistent with the Explanation relation. Because the occurrence of because after a Source– Goal passage would essentially drive the probability of Explanation towards one and the others towards zero, the probabilities in Table 4 would predict an average 80% bias for a subject referent. In this case, we would expect proportionately longer reading times for pronouns that referred to the Goal as compared to the Source.23 This expectation-driven view of incremental processing contrasts with a common view in the literature, whereby surface-level features determine the initial referent assigned to a pronoun, to be later confirmed or contradicted by plausibility factors (e.g. Gordon and Scearce 1995, inter alia). We believe our analysis provides a more parsimonious account that simultaneously captures documented preferences based on surface cues and a range of phenomena that are problematic for them. We elaborate in the sections that follow, considering first the phenomenon known under the rubric of implicit causality.
Andrew Kehler et al. 31
(31) Tom scolded Bill because he____________ The percentage of interpretations to a referent was used as a measure of the verb’s bias; with (31), for instance, they found that scold encodes a strong bias towards its direct object (henceforth, an NP2 verb) as opposed to one that encodes a bias towards its subject (henceforth, an NP1 verb). This NP2 bias predicts that example (32a), in which the preferred referent is congruent with the bias, should be read faster than (32b), in which the preferred referent is incongruent with the bias. (32) a. Tom scolded Bill because he was annoying. b. Tom scolded Bill because he was annoyed. Pairs of stimuli per (32a,b) were joined with two controls that used gender-unambiguous pronouns. The results confirmed the prediction; sentences with bias-inconsistent pronoun interpretations took longer to read than sentences with bias-consistent ones in both conditions. There is an obvious relationship between these experiments and our coherence analysis, in light of the fact that the connective because is an explicit indicator of an Explanation relation. The results of Experiment 2 shown in Table 4 also revealed a set of biases, in this case for transferof-possession passages, in terms of both the likelihood of each possible coherence relation to follow and of mentioning a particular referent conditioned on each coherence relation. Interestingly, we found what could be characterized as an overall NP1 IC bias here as well, with an average of 80% of NP1 references in Explanation relations. Caramazza et al. (1977) note as a major finding of their work that the ‘IC feature’ can be best represented as a continuum, that is, when the bias is represented as the proportion of continuations that suggest NP1 as the
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
that IC is a feature of verb roots that selects one entity as the ‘probable instigator or causal source for a series of events’, which is in turn responsible for the corresponding bias in pronoun assignment. Importantly, as with any statistical bias, IC biases can be violated without rendering the passage ungrammatical or incoherent, for example, compare (30a) with Jane hit Mary because she reacts violently to criticism. Nonetheless, one might ask whether these biases affect reading times, insofar as clauses in which the pronoun assignment is incongruent with the preceding verb’s IC bias should take longer to read than ones in which it is congruent. Caramazza et al. (1977) ran a reading time experiment to test this prediction. Norming was done in a previous study (Garvey et al. 1976, Experiment 1) using a sentencecompletion task of the sort we employed in Experiment 2, in which participants were asked to write completions for fragments such as (31):
32 Coherence and Coreference Revisited
6.1 Experiment 3 This experiment tested whether the biases found for IC verbs in passages containing a because prompt, mimicking the design of Garvey et al. (1976, inter alia), are similar to those found for the Explanation relationships identified in responses within a full-stop condition. Because we are mainly interested in the coherence-driven biases towards referents generated by these different classes of verbs, a pronoun was not included in the prompts. All subsequent first-mentioned referents were therefore catalogued, regardless of form of reference (i.e. pronoun or proper name). This choice allowed the use of contexts that lacked gender ambiguity, which facilitated the identification of the intended referents of pronouns (cf. Stewart et al. 1998; Arnold 2001). 6.1.1 Stimuli A 2 3 3 design was used that crossed verb type (IC verb v. non-IC verb) with continuation type (full stop v. because v. 24 Ehrlich (1980) ran an experiment in which the connective used was varied between because, but and and:
(33) a. Steve blamed Frank because he spilt the coffee. b. Steve blamed Frank and he spilt the coffee. c. Steve blamed Frank but he spilt the coffee. Her results were mixed, which is not surprising on our analysis because neither but nor and select for a single coherence relation: but is consistent with both Contrast and Violated Expectation (which each have different biases), and and is consistent with Occasion, Result and Parallel (again, each having different biases). As such, this manipulation does not reveal much about the predictions of a coherence-driven theory.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
referent, the values range continuously between 0 and 1. This is exactly what the final term of (29) captures, although crucially these biases are conditioned on coherence relations. As we alluded to in the previous section, the inclusion of because in the stimulus prompts typically used in the IC literature might do no more with respect to pronoun interpretation than to restrict the operative coherence relation to Explanation. This analysis predicts that the IC bias found in sentence completions using a because prompt as in (31) should align closely with the IC bias found for completions in a similar no-pronoun, full-stop condition when only those passages that participate in an Explanation relation are considered. To our knowledge, such an experiment has not been carried out to date.24 We therefore ran a sentence-completion experiment to test this question. A positive outcome would suggest that IC effects are a microcosm of a more general set of biases that apply in all contexts, distinguishing themselves only with respect to the strength of their bias towards a particular referent when an Explanation relation is operative.
Andrew Kehler et al. 33
dialogue prompt). The dialogue-prompt condition was included for norming data for an orthogonal future experiment and will not be further analysed here. Examples of the full stop and because condition are shown in (34): (34) a. Tony disappointed Courtney. ____________ b. Tony disappointed Courtney because ____________
6.1.2 Participants Seventy-five monolingual English-speaking undergraduates at UCSD participated in the study for extra credit in linguistics courses. 6.1.3 Task The task followed the design of Experiment 2. Participants were asked to write the first natural completion that comes to mind, without adding extra humour or creativity to the task. 6.1.4 Results The results for NP1 verbs, NP2 verbs and non-IC verbs are presented in Tables 5, 6 and 7, respectively. Entries for coherence relations are not included if they comprised less than 5% of the continuations in the full-stop condition. This was sometimes the case for Violated Expectation and Occasion, and was always the case for Parallel. Table 5 summarizes the results for the IC-NP1 verbs. The NP1 bias of 85% for Explanation relations in the full-stop condition is essentially equivalent to the 84% bias in the because condition, as predicted. [Prompt type is not a significant predictor of bias: F1(1, 70) < 0.0221, P < 0.8822; F2(1, 19) ¼ 0.032, P < 0.86.] The lower 60% overall bias found in the full-stop condition simply represents a watering down of the IC bias due to the existence of passages with coherence relations other than Explanation, to which the IC bias is not relevant. Table 6 summarizes the results for the IC-NP2 verbs. Again the Explanation bias towards NP1 in the full-stop condition is essentially equivalent to the one in the because condition, as predicted. [Prompt type: F1(1, 73) ¼ 0.4424, P < 0.5081; F2(1, 19) ¼ 1.2235; P < 0.2825.]
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Forty IC verbs and 40 non-IC verbs were taken from McKoon et al. (1993), with three replacements. (The verbs cheat, jeer and dread were felt to sound awkward in our sentence frames, and were replaced by offend, mock and fear, respectively.) The IC category was further broken down into 20 each of NP1 and NP2 verbs. All context sentences contained mentions of two possible referents, one male and one female. Twenty filler sentences used non-IC verbs and were followed by various interclausal connectives (monologue continuation) or a dialogue response that contained the beginning of a question (dialogue continuation), for a total of 100 stimulus items per participant.
34 Coherence and Coreference Revisited Coherence relation
because prompt
Full stop P(CR) (%)
P(SubjjCR)
P(CR) (%)
P(SubjjCR)
58 22 10
0.84 0.10 0.61
100 — —
0.85 — —
Explanation Result Elaboration
Table 5 Probabilities from Experiment 3 (IC-NP1 verbs)
because prompt
Full stop P(CR) (%)
P(SubjjCR)
P(CR) (%)
P(SubjjCR)
62 15 14
0.13 0.03 0.46
100 — —
0.10 — —
Explanation Result Elaboration
Table 6 Probabilities from Experiment 3 (IC-NP2 verbs)
because prompt
Full stop
Coherence Relation Explanation Elaboration Result Violated Expectation Occasion
P(CR) (%)
P(SubjjCR)
P(CR) (%)
P(SubjjCR)
24 29 22 13 9
0.57 0.58 0.24 0.40 0.53
100 — — — —
0.56 — — — —
Table 7 Probabilities from Experiment 3 (non-IC verbs)
Finally, Table 7 summarizes the results for the non-IC verbs. We see that even for non-IC verbs, the average bias towards NP1 is consistent between the because condition and the Explanation relations in the fullstop condition. [Prompt type: F1(1, 61) < 1, P < 0.982; F2(1, 36) ¼ 1.4598, P < 0.2348.] This provides further evidence that there is nothing special about IC verbs coupled with the connective because; because simply marks an Explanation relation, and the referent bias gets adjusted accordingly for IC and non-IC verbs alike. The hypothesis is therefore confirmed: The IC biases seen in the because condition are highly consistent with those found for Explanation relations in the full-stop condition across all three verb types. As in Experiment 2, the summary statistics across coherence
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Coherence Relation
Andrew Kehler et al. 35
6.2 Immediate focusing v. clausal integration A more recent controversy has centered around when IC information is used, that is, whether the information is utilized early enough so as to 25 These results are surprising for the analysis of Stevenson et al. (2000), who argue for a semantic focusing account over a (Hobbsian, coherence-based) relational account. Whereas we argue that connectives influence coherence establishment and coherence establishment in turn influences pronoun interpretation, in their analysis connectives constrain pronoun interpretation more directly by modifying the salience of entities, in a second role that they consider distinct from their role in constraining coherence establishment. The assumptions that they place on coherence-driven analyses are problematic, however, and do not adequately represent either our analysis or Hobbs’s original proposal. Whereas we will not go into further detail on these matters, we do note that the alignment of biases between the full-stop condition for Explanation and the because condition can only be viewed as a coincidence in their theory.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
relations hide the considerably stronger biases that are often a play when coherence relations are conditioned on.25 Tables 5–7 also bring to light that there is not one but two noteworthy biases that are associated with IC verbs. Besides the biases towards particular referents that have been our focus thus far, IC verbs are also shown to be significantly more likely to evoke Explanation continuations (60% for NP1 and NP2 continuations combined) than non-IC verbs (24%), regardless of which referent gets mentioned first. This suggests that the lexical semantics of IC verbs create a strongerthan-usual expectation for an explanation. This bias may have gone unnoticed in the literature because previous studies typically have used only because prompts or have otherwise not categorized the coherence relations operative in their passage completions. All these results suggest that contexts trigger rich probabilistic information that is brought to bear during interpretation. In this sense, the IC biases that have been documented in the literature represent just one of up to 10 biases that are exhibited in the no-bias conditions of Tables 5–7 (i.e. a bias for each of five relations coming next and a bias for a particular referent given each relation). In fact, another one of these biases—towards a referent given a Result relation—was previously identified and termed Implicit Consequentiality by Stewart et al. (1998). (See also the discussion in Crinean and Garnham 2006.) Using response completions to passages such as Because John annoyed Bill, he, they identify verbs that have both NP1 and NP2 consequentiality biases, and demonstrate that these biases impact reading times. A prediction of our analysis would therefore extend the online findings found for both IC and implicit consequentiality to the biases found for a broader range of contexts, across all coherence relations per the probabilities assigned by (29).
36 Coherence and Coreference Revisited
However, our findings are not necessarily inconsistent with an incremental clausal integration account, in which the information made available by the subordinate clause is ‘retroactively’ related to the interpretation of the main clause on a word-by-word basis. (p. 459) This view, which they similarly cast in terms of probabilistically driven expectations, is precisely the type of account that we advocate; it is evident from KvB’s discussion that there is a close relationship between our respective views of expectation-driven discourse interpretation and
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
essentially constitute a focusing mechanism (the immediate focusing account, e.g. McKoon et al. 1993), or instead is used only as part of a sentence-final clause integration process (the clausal integration account, e.g. Stewart et al. 2000). The clausal integration account predicts that IC effects will arise later during sentence interpretation than the immediate focusing account does, at least when a pronoun occurs early in the clause. Our analysis predicts aspects of both of these models: The biases we have documented should be available at the time the pronoun is encountered and hence should influence reading times at or soon after the pronoun, but so will subsequent words that affect the likely coherence relation, and as a result, the likely referent for a pronoun given that coherence relation. These predictions are supported by the recent study of Koornneef and van Berkum (2006, KvB). Characterizing IC biases as ‘probabilistic asymmetries’ that ‘reflect something more subtle about the way we use various sources of information in everyday language comprehension’, KvB looked for mid-sentence reading delays caused by pronouns that are inconsistent with the bias of a preceding IC verb in two experiments with gender-unambiguous pronouns. In a word-byword self-paced reading task, they found that words in the pre-critical region were read equally fast across the bias-consistency conditions, but readers slowed down right at a bias-inconsistent pronoun, with a significant main effect emerging at the first two words thereafter. In an eye tracking study that measured mean regression path durations, again no differences were measured in the pre-critical region, but pronouns that were inconsistent with the IC bias reliably perturbed the reading process at or shortly after the pronoun. The results of both experiments therefore suggest that IC information becomes available rapidly enough to appear mid-sentence, even in passages in which the gender of the pronoun singles out a unique referent. While these results support the immediate focusing account over the clausal integration account, KvB do not discount the latter entirely:
Andrew Kehler et al. 37
7 THE GRAMMATICAL SUBJECT PREFERENCE Finally, we evaluate the evidence for a grammatical subject preference in light of our analysis and results. Crawley et al. (1990) report on two studies which they argue support the idea that hearers use a subject assignment strategy, contrasting it specifically with the predictions of a parallel function strategy. They characterize such strategies as ‘relatively mechanical rules of thumb which tell us to whom or what to assign a pronoun’, which are nonetheless only invoked when ‘there are no other strong constraints (such as linguistic or pragmatic constraints) on assignment’. In a self-paced reading task, participants read a three-sentence passage that ended with a clause that contained a pronoun in object position. (We restrict discussion to their ambiguous pronoun condition.) Although they acknowledge the difficulty in completely eliminating the influence of general knowledge in their stimuli, three judges checked each stimulus to ensure that either assignment of the pronoun resulted in a plausible interpretation. Participants answered a question that revealed their pronoun assignments. A bias was found towards the grammatical subject over the object, with an average of 23.7 subject interpretations for the 40 passages (a 59.25% bias). It also took slightly longer to read sentences with object referents. The results of a direct assignment task using the same stimuli were very similar, with an average of 24 subject assignments (60% subject bias). Despite their conclusions, however, nothing in their experiment rules out the possibility that their results arose from discourse-driven expectations generated by their stimuli rather than from a distinct pronoun-specific interpretation strategy. Furthermore, as Smyth (1994) points out, properties of their reading time data suggest that some
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
how it influences pronoun interpretation. We therefore consider KvB’s online results as initial evidence for our approach, and would predict similar results from violating other coherence-driven biases as well. Online investigations of these additional hypotheses must await further research. To summarize this section, our results suggest that IC biases are simply microcosms of a more general system of coherence-driven biases that drive pronoun interpretation in all context types. These results also show that IC verbs are exceptional with respect to two biases they engender: In addition to previously known biases towards a particular referent in an Explanation context, they also generate stronger-than-usual expectations for an upcoming Explanation relation.
38 Coherence and Coreference Revisited
26 Although 25 participants saw each of these in the full-stop condition, the actual number of qualifying entries was generally less, since some did not mention either referent, or mentioned both at once with the pronoun they. All but one of the entries included in the table had at least 20 qualifying continuations (saw had 18).
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
stimuli may have been consistently interpreted with subject assignment and others consistently with object assignment. As such, perhaps a different set of stimuli would have yielded a different result. To shed light on this issue, we analysed the biases found for the nonIC verbs in the full-stop condition in our Experiment 3. A representative sample of 10 of the 40 verb frames used in that experiment, all of which were from the stimuli of McKoon et al. (1993), are shown in Table 8.26 As can be seen, the NP1 biases show an even representation across the spectrum from 0 to 1. (The entire set of 40 verbs showed the same even distribution as well, with exactly half of the verbs having a bias above 0.5 and half falling at 0.5 or below.) Granted there are several differences between these biases and those found by Crawley et al.: These were collected from a sentence-completion study without a pronoun prompt, and represent first-mentioned referents rather than only subjects. Nonetheless, these results highlight the degree of freedom afforded in the selection of stimuli. Selecting verb frames from the top half of the table would presumably tilt the results towards a subject assignment strategy, whereas verb frames from the bottom half would presumably tilt the evidence away from it. Whereas researchers who posit the existence of pronoun interpretation preferences and heuristics have consistently exempted ‘pragmatically biased’ examples, viewing stimuli in terms of the statistical distributions they engender makes it clear that there really is no such thing as a passage that is devoid of pragmatic bias. The numbers may be stronger or weaker, but any context will give rise to a set of biases over continuation types with respect to coherence and a set of biases for likelihood of mention given a continuation type. It is therefore incumbent on researchers to explain exactly what counts as a plausibility factor when using it to exempt examples that fail to conform. In the case of the subject assignment strategy, such conditions would not only have to include the intuitively high-bias NP2 verbs in the IC literature, but also the seemingly mundane perfective transfer-ofpossession passages from Experiment 2 and the bottom five verb frames in Table 8, all of which do not follow the predictions of the strategy. This argument should not be misconstrued to suggest that a bias towards subjects would not emerge if one could compute statistical expectations over all possible contexts. For instance, one might expect that many verb frames are frequently continued with Occasion
Andrew Kehler et al. 39 Verb frame borrowed-a-bike-from saw waited-to-see counted-the-money-from played-the-piano-for edited-an-essay-for repaired-a-bike-for watched went-to-visit read-a-funny-story-to
p(NP1) 0.857 0.722 0.636 0.545 0.500 0.400 0.350 0.261 0.200 0.130
relations, and that many Occasions will display continuity in the agent role—and as a result, oftentimes continuity in the subject position as well—a combination that would tilt the statistical biases towards the subject position. The point is that this bias then emerges from general mechanisms without any need to posit a separate heuristic. Indeed, a coherence-driven theory is in principle capable of explaining such overall biases while still capturing the differing behaviour of certain other verb frames, using the same types of predictive interpretation mechanisms that we find evidence for in sentence processing. A final observation is in order, however, as experiments by Stevenson et al. (1994) provide a type of evidence for a subject bias that we have yet to address. Recall that in addition to their pronoun-prompt condition, Stevenson et al. had a no-pronoun, full-stop condition (as we used in our Experiment 3), in which participants chose their own forms of referring expressions. Across their stimulus types, they found that this choice was heavily biased towards a pronoun when the referent was the previous subject, and likewise towards a name when the referent was a non-subject. (Arnold (2001) found similarly strong biases.) At first blush this result seems paradoxical: If participants have a clear production preference to refer to non-subjects with full names, why do they so readily assign a pronoun to a non-subject in the pronounprompt condition (e.g. 49.0% to the Goal in Source–Goal contexts)? Stevenson et al. suggest that, in addition to a general thematic role preferences, ‘heuristic search processes triggered by the presence of a pronoun’ provide an additional bias to the first-mentioned entity, that is, that there is an overlaid subject assignment strategy. This suggestion would explain, for instance, why they found far more references to the Goal in their Goal-Source pronoun-prompt condition (84.6% by our
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Table 8 Biases for 10 selected verbs from Experiment 3 (non-IC verbs)
40 Coherence and Coreference Revisited calculation), in which the Goal is also the subject, than in the Source– Goal condition (again, 49.0%). However, there are other possible explanations for these results that do not require appeal to any specific interpretation heuristics. As an illustration, we consider the relationship between pronoun production and interpretation that emerges when cast in Bayesian terms: PðreferentjpronounÞ ¼
PðpronounjreferentÞPðreferentÞ PðpronounÞ
27 We will ignore the term P(pronoun), which is a constant factor over all possible referents in the context. 28 Such a situation occurred in Arnold’s (2001) Source–Goal condition. She found that 76.0% of the references to the subject were pronominalized, whereas only 20.1% of references to the object of the preposition were. However, the next mention bias towards the Goal was an overwhelming 85.6%. 29 Having made this point, we want to stress that it is not our goal to argue for this Bayesian analysis, as it raises a large number of questions that we are not prepared to address. We only wish to offer it as a proof-of-concept of how a subject bias in interpretation could emerge beyond what is predicted by coherence-driven expectations alone. A fuller exploration of the model is the subject of ongoing work.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Whereas up to this point we have considered pronoun interpretation biases (P(referentjpronoun)) as conditioned by coherence-driven expectations, this formulation splits the bias into two: an expectation towards a subsequent mention of a referent (P(referent)), and an expectation about the form of referring expression that the speaker would use to mention that referent (P(pronounjreferent)).27 Under this formulation, there is nothing inconsistent about an interpretation bias towards a non-subject referent despite a strong bias against pronominalizing non-subjects, assuming a suitably large subsequent mention bias towards the non-subject.28 Our results and those of Stevenson et al. and Arnold are therefore all consistent with a scenario in which grammatical or information structural factors (subjecthood, topichood) play a greater role in conditioning P(pronounjreferent) and coherence-driven expectations play a greater role in conditioning P(referent). If this is the case, we would expect to find a pronominal bias towards the subject position beyond what is predicted from coherence-driven expectations alone (exempting Parallel relations, per the arguments in section 3), although importantly, without the need to posit that hearers utilize pronounspecific interpretation strategies or heuristics.29 To conclude this section, our analysis of seemingly unremarkable verb frames as exemplified in Table 8 reveals great variance in their prior contextual biases towards particular referents. It would therefore seem essential that such biases be controlled for before the existence of an overlaid subject assignment preference can be established. Further, while we suspect that there are sources of pronoun-specific subject biases in pronoun
Andrew Kehler et al. 41
interpretation, they do not necessarily entail the existence of special-purpose, heuristic interpretation ‘strategies’, but instead may ultimately prove to be better captured within a more parsimonious, expectation-driven account. 8 CONCLUSION
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
We have presented new experimental evidence in support of a coherence-driven analysis of pronoun interpretation, and described how it can accommodate previous findings suggestive of conflicting preferences and biases. The results of our first experiment demonstrated that the grammatical subject and grammatical parallelism preferences can be neutralized when coherence is carefully controlled for. We furthermore provided a linguistic analysis that establishes that the grammatical role parallelism preference is an epiphenomenon of an independent interaction between information structure and accent placement in Parallel coherence relations that applies to referring expressions of all types. The results of the second experiment distinguished the thematic role and event-structure biases proposed by Stevenson et al. (1994), supporting the event-structure bias. The experiment further showed that the bias is limited primarily to those coherence relations which implicate event structure in their formulation, and that the approximately 50/50 distribution of references found in Source–Goal passage completions represents but an average of a set of considerably stronger biases evident when coherence relations are conditioned on. Whereas evidence for incremental interpretation has historically been seen as problematic for coherence-based analyses, we have described a model that captures how a hearer’s coherence-driven expectations about how the discourse is likely to proceed could predict online measurements of pronoun interpretation difficulty. The results of Experiment 3 confirmed a prediction of this analysis, specifically that IC biases evident in passage completions with because prompts are essentially equivalent to those in a full-stop condition when only Explanation relations are analysed. The results of this experiment also demonstrate that IC biases represent but one instance of a more comprehensive set of biases that drive predictive discourse interpretation, which include biases for what type of continuation will ensue in addition to biases towards mentioning particular referents conditioned on continuation type. Although online tests of the predictions of the analysis await future work, biases estimated from passagecompletion experiments have been repeatedly shown in the literature to influence pronoun processing difficulty. Finally, we described how coherence-driven expectations about who will be mentioned next have the potential to dramatically affect
42 Coherence and Coreference Revisited
Acknowledgements We would like to thank Jennifer Arnold, Julie Sedivy, and Ron Smyth for extensive comments that led to substantial improvements over earlier drafts, and Roger Levy, Bob Slevc, and audiences at the University of Pennsylvania, MIT, UCLA, Ohio State, the University of Minnesota, San Diego State, the University of Texas at Austin, and the UCSD Center for Research in Language for useful comments and discussions. We also thank Erica Gold for her extensive annotation efforts. This research was supported in part by National Institutes of Health Training Grant 5-T32-DC0041 to the Center for Research in Language at UCSD, and a grant from the UCSD Academic Senate. ANDREW KEHLER Department of Linguistics University of California San Diego 9500 Gilman Drive La Jolla CA 92093-0108 USA e-mail:
[email protected] REFERENCES Akmajian, A. & R. S. Jackendoff (1970), ‘Coreferentiality and stress’. Linguistic Inquiry 1:124–6. Arnold, J. (2001), ‘The effect of thematic roles on pronoun use and frequency
of reference continuation’. Discourse Processes 21:137–62. Beaver, D. I. (2004), ‘The optimization of discourse anaphora’. Linguistics and Philosophy 27:3–56.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
evidence for a grammatical subject preference. We also speculated that there are subject biases in pronoun interpretation that go beyond what can be predicted by coherence-driven expectations alone, and how these might be explainable without recourse to any heuristic interpretation ‘strategies’. A suitably comprehensive evaluation of the tenability of this approach must await future work, however. In sum, the coherence analysis is capable of explaining a wide variety of often contradictory results in the previous literature in a theoretically parsimonious manner. It offers an explanation of what the underlying sources of previously proposed biases are, and predicts in what contexts evidence for each will surface. The theory finds no need to include caveats for examples with ‘pragmatic bias’, since the theory directly captures the fact that all passages contain pragmatic bias. A ramification for future psycholinguistics work is the need to control for the pronoun-independent, coherence-driven expectations that are embodied in experimental stimuli, as our results argue that this is required before evidence for overlaid biases or preferences can be successfully established.
Andrew Kehler et al. 43 Hobbs, J. R. (1979), ‘Coherence and coreference’. Cognitive Science 3:67–90. Hobbs, J. R. (1990), Literature and Cognition. CSLI Lecture Notes 21. Stanford, CA. Hume, D. 1748), An Inquiry Concerning Human Understanding. The Liberal Arts Press. New York. 1955 edition. Kameyama, M. (1999), ‘Stressed and unstressed pronouns: complementary preferences’. In P. Bosch and R. van der Sandt (eds.), Focus: Linguistic, Cognitive, and Computational Perspectives. Cambridge University Press. Cambridge. 306–21. Kehler, A. (2002), Coherence, Reference, and the Theory of Grammar. CSLI Publications. Stanford, CA. Kehler, A. (2005), ‘Coherence-driven constraints on the placement of accent’. In E. Georgala and J. Howell (eds.), Proceedings of the 15th Conference on Semantics and Linguistic Theory (SALT-15). CLC Publications. Cornell University. 98–115. Kertz, L., A. Kehler, & J. L. Elman (2006), ‘Grammatical and coherencebased factors in pronoun interpretation’. In R. Sun (ed.), Proceedings of the 28th Annual Conference of the Cognitive Science Society. Lawrence Erlbaum Associates. Mahwah, NJ. 1605–10. Koornneef, A. W. & J. J. A. van Berkum (2006), ‘On the use of verb-based implicit causality in sentence comprehension: evidence from self-paced reading and eye-tracking’. Journal of Memory and Language 54:445–65. Levy, R. (2007), ‘Expectation-based syntactic comprehension’. Cognition, forthcoming. McKoon, G., S. B. Greene, & R. Ratcliff (1993), ‘Discourse models, pronoun resolution, and the implicit causality of verbs’. Journal of
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Caramazza, A., E. Grober, C. Garvey, & J. Yates (1977), ‘Comprehension of anaphoric pronouns’. Journal of Verbal Learning and Verbal Behaviour 16: 601–9. Chambers, C. C. & R. Smyth (1998), ‘Structural parallelism and discourse coherence: a test of centering theory’. Journal of Memory and Language 39: 593–608. Crawley, R. A., R. J. Stevenson, & D. Kleinman (1990), ‘The use of heuristic strategies in the interpretation of pronouns’. Journal of Psycholinguistic Research 19:245–64. Crinean, M. & A. Garnham (2006), ‘Implicit causality, implicit consequentiality, and semantic roles’. Language and Cognitive Processes 21:636–48. de Hoop, H. (2004), ‘On the interpretation of stressed pronouns’. In R. Blutner and H. Zeevat (eds.), Optimality Theory and Pragmatics. Palgrave Macmillan. New York. 25–41. Ehrlich, K. (1980), ‘Comprehension of pronouns’. Quarterly Journal of Experimental Psychology 32:247–55. Garvey, C., A. Caramazza, & J. Yates (1976), ‘Factors underlying assignment of pronoun antecedents’. Cognition 3: 227–43. Gordon, P. C. & K. A. Scearce (1995), ‘Pronominalization and discourse coherence, discourse structure and pronoun interpretation’. Memory and Cognition 23:313–23. Gundel, J. K., N. Hedberg, & R. Zacharski (1993), ‘Cognitive status and the form of referring expressions in discourse’. Language 69:274–307. Hale, J. (2001), ‘A probabilistic Earley parser as a psycholinguistic model’. In Proceedings of the Annual Meeting of the North American Chapter of the Association for Computational Linguistics. The Association for Computational Linguistics. Morristown, NJ. 159–66.
44 Coherence and Coreference Revisited events’. Language and Cognitive Processes 9:519–48. Stevenson, R. J., A. Knott, J. Oberlander, & S. McDonald (2000), ‘Interpreting pronouns and connectives: interactions among focusing, thematic roles, and coherence relations’. Language and Cognitive Processes 15:225–62. Stewart, A. J., M. J. Pickering, & A. J. Sanford (1998), ‘Implicit consequentiality’. In M. A. Gernsbacher and S. J. Derry (eds.), Proceedings of the 20th Annual Conference of the Cognitive Science Society. Lawrence Erlbaum Associates. Mahwah. NJ. 1031–6. Stewart, A. J., M. J. Pickering, & A. J. Sanford (2000), ‘The role of implicit causality in language comprehension: focus versus integration accounts’. Journal of Memory and Language 42: 423–43. Venditti, J. J., M. Stone, P. Nanda, & P. Tepper (2002), ‘Discourse constraints on the interpretation of nuclearaccented pronouns’. In Proceedings of the 2002 International Conference on Speech Prosody. ISCA Archive. Aixen-Provence, France. 675–8. Winograd, T. (1972), Understanding Natural Language. Academic Press. New York. Wolf, F., E. Gibson, & T. Desmet (2004), ‘Discourse coherence and pronoun interpretation’. Language and Cognitive Processes 19:665–75. Wolters, M. & D. Beaver (2001), ‘What does he mean?’ In J. D. Moore and K. Stenning (eds.), Proceedings of the 23rd Annual Meeting of the Cognitive Science Society. Lawrence Erlbaum Associated. Mahwah, NJ. 1176–80. First version received: 01.08.2006 Second version received: 20.06.2007 Third version received: 05.10.2007 Accepted: 09.10.2007
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Experimental Psychology: Learning, Memory, and Cognition 18:266–83. Miltsakaki, E. (2001), ‘Toward an aposynthesis of topic continuity and intrasentential anaphora’. Computational Linguistics 28:319–55. Moens, M. & M. Steedman (1988), ‘Temporal ontology and temporal reference’. Computational Linguistics 14:15–28. Oehrle, R. T. (1981), ‘Common problems in the theory of anaphora and the theory of discourse’. In H. Parret, M. Sbisa`, and J. Verschueren (eds.), Possibilities and Limitations of Pragmatics. John Benjamins. Amsterdam. 509–30. Studies in Language Companion Series, Volume 7. Rohde, H., A. Kehler, & J. L. Elman (2006), ‘Event structure and discourse coherence biases in pronoun interpretation’. In R. Sun (ed.), Proceedings of the 28th Annual Conference of the Cognitive Science Society. Lawrence Erlbaum Associates. Mahwah, NJ. 617–22. Rudolph, U. & F. Forsterling (1997), ‘The psychological causality implicit in verbs: a review’. Psychological Bulletin 121:192–218. Schwarzschild, R. (1999), ‘Givenness, AvoidF, and other constraints on the placement of accent’. Natural Language Semantics 7:141–77. Sheldon, A. (1974), ‘The role of parallel function in the acquisition of relative clauses in English’. Journal of Verbal Learning and Verbal Behavior 13: 272–81. Smyth, R. (1994), ‘Grammatical determinants of ambiguous pronoun resolution’. Journal of Psycholinguistic Research 23:197–229. Stevenson, R. J., R. A. Crawley, & D. Kleinman (1994), ‘Thematic roles, focus, and the representation of
Journal of Semantics 25: 45–91 doi:10.1093/jos/ffm010 Advance Access publication September 12, 2007
On the Meaning of Only MICHELA IPPOLITO University of Toronto
Abstract
1 INTRODUCTION One of the goals of a semantic theory for the focus-sensitive particle only is to account for the fact that a sentence containing this particle conveys that the only-less sentence is true. For example, the sentence in (1a) conveys that John can speak French [cf. (1b)] and that nobody other than John can [cf. (1c)]. Following Horn (1996), we will call (1b) the prejacent of (1a).1
1 In this paper, I will not discuss temporal uses of only. I believe that the analysis I will suggest does cover basic temporal uses like in (i) below, which conveys the information that it is no later than five o’clock:
(i)
It is only five o’clock.
However, the following pair shows an unexpected contrast: while (ii)—in line with (i) above—conveys the information that John will not stay at home later than five o’clock, the sentence in (iii) surprisingly conveys the information that John will not arrive earlier than five o’clock. (ii) John will stay at home only until five o’clock. (iii) John will arrive only at five o’clock. I will not be able to discuss these and related temporal cases in this paper. The Author 2007. Published by Oxford University Press. All rights reserved. For Permissions, please email:
[email protected].
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
This paper investigates the semantics of the focus particle only and is primarily concerned with the relation between the exclusive proposition and the proposition expressed by the prejacent (the only-less sentence). We argue that, in a sentence of the form only A is B, only triggers the conditional presupposition that if something is B, A is B. We show that in a positive-only sentence, the prejacent is a conversational implicature and therefore it is cancellable. Instead, in a negative-only sentence the prejacent is shown to be entailed by any context that satisfies the conditional presupposition and to which the (negative) assertion is added. Hence, the prejacent of a negative-only sentence is not cancellable. The entailment analyses, the strong presupposition analyses and the weak presupposition analyses of only are discussed, together with the problems that each type of theories faces.
46 On the Meaning of Only (1) a. Only John can speak French. b. John can speak French. c. Nobody other than John can speak French.
2 Because this paper is primarily concerned with the relation between the exclusive proposition and the prejacent, there are a number of theories that I will not be able to discuss here, for example Krifka (2006), Beaver & Clark (2003), von Fintel (1997), Bonomi & Casalegno (1993) and von Stechow (1991), among others.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
In this paper, I will be concerned with the question of the relation between (1a) and (1b). There are at least three positions that have been defended in the literature: the entailment analysis according to which (1a) entails (1b); the presupposition analysis according to which (1a) presupposes (1b); the implicature analysis according to which (1a) conversationally implicates (1b).2 As the paradigmatic example of the entailment analysis I will discuss the proposal defended by Atlas (1993, 1996). In the strong form alluded to above, the presuppositional analysis of only has been advocated by Horn (1969) and Rooth (1985, 1992), among others. A weaker form of the presuppositional analysis has been proposed by Horn (1996) and, more recently, by Geurts & van der Sandt (2004). The implicature view was suggested by McCawley (1981) and more recently by van Rooij & Schulz (2005). In light of some problems that affect both the entailment and the (strong and weak) presupposition analyses, Horn (2002) proposed that the prejacent proposition is ‘assertorically inert’, that is, it is entailed but not asserted. In this paper, I will argue that (i) only does trigger a presupposition but this presupposition is neither the prejacent itself nor an existential presupposition and (ii) it is a conditional presupposition which behaves just like we expect a presupposition to behave. These two claims are among the central ideas of the present proposal, from which the properties of an only sentence will be shown to follow. From a more general perspective, this study might contribute from a linguistic perspective to the debate about the semantics/pragmatics distinction that has interested both philosophers and linguists. The fundamental observation that lies at the origin of this debate is that there are components of the information conveyed by an utterance that do not seem to be determined by the conventional linguistic meaning of the sentence that has been uttered. The object of most of the current research in this area of philosophical and linguistic investigation is to understand to what extent the (overt and covert) syntactic structure of the sentence determines what is communicated and whether and how exactly semantics and pragmatics relate to each other (see, e.g., the discussion of ‘weak’ versus ‘strong’ pragmatic effects by King & Stanley
Michela Ippolito 47
2 THE ENTAILMENT ANALYSIS AND ITS SHORTCOMINGS According to a simplified version of Atlas’s entailment analysis,3 the sentence ‘Only John can speak French’ asserts that John can speak French and that nobody other than John can speak French. (2a) shows a version of Atlas’s truth conditions for the sentence in question (ignoring tense and the internal composition of the predicate ‘can speak French’). (2) a. can-speak-French( j) ^ :$y(y 6¼ j ^ can-speak-French(y)) b. John can speak French and nobody other than John can speak French. Because the logical form of the sentence ‘Only John can speak French’ is analysed as a conjunction, the sentence is predicted to entail that John can speak French (as well as that nobody other than John can speak French). The most compelling argument in support of the entailment analysis is that it surely seems like we would contradict ourselves were we to utter either (3) or (4). (3) (4)
#Only John can speak French, and/but Bill can too. #Only John can speak French, and/but John cannot.
In (3), the second part of the sentence contradicts the second conjunct in (2b), that is that nobody other than John was at the party. In (4), the
3
Horn (1996) points out that this view was already defended by some of the medieval scholars.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
(2005) and their discussion of apparent pragmatic ‘intrusions’ in the sense of Levinson 2000). Since, as we pointed out at the beginning of this section, the prejacent is part of what the speaker of an only sentence communicates, the question about the nature of this meaning component arises. I follow the focus semantics developed in Rooth (1985, 1996), and assume that the association with focus that characterizes focus-sensitive particles like only is represented syntactically by means of an operator introducing a presupposed set of alternatives. The conclusion of this paper will be that the answer to the question about the nature of the prejacent is more complex than previously thought but that theories of presupposition and conversational implicature are well equipped to account for the meaning of only.
48 On the Meaning of Only second part of the sentence contradicts the first conjunct in (2b), that is that John can speak French. Since according to the entailment story presented above, the two conjuncts are both entailed by the sentence ‘Only John can speak French’, and you cannot deny either without contradicting yourself, the data in (3) and (4) support this view. However, when we test whether the first conjunct in (2b) patterns like an entailment with respect to negation, it seems clear that it does not. Mere entailments of a sentence S are no longer entailed when S is negated.
Now, when we negate our only sentence, as in (6), while the second conjunct in (2b) can no longer be true, the first conjunct still is.4 (6) a. Not only JohnF can speak French. Therefore, somebody other than John can speak French. b. Not only JohnF can speak French. Therefore, John can speak French. The conclusion that seems reasonable to draw on the basis of these facts is that the two conjuncts in (2b) are not equal and that only the second conjunct—that nobody other than John can speak French—is an actual entailment of the sentence ‘Only John can speak French’. Of course, the judgment in (4) remains to be accounted for. 4 I will assume here that a sentence of the form ‘not only AF is B’ is the negation of the sentence ‘only AF is B’. Alternatively, someone might suggest that the actual negation of ‘only AF is B’ is something like ‘it is not true/the case that only AF is B’, and that once we take this to be the negation of the only sentence our intuition that the prejacent is required to be true no longer holds. I do not think, though, that the fact that the negation ‘it is not true/the case that’ weakens our intuition about the truth of the prejacent can tell us that, for example, the prejacent is not presupposed. The reason is that, even in cases that are taken to trigger a presupposition, the negation ‘it is not true/the case that’ seems to weaken the intuition that the presupposed material is actually being presupposed. For example, consider the contrast between the following two negative sentences ( means ‘presupposes’):
(i)
a. John went to the opera with his children. John has children. b. John didn’t go to the opera with his children: #he does not have children. c. It is not true that John went to the opera with his children: he does not have children.
Therefore, in the course of this article I will not use ‘it is not true/the case that’ to form the negation of an only sentence. In section 5.3, however, I will discuss a recent instance of the view that not only is not the negation of only, that is the view proposed in Beaver (2004), according to which ‘not only AF is B’ is not the negation of ‘only AF is B’ in that in the negative sentence not associates with only and forms an idiomatic phrase.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
(5) a. John bought a red Ferrari / John bought a Ferrari. b. John didn’t buy a red Ferrari K John bought a Ferrari.
Michela Ippolito 49
3 THE PRESUPPOSITION ANALYSIS AND ITS SHORTCOMINGS The survival of an inference under negation is a typical feature of a presupposition. The example below illustrates this point with the presupposition trigger regret: both the positive and the negative sentence require that John was a smoker at some time before the speech time ( means ‘presupposes’).
The fact that the inference that John can speak French survives under negation [cf. (6b)] might be accounted for if only—like regret—were to be a presupposition trigger and the prejacent in question a presupposition. According to this view, the sentence would assert that nobody other than John can speak French and would presuppose that John can. Before we continue, let us make a somewhat more formal proposal about what the assertoric component of an only sentence should be. According to a Horn-style analysis of only, sentence (8) asserts that nobody other than John can speak French. (8)
Only JohnF can speak French.
Assuming that only is an operator taking three arguments—a contextual variable C, a proposition, and a world—the truth conditions for an only sentence would be as shown in (9).5 (9)
only(C)(u)(w) ¼ 1 if "w 2 CðwðwÞ / ðu 4 wÞÞ
In our example, u is the proposition that John can speak French, and C is a contextually salient set of alternative propositions obtained by replacing the focus phrase with an alternative:6 the sentence asserts that for every proposition of the form ‘x can speak French’, if this proposition is true, then it must be entailed by the proposition that 5 This view of the truth conditions of an only sentence is not shared by everyone. One notable exception is Horn (2002): there, Horn argues that, while (8) asserts that nobody other than John can speak French, the prejacent is also entailed, even though not asserted. According to this view, (8) is false if John cannot speak French. 6 I will address the question of what C is in section 4.1.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
(7) a. John regrets having smoked. b. John does not regret having smoked. c. John smoked (at some time before the speech time).
50 On the Meaning of Only John can speak French. Since the only proposition entailed by ‘John can speak French’ is that John can speak French, it follows that nobody other than John can speak French. The actual truth conditions for (8) are given in (10). (10) only(C)(kw#.John can speak French in w#)(w) ¼ 1 if "w 2 CðwðwÞ/ð½kw#:John can speak French in w# 4 wÞÞ
(11) Only JohnF can speak French. a. ~ nobody other than John can speak French. b. John can speak French. I will call this the ‘strong’ presupposition analysis.
3.1 Problems with the strong presupposition analysis Presuppositions project under modal adjectives and adverbs, as shown below (Karttunen 1973a, among many others). (12) a. It’s possible that John doesn’t regret having smoked for ten years. b. John smoked for ten years. Both Horn (1996) and Geurts & van der Sandt (2004) have observed, however, that modalized only sentences do not strongly suggest the truth of the only-less sentence. (13) is fine even though the speaker is not assuming the truth of the proposition that John can speak French.7 (13) It is possible that only John can speak French . . . . . . and maybe not even he can. 7
It should be pointed out, though, that not all speakers agree that these are felicitous cases of cancellation. Therefore, this discussion applies to those dialects that allow cancellation of the prejacent.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
The truth conditions in (10) do not entail that John can speak French. Indeed, we saw above in our discussion of the entailment analysis that this meaning component does not behave like an entailment, contrary to the exclusive proposition in (10). According to the presupposition analysis, that John can speak French is a presupposition of (8), and therefore this information will be preserved when the sentence is negated (i.e. the presupposition projects, since negation is a ‘hole’, to use the terminology in Karttunen 1973a).
Michela Ippolito 51
Well-established instances of presuppositions, though, cannot be suspended and when we try the test in (13), we do not obtain a felicitous discourse.8 (14) It’s possible that John regrets having smoked . . . . . .#and/#but maybe he never smoked. (15) It’s possible that John quit smoking yesterday . . . . . .#and/#but maybe he never smoked. (16) It’s possible that John will go to the Opera with his wife . . . . . .#and/#but maybe he is not married.
8 This claim needs some qualification. Some people have suggested that presuppositions can be ‘suspended’ in at least some cases. For example, Horn (1972) argued that presuppositions can be suspended if by doing so the proposition that is obtained is more ‘universal’. A reviewer suggests that the following sentences are better than (15) and (16), respectively:
(i)
a.
It’s possible that John didn’t quit smoking yesterday and maybe he never smoked in the first place. b. I didn’t meet John’s wife yesterday and for all I know he isn’t even married.
However, it seems to me that the examples in (i) are acceptable only if the first conjunct ‘echoes’ a question under discussion which carries the presupposition in question, as illustrated in (ii). (ii)
Q: Did you meet John’s wife yesterday? A: No, I didn’t meet John’s wife yesterday, and for all I know he’s not even married.
In this exchange, the following seems true: (a) the presupposition triggered by the definite phrase is not entailed by A’s doxastic state; (b) A’s use of the definite phrase John’s wife echoes a salient utterance (in this case, a salient question). Now, the following example seems to indicate that if A does not believe that John has a wife and the expression John’s wife was not part of the salient question, the A’s utterance is infelicitous. (iii)
A: Who did you meet at the party yesterday? B: #I met John. I didn’t meet his wife, though, and for all I know he isn’t even married.
We can explain A’s answer by appealing to local accommodation, which allows A to answer a question carrying the presupposition that John has a wife, while at the same time asserting that her doxastic state does not entail the locally accommodated proposition that John has a wife. The proposition in question is locally accommodated under the scope of negation. (iv)
Q: Did you meet John’s wife yesterday? A: No, it is not the case that John has a wife and I met John’s wife yesterday, and for all I know he’s not even married.
However, going back to the discussion in the main text, local accommodation is not the default way of accommodating presuppositions (see Heim 1983 for the view that local accommodation is the preferred way of doing so), and the infelicity of example (14) in the text shows that it is not always available, not even when global accommodation would generate a contradiction. If the felicity of (13) were due to the rescuing effect of local accommodation, local accommodation should rescue (14) too, but it does not. An analysis of the conditions that need to be met for local accommodation to take place, though, must await another occasion.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
As the infelicity of the continuations in (14)–(16) shows, the presuppositions triggered by regret, quit and the definite article cannot
52 On the Meaning of Only be ‘cancelled’. If the information that John can speak French in (11) was a presupposition, we would expect the same behaviour.9 The second problem has to do with question–answer pairs. Horn has observed that if an only sentence presupposed the truth of the only-less sentence and assuming that presuppositions are propositions that are part of the common ground (Stalnaker 1979, among others), then sequences like the following should be strange because the answer would be presupposing a piece of information that is not known by the person who is asking the question. Contrary to this expectation, however, these question–answer pairs are perfectly felicitous (Horn 1996).10
To make things worse, standard cases of presuppositions do behave as expected, as shown by the following question–answer pairs, which are not felicitous unless a marked intonation is used. Whatever label we might decide to give to this intonation,11 it is certainly not required in (17). (18) Q: Is John married? A: #He went to the Opera with his wife. (19) Q: Did you ever smoke? A: #I haven’t quit. (20) Q: Did John marry Sue? A: #He doesn’t regret that he did. 9
One alternative would be to claim that there is not just a single type of presupposition triggers and that some presuppositions may be easier to cancel (von Fintel, personal communication). For instance, Karttunen (1973b) and Kay (1992) observed that some presuppositions, for example the presupposition triggered by a lexical verb like quit, disappear when embedded under an epistemic modal like maybe: (i)
(Looking at a guy we don’t know who’s chewing his finger nails.) Maybe he just quit smoking.
In response, we observed above that cancelling the presupposition of quit is not always easy, and this difference must be accounted for. Also, recently Simons (2001) and Abusch (2002) have suggested that these presuppositions that seem to disappear in some contexts might be better analysed as implicatures. 10
Horn attributes this observation to Roger Schwarzschild. I think this intonation has an element of irony or scarcasm. A reviewer points out that there are examples where the connotation of the answer does not seem to be either satirical or ironic but rather ‘indirect’, and provides the following example: 11
(i)
Q: Is John married? A: Well, he went to the opera with his wife.
If so, I think that the presence of the adverb well, without an ironical or sarcastic intonation, is indicating that the speaker has some evidence that the person whom John went to the opera with is his wife, even though she does not fully commit to the truth of that proposition. That is, A’s answer in (i) is understood as something like ‘It seems that he went to the opera with his wife’. In this sense, the reviewer’s choice of label seems appropriate. But the point in the text remains since even the ‘indirect’ intonation is entirely absent from (17).
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
(17) A: Who can speak French? B: Only JohnF.
Michela Ippolito 53
(21) Q: Did Mary buy anything? A: #It is a purse that she bought. (22) Q: Did Mary buy anything? A: She bought a purse. There are two pieces of information encapsulated in the answer in (21): the presupposed information that Mary bought something and the asserted information that it was a purse. Now, these very same pieces of information are encoded in the answer in (22), the difference being that in the latter they are both entailments of the answer. According to the defender of the strong presupposition analysis, what makes the answer in (21) infelicitous is the fact that the asserted content is irrelevant. But this incorrectly predicts that the answer in (22) should be equally bad. The pairs in (23) and (24) raise the same problem for the defender of the strong presupposition analysis. (23) Q: Did anybody lose anything? A: #What John lost was a key. (24) Q: Did anybody lose anything? A: John lost a key. We conclude that it is not what information is communicated, but how this information is packaged—as a presupposition or as an entailment—that causes the difference in judgment between (21) and (23) on 12
This objection was raised by Craige Roberts in her comments on an earlier draft of this paper.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Now, if answering a question by presupposing the answer is indeed an illegitimate move in discourse, then (17) should be odd too. A defender of strong presupposition analysis might object that there is a crucial difference between the question–answer pair in (17) and the question–answer pairs in (18)–(20).12 The difference—the objection goes—is that in the former but not in the latter cases, the asserted content of the answer is directly relevant to the question. In (18)–(20), the asserted content is not directly relevant to the question, but only indirectly relevant, since it presupposes the answer to the question. In (17), however, both the positive and the negative propositions are partial answers to the question. So, the defender of the strong presupposition analysis would claim that the fact that (18)–(20) sound worse than (17) is due to the fact that the asserted content in the former cases is irrelevant to the question. However, the following examples suggest that this difference is not what accounts for why we judge (17), but not (18)–(20), a felicitous question–answer pair. Consider (21) and (22).
54 On the Meaning of Only
3.2 Weak presupposition analyses and their problems In light of the problems just discussed, Horn (1996) suggested that the prejacent itself is not presupposed. He suggested that only functions semantically as the inverse of all, so that a sentence like ‘only A is/are B’ is the inverse of ‘all B are A’. Here are some examples. (25) a. b. (26) a. b.
Only [Bostonians]F eat lobsters. All the people who eat lobsters are Bostonians. Only [John]F can speak French. All the people who can speak French are identical to John.
Since the (a) sentences are equivalent to the (b) sentences and since a universal quantifier has the existential requirement that its domain B not be empty, Horn suggested that the only sentence too will require that B not be empty.14 That is, as the (b) sentence in (25) requires that there is somebody who eats lobsters, the (a) sentence will too. Now, this proposition together with the assertion that nobody other than 13
(i)
More interesting is the following case, which one reviewer finds rather natural: Q: Do you love anyone (at all)? A: It’s youF that I love!
However, it is enough to modify the reviewer’s example slightly to make the cleft answer infelicitous again. Consider the following: (ii)
Q: Does Mary love anyone (at all)? A: #It’s BillF that she loves!
14 The question of whether the universal quantifier all carries an existential presupposition is part of a larger question about whether quantifiers in general carry an existential presupposition. Among the most prominent analyses are Strawson (1952), McCawley (1972), De Jong & Verkuyl (1985), Diesing (1992), Lappin & Reinhart (1988) and Abusch & Rooth (forthcoming). A clear review of some of these positions can be found in Heim & Kratzer (1998).
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
the one hand and (22) and (24) on the other hand. The infelicity of (21) and (23) is due to the fact that they both presuppose the answer to the question.13 Similarly, the infelicity of (18)–(20) does not lie in the relevance of the information—contrary to the defender of the strong presupposition analysis—but in the fact that the answer to the question is presupposed. To sum up, in this section I retraced Horn’s arguments against the strong presupposition analysis according to which only presupposes the truth of the only-less sentence; I then considered an objection from a strong presupposition analysis’s advocate and refuted it. In section 3.2, I will review the ‘weak’ presupposition analysis and present some arguments against it.
Michela Ippolito 55
(27) Background-Presupposition Rule: Whenever focusing gives rise to a background kx.u(x), there is a presupposition to the effect that kx.u(x) holds of some individual. In our example (25), the backgrounded material is kx.eat-lobsters(x), and according to the BPR, the presupposition will be $x [eat-lobsters(x)]. Therefore, in the story of Geurts and van der Sandt, as in Horn’s story, the proposition that Bostonians eat lobsters is not the presupposition itself but what follows from the existential presupposition together with the assertion that nobody other than Bostonians eats lobsters. Similarly, the (a) sentence in (26) presupposes that $x [can-speak-French(x)] which, together with the assertion that nobody other than John can speak French, entails that John can speak French. Against Horn’s proposal, Geurts and van der Sandt argue that there is a lack of parallelism between the universal quantifiers and only with respect to downward monotonicity. Horn’s prediction is that, since only A B is equivalent to all B A, the first argument of only (the second argument of ") should be upward monotone, whereas its second argument (the first argument of ") should be downward monotone. NPIs should then be allowed to occur only in the second argument of only, and not in the first argument since NPIs are not allowed in the second argument of ". However, according to Geurts and van der Sandt, the following data challenge Horn’s thesis.15 15
The issue of NPI licensing in the first argument of only is discussed in detail in von Fintel (1997).
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Bostonians eats lobsters entails the proposition that (at least some) Bostonians eat lobsters, but the latter is not a presupposition. Similarly in (26): the sentence asserts that nobody other than John can speak French, and it presupposes that somebody can speak French. Therefore, it follows from the assertion and the presupposition together that John can speak French. What is the status of the existential requirement that somebody eats lobsters and that somebody can speak French? Horn is only committed to the claim that this existential proposition is exactly like the existential requirement of a universal quantifier, leaving open the question whether it is a presupposition or an implicature. More recently, Geurts & van der Sandt (2004) have argued that an only sentence carries an existential presupposition, but unlike Horn, they claim that this presupposition is not triggered by only being the semantic inverse of all, but by the focus structure of the sentence and the Background-Presupposition Rule (BPR), stated in (27).
56 On the Meaning of Only (28) a.
Only the students who had ever read anything about polarity passed. b. *All students who passed had ever read anything about polarity. (29) a. Only the guests who had seen any of the suspects were questioned. b. *All the guests who were questioned had seen any of the suspects.
(30) only(C)(u)(w) ¼ 1 iff :$w 2 C(w(w) ¼ 1 ^ (u ? w)) However, Beaver (2004) has argued that what licenses the NPIs in the only sentences in (28) and (29) is not only but the definite phrase ‘the students who’, which is confirmed by the grammaticality of (31) (from Beaver 2004). (31) The students who had ever read anything about polarity passed (with ease). Furthermore, Beaver observes that those NPIs that do occur in the focus of only cannot have an NPI interpretation but only a literal one. For example, (32) can only be interpreted as asserting that Freda budged exactly one inch (compared with ‘Freda did not budge an inch’, which means that she did not move at all). (32) Freda only budged an inch. Furthermore, even more problematic for their idea that NPIs are licensed in the focus of only is the fact that many NPIs are not licensed in that position. The following example is again from Beaver (2004). (33) a. *Freda only has [any money]F. b. *Freda only ate muchF. Putting aside their arguments against Horn’s analysis of only, we are left with the claim of Geurts and van der Sandt that an only sentence carries an existential presupposition. Ignoring the fact that the existential presupposition has different sources in the theories by Horn and Geurts and van der Sandt, both theories agree that the prejacent is entailed by the exclusive proposition together with the existential presupposition. In what follows, I will present some challenges for the existential analyses of both Horn and Geurts and van der Sandt. Before I do this,
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Geurts and van der Sandt assume that the NPIs in the only sentences in (28) and (29) are licensed by only, and propose that the logical form of an only sentence does not feature a universal quantifier but negation and an existential quantifier, as shown below.
Michela Ippolito 57
(34) Only [ John and Mary]F can speak French. The sentence asserts that nobody other than John and Mary can speak French and presupposes that somebody can speak French. Since it does not follow from this that both John and Mary can speak French, we cannot account for the intuition that what is implicated in (34) is that both John and Mary can speak French.17 Second, there is a problem with the negative-only sentence. Consider (36). (35) Not only JohnF can speak French. The sentence asserts that somebody other than John can speak French, and it presupposes that somebody can speak French. It does not follow from the assertion and the presupposition that John can speak French. Now, Geurts and van der Sandt are aware of this problem, and to account for the fact that the prejacent [that John can speak French in (35)] is preserved in a negative sentence, they resort to an implicature story. According to this story, the sentence in (35) only conversationally implicates that John can speak French. This implicature arises from the speaker not having uttered (36). (36) JohnF cannot speak French. It follows from the BPR, that the sentence in (36) presupposes that somebody can speak French. Moreover, (36) asserts that John cannot 16
Among the people who argue that questions carry existential presuppositions are Postal (1971) and Comorovski (1996). For arguments against this idea, see Fitzpatrick (2005). 17 This problem has been independently pointed out by van Rooij & Schulz (2005).
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
let me mention that, if it is assumed that wh-questions carry an existential presupposition, then both proposals might help with the problem raised by the question–answer pairs in (17) because what B would be presupposing is not that John can speak French (something that A clearly does not know) but the weaker proposition that somebody can speak French, something that A might very well know without knowing their identity. However, the claim that wh-questions carry such presupposition is at best controversial.16 What are the problems for the weak presupposition analysis? First, it seems that the weak presupposition analysis of only is too weak for cases where a plural NP is in the scope of the particle. For example, the sentence (34) suggests that John and Mary can speak French, but the weak presupposition analysis predicts that the sentence should only presuppose that somebody can speak French.
58 On the Meaning of Only
(37) Only John can speak French, and maybe not even he can. (38) #Not only John can speak French, and maybe he can’t. Third, we know that presuppositions project when they occur in the antecedent of an (indicative) conditional. For example, in the (a) example in (39), the presupposition triggered by regret—that John smoked—projects. (39) a. If John regrets having smoked, he will tell his children not to smoke. b. If John quits smoking tomorrow, he will be grumpy. The prediction of the existential presupposition analysis is that, when only occurs in the antecedent, the $x[u(x)] presupposition will project. Consider the example in (40). (40) a. If Sue invites only [John]F for dinner, she will upset Mary . . . so, she will either invite nobody or John and Mary. b. $x[Sue invites x for dinner] If the antecedent presupposes $x[Sue invites x for dinner], this presupposition would project, that is (40) could only be added to
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
speak French. Since (35) presupposes that somebody can speak French, but asserts that somebody other than John can speak French, Geurts and van der Sandt suggests that (36) is more informative than (35), presumably because, if the presupposition of both sentences (that somebody can speak French) is true, then the assertion of (36) asymmetrically entails the assertion of (35): if it is true that somebody can speak French and John cannot [the last conjunct is (36)’s assertion], then it is true that somebody other than John can speak French [(35)’s assertion]. However, not vice versa: if it is true that somebody other than John can speak French, it does not follow that John cannot speak French. This proposal predicts that (i) in the positive only sentence ‘Only John can speak French’, the prejacent follows from the presupposition together with the assertion and, therefore, it should not be cancelable; (ii) in the negative-only sentence ‘Not only John can speak French’, the prejacent is a scalar implicature, and therefore, should be cancelable, like other scalar implicatures. The problem is that the judgments point exactly in the opposite direction. In the following pair, the second conjunct ‘and maybe not even he can’ (with which the speaker conveys the information that she does not know whether John can speak French) is compatible with the positive-only sentence, but incompatible with the negative-only sentence, just the opposite of what Geurts and van der Sandt predict.
Michela Ippolito 59
a common ground that entails the presupposition in question. However, the speaker of (40) is not required to be assuming that you will invite somebody since she clearly considers inviting nobody an option. Similarly, if this existential requirement is a presupposition, then it should show the same projection behaviour of any other presupposition, which Geurts and van der Sandt do try to show. However, it seems to me that there is a wealth of cases where clearly it does not. First, consider the lack of projection under the modal adjective possible.
Despite the fact that the presuppositions triggered by lexical verbs fail sometimes to project (see footnote 9), there is nevertheless a contrast between them and the existential requirement associated with only, since only the latter seems to be cancellable in the above examples. 4 THE IMPLICATURE ANALYSIS AND ITS PROBLEMS McCawley (1981) suggested that, while it asserts that nobody other than Muriel, Lyndon and Ed voted for Hubert, the sentence (42) conversationally implicates (43). (42) Only Muriel, Lyndon, and Ed voted for Hubert. (43) Muriel, Lyndon and Ed voted for Hubert. If one of the persons enumerated in (42) is known by the speaker not to have voted for Hubert, then, in uttering (42) the speaker is being misleading, since she could have been more informative by leaving the person out of the list. Since the speaker is not being misleading (she is being cooperative in the Gricean sense), it must be the case that she either does not know whether Muriel, Lyndon and Ed voted for Hubert or knows that they did. More recently a version of the implicature analysis of only has been defended in van Rooij & Schulz (2005). There are two crucial problems for the implicature analyses. The first problem is to explain why the truth of the prejacent is conveyed by a negative-only sentence, which is not at all what we expect from
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
(41) a. It is possible that only John will arrive at the party on time . . . . . . and maybe not even he will. b. It is possible that John will quit smoking tomorrow . . . . . . #but maybe he’s not a smoker. c. It is possible that John will be glad to have quit smoking . . . . . . #but maybe he didn’t quit.
60 On the Meaning of Only conversational implicatures. The second problem is to explain why the implicature in an only sentence cannot be simply cancelled, as shown in (44a). The prejacent can be suspended only by means of an epistemic operator such as maybe, as shown in (44b) (we will discuss the contrast in (44) in detail in section 5.3.1 and what van Rooij and Schulz have to say about it in section 6). (44) a. #Only Mary can speak French—in fact, not even she can. b. Only Mary can speak French, and maybe not even she can.
5 A NEW ANALYSIS The discussion of (some of) the previous theories of the meaning of only and of their shortcomings has highlighted what a satisfactory theory of only should do. It should answer the following questions: 1. Why a positive only sentence of the form only AF is B conveys that nobody other than A is B (the exclusive proposition) and that A is B (the prejacent). 2. Why its negative counterpart not only AF is B conveys that someone other than A is B (the negation of the exclusive proposition) and that A is B (the prejacent). 3. Why the prejacent is suspendable in a positive sentence but only by means of an epistemic operator such as maybe [cf. (44)]; that is, why ‘only A is B’ can be followed by ‘and maybe not even A’, but is infelicitous if the speaker believes that nobody is B. 4. Why the prejacent is not suspendable in a negative sentence. 5. Why the prejacent behaves differently from standard cases of presuppositions in modalized contexts and in questions. 6. Why a sentence of the form ‘only [A and B]F are C’ conveys that both A and B are C. We saw that the strong presupposition analyses have difficulties with points 3 and 5; weak presupposition analyses have difficulties with
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
In what follows I will argue that only does trigger a presupposition, but this presupposition is not any of the ones discussed so far in the literature. Once we understand what the actual presupposition of only is, we will see that this presupposition, contrary to the strong and weak presuppositions discussed above, does behave just like we expect a presupposition to behave. This presupposition, together with a conversational implicature, will account for the properties of only.
Michela Ippolito 61
points 2, 3, 4, 5 and 6 and implicature analyses have difficulties with points 2 and 3. The central ideas of the analysis that I will propose below are the following:
Because the prejacent of a positive-only sentence is a scalar implicature, it can be suspended (even though not cancelled simpliciter because of the ban against vacuous presuppositions). However, from the satisfaction of the conditional presupposition and the truth of the negative-only sentence it follows that the prejacent must be true. This last point explains why the prejacent of a negative-only sentence is not suspendable, unlike the prejacent of a positive-only sentence. Last, it follows from what we said that the prejacent does not pattern like presuppositions in modalized contexts and questions because it is not a presupposition. In what follows I will elaborate on these ideas and show in detail how they address the points on the to-do list above.
5.1 The assertion and the presupposition I will couch my analysis in Rooth’s alternative semantics, but before I begin, let me review the analysis of the focusing adverb only developed in Rooth (1992, 1996). According to Rooth, focus evokes a set of alternative propositions in a presuppositional way. This idea is implemented by using a focus operator ; which introduces a presupposed set of alternatives, as follows.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
(A) The exclusive proposition is the assertoric content of an only sentence. (B) Only triggers a conditional presupposition: for instance, the sentence ‘only A is B’ presupposes that if someone (in the relevant set of alternatives) is B, A is B. (C) Because of the conditional presupposition triggered by only, if a negative only sentence is defined and true, the prejacent must be true too. (D) The prejacent of a positive-only sentence is a scalar implicature. (E) A pragmatic ban against vacuously true presuppositions prevents a speaker from uttering ‘only A is B’ when she knows that nobody is B. In other words, since the speaker is presupposing that if someone is B, A is B, there is a requirement that the speaker believe that it is possible that somebody is B. Thus, the ban against vacuously true presuppositions together with only’s presupposition require that the speaker believe that it is (at least) possible that A is B.
62 On the Meaning of Only (45) Where u is a syntactic phrase and C is a syntactically covert semantic variable, u ; C introduces the presupposition that C is a subset of ½½u f containing ½½uo and at least one other element.18 Consider our example again, whose structure is given in (47). (46) Only JohnF can speak French.
(47)
The interpretation of C is a matter of anaphora resolution but the presupposition introduced by ; constrains the possible values of C along the lines specified in (45). Below is a Roothian lexical entry for only, where the truth of the prejacent is presupposed.19 This presupposition is expressed as a partiality condition on the second argument of the function denoted by the adverb. (48) ½½onlyw ¼ kC 2 DÆÆstætæ : kp : p 2 DÆstæ and pðwÞ : "q 2 CðqðwÞ / ðp 4 qÞÞ Suppose that there are three people salient in the context of utterance, John, Mary and Bill. The focus presupposition in ‘Only JohnF can speak French’ requires that C be a subset of ½½JohnF can speak French f including at least the proposition that John can speak French and some other proposition, but identifying the value of C is a matter of anaphora resolution. Assuming that the value of C is the Roothian set of alternatives closed under conjunction, in our scenario the value of C will be the following set of alternatives. 18 ½½ao denotes the ordinary semantic value of the phrase a; ½½a f denotes its focus value. For example, if a is the sentence ‘John runs’, then ½½ JohnF runso denotes the proposition that John runs (kw. John runs in w), and ½½[ John]F runs f denotes the set of propositions of the form ‘x runs’. 19 This is a variant of Rooth’s (1996) definition.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
The focusing adverb quantifies over propositions and, like other quantifiers in natural language, its domain is restricted. The Roothian idea of association with focus is implemented as follows: the restriction of the adverb is a variable coindexed with the presuppositional variable introduced by the focus operator ;.
Michela Ippolito 63
(49) C ¼ {John can speak French; Mary can speak French; Bill can speak French; John and Mary can speak French; John and Bill can speak French; Mary and Bill can speak French; John and Mary and Bill can speak French}.
(50) Presupposition triggered by only (where u is the sentence in its scope): If some proposition in C is true, then u is true 20
At some earlier stage, I had formulated the only-presupposition as follows: ‘either John can speak French and someone other than John (in the relevant set) can speak French or nobody other than John (in the relevant set) can speak French’. Thanks to David Beaver for pointing out that this presupposition is equivalent to the more perspicuous proposition in the text above. Here is the proof of the equivalence. The presupposition ‘either John can speak French and someone other than John can speak French or nobody other than John can speak French’ has the form (A ^ B) _ :B and this formula is equivalent to B/A ^ B; which in turn is equivalent to B/A. Hence, our presupposition is equivalent to the following: ‘if someone other than John can speak French, John can speak French’. Because it is a tautology that if John can speak French, then John can speak French, the conditional in italics will be equivalent to the more general proposition that if someone can speak French, John can speak French.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
According to the meaning in (48), the sentence in (46) presupposes that John can speak French and is true if there is no proposition in C that is true and not entailed by the proposition that John can speak French. The negation of (46) also presupposes that John can speak French and is true if there is a proposition in C that is true and not entailed by the proposition that John can speak French. Now, just like in Rooth’s analysis, I will take the negative proposition to constitute the assertoric content of an only sentence, but unlike it, I will not take the prejacent to be presupposed. I will keep the Roothian focus machinery and assume that C is required by the focus operator to be a subset of ½½ JohnF can speak French f including at least the proposition that John can speak French and some other proposition [in our hypothetical context, C is the set in (49)]. Only quantifies over the Roothian set of alternatives closed under conjunction, and the sentence containing the adverb asserts that there is no proposition in this set that is true and not entailed by p (the proposition that John can speak French). My proposal is that only does trigger a presupposition but that this presupposition is neither the prejacent nor an existential proposition. Only presupposes that if someone (in the relevant set) can speak French, John can speak French.20 Because we are using quantification over propositions in C, we will restate the presupposition as follows: ‘if some proposition in C is true, it is true that John can speak French’. This is expressed more generally in (50).
64 On the Meaning of Only The truth conditions for the only sentence in (46) will be as shown in (51).21 (51) ½½only ðC8 ÞJohnF can speak French ; C8 o;R;w is defined only if the following conditions are satisfied:
What follows from this proposal? Consider a world where John and Mary can speak French. In this world, the only presupposition is satisfied: the antecedent is true because there is a true proposition in C8, and the consequent is true because it is true that John can speak French. Therefore, (46) is defined. Furthermore, (46) is false because in C8 there is a proposition that is true and not entailed by the proposition that John can speak French, that is the proposition that John and Mary can speak French [cf. example (52)]. (52) (Scenario: John and Mary can speak French) Only John can speak French. (FALSE) In a world, however, where only John can speak French, the sentence is defined and true [cf. (53)]. (53) (Scenario: Only John can speak French) Only John can speak French. (TRUE) Consider now a world where only Mary can speak French. In this world, the sentence is undefined because the presupposition is false: the consequent is false (because John cannot speak French) but the antecedent is true (because there is a true proposition in C8, that is the proposition that Mary can speak French). Since the presupposition is false, the sentence is undefined [cf. example (54)]. (54)
(Scenario: only Mary can speak French) #Only John can speak French. (UNDEFINED)
21 The value of C is whatever value the contextual parameter R (the assignment function) assigns to the integer 8.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
(1) Rð8Þ 4 ½½John can speak French f and ½½ John can speak Frencho 2 Rð8Þ and jRð8Þj > 2 (focus pres.) (2) If some proposition in R(8) is true, ½½John can speak Frencho;w ¼ 1 (only-pres.) If defined, the sentence ½½only ðC8 Þ John can speak French ; C8 o;R;w ¼1 if :$p 2 Rð8Þ : pðwÞ ^ ½½John can speak Frencho ? p ¼0 if $p 2 Rð8Þ : pðwÞ ^ ½½ John can speak Frencho ? p
Michela Ippolito 65
The theory also correctly explains the infelicity of the sentence ‘Only John and Mary can speak French’ in a world where only John can speak French. (55) (Scenario: Only John can speak French) Only John and Mary can speak French. (UNDEFINED)
(56) Only twoF students can speak French. The set of alternatives C is totally ordered: {one student can speak French; two students can speak French; three students can speak French}. The presupposition in (56) will be satisfied only if two or more students can speak French; it will not be satisfied otherwise (i.e. if less than two students can speak French). If defined, the sentence will be true if at most two students can speak French, and it will be false is more than two students can.23 On a different note, this analysis also predicts that when the complement of only is a contradictory proposition, the sentence is always undefined.24 For example, consider the odd sentence in (57). (57) #Only squares are circles. Assume for the sake of the argument that C is {squares are circles; circles are circles; triangles are circles}. Now, the presupposition 22 A trickier case is one where nobody can speak French. Given what we have said so far, the analysis seems to predict that the sentence ‘Only John can speak French’ should be felicitous and true in a context where no student can speak French, contrary to our intuition that the sentence should come out undefined in this case. I will address this problem in section 5.3.1 and I will offer a solution based on a pragmatic constraint against vacuously true presuppositions. 23 The problem with cases where no student can speak French arises here too. As I mentioned in the preceding footnote, this problem is addressed in section 5.3.1. 24 Thanks to Josh Brown for raising the issue of contradictions embedded under only.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
In this world, the conditional presupposition would be false since, while there is a true proposition among the alternatives (that John can speak French), the prejacent is not true. The sentence is thus correctly predicted to be undefined. Notice that the same prediction with respect to (54) and (55) is made by the strong presuppositional theories. Take (54): according to these theories, the sentence (54), uttered in a context where only Mary can speak French, is undefined because the presupposition that John can speak French is not satisfied. The weak presupposition analyses, on the other hand, make the prediction that, when uttered in a context entailing that only Mary can speak French, (54) is false.22 This analysis also works for cases where the focus phrase associated with only is a numeral or a quantifier.
66 On the Meaning of Only requires that if there is a true proposition in C, then it is true that squares are circles. There is a true proposition in C, that is the proposition that circles are circles, but because it is not true that squares are circles, the presupposition is false and the sentence is undefined.
5.2 The scalar implicature
(58) ½½only ðC8 Þ ½ JohnF can speak French ; C8 o;R;w ¼ 1 if :$ p 2 Rð8Þ½ pðwÞ ^ ½½½ JohnF can speak Frencho ? p Let us now look at ‘Nobody can speak French’. As argued in von Fintel (1994) and others, the interpretation of a sentence with a quantifier is relative to a contextually salient set of individuals, which is a subset of the entire domain De. In our example, the interpretation of (59) depends on the contextually salient value that the salient assignment function assigns to the variable S5, the covert restriction of the negative quantifier nobody.25 (59) ½½Nobody ðS5 Þ can speak FrenchR;w ¼ 1 iff :$x 2 Rð5Þ ½can-speak-French ðxÞ Because we are comparing (58) and (59) as uttered in the same context, it is reasonable to assume that R(5) is the set of salient individuals (and their sums) such that the set of propositions of the form ‘x can speak French’ where x is a member of R(5) is R(8), that is the salient set of alternatives for only. Given the standard assumption about the covert restriction of quantifiers and since we are interpreting both sentences in the same context, (59) asymmetrically entails (58), and because of this asymmetric entailment, these two propositions can compete according to the 25 Technically, we might want to think of indices as pairs Æn; sæ where n is a number and s is a semantic type (Heim & Kratzer 1998); so, technically, the index on C in (58) should be Æ8; Æst; tææ; while the index on S in (59) should be Æ5; Æetææ: In the text, though, I will keep the notation simple and ignore the semantic type.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
The prejacent of a positive-only sentence is an inference that follows from the assertion together with an existential conversational scalar implicature. More precisely, the prejacent (John can speak French) arises from a competition between the actual utterance ‘Only JohnF can speak French’ and a more informative one, that is ‘Nobody can speak French’. Given our discussion so far, the meaning of the former sentence is (58). The value of C8 is whatever set of propositions the contextually salient assignment function R assigns to the index 8. Let us assume that R(8) is the set given in (49).
Michela Ippolito 67
neo-Gricean paradigm.26 In (60), I show the computation of the scalar implicature (which I call the ‘weak’ implicature) and how the prejacent (which I call the ‘strong’ implicature) follows from the weak implicature together with the assertion. (I ignore presuppositions for simplicity sake.)
5.3 Consequences Unlike the weak presupposition analyses in Horn (1996) and Geurts & van der Sandt (2004), this proposal can account for only sentences with conjoined NPs in focus, and it can explain why when an only sentence is negated, the prejacent remains. Furthermore, unlike both weak and strong presupposition analyses, this proposal also explains the projection and cancellation facts presented above. First, let us consider the problem of conjoined NPs in focus. Recall that weak presupposition analyses were not able to derive the fact that the prejacent of (61) is that John and Mary can speak French. What weak presupposition analyses derive is only the weak presupposition that somebody can speak French. 26
Notice that if the stronger proposition, that is (59), is true, the presupposition triggered by only in the weaker proposition (58) is technically satisfied: if nobody in a salient set can speak French, then the conditional presupposition in (58) is true for any proposition in the salient set.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
(60) a. S uttered Only JohnF can speak French, which asserts: :$p 2 C8 ½ pðwÞ ^ ½½JohnF can speak Frencho ? p b. The sentence ‘Nobody can speak French’ is stronger and should have been uttered (Maxim of Quantity): :$x 2 S5 ½can-speak-French ðxÞ where, in the context of utterance, S5 is the set of individuals (and their sums) such that the set of propositions of the form ‘x can speak French’, where x 2 S5, is C8. c. Since the speaker is being cooperative, it must be the case that she is not in a position to utter the stronger statement without violating the Maxim of Quality, that is she is not epistemically certain that nobody can speak French. Assuming that she is knowledgeable about the subject matter, she must be epistemically certain that some salient individual can speak French. This is the weak implicature. d. Since the speaker asserted that nobody other than John can speak French and implicated that somebody can speak French, it follows that she is epistemically certain that John can speak French. This is the strong implicature.
68 On the Meaning of Only (61) Only [ John and Mary]F can speak French. a. Prejacent: John and Mary can speak French. b. Assertion: Nobody other than John and Mary can speak French. I will first spell out the truth conditions of (61), and then I will show how the truth of the prejacent follows from two implicatures that the sentence generates. Suppose there are three salient people: John, Mary and Bill. Focus is going to introduce a presupposed set of alternatives C8. Let C8 be as shown in (62).
The truth conditions for (61) are given in (63). (63) ½½only(C8) [S[ John and Mary]F can speak French] ; C8o,R,w is defined only if the following conditions are satisfied: (1) R(8) 4 ½½[ John and Mary]F can speak French f and ½½[ John and Mary]F can speak Frencho 2 R(8) and jR(8)j > 2 (focus pres.) (2) If some proposition in R(8) is true, ½½[ John and Mary]F can speak Frencho,w ¼ 1 (only-pres.) If defined, ½½only(C8) ½S[ John and Mary]F can speak French ; C8o, R, w ¼1 if:$p 2 R(8) : p(w) ^ ½½[ John and Mary]F can speak Frencho ? p ¼0 if $p 2 R(8) : p(w) ^ ½½[ John and Mary]F can speak Frencho ? p The presupposition (2) in (63) requires that if some proposition in C8 is true, then it is true that John and Mary can speak French. These truth conditions make the sentence (61) true in precisely those situations where nobody other than John and Mary can speak French: (i) in a world where only John and Mary can speak French, (63) is defined and true; (ii) in a world where John and Mary and Bill can speak French, the sentence is defined and false; (iii) in a world where only Bill can speak French, (63) is undefined since the antecedent is true (Bill can speak French) but the consequent is false (John and Mary cannot speak French); (iv) in a world where only John can speak French (as well as in a world where only Mary can speak French), the sentence is undefined because the conditional presupposition is false.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
(62) C8 ¼ {John can speak French; Mary can speak French; Bill can speak French; John and Mary can speak French; John and Bill can speak French; Mary and Bill can speak French; John and Mary and Bill can speak French}.
Michela Ippolito 69
Now that we have accounted for the assertion, how do we derive the prejacent that John and Mary can speak French? Just like with the simple cases, the prejacent is a scalar implicature, calculated as follows.
The current proposal also accounts for only sentences with bare plurals such as ‘Only Bostonians voted for John’, whose prejacent is that some Bostonians voted for John (maybe not even half of them).27 Since the stronger statement that the speaker could have made is that nobody voted for John, the scalar implicature will be that somebody voted for John. Since the speaker asserted that no non-Bostonian voted for John, it follows that (at least) some Bostonian voted for John.28 27
Horn (1996), von Fintel (1997), among others. A trickier case is the case of only sentences where the associate if disjoined noun phrase, as in the following sentence. 28
(i)
Only Mary or Sue voted for John. I must leave the question of what the truth-conditions for (i) are open.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
(64) a. The speaker uttered ‘Only [ John and Mary]F can speak French’, which asserts: :$p 2 C8 : p(w) ^ ½½[ John and Mary]F can speak Frencho ? p b. The sentences ‘Only [John]F can speak French’ and ‘Only [Mary]F can speak French’ are stronger and either of them should have been uttered (Maxim of Quantity): 1. :$p 2 R(8) : p(w) ^ ½½[Mary]F can speak Frencho ? p 2. :$p 2 R(8) : p(w) ^ ½½[ John]F can speak Frencho ? p c. Since the speaker is being cooperative, it must be the case that she is not in a position to utter either stronger statement without violating the Maxim of Quality, that is she is not epistemically certain that nobody other than John can speak French and she is not epistemically certain that nobody other than Mary can speak French. Assuming that she is knowledgeable about the subject matter, she must be epistemically certain that there is somebody other than Mary who can speak French and that there is somebody other than John who can speak French. These are the weak implicatures. d. Since the speaker asserted that nobody other than John and Mary can speak French and implicated that somebody other than Mary can speak French and that somebody other than John can speak French, it follows that the speaker is epistemically certain that John and Mary can speak French. This is the strong implicature.
70 On the Meaning of Only Second, let us consider the case of negative-only sentences. Just like the strong presupposition analysis, but unlike the weak presupposition and the entailment analyses, the present analysis offers an explanation for why, when an only sentence is negated, the prejacent survives: as we will see in detail below, in the current proposal the prejacent of a negative sentence follows from the satisfaction of the only-presupposition together with the assertion. Suppose someone were to utter (65). (65) Not only John can speak French. Prejacent: John can speak French.
(66) Presupposition: If some proposition in C is true, it is true that John can speak French If this presupposition is satisfied (i.e. if it is entailed by the common ground), then (65) can be added too. Now, as shown in (67), (65) asserts that there is a proposition in C8 that is true and not entailed by the proposition that John can speak French. (67) ½½not only(C8) JohnF can speak French ; C8o,R,w ¼ 1 if $p 2 R(8)[p(w) ^ ½½ JohnF can speak Frencho ? p] Therefore, as shown below, when the sentence is successfully added to the common ground, the common ground ends up entailing that John can speak French. (68) P1. If some proposition in C is true, it is true that John can speak French P2. $p 2 C8[ p(w) ^ ½½ JohnF can speak Frencho ? p] C. It is true that John can speak French Related to this point is the observation that this analysis correctly predicts that, while the prejacent of a positive-only sentence should be cancellable, the prejacent of a negative sentence should not since it is an entailment. Recall that we observed above that the weak presupposition analysis (at least the version of Geurts and van der Sandt) make exactly the opposite prediction, that is that the prejacent should be cancellable in the negative sentence but not in the positive one.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
A condition for adding the proposition expressed by an utterance of (65) to the common ground is that the common ground already entails its presupposition, that is (66).
Michela Ippolito 71
The data contradict weak presupposition analyses and support our proposal. (69)
Only John is strong enough to swim across the British Channel, and maybe not even him. (70) #Not only John is strong enough to swim across the British Channel, and maybe John isn’t.
(71) I don’t know whether Muriel likes Hubert, but . . . a. *Muriel likes not only Hubert. b. Muriel does not like only Hubert. From the perspective of the weak existential analysis, the infelicity of (71a) is problematic since from the assertion that Muriel likes somebody other than John and the presupposition that Muriel likes someone does not follow that Muriel likes Hubert: precisely because the prejacent is not presupposed, the sentence should be felicitous when the speaker does not know whether Muriel likes Hubert. Beaver tries to reconcile the facts in (71) with the weak existential presupposition by suggesting that (71a) is not the negation of ‘Muriel likes only Hubert’ for the following reason. Not only is a constituent, where not associates with only, and therefore the relevant set of alternatives in (71a) is {Muriel likes Hubert; Muriel likes only Hubert}; because one alternative is presupposed to be true, it follows that it is presupposed that Muriel likes Hubert. Therefore, according to Beaver, the particular semantics of not only conspires to implicate the truth of the prejacent, thus explaining why (71a) is bad but, because (71a) is not the negation of ‘Muriel likes only Hubert’, its infelicity does not threaten the proposal of Geurts and van der Sandt. As for (71b), Beaver claims that it is the negation of ‘Muriel likes only Hubert’ and its felicity in the context of (71) is taken to be an argument in favour of the weak presupposition analysis of Geurts and van der Sandt.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
In footnote 4, I mentioned that Beaver (2006) has suggested an alternative analysis of not only sentences, according to which a sentence of the form ‘not only A is B’ is not the mere negation of the corresponding ‘only A is B’ sentence. Beaver hypothesizes that not only forms a constituent and has an idiomatic nature on the basis of his observation that, while when negation and only are adjacent the sentence implies the truth of the prejacent, this implication is absent when negation and only are separated by some linguistic material. The following pair and the judgments are from Beaver (2004).
72 On the Meaning of Only However, consider the following example. If the particle too is missing, the second sentence is not felicitous.29 (72) a. Mary didn’t eat only the cookies. #She ate the chocolate. b. Mary didn’t eat only the cookies. She ate the chocolate too.
(73) Sam is having dinner in New York tonight, too. However, the sentence is not felicitous unless there is a contextually salient individual who is known to have dinner in New York tonight, as shown below. (74) Tom is having dinner in New York tonight. Sam is having dinner in New York tonight too. Hence, following the work cited above, we take too to be implicitly anaphoric and to mean something like ‘in addition to x’ it associates with focus and is coindexed with a phrase in the sentence whose denotation is the value of the variable x. The presupposition triggered by too in (74) is that some salient individual other than Sam is having dinner in New York tonight. Because Tom is salient and the context entails that Tom is having dinner in New York tonight, the presupposition of the second sentence is satisfied and the sentence is felicitous. Let us go back to (72a). In Beaver’s version of the proposal of Geurts and van der Sandt, the sentence asserts that Mary ate something other than the cookies and, since not and only do not form a constituent in the first sentence, the presupposition triggered by the BPR is that Mary ate something. Because the sentence neither entails nor presupposes 29
The same judgments hold in Italian:
(i) Maria non Maria not ‘Maria didn’t (ii) Maria non Maria not cioccolato. chocolate. ‘Maria didn’t
ha mangiato solo/soltanto i biscotti. #Ha has eaten only the cookies. Has eat only the cookies. #She ate the chocolate.’ ha mangiato solo/soltanto i biscotti. Ha has eaten only the cookies. Has eat only the cookies. She ate the chocolate too.’
mangiato il eaten the
cioccolato. chocolate.
mangiato anche il eaten also the
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
We know from Kripke (1990) and Heim (1990), among others, that the presupposition triggered by too is not an existential proposition but a singular proposition. For example, if Kripke’s example in (73) merely presupposed that there is someone other than Sam who is having dinner in New York tonight, since we all know that this proposition is true, it should be felicitous out-of-the-blue since its presupposition is taken to be true in the common ground.
Michela Ippolito 73
(75) A already gave B the chance to tell the truth, but B didn’t. So, now B says: a. #Please, give me a chance. b. Please, give me another chance. Going back to the contrast between (72a) and (72b), according to our proposal, the former is infelicitous because, when the second sentence is uttered, the context already entails that Mary ate the cookies and since this proposition is salient and relevant to the topic under discussion (plausibly, what Mary ate), we are required to presuppose it. With respect to the pair in (71), while I found that there is a contrast between (71a) and (71b), with the former being worse than the latter, the speakers I interviewed found (71b) odd too. The infelicity of (71a), together with the fact that speakers do not find the sequence with (71b) flawless, suggests to me that we should be cautious in taking the contrast in (71) as telling us that (71b), unlike (71a), does not entail/presuppose that Muriel likes Hubert. One might argue that both (71a) and (71b) entail/presuppose that Muriel likes Hubert and are both infelicitous in (71), but that what rescues (71b) cannot rescue (71a) thus causing the latter to be more deviant. One possibility might be that, while (71b) is rescued by locally accommodating the conditional presupposition (thus preventing the context from entailing the prejacent proposition once the assertion was added), local accommodation is not possible in (71a). For convenience, I will use the context change semantics notation to 30
In Pesetsky (2000), this observation is attributed to Irene Heim, via Kai von Fintel.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
that Mary ate the cookies, too should (at the very least) not be required to occur and the sentence should be fine without it: the first sentence in (72a) would presuppose that Mary ate something, and would assert that Mary ate something other than the cookies; the second sentence would then assert that she ate chocolate. The infelicity of the sentence without too is unexplained and challenges Beaver’s attempt to defend the weak presupposition analysis of Geurts and van der Sandt. On the other hand, the contrast between (72a) and (72b) is easily explained by the present proposal (as well as by strong presupposition analyses), together with Heim’s observation that an utterance is required to presuppose (at least some of the) information that is entailed by the context (and that is also salient and relevant to the topic under discussion).30 For example, imagine a situation where A already gave B the chance to tell the truth, but B did not. In this context, we judge B’s utterance with the indefinite article a inappropriate.
74 On the Meaning of Only illustrate this point (cf. Heim 1983, 1992). The operation of adding the context change potential of ‘Muriel does not like only Hubert’ to the context set looks as follows (technically, we add the context change potential of the LF of ‘Muriel does not like only Hubert’, that is ‘not [only(C8) [Muriel likes Hubert] ; C8]’). (76) c + not [Muriel likes only Hubert] ¼ c (c + Muriel likes only Hubert)
(77) c + not [Muriel likes only Hubert] ¼ c (c + if Muriel likes someone, Muriel likes Hubert + Muriel likes only Hubert) The resulting c will contain only worlds where Muriel likes someone other than Hubert but not Hubert, and worlds where she likes Hubert and someone else. Since this is consistent with the first assertion in (71), the discourse is acceptable. Now, let us consider (71a). As observed in Wagner (2007) and references cited there, the occurrence of negation in (71a) seems to be an instance of substitutive negation, that is a negation which associates with focus negates the sentence in which it occurs and entails the truth of one of the alternatives. Typically the true alternative is introduced by but, and in some cases it must, as shown below. (78) a. *Muriel likes not Hubert. b. Muriel likes not HubertF but Lyndon. Suppose that, whether the but phrase is overt or not, substitutive negation is always part of a two-place operator taking two propositions as its arguments, that is ‘NOT-BUT(p, q)’: if there is no overt but phrase, there is a covert one that is interpreted. At the relevant level of interpretation, then, the form of (78b) will be the following, where the TP in the second argument of the operator is elided and must find a suitable antecedent in the first argument.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
The addition operation + is only defined if the presupposition in the only sentence (i.e. that if Muriel likes someone, Muriel likes Hubert) is entailed by c. However, this presupposition cannot be globally accommodated in c since the speaker said in (71) that she does not know whether Muriel likes Hubert: if it were globally accommodated in c, the addition of the proposition that Muriel likes someone other than Hubert [the assertoric content of (71b)] would require c to entail that Muriel likes both Hubert and someone else, which contradicts the first sentence in (71) and causes c to be empty. However, the conditional presupposition can be locally accommodated as shown below.
Michela Ippolito 75
(79) NOT [HubertF [k1 Muriel likes t1]] BUT [Lyndon
(80)
On the one hand, for the reason explained above [i.e. in order not to create a contradiction with the preceding sentence in (71)], the conditional presupposition in the only sentence cannot be accommodated globally; on the other hand, since a presupposition can be accommodated locally only if it occurs within the scope of the relevant operator, local accommodation cannot apply here since the only sentence does not occur in the scope of the whole operator not-but. The only option is for the conditional presupposition to project to the top of the sentence, making the whole utterance incompatible with the context of utterance and causing the discourse to be infelicitous. If this idea is correct, the infelicity of (71a) has nothing to do with only. That this might in fact be the case is indicated by the fact that the following pair of examples shows the same pattern as (71) even though only is absent: while local accommodation is possible with simple negation, it is ruled out by substitutive negation. 31
Notice that, independently of the infelicity of (71a) in the context of (71), speakers find (71a) bad because the but phrase is missing, which is consistent with the hypothesis that substitutive negation is part of a complex operator taking two arguments.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Turning to (71a), this sentence is understood as meaning that Muriel likes not only Hubert but some (salient) person other than Hubert. Therefore, at the relevant level of interpretation the form of (71a) is similar to the form of (78b), that is something like ‘[NOT [only [HubertF [k1 Muriel likes t1]]] BUT [pro2 ]’, where the underlined material is covert, pro2 is a covert pronoun whose value is some salient individual other than Hubert, and the struck through part is the elided material whose antecedent is a relevant phrase in the first argument.31 We could represent the structure as follows.
76 On the Meaning of Only (81) I don’t know whether John has a wife, but . . . a. ?he certainly didn’t have a fight with her. He looks so happy! b. *he certainly had a fight not with her but with Fred.
(82) It is possible that only John will arrive at the party on time . . . . . . and maybe not even he will. (83) It is possible that John is glad to have quit smoking . . . . . . # and maybe he didn’t quit. Horn (1996) makes the same point with the following example.32 (84) John will go to church only on Sunday, and may not even on Sunday if there is a football match in the morning. Furthermore, the present proposal predicts that, when a negative-only sentence is modalized, the prejacent should not ‘project’, because it is an entailment. I think that the following examples support this 32 Note that—contrary to Atlas’s (1993) claim that ‘if one takes the epistemic, modal quantifier as essential in tests of cancellation, anything, including logical entailment, will, incoherently, turn out to be cancelable’—the only-less counterpart of (84) is not coherent:
(i)
#John will go to church on Sunday, and maybe not if there is a football match in the morning.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
These facts are consistent with our hypothesis that local accommodation is not allowed because the sentence carrying the presupposition is not in the scope of the discontinuous operator not-but. If this hypothesis is correct, the deviance of (71a) is not due to the fact that it is not the negation of an only sentence: (71a) negates the positive-only sentence just as much as (71b) does, but because this is an instance of substitutive negation, the conditional presupposition occurring in the first argument of not-but cannot be locally accommodated since that position does not count as being in the scope of the operator. The account of the contrast between (71b) and (71a) above is tentative and, for reasons of space, I cannot explore this hypothesis further. However, these facts and the point raised in the text about the occurrence of too indicate that more avenues need to be explored before concluding with Beaver that the contrast between (71a) and (71b) supports the weak presupposition analysis. The third welcome consequence of our approach is that the lack of projection of the prejacent under modal adjectives—the examples are repeated below—is no longer surprising since implicatures do not project under modal adjectives.
Michela Ippolito 77
prediction since the speaker does not seem committed to the truth of the proposition that John will take Syntax in (85).33 (85) Mary will not take Phonology. She might take Syntax or she might take Semantics. So, if she doesn’t take only Syntax, she will take semantics.
5.3.1 The need for an epistemic operator when cancelling the implicature The prejacent is a conversational implicature. However, cancelling it by asserting its negation is not possible, as shown in (86). #Only Mary can speak French—in fact, not even she can.
This is not expected if the prejacent is a conversational implicature since we know that cancelling a conversational implicature is possible and does not give rise to a contradiction. (87) Mary saw some of Antonioni’s movies—in fact, she saw all of them. However, when the implicature is suspended by means of an epistemic operator such as maybe, the sentence is felicitous. (88) Only Mary can speak French, and maybe not even she can. This raises the question why is (88), but not (86), acceptable. The sentence ‘Nobody can speak French’ asymmetrically entails ‘Only Mary can speak French’. Therefore, upon uttering the weaker of the two, the conversational implicature that the speaker does not know that nobody can speak French arises. Call this implicature Implicature I. Assuming that the speaker is knowledgeable about who can speak French, the stronger implicature that the speaker knows that someone can speak French arises too. Call this implicature Implicature II. In other words, because of the logical relation between ‘Nobody can speak French’and ‘Only Mary can speak French’, not having uttered the former has the implications we called Implicatures I and II. 33 This argument, however, may not be very convincing one way or another, as someone intending to defend the presuppositional view might claim that the felicity of (85) is an instance of local accommodation, just like in the following example from Kadmon (2001).
(i)
a. It is possible that John has children and it is possible that his children are away. b. It is possible that John has children and it is possible that John has children and his children are away.
On the other hand, in recent years there have been a number of studies suggesting that the ease with which some alleged lexical presuppositions are ‘cancelled’ is suspicious and sheds light on the presuppositional nature of the propositions in question [e.g. that John has children in (i)], which might be best analysed as conversational implicatures (see Simons 2001 and Abusch 2002).
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
(86)
78 On the Meaning of Only
(89) Presupposition triggered by only in ‘Only Mary can speak French’: If there is a true proposition in C, then Mary can speak French. When applied to (89), the constraint against a vacuously true presupposition makes the sentence ‘Only Mary can speak French’ infelicitous when added to a context set entailing that nobody can speak French. That is, it makes the sentence felicitous only when added to context sets that are compatible with (if not entailing) the proposition that someone can speak French. If this constraint is satisfied, the next step is to check that the presupposition in (89) is entailed.34 34
One might think that the constraint against vacuously true presuppositions interferes with the calculation of the quantity implicature in that in contexts where the constraint is satisfied, the speaker should not be in a position to utter the stronger alternative ‘Nobody can speak French’. This is a worry but I would like to suggest that this constraint does not in fact interfere with the calculation of the implicature because, being a merely pragmatic constraint, it does not affect the logical relation between ‘Only Mary can speak French’ and ‘Nobody can speak French’. That purely pragmatic constraints do not interfere with the calculation of scalar implicature is suggested by a different but related case: an utterance of the sentence ‘Not all round squares are perfect’ conversationally implicates that some round square is perfect, even though the stronger competitor, ‘No round square is perfect’, does not satisfy the pragmatic constraint against vacuously true assertions. That is, we know that there exists no round square and, therefore, the pragmatic constraint is violated. If this violation prevented the stronger proposition from entering the competition (since, knowing that it is vacuously true, the speaker should not be in a position of utter the stronger statement), then an utterance of ‘Not all round squares are perfect’ would never implicate that there is a round square that is perfect.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Now, the felicity of (88) shows that cancelling Implicature II (that the speaker knows that someone can speak French) is possible. However, cancelling Implicature I is not possible, as shown in (86). I am going to suggest that this impossibility is due to a pragmatic (not a semantic) constraint on the actual utterance ‘Only Mary can speak French’, that is a pragmatic constraint against presupposing a vacuous truth. Roughly, this constraint bans sentences uttered in contexts where their presuppositions are vacuously true. I think we can justify the constraint against vacuously true presuppositions within a theory where presuppositions are admittance conditions (Karttunen 1974; Stalnaker 1974), that is requirements that the sentences carrying the presuppositions impose on the context of utterance. The goal of an assertion is to shrink the context set by eliminating the worlds that are incompatible with it: if you make an assertion, that assertion is going to be felicitous only if it meets that conversational goal. Similarly, we can view the goal of a presupposition as that of indicating which context sets will admit the sentence and which ones will not: if a sentence presupposes a proposition that is vacuously true in the context of utterance, that presupposition is in effect imposing no requirement at all on the context set and, therefore, the presupposition does not meet its conversational goal. Recall that the presupposition triggered by only is the conditional proposition below.
Michela Ippolito 79
(90)
W4: Nobody can speak French #Only MaryF can speak French. #Only twoF students can speak French.
Finally, as the examples above show, cancelling the prejacent requires either even or either in the suspender clause.35 (91) a. Only Mary can speak French, and maybe not even she can. b. Only Mary can speak French, and maybe she can’t either. c. ??Only Mary can speak French, and maybe she can’t. Let us begin with (91b). The need for either [compare (91b) to (91c)] is again reminiscent of Heim’s observation that the speaker is required to presuppose (at least some) information that is salient and relevant to her utterance (see example (75) above and related discussion). Since the first conjunct in (91b) asserts that everyone other than Mary cannot speak French and since this information is salient and relevant, in the 35 A reviewer suggested that bare focus on she, as in (i), might be enough to rescue the sentence for those speakers who accept (91a) and (91b). However, I did not find this to be true: the speakers I interviewed found (i) as infelicitous as (91c).
(i)
??Only Mary can speak French, and maybe sheF can’t.
In light of the idea that speakers are required to ‘maximize’ the presupposed information in their utterances, and in light of the fact that, as argued already by Rooth, focus itself does not introduce an existential presupposition, the infelicity of (i) is expected.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Among the context sets that survive the pragmatic constraint are the following four: c1 includes worlds where Mary alone can speak French, worlds where John alone can speak French, worlds where both John and Mary can speak French and worlds where nobody can speak French; c2 includes worlds where Mary alone can speak French and worlds where nobody can; c3 includes worlds where John alone can speak French and worlds where nobody can; c4 includes worlds where both John and Mary can speak French and worlds where nobody can. Now, c1 and c3 do not satisfy the conditional presupposition since the latter is not true in all those worlds, but c2 and c4 do, since in all of their worlds is true that if someone can speak French, Mary can. It follows that context sets that obey the pragmatic constraint against vacuously true presuppositions and satisfy the conditional presupposition must be (at least) compatible with the proposition that Mary can speak French, and therefore, Implicature I cannot be cancelled without contradiction. The infelicity of the sentences ‘Only MaryF can speak French’ and ‘Only two students can speak French’ uttered in a context where it is known that nobody speaks French follows similarly from the pragmatic constraint against vacuously true presuppositions.
80 On the Meaning of Only
5.4 Further issues 5.4.1 Downward entailingness von Fintel (1999) observes that from the truth of (92a) one cannot infer the truth of (92b). (92) a. Only John ate vegetables for breakfast. b. Only John ate kale for breakfast. The inference failure in (92) conflicts with the observation that only is an NPI licenser, as shown in the examples below from Horn (2002), where the author observes that not only does only license weak NPIs such as ever and any but it also licences minimizers such as lift a finger and at all. (93) a. b. c. d.
Only John ever suspected David Alexander. Only young writers ever accept suggestions with any sincerity. (Of all her friends,) Only Phil would lift a finger to help Lucy. My nose and my lungs are only alive at all because they are part of my body and share its common life. (C. S. Lewis, Mere Christianity)
36 For the view that the existential requirement is a presupposition, see Karttunen & Peters (1979), Rooth (1985) and Wilkinson (1996), among others; for criticisms of this view, see Krifka (1991), von Stechow (1991) and Rullmann (1997), among others. For the view that what underlies the scalar implicature is not likelihood, see, for example, Kay (1990).
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
second conjunct the speaker is required to presuppose it: either satisfies this requirement since it triggers the presupposition that somebody other than Mary cannot speak French. As for the contribution of even in (91a), it is twofold: on the one hand, it contributes the presupposition or implicature (depending on the theory) that someone other than Mary cannot speak French; on the other hand, it contributes a scalar presupposition, for example that Mary is the most likely person to be able to speak French.36 Suppose for convenience that both these requirements are presuppositions triggered by even. For this sentence to be felicitous, both these presuppositions must be satisfied. First, the existential presupposition is satisfied by the assertion of the first conjunct (since C must contain the proposition that Mary can speak French and at least an alternative). Second, the scalar presupposition is satisfied by the presupposition of only, since I would argue that believing that if anyone (in the relevant set) can speak French, Mary can, seems to commit the believer to the proposition that Mary is most likely than anyone else (in that relevant set) to be able to speak French. Therefore, if the first conjunct in (91a) is defined and true, the scalar presupposition of even is satisfied too.
Michela Ippolito 81
According to von Fintel (1999), we can account for the licensing of NPIs if we assume that what licences an NPI is not a downward entailing (DE) operator but what he labels a ‘Strawson-downward entailing’ (SDE) operator, where a SDE operator is defined as follows. (94) Strawson Downward Entailingness: A function f of type Ær, sæ is Strawson-DE iff for all x, y of type r such that x 0 y and f(x) is defined: f(y) 0 f(x).
(95) a. Someone ate kale for breakfast. b. Only John ate vegetables for breakfast. c. Only John ate kale for breakfast. In an implicature account of only ’s prejacent, the NPI licensing with only is not a puzzle since (92a) does downward entail (92b), and the inference in (92) is predicted to be valid. But the question remains why the inference in (92) is strange unless it is known that someone (i.e. John) ate kale for breakfast. Consider the contrast between (92) and (96). (96) A: Who ate kale for breakfast? B: We know that only John ate vegetables for breakfast. Therefore, only John ate kale for breakfast, if anybody at all. In (92), the implicature that John ate kale for breakfast does not follow from the assertion that nobody other than John ate vegetables for breakfast or from the implicature that John ate vegetables for breakfast. It is then plausible to maintain that it is the implicature that John ate kale for breakfast that causes the inference to be strange. In (96), however, this implicature is suspended by the clause ‘if anybody at all’ and the felicity of this example suggests that, when the implicature is suspended, the inference is in fact valid because the assertoric content
37 In von Fintel (1999), the first premise, that is the presupposition of only sentence that represents the conclusion of the argument, is that John ate kale for breakfast. For the reason explained earlier, the weak existential proposition that someone ate kale for breakfast is enough to make the argument valid.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
According to von Fintel’s analysis, (92a) does not downward entail (92b). It does Strawson-downward entail it, under the assumption that ‘someone ate kale for breakfast’ is a presupposition of (95c).37
82 On the Meaning of Only of ‘Only John ate kale for breakfast’ does follow from the assertoric content of ‘Only John ate vegetables for breakfast’.38
(97) a.
Only [Hillary]F trusts Bill, if (even) she does/and perhaps even she does not. b. *Only [Hillary]F trusts Bill, and (even) she does not.
Their proposal goes as follows. An utterance of ‘Only Hillary trusts Bill’ (semantically) means that nobody other than Hillary trusts Bill and has a weak implicature (what they call the ‘Gricean interpretation’) according to which the speaker does not know whether Hillary does not trust Bill. Assuming that the speaker is maximally competent (what they call the principle of ‘maximizing competence’), though, this 38 In this paper, I have not considered the expression at most n and its counterparts in other languages. Take the Italian counterpart of at most in (i):
(i)
Gianni parla Gianni speaks
al massimo at most
una lingua. one language
One reason for doing so is that I am not convinced that (i) asserts what ‘Gianni speaks only one language’ does. It seems to me that at most, as least when occurring in matrix sentences like (i), might have an ‘epistemic’ or ‘evidential’ flavour that is absent from the only sentences so that (i) might be paraphrased as something like ‘the strongest proposition compatible with what I take to be true is that Gianni speaks one language’. If so, (i) would presuppose that there is a strongest (relevant) proposition compatible with what the speaker takes to be true and would assert that this proposition is the proposition that Gianni speaks one language. In fact, a reviewer points out that an epistemic analysis for at least, at most and related modifiers has been recently proposed by Geurts & Nouwen (2007).
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
5.4.2 van Rooij & Schulz (2005) Recently, van Rooij & Schulz (2005) have independently argued for an implicature analysis of only. Similarly to the view I have defended here, van Rooij and Schulz suggest that the exclusive proposition is the only component that is asserted by a positive-only sentence, and that the prejacent is a conversational implicature and not a presupposition (whether strong or weak). However, despite this similarity, the two analyses crucially diverge in two important respects. The first point of divergence concerns negative-only sentences. van Rooij and Schulz do mention that the prejacent is inferred in negative-only sentences too, but they do not address the natural objection to any implicature analysis, that is that implicatures do not survive in negative sentences. The second point of divergence concerns Atlas’s observation that it is not possible to deny the prejacent of an only clause. Following on Atlas’s observation, we pointed out in section 5.3.1 that we can only suspend the implicature by using an epistemic operator such as maybe and possibly. van Rooij and Schulz notice the same contrast with respect to the following pair.
Michela Ippolito 83
(98) a. A: Who will you send a copy of your new book to? B: I’ll send a copy to John or Mary. In fact, I’ll send a copy to both of them. b. A: What would you like for Christmas? B: I would like a book by Alice Munro. In fact, I would like all of them. van Rooij and Schulz, however, claim that cases where the speaker successfully denies the weak implicature are not counterexamples to their claim that weak implicatures are not cancellable because this type of cancellation is only possible in contexts where the truth of the stronger proposition (whose negation is the implicature) is not relevant. Assuming—correctly—that in an only sentence the prejacent is always relevant, it seems to follow that its cancellation is not possible. Because it is assumed that a quantity implicature is drawn only if a stronger proposition is (contextually) relevant, van Rooij and Schulz’s suggestion amounts to the claim that the cases where it looks like we are denying the implicature are actually cases where no implicature was (conversationally) derived to begin with. This idea is interesting but problematic in that it does not seem to work for (98) and similar examples. Take (98b). Suppose A is asking what B wants for Christmas so that she can buy it: in this situation, it is very relevant to know exactly how many books by Munro B wants, since that will determine what A will buy. Because the stronger proposition (that B wants all the books by Munro) is relevant, an utterance of the sentence ‘I want a book by Alice Munro’ (which expresses a weaker propositon) will trigger the quantity implicature that it is not the case that B wants all of them. Because this implicature arises (and given that B does want all of
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
implicature is strengthened into the implicature that the speaker knows that Hillary does trust Bill. However, in order to account for the contrast above, van Rooij and Schulz claim that, while the implicature that derives from maximizing competence (which I called Implicature II) can be easily cancelled, thus explaining the felicity of (97a), the implicature stemming from the Gricean maxims of Quality and Quantity (which I called Implicature I) is not. About the impossibility of cancelling the latter implicature, they write ‘The fact that this gives rise to an inappropriate sentence strongly suggests that one cannot cancel inferences based on these maxims [the Gricean maxims of Quality and Quantity—MI] that easily’. The problem is that we know this is not true of quantity implicatures in general, as we have argued in section 5.3.1 and is shown again here.
84 On the Meaning of Only Munro’s books), B denies it.39 The conclusion seems to be that, even when the stronger alternative is relevant, it is possible to deny the Gricean weak implicature. Therefore, because in van Rooij and Schulz’s proposal the prejacent of an only sentence is a conversational implicature, the question why it is not cancellable like the implicature in (98b) remains unanswered by their theory.
6 CONCLUSION
39 There is an interesting question hidden in this discussion, that is the question why, if he wants all of Munro’s books, B did not say it right away, that is, why he uttered the weaker proposition first. Maybe B was just trying not to sound too greedy, so he first uttered a weak proposition, but then, knowing perfectly well that his Christmas present depends on his answer, he strengthens the answer. A detailed investigation of this issue must await another occasion.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
In this paper, I have argued that only in any sentence of the form ‘only A is/are B’ does trigger a presupposition, that is the conditional presupposition that if something is B, then A is/are B. This is a ‘scalar’ presupposition. In a positive-only sentence, the prejacent follows from the assertion together with an existential scalar implicature, and is therefore cancellable. That the prejacent is not cancellable in a negative-only sentence follows from the fact that a context that satisfies the presupposition and to which the (negative) assertion is added will entail the prejacent. What seemed to be the strongest piece of evidence in support of the presuppositional analysis—the survival of the prejacent when the sentence is negated—turns out to be an entailment of the negative sentence. I discussed some alternative analyses such as the strong and the weak presuppositional analyses (Horn 1969 and Geurts & van der Sandt 2004, respectively) and the implicature version of Horn (1996) based on the claim that an ‘only A is/are B’ is equivalent to the sentence ‘all B are A’ and that the universal quantifier’s requirement that there be some B holds for only too. The idea defended in this paper that the prejacent of a positive-only sentence is a conversational implicature follows a suggestion made by McCawley (1981), and inspired by some of the arguments in Horn (1996), but it is framed within a ‘mixed’ analysis. Unlike these implicature analyses and the one defended by van Rooij & Schulz (2005), the current proposal claims that only does trigger a presupposition, even though different from what suggested by both strong presupposition analyses and the weak presupposition analyses. Furthermore, it accounts for the mixed behaviour of the prejacent (its survival under negation and its cancellability in a positive
Michela Ippolito 85
(99) Only Mary can speak French, and maybe not even she can. (100) #Only John married Sue, and maybe not even he did. The contrast between (99) and (100) does not seem accidental. Consider the following examples. The first group consists of examples where cancellation out of context is possible without much effort. (101) (102) (103)
Only John can win the Boston marathon, and maybe not even he can. Only John understands Mary, and maybe not even he does. Only John is healthy, and maybe not even he is.
The second group consists of examples where cancellation out of context is not as felicitous. (104) #Sue hired only John, and maybe not even him. (105) #Only John is dead, and maybe not even he is. These data suggest that modal verbs (at least ability modals, e.g. can), stative predicates (e.g. understand), gradable predicates (e.g. healthy) allow cancellation out of context (at least for those speakers who allow cancellation to begin with), but non-stative verbs (e.g. hire) and nongradable adjectives (e.g. dead) do not. This contrast might have to do
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
sentence) by arguing that it is the scalar implicature of a positive-only sentence but an entailment of its negative counterpart. I discussed Horn’s observation that in order to suspend the implicature it is necessary to use a modal operator in the suspender clause and that it is not possible to deny the prejacent. I argued that this observation is correct but does not undermine the implicature analysis, once we assume a pragmatic constraint against vacuously true presuppositions. Before concluding this paper, I would like to (i) make a brief note about the cancellability of the prejacent and (ii) spend a few words on an objection raised by Craige Roberts, a defender of the strong presupposition analysis, to the proposal defended in this paper. With respect to the cancellability of the prejacent, I noticed above that there is variation among speakers with respect to whether they allow cancellation of the prejacent. Now, even for speakers who do allow cancellation of the prejacent in the examples above, not all cancellations seem felicitous to the same degree out of context. For example, it is quite easy and felicitous to cancel the prejacent out of context in (99), but not in (100).
86 On the Meaning of Only with the ‘gradable’ nature of ability modals, stative predicates like understand and adjectives like healthy, but at this stage I do not know why the gradable nature of these predicates would affect the ease with which the prejacent is cancelled. Future research will need to establish, first, whether these facts are correct; second, what the source of this asymmetry is and third, how it might relate to the cancellation issue. Turning to Roberts’s objection, she observes that if the suspender clause precedes the only sentence, the prejacent cannot be suspended, and that this is not true of standard scalar implicatures (Roberts 2006). The pronoun she should be interpreted as referring to Mary.
While the infelicity of (106a) is consistent with the presupposition analyses, the contrast above is puzzling for a theory where the prejacent of a positive-only sentence is a scalar implicature. Now, take (106b). By the time the speaker utters the second clause (but she definitely ate some), the proposition that Mary ate all of the cookies has been eliminated from the set of possible answers (to the salient question under discussion ‘How many cookies did Mary eat?’) by the utterance of the first clause (‘I don’t know whether Mary ate all of the cookies’). In other words, when the second clause is interpreted, the stronger proposition that Mary ate all of the cookies is no longer relevant. Because it is not relevant, it is not a competing alternative to the weaker proposition that Mary ate some of the cookies and therefore, since the neo-Gricean competition does not take place, the scalar implicature does not arise and the sentence is felicitous. Now, take (106a). van Rooij and Schulz claim correctly that in an only sentence the prejacent is always a relevant proposition, which is reflected in the fact that the set of alternatives always includes the prejacent. Here is a tentative suggestion about what causes the infelicity of (106a). The first clause in (106a) eliminates the proposition that Mary can speak French from the set of alternatives, that is it makes that proposition is no longer relevant (with respect to the implicit question ‘Who can speak French?’). However, in the second clause in (106a), because of only’s association with focus and the interpretation of the latter, that same proposition is again among the relevant alternatives in C. Because the proposition that Mary can speak French in included in C (together with any other alternative of the form ‘x can speak French’ where x is a salient individual), the proposition that ‘nobody can speak French’ is a
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
(106) a. #I don’t know whether Mary can speak French, but definitely only sheF can. b. I don’t know whether Mary ate all of the cookies, but she definitely ate some.
Michela Ippolito 87
possible answer to the implicit question ‘Who can speak French?’ and, therefore, can compete with the (asserted) proposition that nobody other than Mary can speak French, according to the neo-Gricean paradigm. Since the competition is possible, it will take place and the scalar implicature will arise, contradicting the preceding assertion.40,41 A second point to consider is the following. Take the felicitous example in (107), where he and John corefer.42 (107) I don’t know whether John or anybody else can speak French, but it is definitely possible that only he can.
(108) #I don’t know whether John can speak French, but John can speak French and it is definitely possible that only John can. In order to explain the felicity of (107), we can of course appeal to local accommodation, but if we construct examples parallel to (107) with 40 We expect that, if after having uttered the first clause in (106a), we do not make Mary relevant again by including her (or, better, the proposition that she can speak French) among the relevant alternatives, the discourse will be felicitous. The felicity of (i) might be an indication that the tentative suggestion presented above is on the right track. Of course, the contrast between (106a) and (i) is also consistent with the strong presupposition analysis.
(i)
I don’t know whether Mary can speak French, but nobody else/nobody other than her can. 41
During the discussion of an earlier version of this paper, Hans Kamp noted that the fact that the suspender can follow the only sentence but cannot precede it is reminiscent of the Lewis-Sobel sequences, whose order also cannot be reversed: while the sequence of conditionals in (i) is fine, the sequence in (ii) is not. (This observation is discussed at length by von Fintel 2001, who attributes it to Irene Heim.) (i)
If John came, it would be a good party. But if John and Mary came, it would be dreadful.
(ii) If John and Mary came, it would be a dreadful party. #But if John came, it would be good. In the case of conditionals, the generalization is that it is not possible to ‘disregard’ possibilities once they have been raised, unless this conversational move is appropriately ‘flagged’, for example by using only. (iii) If John and Mary came, it would be a dreadful party. But if only John came, it would be good. In other words, as von Fintel puts it, ‘more possibilities come into consideration; no shrinking is allowed’ (von Fintel 2001: 140). Kamp’s observation is intriguing, but for reasons of space, the elaboration of this idea in connection with the present proposal must await another occasion. 42
In the intended reading, the adjective possible has scope over only. In this reading, the truth of the sentence is consistent with it being epistemically possible that someone other than John can speak French. It is important to make this point because the presupposition that arises if only has scope over possible (that it is possible that John can speak French) is compatible with the speaker not knowing whether John can speak French and, hence, the felicity of (106a) is consistent with the strong presupposition analysis.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Because possible is a hole for presuppositions, the alleged strong presupposition triggered by only, that John can speak French, is predicted to project and the sentence should be as bad as the following sentence, which clearly sounds contradictory.
88 On the Meaning of Only other presupposition triggers, we do not get an acceptable sentence, suggesting that we should be cautious when appealing to local accommodation in order to explain the felicity of cases like (107).43 (109) a. #I don’t know whether John smokes, but it is definitely possible that he will quit. b. #I don’t know whether Mary left John, but it is definitely possible that she regrets it/leaving him. c. #I don’t know whether Mary lost anything, but it is definitely possible that it is a purse that she lost.
(110) a.
I don’t know whether John can speak French, but if only he does, then we need to hire a tour guide. b. #I don’t know whether John smokes, but if he quits, he will live longer. c. #I don’t know whether Mary left John, but if she regrets it, she will not admit it.
Therefore, while it is true that (106a) raises questions for the view I am suggesting, it is also true that (107), (110a) and the contrast between these two sentences and the sentences in (109) remain a challenge for the strong presupposition analysis. Acknowledgements I would like to thank two anonymous reviewers and the editor of this journal for helpful comments and suggestions. I am very grateful for comments to David Beaver, Josh Brown, Kai von Fintel, Larry Horn, Craige Roberts and the audience at the University of Michigan Workshop in Linguistics and Philosophy 2005. All mistakes and omissions are mine.
MICHELA IPPOLITO Department of Linguistics University of Toronto 130 st. George Street Toronto, ONTARIO M5S 3H1 e-mail:
[email protected] 43 The trigger in (109c), the it-cleft, is classified in Roberts (2006) as a ‘background presupposition’, just like only. The other two types of presuppositions in Roberts’ classification are the ‘entailed presuppositions’, such as that triggered by quit, and the ‘anaphoric presuppositions’ such as those triggered by too and again.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Similarly, for the antecedent of conditionals, another hole for presuppositions.
Michela Ippolito 89
REFERENCES licensing in questions’. In J. Alderete, C.-H. Han, and A. Kochetov (eds.), The Proccedings of WCCFL 24. Cascadilla Proceedings Project. Somerville, Massachusetts, USA. Geurts, B. & R. Nouwen (2007), ‘At least et al.. The semantics of scalar modifiers’. Language, forthcoming. Geurts, B. & R. van der Sandt (2004), ‘Interpreting focus’. Theoretical Linguistics 30:1–44. Heim, I. D. Flickinger et al. (1983), ‘On the projection problem for presuppositions’. In Proceedings of WCCFL. Stanford University Press, Stanford, California, USA. Volume 2. 114–25. Heim, I. (1990), ‘Presupposition projection’. In R. van der Sandt (ed.), Presupposition, Lexical Meaning and Discourse Processes: Workshop Reader. University of Nijmegen. Heim, I. (1991), ‘Articles and definiteness’. In A. v. Stechow and D. Wunderlich (ed.), Semantics. An International Handbook of Contemporary Research. De Gruyter. Berlin. Heim, I. (1992), ‘Presupposition projection and the semantics of attitude verbs’. Journal of Semantics 9:183–221. Heim, I. & A. Kratzer (1998), Semantics in Generative Grammar. Blackwell Publishers. Malden, MA. Horn, L. (1969), ‘A presuppositional analysis of ‘only’ and ‘even’.’ In Papers from the Fifth Regional Meeting of the Chicago Linguistics Society. CLS. Chicago, IL, USA. 98–107. Horn, L (1972), On the Semantic Properties of Logical Operators in English. Ph.D. thesis, UCLA. Los Angels, California, USA. Horn, L. (1996), ‘Exclusive company: ‘Only’ and the dynamics of vertical inference’. Journal of Semantics 13:1–40.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Abusch, D. (2002), ‘Lexical alternatives as a source of pragmatic presuppositions’. In B. Jackson (ed.), Proceeding of SALT XII. CLC Publications. Cornell University, Ithaca, NY. Abusch, D. & M. Rooth (2004), ‘Empty-domain effects for presuppositional and non-presuppositional determiners.’ In H. Kamp and B. Partee (eds.), Context Dependence in the Analysis of Linguistic Meaning. Elsevier Publishers. Atlas, J. (1993), ‘The importance of being only: testing the neo-Gricean versus neo-entailment paradigms.’ Journal of Semantics 10:301–18. Atlas, J. D. (1996), ‘‘Only’ noun phrases, pseudo-negative generalized quantifiers, negative polarity items, and monotonicity’. Journal of Semantics 13:265–328. Beaver, D. (2004), ‘Five only pieces’. Theoretical Linguistics 30:45–64. Beaver, D. & B. Clark (2003), ‘Always and only: why not all focus sensitive operators are alike’. Natural Language Semantics 11:323–62. Bonomi, A. & P. Casalegno (1993), ‘Only: association with focus in event semantics’. Natural Language Semantics 2:1–45. Comorovski, I. (1996), Interrogatives Phrases and the Syntax-Semantics Interface. Kluwer. Dordrecht. De Jong, F. & H. Verkuyl (1985), ‘Generalizing quantifiers: the properness of their strength’. In J. van Bentham and A. ter Meulen (eds.), Generalized Quantifiers: Theory and Application. GRASS 4. Foris. Dordrecht. 21–43. Diesing, M. (1992), Indefinites. MIT Press. Cambridge, MA. Fitzpatrick, J. (2005), ‘The whys and how comes of presupposition and NPI
90 On the Meaning of Only Lappin, S. & T. Reinhart (1988), ‘Presuppositional effects of strong determiners’. Linguistics 26:1021–37. Levinson, S. (2000), Presumptive Meanings. The Theory of Generalized Conversational Implicature. MIT Press. Cambridge, Massachusetts, USA. McCawley, J. (1972), ‘A program for logic’. In D. Davidson and G. Harman (eds.), Semantics of Natural Language. Reidel. Dordrecht. Chicago, IL. 498– 544. McCawley, J. (1981), Everything that Linguists Have Always Wanted to Know about Logic but Were Ashamed to Ask. University of Chicago Press. Pesetsky, D. (2000), Phrasal Movement and Its Kin. MIT Press. Cambridge, MA. Postal, P. (1971), Cross-Over Phenomena. Holt, Rinehart, and Winston. Roberts, C. (2006), Only, Presupposition and Implicature. Unpublished MS, Ohio State University. Rooth, M. (1985), Association with Focus. GLSA Publications. Amherst, MA. Rooth, M. (1992), ‘A theory of focus interpretation’. Natural Language Semantics 1:75–116. Rooth, M. (1996), ‘Focus’. In S. Lappin (ed.), The Handbook of Contemporary Semantic Theory. Blackwell. 271–97. Rullmann, H. (1997), ‘Even, polarity, and scope’. In M. Gibson, G. Wiebe and G. Libben (eds.), Papers in Experimental and Theoretical Linguistics. University of Alberta. 40–64. Simons, M. (2001), ‘On the conversational basis of some presuppositions’. In R. Hastings, B. Jackson, and S. Zvolensky (eds.), Proceedings of SALT 11 CLC Publications. 431–48. Stalnaker, R. (1974), ‘Pragmatic presuppositions’. In M. Munitz and P. Unger (eds.), Semantics and Philosophy. New York University Press. New York. 197–213.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Horn, L. Andronis, M. Debenport, E. Pycha A. & Yoshimura K. (2002), ‘Assertoric inertia and NPI licensing’. In CLS 38 Part Two: The Panels. Chicago Linguistic Society. Chicago, IL, USA. 55–82. Kadmon, N. (2001), Formal Pragmatics. Blackwell. Malden, MA. Karttunen, L. (1973a), ‘Presuppositions of compound sentences’. Linguistic Inquiry 4.2:169–93. Karttunen, L. (1973b), STOP: Is There a Presupposition? Unpublished MS. Karttunen, L. (1974), ‘Presupposition and linguistic context’. Theoretical Linguistics I:181–94. Karttunen, L. & S. Peters (1979), ‘Conventional Implicature.’ In C.-K. Oh and D. A. Dinneen (eds.), Syntax and Semantics 11: Presupposition. Academic Press, New York. 1–56. Kay, P. (1990), ‘Even.’ Linguistics and Philosophy 13:59–111. Kay, P. (1992), ‘The inheritance of presuppositions.’ Linguistics and Philosophy 15:333–81. King, J. & J. Stanley (2005), ‘Semantics, pragmatics, and the role of semantic content’. In Z. Szabo (ed.), Semantics vs. Pragmatics. Oxford University Press. 111–64. Krifka, M. (1991), ‘A compositional semantics for multiple focus constructions’. In S. Moore and W. A. Z. (eds.), Proceedings of Semantics and Linguistic Theory I. CLC Publications. Cornell University, Ithaca, NY, USA. Krifka, M. (2006), ‘Association with focus phrases’. In V. Molnar and S. Winkler (eds.), The Architecture of Focus. Mouton de Gruyter. Berlin, Germany. 105–36. Kripke, S. (1990), Presupposition and Anaphora: Remarks on the Formulation of the Projection Problem. Princeton University. Princeton, NY, USA.
Michela Ippolito 91 Kenstowicz (ed.), Ken Hale: A Life in Language. MIT Press. Cambridge, MA. 123–52. von Stechow, A. (1991), ‘Focusing and background operators’. In W. Abraham (ed.), Discourse Particles. John Benjamins. Amsterdam/Philadelphia. 37–84. Wagner, M. (2007), ‘Association by movement: evidence from NPIlicensing’. Natural Language Semantics. Volume 14, 297–324. Wilkinson, K. (1996), ‘The scope of even’. Natural Language Semantics. Volume 4, 193–215. First version received: 16.05.2006 Second version received: 11.02.2007 Accepted: 05.04.2007
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Stalnaker, R. (1979), ‘Assertion’. In P. Cole (ed.), Syntax and Semantics: 9. Academic Press. New York. 315–332. Strawson, P. (1952), Introduction to Logical Theory. London. Methuen. van Rooij, R. & K. Schulz (2005), Only: Meaning and Implicature. Unpublished MS. University of Amsterdam. von Fintel, K. (1994), Restrictions on Quantifier Domains. GLSA. Amherst, MA. von Fintel, K. (1997), ‘Bare plurals, bare conditionals, and only’. Journal of Semantics 15:1–56. von Fintel, K. (1999), ‘NPI-licensing, Strawson-entailment, and contextdependency’. Journal of Semantics 16:97–148. von Fintel, K. (2001), ‘Counterfactuals in a dynamic context’. In M.