VOLUME 25 NUMBER 4 NOVEMBER 2008
CONTENTS 345
PAULA RUBIO-FERNÁNDEZ Concept Narrowing:The Role of Context-independent Information
381
ANNA SZABOLCSI, LEWIS BOTT AND BRIAN MCELREE The Effect of Negative Polarity Items on Inference Verification
411
Editor’s Note
451
FORTHCOMING ARTICLES ARIEL COHEN: No Alternative to Alternatives YAEL GREENBERG: Presupposition Accommodation and Informativity Considerations with Aspectual still
VOLUME 25 NUMBER 4 NOVEMBER 2008
CHUNG-HYE HAN AND NANCY HEDBERG Syntax and Semantics of It-Clefts: A Tree Adjoining Grammar Analysis
JOURNAL OF SEMANTICS
JOURNAL OF SEMANTICS
issn 0167-5133
VOLUME 25 NUMBER 4 NOVEMBER 2008
Journal of
SEMANTICS www.jos.oxfordjournals.org
oxford
JOURNAL OF SEMANTICS A N I NTERNATIONAL J OURNAL FOR THE I NTERDISCIPLINARY S TUDY THE S EMANTICS OF N ATURAL L ANGUAGE
MANAGING EDITOR: ASSOCIATE EDITORS:
OF
BART G EURTS (University of Nijmegen) DAVID B EAVER (Stanford University) R EGINE E CKARDT (Universität Göttingen) I RA N OVECK (Institut des Sciences Cognitives, Lyon) PAUL P ORTNER (Georgetown University,Washington) P HILIPPE S CHLENKER (Institut Jean-Nicod, Paris) YAEL S HARVIT (University of Connecticut, Storrs) A NNA S ZABOLCSI (New York University) EDITORIAL BOARD:
N ICHOLAS A SHER (University of Texas, Austin) C HRIS B ARKER (University of California at San Diego) J OHAN B OS (University of Edinburgh) P ETER B OSCH (University of Osnabrück) R ICHARD B REHENY (University College London) M IRIAM B UTT (University of Konstanz) G REG C ARLSON (University of Rochester) A NN C OPESTAKE (Stanford University) H ENRIËTTE DE S WART (Utrecht University) PAUL D EKKER (University of Amsterdam) K URT E BERLE (Lingenio Heidelberg) M ARKUS E GG (Universität des Saarlandes) U LRIKE H AAS -S POHN (University of Konstanz) L AURENCE R. H ORN (Yale University) H ANS K AMP (University of Stuttgart) G RAHAM K ATZ (University of Osnabrück) T IBOR K ISS (Ruhr University, Bochum) J ONAS K UHN (University of Texas, Austin)
CLAUDIA MAIENBORN (Humboldt University, Berlin) JULIEN MUSOLINO (Rutgers University) FRANCIS JEFFRY PELLETIER (University of Alberta) CHRISTOPHER POTTS (University of Massachusetts, Amherst) MARK STEEDMAN (University of Edinburgh) ZOLTAN GENDLER SZABO (Cornell University) KEES VAN DEEMTER (University of Aberdeen) ROB VAN DER SANDT (University of Nijmegen) ROBERT VAN ROOIJ (University of Amsterdam) KAI VON FINTEL (Massachusetts Institute of Technology) ARNIM VON STECHOW (University of Tübingen) BONNIE WEBBER (University of Edinburgh) HENK ZEEVAT (University of Amsterdam) THOMAS EDE ZIMMERMANN (University of Frankfurt)
EDITORIAL CONTACT:
[email protected] © Oxford University Press 2008 All rights reserved; no part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise without prior written permission of the Publishers, or a licence permitting restricted copying issued in the UK by the Copyright Licensing Agency Ltd, 90 Tottenham Court Road, London W1P 9HE, or in the USA by the Copyright Clearance Center, 222 Rosewood Drive, Danvers, MA 01923. Typeset by TnQ Books and Journals Pvt. Ltd., Chennai, India. Printed by Bell and Bain Ltd, Glasgow, UK For subscription information please see back of journal.
Scope of this Journal The Journal of Semantics publishes articles, notes, discussions, and book reviews in the area of academic research into the semantics of natural language. It is explicitly interdisciplinary, in that it aims at an integration of philosophical, psychological, and linguistic semantics as well as semantic work done in logic, artificial intelligence, and anthropology. Contributions must be of good quality (to be judged by at least two referees) and must report original research relating to questions of comprehension and interpretation of sentences, texts, or discourse in natural language. The editors welcome not only papers that cross traditional discipline boundaries, but also more specialized contributions, provided they are accessible to and interesting for a general readership in the field of natural language semantics. Empirical relevance, sound theoretic foundation, and formal as well as methodological correctness by currently accepted academic standards are the central criteria of acceptance for publication. It is also required of contributions published in the Journal that they link up with currently relevant discussions in the field of natural language semantics. Information for Authors Papers for publication should be submitted to the Managing Editor by email as a PDF file or PS file attachment. If this is not feasible please contact the Managing Editor.The receipt of submissions is confirmed by email (when there is more than one author to the first author, whom we assume to deal with all correspondence, unless we are instructed differently), and the paper is reviewed by two members of the editorial board or external experts chosen by the editors. The reviewers remain anonymous. An editorial decision is normally reached within 2-3 months after submission. Papers are accepted for review only on the condition that they have neither as a whole nor in part been published elsewhere, are elsewhere under review or have been accepted for publication. In case of any doubt authors must notify the editor of the relevant circumstances at the time of submission. It is understood that authors accept the copyright conditions stated in the journal if the paper is accepted for publication. The style requirements of the Journal of Semantics can be found at www.jos.oxfordjournals. org, under “Instructions to Authors”, and are binding for the final version to be prepared by the author when the paper is accepted for publication. LATEX submission Please use the Journal class file (http://www3.oup.co.uk/semant/instauth/semant.cls). A tex file (http://www3.oup.co.uk/semant/instauth/guide.tex) is available on how to use the .cls file. Authors who are planning to send source files by email should also include a postscript or PDF version of their paper. Please follow all the instructions to authors that are detailed above and note the text width should be set to 28pc and the text height to 41\baselineskip. Electronic figures can only be used in ps or eps format.
JOURNAL OF SEMANTICS A N I NTERNATIONAL J OURNAL FOR THE I NTERDISCIPLINARY S TUDY THE S EMANTICS OF N ATURAL L ANGUAGE
MANAGING EDITOR: ASSOCIATE EDITORS:
OF
BART G EURTS (University of Nijmegen) DAVID B EAVER (Stanford University) R EGINE E CKARDT (Universität Göttingen) I RA N OVECK (Institut des Sciences Cognitives, Lyon) PAUL P ORTNER (Georgetown University,Washington) P HILIPPE S CHLENKER (Institut Jean-Nicod, Paris) YAEL S HARVIT (University of Connecticut, Storrs) A NNA S ZABOLCSI (New York University) EDITORIAL BOARD:
N ICHOLAS A SHER (University of Texas, Austin) C HRIS B ARKER (University of California at San Diego) J OHAN B OS (University of Edinburgh) P ETER B OSCH (University of Osnabrück) R ICHARD B REHENY (University College London) M IRIAM B UTT (University of Konstanz) G REG C ARLSON (University of Rochester) A NN C OPESTAKE (Stanford University) H ENRIËTTE DE S WART (Utrecht University) PAUL D EKKER (University of Amsterdam) K URT E BERLE (Lingenio Heidelberg) M ARKUS E GG (Universität des Saarlandes) U LRIKE H AAS -S POHN (University of Konstanz) L AURENCE R. H ORN (Yale University) H ANS K AMP (University of Stuttgart) G RAHAM K ATZ (University of Osnabrück) T IBOR K ISS (Ruhr University, Bochum) J ONAS K UHN (University of Texas, Austin)
CLAUDIA MAIENBORN (Humboldt University, Berlin) JULIEN MUSOLINO (Rutgers University) FRANCIS JEFFRY PELLETIER (University of Alberta) CHRISTOPHER POTTS (University of Massachusetts, Amherst) MARK STEEDMAN (University of Edinburgh) ZOLTAN GENDLER SZABO (Cornell University) KEES VAN DEEMTER (University of Aberdeen) ROB VAN DER SANDT (University of Nijmegen) ROBERT VAN ROOIJ (University of Amsterdam) KAI VON FINTEL (Massachusetts Institute of Technology) ARNIM VON STECHOW (University of Tübingen) BONNIE WEBBER (University of Edinburgh) HENK ZEEVAT (University of Amsterdam) THOMAS EDE ZIMMERMANN (University of Frankfurt)
EDITORIAL CONTACT:
[email protected] © Oxford University Press 2008 All rights reserved; no part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise without prior written permission of the Publishers, or a licence permitting restricted copying issued in the UK by the Copyright Licensing Agency Ltd, 90 Tottenham Court Road, London W1P 9HE, or in the USA by the Copyright Clearance Center, 222 Rosewood Drive, Danvers, MA 01923. Typeset by TnQ Books and Journals Pvt. Ltd., Chennai, India. Printed by Bell and Bain Ltd, Glasgow, UK For subscription information please see back of journal.
Scope of this Journal The Journal of Semantics publishes articles, notes, discussions, and book reviews in the area of academic research into the semantics of natural language. It is explicitly interdisciplinary, in that it aims at an integration of philosophical, psychological, and linguistic semantics as well as semantic work done in logic, artificial intelligence, and anthropology. Contributions must be of good quality (to be judged by at least two referees) and must report original research relating to questions of comprehension and interpretation of sentences, texts, or discourse in natural language. The editors welcome not only papers that cross traditional discipline boundaries, but also more specialized contributions, provided they are accessible to and interesting for a general readership in the field of natural language semantics. Empirical relevance, sound theoretic foundation, and formal as well as methodological correctness by currently accepted academic standards are the central criteria of acceptance for publication. It is also required of contributions published in the Journal that they link up with currently relevant discussions in the field of natural language semantics. Information for Authors Papers for publication should be submitted to the Managing Editor by email as a PDF file or PS file attachment. If this is not feasible please contact the Managing Editor.The receipt of submissions is confirmed by email (when there is more than one author to the first author, whom we assume to deal with all correspondence, unless we are instructed differently), and the paper is reviewed by two members of the editorial board or external experts chosen by the editors. The reviewers remain anonymous. An editorial decision is normally reached within 2-3 months after submission. Papers are accepted for review only on the condition that they have neither as a whole nor in part been published elsewhere, are elsewhere under review or have been accepted for publication. In case of any doubt authors must notify the editor of the relevant circumstances at the time of submission. It is understood that authors accept the copyright conditions stated in the journal if the paper is accepted for publication. The style requirements of the Journal of Semantics can be found at www.jos.oxfordjournals. org, under “Instructions to Authors”, and are binding for the final version to be prepared by the author when the paper is accepted for publication. LATEX submission Please use the Journal class file (http://www3.oup.co.uk/semant/instauth/semant.cls). A tex file (http://www3.oup.co.uk/semant/instauth/guide.tex) is available on how to use the .cls file. Authors who are planning to send source files by email should also include a postscript or PDF version of their paper. Please follow all the instructions to authors that are detailed above and note the text width should be set to 28pc and the text height to 41\baselineskip. Electronic figures can only be used in ps or eps format.
VOLUME 25 NUMBER 4 NOVEMBER 2008
CONTENTS 345
PAULA RUBIO-FERNÁNDEZ Concept Narrowing:The Role of Context-independent Information
381
ANNA SZABOLCSI, LEWIS BOTT AND BRIAN MCELREE The Effect of Negative Polarity Items on Inference Verification
411
Editor’s Note
451
FORTHCOMING ARTICLES ARIEL COHEN: No Alternative to Alternatives YAEL GREENBERG: Presupposition Accommodation and Informativity Considerations with Aspectual still
VOLUME 25 NUMBER 4 NOVEMBER 2008
CHUNG-HYE HAN AND NANCY HEDBERG Syntax and Semantics of It-Clefts: A Tree Adjoining Grammar Analysis
JOURNAL OF SEMANTICS
JOURNAL OF SEMANTICS
issn 0167-5133
VOLUME 25 NUMBER 4 NOVEMBER 2008
Journal of
SEMANTICS www.jos.oxfordjournals.org
oxford
SUBSCRIPTIONS
A subscription to Journal of Semantics comprises 4 issues. All prices include postage, and for subscribers outside the UK delivery is by Standard Air. Journal of Semantics Advance Access contains papers that have been finalised, but have not yet been included within the issue. Advance Access is updated monthly. Annual Subscription Rate (Volume 25, 4 issues, 2008) Institutional Print edition and site-wide online access: £175/$350/=263 C Print edition only: £166/$332/=249 C Site-wide online access only: £166/$332/=249 C Personal Print edition and individual online access: £65/$130/=98 C Please note: £ Sterling rates apply in Europe, US$ elsewhere There may be other subscription rates available, for a complete listing please visit www.jos.oxfordjournals.org/subscriptions. Full prepayment, in the correct currency, is required for all orders. Orders are regarded as firm and payments are not refundable. Subscriptions are accepted and entered on a complete volume basis. Claims cannot be considered more than FOUR months after publication or date of order, whichever is later. All subscriptions in Canada are subject to GST. Subscriptions in the EU may be subject to European VAT. If registered, please supply details to avoid unnecessary charges. For subscriptions that include online versions, a proportion of the subscription price may be subject to UK VAT. Personal rate subscriptions are only available if payment is made by personal cheque or credit card and delivery is to a private address. The current year and two previous years’ issues are available from Oxford University Press. Previous volumes can be obtained from the Periodicals Service Company, 11 Main Street, Germantown, NY 12526, USA. Email:
[email protected]. Tel: +1 (518) 537 4700. Fax: +1 (518) 537 5899. For further information, please contact: Journals Customer Service Department, Oxford University Press, Great Clarendon Street, Oxford OX2 6DP, UK. Email:
[email protected]. Tel (and answerphone outside normal working hours): +44 (0)1865 353907. Fax: + 44 (0)1865 353485. In the US, please contact: Journals Customer Service Department, Oxford University Press, 2001 Evans Road, Cary, NC 27513, USA. Email:
[email protected]. Tel (and answerphone outside normal working hours): 800 852 7323 (toll-free in USA/Canada). Fax: 919 677 1714. In Japan, please contact: Journals Customer Services, Oxford University Press, 1-1-17-5F, Mukogaoka, Bunkyo-ku, Tokyo, 113-0023, Japan. Email:
[email protected]. Tel: (03) 3813 1461. Fax: (03) 3818 1522. Methods of payment. Payment should be made: by cheque (to Oxford University Press, Cashiers Office, Great Clarendon Street, Oxford, OX2 6DP, UK); by bank transfer [to Barclays Bank Plc, Oxford Office, Oxford (bank sort code 20-65-18) (UK);
overseas only Swift code BARC GB22 (GB£ Sterling Account no. 70299332, IBAN GB89BARC20651870299332; US$ Dollars Account no. 66014600, IBAN GB27BARC20651866014600; EU= C EURO Account no. 78923655, IBAN GB16BARC20651878923655]; or by credit card (Mastercard, Visa, Switch or American Express). Journal of Semantics (ISSN 0167 5133) is published quarterly (in February, May, August and November) by Oxford University Press, Oxford, UK. Annual subscription price is £175/$350/=263. C Journal of Semantics is distributed by Mercury International, 365 Blair Road, Avenel, NJ 07001, USA. Periodicals postage paid at Rahway, NJ and at additional entry points. US Postmaster: send address changes to Journal of Semantics (ISSN 0167-5133), c/o Mercury International, 365 Blair Road, Avenel, NJ 07001, USA. Abstracting and Indexing Annual Bibliography English Language Literature (ABEL), INSPEC, International Bibliography Sociology, Linguistics Abstracts, Linguistics and Language Behaviour Abstracts (LLBA), MLA: International Bibliography Books, Articles and Modern Language Literature, periodicals Contents Index, Philosopher’s Index, Social Planning Policy and Development Abstracts, Bibliographie Linguistique/Linguistic Bibliography and BLonline. Permissions For information on how to request permissions to reproduce articles/information from this journal, please visit www.oxfordjournals.org/jnls/permissions. Advertising Inquiries about advertising should be sent to Helen Pearson, Oxford Journals Advertising, PO Box 347, Abingdon OX14 1GJ, UK. Email:
[email protected]. Tel: +44 (0)1235 201904. Fax: +44 (0)8704 296864. Disclaimer Statements of fact and opinion in the articles in Journal of Semantics are those of the respective authors and contributors and not of Journal of Semantics or Oxford University Press. Neither Oxford University Press nor Journal of Semantics make any representation, express or implied, in respect of the accuracy of the material in this journal and cannot accept any legal responsibility or liability for any errors or omissions that may be made. The reader should make his/her own evaluation as to the appropriateness or otherwise of any experimental technique described.
JOURNAL OF SEMANTICS Volume 25 Number 4
CONTENTS CHUNG-HYE HAN AND NANCY HEDBERG Syntax and Semantics of It-Clefts: A Tree Adjoining Grammar Analysis
345
PAULA RUBIO-FERNA´NDEZ Concept Narrowing: The Role of Context-independent Information
381
ANNA SZABOLCSI, LEWIS BOTT AND BRIAN MCELREE The Effect of Negative Polarity Items on Inference Verification 411 Editor’s Note
Please visit the journal’s web site at www.jos.oxfordjournals.org
451
Journal of Semantics 25: 345–380 doi:10.1093/jos/ffn007 Advance Access publication August 20, 2008
Syntax and Semantics of It-Clefts: A Tree Adjoining Grammar Analysis CHUNG-HYE HAN AND NANCY HEDBERG Simon Fraser University
In this paper, we examine two main approaches to the syntax and semantics of itclefts as in ‘It was Ohno who won’: an expletive approach where the cleft pronoun is an expletive and the cleft clause bears a direct syntactic or semantic relation to the clefted constituent, and a discontinuous constituent approach where the cleft pronoun has a semantic content and the cleft clause bears a direct syntactic or semantic relation to the cleft pronoun. We argue for an analysis using Tree Adjoining Grammar (TAG) that captures the best of both approaches. We use TreeLocal Multi-Component Tree Adjoining Grammar to propose a syntax of it-clefts and Synchronous Tree Adjoining Grammar (STAG) to define a compositional semantics on the proposed syntax. It will be shown that the distinction TAG makes between the derivation tree and the derived tree, the extended domain of locality characterizing TAG and the direct syntax–semantics mapping characterizing STAG allow for a simple and straightforward account of the syntax and semantics of it-clefts, capturing the insights and arguments of both the expletive and the discontinuous constituent approaches. Our analysis reduces the syntax and semantics of it-clefts to copular sentences containing definite description subjects, such as ‘The person that won is Ohno’. We show that this is a welcome result, as evidenced by the syntactic and semantic similarities between it-clefts and the corresponding copular sentences.
1 INTRODUCTION The extant literature on the syntax of it-clefts, as in (1), can be classified into two main approaches. First, the cleft pronoun it is an expletive, and the cleft clause bears a direct syntactic or semantic relation to the clefted constituent, such as one of predication (Jespersen 1937; Chomsky 1977; Williams 1980; Delahunty 1982; Rochemont 1986; Heggie 1988; Delin 1989; E´. Kiss 1998). Second, the cleft clause bears a direct syntactic or semantic relation to the cleft pronoun and is spelled out after the clefted constituent through extraposition or by forming a discontinuous constituent with the cleft pronoun (Jespersen 1927; Akmajian 1970b; Emonds 1976; Gundel 1977; Wirth 1978; Hedberg Ó The Author 2008. Published by Oxford University Press. All rights reserved. For Permissions, please email:
[email protected].
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Abstract
346 Syntax and Semantics of It-Clefts 1990, 2000; Percus 1997). Under this second approach, the cleft pronoun is not necessarily expletive but rather has a semantic function such as that of a definite article. (1) It was OHNO [who won]. cleft pronoun + copula + clefted constituent + cleft clause
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
In this paper, we argue for an analysis using Tree Adjoining Grammar (TAG) that captures the best of both traditional analyses by making use of the distinction in TAG between the derivation tree on which syntactic dependencies between elementary objects and compositional semantics are defined, and the derived tree on which aspects of surface constituency are defined. An illustration of the derivation tree and derived tree in TAG is given in section 3.1. In our analysis, as in the expletive approach, at the level of surface syntax (the derived tree), the clefted constituent and cleft clause form a syntactic constituent. As in the discontinuous constituent approach, however, at the level of syntactic dependencies (the derivation tree), the cleft pronoun and the cleft clause form a syntactic unit, and a semantic unit as a definite description. This aspect of our analysis reduces the syntax and semantics of it-clefts to copular sentences containing definite description subjects. We show that this reduction is supported by the fact that it-clefts and the corresponding copular sentences pattern alike both syntactically and semantically. In particular, we use Tree-Local Multi-Component Tree Adjoining Grammar (MC-TAG) to propose a syntax of it-clefts and Synchronous Tree Adjoining Grammar (STAG) to define a compositional semantics on the proposed syntax. It will be shown that the distinction TAG makes between the derivation tree and the derived tree, the extended domain of locality characterizing TAG and the direct syntax–semantics mapping characterizing STAG allow for a simple and straightforward account of the syntax and semantics of it-clefts, capturing the insights and arguments of both the expletive and the discontinuous constituent approaches. The paper is organized as follows. In section 2, we present arguments supporting the discontinuous constituent analysis as well as some arguments supporting the expletive analysis. We also discuss connectivity effects in it-clefts and parallel effects in copular sentences instantiated by binding and agreement. In section 3, we introduce the basics of TAG for doing natural language syntax and present our TAG analysis of the syntax of it-clefts. In section 4, we introduce STAG and show how compositional semantics is done using STAG, and present our analysis of the semantics of it-clefts. In section 5, we show how our TAG analysis can account for the connectivity effects in it-clefts instantiated by binding and agreement.
Chung-Hye Han and Nancy Hedberg 347
2 THE TENSION BETWEEN THE EXPLETIVE AND THE DISCONTINUOUS CONSTITUENT ANALYSES
(2)
a. This is not Iowa we’re talking about. (Hedberg 2000, ex. 17) b. That’s the French flag you see flying over there. (Hedberg 2000, ex. 20)
In (2), the proximal demonstrative pronoun is selected when the content of the cleft clause indicates that the referent of the clefted constituent is close to the speaker, and the distal demonstrative is selected when the content of the cleft clause indicates that the referent is far from the speaker. Reversing the cleft pronouns would lead to infelicity. The discontinuous constituent analysis allows the cleft pronoun to be treated as having the semantic content of a determiner. Thus, we can view the cleft pronoun and cleft clause in (2) as working together to function as a demonstrative description as in (3). (3)
a. This [place] we’re talking about is not Iowa. b. That [thing] you see flying over there is the French flag.
Second, the cleft clause has the internal structure of a restrictive relative clause. This is supported by the fact that the initial element in the cleft clause may be realized either as a wh-word (1) or as that (4a), or it may be absent altogether when the gap is not in the subject position (2,4b). It may even be in the form of a genitive wh-word as in (4c). (4)
a. It was Ohno that won. b. It was Ohno Ahn beat. c. It was Ohno whose Dad cheered.
The cleft clause, however, does not relate to the clefted constituent in the way that a restrictive relative clause relates to its head noun, as first
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
In this section, we review five main syntactic and semantic properties of it-clefts: semantic content of the cleft pronoun, internal structure of the cleft clause, presence of existential and exhaustive presuppositions, presence of equative and predicational readings, and connectivity. For each property, we discuss how the expletive analysis and the discontinuous constituent analysis fare. The arguments presented in this section are taken from the existing literature on it-clefts. First, it has been shown in Hedberg (1990, 2000) that the cleft pronoun can be replaced with this or that, as in (2), depending on the discourse contextual interpretation of the cleft clause. The fact that the choice of the cleft pronoun is subject to pragmatic constraints indicates that the cleft pronoun is not an expletive element.
348 Syntax and Semantics of It-Clefts noted in Jespersen (1927). This is because the clefted constituent can be a proper noun, unlike a head noun modified by a restrictive relative clause, as illustrated in (5). Many expletive analyses (e.g. Delahunty 1982; Rochemont 1986; Heggie 1988) thus do not consider the cleft clause to have the internal structure of a restrictive relative clause. The discontinuous constituent analysis, on the other hand, allows the cleft clause to be treated as such, as argued for in Hedberg (1990), because it assumes that the relative clause forms a constituent with the cleft pronoun. (5) *Ohno that won is an American.
(6) a. I said it should have been [Bill who negotiated the new contract], and it should have been. b. It must have been [Fred that kissed Mary] but [Bill that left with her]. It will be shown in section 3.2 that our analysis resolves this tension between the discontinuous constituent analysis and the expletive analysis by making use of TAG’s distinction between the derivation tree, on which compositional semantics and syntactic dependencies between elementary objects are defined, and the derived tree, on which surface syntactic relations are defined. On our analysis, the clefted constituent and the cleft clause form a constituent in the derived tree, and the cleft pronoun and the cleft clause form a syntactic unit in the derivation tree. Third, it-clefts pattern with copular sentences containing definite description subjects syntactically and semantically. Semantically, it-clefts have existential and exhaustive presuppositions, just as definite descriptions do, as pointed out in Percus (1997) and Hedberg (2000). The inference in (7c) associated with (7a) survives in the negative counterpart in (7b). This is exactly the way the presupposition associated with the definite description the king of France behaves: the presupposition spelled out in (8c) survives in both the affirmative (8a) and the negative counterpart in (8b). (7) a. It was Ohno who won. b. It was not Ohno who won. c. Someone won, and only one person won.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Even so, as pointed out first in Delahunty (1982), there is some syntactic evidence that the clefted constituent and the cleft clause do form a surface syntactic constituent. The examples in (6), from Hedberg (2000), show that the two together can be deleted as a unit, as in (6a), and coordinated as a unit, as in (6b).
Chung-Hye Han and Nancy Hedberg 349
(8)
a. The king of France is bald. b. The king of France is not bald. c. There is one and only one king of France.
(9)
a. The teacher is Sue Johnson. b. The teacher is a woman.
This observation follows under the discontinuous constituent analysis, as it-clefts there reduce to ordinary copular sentences, unlike some expletive analyses where the copula is treated as a focus marker (E´. Kiss 1998). For instance, (7a) (repeated as (10a)) can be paraphrased as (10b), and corresponds to a typical equative sentence. And (11a) can be paraphrased as (11b), and corresponds to a typical predicational sentence. According to the analysis we will present in section 4, (10a) will be assigned the semantic representation in (10c) and (11a) will be assigned the semantic representation in (11c). (10) a. It was Ohno who won. b. The one who won was Ohno. c. THEz [won(z)] [z ¼ Ohno] (11) a. It was a kid who beat John. b. The one who beat John was a kid. c. THEz [beat(z, John)] [kid(z)] Fifth, Percus (1997) points out that it-clefts pattern with copular sentences containing definite description subjects with regard to SELF-anaphor binding and negative polarity item (NPI) licensing. In the absence of c-command, a SELF-anaphor in the clefted constituent position can be bound by an antecedent inside the cleft clause, as shown in (12a). Also a pronoun in the clefted constituent position cannot be
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Both Percus and Hedberg argue that this parallelism between definite descriptions and it-clefts can be accounted for if the cleft pronoun and the cleft clause form a semantic unit, with it playing the role of the definite article and the cleft clause the descriptive component. What this translates to syntactically is that the cleft clause is a restrictive relative clause which is situated at the end of the sentence, forming a discontinuous constituent with the cleft pronoun. On this view, the syntax and semantics of it-clefts reduce to that of copular sentences with definite description subjects. Fourth, it has been observed that it-clefts can have equative and predicational interpretations (Ball 1977; DeClerck 1988; Hedberg 1990, 2000), both of which are readings attested in simple copular sentences, as shown in (9):
350 Syntax and Semantics of It-Clefts bound by an antecedent inside the cleft clause, as shown in (13a). Copular sentences with definite description subjects exhibit the same pattern, as in (12b) and (13b). An NPI can occur in the clefted constituent position, licensed by a matrix negative element, as shown in (14a), but it is not licensed by a negation in the cleft clause, as in (15a). This pattern of NPI licensing is attested in copular sentences, as shown in (14b) and (15b).
(14) a. b.
It isn’t anyone I know that John saw. The one that John saw isn’t anyone I know.
(15) a. *It is anyone I know that John didn’t see. b. *The one that John didn’t see is anyone I know. Since it-clefts and copular sentences with definite description subjects exhibit the same pattern of binding and NPI licensing, a uniform explanation for the two cases can be sought if the cleft pronoun and the cleft clause together form a definite description.1 The NPI facts are not difficult to explain, as the NPI in (14) is ccommanded by the negative element, and the NPI in (15) is not ccommanded by the negative element. However, the SELF-anaphor in (12) and the pronoun in (13) are at first sight mysterious under the discontinuous constituent analysis. This is an example of connectivity, whereby the clefted constituent appears to behave as it would if it were generated inside the cleft clause, thus lending support for the expletive analysis. In section 5, we present a solution to this problem by incorporating Binding Conditions of Reinhart & Reuland (1993) to our TAG analysis, and also arguing that the SELF-anaphor in (12) is a discourse anaphor of focus. Agreement facts constitute another example of connectivity, in that when the cleft clause has a subject gap, the verb in the cleft clause agrees in number and person with the clefted constituent. Note also 1 Percus shows that wh-clefts differ from both it-clefts and copular sentences with definite description subjects in that only in the former can post-copular NPIs be licensed by embedded negation. See the examples in (15) and (i). The grammaticality of (i), as opposed to the ungrammaticality of (15), shows that it-clefts should not be treated as deriving from wh-clefts, as was argued, for example, in Akmajian (1970b).
(i) What John didn’t see was anything I might recognize.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
(12) a. It was himselfi who Johni nominated. b. The one that Johni nominated was himselfi. (13) a. *It was himi who Johni nominated. b. *The one that Johni nominated was himi.
Chung-Hye Han and Nancy Hedberg 351
that in equative clefts the copula agrees with the singular cleft pronoun and not with a plural clefted constituent. These facts are shown in (16). (16) a. It is John and Mary that like Pete. b. *It is John and Mary that likes Pete. c. *It are John and Mary that like Pete.
(17) a. They’re just fanatics who are holding him. b. These are students who are rioting. c. Those are kids who beat John. This difference in cleft pronoun choice between equative and predicational clefts with plural clefted constituents shows that the distinction is a real one and emphasizes the parallelism between it-clefts and ordinary copular sentences, which also exhibit the distinction, as shown above in (9).2 It would be difficult for an expletive analysis that assumes that the copula as well as the cleft pronoun is semantically inert, to account for the distinction between the predicational and equative itclefts. In section 5, we use agreement features and feature unification in TAG to account for the connectivity in agreement and the difference in agreement behaviour between equative and predicational it-clefts, again showing that our TAG analysis can capture the best of both the discontinuous constituent analysis and the expletive analysis. 3 SYNTAX OF IT-CLEFTS
3.1 Introduction to TAG syntax TAG is a tree-rewriting system, first formally defined in Joshi et al. (1975). In TAG for natural language, the elementary objects are 2 An anonymous reviewer suggests that the indefinite plural clefted constituent examples in (17) could also be produced with a singular pronoun and copula. While we agree that this might be possible, we have the strong intuition that such examples are equative in nature. Thus, in (i), it is no longer the case that the property of being fanatics is being predicated of a set of people independently identified as those who are holding him. Instead, the question of who is holding him is being answered by identifying these people as a group of fanatics.
(i) It’s just fanatics who are holding him.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
The agreement connectivity between the clefted constituent and the cleft clause favours expletive analyses that analyse the clefted constituent as adjoined to or extracted from the cleft clause. Interestingly, as first pointed out in Ball (1977), in predicational clefts, a plural clefted constituent triggers a plural cleft pronoun and the copula agrees with this plural cleft pronoun, while the verb in the cleft clause again agrees with the clefted constituent, as shown in (17).
352 Syntax and Semantics of It-Clefts
3 In principle, trees such as (aa_movie) could be broken down into trees for determiners and trees for NPs, as in (i). Under this approach, an NP tree anchoring a noun would substitute into a DP tree anchoring a determiner. But strictly speaking, this violates Frank’s (2002) formulation of CETM, as the DP tree in (i) is a projection of a functional head (D), not a lexical head.
(i)
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
lexicalized trees called elementary trees that represent extended projections of a lexical anchor. These trees are minimal in that all and only the syntactic/semantic arguments of the lexical anchor are encapsulated and all recursion is factored away. The elementary trees in TAG are therefore said to possess an extended domain of locality. Frank (2002) formulates the extended projection property of elementary trees as a Condition on Elementary Tree Minimality (CETM) and states that ‘the syntactic heads in an elementary tree and their projections must form an extended projection of a single lexical head’ (p. 54). Following Grimshaw (1991), Frank takes extended projections of a lexical head to include the projections of all functional heads that embed it. This means that an elementary tree anchoring a verb can project to verb phrase (VP) but also to tense phrase (TP) and complementizer phrase (CP), and an elementary tree anchoring a noun can project to noun phrase (NP) but also to determiner phrase (DP) and prepositional phrase. Further, the fundamental thesis in TAG for natural language is that ‘every syntactic dependency is expressed locally within a single elementary tree’ (Frank 2002: 22). This allows for a syntactic dependency created by movement to occur within an elementary tree, but not across elementary trees. The trees in Figure 1 are all examples of well-formed elementary trees. (asaw) is an elementary tree because it is an extended projection of the lexical predicate saw and has argument slots for the subject and the object marked by the downward arrow (Y). Moreover, the movement of the subject DP from [Spec,VP] to [Spec,TP], following the VP-internal subject hypothesis (Koopman & Sportiche 1991), is an operation internal to the elementary tree, and therefore represents a syntactic dependency localized to the elementary tree. (aJohn) and (aa_movie) are valid elementary trees because these DP trees each contain a single lexical head, John for (aJohn) and movie for (aa_movie), that can form an extended projection with a DP, in line with the DP hypothesis (Abney 1987).3
Chung-Hye Han and Nancy Hedberg 353
Figure 1 Initial trees in TAG.
Figure 2
Auxiliary trees in TAG.
4 By convention, names of initial trees are prefixed with a, and names of auxiliary trees are prefixed with b.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Elementary trees are of two types: initial trees and auxiliary trees. A derivation in TAG starts with initial trees such as trees for simple clauses and nominal phrases. The elementary trees in Figure 1 are examples of initial trees. Auxiliary trees are used to introduce recursive structures, for example, adjuncts or other recursive portions of the grammar. Auxiliary trees have a special non-terminal node called the foot node (marked with an asterisk) among the leaf nodes, which has the same label as the root node of the tree. The auxiliary trees in Figure 2 are well-formed elementary trees, as CETM requires only that syntactic heads and their projections form an extended projection, rendering the presence of the VP root node in (breluctantly) and the NP root node in (bscary) consistent with CETM. Further, following Frank (2002), we can count VP* in (breluctantly) and NP* in (bscary) as arguments of the lexical anchor, as the process of theta-identification (Higginbotham 1985) obtains between them and the lexical anchor.4
354 Syntax and Semantics of It-Clefts
Figure 3
Figure 4
Substitution in TAG.
Adjoining in TAG.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
These elementary trees are combined through two derivational operations: substitution and adjoining. In the substitution operation, the root node on an initial tree is merged into a matching non-terminal leaf node marked for substitution (Y) in another tree. This is illustrated in Figure 3. In an adjoining operation, an auxiliary tree is grafted onto a non-terminal node in another elementary tree that matches the root and foot nodes of the auxiliary tree. For example, Figure 4 illustrates (breluctantly) adjoining to the VP node in (asaw), and (bscary) adjoining to the NP node in (aa_movie) which in turn substitutes into (asaw). TAG derivation produces two structures: a derived tree and a derivation tree. The derived tree is the conventional phrase structure
Chung-Hye Han and Nancy Hedberg 355
Figure 5
Derived tree and derivation tree in TAG.
5 The location in the parent elementary tree is usually denoted by the Gorn tree address. Here, we use node labels such as DPs or VPs for the sake of simplicity.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
tree and represents surface constituency. For instance, combining the elementary trees in Figures 1 and 2 through substitution and adjoining as in Figures 3 and 4 generates the derived tree in Figure 5 (left). The derivation tree represents the history of composition of the elementary trees and the dependencies between the elementary trees. In a derivation tree, each node is an elementary tree, and the children of a node N represent the trees which are adjoined or substituted into the elementary tree represented by N. The link connecting a pair of nodes is annotated with the location in the parent elementary tree where adjoining or substitution has taken place.5 An example of a derivation tree is given in Figure 5 (right). Figure 5 (right) records the history of composition of the elementary trees to produce the derived tree in Figure 5 (left): (bscary) adjoins to (aa_movie) at NP, (aJohn) and (aa_movie) substitute into (asaw) at DPi and DP, respectively, and (breluctantly) adjoins to (asaw) at VP. As first shown by Joshi (1985) and Kroch & Joshi (1985), and explored further in Frank (2002), the properties of TAG permit us to provide computationally feasible accounts for various phenomena in
356 Syntax and Semantics of It-Clefts
3.2 Our TAG analysis of the syntax of it-clefts Inspired by work of Kroch & Joshi (1987) and Abeille´ (1994) on discontinuous constituents resulting from extraposition, we propose an analysis for the syntax of it-clefts using tree-local MC-TAG, an extension of TAG. In tree-local MC-TAG, the basic objects of derivation are not only individual elementary trees but also (possibly a singleton) set of such trees, called a multi-component set. All the trees in a multi-component set are restricted to adjoin or substitute simultaneously into a single elementary tree, at each step in a derivation. With this restriction, MC-TAG is shown to be identical to basic TAG in terms of strings and structural descriptions it generates: that is, MCTAG has the same weak and strong generative capacity as the basic TAG (Weir 1988). In addition to extraposition, MC-TAG has been used in the analyses of West Germanic verb raising (Kroch & Santorini 1991), Romance clitic climbing (Bleam 2000) and extraction of an object wh-phrase from a wh-island (Kroch 1989; Frank 2002). The trees in a multi-component set can be thought of as a single elementary tree decomposed into two or more trees. As these trees substitute or adjoin into different positions in another elementary tree, the effect of discontinuous constituency can be produced. Further, the locality of the syntactic dependencies that exist between these trees is maintained, as they are restricted to compose simultaneously with a single elementary tree, contributing to the restricted generative capacity of MC-TAG. We propose that the elementary trees for the cleft pronoun and the cleft clause in the derivation of it-clefts such as (10a) (repeated below as
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
natural language syntax. For example, TAG’s extended domain of locality and its factoring of recursion from elementary trees lead, among other things, to a localization of unbounded dependencies. TAG is a mildly context-sensitive grammar Joshi et al. (1991), formally sitting between context-free and context-sensitive grammar, and is able to generate unbounded cross-serial dependencies such as those that occur between the arguments and verbs in Dutch and Swiss German in a natural way. In section 3.2, we show that TAG’s extended domain of locality allows us to provide an elegant syntactic account of the discontinuous constituency of the cleft pronoun and the cleft clause without adopting a movement-based account of the extraposition of the cleft clause. At the same time, TAG’s distinction between the derivation and derived trees allows us to account for the surface syntactic constituency of the clefted constituent and the cleft clause.
Chung-Hye Han and Nancy Hedberg 357
(18)) and (11a) (repeated below as (19)) form a multi-component set, as in f(ait), (bwho_won)g and f(ait), (bwho_beat)g in Figure 6. (18) It was Ohno who won. (19) It was a kid who beat John.
Figure 6 Multi-component sets of cleft pronoun and cleft clause. 6 Strictly speaking, the elementary trees representing the cleft clause in the two multi-component sets in Figure 6 should have a substitution site in [Spec,CP] to be substituted in by a separate DP elementary tree anchoring a relative pronoun. Here, to simplify the derivation, we have already substituted in the relative pronoun DP tree.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
We capture the intuition that the cleft pronoun and the cleft clause form a syntactic unit by placing the elementary trees for them in a single multi-component set. And as these are two separate trees, they are able to substitute and adjoin onto two different places in a single elementary tree, producing the effect of discontinuity. The first component of each set introduces a determiner and the second component of each set introduces a relative clause anchoring the lexical predicate.6 The multi-component set can be thought as a DP tree decomposed into two parts: a functional projection of a determiner and a lexical domain on which the determiner operates. That is, the two parts are comparable to a projection of D and a projection of N in a simple DP tree such as (aa_movie) in Figure 1: like a in (aa_movie), it in (ait) is a determiner that heads a DP, and like the NP (movie) in (aa_movie), (bwho_won) and (bwho_beat) include the lexical domains on which the determiner operates. Moreover, just like simple DP trees like (aa_movie), the two components in the sets f(ait), (bwho_won)g and f(ait), (bwho_beat)g together comply to CETM: each set has a single lexical head, the verb and all other syntactic heads and their
358 Syntax and Semantics of It-Clefts projections, TP, CP and DP form extended projections of the verb. The presence of FP does not violate CETM, as CETM requires only that syntactic heads and their projections in an elementary tree form an extended projection of the anchor. For the derivation of equative it-clefts as in (18), we adopt the equative copular tree in (awas) in Figure 7, a tree similar to the one proposed in Frank (2002) for copular sentences. In this tree, FP is a small clause of the copula from which the two DPs being equated originate. (18) is derived by substituting (ait) into DP0 in (awas), adjoining (bwho_won) into FP in (awas) and substituting (aOhno) into DP1 in (awas), as illustrated in Figure 8. The syntactic derivation tree and the
Figure 8 Elementary trees for ‘It was Ohno who won’.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Figure 7 Equative copula elementary tree.
Chung-Hye Han and Nancy Hedberg 359
(20) a. I said it should have been [Bill who negotiated the new contract], and it should have been. b. It must have been [Fred that kissed Mary] but [Bill that left with her]. c. It was Kim, in my opinion, who won the race.
Figure 9 Derivation and derived trees for ‘It was Ohno who won’. 7 By convention, names of derivation trees are prefixed with d, and names of derived trees are prefixed with c. 8 See Han & Hedberg (2006) for a TAG analysis of coordination in it-clefts, as exemplified in (20b).
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
derived tree for (18) are given in (d18) and (c18), respectively, in Figure 9.7 In (d18), the elementary trees for the cleft pronoun and the cleft clause form a unit, represented as a single node, and in (c18), the clefted constituent and the cleft clause form a constituent. Postulating separate projections for the copula (CopP) and the small clause (FP) in (awas) can account for the fact that the clefted constituent and the cleft clause form a constituent, as illustrated in (6a,b) (repeated below as (20a,b)), and yet they can be separated by an adverbial phrase, as in (20c). In our analysis, (20a,b) are possible because the bracketed parts are the higher layers of the FPs in the derived tree. (20c) is possible because an adverbial phrase can adjoin onto FP or F# in the equative copula tree, in which case, the clefted constituent and the cleft clause would be separated by the adverbial phrase in the derived tree.8
360 Syntax and Semantics of It-Clefts
Figure 10
Predicational copula tree.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
For the derivation of predicational it-clefts as in (19), we adopt a predicational copula tree (awas_kid) in Figure 10. The predicational copula tree in (awas_kid) is similar to the equative copula tree in (awas) in that in both trees, the copula combines with a small clause FP. But the two trees have different anchors and different number of argument substitution sites. In (awas_kid), the noun (kid) is the predicate requiring a single argument, and thus the noun (kid) is the lexical anchor of the tree and the subject DP is an argument substitution site. But in (awas), both the subject and the non-subject DPs are argument substitution sites as they are arguments of an equative predicate. As illustrated in Figure 11, (19) is derived by substituting (ait) into DP0 and adjoining (bwho_beat) onto FP in (awas_kid), and substituting (aJohn) into DP in (awho_beat). The syntactic derivation tree and the derived tree for (19) are given in (d19) and (c19), respectively, in Figure 12. Just as in the derivation tree and the derived tree for the equative it-cleft in Figure 12, in (d19), the elementary trees for the cleft pronoun and the cleft clause form a unit, represented as a single node, and in (c19), the clefted constituent and the cleft clause form a constituent.
Chung-Hye Han and Nancy Hedberg 361
Figure 12
Syntactic derivation and derived trees for ‘It was a kid who beat John’.
4 SEMANTICS OF IT-CLEFTS In TAG, the derivation tree, not the derived tree, serves as the input to compositional semantics (Joshi & Vijay-Shanker 1999; Kallmeyer & Joshi 2003). While phrase structure-based compositional semantics computes the meaning of a sentence as a function of the meaning of each node in
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Figure 11 Elementary trees for ‘It was a kid who beat John’.
362 Syntax and Semantics of It-Clefts
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
the syntactic tree, TAG-based compositional semantics computes the meaning of a sentence as a function of the meaning of elementary trees put together to derive the sentence structure. Each syntactic elementary tree is associated with a semantic representation, and following the history of how the elementary trees are put together to derive the sentence structure, the corresponding semantic representation is computed by combining the semantic representations of the elementary trees. There are two main approaches to doing compositional semantics on the derivation tree: (i) flat semantics (Joshi & Vijay-Shanker 1999; Kallmeyer & Joshi 2003; Romero & Kallmeyer 2005; Kallmeyer & Romero 2008); and (ii) STAG (Shieber & Schabes 1990; Abeille´ 1994; Shieber 1994). Under the flat semantics approach, in the style of Minimal recursion semantics (Copestake et al. 2005), the main operation for semantic composition is the conjunction of the semantic representations associated with each elementary tree along with the unification of variables contributed by these semantic representations. In Romero & Kallmeyer (2005) and Kallmeyer & Romero (2008), derivation trees are augmented with feature structures to enforce variable unification. The theory of semantic representations developed by Kallmeyer and Romero has been used in a series of empirical work: pied-piping of wh-phrases (Kallmeyer & Scheffler 2004), focus (BabkoMalaya 2004), questions (Romero et al. 2004), VP coordination (Banik 2004), among others. In this paper, however, we use STAG, a pairing of a TAG for the syntax and a TAG for the semantics, to propose a compositional semantic analysis for it-clefts. In STAG-based compositional semantics, the semantic representations are structured trees with nodes on which substitution and adjoining of other semantic representations can take place. Compositionality obtains with the requirement that the derivation tree in syntax and the corresponding derivation tree in semantics be isomorphic, as specified in Shieber (1994). This isomorphism requirement guarantees that the derivation tree in syntax determines the meaning components needed for semantic composition, and the way these meaning components are combined. Since the semantic representations are structured trees, the semantic objects and the composition of these objects parallel those already utilized in syntax, and so computing semantics only requires the operations of substitution and adjoining used to build the syntactic structures. These properties of STAG allow us to define a simple and elegant syntax– semantics mapping, as has been shown to be the case by Nesson & Shieber (2006), who provide a STAG analysis for various linguistic phenomena, including quantifier scope, long distance wh-movement,
Chung-Hye Han and Nancy Hedberg 363
subject-to-subject raising and nested quantifiers and inverse linking, and Han (2007), who provide a STAG analysis for relative clauses and pied-piping. In section 4.1, we introduce the basics of STAG and STAG-based compositional semantics and in section 4.2, we present our proposed analysis for the semantic composition of it-clefts.
4.1 Introduction to STAG and compositional semantics
(21) John saw a scary movie. We use STAG as defined in Shieber (1994). In STAG, each syntactic elementary tree is paired with one or more semantic trees that represent its meaning with links between matching nodes. A synchronous derivation proceeds by mapping a derivation tree from the syntax side to an isomorphic derivation tree on the semantics side, and is synchronized by the links specified in the elementary tree pairs. In the tree pairs given in Figure 13, the trees on the left side are syntactic elementary trees and the ones on the right side are semantic trees. In the semantic trees, F stands for formulas, R for predicates and T for terms. We assume that these nodes are typed (e.g. the F node in (a#saw) has type t and the lowest R node in (a#saw) has type <e, <e, t), and we represent predicates as unreduced k-expressions, following the notation in Han (2007). Making use of unreduced k-expressions in semantic trees allows the reduction of semantic derived trees to logical forms through the application of k-conversion and other operations defined on kexpressions. The linked nodes are shown with boxed numbers. For the sake of simplicity, in the elementary tree pairs, we only include links that are relevant for the derivation of given examples.9 Figure 13 contains elementary trees required to generate the syntactic structure and the logical form of (21). The proper name tree in (aJohn) is paired with a tree representing a term on the semantics side, and the attributive adjective tree in (bscary) is paired with an auxiliary tree on the semantics side that represents a one-place predicate to be adjoined to another one-place predicate. For quantified DPs, we follow Shieber & Schabes (1990) and Nesson & Shieber (2006), and use tree-local MC-TAG on the semantics side. Thus, the DP in (aa_movie) 9 By convention, names of semantic elementary trees are prefixed with a# or b#, names of semantic derivation trees are prefixed with d# and names of semantic derived trees are prefixed with c#.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
We illustrate the framework of STAG and STAG-based compositional semantics and clarify our assumptions, using (21), a simple sentence that contains an existential quantifier and an attributive adjective. A similar example was used in section 3 to illustrate the syntactic derivation in TAG.
364 Syntax and Semantics of It-Clefts
Syntactic and semantic elementary trees for ‘John saw a scary movie’.
is paired with a multi-component set f(a#a_movie), (b#a_movie)g on the semantics side: (a#a_movie) provides an argument variable and (b#a_movie) provides an existential quantifier with the restriction and scope. The transitive tree in (asaw) is paired with a semantic tree representing a formula that consists of a two-place predicate and two term nodes. The links, notated with boxed numbers, guarantee that whatever substitutes into DPi, its corresponding semantic tree will substitute into the term node marked with 1 , and whatever substitutes into DP is paired up with a multi-component set on the semantics side where one of the components will substitute into the term node marked with 2 and the other will adjoin to the F node marked with 2 . The syntactic and semantic derivation trees are given in Figure 14, and the derived trees are given in Figure 15. Technically, there is only one derivation tree because the syntactic and semantic derivations are
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Figure 13
Chung-Hye Han and Nancy Hedberg 365
Figure 14 Syntactic and semantic derivation trees for ‘John saw a scary movie’.
isomorphic. In this paper, we provide two derivation trees (one for syntax and the other for semantics) throughout to make the tree-local derivation explicit.10 The semantic derived trees can be reduced by applying kconversion, as the nodes dominate typed k-expressions and terms. When reducing the semantic derived trees, in addition to k-conversion, we propose to use Predicate Modification, as defined in Heim & Kratzer (1998) in (22). (22) Predicate Modification , and ½½bs and ½½cs are both in D<e, t>, If a has the form then ½½as ¼ kxe½½bs(x) ^ ½½cs(x). 10
In semantic derivation trees, we do not annotate the connections between a mother and a daughter node with the location of adjoining or substitution that has taken place in the mother elementary tree, as this is determined by the links between syntactic and semantic elementary trees.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Figure 15 Syntactic and semantic derived trees for ‘John saw a scary movie’.
366 Syntax and Semantics of It-Clefts The application of Predicate Modification and k-conversion reduces (c#21) to the formula in (23). (23) dy[scary(y) ^ movie(y)] [saw(John, y)]
4.2 Our TAG analysis of the semantics of it-clefts
Figure 16 Syntactic and semantic elementary trees for ‘It was Ohno who won’. 11 In (b#who_won), the R node represents the semantics of the relative clause who won. This is a product of composing the semantics of the relative pronoun who and the semantics of the rest of the relative clause. Here, to simplify the derivation and to streamline the discussion, we skipped a step in the derivation with separate semantic trees for the relative pronoun and the rest of the relative clause. For a detailed analysis of the compositional semantics of relative clauses using STAG, see Han (2007).
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
The elementary tree pairs required for the syntax–semantics mapping of the equative it-cleft in (18) are given in Figure 16. (a#it) and (b#who_won) in the multi-component set in Figure 16 together define the semantics of definite quantification, where the former contributes the argument variable and the latter the definite quantifier, the restriction and scope, and (a#was) represents the semantics of equative sentences.11 The derivation tree for the semantics of (18) is given in (d#18) in Figure 17
Chung-Hye Han and Nancy Hedberg 367
Figure 17
Syntactic and semantic derived trees for ‘It was Ohno who won’.
and the semantic derived tree is given in (c#18) in Figure 18. Note that the semantic derivation tree in (d#18) is isomorphic to the syntactic one in (d18). The semantic derived tree in (c#18) can be reduced to the formula in (24) after the application of k-conversion. (24) THEz [won(z)] [z ¼ Ohno] The elementary tree pairs required for the syntax–semantics mapping of the predicational it-cleft in (19) are given in Figure 19. The difference between the semantics of equative sentences and predicational sentences is represented by the two different semantic trees, (a#was) in Figure 16 and (a#was_kid) in Figure 19. While (a#was) in Figure 16 represents the semantics of equative sentences and has two term nodes with a two-place equative predicate anchoring the tree, (a#was_kid) in Figure 19 represents the semantics of predicational sentences and has one term node with a one-place predicate, kx.kid(x), anchoring the tree. The syntactic and semantic derivation trees for (19),
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Figure 18
Syntactic and semantic derivation trees for ‘It was Ohno who won’.
368 Syntax and Semantics of It-Clefts
which are isomorphic, are given in in Figure 20, and the corresponding derived trees are given in in Figure 21. The semantic derived tree in (c#19) can be reduced to the formula in (25) after the application of k-conversion. (25) THEz [beat(z, John)] [kid(z)]
5 CONNECTIVITY
5.1 Agreement In equative it-clefts, the cleft pronoun is always singular and agrees with the copula, but the clefted constituent can be either singular or plural.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Figure 19 Syntactic and semantic elementary trees for ‘It was a kid who beat John’.
Chung-Hye Han and Nancy Hedberg 369
Figure 20
Syntactic and semantic derived trees for ‘It was a kid who beat John’.
Further, when the cleft clause is a subject relative clause, the clefted constituent agrees with the verb in the cleft clause in person and number. This is illustrated in (16), repeated here as (26). This apparent agreement between the clefted constituent and the verb in the clefts clause, even though they are not in the same clause in our analysis, gives rise to a connectivity effect. (26) a. It is John and Mary who like Pete. b. *It is John and Mary who likes Pete. c. *It are John and Mary who like Pete. We point out that agreement across clauses is not unique to it-clefts. In (27), the subject of the main clause John and Mary agrees with the
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Figure 21
Syntactic and semantic derivation trees for ‘It was a kid who beat John’.
370 Syntax and Semantics of It-Clefts copula of the non-restrictive relative clause. So, there is independent motivation for a mechanism in the grammar that allows agreement across clauses in appropriate syntactic contexts. (27) John and Mary, who are students, came to see me.
Figure 22 Derivation of ‘It is John and Mary who like Pete’. 12
An anonymous reviewer asks why the agreement feature on T in (bwho_like) is not valued as plural. We chose to leave it unspecified, as it is compatible with third person plural as well as second and first person singular and plural.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
The agreement phenomena in it-clefts can be easily accommodated by our TAG analysis, with the addition of feature unification (VijayShanker & Joshi 1988). We will postulate an agreement feature attribute, Agr, that can have feature values such as third person singular (3sg) or third person plural (3pl) feature. This Agr feature can also be unspecified in an elementary tree and obtain a value through feature unification as it composes with another elementary tree. An unspecified Agr feature has an arbitrary index as a temporary value, and Agr features with the same indices must have the same value at the end of the derivation. Figure 22 illustrates how our TAG analysis can capture the agreement between the cleft pronoun it and the copula is, and the clefted constituent John and Mary and the verb of the cleft clause like in (26a).12 To simplify the discussion, we have already derived the DP coordination tree for John and Mary and referred to it as (aand), and substituted the DP tree anchoring Pete into (bwho_like). The substitution of (ait) into DP0 in (ais) is licensed because DP in (ait) has [Agr:3sg] feature which unifies with [Agr:3sg] in DP0 in (ais). And the agreement between it and is is guaranteed as both DP0 and T in
Chung-Hye Han and Nancy Hedberg 371
(28) a. They’re just fanatics who are holding him. b. Those are students who are rioting. c. Those are kids who beat John.
Figure 23
Syntactic derived tree for ‘It is John and Mary who like Pete’.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
(ais) tree have the same agreement features, as indicated by the coindexation between the agreement feature on DP0 and the third person singular feature in T. As (aand) tree substitutes into DP1 in (ais), the [Agr: 4 ] feature on FP is valued as 3pl. As (bwho_like) tree adjoins onto FP in (ais), DPl and T in (bwho_like) are valued as 3pl as well. This will guarantee the agreement between John and Mary and like. The derived tree with all the Agr features valued and unified is in Figure 23. In predicational it-clefts, the cleft pronoun can be plural, and it must agree with the copula as well as the clefted constituent. Moreover, if the cleft clause is a subject relative clause, then the clefted constituent must agree with the verb of the cleft clause, even though they are not in the same clause in our analysis, giving rise to a connectivity effect. This is illustrated in (17), repeated here as (28).
372 Syntax and Semantics of It-Clefts
5.2 Binding In it-clefts, even though the clefted constituent is not c-commanded by the subject of the cleft clause, a SELF-anaphor in the clefted constituent
Figure 24 13
Derivation of ‘Those are kids who beat John’.
We left the agreement feature on T in (bwho_beat) unspecified for the same reason we left it unspecified in (bwho_beat): it is compatible with third person plural, and second and first person singular and plural. 14 Why equative clefts require singular cleft pronouns when they contain a plural clefted constituent does not follow from our theory and remains a puzzle. However, the fact that different agreement patterns occur shows that there are clearly two types of it-cleft.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
How our TAG analysis can capture the agreement phenomena in predicational it-clefts is illustrated in Figure 24.13 To simplify the discussion, we have already substituted the DP tree anchoring John into (bwho_beat). In our TAG analysis, the lexical anchor of a predicational copula elementary tree is the predicative noun, as in (aare_kids). In this tree, the agreement between the cleft pronoun, the copula and the predicative noun is guaranteed: DP0, T and DP all have the same agreement features as they all have the same indices. Here, they all have third person plural features as the DP containing the predicative noun is specified with the third person plural feature. The substitution of (athose) tree into DP0 in (aare_kids) is licensed because DP in (athose) has [Agr:3pl] feature which unifies with the third person plural feature in DP0 in (aare_kids). As (bwho_beat) tree adjoins onto FP in (aare_kids), DPl and T in (bwho_beat) will obtain 3pl value as well. This will guarantee the agreement between kids and beat. The derived tree with all the Agr features valued and unified is given in Figure 25.14
Chung-Hye Han and Nancy Hedberg 373
Syntactic derived tree for ‘Those are kids who beat John’.
can be co-indexed with the subject in the cleft clause as in (12a), repeated here as (29a), and a pronoun in the clefted constituent cannot be co-indexed with the subject in the cleft clause as in (13a), repeated here as (29b). In other words, the SELF-anaphor and the pronoun behave as if they are inside the cleft clause as in (30a) and (30b), giving rise to a connectivity effect. (29) a. It was himselfi who Johni nominated. b. *It was himi who Johni nominated. (30) a. Johni nominated himselfi. b. *Johni nominated himi. We will use the Binding Conditions defined in Reinhart & Reuland (1993) to account for this phenomenon. The formulation of Binding Conditions by Reinhart and Reuland and the definitions needed to understand it are given in (31) and (32). Condition A constrains the
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Figure 25
374 Syntax and Semantics of It-Clefts distribution of SELF-anaphors and Condition B constrains the distribution of pronouns. (31) Binding Conditions (Reinhart & Reuland 1993) a. A: If a syntactic predicate is reflexive-marked, it is reflexive. b. B: If a semantic predicate is reflexive, it is reflexive-marked.
According to Reinhart and Reuland, Condition A successfully applies to (30a) because the syntactic predicate ‘John nominated himself ’ is reflexive-marked, as one of the arguments, himself, is a SELFanaphor, and it is also reflexive, as two of its arguments, John and himself, are co-indexed. However, (30b) is ruled out by Condition B. In (30b), the semantic predicate nominated(John, John) is reflexive, as two of its arguments are co-indexed, but it is not reflexive-marked, as nominated is not lexically reflexive and none of nominated’s arguments is a SELF-anaphor. We first apply Condition B of Reinhart and Reuland to rule out (29b), repeated below as (33a). According to our TAG analysis, (33a) would map onto an equative semantic representation as in (33b). Since the clefted constituent him is co-indexed with John, they corefer, and so the variable from the cleft pronoun, z, would be equated with John. We will represent this as z ¼ himJohn, just to be explicit about the fact that the form of the clefted constituent here is him. This in turn means that the semantic predicate nominated(John, z) is reflexive. But it is not reflexive-marked, as nominated is not lexically reflexive and none of its arguments is a SELF-anaphor. (33) a. *It was himi who Johni nominated. b. *THEz [nominated(John, z)] [z¼himJohn] We now turn to (29a). According to our TAG analysis, (29a) is also an equative sentence. We thus have a syntactic predicate whose head is the equative copula and with two syntactic arguments, it and himself. But then Condition A should rule out this sentence because even
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
(32) Definitions (Reinhart & Reuland 1993) a. The syntactic predicate formed of a head P is P, all its syntactic arguments (the projections assigned theta-roles/case by P), and an external argument of P. b. The semantic predicate of P is P and all its arguments at the relevant semantic level. c. P is reflexive iff two of its arguments are co-indexed. d. P is reflexive-marked iff either P is lexically reflexive or one of P’s arguments is a SELF-anaphor.
Chung-Hye Han and Nancy Hedberg 375
though the syntactic predicate is reflexive-marked, it is not reflexive, as it and himself are not co-indexed. Reinhart and Reuland point out that focus anaphors can occur in an argument position without a binder, appearing to be exempt from Condition A. Such anaphors are also known as discourse anaphors of focus or emphatic anaphors (Kuno 1987; Zribi-Hertz 1989). Some examples are given in (34).
We note that the clefted constituent is a focused position (Akmajian 1970a; Prince 1978). This means that a SELF-anaphor in a clefted constituent position is always focused, and so it can be exempt from Condition A. A further support for this view comes from examples as in (35). These examples are acceptable even though myself and yourself do not have possible binders in the sentences in which they occur. (35) a. It was myself who John nominated. b. It was yourself who John nominated. A question remains though as to why the clefted constituent cannot be occupied by just any SELF-anaphor. For instance, (36) is degraded where herself in the clefted constituent position does not have a binder. (36) *It was herself who John nominated. This implies that even though a focus anaphor in the clefted constituent position is not subject to Condition A, its distribution is constrained by discourse factors. The exact nature of the discourse constraints on the distribution of focus anaphors in it-clefts remains to be investigated.
6 CONCLUSION We have proposed a syntax and semantics of it-clefts, using tree-local MC-TAG and STAG. We accounted for the equative and predicational interpretations available to it-clefts, the two readings available to simple copula sentences as well, by postulating two types of copula sentences in English, an equative one and a predicational one (Heycock & Kroch 1999). The two types of copula sentences are represented by two
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
(34) a. This letter was addressed only to myself. (Reinhart & Reuland 1993, ex. 27a) b. ‘Bismarck’s impulsiveness has, as so often, rebounded against himself ’. (Reinhart & Reuland 1993, ex. 27c, originally quoted in Zribi-Hertz 1989)
376 Syntax and Semantics of It-Clefts
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
different pairs of syntactic and semantic elementary trees. Our analysis thus contrasts with the inverse analysis of Williams (1983), Partee (1986), Moro (1997) and Mikkelsen (2005), according to which specificational clauses (our equatives) are inverted predicational clauses. On some versions of this analysis, both orders derive from an underlying embedded small clause, with either the subject or the predicate raising to matrix subject position. In our TAG analysis, the derivation of it-clefts starts either with an equative copula elementary tree or with a predicational copula elementary tree. The copula tree then composes with the elementary tree for the cleft pronoun and the elementary tree for the cleft clause. In our analysis, the cleft pronoun and the cleft clause bear a direct syntactic relation because the elementary trees for the two parts belong to a single multi-component set. They do not actually form a syntactic constituent in the derived tree, but as the elementary trees for the two belong to the same multi-component set, the intuition that they form a syntactic unit is captured, represented in the derivation tree as a single node. At the same time, the surface syntactic constituency is represented in the derived tree where the clefted constituent and the cleft clause form a constituent. Further, the semantics of the two trees in the multi-component set is defined as a definite quantified phrase, capturing the intuition that they form a semantic unit as a definite description. We have also shown that our TAG analysis can account for connectivity effects instantiated by binding and agreement: for binding, we applied Binding Conditions of Reinhart & Reuland (1993) and exploited the fact that the clefted constituent is a focused position, and for agreement, we added feature unification to our TAG analysis. The distinction in TAG between the derivation tree and the derived tree enabled us to resolve the tension between the surface constituency and the syntactic and semantic dependency in it-clefts: in the derived tree, the cleft clause forms a constituent with the clefted constituent, not with the cleft pronoun, capturing the insight from the expletive approach, but in the derivation tree, the cleft clause and the cleft pronoun form a syntactic/semantic unit, capturing the insight from the discontinuous constituent approach. The extended domain of locality of TAG and the ability to decompose an elementary tree to a set of trees in MC-TAG enabled us to provide a straightforward syntactic account of the discontinuous constituent property of the cleft pronoun and the cleft clause without having to adopt movement to produce the effect of extraposition of the cleft clause. Moreover, the derivation tree-based compositional semantics and the direct syntax–semantics mapping in STAG enabled us to provide a simple compositional semantics for
Chung-Hye Han and Nancy Hedberg 377
it-clefts without using an ad hoc interpretive operation to associate the meaning coming from the cleft pronoun and the meaning coming from the cleft clause. It remains as future work to extend our analysis to itclefts that have non-DP clefted constituents, such as ‘It was to the library that John went’ and ‘It was happily that John quit his job’. Acknowledgements
CHUNG-HYE HAN AND NANCY HEDBERG Department of Linguistics Simon Fraser University 8888 University Drive Burnaby BC V5A 1S6 Canada e-mail:
[email protected],
[email protected] REFERENCES Abeille´, Ann (1994), ‘Syntax or semantics? Handling nonlocal dependencies with MCTAGs or Synchronous TAGs’. Computational Intelligence 10:471–85. Abney, Steven (1987), The English Noun Phrase in Its Sentential Aspect. Doctoral dissertation, MIT, Cambridge, MA. Akmajian, Adrian (1970a), Aspects of the Grammar of Focus in English. Doctoral dissertation, MIT, Cambridge, MA. Akmajian, Adrian (1970b), ‘On deriving cleft sentences from pseudo-cleft sentences’. Linguistic Inquiry 1:149–68. Babko-Malaya, Olga (2004), ‘LTAG semantics of focus’. In Proceedings of TAG+7. Vancouver, Canada. 1–8.
Ball, Catherine N. (1977), ‘Th-clefts’. Pennsylvania Review of Linguistics 2:57–69. Banik, Eva (2004), ‘Semantics of VP coordination in LTAG’. In Proceedings of TAG+7. Vancouver, Canada. 118– 25. Bleam, Tonia (2000), ‘Clitic climbing and the power of Tree Adjoining Grammar’. In Ann Abeille´ and Owen Rambow (eds.), Tree Adjoining Grammars: Formalisms, Linguistic Analysis, and Processing. CSLI. Stanford, CA. 193–220. Chomsky, Noam (1977), ‘On whmovement’. In P. W. Culicover, T. Wasow and A. Akmajian (eds.), Formal Syntax. Academic Press. New York. 71–132.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
We thank the audience at TAG+8 in Sydney, 2006, for comments and questions on the previous version of this paper. We are also extremely indebted to the two anonymous reviewers for their insightful comments that were crucial in improving this paper. All remaining errors are ours. This work was supported by SSHRC 4102003-0544 and NSERC RGPIN341442 to Han, and SSHRC 410-2007-0345 to Hedberg.
378 Syntax and Semantics of It-Clefts Doctoral dissertation, University of Minnesota, Minneapolis, MN. Hedberg, Nancy (2000), ‘The referential status of clefts’. Language 76:891–920. Heggie, Lorie A. (1988), The Syntax of Copular Structures. Doctoral dissertation, University of Southern California, Los Angeles, CA. Heycock, Caroline & Anthony Kroch (1999), ‘Pseudocleft connectedness: implications for the LF interface’. Linguistic Inquiry 30:365–97. Higginbotham, James (1985), ‘On semantics’. Linguistic Inquiry 16:547–94. Jespersen, Otto (1927), A Modern English Grammar, vol. 3. Allen and Unwin. London. Jespersen, Otto (1937), Analytic Syntax. Allen and Unwin. London. Joshi, Aravind K. (1985), ‘Tree Adjoining Grammars: how much context sensitivity is required to provide a reasonable structural description’. In D. Dowty, L. Karttunen and A. Zwicky (eds.), Natural Language Parsing. Cambridge University Press. Cambridge, UK. 206–50. Joshi, Aravind K., L. Levy & M. Takahashi (1975), ‘Tree adjunct grammars’. Journal of Computer and System Sciences. 10:136–63. Joshi, Aravind K. & K. Vijay-Shanker (1999), ‘Compositional semantics with Lexicalized Tree-Adjoining Grammar (LTAG): how much underspecification is necessary?’ In H. C. Blunt and E. G. C. Thijsse (eds.), Proceedings of the Third International Workshop on Computational Semantics (IWCS-3). Tilburg. 131–45. Joshi, Aravind K., K. Vijay-Shanker & David Weir (1991), ‘The convergence of mildly context-sensitive grammatical formalisms’. In Peter Sells, Stuart Shieber and Tom Wasow (eds.), Foundational Issues in Natural Language Processing. MIT Press. Cambridge, MA. 31–82.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Copestake, Ann, Dan Flickinger, Ivan A. Sag & Carl Pollard (2005), ‘Minimal recursion semantics: an introduction’. Journal of Research on Language and Computation 3:281–332. DeClerck, Renaat (1988), Studies on Copular Sentences, Clefts and Pseudoclefts. Foris. Dordrecht, The Netherlands. Delahunty, Gerald P. (1982), Topics in the Syntax and Semantics of English Cleft Sentences. Indiana University Linguistics Club. Bloomington, IN. Delin, Judy L. (1989), Cleft Constructions in Discourse. Doctoral dissertation, University of Edinburgh, Edinburgh. E´. Kiss, Katalin (1998), ‘Identificational focus versus information focus’. Language 74:245–73. Emonds, Joseph E. (1976), A Transformational Approach to English Syntax. Academic Press. New York. Frank, Robert (2002), Phrase Structure Composition and Syntactic Dependencies. MIT Press. Cambridge, MA. Grimshaw, Jane (1991), Extended Projection. Unpublished MS, Brandeis University, Waltham, MA. Gundel, Jeanette K. (1977), ‘Where do cleft sentences come from?’ Language 53:543–59. Han, Chung-hye (2007), ‘Pied-piping in relative clauses: syntax and compositional semantics using Synchronous Tree Adjoining Grammar’. Research on Language and Computation 5:457– 79. Han, Chung-hye & Nancy Hedberg (2006), ‘A Tree Adjoining Grammar Analysis of the Syntax and Semantics of It-clefts’. In Proceedings of the 8th International Workshop on Tree Adjoining Grammars and Related Formalisms (TAG+ 8). COLING-ACL Workshop. Sydney, Australia. 33–40. Hedberg, Nancy (1990), Discourse Pragmatics and Cleft Sentences in English.
Chung-Hye Han and Nancy Hedberg 379 Mikkelsen, Lina (2005), Copular Clauses: Specification, Predication and Equation. John Benjamins, Amsterdam. Moro, Andrea (1997), The Raising of Predicates: Predicative Noun Phrases and the Theory of Clause Structure. Cambridge University Press. Cambridge. Nesson, Rebecca & Stuart M. Shieber (2006), ‘Simpler TAG semantics through synchronization’. In Proceedings of the 11th Conference on Formal Grammar. CSLI. Malaga, Spain. 103–117. Partee, Barbara (1986), ‘Ambiguous pseudoclefts with unambiguous be’. In S. Berman, J. Choe and J. McDonough (eds.), Proceedings of NELS, vol. 16. GLSA, University of Massachusetts. Amherst, MA. 354–66. Percus, Orin (1997), ‘Prying open the cleft’. In K. Kusumoto (ed.), Proceedings of the 27th Annual Meeting of the North East Linguistics Society. GLSA. Amherst, MA. 337–51. Prince, Ellen (1978), ‘A comparison of wh-clefts and it-clefts in discourse’. Language 54:883–906. Reinhart, Tanya & Eric Reuland (1993), ‘Reflexivity’. Linguistic Inquiry 24:657–720. Rochemont, Michael (1986), Focus in Generative Grammar. John Benjamins. Amsterdam, The Netherlands. Romero, Maribel & Laura Kallmeyer (2005), ‘Scope and situation binding in LTAG using semantic unification’. In Proceedings of the Sixth International Workshop on Computational Semantics (IWCS-6). Tilburg. Romero, Maribel, Laura Kallmeyer & Olga Babko-Malaya (2004), ‘LTAG semantics for questions’. In Proceedings of TAG+7. Vancouver, Canada. 186– 93. Shieber, Stuart (1994), ‘Restricting the weak-generative capacity of Synchronous Tree-Adjoining Grammars’. Computational Intelligence 10:271–385.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Kallmeyer, Laura & Aravind K. Joshi (2003), ‘Factoring predicate argument and scope semantics: underspecified semantics with LTAG’. Research on Language and Computation 1:3–58. Kallmeyer, Laura & Maribel Romero (2008), ‘Scope and situation binding in LTAG using semantic unification’. Research on Language and Computation 6:3–52. Kallmeyer, Laura & Tatjana Scheffler (2004), ‘LTAG analysis for pied-piping and stranding of wh-phrases’. In Proceedings of TAG+7. Vancouver, Canada. 32–9. Koopman, Hilda & Dominique Sportiche (1991), ‘The position of subjects’. Lingua 85:211–58. Kroch, Anthony (1989), ‘Asymmetries in long-distance extraction in a Tree Adjoining Grammar’. In Mark Baltin and Anthony Kroch (eds.), Alternative Conceptions of Phrase Structure. University of Chicago Press. Chicago, IL. 66–98. Kroch, Anthony & Aravind Joshi (1985), ‘Linguistic relevance of Tree Adjoining Grammar’. Technical Report, MS-CS-85-16. Department of Computer and Information Sciences, University of Pennsylvania. Kroch, Anthony S. & Aravind K. Joshi (1987), ‘Analyzing extraposition in a Tree Adjoining Grammar’. In G. Huck and A. Ojeda (eds.), Discontinuous Constituents, volume 20 of Syntax and Semantics. Academic Press. Orlando, FL. 107–49. Kroch, Anthony & Beatrice Santorini (1991), ‘The derived constituent structure of the West Germanic verb raising construction’. In Robert Freidin (ed.), Principles and Parameters in Comparative Grammar. MIT Press. Cambridge, MA. 269–338. Kuno, Susumu (1987), Functional Syntax: Anaphora, Discourse and Emphathy. University of Chicago Press. Chicago, IL. Amsterdam.
380 Syntax and Semantics of It-Clefts Shieber, Stuart & Yves Schabes (1990), ‘Synchronous Tree Adjoining Grammars’. In Proceedings of COLING’90. Helsinki, Finland. 253–8.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Vijay-Shanker, K. & Aravind K. Joshi (1988), ‘Feature structure based Tree Adjoining Grammars’. In Proceedings of the 12th International Conference on Computational Linguistics. Budapest, Hungary. 714–9. Weir, David (1988), Characterizing Mildly Context-sensitive Grammar Formalisms. Doctoral dissertation, University of Pennsylvania. Philadelphia, PA.
Williams, Edwin (1980), ‘Predication’. Linguistic Inquiry 11:203–38. Williams, Edwin (1983), ‘Semantic vs. syntactic categories’. Linguistics and Philosophy 6:423–46. Wirth, Jessica R. (1978), ‘The derivation of cleft sentences in English’. Glossa 12:58–82. Zribi-Hertz, Anne (1989), ‘A-type binding and narrative point of view’. Language 65:695–727. First version received: 17.12.2007 Second version received: 08.04.2008 Accepted: 06.05.2008
Journal of Semantics 25: 381–409 doi:10.1093/jos/ffn004 Advance Access publication June 30, 2008
Concept Narrowing: The Role of Context-independent Information ´ NDEZ PAULA RUBIO-FERNA Princeton University
The present study aims to investigate the extent to which the process of lexical interpretation is context dependent. It has been uncontroversially agreed in psycholinguistics that interpretation is always affected by sentential context. The major debate in lexical processing research has revolved around the question of whether initial semantic activation is context sensitive or rather exhaustive, that is, whether the effect of context occurs before or only after the information associated to a concept has been accessed from the mental lexicon. However, within post-lexical access processes, the question of whether the selection of a word’s meaning components is guided exclusively by contextual relevance, or whether certain meaning components might be selected context independently, has not been such an important focus of research. I have investigated this question in the two experiments reported in this paper and, moreover, have analysed the role that context-independent information in concepts might play in word interpretation. This analysis differs from previous studies on lexical processing in that it places experimental work in the context of a theoretical model of lexical pragmatics.
1 CORE FEATURES Looking at the psycholinguistic literature on lexical processing (see below), it is possible to argue that the properties associated with a given concept all receive an initial boost on word recognition but that only some subset of these remain detectable over the time course of language processing. I am interested in those meaning components which remain active throughout the interpretation process regardless of their contextual relevance (what I call core features of a concept). On a strong interpretation of the term, core features would constitute something corresponding to a core semantic interpretation for the word, which would be constant across contexts. However, on a weaker reading, core features would merely be highly accessible during interpretation, not necessarily computed as part of the word’s interpretation—whether they are or not depending on the actual context. In order to determine whether there are core features of a concept, a contrast needs to be Ó The Author 2008. Published by Oxford University Press. All rights reserved. For Permissions, please email:
[email protected].
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Abstract
382 Concept Narrowing
1 The distinction I make between strong and weak associates refers to the different strength of association between a concept and its properties. In contrast, the distinction between core and noncore features is based on the different time course of activation that the properties of a concept might have during processing by virtue of the strength of their association.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
established with other associated properties which do not maintain their initial activation context independently (what I call non-core features). The lexical priming study reported in this paper will therefore try to establish (i) whether those conceptual properties which are more strongly associated to a concept might behave as core features by maintaining their initial activation across contexts and (ii) whether those properties which are more weakly associated to a concept might behave as non-core features and prolong their automatic activation only in those contexts where they are relevant for interpretation.1 The distinction between core and non-core features is related to, but not exactly the same as, that drawn by Barsalou between contextindependent and context-dependent properties (Barsalou & Bower 1980; Barsalou 1982). According to Barsalou, context-independent properties are activated by the word for a concept on all occasions, regardless of contextual relevance (e.g. BASKETBALL—ROUND). Context-dependent properties, on the other hand, are only activated by particular contexts in which the word appears (e.g. BASKETBALL— FLOATS). In Barsalou’s view, context-independent properties constitute the core meaning components of words, whereas the activation of context-dependent properties accounts for semantic flexibility. Although Barsalou (1982) offers some empirical evidence for context-dependent and context-independent information in concepts (see below), he does not take into account the question of pre- v. postlexical access processes. In this respect, it is not clear whether the semantic activation he refers to is the result of an early automatic process of spreading activation of associates or rather reflects a later meaning selection. This question is particularly important in the case of contextdependent properties. According to Barsalou, some context-dependent properties are ad hoc or computed on the fly (Barsalou 1983), whereas others are part of the information associated to a concept in long-term memory (what Barsalou calls conceptual frames; Barsalou 1992). Context-dependent properties therefore include two different types of conceptual properties in this account: those which have to be inferred during processing and those which are accessed automatically by virtue of their association to the concept. The different patterns of activation of these two types of properties make it difficult to make generalizations about the accessibility of context-dependent properties during processing. In contrast, the notion of non-core features refers only to properties
Paula Rubio-Ferna´ndez 383
associated to a given concept in long-term memory, although this association would not be strong enough for them to maintain their initial activation context independently, the way core features would. 2 PREVIOUS RESEARCH POINTING TO A CORE/ NON-CORE DISTINCTION
2 In the word processing literature, high-dominant properties of a concept are understood as those properties which are frequently processed together with the word for the concept. This results in a strong association between the concept and the property. Likewise, low-dominant properties are not so often processed together with the word for the concept and so are more weakly associated to the concept in long-term memory.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Using a property verification task, Barsalou (1982) showed that context-independent properties (e.g. HAS A SMELL for SKUNK) were primed in neutral sentence contexts (‘The skunk was under a large willow’) just as much as they were in biasing contexts (‘The skunk stunk up the entire neighbourhood’). However, the verification of context-dependent properties (CAN BE WALKED UPON for ROOF) was faster after a biasing sentence (‘The roof cracked under the weight of the repairman’) than after a neutral sentence (‘The roof had been renovated prior to the rainy season’). Using a similarity-judgment task, Barsalou (1982) also showed that the similarity ratings for pairs of words (e.g. SOFA-DESK or RACOON-SNAKE) were positively affected by the prior presentation of the category name (FURNITURE and CAN BE A PET for the examples above) only when the property shared by the two concepts was context dependent. This would confirm the prediction that the properties shared by instances of a common category (FURNITURE) are usually context independent, whereas the properties shared by instances of an ad hoc category (CAN BE A PET) might often be context dependent. Barsalou acknowledges that the results from these experiments provide only a functional account of property availability, since the procedure did not address the time course of activation. Likewise, the results reported in Conrad (1978), Tabossi & Johnson-Laird (1980) and Tabossi (1982) provide empirical evidence for some distinction between context-dependent and context-independent properties without taking into account the time course of semantic processing (see references for details). Whitney et al. (1985) and Greenspan (1986) carried out different online studies of word interpretation, which converged on the same basic conclusions. Relative to controls, both high- and low-dominant properties of the prime concepts2 were activated at 0 ms, regardless of
384 Concept Narrowing
3 RELEVANCE THEORY: CONCEPT NARROWING At the sentence level, pragmatics tries to account for the divergence between the encoded meaning of a complex linguistic expression and the meaning that it is used to communicate in context. Likewise, at the word level, lexical pragmatics investigates how the concept communicated by use of a word may differ from the concept encoded by that word. In relevance theory (Sperber & Wilson 1986/1995; Carston 2002), content words encode concepts. What this means is that a content word gives access to a stable concept in the conceptual repertoire of the hearer. As psychological objects, concepts consist of an address in memory, under which various types of information are stored (i.e. logical, encyclopaedic and lexical information). When a conceptual address appears in the mental representation of an utterance being processed, the various types of conceptual information stored at that address would have been activated or made accessible (Sperber & Wilson 1986/1995). One of the cases where the concept communicated by use of a word differs from the concept encoded by that word is concept narrowing, which results from a general pragmatic process of concept adjustment in online utterance interpretation (Carston 2002; Wilson 2003). In
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
which property was emphasized by the context. However, lowdominant properties were no longer activated 300 ms after the occurrence of the noun if the context biased interpretation towards a high-dominant property (e.g. ‘The fresh meat was protected by the ice’—SLIPPERY). In contrast, high-dominant properties remained active at 1000 ms even in contexts emphasizing a low-dominant property (e.g. ‘Robert fell on the ice’—COLD). It is therefore reasonable to conclude that previous off- and online studies provide support for the notion of context-independent information in concepts. The present study of core features using the crossmodal lexical priming paradigm was designed as a small-scale follow-up of previous online studies such as those of Whitney et al. (1985) and Greenspan (1986). However, unlike previous studies of word processing, I will provide an interpretation of the results in accordance with the theoretical model of lexical pragmatics currently being developed within relevance theory, which is outlined in the next section. The analysis of experimental work from a cognitive-pragmatic perspective takes forward psycholinguistic analyses by adding a necessary communicative dimension to the interpretation of the language processing data.
Paula Rubio-Ferna´ndez 385
instances of narrowing, a word is used to convey a more specific concept than the one it encodes.3 Consider the following examples:4 (1)
a. John said he would like a drink. b. This Christmas the bird was delicious. c. Mary is happy here.
3 The opposite process would be concept loosening, where a word is used to convey a more general concept than the one encoded. Most instances of metaphor interpretation, for example, involve both narrowing and loosening the concept for the metaphor vehicle (see Carston 2002). Metonymy, on the other hand, would involve semantic transfer: a third type of process where the outcome is not a narrower or broader concept but a different concept altogether (Recanati 2004). 4 Different versions of these examples are usually cited in the relevance theory literature as instances of narrow and loose use (e.g. Carston 2002). However, it is possible that these uses might be so common as to have given rise to secondary meanings of the corresponding words—with the corresponding concepts not being ad hoc anymore. If that were the case, I would be interested in how online interpretation would have worked before these uses were lexicalized.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
In the examples under (1), a content word is used to convey a more specific concept than the lexical one. In (1a), it is understood that John would like an alcoholic drink, rather than any type of drink, just as in (1b) bird would refer to some edible bird, in particular one of the typical birds served at Christmas dinner (i.e. goose or turkey). Similarly, interpreting (1c) would involve understanding the term happy, which covers a wide range of positive emotional states, in a specific way that applies to Mary’s. Understanding the examples in (1a), (1b) and (1c) would involve constructing an ad hoc concept DRINK*, BIRD* and HAPPY*, respectively, with a more restricted denotation than the concepts linguistically specified. Relevance Theory seems to share with psycholinguists such as Barsalou (1987) the assumption that an ad hoc concept maybe formed from a stable concept by using the information associated with the stable concept to restrict or extend its extension (Carston 2002; Wilson 2003). Some selection of the logical and encyclopaedic information stored under the address of the lexical concept would therefore take place in concept narrowing. According to Carston (1996, 2002), when an ad hoc concept results from a process of narrowing, some encyclopaedic property of the lexical concept is promoted to the status of logical or content constitutive. For example, in processing (1a), the ad hoc concept DRINK* would result from strengthening the lexical concept DRINK by making the encyclopaedic property ALCOHOLIC content constitutive of the concept communicated. Unfortunately, relevance theory and other theoretical work in lexical pragmatics do not offer a more detailed explanation of the mechanisms involved in concept narrowing, especially in terms of the time course of
386 Concept Narrowing
4 THE SCOPE OF THE MECHANISM OF SUPPRESSION Another question that is worth discussing around the notion of core features is the scope of the mechanism of suppression in word processing, that is, the scope of the cognitive mechanism which reduces the activation of lexical information that is irrelevant for word interpretation. According to some views (e.g. Simpson & Kang 1994; Simpson & Adamopoulos 2002), suppression is a rather specific mechanism only operating when candidates are mutually exclusive (e.g. the various meanings of an ambiguous word or the literal and the 5 It also seems reasonable that conceptual properties that might be relevant in a certain context remain active during processing without necessarily being promoted to the status of content constitutive. For example, in ‘Mary travels so often that she cannot even keep a cactus at home’, the non-core feature RESILIENT would probably be highly accessible in the context even thought the concept CACTUS would not need to be narrowed down to include only resilient cacti.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
activation of the relevant properties (e.g. Wilson 2003; Recanati 2004). This is particularly problematic when trying to derive testable predictions from the theory. However, if certain conceptual properties were so strongly associated with a concept as to behave as core features (i.e. become and remain activated in processing the corresponding word regardless of their contextual relevance), then their role in word interpretation would seem particularly interesting in cases of concept narrowing. It seems a reasonable assumption that those encyclopaedic properties of the lexical concept which are promoted to the status of content constitutive in narrowing the encoded concept would remain active during the interpretation process (Rubio-Ferna´ndez 2005).5 In this view, it is possible that core features might function as some sort of default meaning component of the prime words, with the encoded concepts being always narrowed down to their stereotypical members, provided the result is consistent with the context (Levinson 2000). For example, if YELLOW is a core feature of the lexical concept BANANA, in processing the sentence ‘The tourist gave the monkey a banana’, the concept BANANA* might be interpreted as referring to yellow bananas. However, it is also possible that core features are only upgraded to the status of definitional of the concept expressed in particular contexts where they are relevant for interpretation and can reasonably be taken to have been intended by the speaker. In all other contexts, core features would still be highly accessible during interpretation, although not actually computed as part of the information communicated by use of the corresponding word. These two alternatives will be further discussed in the light of the experimental results.
Paula Rubio-Ferna´ndez 387
5 A PRIMING STUDY OF CORE FEATURES In order to investigate the notion of core features, I carried out two experiments using a cross-modal lexical priming procedure adapted from the literature on disambiguation (Swinney 1979; Tanenhaus et al. 1979). This procedure requires that participants make a lexical decision on a series of visual targets while listening to different sentential contexts. Word targets are sometimes related to the sentence (e.g. ‘John found a bug in his room’—ANT), so facilitation of word recognition relative to an unrelated control is taken as a measure of activation. Unlike the experiments discussed in the previous section, I was interested not only in investigating the role of context in semantic activation but also in finding some empirical basis for distinguishing core and non-core features among the associates of a given concept.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
figurative interpretations of a metaphor vehicle). In these cases, keeping the inappropriate meaning active or simply leaving it to passively decay could interfere with sentence comprehension, hence the need for an executive mechanism that actively reduces the activation of information that is inconsistent with the interpretation of the sentence. However, according to other views (e.g. the Structure Building Framework; Gernsbacher 1990), suppression is a general cognitive mechanism involved not only in disambiguation and metaphor interpretation but also in reducing the activation of less relevant information of unambiguous words like the ones I tested in the experiments reported here. In another study (see Rubio-Ferna´ndez 2006), I found evidence that core features get actively suppressed in contexts where they are inconsistent with the mental representation of the prime concept. For example, FAST, a core feature of the concept CHEETAH would drop in activation between 0 and 400 ms when processing a conflicting sentence like ‘In their final exam the biology students had to dissect a cheetah’. In other type of contexts where this property was not inconsistent with interpretation, it was still active at 400 ms. This empirical evidence is consistent with both of the above views of suppression. However, what remains to be seen is whether in a context where high-dominant properties or strong associates of a concept are simply irrelevant for interpretation (and not necessarily inconsistent), they would maintain their initial activation at 400 ms. If this is the case, the very notion of core features would support the narrow scope view of suppression, where competition between candidates is necessary to trigger the mechanism. Core features would therefore become and remain activated regardless of their contextual relevance, unless actively suppressed in a conflicting context.
388 Concept Narrowing
5.1 Experiment 1 In the first experiment, I tested the time course of activation of strong and weak associates in neutral contexts, which do not make salient any particular aspect of the primes (e.g. ‘The tourist gave the monkey a banana’—YELLOW [SA]/BENT [WA]). I worked on the assumption that the interpretation of the prime words in this type of context would be rather stereotypical. According to even the most expansive views on the role of suppression (e.g. Gernsbacher 1990), the mechanism of suppression would not have sufficient grounds to operate in neutral contexts as it would not have enough of a basis on which to select the conceptual information that maybe irrelevant for interpretation. In this respect, if associates decreased in activation during the processing of neutral contexts, it would be due to passive decay for lack of contextual stimulation rather than to active suppression. As core features of the prime concepts, I would expect strong associates to maintain their initial activation in neutral contexts past the automatic phase of spreading activation. In contrast, as non-core features, weak associates should lose their activation soon after the spreading activation phase as they would only prolong their initial activation in contexts where they are relevant for interpretation. 6 I will use the terms ‘strong’ and ‘weak associates’ in order to emphasize the association between the prime concepts and the target properties in long-term memory (in contrast, for example, with the ad hoc properties that Barsalou (1982) would consider as context dependent). However, this distinction is equivalent to that drawn by other terms used in the literature; for example, ‘high-’ and ‘lowdominant properties’ or ‘central’ and ‘peripheral properties’ (Whitney et al. 1985; Greenspan 1986).
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
The activation of strong and weak associates of the prime concepts6 was tested across three time delays (i.e. 0, 400 and 1000 ms) in two different types of contexts (i.e. neutral and weak-associate biasing). If strong associates behave like core features of the prime concepts, they should become and remain activated both in neutral and weakassociate biasing contexts where they are irrelevant for interpretation. In contrast, if weak associates behave like non-core features of the prime concepts, then they should maintain their initial activation only in weakassociate biasing contexts where they are relevant for interpretation. In terms of concept narrowing, if strong associates behave like core features and remain active across contexts regardless of their contextual relevance, the important issue to discuss is whether they play the same role in lexical interpretation in both types of contexts or rather have a different function in concept comprehension depending on their contextual relevance.
Paula Rubio-Ferna´ndez 389
5.1.1 Method Participants The participants in this experiment were 60 undergraduate students at Cambridge University who volunteered to take part in the experiment. They all had English as their first language. Each session lasted approximately 12 minutes.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Materials and design A set of 22 common nouns with predictable distinctive properties were selected as primes. Rather than taking these properties from a dictionary definition of the critical nouns, I undertook a direct assessment of property dominance by using two questionnaires based on the literature on prototypes (Rosch 1973; Rosch & Mervis 1975; Barsalou 1983, 1985, 1987). In Questionnaire 1, participants were presented with two tasks. In the first one, they were asked to give a brief definition for each of the 22 words of a list (example given: ‘HOUSE: Building where people live’). In the second task, they were asked to list what they thought were the distinctive characteristics of the concepts in the same list of words (example given: ‘How could you distinguish a WHALE from other animals? Lives in the sea, large size, mammal’). In Questionnaire 2, participants were presented with a free-association task where they were asked to write down the first characteristic that came to mind when reading the words on a list, which was the same as in Questionnaire 1. After piloting the questionnaires on 15 participants (5 for Questionnaire 1 and 10 for Questionnaire 2), only minor modifications needed to be made to the instructions. The final versions of the questionnaires were distributed among 65 participants (25 for Questionnaire 1 and 40 for Questionnaire 2). Having chosen a list of words with predictable highly distinctive properties, the results were as expected apart from two terms, tip and spring, which are ambiguous and did not show a homogeneous result. These latter were discarded. For each of the 20 remaining concepts, the most frequent distinctive property (what I refer to as strong associate) was selected from the three different tasks. A potential problem in selecting the set of weak associates was the strength of the association between the property and the concept. If prime and target were not related closely enough, it could be argued that the property was not associated to the concept in long-term memory (as seems to be the case with Barsalou’s context-dependent properties; see Whitney et al. 1985 for a critique). Therefore, weak associates were selected not from the least frequent distinctive properties given in the questionnaires but rather from the range between the third and the seventh most frequent properties (see Appendix A for a complete list of primes and targets).
390 Concept Narrowing
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
A neutral sentence was then constructed for each one of the 20 primes, which was always the last word in the sentence (see Appendix B for a complete list of the sentential contexts). In order to avoid intralexical priming, sentences did not include any associate of the target word other than the critical prime. The 20 critical sentences were divided into two equal groups matched by the word frequency and the length of the corresponding targets (Johansson & Hofland 1989). One group of sentences were paired with related targets (e.g. ‘For the dinner Mary brought champagne’—BUBBLE [SA]/CELEBRATION [WA]). For the other group, targets were scrambled, and so the sentences were paired with unrelated target words (‘My grandmother knew that old lullaby’—SMALL [SA]/WINDOW [WA]). The unrelated sentences and targets served as controls. Two lists of materials were constructed by pairing one group of sentences with related targets in List A and with unrelated targets in List B and the other group of sentences with unrelated targets in List A and with related targets in List B. Another set of 20 neutral sentences was constructed and paired with English-like non-words (e.g. Today John was late for his meeting—WASK). Critical and filler sentences were randomized individually for each participant in each of the two lists of materials. Given that a word is identified at the point in time when information uniquely specifies it, which may actually occur before the physical ending of the word (Marslen-Wilson 1987), for each of the 20 nouns in the experimental materials, a point was selected where the prime would be unequivocally recognized. Targets were presented visually at the end of the acoustic signal 0, 400 or 1000 ms after the word recognition point was selected for each prime. This enabled accurate measuring of initial semantic activation, while controlling for the possibility that an early contextual effect may result from an early word recognition followed by a fast property selection given the length of the primes. Participants were randomly assigned to one Target Type, List and Inter-Stimulus Interval (ISI), so each participant saw each sentence and the corresponding target only once. Sentences were recorded at a normal rate by a male speaker on an Apple Macintosh computer. The word recognition point of each prime at the end of the context was marked using a sound-editing program. The auditory stimuli and the visual targets were synchronized using a specialized computer program. The experimental materials were preceded by two sets of practice trials. The first one consisted of a lexical decision task on a list of 10 words and 10 non-words that were presented visually one at a time and randomized separately for each participant. The second set of practice
Paula Rubio-Ferna´ndez 391
trials included both sentential contexts in the acoustic modality and visual targets for a lexical decision. This set of trials contained six neutral sentences similar to the critical ones, although the corresponding visual targets were not related to the primes in any of the practice trials.
Procedure The experiment was presented to the participants as a simple psycholinguistic experiment investigating language processing. Participants were first given standard written instructions, which were then explained by the experimenter. Participants were told that they would be listening to a series of sentences through the headphones and that at the end of each sentence a string of letters would appear on the computer screen. They should try to indicate as fast and accurately as possible whether the string of letters was a word of English or not by pressing the corresponding key on the response box. It was emphasized that both tasks, namely listening carefully to the sentences and making a fast lexical decision, were equally important, although they should be taken as two unrelated tasks. In order to avoid the possibility that participants might have divided their attention and tried to find some underlying coherent structure connecting the sentences, it was stressed that the sentences were unconnected and did not make a story. It was also indicated that non-words would not correspond in principle with words of other languages, but rather be orthographically similar to legitimate English words. Participants were tested individually. They ran through the two sets of practice trials with the experimenter and got appropriate feedback on their performance. When being tested on the critical materials, participants were left on their own in a closed room or cubicle. To make sure that participants paid adequate attention to the priming sentences, a short memory test was given at the end of the
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Apparatus The experiment was conducted on a Toshiba laptop computer. The sentences were presented through a pair of headphones plugged into the laptop. The visual probes were presented in capital letters in the middle of the computer screen on a white background. Responses to the visual targets were made via a response box connected to the laptop. Word responses were made with the thumb or the index finger of the right hand on the right-most key of the response box, which was a green key. Non-word responses were made with the thumb or the index finger of the left hand on the left-most key of the response box, which was a red key. Target words remained on the screen until the participant made a lexical decision. There was a 1000-ms delay between the offset of the visual target and the onset of the following acoustic context.
392 Concept Narrowing experiment. Participants had been told about this memory test in the instructions. Three randomly chosen sentences from the critical set and another three from the filler set were included in the memory test. Another six sentences similar in style but different from any of the sentences used in the experiment completed the memory test. Participants were instructed to tick the sentences that they thought they had listened to in the experiment. It was stressed that no change had been made to the original sentences.
Target
Strong associates
Weak associates
Relatedness
Related Unrelated Facilitation Related Unrelated Facilitation
ISI 0
400
1000
601 (119, 0.07) 649 (121, 0.00) 48*** 634 (120, 0.03) 680 (114, 0.04) 46***
536 (53, 0.06) 585 (58, 0.05) 49** 645 (85, 0.03) 639 (74, 0.06) 6
518 (50, 0.03) 522 (56, 0.11) 4 740 (188, 0.06) 763 (177, 0.07) 23
Table 1 Mean reaction times (in milliseconds), SD, proportions of missing data and facilitation in each condition in Experiment 1 **P < 0.05, ***P < 0.01. 7 Because this study was a small-scale follow-up of previous experiments, the design was not powerful enough to carry out reliable analyses per item. Even though this is clearly a limitation of the study, List was included as an independent variable to see whether the distribution of the materials had had any significant effect on the ANOVAs.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
5.1.2 Results The minimum of correct responses required in the memory test was 2.5 standard deviations (SD) below the participants’ average of correct responses per ISI. Only one participant was replaced because he failed to meet this criterion. The mean response time, SD and proportions of missing data for each Relatedness condition together with the facilitation [i.e. the difference between the experimental (related) and the control (unrelated) conditions] and its significance level per Target Type and ISI are presented in Table 1. Across the two experiments, a response time data point was treated as ‘missing’ if it was either from an erroneous response or over 2.5 SD above the participant’s average response time to the word targets in his exercise. The statistical analysis of Experiment 1 examined the effects of Target Type (strong/weak associate), Target Relatedness (related/ unrelated), ISI (0/400/1000 ms) and List (A/B). Mean reaction times were entered into four-way analyses of variance (ANOVAs), with participants (F1) as random variables.7 There was a significant overall
Paula Rubio-Ferna´ndez 393
5.1.3 Discussion Both strong and weak associates were significantly primed at 0 ms in neutral contexts. This facilitation would have been the result of an automatic process of spreading activation of associates. Therefore, the choice of targets included associates of the prime concepts in both the strong and the weak associate conditions. However, the level of priming for the two types of associates diverged at 400 ms, where only strong associates remained active. Since neutral contexts did not prime any particular aspect of the prime concepts, strong associates must have maintained their initial activation because of 8 In both experiments, only significant results will be reported, with the exception of the critical interactions, which will be reported for every ANOVA.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
priming effect.8 Reaction times to targets in the related condition were 27 ms faster than those in the unrelated condition, F1(1, 48) ¼ 20.77, MSE ¼ 1077.4, P < 0.001. There was also a significant main effect of Target Type, F1(1, 48) ¼ 17.26, MSE ¼ 22 864, P < 0.001. The ISI 3 Target Type interaction was also significant, F1(2, 48) ¼ 4.750, MSE ¼ 22 864, P < 0.02. This interaction can be explained as an effect of the different speed of reaction of the different groups of participants tested in each condition, the greatest difference between strong and weak associates being observed at the 1000-ms delay (232 ms slower in the weak associate condition). The Relatedness 3 ISI 3 Target interaction, which was critical for the investigation, was significant, F1(2, 48) ¼ 3.390, MSE ¼ 1077.4, P < 0.05. List did not show any significant main effect or interaction with any other variable. A 2 3 2 3 3 3 2 (Target 3 Relatedness 3 ISI 3 List) ANOVA was carried out on the arcsine transformation of the missing data using participants as the random factor. The missing data were arcsine transformed to stabilize variances (Winer 1971). A main effect of ISI was observed, F1(2, 48) ¼ 5.001, MSE ¼ 0.015, P < 0.02, the highest missing rate being observed at the 1000-ms delay (0.193). A significant Target 3 ISI 3 List interaction was observed, F1(2, 48) ¼ 4.008, MSE ¼ 0.015, P < 0.03. Since the missing rates for each Target Type did not vary consistently across the List and ISI conditions, this interaction can be understood as an effect of the different accuracy of the different groups of participants tested in each condition. The critical Relatedness 3 ISI 3 Target interaction was marginally significant, F1(2, 48) ¼ 2.488, MSE ¼ 0.044, P < 0.1. The average arcsine-transformed missing data were higher in the unrelated than in the related condition across ISIs (0.155 v. 0.140).
394 Concept Narrowing
5.2 Experiment 2 In the second experiment, I tested a twofold hypothesis. First, I investigated whether strong associates behave like core features in weak-associate biasing contexts where they are irrelevant for interpretation (e.g. ‘The child said the moon was shaped like a banana’—YELLOW). If this was the case, there should be no difference between their pattern of activation in Experiments 1 and 2, with strong associates getting and remaining active in both neutral and weak-associate biasing contexts. Second, I investigated whether weak associates maintain their initial activation in weak-associate biasing contexts where they are relevant for interpretation (e.g. BENT would be the weak associate target for the biasing context above). If they behave like non-core features, weak associates should show a different pattern of activation in the two experiments, since their initial activation decayed by 400 ms in neutral contexts. Overall, the patterns of activation of strong and weak associates should be similar in weak-associate biasing contexts since core features would maintain
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
their strong association to the prime concepts and not because they were relevant for interpretation. In this respect, strong associates behaved like core features of the prime concepts in neutral contexts. The loss of activation of weak associates at 400 ms and strong associates at 1000 ms would have been the result of passive decay rather than active suppression since neither of these target types was clearly irrelevant for interpretation in neutral contexts. Most importantly, the overall patterns of activation of strong and weak associates in neutral contexts were significantly different. It is possible that strong associates would have maintained their initial activation past the 400-ms delay if participants had had a discourse background with which to integrate the incoming information. Because it was emphasized in the instructions that the sentences did not make a story, it is possible that participants may have processed the sentences rather shallowly, without constructing a very detailed mental representation of each sentence. In a different study, I have observed that strong associates remain active up to 1000 ms in contexts where they are highly relevant for interpretation (see Rubio-Ferna´ndez 2007). I take the results of the first experiment as supportive of the core/ non-core distinction. Strong associates behaved like core features of the prime concepts since they became and remained active in neutral contexts where they were not particularly relevant for interpretation. In contrast, weak associates lost their initial activation by 400 ms, as predicted for non-core features when they are not relevant for interpretation.
Paula Rubio-Ferna´ndez 395
their initial activation context independently and non-core features would do so in contexts where they are relevant for interpretation. 5.2.1 Method Participants The participants in this experiment were 60 undergraduate students at Cambridge University who volunteered to take part in the experiment. They all had English as their first language. Each session lasted approximately 12 minutes. Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Materials and design Sentences biasing weak associates were used as irrelevant contexts for strong associates. Therefore, biasing contexts had to (i) make weak associates salient while (ii) making strong associates irrelevant for interpretation. Bearing these criteria in mind, 20 biasing sentences were constructed, one for each of the 20 critical nouns used in Experiment 1. The prime was always the last word in the sentence. In order to avoid intra-lexical priming, sentences did not include any associate of the target word other than the critical prime word. The 20 weak-associate biasing sentences were evaluated in a questionnaire that was distributed among 34 students, who had to rate using a 1–5 scale how close each target was to the point of the sentence (example given: ‘In the sentence ‘‘Despite his poor reading skills, the boy was very good at scrabble,’’ WORD would be much more closely related to the point of the sentence than TILE’). The strong and weak associates were randomized and divided into two questionnaires so that each participant got only one type of target per sentence. Not being biased by the sentences, I expected the strong associates to be rated between 1 and 2.5, whereas the ratings for the weak associates should have ranged from 3.5 to 5 given their contextual relevance. In view of the results, minor changes were made to five of the sentences (see Appendix B for a complete list of the sentential contexts). The same set of primes and targets that were used in Experiment 1, as well as the same experimental design, were used in Experiment 2 to measure the activation of strong and weak associates in weak-associate biasing contexts. As in Experiment 1, the 20 weak-associate biasing sentences were divided into two equal groups matched by the word frequency of the corresponding targets (Johansson & Hofland 1989). One of these groups was paired with related targets (‘When they got the results, John went to buy a bottle of champagne ’—BUBBLE [SA]/ CELEBRATION [WA]) and the other with scrambled and therefore unrelated targets (‘They could still remember when their grandmother taught them that lullaby’—TALL [SA]/THIN [WA]). As in the
396 Concept Narrowing previous experiment, the unrelated sentences and targets served as controls. The two sentence groups and prime-target pairings were combined into two lists of materials. A set of 20 similar filler sentences was paired with English-like non-words. Critical and filler sentences were randomized individually for each participant in both lists of materials. Participants were randomly assigned to one Target Type, List and ISI. The recording and setting of the materials were the same as in the previous experiment. The two sets of practice trials were also the same as those used in Experiment 1. Practice sentences were recorded again together with the new materials.
5.2.2 Results No participants had to be discarded because of their performance on the memory test. The results of Experiment 2 are presented in Table 2. The statistical analysis of Experiment 2 examined the effects of Target Type (strong/weak associate), Target Relatedness (related/unrelated), Target
Strong associates
Weak associates
Relatedness
Related Unrelated Facilitation Related Unrelated Facilitation
ISI 0
400
1000
625 (112, 0.05) 652 (115, 0.04) 27* 743 (263, 0.02) 778 (276, 0.08) 35*
663 (101, 0.03) 741 (172, 0.07) 78** 590 (80, 0.02) 642 (81, 0.09) 52***
631 (98, 0.03) 638 (113, 0.07) 7 598 (87, 0.04) 630 (108, 0.05) 32*
Table 2 Mean reaction times (in milliseconds), SD, proportions of missing data and facilitation in each condition in Experiment 2 *P < 0.1, **P < 0.05, ***P < 0.01.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Apparatus and procedure The apparatus and procedure in Experiment 2 were the same as in Experiment 1. The only modification that was made to the standard instructions was to ask participants to try to visualize the meaning of the sentences they were listening to, in the same way that they would visualize the story if they were reading fiction. With this modification, I tried to make sure that participants arrived at the intended interpretation of the sentence, for which weakassociate biasing sentences had to be processed more deeply than neutral sentences (e.g. ‘The only thing that I have been able to grow in my garden is a cactus’—DRY [WA]). The secondary task was the same as in Experiment 1, with the memory test including again six sentences from the experimental materials plus another six sentences similar to the original ones.
Paula Rubio-Ferna´ndez 397
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
ISI (0/400/1000 ms) and List (A/B). Mean reaction times were entered into four-way ANOVAs, with participants (F1) as random variables. A main effect of Relatedness was observed, F1(1, 48) ¼ 27.66, MSE ¼ 1599.4, P < 0.001, with reaction times to targets in the related condition being 38 ms faster than those in the unrelated condition. The Relatedness 3 ISI interaction was also significant, F1(2, 48) ¼ 3.555, MSE ¼ 1599.4, P < 0.04. Reaction times to related targets were faster than those to unrelated targets across the three time delays, although the difference varied at each ISI (31, 65 and 19 ms, respectively). Unlike in Experiment 1, the critical Relatedness 3 ISI 3 Target interaction was not significant, F1(2, 48) ¼ 1.062, MSE ¼ 1599.4, P > 0.3. List did not show any significant main effect or interaction with any other variable. A 2 3 2 3 3 3 2 (Target 3 Relatedness 3 ISI 3 List) ANOVA was carried out on the arcsine transformation of the missing data using participants as the random factor. Relatedness showed a significant main effect, F1(1, 48) ¼ 6.142, MSE ¼ 0.040, P < 0.02, the highest missing rate being observed in the unrelated condition (0.192). Like in the analysis of the reaction-time data, the critical Relatedness 3 ISI 3 Target interaction was not significant, F1(2, 48) ¼ 0.954, MSE ¼ 0.040, P > 0.3. Separate analyses for strong and weak associates across Experiments 1 and 2 examined the effect of context type (neutral/biasing), Target Relatedness (related/unrelated), ISI (0/400/1000 ms) and List (A/B). Mean reaction times were entered into four-way ANOVAs, with participants (F1) as random variables. Regarding the results for strong associates, main effects of Relatedness, F1(1, 48) ¼ 24.11, MSE ¼ 1573.9, P < 0.001, and Context, F1(1, 48) ¼ 11.96, MSE ¼ 20191, P < 0.002, were observed. The faster reaction times were observed in the related condition (35 ms) and neutral contexts (90 ms). The Relatedness 3 ISI interaction was significant, F1(2, 48) ¼ 5.466, MSE ¼ 1573.9, P < 0.008, with the fastest reaction times being observed in the related condition across the three time delays, although facilitation varied at each ISI (38, 64 and 6 ms, respectively). The critical Relatedness 3 ISI 3 Context was not significant, F1(2, 48) ¼ 0.960, MSE ¼ 1573.9, P > 0.3. No main effect or significant interaction was observed with the List variable. A 2 3 2 3 3 3 2 (Context 3 Relatedness 3 ISI 3 List) ANOVA was carried out on the arcsine transformation of the missing data for strong associates using participants as the random factor. Only the Relatedness 3 ISI interaction was significant, F1(2, 48) ¼ 3.404, MSE ¼ 0.049, P > 0.05. Since the missing rates were consistently lower in the
398 Concept Narrowing
5.2.3 Discussion As in Experiment 1, both strong and weak associates got initially activated in weak-associate biasing contexts by an automatic process of spreading activation. However, their level of activation was only peripherally significant at 0 ms. These weaker results could be an effect of the greater processing load of the task, given that in Experiment 2 participants were asked to form a mental picture of the content of the sentences they were listening to.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
related condition, this interaction can be explained as an effect of this coefficient varying across the three ISIs. Like in the analysis of the reaction-time data, the critical Relatedness 3 ISI 3 Context interaction was not significant, F1(2, 48) ¼ 1.059, MSE ¼ 0.049, P > 0.3. In the ANOVAs of the reaction-time data for weak associates, only a significant main effect of Relatedness was observed, F1(1, 48) ¼ 24.72, MSE ¼ 1102.9, P < 0.001, with reaction times to related targets being 30 ms faster than those to unrelated targets. The critical Relatedness 3 ISI 3 Context interaction was peripherally significant, F1(2, 48) ¼ 2.840, MSE ¼ 1102.9, P < 0.07. Again, List did not show any significant main effect or interaction with other variables. A 2 3 2 3 3 3 2 (Context 3 Relatedness 3 ISI 3 List) ANOVA was carried out on the arcsine transformation of the missing data for weak associates using participants as the random factor. The only significant result was the main effect of Relatedness, F1(1, 48) ¼ 6.258, MSE ¼ 0.036, P < 0.02, the average missing data being higher in the unrelated than in the related condition (0.190 v. 0.104). The critical Relatedness 3 ISI 3 Context interaction did not reach significance level, F1(2, 48) ¼ 0.207, MSE ¼ 0.036, P > 0.8. Given that the 400-ms delay is the critical one in determining whether weak associates behave like non-core features, a 2 3 2 3 2 (Relatedness 3 Context 3 List) ANOVA was carried out on the reaction-time data for weak associates in the 400-ms condition of Experiments 1 and 2. The only significant result was the critical Relatedness 3 Context interaction, F1(1, 16) ¼ 6.714, MSE ¼ 1245.1, P < 0.03. A 2 3 2 3 2 (Context 3 Relatedness 3 List) ANOVA was carried out on the arcsine transformation of the missing data for weak associates in the intermediate ISI. The only significant result was the main effect of Relatedness, F1(1, 16) ¼ 6.860, MSE ¼ 0.029, P < 0.02, the average missing data being higher in the unrelated than in the related condition (0.213 v. 0.071). The critical Relatedness 3 ISI 3 Context interaction did not reach significance level, F1(1, 16) ¼ 0.720, MSE ¼ 0.029, P > 0.4.
Paula Rubio-Ferna´ndez 399
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Unlike in Experiment 1, both strong and weak associates were significantly primed at 400 ms. These results would have been expected given that, as core features of the prime concepts, strong associates should maintain their initial activation even in contexts where they are irrelevant for interpretation, while, as non-core features of the primes, weak associates should be sensitive to contextual factors and be facilitated in contexts where they are relevant for interpretation. Weak associates still showed a marginal level of activation at the longest delay, whereas strong associates had decayed at that point. According to an alternative interpretation of the priming levels observed in Experiment 2, the differences observed between Experiments 1 and 2 would not be due to differences in meaning selection but rather due to a shift in the activation patterns of the primes (maybe as a result of the different contexts used or the visualization requirement). Since priming was only marginally significant at 0 ms in Experiment 2, facilitation would have not only risen later but also decayed later, being still significant at 400 ms for both types of associates. The results of the ANOVAs, however, show that context does have an effect in the different patterns of activation observed. In Experiment 2, using weak-associate biasing contexts, there was no significant difference in the overall patterns of activation of strong and weak associates, as distinct from the results of Experiment 1, which employed neutral contexts. Comparing the time course of activation of strong and weak associates separately, strong associates showed similar results in neutral and weak-associate biasing contexts, whereas the activation of weak associates was significantly different in the two types of context, especially in the 400-ms condition. If the differences observed in Experiments 1 and 2 only reflected a shift in activation rather than a process of meaning selection, strong and weak associates should show different patterns of activation in both experiments, not only in Experiment 1. Also, if delayed activation resulted in a different priming pattern for weak associates, the same difference should be observed for strong associates between Experiments 1 and 2. The results of the ANOVAs therefore support the twofold hypothesis that strong associates behave as core features of the prime concepts, whereas weak associates behave as non-core features. Given that strong associates were actively suppressed by 400 ms in contexts where they were inconsistent with interpretation (RubioFerna´ndez 2006), I interpret the present results as supporting the narrow scope view of suppression. In this respect, suppression would not have operated on strong associates in weak-associate biasing contexts where they remained active past the 400-ms threshold. On
400 Concept Narrowing
6 GENERAL DISCUSSION The present results are consistent with the distinction between core and non-core features of a concept. Strong associates behaved like core features in that they became and remained active regardless of their contextual relevance. In contrast, weak associates showed a different pattern of activation in neutral and weak-associate biasing contexts, maintaining their initial activation only in the latter type of context where they were relevant for interpretation. Most importantly, there was no significant difference in the activation patterns of strong associates in neutral and weak-associate biasing contexts, whereas weak associates showed different results in the two types of context. The present study therefore reinforces the findings of previous studies of lexical processing (e.g. Barsalou 1982; Whitney et al. 1985; Greenspan 1986). Further research using eye-tracking techniques should offer even more accurate results bearing on the core/non-core distinction. It is important to notice that, in the word processing literature, the notion of context-independent properties is different from classical views in semantic theory: in the former, these properties are taken as the most strongly associated conceptual information, whereas in the latter they are understood as those meaning aspects which remain necessarily (and not only typically) constant across contexts. For example, from a purely semantic perspective, the meaning of DALMATIAN would
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
grounds of parsimony, their loss of activation at the longest delay should be viewed as comparable to that observed in neutral contexts. Overall, I interpret these results as supportive of the core/non-core distinction, with strong associates behaving like core features and weak associates like non-core features. The advantage of the cross-modal priming paradigm used in these experiments is that it allows presenting the visual target at the critical point during the sentence (e.g. presenting BUBBLE right before the offset of champagne), allowing for an accurate measure of activation (Swinney 1979). However, it must be noted that the results of cross-modal priming studies have been controversial in that they might reflect not only purely automatic processing but also post-lexical processing such as participants’ strategies to check for relatedness (Shelton & Martin 1992). Eye-tracking methodologies currently used in priming studies (e.g. Myung et al. 2006) maybe less susceptible to strategic processing. Also, eye-tracking techniques offer a continuous online measure of activation, rather than relying on an intermittent profile made up of time points corresponding to different delays between the acoustic prime and the visual target.
Paula Rubio-Ferna´ndez 401
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
include the entailment IS A DOG but not the property SPOTTED. I have investigated the time course of activation of superordinate terms (e.g. DALMATIAN—DOG) in other lexical priming studies (see Rubio-Ferna´ndez 2005, 2007) and, like the strong associates tested in this study (e.g. SPOTTED), they behave like core features of the prime concepts. In neutral contexts (the same as those used in Experiment 1), superordinates get activated at 0 ms and remain active up until the 1000ms delay (Rubio-Ferna´ndez 2005). Even though their patterns of activation are not significantly different, superordinates are accessible for longer than strong associates. In metaphoric contexts where superordinates are inconsistent with the figurative interpretation of the prime, they remain active up until 400 ms, being suppressed only between the intermediate and the longest delays. These long patterns of activation suggest that the semantic nature of the superordinate relation might make their conceptual association even stronger than that of strong associates. This could be related, among other things, to discourse processing strategies since a given noun can be referred to later on in discourse by a definite description of the corresponding superordinate category (e.g. ‘Mary wants to take her Dalmatian to a dog show. She thinks the dog can win’). Regarding the scope of the mechanism of suppression, the results support the view that suppression is a specific mechanism that operates only on conceptual information that is inconsistent with the mental representation of the prime and so could interfere with the interpretation process (Simpson & Kang 1994). Contrary to the predictions of The Structure Building Model (Gernsbacher 1990), suppression does not operate on information that is less relevant for interpretation, since core features maintained their initial activation not only in neutral contexts but also in weak-associate biasing contexts where they were irrelevant for interpretation. Given the empirical evidence found in a previous study (see Rubio-Ferna´ndez 2006), core features would only get actively suppressed in conflicting contexts, where they are not simply irrelevant but actually inconsistent with word interpretation. In view of the distinction between core and non-core features, it is reasonable to conclude that suppression would only need to operate on core features that were contextually inconsistent. Non-core features, being more weakly associated to the corresponding concept, would passively decay for lack of contextual stimulation unless relevant in the context. Now I would like to address the interesting question for theoretical lexical pragmatics which is raised by the existence of core features of
402 Concept Narrowing
(2) I love Dalmatians, but not the spotted ones. (3) I love Dalmatians, but not the albino ones. Thus, just as Dalmatians maybe generally interpreted as spotted Dalmatians in the above examples (hence the pragmatic oddity of (2)), it is possible that the prime concepts may have been narrowed down to a stereotype when interpreting the corresponding prime words in neutral contexts. Unlike in neutral contexts, non-core features maintained their initial activation up to the 1000-ms delay in biasing contexts where they were relevant for interpretation. It is possible that these associates showed a different pattern of activation in neutral and weak-associate biasing contexts because only in the latter were they upgraded to the 9
Note that lexical disambiguation, a highly context-sensitive process, takes place 200–300 ms after the offset of the ambiguous prime (Swinney 1979; Tanenhaus et al. 1979; Onifer & Swinney 1981).
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
concepts: do these properties play a part in the pragmatic process of concept narrowing given that they are highly accessible past the point where context has a selective effect on feature activation?9 In concept narrowing, one or more encyclopaedic properties of the encoded concept are upgraded to the status of content constitutive of the resulting ad hoc concept (e.g. ALCOHOLIC in narrowing down DRINK to DRINK* including only alcoholic drinks in its extension; Carston 2002; Recanati 2004). Given that the set of core features included highly distinctive properties of the prime concepts (e.g. SPOTTED for DALMATIAN), it is possible that the prime concepts were narrowed down to their stereotypical members in neutral contexts, which did not make salient any particular aspect of the primes (e.g. ‘This Christmas John wanted a Dalmatian’). The pragmatic process of narrowing down to a stereotype maybe a common one in processing content words in broad contexts, since the speaker would be expected to specify when she is not referring to a stereotypical member of a category, rather than vice versa. For example, Dalmatians may not necessarily refer to spotted Dalmatians by default. However, as the stereotypical type of Dalmatians, spotted Dalmatians would be the unmarked subset of the category. It would therefore be more relevant to use the word Dalmatian to refer to a stereotypical spotted Dalmatian than to a different, marked narrowing of the concept (e.g. albino Dalmatians). This would explain why the following examples might seem intuitively different in comprehensibility:
Paula Rubio-Ferna´ndez 403
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
status of content constitutive of the resulting narrow concept. However, in this case, the process would not have resulted from the context being broad (as would have been the case with core features in neutral contexts), but rather because non-core features would have been part of the information directly communicated by use of the prime word in weak-associate biasing contexts. For example, in processing the sentence ‘Mary wished her old dog had the figure of a Dalmatian’, the hearer would understand that Mary has in mind a stereotypically slender Dalmatian, and not any odd type of Dalmatian. In this respect, the ad hoc concept DALMATIAN* constructed in understanding the above example would have been narrowed down around the non-core feature SLENDER to include only slender Dalmatians in its extension. Core features also prolonged their initial automatic activation in weak-associate biasing contexts. However, unlike non-core features, these properties were irrelevant for interpretation in this type of context. It is therefore less likely that they would have been computed as communicated by use of the word and upgraded to the status of content constitutive of the resulting ad hoc concept. In the previous example, the type of Dalmatian that Mary would have referred to did not need to be spotted as long as it was slender. The core and non-core features SPOTTED and SLENDER would have, therefore, played different roles in constructing the ad hoc concept DALMATIAN* necessary to understand the above biasing sentence. I would suggest that, as strong associates of the prime concepts, the automatic activation of core features would last beyond the point where context-sensitive processes take place, without these properties having been necessarily selected for interpretation. In other words, contrary to what happens with non-core features, the sustained activation of core features cannot necessarily be interpreted as the outcome of a selection process of contextually relevant properties. It seems therefore legitimate to wonder what the role of core features would be when they remain accessible during interpretation but are not taken as part of what is communicated by the speaker by use of the corresponding word. One possibility is that core features are part of the mental representation of the concept in working memory across contexts. Given the perceptual character of this set of core features as distinctive properties of the prime concepts, it is possible that these features may have an important role in the visual or imagistic mental representation of the corresponding concepts (see Barsalou 1999). It would be interesting to investigate in future research the relation between the propositional and the perceptual mental representation of concepts.
404 Concept Narrowing 7 CONCLUSIONS
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
The present empirical findings and theoretical discussion of the data should have implications for both psycholinguistic accounts of lexical processing and pragmatic models of word interpretation. In previous experimental studies of word processing (e.g. Barsalou 1982; Whitney et al. 1985; Greenspan 1986; Tabossi 1988), the high activation of a conceptual property during sentence processing was taken as evidence of it being part of the interpretation—or even the stable meaning, of the corresponding word. Even models of language comprehension that would predict the accessibility of those encoded meaning components that are most frequent or salient (e.g. The Graded Salience Hypothesis; Giora 1997, 2002) fail to differentiate between the roles that highly accessible meanings might play in interpretation depending on whether their long activation is automatic or results from an active process of meaning selection. However, as I have shown, salient meaning components or core features may not necessarily be computed as part of the conceptual information communicated by use of a word since they remain active in contexts where they are irrelevant for interpretation. In other words, there is an important distinction to be made between activation/accessibility, on the one hand, and selection as part of the intended meaning, on the other. It is through bringing together theoretical lexical pragmatics and empirical work on word processing that this insight has emerged. According to pragmatic models of lexical interpretation (e.g. Sperber & Wilson 1986/1995, 2002; Recanati 2004), the accessibility of a conceptual property would be determined, other things being equal, by its recency of processing and contextual relevance. That is, the more recently a property has been processed and the more relevant it is in a given context, the more accessible it would be for word interpretation. Given that core features remain active regardless of their contextual relevance, it seems that there is another factor involved in accessibility which needs to be acknowledged: the strength of association between a property or feature and the corresponding concept. Nevertheless, even if contextual relevance and recency of processing might not fully determine property accessibility, the selection of a core feature as part of the intended interpretation of a word would ultimately be determined by considerations of relevance, especially the maximization of cognitive effects (Wilson & Sperber 2004; see Rubio-Ferna´ndez 2007). The distinction between core and non-core features should therefore allow us to gain a better understanding of both the automatic
Paula Rubio-Ferna´ndez 405
processes involved in lexical interpretation, as well as the pragmatic process of determining the concept communicated by use of a word.
APPENDIX A: PRIMES AND TARGETS SA
WA
Cactus Lion Slippers Skyscraper Lullaby Dalmatian Mercedes Chair Champagne Breakfast Cheetah Sapling Woodpecker Pacific Rugby Steel Minnow Banana Encyclopaedia Norway
Spike Mane Comfortable Tall Sleep Spot Expensive Back Bubble Morning Fast Young Noise Large Tough Strong Small Yellow Knowledge Cold
Dry Roar House Window Children Slender Reliable Wood Celebration Toast Predatory Thin Beak Deep Oval Silver River Bent Alphabetical North
APPENDIX B: SENTENTIAL CONTEXTS
Neutral Contexts Mary bought her mother a cactus Of all the characters in the story book, John preferred the lion When travelling, Mary always packs her slippers Mary’s office is in a skyscraper My grandmother knew that old lullaby
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
PRIMES
406 Concept Narrowing
WA Biasing Contexts The child said the moon was shaped like a banana The zebras didn’t notice the hiding cheetah Mary has lots of light in her office now that she works in a skyscraper John felt like having marmalade at breakfast Although the telegraph pole stayed in place, the wind bent the sapling Their black & white picture looked better in a frame made of steel John’s job involved lots of driving, so he bought a Mercedes Mary wished her old dog had the figure of a Dalmatian The only thing that I’ve been able to grow in my garden is a cactus When he got the results, John went to buy a bottle of champagne It would be impossible to rescue the cargo if a ship sunk in the Pacific When crossing the woods, John went swimming and saw some minnows After dinner, Mary changed into her slippers The only bird that could have made this hole is a woodpecker To make the fire John used a broken chair The plane encountered some turbulence over Germany and headed towards Norway They could still remember when their grandmother taught them that lullaby Not far away from the camp they could hear a lion
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
This year for Christmas John wanted a Dalmatian My friend works for Mercedes For the new apartment Mary only had to buy a chair For the dinner Mary brought champagne John didn’t have enough money to buy breakfast The expedition approached the territory of the cheetah At school every student planted a sapling The children went to watch the woodpecker The hotel room looked over the Pacific All boys in the school played rugby In this country, they make a lot of steel Yesterday John caught a minnow The tourist gave the monkey a banana The man at the door was selling an encyclopaedia They had been planning to visit Norway
Paula Rubio-Ferna´ndez 407
Mary didn’t know how to look up words in an encyclopaedia With a round ball the kids won’t be able to play rugby Acknowledgements
´ NDEZ PAULA RUBIO-FERNA Department of Psychology, Princeton University, Green Hall, Princeton NJ 08540, USA e-mail:
[email protected] REFERENCES Barsalou, L. W. (1982), ‘Contextindependent and context-dependent information in concepts’. Memory and Cognition 10:82–93. Barsalou, L. W. (1983), ‘Ad hoc categories’. Memory and Cognition 11:211– 27. Barsalou, L. W. (1985), ‘Ideals, central tendency and frequency of instantiation as determinants of graded structure in categories’. Journal of Experimental Psychology: Learning, Memory, and Cognition 11:629–49. Barsalou, L. W. (1987), ‘The instability of graded structure: Implications for the
nature of concepts’. In U. Neisser (ed.), Concepts and Conceptual Development: Ecological and Intellectual Factors in Categorisation (pp. 101–140). Cambridge University Press. Cambridge. Barsalou, L. W. (1992), ‘Frames, concepts and conceptual fields’. In A. Lehrer and E. Kittay (eds.), Frames, Fields and Contrasts. Lawrence Erlbaum Associates. Hillsdale, NJ. Barsalou, L. W. (1999), ‘Perceptual symbol systems’. Behavioural and Brain Sciences 22:577–660. Barsalou, L. W. & Bower, G. (September 1980), ‘A priori determinants of
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
The study reported in this paper is part of the experiments completed by the author towards her Ph.D. degree at Cambridge University. This research was initially supported by an Arts and Humanities Research Board Postgraduate Award, and currently by a British Academy Postdoctoral Fellowship and a Marie Curie Outgoing International Fellowship (Project 022149—Inferential Processes in Language Interpretation). This paper was written at University College London and reviewed at Princeton University, where the author is currently working on a postdoctoral project. I would like to thank Julie Sedivy, Larry Barsalou and an anonymous reviewer for their valuable comments. Thanks also to John Williams for supervising the setting and analysis of the experiments, as well as to Richard Breheny for many interesting discussions of the theory and the data. Thanks as well to Robyn Carston for her insightful comments on this paper.
408 Concept Narrowing Myung, J., Blumstein, S. E., & Sedivy, J. (2006), ‘Playing on the typewriter, typing on the piano: manipulation knowledge of objects’. Cognition 98:223–43. Onifer, W. & Swinney, D. A. (1981), ‘Accessing lexical ambiguities during sentence comprehension: effects of frequency of meaning and contextual bias’. Memory and Cognition 9:225–6. Recanati, F. (2004), Literal Meaning. Cambridge University Press. Cambridge. Rosch, E. (1973), ‘On the internal structure of perceptual and semantic categories’. In T. E. Moore (ed.), Cognitive Development and the Acquisition of Language. Academic Press. New York. Rosch, E. & Mervis, C. B. (1975), ‘Family resemblances: studies in the internal structure of categories’. Cognitive Psychology 4:573–605. Rubio-Ferna´ndez, P. (2005), ‘Pragmatic processes and cognitive mechanisms in lexical interpretation: the on-line construction of concepts’. Ph.D. thesis, University of Cambridge. Rubio-Ferna´ndez, P. (2006), ‘Delimiting the power of suppression in lexical processing: the question of belowbaseline performance’. UCL Working Papers in Linguistics 18:119–37. Rubio-Ferna´ndez, P. (2007), ‘Suppression in metaphor interpretation: differences between meaning selection and meaning construction’. Journal of Semantics, Special Issue on Processing Meaning 24:345–71. Shelton, J. R. & Martin, R. C. (1992), ‘How semantic is automatic semantic priming?’ Journal of Experimental Psychology: Learning Memory, and Cognition 18:1191–210. Simpson, G. B. & Adamopoulos, A. C. (2002), ‘Repeated homographs in word and sentence contexts: multiple
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
a concept’s highly accessible information’. Paper presented at The Meeting of the American Psychological Association, Montreal. Carston, R. (1996), ‘Enrichment and loosening: complementary processes in deriving the proposition expressed?’ UCL Working Papers in Linguistics 8:61–88. Reprinted in (1997) Linguistische Berichte 8:103–27. Carston, R. (2002), Thoughts and Utterances: The Pragmatics of Explicit Communication. Blackwell. Oxford. Conrad, C. (1978), ‘Some factors involved in the recognition of words’. In J. W. Cotton and R. L. Klatzky (eds.), Semantic Factors in Cognition. Lawrence Erlbaum Associates. Hillsdale. NJ. Gernsbacher, M. A. (1990), Language Comprehension as Structure Building. Lawrence Erlbaum Associates. Hillsdale NJ. Giora, R. (1997), ‘Understanding figurative and literal language: the graded salience hypothesis’. Cognitive Linguistics 7:183–206. Giora, R. (2002), ‘Literal vs. figurative language: different or equal?’ Journal of Pragmatics 34:487–506. Greenspan, S. L. (1986), ‘Semantic flexibility and referential specificity of concrete nouns’. Journal of Memory and Language 25:539–57. Johansson, S. & Hofland, K. (1989), Frequency Analysis of English Vocabulary and Grammar Based on the LOB Corpus: Vol. 1, Tag Frequencies and Word Frequencies. Clarendon Press. Oxford. Levinson, S. C. (2000), Presumptive Meanings. MIT Press. Cambridge, MA. Marslen-Wilson, W. D. (1987), ‘Functional parallelism in spoken wordrecognition’. Cognition 25:71–102.
Paula Rubio-Ferna´ndez 409 priming of semantic information’. Quarterly Journal of Experimental Psychology 32:595–603. Tanenhaus, M. K., Leiman, J. M., & Seidenberg, M. S. (1979), ‘Evidence for multiple stages in the processing of ambiguous words in syntactic contexts’. Journal of Verbal Learning and Verbal Behavior 18:427–40. Whitney, P., McKay, T., Kellas, G., & Emerson, W. A. (1985), ‘Semantic activation of noun concepts in context’. Journal of Experimental Psychology: Learning, Memory and Cognition 11:126–35. Wilson, D. (2003), ‘Relevance theory and lexical pragmatics’. Italian Journal of Linguistics/Rivista di Linguistica 15:273–91. Special issue on pragmatics and the lexicon. Wilson, D. & Sperber, D. (2004), ‘Relevance theory’. In G. Ward and L. Horn (eds.), Handbook of Pragmatics. Blackwell. Oxford. Longer earlier version published in (2002) UCL Working Papers in Linguistics 14:249–87. Winer, B. J. (1971), Statistical Principles in Experimental Design. McGraw-Hill. New York. First version received: 03.08.2006 Second version received: 21.01.2008 Accepted: 05.05.2008
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
processing of multiple meanings’. In D. S. Gorfein (ed.), On the Consequences of Meaning Selection. APA Publications. Washington, DC. Simpson, G. B. & Kang, H. (1994), ‘Inhibitory processes in the recognition of homograph meanings’. In D. Dagenbach and T. H. Carr (eds.), Inhibitory Processes in Attention, Memory, and Language. Academic Press. San Diego, CA. Sperber, D. & Wilson, D. (1986/1995), Relevance: Communication and Cognition. Blackwell. Oxford. Sperber, D. & Wilson, D. (2002), ‘Pragmatics, modularity and mindreading’. Mind & Language 17:3–23. Swinney, D. A. (1979), ‘Lexical access during sentence comprehension: (re)consideration of context effects’. Journal of Verbal Learning and Verbal Behavior 18:645–59. Tabossi, P. (1982), ‘Sentential context and the interpretation of unambiguous words’. Quarterly Journal of Experimental Psychology 34A:79–90. Tabossi, P. (1988), ‘Effects of context on the immediate interpretation of unambiguous nouns’. Journal of Experimental Psychology: Learning, Memory and Cognition 14:153–62. Tabossi, P. & Johnson-Laird, P. N. (1980), ‘Linguistic context and the
Journal of Semantics 25: 411–450 doi:10.1093/jos/ffn008 Advance Access publication August 25, 2008
The Effect of Negative Polarity Items on Inference Verification ANNA SZABOLCSI New York University
BRIAN MCELREE New York University
Abstract The scalar approach to negative polarity item (NPI) licensing assumes that NPIs are allowable in contexts in which the introduction of the NPI leads to proposition strengthening (e.g. Kadmon & Landman 1993; Krifka 1995; Lahiri 1997; Chierchia 2006). A straightforward processing prediction from such a theory is that NPIs facilitate inference verification from sets to subsets. Three experiments are reported that test this proposal. In each experiment, participants evaluated whether inferences from sets to subsets were valid. Crucially, we manipulated whether the premises contained an NPI. In Experiment 1, participants completed a metalinguistic reasoning task and Experiments 2 and 3 tested reading times using a self-paced reading task. Contrary to expectations, no facilitation was observed when the NPI was present in the premise compared to when it was absent. In fact, the NPI significantly slowed down reading times in the inference region. Our results therefore favour those scalar theories that predict that the NPI is costly to process (Chierchia 2006), or other, non-scalar theories (Ladusaw 1992; Giannakidou 1998; Szabolcsi 2004; Postal 2005) that likewise predict NPI processing cost but, unlike Chierchia (2006), expect the magnitude of the processing cost to vary with the actual pragmatics of the NPI.
1 INTRODUCTION Negative polarity items (NPIs) are expressions whose occurrence is restricted to the immediate scope of certain operators, called their licensors. Ever is an NPI licensed, among other operators, by none/few/ at most two of the books: (1) *All of the books have ever been borrowed. *Some of the books have ever been borrowed. *Most of the books have ever been borrowed. Ó The Author 2008. Published by Oxford University Press. All rights reserved. For Permissions, please email:
[email protected].
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
LEWIS BOTT Cardiff University
412 Effect of NPIs (2) None of the books have ever been borrowed. Few of the books have ever been borrowed. At most two of the books have ever been borrowed.
(3) If VP1 4 VP2, DE-Op(VP2) 0 DE-Op(VP1) Given that the things that have been borrowed by children are a subset of those that have been borrowed, Few of the books have been borrowed 0 Few of the books have been borrowed by children (4) DE-Op(VP1 or VP2) 0 DE-Op(VP1); DE-Op(VP2) Given that things that have been borrowed are a subset of those that have been borrowed or used in the reading room, Few of the books have been borrowed or used in the reading room 0 Few of the books have been borrowed; Few of the books have been used in the reading room Therefore, (5) is a correct definition and (6) a correct descriptive generalization. 1 There are well-known problems with the ‘at least monotone decreasing’ generalization in English. For example, only licences various NPIs, although it is overall non-monotonic:
(i) Only thrillers have ever been borrowed. Von Fintel (1999) argues that only thrillers satisfies a weaker notion that he dubs Strawson decreasingness. This is restricted to contexts where relevant presuppositions are borne out. See Giannakidou (2006) for critical discussion. Gajewski (2008) unifies weak and strong NPIs as both being licensed by decreasingness and differing, in that weak NPIs only care about the truth conditions/assertion, while strong NPIs require decreasingness that is preserved when all non-truthconditional/inert coordinates of meaning are taken into consideration. On the other hand, Postal (2005) observes that zero books is decreasing but does not license NPIs: (ii) *Zero books have ever been borrowed. Finally, there are subtler problems, discussed in Zwarts (1995) for English and Giannakidou (1998) for Modern Greek. This paper will gloss over these matters and investigate a plain version of the classical view.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Some other NPIs are yet, at all, in weeks, sleep a wink, much and any. Different NPIs are licensed by somewhat different operators (Zwarts 1981; Giannakidou 1998; De Decker et al. 2005), but the classical view, going back to Ladusaw (1980), is that the licensors of NPIs in English are at least monotone decreasing.1 A monotone-decreasing (DE) operator reverses the ordering in its domain. If A < B and Op is decreasing, then Op(B) < Op(A). One special case is where A < B represents ‘A is a subset of B’ and Op(B) < Op(A) represents ‘Op(B) entails Op(A)’. For example, both inferences below illustrate the fact that ‘few of the books’ is a decreasing operator:
Anna Szabolcsi, Lewis Bott and Brian McElree 413
(5) An operator is decreasing iff it supports inferences from sets to subsets. (6) Correlation between NPI-licensing and inferences: Negative polarity items occur in the immediate scope of operators that support inferences from sets to subsets.
1.1 The association model The correlation between NPI licensing and inference to subsets need not be just a descriptive coincidence; it may be essential and explanatory. For example, the main hypothesis in Dowty (1994) is formulated as follows (his (27)): Hypothesis. Given that (i) [M and YM inferences are a very significant pattern of natural language reasoning, and (ii) the distribution of NPIs (and NC) is (almost) coextensive with logically YM contexts, we can hypothesize that one important reason for the existence of NPI and NC marking is to directly mark positions syntactically which are subject to YM inferences. ([M and YM stand for ‘monotonically increasing’ and ‘decreasing’, respectively, and NC stands for ‘negative concord’—AS, LB and BM.) The assumption of an essential correlation also applies to scalar accounts of NPI licensing (Kadmon & Landman 1993; Krifka 1995; Lahiri 1997; Chierchia 2006). Glossing over the differences between these accounts, let us say that the NPI widens the domain of quantification and, moreover, comes with a requirement that domain widening should strengthen the claim as compared to what the use of a plain indefinite would convey. For example, the plain indefinite (some) readers is taken to quantify over typical readers, and the NPI any readers to quantify over a widened domain of either typical readers or marginal readers, such as paid reviewers or people who just browse the book. Now consider (7) and (8). Many of the books had (some) typical readers 0 Many of the books had (some) typical or marginal readers b. *Many of the books had any readers.
(7) a.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
The central question of this paper pertains to the psychological status of this abstract grammatical generalization. In particular, we ask whether human sentence processing operations (henceforth, the human processor or the processor) recognize that the distribution of NPIs is governed by the same property that supports inferences from sets to subsets.
414 Effect of NPIs (8) a. Few of the books had (some) typical or marginal readers 0 Few of the books had (some) typical readers b.OK Few of the books had any readers.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
A proposition p is stronger than q if p entails q. In an increasing context such as (7), having typical readers entails having either typical or marginal readers, so by widening the domain we weaken the claim instead of strengthening it. Some cannot be replaced by any. In contrast, (8) is a decreasing context and the entailment relations are reversed. Few of the books having typical or marginal readers entails, and is thus stronger than, few of them having typical readers. Here, some can be replaced by any. If this account is correct, then it is quite plausible for the processor to recognize that NPI licensing and supporting inferences to subsets are the same property. Let us call this the association model. The association model presupposes that decreasingness is a factor in the processing of inferences, as do theories within formal semantics (Sa´nchez-Valencia 1991; van Benthem 1991; Bernardi 2002; Fyodorov et al. 2003; Altman et al. 2005). The only psychological data that have tested this claim, however, is Geurts (2003a) and Geurts & van der Slik (2005). Geurts argued that if montonicity plays a role in human reasoning, then syllogisms that are complex according to a monotonicitybased logic should show low rates of success in human reasoning tasks. He therefore calculated the number of steps required to solve standard syllogisms according to a set of monotonicity-based inference rules, and compared the complexity of the problem with human performance. He found that the monotonicity-based complexity did indeed correlate with performance (but see Newstead 2003, and Geurts 2003b, for commentary). Geurts and van der Slik came to a similar conclusion by demonstrating that participants find sentences with two quantifiers of the same monotonicity profile easier to interpret than sentences with mixed monotonicity profiles. Thus, what evidence there is on processing and monotonicity agrees with the association model. Given the existing evidence on NPI licensing, it seems almost like a foregone conclusion that the processor should recognize the relationship between monotonicity and NPI licensing. Indeed, when we started this project, our intention was to investigate how this recognition occurred, rather than whether it occurred. As is often the case, however, expectations did not match results and we were unable to find evidence for the recognition. We therefore describe other linguistic models, which do not predict such a relationship.
Anna Szabolcsi, Lewis Bott and Brian McElree 415
1.2 The dissociation model
(9) Mary never ate pizza 0 Mary never ate pizza with anchovies (never is decreasing) Mary never ate pizza L Mary ate pizza (never is non-veridical) (10) Mary is sad or tired L Mary is sad; Mary is tired (or is non-veridical) Mary is sad or tired L Mary is very sad or very tired (or is not decreasing) Suppose now that some NPI is licensed by non-veridical operators. The presence of such a licensed NPI would not highlight the decreasingness of the licensor even if the licensor happened to be decreasing, because its decreasingness would not be essential to licensing. Finally, there is a family of theories that identify interpreted or uninterpreted negatives as the crucial factors in NPI licensing. Ladusaw (1992) assimilates Romance negative concord to NPI licensing, arguing
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
The essential relation (e.g. scalar) account of NPI licensing is also compatible with the processor failing to recognize the sameness of properties. It is possible, for example, that the NPI licensing effect of decreasing operators is compiled into the syntax, whereas inferencing is performed purely model theoretically. It could be argued that NPI licensing is purely syntactic. For example, NPIs have a [de] feature that makes the sentence ungrammatical unless it is deleted in construction with a licensor, which bears a [+de] feature. On this view, the set of operators that bear the syntactic [+de] feature may coincide with the set of semantically decreasing ones, but this fact has no significance in the actual licensing process. In contrast, inferences could be computed purely in a model-theoretic semantic manner, not in terms of syntactic features. The processor may dissociate NPI licensing from inferences for another reason. It may be that the correlation between licensing and inference to subsets holds true, but it is not essential for how NPI licensing works. For example, Zwarts (1995) and Giannakidou (1998) propose that non-veridicality, as opposed to decreasingness, is the semantic property that licenses certain polarity-sensitive items. The baseline definition is that an operator Op is veridical if Op(p) entails p, and non-veridical if Op(p) does not entail p. Giannakidou introduces further qualifications to narrow down the set of licensors; see Simons (1999) for a review. Relevant to us is that all decreasing operators are non-veridical (e.g. never), although not all non-veridical operators are decreasing (e.g. or), as illustrated in (9) and (10):
416 Effect of NPIs that n-words (nessuno, nadie, personne, etc.), as well as verbal negation in Romance languages, are NPIs, and their licensor is an overt or silent anti-additive item.2 de Swart & Sag (2002) recast this analysis with nwords interpreted as anti-additive quantifiers that are absorbed into a single polyadic quantifier. (11) Nadie no he visto nada. no one not saw nothing ‘No one saw anything’
(12) not-many books had not-some readers ¼ underlying not ( _ many books had _ some readers) ¼ negations factored out Few books had any readers. ¼ spell-out As all decreasing operators either are negations or can be decomposed into a negation plus an increasing operator within its scope (for little as degree negation, see Heim 2006), this analysis is fully compatible with the correlation between NPI licensing and decreasingness. But the processor will have no reason to recognize that the NPI licensing 2 Op is anti-additive if it bears out this de Morgan law: Op(a or b) iff Op(a) and Op(b). No one, never, without are anti-additive. If Op is merely decreasing, the biconditional holds left to right but not right to left. 3 One of Postal’s strong descriptive arguments for the claim that any forms contain a lexical negation comes from the phenomenon known as ‘secondary triggering’. The NPI in years requires a clausemate anti-additive licensor. (i) is bad because it does not have one.
(i) *Nobody suspected that astronauts had gone to Mars in years. (ii) Nobody suspected that no astronauts had gone to Mars in years. (iii) Nobody suspected that any astronauts had gone to Mars in years. In (ii), the clausemate anti-additive licensor is no astronauts. In (iii), only any astronauts can be the licensor, but then it must contain an anti-additive operator, just like no astronauts does. Den Dikken (2006) generalizes Postal’s account to the Dutch NPI hele ‘whole’ and recasts the analysis in syntactic terms similar to Ladusaw’s, completing the circle.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Postal (2005) and Szabolcsi (2004) propose the flip-side account and assimilate NPI licensing to negative concord. More precisely, according to Postal, NPIs are not lexical items in need of licensing. Instead, surface forms like no one and anyone are alternative morphologies that spell out the combination of an underlying indefinite and one or more negations,3 the choice depending on whether the negations are left alone or cancelled by other negations in the sentence. Szabolcsi recasts Postal’s proposal along the lines of de Swart and Sag: both the NPI and the licensor have a negation component in their lexical semantics; these negations are factored out to form a polyadic negative quantifier (see Szabolcsi 2004: 433–6 and 447–50). The following example involves some simplifications that are irrelevant to present purposes:
Anna Szabolcsi, Lewis Bott and Brian McElree 417
property of few books is coextensive with the one that supports inference to subsets. In other words, dissociation, while not strictly necessary, is quite plausible.
1.3 Overview of the experiments and empirical predictions
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
The association and dissociation models make different predictions regarding the effects of an NPI on inference processing. We conducted three experiments to discriminate between the models. In all three of the experiments, participants verified whether sentences involving quantifiers were entailed by other sentences. For example, the participant might read ‘None of the colleagues have ever sent flowers or cards’, and judge whether ‘None of the colleagues have sent cards’ was true. The crucial manipulation was whether the disjunction sentence contained an NPI. The association model, combined with Geurts’ findings, makes a strong prediction that the occurrence of an NPI within the licensing domain of a decreasing operator should facilitate processing of inferences from sets to subsets. Below we describe how we arrive at this prediction. According to the association model, facilitation should arise for two reasons. First, the presence of the NPI is assumed to provide additional information as to the decreasing monotonicity of the context. As Geurts and van der Slik write when reviewing the literature on NPIs, ‘In effect, a NPI serves to signal that the environment in which it occurs is [decreasing] . . .’ (p. 240, emphasis in the original, in Geurts 2003a: 101; Geurts & van der Slik, 2005). With more evidence contributing to the monotonicity classification decision, the classification process will be less susceptible to processing noise, which may mask relevant contextual information. We would expect this to translate into either more accurate monotonicity judgments or quicker processing times, depending on how participants elect to balance speed and accuracy constraints in their classification judgments. If the monotonically decreasing character of the context is more easily determined, and this information is used in the inference verification process, inference verification will show facilitation relative to a situation in which monotonicity decisions are difficult to determine. Second, facilitation could arise if the NPI forces computation of the monotonicity context prior to the point at which inferences are needed. If there is no NPI in a quantifier sentence, there is no guarantee that the monotonicity of the context will be computed until inference verification is required (indeed, there are good arguments on psychological and linguistic grounds that sentences are minimally processed until discourse constraints force otherwise, see, e.g. McKoon & Ratcliff 1992; Koller & Niehren 1999; Ferreira & Patson 2007). Processing time for the
418 Effect of NPIs
2 EXPERIMENTS
2.1 Experiment 1 Experiment 1 was a reasoning task in which participants judged whether a decreasing inference was supported by the preceding discourse. Participants read two-line vignettes, followed by a question. The first line set the general context, and the second line involved a verb phrase disjunction within the scope of a subject quantifier. Two types of quantifiers were used, decreasing and non-decreasing. When the quantifier was decreasing, the combination supported an inference to one of the disjuncts within the scope of the same quantifier, but when the quantifier was non-decreasing, the inference was not supported. The question at the end of the vignette explicitly asked about the inference to one of the disjuncts. Crucially, half the items included an NPI (ever or any), while the other half did not. If the association model were correct, we would expect the inferences that included the NPI to be verified more accurately than those that did not include the NPI. We now elaborate on the structure of the materials used in the task and the experimental design.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
monotonicity calculation will therefore be delayed until inferences need to be verified. This is not the case when an NPI is present in a sentence, however. Since recognizing decreasingness is assumed to be a necessary part of the NPI licensing computation, a sentence with an NPI cannot be processed without determining decreasing monotonicity. Accordingly, when inferences need to be verified later in the discourse, the monotonicity computations will have already been completed, and thus the valid inferences can be determined relatively easily. The association model predicts that facilitation of inferences arises because the NPI should highlight the monotonicity properties of the quantifier and because it forces the monotonicity computations prior to the inference verification stage of processing. In contrast, if NPI licensing is not essentially linked to the monotonicity computations, as in the dissociation model, no facilitation of inference processing would be expected. These predictions were tested using a reasoning task and two self-paced reading tasks, presented below in section 2. In the case of each experiment, the introductory section explains the hypotheses and how the materials were constructed. This is followed by a technical presentation of the method and the results. The final section for each experiment is the discussion of the extent to which the effects observed in that experiment supported the hypotheses. The more general linguistic interpretation of the results is reserved for the general discussion in section 3.
Anna Szabolcsi, Lewis Bott and Brian McElree 419
4
There were several reasons for these quantifier choices. Speakers generally find modified quantifiers more natural and plausible, which is important in preventing the task from having an IQtest feel. Regarding specific quantifiers, almost no(body) is strictly speaking non-monotonic. However, people readily accept inferences like this: (i) Almost no campers have had a sunburn or caught a cold 0 Almost no campers have caught a cold. We believe the reason is this. Indeed, it may be that some campers had sunburns but no campers caught colds, in which case almost is not justified in the conclusion. However, given that campers in a camp usually number in the hundreds and sunburns and colds are equally probable and unremarkable, almost no essentially means that the number of incidents was negligible. People freely assume that both kinds of events occurred and if for some reason one did not, it is not important. This is different from inferences to arbitrary subsets, as in Almost no campers had a sunburn or sprouted a second head L Almost no campers sprouted a second head. The quantifiers involving only license the NPI although they are used in condition (c) (they are non-monotonic). These items were therefore removed for the analysis of the effect of the unlicensed NPI and reinstated for all other analyses.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Participants answered 48 reasoning problems constructed from different vignettes. Each vignette existed in four conditions: (a) decreasing quantifier with an NPI; (b) decreasing quantifier without an NPI; (c) non-decreasing quantifier with an NPI; (d) non-decreasing quantifier without an NPI. Inferences from DE quantifiers (conditions a and b) supported inferences to subsets (i.e. the inference presented in S2 was valid), whereas inferences from non-decreasing quantifiers (conditions c and d) did not (inferences were invalid). Conditions (a) and (c) contained NPIs, but not (b) and (d) (note that the NPI in (c) is unlicensed for most non-decreasing quantifiers). For each vignette, pairs of comparable quantifiers were used for the DE and the nondecreasing conditions, respectively. After participants had read the context sentence and the disjunction sentence, they answered an inference question in the format, ‘Would it be reasonable to say that Quantifier(VP2)?’ Table 1 shows two examples of the stimuli. The context sentence, displayed in all conditions, is shown in the top row. Sentences displayed in the different conditions are shown in the following rows, and the question appears in the final row. Note that the inference invariably used the second disjunct so as to avoid repetition of a contiguous stretch of the previous sentence. The correct answer was ‘yes’ for conditions (a) and (b) and ‘no’ for (c) and (d). There were 48 vignettes, constructed so as to support ever as the NPI, or to support any (referred to in the discussion below as ever items and any items, respectively, regardless whether they are used in the NPIpresent or the NPI-absent condition). The complete list of the quantifier pairs is shown in Table 2. Each participant saw only one occurrence of each quantifier and only one use of each particular vignette.4
420 Effect of NPIs Ever example
Any example
Our camp is on Staten Island.
(a) decreasing with NPI (b) decreasing without NPI
Almost no campers have ever had a sunburn or caught a cold. Almost no campers have had a sunburn or caught a cold.
(c) non-decreasing with NPI
Almost every camper has ever had a sunburn or caught a cold.
(d) non-decreasing without NPI
Almost every camper has had a sunburn or caught a cold.
Inference question
Would it be reasonable to say that almost no [almost every] camper[s] have [has] caught a cold?
Table 1
After winning a game at a local fair, children get to choose a prize. Almost no child chose any mint chocolates or decided on the cotton candy. Almost no child chose mint chocolates or decided on the cotton candy. Almost every child chose any mint chocolates or decided on the cotton candy. Almost every child chose mint chocolates or decided on the cotton candy. Would it be reasonable to say that almost no [almost every] child decided on the cotton candy?
Examples of the stimuli used in Experiment 1
Non-decreasing
Decreasing
Almost every Almost everybody At least five At least half More than five More than five of Many Many of No less than fifty No less than five Only five Only five of
Almost no Almost nobody At most five At most half Less than five Less than five of Not many Not many of No more than fifty No more than five Very few Very few of
Table 2
Quantifier pairs
If the association model is correct and there is an essential link between decreasingness and NPIs, participants should find version (a), in which an NPI is present in the disjunction sentence, easier to process than version (b), in which no NPI is present. As described in section 1.3, the NPI should highlight the monotonicity context and make it easier for participants to verify the inference. If there is not an essential link between NPI licensing and decreasingness, there should be no difference between the versions. Conditions in which there is no NPI
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Context sentence
Anna Szabolcsi, Lewis Bott and Brian McElree 421
2.1.1 Method Participants. Twenty-eight New York University students participated for course credit or payment. Participants were randomly allocated to one of four experimental groups (see below). Design and stimuli. Forty-eight items were constructed, 24 capable of supporting ever and 24 capable of supporting any. Each item was three lines long and included a context sentence (sentence 1, S1), a disjunction sentence that used or (S2), and an inference question (S3). The disjunction sentence always began with a quantifier, and included an NPI in the (a) and (c) conditions. For the ever items, the NPI occurred prior to the disjunction and scoped over both disjuncts. For the any items, the NPI occurred in the first disjunct and therefore only scoped over the first disjunct. The inference question always began with the phrase, ‘Would it be reasonable to say that’, followed by the quantifier, followed by the inference. The NPI was not included in the inference sentence because we wished the inference sentence to be identical across all conditions. A selection of the items is shown in the Appendix, where the first two sentences in each item were used as the first two lines in this experiment, and the third sentence was transformed into the question (although note that the any items presented in the Appendix correspond to the self-paced versions used in Experiment 3, not the versions used in Experiment 1). Four experimental groups of items were created, so that each participant would see items from all four conditions, but see no single item more than once. Items were counterbalanced across groups, so that all items occurred equally often in all four conditions across the experiment.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
present can be used to measure baseline performance. Participants should respond positively to valid inferences without the NPI (b) and negatively to invalid inferences without the NPI (d). We included condition (c)—with an unlicensed NPI—because we wished to establish whether participants were processing the NPI independently from whether the NPI affected processing of the inference. Under conditions in which participants showed no effects of reading an unlicensed NPI, an association model would not necessarily predict that an NPI would facilitate processing. Condition (c) allows us to eliminate the possibility that participants were merely ignoring the NPI during the task. (This possibility is more likely in later experiments in which participants merely read the sentences without having to explicitly judge the veracity of inferences.)
422 Effect of NPIs Participants also saw 24 filler items based around simple deductive inferences, 12 of which were true and 12 of which were false. Filler items were of the same general form as the experimental items but did not vary over experimental groups. For example, a true filler item was, ‘John looked in the fridge./He found ham and cheese./Would it be reasonable to say that John found food in the fridge?’
2.1.2 Results Accuracy on the filler items was at ceiling levels. For the valid inference filler items, average proportion ‘valid’ responses was 0.96, whereas for the invalid items, average proportion ‘valid’ responses was 0.05 (hence, proportion correct ¼ 0.95). Thus, participants understood the instructions and experienced no difficulty in answering simple deduction questions. Average proportion ‘valid’ judgments for the four experimental conditions are shown in Table 3. Valid items (a and b) were generally answered correctly, but participants also believed that a large proportion of the invalid items (c and d) were valid inferences (hence, proportion correct scores were low for the invalid items: M ¼ 0.51 and M ¼ 0.58, respectively). Nonetheless, there was a robust difference between valid and invalid items whether the NPI was present or not, (a) v. (c), F(1,24) ¼ 39.27, P < 0.0005, (b) v. (d), F(1,24) ¼ 40.2, P < 0.0005 (we used counterbalancing group as a blocking factor, as we did in all the analysis of variances (ANOVAs) reported in this article). Participants therefore processed the quantifiers sufficiently deeply to
any ever Total
(a)Valid, NPI+
(b)Valid, NPI
(c)Invalid, NPI+
(d)Invalid, NPI
0.89 (0.17) 0.85 (0.18) 0.87 (0.13)
0.87 (0.14) 0.87 (0.20) 0.87 (0.14)
0.51 (0.31) 0.46 (0.31) 0.49 (0.29)
0.38 (0.31) 0.44 (0.37) 0.42 (0.33)
Table 3 Proportion ‘valid’ responses for Experiment 1. Mean proportions ‘valid’ responses as a function of inference type (valid or invalid), presence of NPI (+ for present, for absent) and NPI type (ever or any). Standard deviations (SDs) in parentheses. Proportions correct for (a) and (b) are equal to the tabulated value, but proportions correct for (c) and (d) are equal to 1 minus the tabulated value
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Procedure. Participants read the sentences from a computer screen. A fixation cross first appeared, followed by the first sentence of the item. Participants pressed the enter key to advance to the second and third sentences, respectively. After reading the third sentence, they pressed a key corresponding to ‘yes’ or ‘no’ to indicate their response to the question.
Anna Szabolcsi, Lewis Bott and Brian McElree 423
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
discriminate between valid and invalid inferences, but they experienced some difficulty in rejecting invalid items. We next consider the effect of the NPI. We hypothesized that the presence of the NPI would facilitate participants in judging whether the inference was valid. For the valid items, the addition of the NPI should therefore lead to a greater proportion of valid responses, i.e. (a) > (b). Table 3 shows that this was not the case. For the valid items, the means are essentially equal, Mab diff ¼ 0.0012, SD ¼ 0.13, 95% confidence interval (CI) participants (one tailed): 0.042 < Mdiff < 0.040, 95% CI items (one tailed): 0.036 < Mdiff < 0.033. We can therefore be confident that the NPI improved performance by a maximum of 4% for the valid items. Accuracy on the invalid items appears to have been worsened by the introduction of the NPI. To examine this in more detail, we broke the invalid items into the ever set and the any set and performed a repeatedmeasures ANOVA with presence of NPI and type of NPI as factors. This revealed marginal effects of the presence of the NPI, F1(1,27) ¼ 3.017, P ¼ 0.094, F2(1,46) ¼ 4.21, P ¼ 0.043 and of the interaction, F1(1,27) ¼ 5.05, P ¼ 0.033, F2(1,46) ¼ 2.94, P ¼ 0.093, such that the addition of any worsened accuracy more than the addition of ever. Unfortunately, the drop in accuracy in condition (c) coincided with a progression towards chance. Hence, it is difficult to know whether participants believed more inferences were valid, or whether more participants responded at chance because they found the sentence with the unlicensed NPI too difficult to interpret. Correct judgments of the valid items were quite high (M ¼ 0.87) and we were concerned that ceiling effects might be obscuring a facilitatory effect of the NPI. We therefore examined the individual participant data to check that incorrect scores were not restricted to a minority of participants. However, only 5 out of 28 participants had perfect scores and there was room for improvement in the remaining 23 participants. The non-ceiling participants scored M ¼ 0.85 v. M ¼ 0.84 in the valid inference, NPI-present and NPI-absent conditions, respectively, which is not a reliable difference, t < 1. Moreover, performance on items in which participants had to reject the inference as being invalid was low (M ¼ 0.58 correct), and the high discrepancy between these scores indicates that high performance on the valid inferences may have been due to a confirmation response bias. In short, our concerns about ceiling effects were unfounded. We also analysed response times to establish whether participants responded more quickly to valid inferences with the NPI (condition (a))
424 Effect of NPIs
2.1.3 Discussion We tested participants on inferences that involved either decreasing or non-decreasing quantifiers. We hypothesized that when an NPI was included in the text, participants should have been more accurate in judging that inferences involving decreasing quantifiers were valid, compared to contexts in which the NPI was absent. Participants generally accepted valid inferences, but experienced difficulty in rejecting invalid inferences. Nonetheless, there was a robust difference between validity judgments in the two conditions. The presence of the NPI, however, failed to significantly facilitate correct acceptance of the valid inference. Indeed, accuracy on the valid inferences was identical in the NPI-present and NPI-absent conditions, and we can be 95% confident that the NPI facilitated processing by at most 4%. One potential reason for the absence of an effect of the NPI is that participants might not have processed the NPI to a sufficient depth when they read the sentences. However, this is an unlikely explanation because we observed marginally lower accuracy when the NPI was present in the non-decreasing quantifier condition (c) (where it was generally unlicensed) than when it was not present (d), and participants spent an extra half a second longer reading sentences from condition (c) than from condition (a). Hence, there is no doubt that participants were
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
than valid inferences without the NPI (condition (b)). Mean response times to correctly answered inferences were almost identical, however, Ma ¼ 2.69 s (SD ¼ 0.86) v. Mb ¼ 2.66 s (SD ¼ 0.97), and no significant differences were apparent, 95% CI (participants, two tailed): 0.19 s < Mab < 0.24 s, 95% CI (items, two tailed): 0.26 s < Mab < 0.27. Thus, the RT analysis was consistent with the response proportion analysis. We did not analyse the invalid inferences because there was an insufficient number of correct responses. Finally, we analysed whole sentence reading times for the sentence that contained the NPI (S2). In particular, we wished to establish whether participants had processed the NPI at all, given that we observed no facilitatory effect in either response proportions or response times. We compared condition (a) with condition (c), both of which contained an NPI but in (a) the NPI was licensed, whereas in (c) it was not. Participants read the unlicensed condition much more slowly than the licensed condition, M ¼ 4.67 s (SD ¼ 1.45) v. M ¼ 4.05 s (SD ¼ 1.07), F1(1,27) ¼ 18.11, P < 0.005, F2(1,46) ¼ 10.81, P < 0.005, but there was no main effect of NPI type (ever v. any), Fs < 1, or interaction, Fs < 1. Thus, we can be sure that participants processed the NPI and the quantifier in sufficient depth that they were sensitive to whether the NPI was licensed.
Anna Szabolcsi, Lewis Bott and Brian McElree 425
2.2 Experiment 2 Experiment 2 tested whether the addition of the NPI reduced reading time when participants read a version of the inferences presented in Experiment 1. The inferences were embedded in short vignettes containing four sentences: a context sentence, a sentence containing a quantifier and the NPI (in the appropriate condition), a sentence containing the inference and a sentence closing the vignette. The sentences were divided into regions and participants advanced from region to region by pressing a key. Table 4 shows one of the items, the ‘Staten Island’ vignette. The slashes indicate self-paced reading regions. The first two sentences of each item were the same as those used in Experiment 1. The third sentence always began ‘Since Quantifier(VP2)’ and contained the inference to the second disjunct that was expressed as the question in Experiment 1. (Notice that S3 did not repeat a contiguous stretch of S2.) We assumed that the most felicitous use Sentence
Conditions (a) [NPI+] and (b)
Conditions (c) [NPI+] and (d)
S1 S2
Our camp/ is on/ Staten Island./ Almost no campers/ have [ever] had/ a sunburn/ or caught a cold./ Since/ almost no campers/ have caught a cold,/ the parents are happy,/ and they praise the counselors./ They will/ use the camp/ again next year./ Is the camp in Staten Island?
Our camp/ is on/ Staten Island./ Almost every camper/ has [ever] had/ a sunburn/ or caught a cold./ Since/ almost every camper/ has caught a cold,/ the parents are unhappy,/ and they blame the counselors./ They will/ use the camp/ again next year./ Is the camp in Staten Island?
S3
S4 S5
Table 4
Staten Island vignette
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
paying attention to the quantifier–NPI combination, yet they appeared not to use this information when evaluating the inferences. This experiment demonstrated that the presence of an NPI does not substantially facilitate the accuracy with which people can make inference judgments. Nonetheless, facilitation might reduce processing time without improving accuracy. Although we did not observe response time differences on the inference sentences, we could only observe reading times for the whole sentence, which included the ‘Would it be reasonable’ section and other parts of the sentence that might obscure a small processing time facilitation. We therefore conducted reading time experiments in which we could measure reading times on specific regions of the sentence.
426 Effect of NPIs
2.2.1 Method Participants. Thirty-two New York University students participated for course credit or payment. None had completed Experiment 1. Participants were randomly assigned to four experimental lists. Stimuli and design. Items were the same as the ever items generated for Experiment 1,6 except for additional material presented at the end of S3 and S4, and the use of since instead of a question. To ensure that all the sentences were felicitous, the additional material in S3 varied 5 People seem to find it easier to make both scope judgments and inferences when anaphora and non-metalinguistic questions are used, rather than when metalinguistic reasoning is tested (see Paterson et al. 1998; Tunstall 1998; Szabolcsi 2006). 6 We also included a set of items with any as an NPI. However, when we analysed these items in isolation, we found that participants were only marginally sensitive to the validity of the inference and to the presence of the unlicensed NPI, and there were no experimentally interesting significant differences. Since very little can be concluded from these items, and to aid the exposition, we report only the analysis of the ever items (although the experiment was analysed on the basis that both sets of items were tested to maintain appropriate family-wise error rates, and the reporting of P values and degrees of freedom reflect this). The complete analysis is available on request. One reason why the any items might have behaved differently to the ever items is that any did not modify the proposition on which the inference is based (any modified the first disjunct but we tested on the second disjunct), unlike ever in the ever items. However, this could not explain why we did not get strong effects of the unlicensed NPI or the inference, so we prefer to remain agnostic about possible differences between the NPIs and about the role of the scope position of the NPI.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
of since (‘seeing that’) is one where the content of its complement is given information, for example, inferable from what has just been conveyed (see also Verbrugge & Schaeken 2006). Using since should therefore force participants to verify the inference, obviating the need for an explicit question.5 We predicted that if the presence of the NPI facilitated processing, reading time on the inference region of the valid inference (‘has/have caught a cold’) would be quicker when the NPI was present than when it was absent ((a) v. (b)). We predicted effects on the inference region because this is the earliest point at which the NPI could facilitate inference generation. Since the language processor is known to act incrementally, the inference region seemed a more plausible site for facilitation than later regions. Note that S3 was identical across conditions (a) and (b) since the NPI only occurred in S2, the premise with the disjunction. Participants should also be slower in the inference region when reading invalid inferences with no NPI, compared to valid inferences with no NPI (S3.3, (d) v. (b)), and they should be slower reading an unlicensed NPI than a licensed NPI (S2.2, (a) v. (c)). ‘S3.3’ denotes region 3 of sentence 3; ‘S2.2’ region 2 of sentence 2, etc.
Anna Szabolcsi, Lewis Bott and Brian McElree 427
Procedure. At the start of each trial, each participant saw the first sentence of an item appear as a row of underscores, each underscore representing a missing letter. The first set of words then appeared in the first region, and the participant advanced onto the next region by pressing space. The words of the current region disappeared and the next set arrived in place of the underscores. After the participant had read the first sentence, the next sentence appeared below it and the old sentence disappeared. On completing all four sentences, the comprehension question appeared and the participant pressed a key corresponding to the yes/no response. After the response, the first sentence of the next trial appeared and the participant proceeded as before. 2.2.2 Results Data preprocessing. We removed all RTs that were more than 4 SDs away from the mean of their region. We also removed all data
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
slightly with the quantifier (e.g. ‘the parents are happy,/ and they praise the counselors/’ vs. ‘the parents are unhappy,/ and they blame the counselors/’). These differences occurred after the regions of interest and so they cannot affect the conclusions of the experiment. We also added a question at the end of each item to maintain the participant’s attention. Half of these questions were designed to have true answers and half were designed to have false answers. The items were divided into the same four lists as in Experiment 1. Items were divided into regions through which participants advanced at their own pace. The first sentence was divided into three regions of approximately the same length. For the second sentence, the division of the sentence varied depending on the structure of the item. For all items, the first region contained the quantifier phrase, and the second region contained the NPI (in the (a) and (c) conditions). This region was designed so that it contained an equal number of words across all the items (two words plus the NPI), so that variation across items was minimized on this region. The remaining sentence was divided into two or three regions depending on the length. The third sentence always included five regions. The first was since; the second was the quantifier phrase; the third was the second disjunct of S2 followed by a comma (the inference region) and the fourth and fifth were concluding phrases. We also included 40 filler items that were of similar length to the experimental items but did not include NPIs or inferences based on the quantifier.
428 Effect of NPIs associated with items that were incorrectly answered because we had no guarantee that participants had read the items in these cases. For the current experiment, 7% of the data were removed because of incorrectly answered comprehension questions. A further 4% were removed as outliers from S2, and 4% from S3.
Figure 1 Reading time differences (ms) for S2, Experiment 2. NPI present, unlicensed (c) minus licensed (a). Values on the x-axis correspond to different regions of the sentence. The appropriate section of the ‘Campers’ item is shown on each region (words in square brackets correspond to the (c) version). Positive difference scores indicate longer reading times in the unlicensed NPI condition. Error bars correspond to the standard error of the difference for each region (subject analysis).
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Sentence 2. S2 contained the NPI in conditions (a) (licensed) and (c) (generally unlicensed). Since we observed only a marginal effect of the NPI on accuracy in Experiment 1, we analysed S2 to establish that participants were processing the NPI in this experiment. Since condition (c) generally contained an unlicensed NPI, participants should spend longer reading these sentences than condition (a). Figure 1 displays the within-subject difference in reading time between conditions (a) and (c) as a function of the sentence regions. We consider regions 2, 3 and 4 because prior to region 2, the NPI had not been presented. Participants were quicker reading the unlicensed NPI than the licensed NPI but slowed down in the following region. This is confirmed by an interaction between region and NPI license condition ((a) v. (c)), F1(2,62) ¼ 6.12, P ¼ 0.004, F2(2,30) ¼ 6.03, P ¼ 0.006.
Anna Szabolcsi, Lewis Bott and Brian McElree 429
Figure 2 Reading time differences (ms) for S3, Experiment 2. The upper line is the invalid (d) minus valid (b) inference conditions, no NPIs. The lower line is the valid inference, NPI absent (b) minus valid inference, NPI present (a) condition. Values on the x-axis correspond to different regions of the sentence. The appropriate section of the ‘Campers’ item is shown on each region (words in square brackets correspond to the (d) version). Error bars correspond to the standard error of the difference for each region (subject analysis).
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Sentence 3. S3 contained the inference. In conditions (a) and (b), the inference was valid, whereas in (c) and (d) it was invalid. We expected participants to have longer RTs when reading invalid inferences than valid inferences independently of the effect of the NPI. We therefore compared conditions (d) and (b) on the inference region (S3.3). The upper line of Figure 2 shows the within-subject difference between conditions (d) and (b) of S3. On the inference region, the invalid inferences were read significantly more slowly than the valid inferences, t1(31) ¼ 3.79, P ¼ 0.001, t2(23) ¼ 0.004. Furthermore, the effect was not present on regions prior to the inference region, t1(31)s < 1.66, Ps > 0.10, t2(23)s < 1.77, Ps > 0.09. This was confirmed by an interaction between region and inference type, F1(2,56) ¼ 5.22, P ¼ 0.008, F2(2,46) ¼ 4.92, P ¼ 0.012, when regions S3.1, S3.2, and S3.3 were included in a repeated-measures ANOVA. Thus, the increased reading times on the inference region must be due to participants evaluating an invalid inference, and not merely due to participants reading different quantifiers across the two conditions. We hypothesized that if the NPI facilitated inference verification, RTs for valid inferences should be lower when the preceding sentence
430 Effect of NPIs
2.2.3 Discussion The primary goal of this experiment was to determine whether the presence of an NPI facilitated inference making in decreasing contexts. We used the NPI ever and found no evidence of such facilitation. In fact, we observed a slowdown on reading times in the inference region. Although a failure to find facilitation can sometimes be attributed to a lack of statistical power or to a lack of depth of processing on behalf of the participants, we observed several effects which suggest that such an explanation would be unlikely. First, we found that participants took longer to read the unlicensed NPIs than the licensed NPIs, indicating that the NPI was being processed to some degree. Secondly, participants took longer to read invalid inferences than valid inferences, indicating that they were aware of the polarity context. These secondary findings suggest that if NPIs did facilitate inference making in negative contexts, we would have found facilitation in these experiments. In Experiment 2, the NPI ever scoped over both disjuncts in S2, and it was not repeated in the inference proposition in S3. We conducted another experiment to determine whether these facts played (a) crucial role in our results. This experiment included items where (i) the NPI ever was repeated in the inference proposition in S3 and (ii) the NPI any occurred inside the second disjunct in S2.
2.3 Experiment 3 Experiment 3 employed the same methodology as Experiment 2, but we made several changes to the items. First, we added the NPI to the third sentence of the ever items in the NPI-present condition (a). For example, if the Experiment 2 sentence was
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
had a licensed NPI. We therefore compared valid inference reading times when the preceding sentence contained the NPI to when it did not, i.e. (a) v. (b). The lower line of Figure 2 shows the within-subject difference between the two conditions. Interestingly, and counter to an NPI-facilitating hypothesis, conditions involving an NPI were more difficult to read than those without the NPI. This is illustrated by the predominately negative values in the figures. There was a robust effect on the inference region, t1(31) ¼ 4.74, P < 0.0005, t2(23) ¼ 2.95, P ¼ 0.007. Furthermore, this effect is greater on the inference region than on the regions prior to it, F1(2,56) ¼ 7.60, P ¼ 0.001, F2(2,46) ¼ 4.76, P ¼ 0.013. In summary, there was no evidence that the NPI facilitated processing. Indeed, we observed a robust increase in RTs in the context of the NPI.
Anna Szabolcsi, Lewis Bott and Brian McElree 431
S3.
Since/ almost no campers/ have caught a cold,/ the parents are happy,/ and they praise the counselors./
the equivalent Experiment 3 sentence became S3.
Since/ almost no campers/ have ever/ caught a cold,/ the parents are happy,/ and they praise the counselors./
S1. S2. S3. S4. S5.
After winning a game/ at a local fair,/ children get to choose/ a prize./ Almost no child/ decided on the cotton candy/ or chose any mint chocolates./ Since/ almost no child/ chose mint chocolates,/ the mint chocolates were thrown away,/ and a new prize introduced./ The local fair/ happens/ once a year./ Does the fair happen twice a year?
The NPI any was now in the inference proposition, as ever was for the ever items. Moreover, just as ever was not repeated in S3 in Experiment 2, any was not repeated in S3 in this experiment. Finally, we no longer included the unlicensed NPI sentences. We already knew from Experiments 1 and 2 that participants were processing the NPIs and so there was no remaining need for that condition. 2.3.1 Method Participants. Forty-eight New York University students participated for course credit or payment. They were randomly allocated to one of the three counterbalancing conditions.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
We made this change to test whether the extra reading time associated with the NPI could have been due to participants experiencing difficulty comparing the proposition in the since sentence, which did not contain an NPI, against the inference propositions generated in the disjunction sentence. A second change was that we included modified versions of the any items from Experiment 1, the verification task. In Experiment 1, any was present in the first disjunct, whereas the inference proposition contained the second disjunct. These were altered so that now any was present in the second disjunct of S2, and this same proposition was inferred in S3 (although without repeating the NPI in S3). These were enabled by changing the order of the disjuncts in S2, and replacing the inference proposition in S3 with the now-second disjunct. For example, the ‘local fair’ vignette from Experiment 1 (shown in Table 1) became
432 Effect of NPIs
2.3.2 Results Data were preprocessed using the same criteria as Experiment 2. Five per cent of the items were removed because of incorrectly answered comprehension questions, and a further 5% of responses were removed as outliers. All the regions of interest were located in S3, since we had no unlicensed NPI condition (c). The principal reason for conducting the experiment was to establish whether changing the scope of any would generate increased reading times on the inference region. Figure 3 shows the within-subject differences for the any items. On the inference region, participants were marginally slower in the invalid inference condition (d) than the valid condition (b), t1(47) ¼ 2.60, P ¼ 0.013, t2(23) ¼ 1.55, P ¼ 0.13. They were also significantly slower in the following region: S3.4, t1(47) ¼ 3.10, P ¼ 0.003, t2(23) ¼ 2.33, P ¼ 0.029. There were no significant differences between the valid and the invalid inference conditions in S3.1 and
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Stimuli and design. The items were very similar to those used in Experiment 2. There were two changes, as described in section 1. First, the NPI in the any items was now in the inference proposition. Note that this resulted in the inference proposition being different from that used in Experiment 1. For example, in the local fair item, above, ‘decided on the cotton candy’ was the Experiment 1 inference proposition, whereas ‘chose mint chocolates’ was the Experiment 3 inference proposition. We were unable to keep the same inference proposition across experiments because many of the inference propositions used in Experiment 1 were incompatible with any. Having changed the inference proposition, we also changed the order of propositions in S2 to ensure that the second disjunct was used as the inference proposition across both experiments. We also changed the ever items by including the NPI in the third sentence of the NPI-present items. For example, item 1 (condition (a)) of Experiment 2 read ‘Since/ almost no campers/ have caught a cold,/ the parents are happy,/ and they praise the counselors./’ whereas the same items and condition in Experiment 3 was ‘Since/ almost no campers/ have ever/ caught a cold,/ the parents are happy,/ and they praise the counselors./’ Note that an extra region was required for the Experiment 3 items (the have ever region in the previous example). This was because we needed to include the NPI for the NPI-present conditions but not for NPI-absent conditions, and we needed to maintain an identically lengthened inference region. The unlicensed NPI condition was no longer included, but all other aspects of the design and procedure were identical to Experiment 2.
Anna Szabolcsi, Lewis Bott and Brian McElree 433
S3.2, t1s < 1, indicating that the slowdown is likely to be restricted to the inference region and further regions, although the interaction between region (S3.1, S3.2 and S3.3) and inference validity condition was not significant, F1(2,90) ¼ 2.11, P ¼ 0.13. F2 < 1. Thus, there is some evidence that participants were sensitive to the validity of the inference on the any items. The lower line in Figure 3 shows the difference between the NPIpresent and NPI-absent valid inference conditions (conditions (b) minus (a)), for the any items. If the NPI facilitated processing, the presence of the NPI should reduce reading times and the values on the b-a curve of Figure 3 should be positive. However, mirroring the effect of the NPI on ever items in Experiment 2, the NPI slowed down processing on the inference region, t1(47) ¼ 2.071, P ¼ 0.044, t2(23) ¼ 3.034, P ¼ 0.006, and there was no effect prior to the inference region, t1(47) ¼ 1.33, P ¼ 0.20, t2(23) ¼ 1.76, P ¼ 0.093, ts < 1, for regions S3.1 and S3.2, respectively, although the interaction was only significant in the items analysis, F1(2,90) ¼ 2.11, P ¼ 0.13, F2(2,46) ¼ 5.70, P ¼ 0.006. There were also significant differences on region S3.4, t1(47) ¼ 2.17, P ¼ 0.035, t2(23) ¼ 2.33, P ¼ 0.013, suggesting the slowdown in
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Figure 3 Reading time differences (ms) for the any items, S3, Experiment 3. The upper line is the invalid (d) minus valid (b) inference conditions, no NPIs. The lower line is the valid inference, NPI absent (b) minus valid inference, NPI present (a) condition. Values on the x-axis correspond to different regions of the sentence. The appropriate section of the ‘local fair’ item is shown on each region (words in square brackets correspond to the (d) version). Error bars correspond to the standard error of the difference for each region (subject analysis).
434 Effect of NPIs
Figure 4 Reading time differences (ms) for the ever items, S3, Experiment 3. The upper line is the invalid (d) minus valid (b) inference conditions, no NPIs. The lower line is the valid inference, NPI absent (b) minus valid inference, NPI present (a) condition. Values on the x-axis correspond to different regions of the sentence. The appropriate section of the ‘Campers’ item is shown on each region (words in square brackets correspond to the (a) or (d) version). Error bars correspond to the standard error of the difference for each region (subject analysis).
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
processing overflowed into the next region. In conclusion, allowing any to occur in the inference disjunct meant that the NPI slowed down the RTs. We were also interested in whether the effect on the ever items would replicate when we included the NPI in S3. First, we consider the effect of the invalid inference on RTs. The upper line in Figure 4 displays the invalid minus the valid within-subject difference ((d) minus (b)). Note that the inference region is now S3.4 (not S3.3, as it was in the previous experiment) because of the inclusion of the NPI in S3. On the inference region, RTs were significantly slower in the invalid inference condition (d) than in the valid inference condition (b), t1(47) ¼ 3.23, P ¼ 0.002, t2(23) ¼ 4.04, P ¼ 0.001. This effect was also present in region S3.2, t1(47) ¼ 2.22, P ¼ 0.031, t2(23) ¼ 2.42, P ¼ 0.024, and marginally so in S3.1 t1(47) ¼ 2.21, P ¼ 0.032, t2 ¼ 1.96, P ¼ 0.062, but not in S3.3, t1(47) ¼ 1.39, P ¼ 0.17, t2(23) ¼ 1.47, P ¼ 0.16. When regions S3.1 to S3.4 were analysed together, a main effect of inference validity was observed, F1(1,135) ¼ 16.66, P < 0.0005, F2(1,23) ¼ 16.07, P ¼ 0.001, and there was some evidence of an interaction, F1(3,135) ¼ 1.35, P ¼ 0.26, F2(3,69) ¼ 3.34, P ¼ 0.024. The effects of the NPI can be seen in the lower curve of Figure 4. Note that region S3.3 has an extra word (the NPI) in condition (a), making interpretation of RT differences complicated in this region. On
Anna Szabolcsi, Lewis Bott and Brian McElree 435
2.3.3 Discussion In this experiment, we included items with any in the second disjunct, so that any modified the inference proposition. We found that the presence of the NPI slowed reading times on S3, particularly on the inference region, as it did for the ever items in Experiment 2. Clearly, there was no evidence that the NPI facilitated processing of the inference, contrary to the association account of NPI licensing. We also altered the ever items by including the NPI in the third sentence for the NPI-present items. We were concerned that the effects of the NPI were due to the difficulty participants might have experienced comparing the propositions generated with the NPI in the second sentence against propositions presented without the NPI in the third. In contrast to this hypothesis, we observed significantly slower
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
the inference region, sentences that included the NPI were read more slowly than sentences that did not, although the effect was only significant in the participants analysis, t1(47) ¼ 2.39, P ¼ 0.021, t2(23) ¼ 1.45, P ¼ 0.16. Similar effects were present in regions prior to the inference region, however, where a main effect of NPI presence was observed (S3.1, S3.2 and S3.4, but not including S3.3 because it contains an extra word), F1(1,45) ¼ 10.11, P ¼ 0.003, F2(1,23) ¼ 5.73, P ¼ 0.025, but there was no evidence of an interaction, Fs < 1. RTs were significantly longer in the NPI-present conditions for S3.1 and S3.2, t1(47) > 2.04, P < 0.05, t2(23) > 2.10, P < 0.05. Thus, we observed reading times that were significantly longer in the NPI-present sentences than in the NPI absent sentences, as we did in Experiment 2. Finally, we consider the effects of the NPI across both sets of items. The inference region for the any items was S3.3, whereas for the ever items it was S3.4. We therefore used these scores as the dependent measure in an ANOVA analysis with NPI presence and NPI type as factors. The overall effect of the NPI presence was to slowdown reading time on the inference region, F1(1,45) ¼ 9.75, P ¼ 0.003, F2(1,46) ¼ 9.24, P ¼ 0.004, and there was no significant interaction between NPI types and NPI presence, Fs < 1. The effect of NPI presence was therefore equal across NPI types. Furthermore, comparing regions S3.1, S3.2 and S3.3 for the any items and S3.1, S3.2 and S3.4 for the ever items revealed that the effect of NPI presence was greatest on the inference region, as shown by an interaction between region and NPI presence, F1(2,90) ¼ 3.75, P ¼ 0.027, F2(1,46) ¼ 3.74, P ¼ 0.028, but no NPI presence by NPI type interaction, F1(1,45) ¼ 1.70, P ¼ 0.20, F2(1,46) ¼ 1.19, P ¼ 0.28, nor region by NPI presence by NPI type interaction, F1(2,90) ¼ 2.35, P ¼ 0.11. F2(2,92) ¼ 2.94, P ¼ 0.058.
436 Effect of NPIs
3 GENERAL DISCUSSION The goal of this study was to establish whether the processor recognizes the relationship between decreasingness and NPI licensing. We argued that a domain-widening plus proposition-strengthening theory would plausibly predict that the presence of a licensed NPI facilitates inferences from sets to subsets (the association model). To this end, we constructed vignettes that contained sentences with either a decreasing or a non-decreasing quantifier followed by inferences that were valid or invalid, respectively. Crucially, we manipulated whether these texts contained an NPI. According to our construal of the association model, the presence of the NPI should have facilitated
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
reading times on the inference region in the NPI-present condition than the NPI absent condition, and when we considered the any and the ever items together, there was a significant slowdown on the inference region and no significant difference between the two quantifiers. Thus, the effects of the NPI on the inference in this experiment, in which we included the NPI in the since sentence, were similar to those in Experiment 2, in which no NPI occurred in the since sentence. However, the effect of the NPI was less robust than previously (the slowdown on the inference region was significant only in the participants analysis) and spillover effects from the extra word in the NPI condition could have contributed to the slowdown. (This was not an issue in the previous experiment, nor for the any items, because the inference sentences were identical in all relevant conditions.) We cannot therefore completely rule out the possibility that differences between inferences derived from the since sentence and propositions presented in the third sentence contributed something to the observed slowdown. This issue is considered further in section 3. Finally, we observed significantly slower reading times as a result of the NPI in regions prior to the inference region in S3, as we did in Experiment 2. It is therefore possible that integrating NPI sentences into the discourse is more complex than integrating sentences without an NPI. However, we did not observe this effect with the any items and so the effect appears to be NPI specific or related to the items that supported ever. Moreover, the effects on the inference region were significantly greater than those on prior regions in Experiments 2 and 3 (at least when ever and any items are combined), as measured by the significant interaction terms. Thus, at least some of the slowdown we observe on the inference region cannot be due to whatever it is that causes the slowdown on the earlier regions. We leave the locus of this effect to future research.
Anna Szabolcsi, Lewis Bott and Brian McElree 437
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
processing by making it more likely that participants would evaluate the inference correctly or by speeding up the interpretation of the inference. Contrary to this hypothesis, however, we saw no evidence that the presence of the NPI facilitated processing, whether participants were given explicit reasoning tasks (Experiment 1) or whether they read sentences in a self-paced reading task (Experiments 2 and 3), and therefore we found no evidence supporting the claim that decreasingness and NPI licensing are linked. It is unlikely that our results were simply due to low sensitivity in general because we found three other reliable effects involving the NPI and the inferences: (i) participants read the NPI more slowly when it was unlicensed than when it was licensed, (ii) invalid inferences were read more slowly than valid inferences and (iii) the presence of the NPI made processing reliably more difficult on the inference region (i.e. the results were significantly in the opposite direction compared to the facilitation predictions). The results are consistent with the hypothesis that the processor does not recognize the relationship between NPIs and inferences to subsets, i.e. the basic tenet of the dissociation model. Before accepting this conclusion, however, we consider other potential explanations for our results. First, suppose that the processor does recognize the relationship between NPIs and inferences, but that any facilitatory effects were obscured in the contexts that we tested. For example, the quantifier at the beginning of the disjunction sentences might have signalled the DE context sufficiently strongly that the licensed NPI was not necessary for the grammar to fully recognize the context. In effect, the context could have been maximally recognized by the grammar before encountering the NPI. The results of Experiment 1 argue against this explanation, however. If participants could so easily identify the context on the basis of the quantifier alone, performance in the inference task should have been very high without the NPI, because the validity of the inference can be perfectly predicted from the context. Instead, participants correctly rejected only 58% of the invalid inferences and accepted only 87% of the valid inferences. Although this is not unusually poor performance compared to other reasoning tasks reported in the literature, participants clearly did not find it especially easy to identify DE contexts without the NPI. A second possible explanation for why we did not find facilitating effects of the NPI is that the context signalling effect may not have been sufficiently long lasting to carry over from the second sentence to the third sentence. This account assumes that the marker for decreasingness or the effect of the NPI fades across time (or linguistic input). A strong argument against this, however, is that the effects of the NPI were
438 Effect of NPIs
(15) Aucun chien n’a touche´ un / le moindre europe´en(s). ‘No dog touched a / a single European’ ? 0 Aucun chien n’a touche´ de francxais. ‘No dog touched any Frenchman’ (16) La plupart des chiens n’ont pas touche´ un / le moindre europe´en(s). ‘Most of the dogs didn’t touch a / a single European’ ? 0 La plupart des chiens n’ont pas touche´ de francxais. ‘Most of the dogs didn’t touch any Frenchman’ Both the percentage of correct answers and the mean response times for correct answers were worse when the premise contained the NPI le moindre than when it contained indefinites with un. Chemla’s results, while very preliminary, suggest that the lack of facilitation in our Experiment 1 is not explained by some accidental feature of the design. Instead of a facilitation effect of the NPI, in the reading time experiments we found that introducing an NPI slowed down inference processing. When the inference verification involved a proposition that contained an NPI (in the preceding sentence), reading times were longer compared to when no NPI was present. For example, ‘Since almost no campers have caught a cold’ took longer to read when the preceding sentence was ‘Almost no campers have ever had a sunburn or caught a cold’ than when the preceding sentence did not contain ever. Furthermore, in Experiment 2, the slowdown was localized to the
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
sufficiently long lasting that they slowed down reading in S3 when the NPI was present in the proposition (in S2). It would be unlikely that the NPI would have this effect while not continuing to signal decreasingness (if it signals it at all). We conclude that these two, linguistically uninteresting, explanations for the absence of any facilitation effects are therefore implausible. While writing up this article, we learned from E. Chemla (personal communication) that he had conducted a pilot study involving inference verification in the presence/absence of an NPI in French and also found no facilitatory effect, converging with the results of our own Experiment 1. In Chemla’s experiment, the premise was read independently and did not contribute to response time. The context was set up so as to exclude the one specific European interpretation. The stimuli involved the quantifiers aucun ‘no’, moins de 4 ‘less than 4’, plupart . . . ne . . . pas ‘most . . . didn’t’ and plus de 4. . . ne . . . pas ‘more than 4 . . . didn’t’, and were of the following shape:
Anna Szabolcsi, Lewis Bott and Brian McElree 439
inference region (caught a cold) and the region immediately following it. Thus, the slowdown was directly related to inference verification, and not to general complexity issues associated with processing NPIs. We formulate two hypotheses consistent with the reported patterns across experiments: (17)
(18)
In what way can the processing of the NPI be costly? One possibility is that recognizing the relationship between the propositions with ever/any (in S2) and those without ever/any (in S3) was difficult. But it is unlikely that absence of the NPI from S3—i.e. a ‘mismatch effect’—is the main factor in the slowdown. In Experiment 2, S3 always contained the second disjunct of S2 and therefore S3 lacked a big chunk of S2, not just the NPI. For example: S2. Almost no campers have ever had a sunburn or caught a cold. S3. Since almost no campers have caught a cold, S2 At most half of the plants have ever died or lost leaves. S3 Since at most half of the plants have lost leaves, Since S3 was never a verbatim repetition of S2, the stimuli did not specifically generate an expectation for the NPI to occur in S2. A more complex form of the mismatch argument may be that participants required extra time to recognize the relationship between the proposition with the NPI in S2 and the proposition without the NPI in S3. But note that the key idea of the scalar account is that the NPI is licensed if and only if the proposition with the NPI entails its counterpart without the NPI. In other words, the diagnostic of NPI licensing involves a mental comparison of ‘mismatching propositions’. If the scalar account is correct, participants who registered that the NPI in S2 was licensed should have already efficiently compared S2 and S3. Therefore, the possibility that the need to compare propositions with and without NPIs determined the reading times in S3 is more compatible with the No Facilitation hypothesis in (17) than with the Some Facilitation hypothesis in (18). This could be further tested by
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
No facilitation plus somewhat costly NPI processing: NPI presence does not improve either the accuracy or the speed of inference processing. On the other hand it incurs some cost that is manifested in increased reading times. Some facilitation plus very costly NPI processing: NPI presence does facilitate inference processing in some way and to some extent, but it also incurs a cost that is large enough both to wipe out all facilitatory effects and to additionally increase reading times.
440 Effect of NPIs
7 Among other things, it would be very interesting to find out whether there is a slowdown even if the NPI occurs in a segment of the preceding sentence that is not part of the inference proposition. In fact, the any items mentioned in note 6 were intended to serve that purpose but, as was pointed out in that footnote, for some reason that batch of items yielded non-significant results in general.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
experiments focusing on the repetition v. omission of the NPI in the inference proposition. Finally, the NPI was repeated in Experiment 3 and some slowdown nevertheless occurred, whereas the mismatch explanation would predict a complete elimination of the slowdown. An alternative possibility is that the semantics/pragmatics of the NPI incurs a significant processing cost. All three theoretical accounts reviewed in section 1 could predict this. On the particular version of the non-veridicality account presented in Giannakidou (1998), some NPIs are referentially deficient: they contain non-deictic variables and thus have to be bound by, or be anaphoric to, an antecedent (for a purely syntactic version of this idea see Progovac 1994). Thus, their processing cost should be similar to that of bound or anaphoric pronouns. On the Ladusaw-de Swart&Sag-Postal-Szabolcsi account, the factoring out of the negative component of the NPI’s lexical representation and the formation of a polyadic quantifier with the negation component of the licensor may well be costly. On the Kadmon&Landman-Krifka-LahiriChierchia account, the NPI itself induces scalar implicatures. This aspect of the account was not detailed in the section 1; we summarize it here, based on Chierchia (2006: 554–60). Chierchia follows Krifka and Lahiri in attributing an even-like flavour to the base meaning of the NPI any. This activates a set of domain alternatives and carries the implicature that even the broadest choice of the domain of quantification will make the sentence with any true. (This implicature can only be true in a decreasing local environment; in such an environment, the any-sentence will entail its counterpart with a plain indefinite, which always quantifies over some particular domain.) Departing from Grice, implicatures are added and strengthened meanings are calculated recursively, at every step of the sentence’s composition. Domain widening and implicature calculation are plausibly costly real-time operations. The upshot is that all the recent theoretical accounts are in principle capable of explaining the findings (no observable facilitation of accuracy, some slowdown in reading times). What our findings clearly rule out is an account that predicts that NPIs should have a squarely facilitatory effect, as in Dowty (1994). Further work might tease apart the localization and magnitude of the effects, and determine whether one of the models is favoured by the processing data.7 As one issue of interest, notice that while especially stressed NPIs undeniably have an even-flavoured meaning, not all NPIs do. Some
Anna Szabolcsi, Lewis Bott and Brian McElree 441
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
examples of NPIs without domain widening are n-words interpreted as NPIs (as is observed in Chierchia 2006), occurrences of unstressed any applied to rigorously defined domains (‘The empty set does not have any proper subsets’ is fully acceptable, but it does not mean ‘even a marginal proper subset’, Krifka 1995), and items like the adverb anymore (‘He doesn’t live here anymore’ ‘He lived here and that has changed’), the auxiliary need (‘He need not come early’) and others (van der Wal 1999). The existence of such NPIs is one reason why some accounts maintain that the phenomenon of NPI licensing per se is not an essentially scalar matter. On the other hand, the basically non-scalar theories may freely acknowledge that some NPIs do have an evenflavour that has to be taken into account in the full description of their distribution and meaning (Szabolcsi 2004, and especially Giannakidou 2007). Such representatives of the dissociation model predict that the processing of an NPI is more costly when it actually carries scalar implicatures. Chierchia’s (2006) account accommodates the existence of NPIs without actual domain widening in the following way. In contrast to items like some and many, whose scalar alternatives can be deactivated and thus their implicatures (‘but not all’) suspended in appropriate contexts, any is grammaticized to always activate a set of domain alternatives. On the other hand, Chierchia requires the proposition with the widest domain of quantification only to entail its counterparts with particular domains; that is, it has to be either stronger than or equivalent to them. ‘Domain widening, as implemented here, is a potential for domain widening’ (Chierchia 2006: 559, emphasis in the original). In this way, his account does not distort interpretations. However, the combined effect of the grammaticized activation of domain alternatives and the recursive computation of scalar implicatures is that NPIs will incur the same processing cost regardless of whether they actually involve domain widening (‘The camper has not suffered ANY bruises’) or not (‘The empty set does not have any proper subsets’). This prediction contrasts with that of the dissociation model. Further work may be able to determine which prediction is borne out by processing. The question is whether domain alternatives are always computed or whether they are only computed when the NPI involves actual domain widening. In this regard, the hypothesis parallels that of processing work on scalar implicatures (e.g. Noveck & Posada 2003; Bott & Noveck 2004; Breheny et al. 2006). These researchers have contrasted neo-Gricean accounts of how scalar implicatures are processed (e.g. Levinson 2000; Chierchia 2004) with context-
442 Effect of NPIs
4 CONCLUSION The classical explanation for how NPIs are licensed is that they are allowable only in the scope of a downward entailing operator. We argued that a plausible processing model derived from such an account would predict that the presence of an NPI should facilitate the processing of inferences from sets to subsets: the NPI should highlight the decreasingness of the context. Yet, we did not observe 8 Not all scalar theories of negative polarity are purely domain-widening theories, see e.g. Krifka (1995), so the predictions will have to be tested in a differentiated manner.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
dependent theories (e.g. Sperber & Wilson 1985). The crucial difference between these accounts is that neo-Gricean accounts predict scalar implicatures to be computed by default on encountering a scalar term like some, that is, the scalar alternatives are always calculated and scalar implicatures (denial of the stronger element in the scale) go through unless cancelled. The context-dependent account predicts that the scalar alternatives are only calculated in specific contexts, that is, there is no default computation of scalar alternatives. The results of these processing investigations have been that interpreting sentences with scalar implicatures, like ‘Some [but not all] children are in the classroom’ requires more processing time than interpreting the sentences without the implicature, as in ‘Some [and possibly all] of the children are in the classroom’ (e.g. Bott & Noveck 2004), thus arguing against a strict default theory. Somewhat modifying his 2004 proposal, Chierchia (2006) assumes that the default activation of alternatives can be suspended to begin with. Scalar terms have a strong [+r] variant with alternatives, and a weak [r] variant without alternatives. The weak variant is employed in those cases that on the previous account involved implicature cancellation. The claim that NPIs always have active domain alternatives is technically expressed as they having only strong variants in Chierchia (2006). Therefore, it is significant that earlier work has demonstrated that processing with and without implicatures incurs different costs. Similar questions to what have been asked about non-NPI scalar terms in the experimental studies cited can now be asked about NPIs, contrasting cases of bona fide domain widening with cases where the NPI either lexically or contextually fails to actually widen the domain. Do both slow down the processing of decreasing inferences, or do only bona fide domain-widening NPIs do so?8
Anna Szabolcsi, Lewis Bott and Brian McElree 443
Acknowledgements Lewis Bott was supported by NIMH Grant MH41704 awarded to Gregory L. Murphy for part of this work. Further support was provided by NIH grant R01HD056200 awarded to Brian McElree. We are grateful to Lyn Frazier, Emmanuel Chemla and Mark Steedman for helpful discussion of an earlier version of the manuscript, to the editor and two reviewers for comments, and to Gregory L. Murphy for his advice and the use of his laboratory. We also thank Jessica Piercy for collecting the data and Tuuli Adams and Peter Liem for help with writing the items.
ANNA SZABOLCSI Department of Linguistics New York University 726 Broadway New York NY 10003 USA e-mail:
[email protected] Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
the expected facilitation effects. In Experiment 1, we found that the presence of an NPI had no effect whatsoever on the likelihood of participants correctly verifying inferences, while in Experiments 2 and 3 we observed the reverse finding; that the NPI significantly slowed participants’ inference verification strategies. While we cannot categorically demonstrate which of several explanations is responsible for this slowdown, all the explanations invoke the NPI in the inference verification process, so it is difficult to argue that the NPI was not a significant contributor to the verification process in general. Our findings suggest that NPIs do not play an important facilitatory role in inference making. The most straightforward implication is that the processor does not recognize the relationship between NPIs and decreasingness (at least not in the way in which the simple association model predicts it should do). The challenge for future researchers is to address what role the NPI plays in the inference-making process, and, more generally, which of the models that we outlined in section 3 is the most accurate processing model of NPI licensing. We have indicated various experiments that could be conducted to this end.
444 Effect of NPIs APPENDIX: EXPERIMENTAL STIMULI
1
2
3
4
5
Our camp/ is on/ Staten Island./ Almost no campers/ have ever had/ a sunburn/ or caught a cold./ Since/ almost no campers/ have caught a cold,/ the parents are happy,/ and they praise the counselors./ They will/ use the camp/ again next year./ Is the camp in Staten Island? I keep/ my collectibles/ on the counter./ No more than five pieces/ have ever fallen/ or gotten smashed./ Since/ no more than five pieces/ have gotten smashed,/ the arrangement seems safe,/ and I will stick with it./ I may/ add/ new pieces./ Do I keep my collectibles on the counter? The club/ hikes in/ the Palisades./ At most five members/ have ever spotted/ deer/ or found bluebells./ Since/ at most five members/ have found bluebells,/ the women are grumbling,/ and they are considering other clubs./ The kids/ don’t care./ Is this a hiking club? We assemble/ our own/ furniture./ Not many chairs/ have ever had/ wobbly legs/ or fallen apart./ Since/ not many chairs/ have fallen apart,/ we’ll get some tables,/ and perhaps more complicated pieces./ One saves/ on these/ purchases./ Do we assemble furniture? Mary got/ a toy set/ from her grandma./ Less than five pieces/ have ever been/ damaged/ or gotten lost./
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
The list below exemplifies each type of stimulus with one entry; the experiments used two entries of each type. The full list can be obtained from the authors. Items 1–12 all contain ever, using the valid inference version with a DE operator. These items were used in all three experiments. Below they appear as presented to the participants using a self-paced reading paradigm in Experiment 2. The slashes separate the regions. Experiment 3 was similar except that the NPI was repeated in the since sentence (S3), in addition to appearing in the disjunction sentence (S2). In Experiment 1, S3 was not a declarative with since but a question using the ‘Would it be reasonable . . .?’ construction discussed in the text.
Anna Szabolcsi, Lewis Bott and Brian McElree 445
6
8
9
10
11
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
7
Since/ less than five pieces/ have gotten lost,/ the set is in good shape,/ and Mary will save it for her children./ She only/ keeps/ quality stuff./ Did Mary get a toy set from her grandma? I leave/ messages about/ my theater group./ Very few people/ have ever come/ to the show/ or called back./ Since/ very few people/ have called back,/ my group is unhappy,/ and they say I am lazy./ They want/ to place calls/ themselves./ Is my group unhappy? One of/ our friends/ is sick./ Almost nobody/ has ever sent/ flowers or written a card./ Since/ almost nobody/ has written a card,/ our friend feels neglected,/ and she is complaining./ She will/ return to work/ next month./ Is our friend sick? Every summer/ Tom grows/ tomatoes./ At most half the plants/ have ever died/ or lost leaves./ Since/ at most half the plants/ have lost leaves,/ Tom is quite satisfied,/ and he doesn’t consider growing carrots./ He likes/ his garden./ Does Tom grow tomatoes? I keep/ an eye on/ the enrollments./ No more than fifty people/ have ever signed up/ for squash/ or taken karate./ Since/ no more than fifty people/ have taken karate,/ some classes are cancelled,/ and the program will move to the second floor./ The coaches/ are holding/ a meeting./ Are the coaches holding a meeting? Judy likes/ to display the books/ she gets/ for presents./ Not many of the books/ have ever been/ thick/ or had two volumes./ Since/ not many of the books/ have had two volumes,/ the display is small,/ and it fits on her shelf./ The shelf/ is above/ Judy’s desk./ Is Judy getting books for presents? I use/ a lab/ computer./ Less than five of my files/ have ever been/ tampered with/ or gotten infected./
446 Effect of NPIs
12
Items 13–24 all contain any. The results for these items were presented in Experiments 1 and 3. The items are in the form presented to participants in Experiment 3 (the disjuncts should be reversed for Experiment 1). 13
14
15
16
Julie keeps/ tropical fish and plants/ in her aquarium./ No more than five fish/ have had a disease /or have eaten any plants. Since/ no more than five fish/ have eaten plants,/ Julie is quite pleased/ and may buy more fish./ She will buy/ some new/ plants too./ Will Julie get some new plants? The ice cream shop/ around the corner/ sold a lot of/ sundaes this weekend./ Not many people/ chose low calorie sundaes /or bought any fruit sorbets. Since/ not many people/ bought the fruit sorbets/ they won’t offer them again,/ and will order ice cream instead./ Next weekend/ they hope to make/ more money./ Did the ice cream shop sell a lot of sundaes? This department store/ had a/ housing goods sale./ Less than five salespeople/ sold living room rugs /or made any significant profit. Since/ less than five people/ made significant profit,/ the store lost money/ and they will lose their jobs./ The store/ usually has sales/ in the fall/ and the spring./ Did the store have a housing goods sale? Our family/ always cooks/ a traditional meal/ for Thanksgiving./
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Since/ less than five of my files/ have gotten infected,/ I like the arrangement,/ and I trust the other users./ I can’t afford/ a laptop/ now./ Do I use lab computer? Max commutes/ from/ Tempe./ Very few of his flights/ have ever been/ overbooked/ or suddenly canceled./ Since/ very few of his flights/ have been suddenly canceled,/ Max’s commute is easy,/ and he doesn’t worry about it./ Max likes/ to live/ in Tempe./ Does Max work outside his home town?
Anna Szabolcsi, Lewis Bott and Brian McElree 447
17
19
20
21
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
18
Very few people/ take a second helping of sweet potatoes /or eat any creamed onions. Since/ very few people/ eat creamed onions,/ there are some leftovers/ and we have them again the next day./ There is usually/ turkey left over/ too./ Does the family cook a traditional meal for Thanksgiving? I went/ to the new/ train station yesterday/ to ask about/ my trip to Nantucket./ Almost nobody/ was buying tickets /or boarding any trains. Since/ almost nobody/ was boarding trains,/ the lines were short/ and I could ask about my train times./ I want/ to leave/ as soon as possible./ Did I go to the train station? The Atlantic Hotel/ sends its napkins/ to be washed/ on Wednesdays./ No more than fifty napkins/ are spotted with bleach /or come back with any stains. Since/ no more than fifty napkins/ come back with stains,/ the napkins are reused,/ and the hotel buys new ones the next season./ The hotel/ is especially busy/ during the summer./ Is the hotel busy in the summer? Paradise Cruise Lines/ had a holiday cruise/ that lasted a week./ Less than five of the passengers/ felt seasick during the cruise /or had any serious complaints. Since/ less than five of the passengers/ had serious complaints,/ the captain was happy,/ and he enjoyed the cruise./ The captain/ hopes the next trip/ will go/ as smoothly./ Did the cruise last a week? The local supermarket/ conducted a survey/ of its customers./ Very few of the surveys/ were returned late /or had any blank responses. Since/ very few of the surveys/ had blank responses,/ the results were tallied/ and the information was passed on to the managers./ The supermarket/ conducted a survey/ again the next year./ Was it a supermarket that conducted the survey? After winning a game/ at a local fair,/ children get to choose/ a prize./ Almost no child/ decided on the cotton candy/ or chose any mint chocolates./
448 Effect of NPIs
22
24
REFERENCES Altman, A., Y. Peterzil, & Y. Winter (2005), ‘Scope dominance with upward monotone quantifiers’. Journal of Logic, Language and Information 14:445–55. Bernardi, R. (2002), Reasoning with Polarity in Categorial Type Logic. Ph.D. dissertation, Utrecht, The Netherlands. Bott, L. & Noveck, I. (2004), ‘Some utterances are underinformative: the onset and time course of scalar in-
ference’. Journal of Memory and Language 51:437–57. Breheny, R., Katsos, N., & Williams, J. (2006), ‘Are generalised scalar implicatures generated by default? An online investigation into the role of context in generating pragmatic inferences’. Cognition 100:434–63. Chierchia, G. (2004), ‘Scalar implicatures, polarity phenomena and the syntax/pragmatics interface’. In
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
23
Since/ almost no child/ chose mint chocolates,/ the mint chocolates were thrown away,/ and a new prize introduced./ The local fair/ happens/ once a year./ Does the fair happen twice a year? Children from/ a boys and girls club/ went on a campaign/ to raise money./ At most five boys/ got a large donation/ or sold any chocolate candy./ Since/ at most five boys/ sold chocolate candy,/ the club will campaign again,/ and go to a different neighborhood./ Selling chocolate/ is a good way/ to raise money./ Were the children from a basketball team? A clothing designer/ asked a focus group/ to pick between buttons/ of various shapes and colors./ At most half the people/ picked square buttons /or chose any purple buttons. Since/ at most half the people/ picked purple buttons,/ the results were discouraging,/ and the color was discontinued./ There were/ four different/ button colors./ Were there ten different button colors? The curator/ of an art gallery/ set up a new exhibit/ of paintings and sculptures./ Not many of the artists/ sculpted large pieces/ or painted any big pictures. Since/ not many of the artists/ painted big pictures,/ the painting area was small,/ and left lots of space for the sculptures./ Setting up/ an art exhibit/ is itself/ an art./ Did the exhibit feature photography?
Anna Szabolcsi, Lewis Bott and Brian McElree 449 Geurts, B. (2003b), ‘Monotonicity and syllogistic inference: a reply to Newstead’. Cognition 90:201–4. Geurts, B. & van der Slik, F. (2005), ‘Monotonicity and processing load’. Journal of Semantics 22:97–117. Giannakidou, A. (1998), Polarity Sensitivity as (Non)Veridical Dependency. Linguistik Aktuell/Linguistics Today 23. John Benjamins. Giannakidou, A. (2006), ‘Only, emotive factives, and the dual nature of polarity dependency’. Language 82:575–603. Giannakidou, A. (2007), ‘The landscape of EVEN’. Natural Language and Linguistic Theory 25:39–81. Heim, I. (2006), ‘Little’. In C. Tancredi et al. (eds.), Proceedings of Semantics and Linguistic Theory 16. (http://research. nii.ac.jp/salt16/proceedings.html). Kadmon, N. & F. Landman (1993), ‘Any’. Linguistics and Philosophy 16:353–422. Koller, A. & J. Niehren (1999), ‘Scope underspecification and processing’. ESSLLI Reader. (http://www.ps.unisb.de/Papers/abstracts/./ESSLLI:99.ps). Krifka, M. (1995), ‘The semantics and pragmatics of polarity items’. Linguistic Analysis 25:209–57. Ladusaw, W. (1980), Polarity Sensitivity as Inherent Scope Relations. Garland, New York. Ladusaw, W. (1992), ‘Expressing negation’. In C. Barker and D. Dowty (eds.), Proceedings of Semantics and Linguistic Theory II. OSU, Columbus, OH. 237–61. Lahiri, U. (1997), ‘Focus and negative polarity in Hindi’. Natural Language Semantics 6:57–123. Levinson, S. (2000), Presumptive Meanings. The MIT Press. Cambridge, MA. McKoon, G. & Ratcliff, R. (1992), ‘Inference during reading’. Psychological Review 99:440–66.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Belletti, A. (ed., Structures and Beyond. Oxford University Press. Oxford. Chierchia, G. (2006), ‘Broaden your views: implicatures of domain widening and the ‘‘logicality’’ of language’. Linguistic Inquiry 37:535–91. De Decker, P., E. Larsson, & A. Martin (2005), ‘Polarity judgments: an empirical view’. Workshop on Polarity from Different Perspectives. New York University. http://www.nyu.edu/gsas/ dept/lingu/events/polarity/posters/ dedecker-larsson-martin.pdf. den Dikken, M. (2006), ‘Parasitism, secondary triggering, and depth of embedding’. In R. Zanuttini, H. Campos, E. Herburger, and P. H. Portner (eds.), Cross-linguistic Research in Syntax and Semantics: Negation, Tense and Clausal Architecture. Georgetown University Press. Washington, DC. 151–75. de Swart, H. & I. A. Sag (2002), ‘Negation and negative concord in Romance’. Linguistics and Philosophy 25:373–417. Dowty, D. (1994), ‘The role of negative polarity and concord marking in natural language reasoning’. In M. Harvey and L. Santelman (eds.), Proceedings from Semantics and Linguistic Theory IV. Cornell University, Ithaca. 114–45. Ferreira, F. & Patson, N. (2007), ‘The ‘‘good enough’’ approach to language comprehension’. Language and Linguistics Compass 1:71–83. Fyodorov, Y., Y. Winter, & N. Francez (2003), ‘Order-based inference in ‘‘Natural Logic’’’. Logic Journal of the IGPL 11:385–416. Gajewski, J. (2008), ‘Licensing strong NPIs’. Proceedings of the 31st Annual Penn Linguistics Colloquium. Papers in Linguistics 14/1:163–76. Geurts, B. (2003a), ‘Reasoning with quantifiers’. Cognition 86:223–51.
450 Effect of NPIs Szabolcsi, A. (2006), ‘Scope and binding’. Submitted to C. Maienborn, K. von Heusinger, and P. Portner (eds.), Semantics: An International Handbook of Natural Language Meaning. Mouton de Gruyter, Berlin-New York. Tunstall, S. (1998), The Interpretation of Quantifiers: Semantics and Processing. Ph.D. thesis, University of Massachusetts, Amherst, MA. Verbrugge, S. & Schaeken, W. (2006), ‘The ballad of if and since: never the twain shall meet?’ Proceedings of the 28th Annual Conference of the Cognitive Science Society, Vancouver, Canada, 26–29 July 2006.2305–10. (http://www.cogsci.rpi. edu/csjarchive/Proceedings/2006/ docs/p2305.pdf). van Benthem, J. (1991), Language in Action. North-Holland. Amsterdam, The Netherlands. van der Wal, S. (1999), Negative Polarity Items and Licensing: Tandem Acquisition. Ph.D. dissertation. Groningen, The Netherlands. von Fintel, K. (1999), ‘NPI-licensing, Strawson-entailment, and context dependency’. Journal of Semantics 16:97–148. Zwarts, F. (1981), ‘Negatief polaire uitdrukkingen 1’. GLOT 4:35–132. Zwarts, F. (1995), ‘Nonveridical contexts’. Linguistic Analysis 25:286–312. First version received: 17.12.2007 Second version received: 23.06.2008 Accepted: 08.07.2008
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Newstead, S. E. (2003), ‘Can natural language semantics explain syllogistic reasoning?’ Cognition 90:193–9. Noveck, I. & Posada, A. (2003), ‘Characterising the time course of an implicature’. Brain and Language 85:203–10. Paterson, K. B., Sanford, A. J., Moxey, L. M. & Dawydiak, E. J. (1998), ‘Quantifier polarity and referential focus during reading’. Journal of Memory and Language 39:290–306. Postal, P. M. (2005), ‘Suppose (if only for an hour) that NPIs are negationcontaining phrases’. Workshop on Polarity from Different Perspectives. New York University. http://www.nyu. edu/gsas/dept/lingu/events/polarity/ papers/postal-paper.pdf. Progovac, L. (1994), Negative and Positive Polarity: A Binding Approach. Cambridge Studies in Linguistics Cambridge U.P., Cambridge. Sa´nchez-Valencia, V. (1991), Studies on Natural Logic and Categorial Grammar. Ph.D. dissertation. Amsterdam, The Netherlands. Simons, M. (1999), Review of Giannakidou, Polarity Sensitivity as (Non) Veridical Dependency. http://linguistlist.org/ issues/10/10-1152.html. Sperber, D. & Wilson, D. (1985), Relevance: Communication and Cognition. Blackwell. Oxford. Szabolcsi, A. (2004), ‘Positive polarity– negative polarity’. Natural Language and Linguistic Theory 22:409–52.
Journal of Semantics 25: 345–380 doi:10.1093/jos/ffn007 Advance Access publication August 20, 2008
Syntax and Semantics of It-Clefts: A Tree Adjoining Grammar Analysis CHUNG-HYE HAN AND NANCY HEDBERG Simon Fraser University
In this paper, we examine two main approaches to the syntax and semantics of itclefts as in ‘It was Ohno who won’: an expletive approach where the cleft pronoun is an expletive and the cleft clause bears a direct syntactic or semantic relation to the clefted constituent, and a discontinuous constituent approach where the cleft pronoun has a semantic content and the cleft clause bears a direct syntactic or semantic relation to the cleft pronoun. We argue for an analysis using Tree Adjoining Grammar (TAG) that captures the best of both approaches. We use TreeLocal Multi-Component Tree Adjoining Grammar to propose a syntax of it-clefts and Synchronous Tree Adjoining Grammar (STAG) to define a compositional semantics on the proposed syntax. It will be shown that the distinction TAG makes between the derivation tree and the derived tree, the extended domain of locality characterizing TAG and the direct syntax–semantics mapping characterizing STAG allow for a simple and straightforward account of the syntax and semantics of it-clefts, capturing the insights and arguments of both the expletive and the discontinuous constituent approaches. Our analysis reduces the syntax and semantics of it-clefts to copular sentences containing definite description subjects, such as ‘The person that won is Ohno’. We show that this is a welcome result, as evidenced by the syntactic and semantic similarities between it-clefts and the corresponding copular sentences.
1 INTRODUCTION The extant literature on the syntax of it-clefts, as in (1), can be classified into two main approaches. First, the cleft pronoun it is an expletive, and the cleft clause bears a direct syntactic or semantic relation to the clefted constituent, such as one of predication (Jespersen 1937; Chomsky 1977; Williams 1980; Delahunty 1982; Rochemont 1986; Heggie 1988; Delin 1989; E´. Kiss 1998). Second, the cleft clause bears a direct syntactic or semantic relation to the cleft pronoun and is spelled out after the clefted constituent through extraposition or by forming a discontinuous constituent with the cleft pronoun (Jespersen 1927; Akmajian 1970b; Emonds 1976; Gundel 1977; Wirth 1978; Hedberg Ó The Author 2008. Published by Oxford University Press. All rights reserved. For Permissions, please email:
[email protected].
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Abstract
346 Syntax and Semantics of It-Clefts 1990, 2000; Percus 1997). Under this second approach, the cleft pronoun is not necessarily expletive but rather has a semantic function such as that of a definite article. (1) It was OHNO [who won]. cleft pronoun + copula + clefted constituent + cleft clause
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
In this paper, we argue for an analysis using Tree Adjoining Grammar (TAG) that captures the best of both traditional analyses by making use of the distinction in TAG between the derivation tree on which syntactic dependencies between elementary objects and compositional semantics are defined, and the derived tree on which aspects of surface constituency are defined. An illustration of the derivation tree and derived tree in TAG is given in section 3.1. In our analysis, as in the expletive approach, at the level of surface syntax (the derived tree), the clefted constituent and cleft clause form a syntactic constituent. As in the discontinuous constituent approach, however, at the level of syntactic dependencies (the derivation tree), the cleft pronoun and the cleft clause form a syntactic unit, and a semantic unit as a definite description. This aspect of our analysis reduces the syntax and semantics of it-clefts to copular sentences containing definite description subjects. We show that this reduction is supported by the fact that it-clefts and the corresponding copular sentences pattern alike both syntactically and semantically. In particular, we use Tree-Local Multi-Component Tree Adjoining Grammar (MC-TAG) to propose a syntax of it-clefts and Synchronous Tree Adjoining Grammar (STAG) to define a compositional semantics on the proposed syntax. It will be shown that the distinction TAG makes between the derivation tree and the derived tree, the extended domain of locality characterizing TAG and the direct syntax–semantics mapping characterizing STAG allow for a simple and straightforward account of the syntax and semantics of it-clefts, capturing the insights and arguments of both the expletive and the discontinuous constituent approaches. The paper is organized as follows. In section 2, we present arguments supporting the discontinuous constituent analysis as well as some arguments supporting the expletive analysis. We also discuss connectivity effects in it-clefts and parallel effects in copular sentences instantiated by binding and agreement. In section 3, we introduce the basics of TAG for doing natural language syntax and present our TAG analysis of the syntax of it-clefts. In section 4, we introduce STAG and show how compositional semantics is done using STAG, and present our analysis of the semantics of it-clefts. In section 5, we show how our TAG analysis can account for the connectivity effects in it-clefts instantiated by binding and agreement.
Chung-Hye Han and Nancy Hedberg 347
2 THE TENSION BETWEEN THE EXPLETIVE AND THE DISCONTINUOUS CONSTITUENT ANALYSES
(2)
a. This is not Iowa we’re talking about. (Hedberg 2000, ex. 17) b. That’s the French flag you see flying over there. (Hedberg 2000, ex. 20)
In (2), the proximal demonstrative pronoun is selected when the content of the cleft clause indicates that the referent of the clefted constituent is close to the speaker, and the distal demonstrative is selected when the content of the cleft clause indicates that the referent is far from the speaker. Reversing the cleft pronouns would lead to infelicity. The discontinuous constituent analysis allows the cleft pronoun to be treated as having the semantic content of a determiner. Thus, we can view the cleft pronoun and cleft clause in (2) as working together to function as a demonstrative description as in (3). (3)
a. This [place] we’re talking about is not Iowa. b. That [thing] you see flying over there is the French flag.
Second, the cleft clause has the internal structure of a restrictive relative clause. This is supported by the fact that the initial element in the cleft clause may be realized either as a wh-word (1) or as that (4a), or it may be absent altogether when the gap is not in the subject position (2,4b). It may even be in the form of a genitive wh-word as in (4c). (4)
a. It was Ohno that won. b. It was Ohno Ahn beat. c. It was Ohno whose Dad cheered.
The cleft clause, however, does not relate to the clefted constituent in the way that a restrictive relative clause relates to its head noun, as first
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
In this section, we review five main syntactic and semantic properties of it-clefts: semantic content of the cleft pronoun, internal structure of the cleft clause, presence of existential and exhaustive presuppositions, presence of equative and predicational readings, and connectivity. For each property, we discuss how the expletive analysis and the discontinuous constituent analysis fare. The arguments presented in this section are taken from the existing literature on it-clefts. First, it has been shown in Hedberg (1990, 2000) that the cleft pronoun can be replaced with this or that, as in (2), depending on the discourse contextual interpretation of the cleft clause. The fact that the choice of the cleft pronoun is subject to pragmatic constraints indicates that the cleft pronoun is not an expletive element.
348 Syntax and Semantics of It-Clefts noted in Jespersen (1927). This is because the clefted constituent can be a proper noun, unlike a head noun modified by a restrictive relative clause, as illustrated in (5). Many expletive analyses (e.g. Delahunty 1982; Rochemont 1986; Heggie 1988) thus do not consider the cleft clause to have the internal structure of a restrictive relative clause. The discontinuous constituent analysis, on the other hand, allows the cleft clause to be treated as such, as argued for in Hedberg (1990), because it assumes that the relative clause forms a constituent with the cleft pronoun. (5) *Ohno that won is an American.
(6) a. I said it should have been [Bill who negotiated the new contract], and it should have been. b. It must have been [Fred that kissed Mary] but [Bill that left with her]. It will be shown in section 3.2 that our analysis resolves this tension between the discontinuous constituent analysis and the expletive analysis by making use of TAG’s distinction between the derivation tree, on which compositional semantics and syntactic dependencies between elementary objects are defined, and the derived tree, on which surface syntactic relations are defined. On our analysis, the clefted constituent and the cleft clause form a constituent in the derived tree, and the cleft pronoun and the cleft clause form a syntactic unit in the derivation tree. Third, it-clefts pattern with copular sentences containing definite description subjects syntactically and semantically. Semantically, it-clefts have existential and exhaustive presuppositions, just as definite descriptions do, as pointed out in Percus (1997) and Hedberg (2000). The inference in (7c) associated with (7a) survives in the negative counterpart in (7b). This is exactly the way the presupposition associated with the definite description the king of France behaves: the presupposition spelled out in (8c) survives in both the affirmative (8a) and the negative counterpart in (8b). (7) a. It was Ohno who won. b. It was not Ohno who won. c. Someone won, and only one person won.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Even so, as pointed out first in Delahunty (1982), there is some syntactic evidence that the clefted constituent and the cleft clause do form a surface syntactic constituent. The examples in (6), from Hedberg (2000), show that the two together can be deleted as a unit, as in (6a), and coordinated as a unit, as in (6b).
Chung-Hye Han and Nancy Hedberg 349
(8)
a. The king of France is bald. b. The king of France is not bald. c. There is one and only one king of France.
(9)
a. The teacher is Sue Johnson. b. The teacher is a woman.
This observation follows under the discontinuous constituent analysis, as it-clefts there reduce to ordinary copular sentences, unlike some expletive analyses where the copula is treated as a focus marker (E´. Kiss 1998). For instance, (7a) (repeated as (10a)) can be paraphrased as (10b), and corresponds to a typical equative sentence. And (11a) can be paraphrased as (11b), and corresponds to a typical predicational sentence. According to the analysis we will present in section 4, (10a) will be assigned the semantic representation in (10c) and (11a) will be assigned the semantic representation in (11c). (10) a. It was Ohno who won. b. The one who won was Ohno. c. THEz [won(z)] [z ¼ Ohno] (11) a. It was a kid who beat John. b. The one who beat John was a kid. c. THEz [beat(z, John)] [kid(z)] Fifth, Percus (1997) points out that it-clefts pattern with copular sentences containing definite description subjects with regard to SELF-anaphor binding and negative polarity item (NPI) licensing. In the absence of c-command, a SELF-anaphor in the clefted constituent position can be bound by an antecedent inside the cleft clause, as shown in (12a). Also a pronoun in the clefted constituent position cannot be
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Both Percus and Hedberg argue that this parallelism between definite descriptions and it-clefts can be accounted for if the cleft pronoun and the cleft clause form a semantic unit, with it playing the role of the definite article and the cleft clause the descriptive component. What this translates to syntactically is that the cleft clause is a restrictive relative clause which is situated at the end of the sentence, forming a discontinuous constituent with the cleft pronoun. On this view, the syntax and semantics of it-clefts reduce to that of copular sentences with definite description subjects. Fourth, it has been observed that it-clefts can have equative and predicational interpretations (Ball 1977; DeClerck 1988; Hedberg 1990, 2000), both of which are readings attested in simple copular sentences, as shown in (9):
350 Syntax and Semantics of It-Clefts bound by an antecedent inside the cleft clause, as shown in (13a). Copular sentences with definite description subjects exhibit the same pattern, as in (12b) and (13b). An NPI can occur in the clefted constituent position, licensed by a matrix negative element, as shown in (14a), but it is not licensed by a negation in the cleft clause, as in (15a). This pattern of NPI licensing is attested in copular sentences, as shown in (14b) and (15b).
(14) a. b.
It isn’t anyone I know that John saw. The one that John saw isn’t anyone I know.
(15) a. *It is anyone I know that John didn’t see. b. *The one that John didn’t see is anyone I know. Since it-clefts and copular sentences with definite description subjects exhibit the same pattern of binding and NPI licensing, a uniform explanation for the two cases can be sought if the cleft pronoun and the cleft clause together form a definite description.1 The NPI facts are not difficult to explain, as the NPI in (14) is ccommanded by the negative element, and the NPI in (15) is not ccommanded by the negative element. However, the SELF-anaphor in (12) and the pronoun in (13) are at first sight mysterious under the discontinuous constituent analysis. This is an example of connectivity, whereby the clefted constituent appears to behave as it would if it were generated inside the cleft clause, thus lending support for the expletive analysis. In section 5, we present a solution to this problem by incorporating Binding Conditions of Reinhart & Reuland (1993) to our TAG analysis, and also arguing that the SELF-anaphor in (12) is a discourse anaphor of focus. Agreement facts constitute another example of connectivity, in that when the cleft clause has a subject gap, the verb in the cleft clause agrees in number and person with the clefted constituent. Note also 1 Percus shows that wh-clefts differ from both it-clefts and copular sentences with definite description subjects in that only in the former can post-copular NPIs be licensed by embedded negation. See the examples in (15) and (i). The grammaticality of (i), as opposed to the ungrammaticality of (15), shows that it-clefts should not be treated as deriving from wh-clefts, as was argued, for example, in Akmajian (1970b).
(i) What John didn’t see was anything I might recognize.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
(12) a. It was himselfi who Johni nominated. b. The one that Johni nominated was himselfi. (13) a. *It was himi who Johni nominated. b. *The one that Johni nominated was himi.
Chung-Hye Han and Nancy Hedberg 351
that in equative clefts the copula agrees with the singular cleft pronoun and not with a plural clefted constituent. These facts are shown in (16). (16) a. It is John and Mary that like Pete. b. *It is John and Mary that likes Pete. c. *It are John and Mary that like Pete.
(17) a. They’re just fanatics who are holding him. b. These are students who are rioting. c. Those are kids who beat John. This difference in cleft pronoun choice between equative and predicational clefts with plural clefted constituents shows that the distinction is a real one and emphasizes the parallelism between it-clefts and ordinary copular sentences, which also exhibit the distinction, as shown above in (9).2 It would be difficult for an expletive analysis that assumes that the copula as well as the cleft pronoun is semantically inert, to account for the distinction between the predicational and equative itclefts. In section 5, we use agreement features and feature unification in TAG to account for the connectivity in agreement and the difference in agreement behaviour between equative and predicational it-clefts, again showing that our TAG analysis can capture the best of both the discontinuous constituent analysis and the expletive analysis. 3 SYNTAX OF IT-CLEFTS
3.1 Introduction to TAG syntax TAG is a tree-rewriting system, first formally defined in Joshi et al. (1975). In TAG for natural language, the elementary objects are 2 An anonymous reviewer suggests that the indefinite plural clefted constituent examples in (17) could also be produced with a singular pronoun and copula. While we agree that this might be possible, we have the strong intuition that such examples are equative in nature. Thus, in (i), it is no longer the case that the property of being fanatics is being predicated of a set of people independently identified as those who are holding him. Instead, the question of who is holding him is being answered by identifying these people as a group of fanatics.
(i) It’s just fanatics who are holding him.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
The agreement connectivity between the clefted constituent and the cleft clause favours expletive analyses that analyse the clefted constituent as adjoined to or extracted from the cleft clause. Interestingly, as first pointed out in Ball (1977), in predicational clefts, a plural clefted constituent triggers a plural cleft pronoun and the copula agrees with this plural cleft pronoun, while the verb in the cleft clause again agrees with the clefted constituent, as shown in (17).
352 Syntax and Semantics of It-Clefts
3 In principle, trees such as (aa_movie) could be broken down into trees for determiners and trees for NPs, as in (i). Under this approach, an NP tree anchoring a noun would substitute into a DP tree anchoring a determiner. But strictly speaking, this violates Frank’s (2002) formulation of CETM, as the DP tree in (i) is a projection of a functional head (D), not a lexical head.
(i)
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
lexicalized trees called elementary trees that represent extended projections of a lexical anchor. These trees are minimal in that all and only the syntactic/semantic arguments of the lexical anchor are encapsulated and all recursion is factored away. The elementary trees in TAG are therefore said to possess an extended domain of locality. Frank (2002) formulates the extended projection property of elementary trees as a Condition on Elementary Tree Minimality (CETM) and states that ‘the syntactic heads in an elementary tree and their projections must form an extended projection of a single lexical head’ (p. 54). Following Grimshaw (1991), Frank takes extended projections of a lexical head to include the projections of all functional heads that embed it. This means that an elementary tree anchoring a verb can project to verb phrase (VP) but also to tense phrase (TP) and complementizer phrase (CP), and an elementary tree anchoring a noun can project to noun phrase (NP) but also to determiner phrase (DP) and prepositional phrase. Further, the fundamental thesis in TAG for natural language is that ‘every syntactic dependency is expressed locally within a single elementary tree’ (Frank 2002: 22). This allows for a syntactic dependency created by movement to occur within an elementary tree, but not across elementary trees. The trees in Figure 1 are all examples of well-formed elementary trees. (asaw) is an elementary tree because it is an extended projection of the lexical predicate saw and has argument slots for the subject and the object marked by the downward arrow (Y). Moreover, the movement of the subject DP from [Spec,VP] to [Spec,TP], following the VP-internal subject hypothesis (Koopman & Sportiche 1991), is an operation internal to the elementary tree, and therefore represents a syntactic dependency localized to the elementary tree. (aJohn) and (aa_movie) are valid elementary trees because these DP trees each contain a single lexical head, John for (aJohn) and movie for (aa_movie), that can form an extended projection with a DP, in line with the DP hypothesis (Abney 1987).3
Chung-Hye Han and Nancy Hedberg 353
Figure 1 Initial trees in TAG.
Figure 2
Auxiliary trees in TAG.
4 By convention, names of initial trees are prefixed with a, and names of auxiliary trees are prefixed with b.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Elementary trees are of two types: initial trees and auxiliary trees. A derivation in TAG starts with initial trees such as trees for simple clauses and nominal phrases. The elementary trees in Figure 1 are examples of initial trees. Auxiliary trees are used to introduce recursive structures, for example, adjuncts or other recursive portions of the grammar. Auxiliary trees have a special non-terminal node called the foot node (marked with an asterisk) among the leaf nodes, which has the same label as the root node of the tree. The auxiliary trees in Figure 2 are well-formed elementary trees, as CETM requires only that syntactic heads and their projections form an extended projection, rendering the presence of the VP root node in (breluctantly) and the NP root node in (bscary) consistent with CETM. Further, following Frank (2002), we can count VP* in (breluctantly) and NP* in (bscary) as arguments of the lexical anchor, as the process of theta-identification (Higginbotham 1985) obtains between them and the lexical anchor.4
354 Syntax and Semantics of It-Clefts
Figure 3
Figure 4
Substitution in TAG.
Adjoining in TAG.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
These elementary trees are combined through two derivational operations: substitution and adjoining. In the substitution operation, the root node on an initial tree is merged into a matching non-terminal leaf node marked for substitution (Y) in another tree. This is illustrated in Figure 3. In an adjoining operation, an auxiliary tree is grafted onto a non-terminal node in another elementary tree that matches the root and foot nodes of the auxiliary tree. For example, Figure 4 illustrates (breluctantly) adjoining to the VP node in (asaw), and (bscary) adjoining to the NP node in (aa_movie) which in turn substitutes into (asaw). TAG derivation produces two structures: a derived tree and a derivation tree. The derived tree is the conventional phrase structure
Chung-Hye Han and Nancy Hedberg 355
Figure 5
Derived tree and derivation tree in TAG.
5 The location in the parent elementary tree is usually denoted by the Gorn tree address. Here, we use node labels such as DPs or VPs for the sake of simplicity.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
tree and represents surface constituency. For instance, combining the elementary trees in Figures 1 and 2 through substitution and adjoining as in Figures 3 and 4 generates the derived tree in Figure 5 (left). The derivation tree represents the history of composition of the elementary trees and the dependencies between the elementary trees. In a derivation tree, each node is an elementary tree, and the children of a node N represent the trees which are adjoined or substituted into the elementary tree represented by N. The link connecting a pair of nodes is annotated with the location in the parent elementary tree where adjoining or substitution has taken place.5 An example of a derivation tree is given in Figure 5 (right). Figure 5 (right) records the history of composition of the elementary trees to produce the derived tree in Figure 5 (left): (bscary) adjoins to (aa_movie) at NP, (aJohn) and (aa_movie) substitute into (asaw) at DPi and DP, respectively, and (breluctantly) adjoins to (asaw) at VP. As first shown by Joshi (1985) and Kroch & Joshi (1985), and explored further in Frank (2002), the properties of TAG permit us to provide computationally feasible accounts for various phenomena in
356 Syntax and Semantics of It-Clefts
3.2 Our TAG analysis of the syntax of it-clefts Inspired by work of Kroch & Joshi (1987) and Abeille´ (1994) on discontinuous constituents resulting from extraposition, we propose an analysis for the syntax of it-clefts using tree-local MC-TAG, an extension of TAG. In tree-local MC-TAG, the basic objects of derivation are not only individual elementary trees but also (possibly a singleton) set of such trees, called a multi-component set. All the trees in a multi-component set are restricted to adjoin or substitute simultaneously into a single elementary tree, at each step in a derivation. With this restriction, MC-TAG is shown to be identical to basic TAG in terms of strings and structural descriptions it generates: that is, MCTAG has the same weak and strong generative capacity as the basic TAG (Weir 1988). In addition to extraposition, MC-TAG has been used in the analyses of West Germanic verb raising (Kroch & Santorini 1991), Romance clitic climbing (Bleam 2000) and extraction of an object wh-phrase from a wh-island (Kroch 1989; Frank 2002). The trees in a multi-component set can be thought of as a single elementary tree decomposed into two or more trees. As these trees substitute or adjoin into different positions in another elementary tree, the effect of discontinuous constituency can be produced. Further, the locality of the syntactic dependencies that exist between these trees is maintained, as they are restricted to compose simultaneously with a single elementary tree, contributing to the restricted generative capacity of MC-TAG. We propose that the elementary trees for the cleft pronoun and the cleft clause in the derivation of it-clefts such as (10a) (repeated below as
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
natural language syntax. For example, TAG’s extended domain of locality and its factoring of recursion from elementary trees lead, among other things, to a localization of unbounded dependencies. TAG is a mildly context-sensitive grammar Joshi et al. (1991), formally sitting between context-free and context-sensitive grammar, and is able to generate unbounded cross-serial dependencies such as those that occur between the arguments and verbs in Dutch and Swiss German in a natural way. In section 3.2, we show that TAG’s extended domain of locality allows us to provide an elegant syntactic account of the discontinuous constituency of the cleft pronoun and the cleft clause without adopting a movement-based account of the extraposition of the cleft clause. At the same time, TAG’s distinction between the derivation and derived trees allows us to account for the surface syntactic constituency of the clefted constituent and the cleft clause.
Chung-Hye Han and Nancy Hedberg 357
(18)) and (11a) (repeated below as (19)) form a multi-component set, as in f(ait), (bwho_won)g and f(ait), (bwho_beat)g in Figure 6. (18) It was Ohno who won. (19) It was a kid who beat John.
Figure 6 Multi-component sets of cleft pronoun and cleft clause. 6 Strictly speaking, the elementary trees representing the cleft clause in the two multi-component sets in Figure 6 should have a substitution site in [Spec,CP] to be substituted in by a separate DP elementary tree anchoring a relative pronoun. Here, to simplify the derivation, we have already substituted in the relative pronoun DP tree.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
We capture the intuition that the cleft pronoun and the cleft clause form a syntactic unit by placing the elementary trees for them in a single multi-component set. And as these are two separate trees, they are able to substitute and adjoin onto two different places in a single elementary tree, producing the effect of discontinuity. The first component of each set introduces a determiner and the second component of each set introduces a relative clause anchoring the lexical predicate.6 The multi-component set can be thought as a DP tree decomposed into two parts: a functional projection of a determiner and a lexical domain on which the determiner operates. That is, the two parts are comparable to a projection of D and a projection of N in a simple DP tree such as (aa_movie) in Figure 1: like a in (aa_movie), it in (ait) is a determiner that heads a DP, and like the NP (movie) in (aa_movie), (bwho_won) and (bwho_beat) include the lexical domains on which the determiner operates. Moreover, just like simple DP trees like (aa_movie), the two components in the sets f(ait), (bwho_won)g and f(ait), (bwho_beat)g together comply to CETM: each set has a single lexical head, the verb and all other syntactic heads and their
358 Syntax and Semantics of It-Clefts projections, TP, CP and DP form extended projections of the verb. The presence of FP does not violate CETM, as CETM requires only that syntactic heads and their projections in an elementary tree form an extended projection of the anchor. For the derivation of equative it-clefts as in (18), we adopt the equative copular tree in (awas) in Figure 7, a tree similar to the one proposed in Frank (2002) for copular sentences. In this tree, FP is a small clause of the copula from which the two DPs being equated originate. (18) is derived by substituting (ait) into DP0 in (awas), adjoining (bwho_won) into FP in (awas) and substituting (aOhno) into DP1 in (awas), as illustrated in Figure 8. The syntactic derivation tree and the
Figure 8 Elementary trees for ‘It was Ohno who won’.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Figure 7 Equative copula elementary tree.
Chung-Hye Han and Nancy Hedberg 359
(20) a. I said it should have been [Bill who negotiated the new contract], and it should have been. b. It must have been [Fred that kissed Mary] but [Bill that left with her]. c. It was Kim, in my opinion, who won the race.
Figure 9 Derivation and derived trees for ‘It was Ohno who won’. 7 By convention, names of derivation trees are prefixed with d, and names of derived trees are prefixed with c. 8 See Han & Hedberg (2006) for a TAG analysis of coordination in it-clefts, as exemplified in (20b).
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
derived tree for (18) are given in (d18) and (c18), respectively, in Figure 9.7 In (d18), the elementary trees for the cleft pronoun and the cleft clause form a unit, represented as a single node, and in (c18), the clefted constituent and the cleft clause form a constituent. Postulating separate projections for the copula (CopP) and the small clause (FP) in (awas) can account for the fact that the clefted constituent and the cleft clause form a constituent, as illustrated in (6a,b) (repeated below as (20a,b)), and yet they can be separated by an adverbial phrase, as in (20c). In our analysis, (20a,b) are possible because the bracketed parts are the higher layers of the FPs in the derived tree. (20c) is possible because an adverbial phrase can adjoin onto FP or F# in the equative copula tree, in which case, the clefted constituent and the cleft clause would be separated by the adverbial phrase in the derived tree.8
360 Syntax and Semantics of It-Clefts
Figure 10
Predicational copula tree.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
For the derivation of predicational it-clefts as in (19), we adopt a predicational copula tree (awas_kid) in Figure 10. The predicational copula tree in (awas_kid) is similar to the equative copula tree in (awas) in that in both trees, the copula combines with a small clause FP. But the two trees have different anchors and different number of argument substitution sites. In (awas_kid), the noun (kid) is the predicate requiring a single argument, and thus the noun (kid) is the lexical anchor of the tree and the subject DP is an argument substitution site. But in (awas), both the subject and the non-subject DPs are argument substitution sites as they are arguments of an equative predicate. As illustrated in Figure 11, (19) is derived by substituting (ait) into DP0 and adjoining (bwho_beat) onto FP in (awas_kid), and substituting (aJohn) into DP in (awho_beat). The syntactic derivation tree and the derived tree for (19) are given in (d19) and (c19), respectively, in Figure 12. Just as in the derivation tree and the derived tree for the equative it-cleft in Figure 12, in (d19), the elementary trees for the cleft pronoun and the cleft clause form a unit, represented as a single node, and in (c19), the clefted constituent and the cleft clause form a constituent.
Chung-Hye Han and Nancy Hedberg 361
Figure 12
Syntactic derivation and derived trees for ‘It was a kid who beat John’.
4 SEMANTICS OF IT-CLEFTS In TAG, the derivation tree, not the derived tree, serves as the input to compositional semantics (Joshi & Vijay-Shanker 1999; Kallmeyer & Joshi 2003). While phrase structure-based compositional semantics computes the meaning of a sentence as a function of the meaning of each node in
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Figure 11 Elementary trees for ‘It was a kid who beat John’.
362 Syntax and Semantics of It-Clefts
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
the syntactic tree, TAG-based compositional semantics computes the meaning of a sentence as a function of the meaning of elementary trees put together to derive the sentence structure. Each syntactic elementary tree is associated with a semantic representation, and following the history of how the elementary trees are put together to derive the sentence structure, the corresponding semantic representation is computed by combining the semantic representations of the elementary trees. There are two main approaches to doing compositional semantics on the derivation tree: (i) flat semantics (Joshi & Vijay-Shanker 1999; Kallmeyer & Joshi 2003; Romero & Kallmeyer 2005; Kallmeyer & Romero 2008); and (ii) STAG (Shieber & Schabes 1990; Abeille´ 1994; Shieber 1994). Under the flat semantics approach, in the style of Minimal recursion semantics (Copestake et al. 2005), the main operation for semantic composition is the conjunction of the semantic representations associated with each elementary tree along with the unification of variables contributed by these semantic representations. In Romero & Kallmeyer (2005) and Kallmeyer & Romero (2008), derivation trees are augmented with feature structures to enforce variable unification. The theory of semantic representations developed by Kallmeyer and Romero has been used in a series of empirical work: pied-piping of wh-phrases (Kallmeyer & Scheffler 2004), focus (BabkoMalaya 2004), questions (Romero et al. 2004), VP coordination (Banik 2004), among others. In this paper, however, we use STAG, a pairing of a TAG for the syntax and a TAG for the semantics, to propose a compositional semantic analysis for it-clefts. In STAG-based compositional semantics, the semantic representations are structured trees with nodes on which substitution and adjoining of other semantic representations can take place. Compositionality obtains with the requirement that the derivation tree in syntax and the corresponding derivation tree in semantics be isomorphic, as specified in Shieber (1994). This isomorphism requirement guarantees that the derivation tree in syntax determines the meaning components needed for semantic composition, and the way these meaning components are combined. Since the semantic representations are structured trees, the semantic objects and the composition of these objects parallel those already utilized in syntax, and so computing semantics only requires the operations of substitution and adjoining used to build the syntactic structures. These properties of STAG allow us to define a simple and elegant syntax– semantics mapping, as has been shown to be the case by Nesson & Shieber (2006), who provide a STAG analysis for various linguistic phenomena, including quantifier scope, long distance wh-movement,
Chung-Hye Han and Nancy Hedberg 363
subject-to-subject raising and nested quantifiers and inverse linking, and Han (2007), who provide a STAG analysis for relative clauses and pied-piping. In section 4.1, we introduce the basics of STAG and STAG-based compositional semantics and in section 4.2, we present our proposed analysis for the semantic composition of it-clefts.
4.1 Introduction to STAG and compositional semantics
(21) John saw a scary movie. We use STAG as defined in Shieber (1994). In STAG, each syntactic elementary tree is paired with one or more semantic trees that represent its meaning with links between matching nodes. A synchronous derivation proceeds by mapping a derivation tree from the syntax side to an isomorphic derivation tree on the semantics side, and is synchronized by the links specified in the elementary tree pairs. In the tree pairs given in Figure 13, the trees on the left side are syntactic elementary trees and the ones on the right side are semantic trees. In the semantic trees, F stands for formulas, R for predicates and T for terms. We assume that these nodes are typed (e.g. the F node in (a#saw) has type t and the lowest R node in (a#saw) has type <e, <e, t), and we represent predicates as unreduced k-expressions, following the notation in Han (2007). Making use of unreduced k-expressions in semantic trees allows the reduction of semantic derived trees to logical forms through the application of k-conversion and other operations defined on kexpressions. The linked nodes are shown with boxed numbers. For the sake of simplicity, in the elementary tree pairs, we only include links that are relevant for the derivation of given examples.9 Figure 13 contains elementary trees required to generate the syntactic structure and the logical form of (21). The proper name tree in (aJohn) is paired with a tree representing a term on the semantics side, and the attributive adjective tree in (bscary) is paired with an auxiliary tree on the semantics side that represents a one-place predicate to be adjoined to another one-place predicate. For quantified DPs, we follow Shieber & Schabes (1990) and Nesson & Shieber (2006), and use tree-local MC-TAG on the semantics side. Thus, the DP in (aa_movie) 9 By convention, names of semantic elementary trees are prefixed with a# or b#, names of semantic derivation trees are prefixed with d# and names of semantic derived trees are prefixed with c#.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
We illustrate the framework of STAG and STAG-based compositional semantics and clarify our assumptions, using (21), a simple sentence that contains an existential quantifier and an attributive adjective. A similar example was used in section 3 to illustrate the syntactic derivation in TAG.
364 Syntax and Semantics of It-Clefts
Syntactic and semantic elementary trees for ‘John saw a scary movie’.
is paired with a multi-component set f(a#a_movie), (b#a_movie)g on the semantics side: (a#a_movie) provides an argument variable and (b#a_movie) provides an existential quantifier with the restriction and scope. The transitive tree in (asaw) is paired with a semantic tree representing a formula that consists of a two-place predicate and two term nodes. The links, notated with boxed numbers, guarantee that whatever substitutes into DPi, its corresponding semantic tree will substitute into the term node marked with 1 , and whatever substitutes into DP is paired up with a multi-component set on the semantics side where one of the components will substitute into the term node marked with 2 and the other will adjoin to the F node marked with 2 . The syntactic and semantic derivation trees are given in Figure 14, and the derived trees are given in Figure 15. Technically, there is only one derivation tree because the syntactic and semantic derivations are
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Figure 13
Chung-Hye Han and Nancy Hedberg 365
Figure 14 Syntactic and semantic derivation trees for ‘John saw a scary movie’.
isomorphic. In this paper, we provide two derivation trees (one for syntax and the other for semantics) throughout to make the tree-local derivation explicit.10 The semantic derived trees can be reduced by applying kconversion, as the nodes dominate typed k-expressions and terms. When reducing the semantic derived trees, in addition to k-conversion, we propose to use Predicate Modification, as defined in Heim & Kratzer (1998) in (22). (22) Predicate Modification , and ½½bs and ½½cs are both in D<e, t>, If a has the form then ½½as ¼ kxe½½bs(x) ^ ½½cs(x). 10
In semantic derivation trees, we do not annotate the connections between a mother and a daughter node with the location of adjoining or substitution that has taken place in the mother elementary tree, as this is determined by the links between syntactic and semantic elementary trees.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Figure 15 Syntactic and semantic derived trees for ‘John saw a scary movie’.
366 Syntax and Semantics of It-Clefts The application of Predicate Modification and k-conversion reduces (c#21) to the formula in (23). (23) dy[scary(y) ^ movie(y)] [saw(John, y)]
4.2 Our TAG analysis of the semantics of it-clefts
Figure 16 Syntactic and semantic elementary trees for ‘It was Ohno who won’. 11 In (b#who_won), the R node represents the semantics of the relative clause who won. This is a product of composing the semantics of the relative pronoun who and the semantics of the rest of the relative clause. Here, to simplify the derivation and to streamline the discussion, we skipped a step in the derivation with separate semantic trees for the relative pronoun and the rest of the relative clause. For a detailed analysis of the compositional semantics of relative clauses using STAG, see Han (2007).
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
The elementary tree pairs required for the syntax–semantics mapping of the equative it-cleft in (18) are given in Figure 16. (a#it) and (b#who_won) in the multi-component set in Figure 16 together define the semantics of definite quantification, where the former contributes the argument variable and the latter the definite quantifier, the restriction and scope, and (a#was) represents the semantics of equative sentences.11 The derivation tree for the semantics of (18) is given in (d#18) in Figure 17
Chung-Hye Han and Nancy Hedberg 367
Figure 17
Syntactic and semantic derived trees for ‘It was Ohno who won’.
and the semantic derived tree is given in (c#18) in Figure 18. Note that the semantic derivation tree in (d#18) is isomorphic to the syntactic one in (d18). The semantic derived tree in (c#18) can be reduced to the formula in (24) after the application of k-conversion. (24) THEz [won(z)] [z ¼ Ohno] The elementary tree pairs required for the syntax–semantics mapping of the predicational it-cleft in (19) are given in Figure 19. The difference between the semantics of equative sentences and predicational sentences is represented by the two different semantic trees, (a#was) in Figure 16 and (a#was_kid) in Figure 19. While (a#was) in Figure 16 represents the semantics of equative sentences and has two term nodes with a two-place equative predicate anchoring the tree, (a#was_kid) in Figure 19 represents the semantics of predicational sentences and has one term node with a one-place predicate, kx.kid(x), anchoring the tree. The syntactic and semantic derivation trees for (19),
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Figure 18
Syntactic and semantic derivation trees for ‘It was Ohno who won’.
368 Syntax and Semantics of It-Clefts
which are isomorphic, are given in in Figure 20, and the corresponding derived trees are given in in Figure 21. The semantic derived tree in (c#19) can be reduced to the formula in (25) after the application of k-conversion. (25) THEz [beat(z, John)] [kid(z)]
5 CONNECTIVITY
5.1 Agreement In equative it-clefts, the cleft pronoun is always singular and agrees with the copula, but the clefted constituent can be either singular or plural.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Figure 19 Syntactic and semantic elementary trees for ‘It was a kid who beat John’.
Chung-Hye Han and Nancy Hedberg 369
Figure 20
Syntactic and semantic derived trees for ‘It was a kid who beat John’.
Further, when the cleft clause is a subject relative clause, the clefted constituent agrees with the verb in the cleft clause in person and number. This is illustrated in (16), repeated here as (26). This apparent agreement between the clefted constituent and the verb in the clefts clause, even though they are not in the same clause in our analysis, gives rise to a connectivity effect. (26) a. It is John and Mary who like Pete. b. *It is John and Mary who likes Pete. c. *It are John and Mary who like Pete. We point out that agreement across clauses is not unique to it-clefts. In (27), the subject of the main clause John and Mary agrees with the
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Figure 21
Syntactic and semantic derivation trees for ‘It was a kid who beat John’.
370 Syntax and Semantics of It-Clefts copula of the non-restrictive relative clause. So, there is independent motivation for a mechanism in the grammar that allows agreement across clauses in appropriate syntactic contexts. (27) John and Mary, who are students, came to see me.
Figure 22 Derivation of ‘It is John and Mary who like Pete’. 12
An anonymous reviewer asks why the agreement feature on T in (bwho_like) is not valued as plural. We chose to leave it unspecified, as it is compatible with third person plural as well as second and first person singular and plural.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
The agreement phenomena in it-clefts can be easily accommodated by our TAG analysis, with the addition of feature unification (VijayShanker & Joshi 1988). We will postulate an agreement feature attribute, Agr, that can have feature values such as third person singular (3sg) or third person plural (3pl) feature. This Agr feature can also be unspecified in an elementary tree and obtain a value through feature unification as it composes with another elementary tree. An unspecified Agr feature has an arbitrary index as a temporary value, and Agr features with the same indices must have the same value at the end of the derivation. Figure 22 illustrates how our TAG analysis can capture the agreement between the cleft pronoun it and the copula is, and the clefted constituent John and Mary and the verb of the cleft clause like in (26a).12 To simplify the discussion, we have already derived the DP coordination tree for John and Mary and referred to it as (aand), and substituted the DP tree anchoring Pete into (bwho_like). The substitution of (ait) into DP0 in (ais) is licensed because DP in (ait) has [Agr:3sg] feature which unifies with [Agr:3sg] in DP0 in (ais). And the agreement between it and is is guaranteed as both DP0 and T in
Chung-Hye Han and Nancy Hedberg 371
(28) a. They’re just fanatics who are holding him. b. Those are students who are rioting. c. Those are kids who beat John.
Figure 23
Syntactic derived tree for ‘It is John and Mary who like Pete’.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
(ais) tree have the same agreement features, as indicated by the coindexation between the agreement feature on DP0 and the third person singular feature in T. As (aand) tree substitutes into DP1 in (ais), the [Agr: 4 ] feature on FP is valued as 3pl. As (bwho_like) tree adjoins onto FP in (ais), DPl and T in (bwho_like) are valued as 3pl as well. This will guarantee the agreement between John and Mary and like. The derived tree with all the Agr features valued and unified is in Figure 23. In predicational it-clefts, the cleft pronoun can be plural, and it must agree with the copula as well as the clefted constituent. Moreover, if the cleft clause is a subject relative clause, then the clefted constituent must agree with the verb of the cleft clause, even though they are not in the same clause in our analysis, giving rise to a connectivity effect. This is illustrated in (17), repeated here as (28).
372 Syntax and Semantics of It-Clefts
5.2 Binding In it-clefts, even though the clefted constituent is not c-commanded by the subject of the cleft clause, a SELF-anaphor in the clefted constituent
Figure 24 13
Derivation of ‘Those are kids who beat John’.
We left the agreement feature on T in (bwho_beat) unspecified for the same reason we left it unspecified in (bwho_beat): it is compatible with third person plural, and second and first person singular and plural. 14 Why equative clefts require singular cleft pronouns when they contain a plural clefted constituent does not follow from our theory and remains a puzzle. However, the fact that different agreement patterns occur shows that there are clearly two types of it-cleft.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
How our TAG analysis can capture the agreement phenomena in predicational it-clefts is illustrated in Figure 24.13 To simplify the discussion, we have already substituted the DP tree anchoring John into (bwho_beat). In our TAG analysis, the lexical anchor of a predicational copula elementary tree is the predicative noun, as in (aare_kids). In this tree, the agreement between the cleft pronoun, the copula and the predicative noun is guaranteed: DP0, T and DP all have the same agreement features as they all have the same indices. Here, they all have third person plural features as the DP containing the predicative noun is specified with the third person plural feature. The substitution of (athose) tree into DP0 in (aare_kids) is licensed because DP in (athose) has [Agr:3pl] feature which unifies with the third person plural feature in DP0 in (aare_kids). As (bwho_beat) tree adjoins onto FP in (aare_kids), DPl and T in (bwho_beat) will obtain 3pl value as well. This will guarantee the agreement between kids and beat. The derived tree with all the Agr features valued and unified is given in Figure 25.14
Chung-Hye Han and Nancy Hedberg 373
Syntactic derived tree for ‘Those are kids who beat John’.
can be co-indexed with the subject in the cleft clause as in (12a), repeated here as (29a), and a pronoun in the clefted constituent cannot be co-indexed with the subject in the cleft clause as in (13a), repeated here as (29b). In other words, the SELF-anaphor and the pronoun behave as if they are inside the cleft clause as in (30a) and (30b), giving rise to a connectivity effect. (29) a. It was himselfi who Johni nominated. b. *It was himi who Johni nominated. (30) a. Johni nominated himselfi. b. *Johni nominated himi. We will use the Binding Conditions defined in Reinhart & Reuland (1993) to account for this phenomenon. The formulation of Binding Conditions by Reinhart and Reuland and the definitions needed to understand it are given in (31) and (32). Condition A constrains the
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Figure 25
374 Syntax and Semantics of It-Clefts distribution of SELF-anaphors and Condition B constrains the distribution of pronouns. (31) Binding Conditions (Reinhart & Reuland 1993) a. A: If a syntactic predicate is reflexive-marked, it is reflexive. b. B: If a semantic predicate is reflexive, it is reflexive-marked.
According to Reinhart and Reuland, Condition A successfully applies to (30a) because the syntactic predicate ‘John nominated himself ’ is reflexive-marked, as one of the arguments, himself, is a SELFanaphor, and it is also reflexive, as two of its arguments, John and himself, are co-indexed. However, (30b) is ruled out by Condition B. In (30b), the semantic predicate nominated(John, John) is reflexive, as two of its arguments are co-indexed, but it is not reflexive-marked, as nominated is not lexically reflexive and none of nominated’s arguments is a SELF-anaphor. We first apply Condition B of Reinhart and Reuland to rule out (29b), repeated below as (33a). According to our TAG analysis, (33a) would map onto an equative semantic representation as in (33b). Since the clefted constituent him is co-indexed with John, they corefer, and so the variable from the cleft pronoun, z, would be equated with John. We will represent this as z ¼ himJohn, just to be explicit about the fact that the form of the clefted constituent here is him. This in turn means that the semantic predicate nominated(John, z) is reflexive. But it is not reflexive-marked, as nominated is not lexically reflexive and none of its arguments is a SELF-anaphor. (33) a. *It was himi who Johni nominated. b. *THEz [nominated(John, z)] [z¼himJohn] We now turn to (29a). According to our TAG analysis, (29a) is also an equative sentence. We thus have a syntactic predicate whose head is the equative copula and with two syntactic arguments, it and himself. But then Condition A should rule out this sentence because even
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
(32) Definitions (Reinhart & Reuland 1993) a. The syntactic predicate formed of a head P is P, all its syntactic arguments (the projections assigned theta-roles/case by P), and an external argument of P. b. The semantic predicate of P is P and all its arguments at the relevant semantic level. c. P is reflexive iff two of its arguments are co-indexed. d. P is reflexive-marked iff either P is lexically reflexive or one of P’s arguments is a SELF-anaphor.
Chung-Hye Han and Nancy Hedberg 375
though the syntactic predicate is reflexive-marked, it is not reflexive, as it and himself are not co-indexed. Reinhart and Reuland point out that focus anaphors can occur in an argument position without a binder, appearing to be exempt from Condition A. Such anaphors are also known as discourse anaphors of focus or emphatic anaphors (Kuno 1987; Zribi-Hertz 1989). Some examples are given in (34).
We note that the clefted constituent is a focused position (Akmajian 1970a; Prince 1978). This means that a SELF-anaphor in a clefted constituent position is always focused, and so it can be exempt from Condition A. A further support for this view comes from examples as in (35). These examples are acceptable even though myself and yourself do not have possible binders in the sentences in which they occur. (35) a. It was myself who John nominated. b. It was yourself who John nominated. A question remains though as to why the clefted constituent cannot be occupied by just any SELF-anaphor. For instance, (36) is degraded where herself in the clefted constituent position does not have a binder. (36) *It was herself who John nominated. This implies that even though a focus anaphor in the clefted constituent position is not subject to Condition A, its distribution is constrained by discourse factors. The exact nature of the discourse constraints on the distribution of focus anaphors in it-clefts remains to be investigated.
6 CONCLUSION We have proposed a syntax and semantics of it-clefts, using tree-local MC-TAG and STAG. We accounted for the equative and predicational interpretations available to it-clefts, the two readings available to simple copula sentences as well, by postulating two types of copula sentences in English, an equative one and a predicational one (Heycock & Kroch 1999). The two types of copula sentences are represented by two
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
(34) a. This letter was addressed only to myself. (Reinhart & Reuland 1993, ex. 27a) b. ‘Bismarck’s impulsiveness has, as so often, rebounded against himself ’. (Reinhart & Reuland 1993, ex. 27c, originally quoted in Zribi-Hertz 1989)
376 Syntax and Semantics of It-Clefts
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
different pairs of syntactic and semantic elementary trees. Our analysis thus contrasts with the inverse analysis of Williams (1983), Partee (1986), Moro (1997) and Mikkelsen (2005), according to which specificational clauses (our equatives) are inverted predicational clauses. On some versions of this analysis, both orders derive from an underlying embedded small clause, with either the subject or the predicate raising to matrix subject position. In our TAG analysis, the derivation of it-clefts starts either with an equative copula elementary tree or with a predicational copula elementary tree. The copula tree then composes with the elementary tree for the cleft pronoun and the elementary tree for the cleft clause. In our analysis, the cleft pronoun and the cleft clause bear a direct syntactic relation because the elementary trees for the two parts belong to a single multi-component set. They do not actually form a syntactic constituent in the derived tree, but as the elementary trees for the two belong to the same multi-component set, the intuition that they form a syntactic unit is captured, represented in the derivation tree as a single node. At the same time, the surface syntactic constituency is represented in the derived tree where the clefted constituent and the cleft clause form a constituent. Further, the semantics of the two trees in the multi-component set is defined as a definite quantified phrase, capturing the intuition that they form a semantic unit as a definite description. We have also shown that our TAG analysis can account for connectivity effects instantiated by binding and agreement: for binding, we applied Binding Conditions of Reinhart & Reuland (1993) and exploited the fact that the clefted constituent is a focused position, and for agreement, we added feature unification to our TAG analysis. The distinction in TAG between the derivation tree and the derived tree enabled us to resolve the tension between the surface constituency and the syntactic and semantic dependency in it-clefts: in the derived tree, the cleft clause forms a constituent with the clefted constituent, not with the cleft pronoun, capturing the insight from the expletive approach, but in the derivation tree, the cleft clause and the cleft pronoun form a syntactic/semantic unit, capturing the insight from the discontinuous constituent approach. The extended domain of locality of TAG and the ability to decompose an elementary tree to a set of trees in MC-TAG enabled us to provide a straightforward syntactic account of the discontinuous constituent property of the cleft pronoun and the cleft clause without having to adopt movement to produce the effect of extraposition of the cleft clause. Moreover, the derivation tree-based compositional semantics and the direct syntax–semantics mapping in STAG enabled us to provide a simple compositional semantics for
Chung-Hye Han and Nancy Hedberg 377
it-clefts without using an ad hoc interpretive operation to associate the meaning coming from the cleft pronoun and the meaning coming from the cleft clause. It remains as future work to extend our analysis to itclefts that have non-DP clefted constituents, such as ‘It was to the library that John went’ and ‘It was happily that John quit his job’. Acknowledgements
CHUNG-HYE HAN AND NANCY HEDBERG Department of Linguistics Simon Fraser University 8888 University Drive Burnaby BC V5A 1S6 Canada e-mail:
[email protected],
[email protected] REFERENCES Abeille´, Ann (1994), ‘Syntax or semantics? Handling nonlocal dependencies with MCTAGs or Synchronous TAGs’. Computational Intelligence 10:471–85. Abney, Steven (1987), The English Noun Phrase in Its Sentential Aspect. Doctoral dissertation, MIT, Cambridge, MA. Akmajian, Adrian (1970a), Aspects of the Grammar of Focus in English. Doctoral dissertation, MIT, Cambridge, MA. Akmajian, Adrian (1970b), ‘On deriving cleft sentences from pseudo-cleft sentences’. Linguistic Inquiry 1:149–68. Babko-Malaya, Olga (2004), ‘LTAG semantics of focus’. In Proceedings of TAG+7. Vancouver, Canada. 1–8.
Ball, Catherine N. (1977), ‘Th-clefts’. Pennsylvania Review of Linguistics 2:57–69. Banik, Eva (2004), ‘Semantics of VP coordination in LTAG’. In Proceedings of TAG+7. Vancouver, Canada. 118– 25. Bleam, Tonia (2000), ‘Clitic climbing and the power of Tree Adjoining Grammar’. In Ann Abeille´ and Owen Rambow (eds.), Tree Adjoining Grammars: Formalisms, Linguistic Analysis, and Processing. CSLI. Stanford, CA. 193–220. Chomsky, Noam (1977), ‘On whmovement’. In P. W. Culicover, T. Wasow and A. Akmajian (eds.), Formal Syntax. Academic Press. New York. 71–132.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
We thank the audience at TAG+8 in Sydney, 2006, for comments and questions on the previous version of this paper. We are also extremely indebted to the two anonymous reviewers for their insightful comments that were crucial in improving this paper. All remaining errors are ours. This work was supported by SSHRC 4102003-0544 and NSERC RGPIN341442 to Han, and SSHRC 410-2007-0345 to Hedberg.
378 Syntax and Semantics of It-Clefts Doctoral dissertation, University of Minnesota, Minneapolis, MN. Hedberg, Nancy (2000), ‘The referential status of clefts’. Language 76:891–920. Heggie, Lorie A. (1988), The Syntax of Copular Structures. Doctoral dissertation, University of Southern California, Los Angeles, CA. Heycock, Caroline & Anthony Kroch (1999), ‘Pseudocleft connectedness: implications for the LF interface’. Linguistic Inquiry 30:365–97. Higginbotham, James (1985), ‘On semantics’. Linguistic Inquiry 16:547–94. Jespersen, Otto (1927), A Modern English Grammar, vol. 3. Allen and Unwin. London. Jespersen, Otto (1937), Analytic Syntax. Allen and Unwin. London. Joshi, Aravind K. (1985), ‘Tree Adjoining Grammars: how much context sensitivity is required to provide a reasonable structural description’. In D. Dowty, L. Karttunen and A. Zwicky (eds.), Natural Language Parsing. Cambridge University Press. Cambridge, UK. 206–50. Joshi, Aravind K., L. Levy & M. Takahashi (1975), ‘Tree adjunct grammars’. Journal of Computer and System Sciences. 10:136–63. Joshi, Aravind K. & K. Vijay-Shanker (1999), ‘Compositional semantics with Lexicalized Tree-Adjoining Grammar (LTAG): how much underspecification is necessary?’ In H. C. Blunt and E. G. C. Thijsse (eds.), Proceedings of the Third International Workshop on Computational Semantics (IWCS-3). Tilburg. 131–45. Joshi, Aravind K., K. Vijay-Shanker & David Weir (1991), ‘The convergence of mildly context-sensitive grammatical formalisms’. In Peter Sells, Stuart Shieber and Tom Wasow (eds.), Foundational Issues in Natural Language Processing. MIT Press. Cambridge, MA. 31–82.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Copestake, Ann, Dan Flickinger, Ivan A. Sag & Carl Pollard (2005), ‘Minimal recursion semantics: an introduction’. Journal of Research on Language and Computation 3:281–332. DeClerck, Renaat (1988), Studies on Copular Sentences, Clefts and Pseudoclefts. Foris. Dordrecht, The Netherlands. Delahunty, Gerald P. (1982), Topics in the Syntax and Semantics of English Cleft Sentences. Indiana University Linguistics Club. Bloomington, IN. Delin, Judy L. (1989), Cleft Constructions in Discourse. Doctoral dissertation, University of Edinburgh, Edinburgh. E´. Kiss, Katalin (1998), ‘Identificational focus versus information focus’. Language 74:245–73. Emonds, Joseph E. (1976), A Transformational Approach to English Syntax. Academic Press. New York. Frank, Robert (2002), Phrase Structure Composition and Syntactic Dependencies. MIT Press. Cambridge, MA. Grimshaw, Jane (1991), Extended Projection. Unpublished MS, Brandeis University, Waltham, MA. Gundel, Jeanette K. (1977), ‘Where do cleft sentences come from?’ Language 53:543–59. Han, Chung-hye (2007), ‘Pied-piping in relative clauses: syntax and compositional semantics using Synchronous Tree Adjoining Grammar’. Research on Language and Computation 5:457– 79. Han, Chung-hye & Nancy Hedberg (2006), ‘A Tree Adjoining Grammar Analysis of the Syntax and Semantics of It-clefts’. In Proceedings of the 8th International Workshop on Tree Adjoining Grammars and Related Formalisms (TAG+ 8). COLING-ACL Workshop. Sydney, Australia. 33–40. Hedberg, Nancy (1990), Discourse Pragmatics and Cleft Sentences in English.
Chung-Hye Han and Nancy Hedberg 379 Mikkelsen, Lina (2005), Copular Clauses: Specification, Predication and Equation. John Benjamins, Amsterdam. Moro, Andrea (1997), The Raising of Predicates: Predicative Noun Phrases and the Theory of Clause Structure. Cambridge University Press. Cambridge. Nesson, Rebecca & Stuart M. Shieber (2006), ‘Simpler TAG semantics through synchronization’. In Proceedings of the 11th Conference on Formal Grammar. CSLI. Malaga, Spain. 103–117. Partee, Barbara (1986), ‘Ambiguous pseudoclefts with unambiguous be’. In S. Berman, J. Choe and J. McDonough (eds.), Proceedings of NELS, vol. 16. GLSA, University of Massachusetts. Amherst, MA. 354–66. Percus, Orin (1997), ‘Prying open the cleft’. In K. Kusumoto (ed.), Proceedings of the 27th Annual Meeting of the North East Linguistics Society. GLSA. Amherst, MA. 337–51. Prince, Ellen (1978), ‘A comparison of wh-clefts and it-clefts in discourse’. Language 54:883–906. Reinhart, Tanya & Eric Reuland (1993), ‘Reflexivity’. Linguistic Inquiry 24:657–720. Rochemont, Michael (1986), Focus in Generative Grammar. John Benjamins. Amsterdam, The Netherlands. Romero, Maribel & Laura Kallmeyer (2005), ‘Scope and situation binding in LTAG using semantic unification’. In Proceedings of the Sixth International Workshop on Computational Semantics (IWCS-6). Tilburg. Romero, Maribel, Laura Kallmeyer & Olga Babko-Malaya (2004), ‘LTAG semantics for questions’. In Proceedings of TAG+7. Vancouver, Canada. 186– 93. Shieber, Stuart (1994), ‘Restricting the weak-generative capacity of Synchronous Tree-Adjoining Grammars’. Computational Intelligence 10:271–385.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Kallmeyer, Laura & Aravind K. Joshi (2003), ‘Factoring predicate argument and scope semantics: underspecified semantics with LTAG’. Research on Language and Computation 1:3–58. Kallmeyer, Laura & Maribel Romero (2008), ‘Scope and situation binding in LTAG using semantic unification’. Research on Language and Computation 6:3–52. Kallmeyer, Laura & Tatjana Scheffler (2004), ‘LTAG analysis for pied-piping and stranding of wh-phrases’. In Proceedings of TAG+7. Vancouver, Canada. 32–9. Koopman, Hilda & Dominique Sportiche (1991), ‘The position of subjects’. Lingua 85:211–58. Kroch, Anthony (1989), ‘Asymmetries in long-distance extraction in a Tree Adjoining Grammar’. In Mark Baltin and Anthony Kroch (eds.), Alternative Conceptions of Phrase Structure. University of Chicago Press. Chicago, IL. 66–98. Kroch, Anthony & Aravind Joshi (1985), ‘Linguistic relevance of Tree Adjoining Grammar’. Technical Report, MS-CS-85-16. Department of Computer and Information Sciences, University of Pennsylvania. Kroch, Anthony S. & Aravind K. Joshi (1987), ‘Analyzing extraposition in a Tree Adjoining Grammar’. In G. Huck and A. Ojeda (eds.), Discontinuous Constituents, volume 20 of Syntax and Semantics. Academic Press. Orlando, FL. 107–49. Kroch, Anthony & Beatrice Santorini (1991), ‘The derived constituent structure of the West Germanic verb raising construction’. In Robert Freidin (ed.), Principles and Parameters in Comparative Grammar. MIT Press. Cambridge, MA. 269–338. Kuno, Susumu (1987), Functional Syntax: Anaphora, Discourse and Emphathy. University of Chicago Press. Chicago, IL. Amsterdam.
380 Syntax and Semantics of It-Clefts Shieber, Stuart & Yves Schabes (1990), ‘Synchronous Tree Adjoining Grammars’. In Proceedings of COLING’90. Helsinki, Finland. 253–8.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Vijay-Shanker, K. & Aravind K. Joshi (1988), ‘Feature structure based Tree Adjoining Grammars’. In Proceedings of the 12th International Conference on Computational Linguistics. Budapest, Hungary. 714–9. Weir, David (1988), Characterizing Mildly Context-sensitive Grammar Formalisms. Doctoral dissertation, University of Pennsylvania. Philadelphia, PA.
Williams, Edwin (1980), ‘Predication’. Linguistic Inquiry 11:203–38. Williams, Edwin (1983), ‘Semantic vs. syntactic categories’. Linguistics and Philosophy 6:423–46. Wirth, Jessica R. (1978), ‘The derivation of cleft sentences in English’. Glossa 12:58–82. Zribi-Hertz, Anne (1989), ‘A-type binding and narrative point of view’. Language 65:695–727. First version received: 17.12.2007 Second version received: 08.04.2008 Accepted: 06.05.2008