Roots
≥
Studies in Generative Grammar 96
Editors
Henk van Riemsdijk Jan Koster Harry van der Hulst
Mouton de Gruyter Berlin · New York
Roots Linguistics in Search of its Evidential Base
Edited by
Sam Featherston Wolfgang Sternefeld
Mouton de Gruyter Berlin · New York
Mouton de Gruyter (formerly Mouton, The Hague) is a Division of Walter de Gruyter GmbH & Co. KG, Berlin.
The series Studies in Generative Grammar was formerly published by Foris Publications Holland.
앝 Printed on acid-free paper which falls within the guidelines 앪 of the ANSI to ensure permanence and durability.
Library of Congress Cataloging-in-Publication Data Roots : linguistics in search of its evidential base / edited by Sam Featherston, Wolfgang Sternefeld. p. cm. ⫺ (Studies in generative grammar ; 96) Includes bibliographical references and index. ISBN 978-3-11-019315-2 (cloth : alk. paper) 1. Linguistic analysis (Linguistics) 2. Linguistics ⫺ Research ⫺ Methodology. 3. Corpora (Linguistics) 4. Computational linguistics. I. Featherston, Sam. II. Sternefeld, Wolfgang, 1953⫺ P126.R665 2007 410⫺dc22 2007044365
Bibliographic information published by the Deutsche Nationalbibliothek The Deutsche Nationalbibliothek lists this publication in the Deutsche Nationalbibliografie; detailed bibliographic data are available in the Internet at http://dnb.d-nb.de.
ISBN 978-3-11-019315-2 ISSN 0167-4331 쑔 Copyright 2007 by Walter de Gruyter GmbH & Co. KG, D-10785 Berlin. All rights reserved, including those of translation into foreign languages. No part of this book may be reproduced in any form or by any means, electronic or mechanical, including photocopy, recording, or any information storage and retrieval system, without permission in writing from the publisher. Cover design: Christopher Schneider, Berlin. Printed in Germany.
Contents
Introduction: The evidential base of linguistics – Work in progress . . . . . Sam Featherston and Wolfgang Sternefeld
1
Portuguese: Corpora, coordination and agreement . . . . . . . . . . . . . . . . . . . . Doug Arnold, Louisa Sadler and Aline Villavicencio
9
Contributing to the extraction/parenthesis debate: Judgement studies and historical data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 Katrin Axel and Tanja Kiziak Quantifying quantifier scope: A cross-methodological comparison . . . . 53 Oliver Bott and Janina Radó Is syntactic knowledge probabilistic? Experiments with the English dative alternation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 Joan Bresnan Psycholinguistic perspectives on grammatical representations . . . . . . . . . 97 Harald Clahsen Early language separation: A longitudinal study of a Russian-German bilingual child. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133 Elena Dieser ‘I need data which I can rely on’: Corroborating empirical evidence on preposition placement in English relative clauses . . . . . . . . . . . . . . . . . . 161 Thomas Hoffman Locality and accessibility in wh-questions . . . . . . . . . . . . . . . . . . . . . . . . . . . 185 Philip Hofmeister, T. Florian Jaeger, Ivan A. Sag, Inbal Arnon and Neal Snider
vi
Contents
Eye Tracking as a tool to investigate the comprehension of referential expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207 Anke Karabanov, Peter Bosch and Peter König Corpus data and experimental results as prosodic evidence: On the case of stressed auch in German. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227 Denisa Lenertová and Stefan Sudhoff The retrieval and classification of Negative Polarity Items using statistical profiles. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249 Timm Lichte and Jan-Philipp Soehn Geographic distributions of linguistic variation reflect dynamics of differentiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267 John Nerbonne and Wilbert Heeringa Focus and verb order in Early New High German: Historical and contemporary evidence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299 Christopher D. Sapp Contrastive topics in pairing answers: A cross-linguistic production study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 319 Stavros Skopeteas and Caroline Féry Coordinate structures: On the relationship between parsing preferences and corpus frequencies. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 341 Ilona Steiner Adverbs and sentence topics in processing English . . . . . . . . . . . . . . . . . . . 361 Britta Stolterfoht, Lyn Frazier and Charles Clifton, Jr.
List of contributors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 375 Index. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 379
The evidential base of linguistics: Work in progress Sam Featherston and Wolfgang Sternefeld
A range of factors have led to remarkable revival of interest in issues of the empirical base of linguistic theory in general, and the status of different kinds of linguistic evidence in particular. Amongst these we must count the technological, which has made many sorts of linguistic data more available and more analyzable, and the theoretical, which has revealed itself in a spreading realization that more attention to detail in data can lead to both broader understanding of wider issues and increased insights into detailed questions. There are at least two aspects to the technological changes which have brought about an increased interest in more empirically founded linguistic work. Both of these are driven by the rise of the personal computer. It is difficult to express the growth in storage capacity, interconnectedness, and computation power of the computer without illustrating them from personal experience. In 1983 Sam Featherston wrote a program to help solve crossword clues, but could only load small parts of the dictionary into his computer’s 48 kilobytes of memory at a time. Neither of us currently know how many gigabytes of memory sit under our desks, but we do not expect ever to fill them. In fact we suspect that our chances of filling them are receding, since bulkier language resources are perfectly accessible remotely, and do not need to be kept locally. There is thus hardly any upper bound any more to the feasible size of collections of samples of real language use, nor any limitation imposed by their physical location. But not only size and access count: the ability of today’s computers to process and search these large quantities of data means that they can be used at great speed. This has allowed the field of corpus linguistics to grow and mature, but has also brought corpuses to the non-specialist linguist. Not only the size of collections but also the sophistication of their architectures, annotation schemes, and search possibilities have made corpus data a central tool in language study. The availability of computers has had another effect too: it has supported the extension of the experimental approach to more areas of linguistics. Although controlled studies can be carried out without computers, the monitor
2 Sam Featherston and Wolfgang Sternefeld as display device and keyboard as input device lend themselves readily to experiments. And only computers can integrate the stimulus and input in different modalities, such as is required by eye-tracking for example, and measure time delays in milliseconds, which forms a central part of many methodologies. One might add that the statistical calculations necessary on the detailed results of these quantitative approaches are also much less forbidding on a computer, as anyone who has worked out even means and standard deviations with pencil and paper will know. Computer technology has thus made data-orientated approaches to linguistics more attractive. But the availability of these data types has coincided with an increased theoretical interest in the findings. In the study of language at the sentence level, the 1980s and perhaps early 1990s were a period in which the data basis was granted relatively little attention. It is necessary to re-read some works from the 1960s and 1970s to become aware the extent to which this preference for idealization from the primary data was a change from previous practice. The tipping point of rationalist linguists, as we might call it, may have come roughly at the appearance of Chomsky’s Minimalist Program (Chomsky 1993). While this work was perhaps the high tide mark of the deductive approach to grammar, avowedly using as the starting point for its grammar architecture that which is ‘necessary’, it too contains certain features which appear to be motivated by descriptive requirements. There is the introduction of competition into the definition of well-formedness (Sternefeld 1996; Müller & Sternefeld 2001), which seems to be a response to the pressure of the evidence, and the promotion of ‘economy’, which can be seen as an attempt to reduce the use of abstract functional projections to account for everything and anything in syntax over the previous ten years. We thus already see signs that calling something ‘an empirical question’ was ceasing to be a way of dismissing it as a matter of little importance. Other papers from the period make the change clear too. We might refer to (Pollard & Sag 1994), in which the authors use as their reference point the ‘empirical domain’, and, while they respect criteria such as ‘simplicity and conceptual clarity’, place greater weight on ‘conformity with the facts’ (p. 4). That the authors felt obliged to spell these priorities out reveals that they were not mainstream assuptions at the time. Schütze’s (1996) seminal review of judgements and methodology and Cowart’s (1997) practical applications are examples of how the wind direction had changed. The results of these technological developments and academic paradigm shift in our field are papers such as the ones in this volume. These articles
The evidential base of linguistics: Work in progress
3
illustrate the progress which has been made but also reveal where practical, analytical, and interpretational problems have been encountered. Researchers have found that it is not sufficient to collect more data in order to gain greater insight – in this way they may sympathize with their predecessors in the 1960s, whose sometimes disappointed hopes of the clear confirmation of their theoretical positions in experimental data may have led to the reduction in interest in the collection of data which we are now once more reacting against. Nevertheless there are clear gains in understanding and individual successes, which these papers testify to. There are a number of aspects of the work reported in these papers which will be of interest to those who aspire to empirical adequacy in their research. Most of the papers report work using more exact and controlled methods of data gathering and data analysis, or at least the use of such methods to clarify questions which have not previously been addressed in this way; indeed we may say that this was a criterion for inclusion. The questions addressed and the approachs taken in the papers are varied, but they have the common theme of exploring what the new data implies for the field of linguistics. Rather than merely describing the data, the ambition is to interpret the findings either within existing theoretical models or in contrast to existing models, so as to gain new insights into language structures and advances in linguistic theory. A wide range of data types are used and discussed, often experimental (e.g. Clahsen; Karabanov et al.), or frequency-based (e.g. Bresnan; Steiner; Arnold et al.), but also those from other sources such as questionnaires and observation (e.g. Dieser; Sapp). Many use familiar methods in new ways or combine sorts of data in order to gain insights (e.g. Axel & Kiziak; Hofmeister et al.). Several also include explicit reflexion on the necessary conditions for more controlled data types to provide added value for linguistic theory (Clahsen; Bott & Radó). Several papers (e.g. Lichte & Soehn; Nerbonne & Heeringa), provide innovative analytical approaches to their data: often these papers reveal insights into succesful techniques, but also perils and pitfalls. Many of the papers consider data from more than one language or variety, a relatively simple way of improving the generalizability of results by broadening the data base and gathering what Clahsen calls converging evidence. While such evidence does not exclude the possibility of the findings being due to an artifact of the methodology or the assumptions underlying the approach, cross-linguistic generalizations are particularly valuable to linguists who place a high value on seeking grammatical universals.
4 Sam Featherston and Wolfgang Sternefeld A fine example of more sophisticated data being put to use for linguistic ends is the paper by Karabanov, Bosch & König. They use an eye-tracker to follow subjects’ visual attention shift in response to different types of noun phrases in auditory input. Such behavioural data has a strong claim as direct evidence of the process of comprehension: actions speak louder than words. The results provide evidence that pronouns are processed very similarly to full noun phrases, but that a distinction may need to be made between referential and non-referential pronouns. The studies by Lenertová & Sudhoff provide another good example of the wealth of information which is made accessible by experimental techniques. These researchers carried out a triple investigation on the prosodic correlates of the identification of the constituent that the German additive particle auch (‘also’) is associated with. The data reveals a complex picture with no one-to-one correspondence of prosody and interpretation, but in a complex descriptive field such as this, detailed data is required to get beyond vague generalizations. Stolterfoht, Frazier & Clifton gather self-paced reading times on clause-sized text chunks which similarly reveal detailed evidence which only tightly controlled quantitative measures can provide. The focus of their study is the hypothesis that a language such as English, which has a fairly rigid word order, will reveal evidence of being sensitive to information structural constraints in the same way that languages with looser word order do. The additional effort involved in gathering evidence with finer definition allows stronger conclusions to be drawn about the structure studied. While the effects of information structure are not so readily visible in English, closer data reveals that speakers are nevertheless sensitive to them. Other contributions focus more on measures of occurrence. Ilona Steiner takes a close look at frequency data to test whether parsing preferences as revealed in reading time studies correspond to production preferences as expressed in corpus data. This turns out to require several stages of discounting irrelevant information, but the careful work is rewarded with the result that the two measures are indeed correlated. This would suggest that corpus data, if carefully analysed, can be used to draw conclusions about models of sentence processing. This is an important piece of work in crossdata type correlation with implications for grammars and processing models too. Lichte & Soehn report their work on extracting lexical candidates as negative polarity items NPIs from a German corpus. The paper makes interesting reading as a case study in the care required and the multiple para-
The evidential base of linguistics: Work in progress
5
meters which need to be addressed in order to maximize both the quantity and the quality of the output. The authors are cautious enough to underline that the data can really only produce negative evidence that a lexical item does not belong to one of the groups of interest, not positive evidence that it does. Here too there is no such thing as a free lunch, but hard work pays off. Many other papers report work using more than one data type. For example Hofmeister, Jaeger, Sag, Arnon & Snider report the results of four separate studies aiming to clarify the role played by factors such as locality and accessibility in the realisation possibilities of multiple wh-questions. The results from their magnitude estimation and self-paced reading studies show how complex such issues are, and that multiple factors must be taken into consideration. This is important work in direction of teasing apart variables of processing and putative grammar. Thomas Hoffmann first looks at preposition placement with English relative pronouns in corpus data and finds that certain structures do not occur. Why not? Are these accidental gaps or grammar-driven exclusions? He therefore carries out experimental studies to clarify these questions, the results of which allow him to group the corpus data for statistical tests. A nice example of very different data types complementing each other and contributing to the larger picture with their specific strengths. Other papers too demonstrate conclusions and predictions derivable across phenomena and data types. Joan Bresnan takes corpus frequencies enriched with fairly detailed information on context and derives continuation probabilities from these. These predictions are tested and found to be accessible to informants in continuation production tests, which shows that these probabilities must be represented in the mind of the speaker. Models of syntactic knowledge need to account for this apparent statistical information. Two papers combine experimentally obtained judgements of the contemporary language with historical data to provide an additional perspective. Christopher Sapp looks at clause-final verb clusters in Early New High German and compares it with contemporary regional varieties. The question whether object focus affected the relative positioning verbs in clusters is difficult to answer conclusively on the basis of historical textual evidence alone, because of data sparseness and fuzziness in the corpus. The statistical analysis of the historical data is supported by interview data with speakers of Swabian and Viennese dialects and by a judgement experiment with speakers of Austrian standard German. The evidence from each source is
6 Sam Featherston and Wolfgang Sternefeld fairly fine, but in sum it makes a convincing case, a significant achievement for a subtle effect such as focus in a historical variety. The other paper employing historical data, that of Axel & Kiziak, does so to answer a question about modern language: whether apparent long whextraction in German should not rather be analyzed as a single clause with a parenthetical comment inserted. Two data types complement each other here. First, the evidence from predicate restrictions on extraction obtained from judgement experiments using the pattern matching technique demonstrates differences between unambiguous long extractions and the controversial construction, which no confounding factors are able to account for. Second, historical analysis of Old High German shows that, while the controversial construction occurs in texts from this period, the simple verbsecond complement clause, from which it is derived, is not clearly attested at all. In this way two very different sources of evidence would seem to converge in supporting the parenthetical analysis of this German structure. Nerbonne & Heeringa test the gravity model of linguistic dynamism using quantified data on dialectal differentiation from the Netherlands. While the data (all gathered during the twentieth century) is basically synchronic and directionless, these measures of linguistic similarity nevertheless allow conclusions to be drawn about dynamic processes of dialectal change and the factors favouring and inhibiting them, since the patterns produced by diachronic processes should be visible. This paper takes perhaps the most sophisticated approach to quantified data in this volume. A very different approach towards stiffening the evidential base of linguistics is to take a cross-linguistic perspective. Skopeteas & Féry describe their on-going work on the encoding of information structure, using questionnaires and elicitation of prosodic forms to look at single and double questions in English, German, Georgian, and Greek. The work reveals clear generalizations about the range and mix of word order options and prosodic features that languages use to represent information structural content. Other papers include explicit reflection on what contribution data can make to the advancement of theory. Harald Clahsen, for instance, identifies three conditions for psycholinguistic evidence to be relevant to those interested in the representation of language systems. First, potential confounds must be excluded, so that we can be sure that the data addresses the issue at hand; next, there should be converging evidence, that is, corroboration from multiple sources; and third, the data should demonstrate its appropriateness by confirming or falsifying existing linguistic theories. He goes on to show in three programmes of research, involving child language, morphological
The evidential base of linguistics: Work in progress
7
processing, and the language of speakers with learning impairments, how these criteria are applied in practice and what insights can be gained. Many papers critically discuss problems with the availability and interpretation of data types and report the solutions to these limitations. The work using frequencies collected from web searches by Arnold, Sadler & Villavicencio is particularly interesting for those linguists who see the web as the single unique corpus of the future. Their study of agreement patterns in Portuguese noun phrases clearly suggests that web data can provide linguistically interesting and valid information at a level of reliability which allows us to construct linguistic analyses on the basis of the differential frequencies found. An useful step in the validation of the web as a linguistic resource. Other authors too reflect on the value of their evidence. Elena Dieser’s paper reveals an interesting case where the data which has traditionally been advanced for a hypothesis can be seen on closer inspection to be inadequate. Her study of a child growing up bilingually differs from most other studies in not using the one-person-one-language pattern of bilingual child care. She demonstrates effectively that the child recognizes and distinguishes between mono-lingual and multilingual speakers long before he has translation equivalents in the two languages. His differential production thus reveals evidence of language separation long before his absolute production does. The traditional data criterion is thus shown to be inadequate. We shall last mention Bott & Radó whose paper addresses a very central question for a volume such as this. They report their work comparing methodologies for the elicitation of intuitions for semantic theory, aiming to measure the relative availability of readings of structures with two quantifiers. The studies reveal themselves to be very well worthwhile, since the results show real differences which any single study would have implied were systematic. There are clearly pitfalls to avoid, and since theory construction requires a firm empirical base, there is more work to be done on this field. We hope that this volume gives the reader a taste of the visible advances but also the hard work still necessary in building a more empirical linguistics particulary in work at the sentence level, and thus succeeds in the same way that the conference Linguistic Evidence 2006 did. We would like to thank and congratulate those many members of the Sonderforschungsbereich 441 here in Tübingen who willingly gave their time and effort to make the conference a success, but above Beate Starke and Marga Reis. We thank also the chair of the Programme Committee Ewald Lang, and the
8 Sam Featherston and Wolfgang Sternefeld colleagues who reviewed papers: Steven Bird, Joan Bresnan, Greg Carlson, Harald Clahsen, Anette Frank, Jost Gippert, Georg Kaiser, John Nerbonne, Karel Oliva, Janet Pierrehumbert, Mark Steedman, Shravan Vasishth, Tilman Berger, Veronika Ehrich, Erhard Hinrichs, Johannes Kabatek, Stephan Kepser, Claudia Maienborn, Uwe Mönnich, Frank Richter, and Hubert Truckenbrodt.
References Chomsky, Noam 1993 A minimalist program for linguistic theory. In The View from Building 20, K. Hale & S. Keyser (eds.). Cambridge, MA: MIT Press. Sternefeld, Wolfgang 1996 Comparing Reference Sets. In The Role of Economy Principles in Linguistic Theory, C. Wilder, H.-M. Gärtner & M. Bierwisch (eds.), 81–114. Berlin: Akademie-Verlag. Müller, Gereon & Wolfgang Sternefeld 2001 The Rise of Competition in Syntax. A Synopsis. In Competition in Syntax, G. Müller & W. Sternefeld (eds.), 1–68. Berlin /New York: Mouton de Gruyter. Cowart, Wayne 1997 Experimental Syntax: Applying Objective Methods to Sentence Judgements. Thousand Oaks, CA: Sage. Schütze, Carson T. 1996 The Empirical Base of Linguistics: Grammaticality Judgements and Linguistic Methodology. Chicago: University of Chicago Press.
Portuguese: Corpora, coordination and agreement Doug Arnold, Louisa Sadler and Aline Villavicencio
1. Introduction This paper reports some results from a corpus study of Portuguese, and explores their implications for the analysis of agreement processes involving coordinate structures (CSs), especially as regards gender agreement within noun phrases (NPs).1 Agreement phenomena have received considerable attention in recent years, but agreement involving CSs, and NP-internal agreement processes have received less attention. As will appear, this cannot be taken as a reflection of inherent theoretical interest. Some of the data discussed here appear to be novel, and to pose a serious challenge for existing analyses of coordinate structures. One goal of this paper is to suggest how they can be overcome. More generally, the study demonstrates the value of corpus data in challenging existing analyses, requiring a more sophisticated view of phenomena. It also raises some interesting methodological issues. The paper is structured as follows. Section 2 introduces some basic ideas about agreement in general, and what is standardly assumed about Portuguese. Section 3 describes the corpus study itself, and the results. The key conclusion is that Portuguese agreement is more complex than has generally been assumed hitherto. Section 4 discusses the theoretical implications, and provides a relatively theory neutral and intuitive analysis of the facts about Portuguese agreement as they emerge. The main point is that, contrary to what is assumed in most approaches to agreement, CSs must make several kinds of agreement information available at the same time. Section 5 summarises the discussion and provides some brief comments of a methodological nature.
2. Background In general terms, ‘agreement’ refers to the phenomenon where the form of one element (the ‘agreement target’) varies depending on properties of another (the ‘agreement controller’). For example, the following show that
10 Doug Arnold, Louisa Sadler and Aline Villavicencio Portuguese nominals control agreement for number and gender on determiners and adjectives. (1)
o teto colorido the.MSG ceiling.MSG coloured.MSG ‘the coloured ceiling’
(2) *os/a/as teto the.MPL/the.FSG/the.FPL ceiling.MSG coloridos/colorida/coloridas coloured.MPL/coloured.FSG/coloured.FPL ‘the coloured ceiling’ Agreement phenomena in general have received considerable attention in recent years. However, the main focus has been on subject-predicate agreement, at the expense of other forms of agreement, notably head-modifier agreement. In particular, there has been relatively little work on the problems posed by head-modifier agreement when the agreement controller is a Coordinate Structure (CS). It turns out that extending analyses based on noncoordinate structures to deal with CSs raises non-trivial problems. In particular, CSs appear to be able to control agreement in a variety of different ways. The two agreement strategies which are most widely attested crosslinguistically involve (syntactic or semantic) resolution and ‘closest conjunct agreement’ (CCA). Resolution strategies are familiar from many languages (for discussion and references see e.g. Corbett 1991; Dalrymple & Kaplan 2000; Wechsler & Zlatić 2003). Intuitively, under a resolution strategy agreement involves properties of the CS as a whole – more precisely, the agreement properties of a CS are some function of the properties of the conjuncts and the CS as a whole. In the case of Portuguese, this agreement strategy gives rise to examples like (3). (3)
o teto e a parede coloridos the.MSG ceiling.MSG and the.FSG wall.FSG coloured.MPL ‘the coloured ceiling and wall’
Here, plural agreement has been triggered on the adjective coloridos because the preceding CS is plural (e.g. it denotes a plurality). Masculine gender has been triggered because the CS contains a masculine conjunct (masculine is the default resolution gender in Portuguese – leaving aside
Portuguese: Corpora, coordination and agreement
11
cases of CCA, feminine agreement is only possible if all conjuncts are feminine). Under a CCA strategy, by contrast, rather than agreeing with the CS as a whole, agreement targets agree with just the closest conjunct. CCA is perhaps less familiar than resolution, but it is nevertheless widely attested. It has been observed in, inter alia, Irish, Welsh, Spanish, Arabic, and Ndebele. (e.g. McCloskey 1986; Corbett 1991; Sadler 1999; Camacho 2003; Moosally 1999; Yatabe 2004). Though it does not seem to have been much discussed in the theoretical literature on Portuguese, the existence of this strategy has been noted in descriptive grammars of Portuguese. de Almeida Torres (1981) gives examples like (4): (4)
no povo e gente hebreia on the.MSG population.MSG and people.FSG hebrew.FSG ‘on the hebrew people’ (de Almeida Torres, 1981)
Here we see that the postnominal adjective is feminine and singular, like the last conjunct, even though it semantically modifies the whole preceding CS (which contains a masculine noun, and so might be expected to trigger masculine agreement). These examples involve postnominal agreement, which is what we focus on here. However, a few words about the behaviour of prenominal adjectives and determiners are in order. As regards gender, it seems that in Portuguese CCA is required for prenominal modifiers and determiners modifying coordinated nominals. For example, in (5) the presence of a masculine conjunct in the CS is not sufficient to permit masculine agreement on the prenominal adjective and noun, which must agree with the closest conjunct in gender. (5)
suas/*seus próprias reações ou julgamentos his.FPL/*his.MPL own.FPL reactions.FPL or judgements.MPL ‘his own reactions or judgements’
As regards number, matters are less clear, and proper discussion would take us too far from the focus of this paper. Part of this complexity arises from the existence of ‘single entity’ readings of CSs (as in examples like my friend and colleague) which are semantically singular. Even leaving cases like this, there seems to be evidence of both CCA and resolution for number
12 Doug Arnold, Louisa Sadler and Aline Villavicencio in Portuguese. Example (6) shows resolved number – a plural determiner and adjective with a CS which is semantically plural, though it consists of singular nominals (prováveis (‘probable’) is plural, but is not marked morphologically for gender); and (7) shows CCA for number – a singular determiner with a CS that is again semantically plural. (6)
Os prováveis diretor e ator principal são the.MPL probable.PL director.MSG and actor.MSG principal.MSG are Gus van Sant e Johnny Depp, respectivamente. Gus van Sant and Johnny Depp respectively ‘The likely director and main actor are, respectively, Gus van Sant and Johnny Depp.’
(7)
O presidente e amigo comeram juntos. the.MSG president.MSG and friend.MSG ate.3PL together ‘The president and (his) friend ate together.’
However, the issue is complex and somewhat controversial, and not essential to the main point of this paper, and we will not pursue it.2 To summarise: NP internally, Portuguese shows clear evidence of two agreement strategies involving CSs: CCA (postnominally, and prenominally as regards gender), and resolution (postnominally, and perhaps also prenominally for number). Leaving aside the matter of prenominal number, these might be represented schematically as in (8) and (9), respectively. (8)
CCA
for number and gender:
DETNUM,GEN
(9)
NNUM,GEN qp NNUM,GEN NNUM,GEN
APNUM,GEN
Resolved number and gender: DETNUM,GEN
NNUM,GEN qp NNUM,GEN NNUM,GEN
APNUM,GEN
Portuguese: Corpora, coordination and agreement
13
The existence of two patterns raises an obvious question about their relative frequency. As we have noted, CCA in Portuguese has not been much discussed in the literature, and one might wonder if this is because it is rare or marginal. In order to investigate this, a corpus study was undertaken, which will be described below, and whose quantitative results give a clear answer to this question (CCA is not rare or marginal). As it turns out, this study also raises (and answers) an interesting qualitative question, which has not previously been considered: are these the only patterns of agreement that are found? As will appear, some of the examples produced by the study seem to show the existence of ‘mixed’ agreement strategies, whose existence has not been previously noticed, and which have significant implications for the analysis of agreement with CS.
3. Corpus study This section reports the results of a corpus based study into the agreement strategies used for NP internal agreement involving CSs, focusing especially on gender agreement for post-nominal dependents. In order to estimate the approximate frequencies with which the agreement strategies are used, a Web based corpus investigation was performed by means of searches using the Google API service.3 Occurrences of coordinated nominals followed by adjectives were found by posing Google queries of the general form (10). (10) " * e * " Here ART stands for instances of the Portuguese (definite and indefinite) articles, ADJ stands for instances of Portuguese plural adjectives, and e is the Portuguese conjunction e (‘and’). The adjectives were extracted from the 1,528,590 entry NILC Lexicon.4 Because we were interested in the correlation between the gender of each of the nominals and the gender of the adjective, only adjectives that overtly reflect gender distinctions were used (9,915 masculine and 9,811 feminine adjectives). The results returned by the queries were manually inspected to remove noise – in cases of putative CCA this entailed removing all cases where, in the judgement of a Portuguese native speaker, the adjective should be interpreted as modifying only the the closest nominal, rather than the CS as whole.
14 Doug Arnold, Louisa Sadler and Aline Villavicencio The overall results found are displayed in Table 1, where ‘Frequency’ indicates the number of hits returned by Google for the searches, and ‘N1’, ‘N2’ and ‘ADJ’ refer to the gender of the first conjunct, second conjunct, and adjective, respectively.5 Table 1. Frequency of Masc vs. Fem Adjectives Modifying Mixed Gender Coordinations of Nominals. Frequency (a) (b) (c) (d)
0 4054 626 550
total
5230
N1
N2
ADJ
f m f m
m f m f
f m m f
Interpretation (Resolve to f) (Resolve to m) (CCA/Resolve to m) (CCA)
The first thing to notice here is that there are no instances of a feminine nominal conjoined with a masculine triggering feminine agreement (row (a)). That is, no instances of the form (11), which would be instances of resolution to feminine, or perhaps ‘furthest conjunct agreement’. This is not particularly surprising, but it supports our implicit assumption that cases of feminine gender agreement where a CS contains a masculine conjunct are indeed cases of CCA, and not some special ‘resolution to feminine’ strategy. (11) [ NF conj NM] ADJF Similarly, row (b) is unsurprising. This row reports the count of cases which are schematically of the form in (12), where a conjoined masculine and feminine trigger masculine agreement. Leaving aside the possibility of ‘furthest conjunct agreement’, these are unambiguously cases of resolution to masculine, and they are very frequent (almost 80% of cases). (12) [ NM conj NF] ADJM The cases counted in row (c), which are schematically like (13), are ambiguous – they might be either cases of resolution to masculine, or CCA with the masculine conjunct. (13) [ NF conj NM] ADJM
Portuguese: Corpora, coordination and agreement
15
The most interesting case is row (d), which gives the number of cases of the form in (14). These are unambiguously cases of CCA (resolution would produce masculine agreement on the adjective). (14) [ NM conj NF] ADJF The interesting point is that they are not at all infrequent. Even on the narrowest interpretation, disregarding all ambiguous cases from row (c), CCA for gender is evidently widespread: the ratio of (d) cases to the total is 550/5230, or slightly over 10%. If these data are representative, the odds on speakers using CCA are better than 1 in 10. We can conclude that while resolution is the dominant strategy for postnominal gender agreement, CCA is by no means rare or marginal. Apart from this quantitative finding, the study also threw up some unexpected qualitative results. Among these results were examples such as (15), which is schematically something like (16). (15) Esta canção anima os corações e mentes This song animate the.MPL hearts.MPL and minds.FPL brasileiras. Brazilian.FPL (16) DETM [ NM conj NF] ADJF What this shows is CCA for gender both prenominally and postnominally, with different effects. In this example, prenominal CCA has produced masculine agreement (recall that CCA for gender appears to be obligatory in Portuguese, so this cannot be a case of resolution to masculine on the determiner), at the same time, postnominal CCA has produced feminine agreement (resolved agreement would have made the adjective masculine). Given that a language exhibits CCA, and has both prenominal and postnominal dependents, it is perhaps not surprising that this should occur. However, the possibility seems not to have been previously considered, and its existence is a significant result, with important theoretical implications, which we will take up below. A second kind of case which appears not to have been previously noticed is exemplified in (17) to (21), which are schematically of the form (22).
16 Doug Arnold, Louisa Sadler and Aline Villavicencio (17) todo o constrangimento e a dor sofridas all.MSG the.MSG embarrassment.MSG and the.FSG pain.FSG suffered.FPL ‘all the embarrassment and pain suffered’ (18) o drama e a loucura vividas the.MSG drama.MSG and the.FSG madness.FSG lived.FPL ‘the drama and the madness experienced’ (19) o aprendizado e a experiência vividas the.MSG learning.MSG and the.FSG experience.FSG lived.FPL ‘the accumulated learning and experience’ (20) o romantismo e a morbidez profundas the.MSG romanticism.MSG and the.FSG morbidity.FSG deep.FPL da alma alemã of the soul German ‘The romanticism and morbidity of the German soul’ (21) uma relação entre sobrecarga do organismo e a relation between overload of the organism and envelhecimento e morte prematuras aging.MSG and death.FSG premature.FPL ‘A relation between overload of the organism and premature aging and death’ (22) [ NMSG conj NFSG] ADJFPL What these examples seem to show is postnominal CCA for gender (the adjective is feminine, like the last conjunct, even though the CS contains a masculine nominal) combined, simultaneously, with resolution for number (the individual conjuncts are singular, but the adjective is plural). These cases raise an interesting theoretical issue, because not only has the existence of such cases not been previously noticed, it seems not even to have been considered as a possibility. We will look at the theoretical implications of this in Section 4, below. Such cases also raise an interesting methodological issue, because though these are all attested examples, some native speakers of Portuguese are uncomfortable with them (not including the present author who is a native speaker of Brazilian Portuguese). In this context, it is worth looking at some other quantitative results. Table 2 summaries the number of examples found which involved coordinations of singular nominals (that is, a strict subset of the examples sum-
Portuguese: Corpora, coordination and agreement
17
marised in Table 1). Since all results feature plural adjectives, these are all cases of number resolution. The cases showing CCA for gender – that is, at least the cases in row (d) – thus show this ‘mixed’ agreement strategy of resolved number and CCA for gender. As the table shows, this strategy appears in 90 cases, which is approximately 4.6% of all the cases counted in Table 2, and about 4.9% of all the cases that could show this effect (i.e. all the cases where the final conjunct is feminine, i.e. rows (b) and (d)). This seems to us to be an interestingly large number, which combined with their acceptability to some speakers means that the phenomenon deserves theoretical attention, and should not be dismissed out of hand (however, we will say a little more about the methodological issue raised here in Section 5). Table 2. Frequency of Masc vs. Fem Adjectives Modifying Mixed Gender Coordinations of Singular Nominals. Frequency (a) (b) (c) (d)
0 1737 137 90
total
1964
N1
N2
ADJ
f m f m
m f m f
f m m f
Interpretation (Resolve to f) (Resolve to m) (CCA/Resolve to m) (CCA)
To summarise the results of this section: we have shown (a) that while gender resolution is the dominant agreement strategy postnominally, CCA is by no means infrequent or marginal, and (b) that Portuguese agreement is more complex than has been previously assumed. In particular, in addition to ‘pure’ resolution with prenominal CCA for gender, we also see prenominal and postnominal CCA operating independently, and a mixed postnominal strategy that involves CCA for gender with resolved number. Schematically, these strategies may be represented as in (23) to (25). The following section will consider the theoretical implications of this. (23) Resolved number and gender: DETNUM,GEN
NNUM,GEN qp NNUM,GEN NNUM,GEN
APNUM,GEN
18 Doug Arnold, Louisa Sadler and Aline Villavicencio (24) CCA for number and gender: DETNUM,GEN
NNUM,GEN qp NNUM,GEN NNUM,GEN
APNUM,GEN
(25) CCA for gender, resolved number: DETNUM,GEN
NNUM,GEN qp NNUM,GEN NNUM,GEN
APNUM,GEN
4. Linguistic analysis and theoretical implications In this section, we will consider some of the theoretical implications of the Portuguese data presented above, showing how an account of the data can be formulated. In the interests of generality, we will keep the presentation as intuitive and framework-neutral as possible.6 We will begin with resolution. In general, resolution can be modelled by a grammatical mechanism which ‘calculates’ the set of resolved agreement features to be associated with the coordinate structure as a whole: this set of resolved features then controls agreement on agreement targets (e.g. Dalrymple & Kaplan 2000). So far as we can see, it is reasonable to assume that number resolution in Portuguese is simply a matter of semantics: CS are plural just in case they denote a plurality or group of some kind. This is expressed in (26).7 (26) The number value on a CS resolves to plural just in case the CS denotes a plurality. As regards gender, it seems safe to assume that masculine is the default resolution gender, or to put it another way, the resolved gender value on a CS is masculine if it contains one or more masculine conjuncts, and feminine only if all conjuncts are feminine:8 (27) The gender value on a CS resolves to feminine iff all conjuncts are feminine, otherwise it is masculine.
Portuguese: Corpora, coordination and agreement
19
In principle, one might try to treat CCA in a similar fashion to resolution – a CS would have a single set of agreement properties calculated from properties of the conjuncts, but rather than involving calculations reflecting principles like (26) and (27), the calculation would simply return values from one designated conjunct (the last one, say). Such an approach might be unproblematic in a language which has only CCA. In a case like Portuguese which has both resolution and CCA, one might try to give every CS a single number and gender value, but allow the values to be calculated in one of two ways: either (a) by a resolution method, or (b) by a CCA method copying the value from (say) the last conjunct. Most existing approaches to agreement involve some kind of ‘single feature’ approach like this. Notice that such an approach predicts that all agreement processes will involve the same set of features.9 The Portuguese data clearly indicate that this sort of approach cannot be correct in general. First, the fact that prenominal CCA for gender is obligatory, while postnominally either CCA or resolution are possible indicate that CSs cannot be assigned a single agreement value: they need at least two sets of values, one for CCA, and one for resolution based agreement. Moreover, as we have seen, examples such as (15), repeated here, show CCA operating both prenominally and postnominally, with different effects. So we cannot manage with just one set of ‘CCA agreement’ features – we need two, one to for prenominal CCA, and one for postnominal CCA. (28) Esta canção anima os corações e mentes brasileiras. This song animate the.MPL hearts.MPL and minds.FPL Brazilian.FPL It seems the simplest way to approach this is to assume three sets of number/gender features: one reflecting resolved values, one for ‘leftwards’ CCA (i.e. CCA on prenominals), and one for ‘rightwards CCA’ (CCA on postnominals). Let use call these ‘RESOL’, ‘LAGR’ and ‘RAGR’. The behaviour of these features will be governed by principles such as the following: (29) The RESOL values of a CS are calculated from the features of the conjuncts, according to principles such as (26) and (27). (30) The LAGR values of a CS come from its leftmost conjunct. (31) The RAGR values of a CS come from its rightmost conjunct. The existence of such features on CSs raises the question of what agreement features the conjuncts have, when they are not themselves CSs. One can
20 Doug Arnold, Louisa Sadler and Aline Villavicencio imagine two approaches. The first would define features like LAGR, RAGR and RESOL only on CSs – ‘normal’ nominals would have only normal agreement features. But this is unattractive: it would complicate the statement of normal agreement principles, which would have to be different depending on whether the agreement control was a CS or not. It would also complicate the statement of the agreement percolation principles inside CSs. If instead, we assume that these features are defined on nominals of all kinds, a much simpler picture emerges. To begin with, we need some principle like the following, to capture the fact that non-coordinate structures exhibit only one kind of agreement behaviour: (32) In non-coordinate nominal structures the values of RAGR, LAGR and RESOL are identical. (One way of implementing this would be to make it a lexical requirement of nouns, which is inherited by nominal projections; if noun phrases are analysed as DPs, it would be stated as a requirement on Ds and their associated Ns that is inherited by DPs). Now, (30) and (31) can be stated more precisely, and with complete generality, as (33) and (34): (33) The LAGR values of a CS are the LAGR values of its leftmost conjunct. (34) The RAGR values of a CS are the RAGR values of its rightmost conjunct. These principles can be seen at work in (36), representing the CS in (35).10 (35) o aprendizado e a experiência the.MSG learning.MSG and the.FSG experience.FSG
Portuguese: Corpora, coordination and agreement
21
NP
(36)
RESOL RAGR LAGR
mpl fs ms
N RESOL RAGR LAGR
DET ! o
NP ms ms ms
RESOL RAGR LAGR
NP RESOL RAGR LAGR
ms ms ms
! aprendizado
CONJ ! e
fs fs fs NP RESOL RAGR LAGR
DET ! a
fs fs fs N RESOL RAGR LAGR
fs fs fs
! experiência Briefly, the lexical nouns aprendizado (‘learning’) and experiência (‘experience’) are lexically specified as masculine singular and feminine singular, and these values appear for all the agreement features. These same values appear on the non-CS nodes that dominate them, as required by (32). The mother node of the CS has LAGR ms from its left daughter and RAGR fs from its right daughter (as required by (33) and (34)). Its RESOL value is masculine because one of the daughters is masculine, reflecting (27); its RESOL number is plural because it denotes a plurality, reflecting (26). Precisely how agreement is handled, given structures like (36), will depend on assumptions about the mechanics of determiner-noun and adjectivenoun agreement, but the underlying principles will be roughly as follows:11 (37) Post-head modifiers must share either: a. their agreement controller’s RESOL values (resolved agreement); or b. their agreement controller’s RARG values (‘full’ CCA); or c. their agreement controller’s RESOL.NUMBER and RARG.GENDER values (‘mixed’ CCA /resolution).
22 Doug Arnold, Louisa Sadler and Aline Villavicencio (38) Determiners and pre-head modifiers must share their agreement controller’s LAGR.GENDER (CCA for gender) The adjective modernos in (39) exemplifies (37a); the adjective monástica in (40) exemplifies (37b); sofridas in (41) exemplifies (37c); and próprias (‘own’) and suas (‘his/her’) in (42) exemplify (38). (39)
o homem e a mulher modernos [ the.MSG man.MSG and the.FSG woman.FSG ] modern.MPL ‘the modern man and woman’
(40)
estudos e profissão monástica [ studies.MSG and profession.FSG ] monastic.FSG ‘monastic studies and profession’
(41)
o constrangimento e a dor sofridas [ the.MSG embarrassment.MSG and the.FSG pain.FSG ] suffered.FPL ‘all the embarrasment and pain suffered’
(42) suas próprias reacoes ou julgamentos his.FPL own.FPL [ reactions.FPL or judgements.MPL ] ‘his own reactions or judgements’ Notice that these principles also apply equally, and unproblematically, in the case of a non-coordinate agreement controllers (the account of agreement is thus uniform for CSs and non-coordinate structures, as one would wish). For example with in a noun like teto (‘ceiling’) in (43) all the principles in (37) produce exactly the same effect, because in a non-CS all the agreement features have the same values. (43) o teto colorido the.MSG ceiling.MSG coloured.MSG ‘the coloured ceiling’ Notice also that, as well as being ‘uniform’ in this sense, this account is consistent with a very standard idea of locality for agreement processes: percolation of features means that agreement can always be stated as a relation between an agreement controller and its sister(s). Principles like those above appear to account for the data, and from a descriptive point of view they are attractive – they provide a simple conceptual and descriptive vocabulary for the analysis of Portuguese and other
Portuguese: Corpora, coordination and agreement
23
complex agreement systems. But there is clearly a theoretical cost in terms of the introduction of features which might not be otherwise required. However, it is worth pointing out that some proliferation of features seems to be required independently, because somewhat different features are required for handling NP-internal agreement processes, like those we have examined here, and NP-external agreement processes like subject-predicate agreement. Familiar examples of this involve so-called ‘hybrid nouns’ (Corbett 1991), which can trigger different kinds of agreement on different targets. For example, in Spanish the title Majestad (‘Majesty’) is feminine, so it triggers feminine agreement on attributive adjectives and determiners. However, if it refers to a male individual, it triggers masculine agreement on a predicative adjective (cf. e.g. Corbett 1991; Kathol 1999; Wechsler & Zlatić 2003): (44) Su Majestadi Suprema esta contento. Pron.F Majesty Supreme.F is happy.M ‘His Supreme Majesty is happy.’ In this context, it is interesting to ask whether Portuguese CCA might be ‘NP-bounded’ – a purely NP-internal process, which might limit the number of features required. Examples like the following suggest that it is not. (45) …(que) o travestismo e a copulação …(that) the.MSG transvestism.MSG and the.FSG copulation.FSG ritual são realizadas para expressar o propósito… ritual be.PL realized.FPL to express the goal… ‘…(that) the transvestism and the ritual copulation are produced to express the goal…’ Here we see the CS o travestismo e a copulação ritual (‘the transvestism and the ritual copulation’) triggering plural agreement on the predicate (the verb são (‘be’) and the participle realizadas (‘realized’)), which would be consistent with a resolution strategy for number. However, we also see that realizadas is marked feminine – i.e. apparently agreeing with the closest conjunct copulação ritual. That is, subject predicate agreement may sometimes involve CCA. An obvious objection to the analysis we have described is that it is stipulative, and does not really capture the fact that CCA is closest conjunct agreement. That is, the principles we have given could equally well be
24 Doug Arnold, Louisa Sadler and Aline Villavicencio phrased so as to yield furthest conjunct agreement, which is not observed in Portuguese. However, furthest conjunct agreement is observed in some languages (e.g. Slovene Corbett 1983). Moreover, notice that any account which tries to express CCA directly, as ‘closest’ conjunct agreement, will be in danger of losing one of the attractions of our account – the fact that it is consistent with standard ideas of locality. For example, an attempt to formulate such an account using any kind of conventional phrase structure will require agreement relations to hold between aunts and nieces, as well as sisters. Moreover, attempts to deal with ‘closeness’ in terms of purely linear adjacency of agreement controllers and targets appear problematic: several of the examples we have given above involve CCA between determiners and nouns which are not adjacent (see, e.g. (5), (6), and (17)).12
5. Conclusion The foregoing has presented some novel data and conclusions about Portuguese agreement. In particular, we have presented data which suggest that CCA is more widespread than has generally been assumed. We have also presented data which suggest that agreement involving CSs is more complex than has been assumed, in ways that challenge existing analyses of agreement. In particular, we have argued that CSs do not possess a single set of agreement features (because both ‘resolved’ and ‘closest conjunct’ features are needed, and because information about the conjuncts at both ends of a CS may be needed for CCA). We have presented an analysis which captures these facts, and is consistent with a uniform treatment of agreement involving CS and non-coordinate structures. The discussion involves what we take to be an interesting mix of ‘empirical’ (e.g. corpus based) and more traditional ‘theoretical’ linguistic investigation and analysis, a mix which is increasingly common, and productive. It also raises a number of methodological issues which deserve brief attention. One relatively straightforward methodological point is that this study is of necessity based on interpreted corpus data: it is not enough to find appropriate sequences of CSs and modifiers in corpora, it is essential to limit attention to cases where the interpretation makes it clear that the modifier scopes over the whole CS. Not only is there no conflict between corpus methods and methods based on ‘native speaker intuition’ here, both are actually necessary.
Portuguese: Corpora, coordination and agreement
25
A second, rather obvious, methodological point involves the value and limitations of corpus data. On the one hand, the value of corpus data comes out clearly: the existence of examples like those above in corpora force one to consider the possibility of CCA operating differently in different directions, which one might not have expected, a priori. On the other hand, getting relevant data can be extremely difficult due to various complicating factors – notably of course the fact that even large corpora do not typically show all possible variations and combinations of the phenomena one is interested in. Here one is naturally drawn to constructing examples. But this is not straightforward, because native speakers are often uncertain about the status of some examples. In particular, it seems that some speakers reject examples involving postnominal CCA for gender with resolved number (i.e. example like (17) to (21)).13 Of course speakers’ acceptability judgements are notoriously unreliable (cf. e.g. Schütze, 1996), especially judgements of unacceptability. And in fact, experience indicates that this sort of conflict between corpus data and intuition is rather rare. It is much more common for exposure to corpus data to persuade speakers that their intuition are over-restrictive.14 But this just makes the problem harder to deal with when it arises. In the case of a web-based study such ours one cannot appeal to any pre-existing quality control (e.g. that the texts have been authored and proof-read by native speakers). One may observe, as we did above, that one has many examples of the relevant kind (in our case, 90). But how many is ‘many’? In the case of web-based queries, there is no useful estimate of the total number of words in the corpus, but we found 4.9% of cases that could have shown the relevant pattern did show it (cf. Table 2). Is this a significant number? We are inclined to think that it is rather a large number to be just the result of ‘noise’ – that is, simple mistakes and the like. On the other hand, we note that the normal standard for statistical significance is 0.05, or 5%, so one could argue that it is statistically non-significant.
Notes 1. The research was supported by the AHRB Project Noun Phrase Agreement and Coordination, MRGAN10939/APN17606. We are grateful for useful comments from many people, including: the anonymous referees for, and participants at, the LingEvid2006 conference held in Tübingen in February 2006; participants at HPSG05 in Lisbon; participants at the ‘Alliance 05 Project’
26 Doug Arnold, Louisa Sadler and Aline Villavicencio
2.
3. 4. 5.
6. 7.
8.
9.
10.
11.
Workshop held at Paris 7 in Oct 2005; numerous colleagues at Essex, and Mary Dalrymple and Irina Nikolaeva. See http://privatewww.essex.ac.uk/~louisa/agr/ NPagreement.html for more information. For example, King & Dalrymple (2004) claim that a singular determiner can only modify a CS with a ‘single entity’ (‘boolean’ or ‘joint’) interpretation, as in o presidente e diretor da Air France ‘the.MSG president.MSG and director.MSG of Air France’, where it is assumed that the president and director are one and the same individual. On the face of it, (7) is a counter-example to this claim. Another complicating issue is that the presence of material between the determiner and nominal may exert an influence on acceptability. In the (acceptable) example (6) the subject noun phrase is Os provęaveis diretor e ator principal (‘the probable director and main actor’) with the adjective provęaveis (‘probable’) intervening between the determiner and noun. Omitting it seems to have a deleterious effect, so *Os diretor e ator principal… (‘the director and main actor…’) is judged unacceptable. See http://www.google.com/apis. See http://www.nilc.icmc.usp.br/nilc/index.html. One interesting point which we will not pursue is that the figures seem to show a strong bias for masculine conjuncts to precede feminine conjuncts (feminine conjuncts precede in only 626/5230 cases). This is probably a reflection a prescriptive bias in favour of this ordering of conjuncts. For a fully worked out formal treatment, see Villavicencio et al. (2005). An example of a CS which does not denote a plurality is given in note 2. A counter-example to our assumption would be a CS containing a plural nominal that triggered singular agreement, where this could not be analysed as a case of CCA. In fact, we have not found any examples of CSs involving plural nominals triggering singular agreement at all. A counter-example would be a CS which contains a masculine conjunct, but triggers feminine agreement, where this cannot be attributed to CCA. As noted in Section 3, no such cases were found in our study. In fact, the ‘single feature’ approach is already known to be inadequate in other languages. Sadler (2003, 1999) shows that it will not work in Welsh, where different agreement processes can target the resolved and the CCA features at the same time, indicating that a CS must be able to have both resolved and CCA features simultaneously. However, Sadler suggests that any one agreement process can only access one kind of feature. The Portuguese data suggest that this is over-restrictive. This representation makes a number of assumptions about the analysis of CSs, e.g. that the conjunction forms a constituent with the final daughter, and that the CS is an NP, rather than (say) a CONJP; none of these assumptions is critical. This formulation evades the issue of number agreement for prenominal adjuncts – we leave open the question of whether they show resolution or CCA for number (or indeed both). Nothing we say hangs on this.
Portuguese: Corpora, coordination and agreement
27
12. It is true that what intervenes may be an adjective which also agrees with the noun, but this is irrelevant: the adjective is not the agreement controller for the determiner. 13. Notice that this is the only case that is problematic in this way. All speakers seem happy with cases of prenominal CCA with postnominal resolution, and cases where pre-and post-nominal CCA give different effects. Thus, the main theoretical claims of the paper are not affected by this issue about data. 14. The following is a simple and uncontroversial example of this. It has sometimes been claimed that alternately cannot be used with or, cf. John was alternately hot and /*or cold. Many speakers accept this judgement at first glance. However, a search of the British National Corpus yields several examples of alternately… or which seem to be fully acceptable to all speakers, and which lead them to revise their judgement – e.g. [they] spent almost three hours in each other’s arms, alternately making love or talking in low whispers.
References Camacho, José 2003 The Structure of Coordination: Conjunction and Agreement Phenomena in Spanish and Other Languages. Dordrecht: Kluwer. Corbett, Greville G. 1983 Hierarchies, Targets and Controllers: Agreement Patterns in Slavic. London: Croom Helm. 1991 Gender. Cambridge: Cambridge University Press. Dalrymple, Mary & Ronald M. Kaplan 2000 Feature indeterminacy and feature resolution in description-based syntax. Language 76 (4): 759–798. de Almeida Torres, Artur 1981 Moderna gramática expositiva da Língua Portuguesa. Sao Paulo: Martins Fontes. Kathol, Andreas 1999 Agreement and the syntax-morphology interface in HPSG. In Studies in Contemporary Phrase Structure Grammar, R. Levine & G. Green, (eds.), 209–260. Cambridge/New York: Cambridge University Press. King, Tracy H. & Mary Dalrymple 2004 Determiner agreement and noun conjunction. Journal of Linguistics 40: 69–104. McCloskey, James 1986 Inflection and conjunction in Modern Irish. Natural Language and Linguistic Theory 4 (2): 245–282.
28 Doug Arnold, Louisa Sadler and Aline Villavicencio Moosally, Michelle J. 1999 Subject and object coordination in Ndebele: and HPSG analysis. In Proceedings of the WCCFL 18 Conference, S. Bird, A. Carnie, J. D. Haugen & P. Norquest (eds.). Somerville, MA: Cascadilla Press. Sadler, Louisa 1999 Non-distributive features in Welsh coordination. In Proceedings of LFG 1999, M. Butt & T. H. King (eds.). Stanford, CA: CSLI Publications. 2003 Coordination and asymmetric agreement in Welsh. In Nominals: Inside and Out, M. Butt & T. H. King (eds.), 85–118. Stanford, CA: CSLI Publications. Schütze, Carson T. 1996 The Empirical Base of Linguistics: Grammaticality Judgments and Linguistic Methodology. University of Chicago Press, Chicago, Il. Villavicencio, Aline, Louisa Sadler & Doug Arnold 2005 An HPSG account of closest conjunct agreement in NP coordination in Portuguese. In Proceedings of the 12th International Conference on Head-Driven Phrase Structure Grammar, Lisbon, S. Müller(ed.). Stanford, CA: CSLI Publications. Wechsler, Stephen & Larisa Zlatić 2003 The Many Faces of Agreement. Stanford, CA: CSLI Publications. Yatabe, Shuichi 2004 A comprehensive theory of coordination of unlikes. In Proceedings of the HPSG04 Conference, S. Müller (ed.), 335–355. CSLI Publications, Katholieke Universiteit Leuven.
Contributing to the extraction/parenthesis debate: Judgement studies and historical data Katrin Axel and Tanja Kiziak
1. Introduction* For German constructions as in (1), henceforward ‘controversial construction’, two analyses have been discussed in the theoretical literature. (1)
Wen denkst du hat Ede angerufen? whom think you has Ede called ‘Whom do you think Ede has called?’
Some linguists have analysed the controversial construction as a long whextraction from an embedded verb-second clause (e.g. Thiersch 1978; Grewendorf 1988; Staudacher 1990; Haider 1993). Others have proposed that it is a monoclausal extraction with a verb-first parenthetical insert (e.g. Andersson/Kvam 1984, Reis 2002). (2)
Extraction analysis Wen1 denkst du [CP t1 [C hat Ede angerufen t1]]? whom think you has Ede called
(3)
Parenthetical analysis Wen1 [denkst du] hat Ede angerufen t1? whom think you has Ede called
The extraction analysis is feasible because it can be shown that German permits (i) dependent verb-second clauses and (ii) long extractions in other contexts. Regarding (i), some matrix predicates optionally select dependent V2 clauses (V2-clauses) without the complementizer dass ‘that’, (4b), besides the standard verb-final dass-clauses, (4a). As to (ii) German allows the extraction of an XP (in particular of wh-phrases) from a complement clause with the overt complementizer dass, cf. (5).1 The extraction is assumed to take place via the SpecCP-position of the embedded clause. Ac-
30 Katrin Axel and Tanja Kiziak cordingly, in the extraction analysis for the controversial construction there is an intermediate trace in the SpecCP of the dependent V2-clause as in (2). (4)
a. Du denkst, dass Ede Tim angerufen hat. you think that Ede Tim called has ‘You think that Ede has called Tim.’ b. Du denkst, Ede hat Tim angerufen. you think Ede has Tim called ‘You think Ede has called Tim.’
(5)
a. Wen denkst du dass Ede angerufen hat? whom think you that Ede called has ‘Whom do you think that Ede has called? b. Weni denkst du [CP ti [C dass Ede ti angerufen hat]]
On the other hand, it is also possible to envisage a parenthetical analysis for the controversial construction, i.e. a verb-first parenthetical (V1-parenthetical) in the prefinite insertion slot. This is supported by the following facts (see Reis 1996): Not only does exactly this type of parenthetical occur in other insertion slots, e.g. in post-subject and clause-final position as in (6), but it is also the case that other types of parentheticals, e.g. so-parentheticals, occur in exactly this prefinite insertion position, see (7). So both the prefinite insertion slot and this type of verb-first parentheticals are attested independently of the controversial construction. (6)
a. Wen hat Ede denkst du angerufen? whom has Ede think you called b. Wen hat Ede angerufen denkst du? whom has Ede called think you ‘Whom has Ede (do you think) called, (do you think)?’
(7)
Bei Freunden, so denkt Ede, sollte man oft anrufen. at friends so thinks Ede should one often call ‘One should often call one’s friends, thinks Ede.’
The controversial construction has attracted the attention of linguists for more than two decades mainly because of the far-reaching implications its analysis has both for models of German sentence structure in general (cf. Reis 1996: 51) and for the status of dependent V2-clauses. Extractions have
Contributing to the extraction/parenthesis debate
31
been claimed to be possible only from strictly governed domains (Huang 1982). Thus if an ‘extraction from V2’ analysis is assumed for the controversial construction, dependent V2-clauses have to be analysed as syntactic complement clauses in the same sense as dass-complement clauses. This view has been challenged by Reis (1997), who claims that dependent V2clauses are not syntactically embedded and that V2 in general is a root phenomenon. It is thus of great theoretical interest to settle the parenthesis/extraction debate. Up to now it has remained unresolved as it is difficult to find any clear evidence which distinguishes between the two accounts. In this paper we will present two types of empirical evidence in order to contribute to this long-standing discussion: Evidence from judgement studies of present-day German (section 2) and evidence from historical corpus data of Old High German (section 3). Both these data types provide new insights in themselves, but crucially, their combination allows for even stronger conclusions concerning the controversy at hand, because, as we shall see, they both point in the same direction.
2. Evidence from judgement studies The overall aim of our judgement studies was to compare the controversial construction to an uncontroversial extraction structure on the one hand and to an uncontroversial parenthetical construction on the other. More specifically, we compared the controversial construction to extractions from a dass-clause (dass-extractions) as in (5) and to V1-parentheticals in postsubject position as in (6a). We assume that constructions of the same structural type receive the same or at least parallel acceptability judgements across contexts. Thus, the finding that the controversial constructions patterns with the unambiguous structure A and not with the unambiguous structure B would count as evidence in favour of analysis A. Reis (1996, 2002) systematically compared (her own judgements of) the controversial construction to the two clear structure types for a number of phenomena. In our study we focussed on one of her core objects of investigation – on predicate restrictions. Reis claims that a number of predicates can appear as matrix predicates in extractions but that they are impossible in prosodically integrated parentheses. Thus they appear as bridge predicates in dass-extractions as in (5), but not inside V1-parentheticals as in (6). The relevant predicates are strong factive predicates, negative or negated predicates, preference predicates, and adjectival predicates in general. The crucial question is whether these predicates occur in the controversial construction.
32 Katrin Axel and Tanja Kiziak Reis denies this, i.e. she rates the controversial construction with these predicates as unacceptable and thus on a par with the V1-parentheticals and differently from the dass-extraction. In our study we elicited judgements for the pertinent constructions with a range of predicates in order to test for similarities and differences between the structures across predicates.
2.1. Magnitude estimation methodology and pattern matching technique In all experiments, we applied the magnitude estimation methodology (Bard, Robertson & Sorace 1996) to elicit judgements. In magnitude estimation, informants judge how good or bad sentences are in comparison both to a reference item and to their own previous judgements, i.e. all judgements are relative. Our informants were instructed to rate the naturalness of our example sentences. Subjects can use all positive numbers including decimals to state their judgements, i.e. they can always introduce a score which is better or worse than all of their previous scores. Since the results are numerical and form an interval scale, standard statistical tests can be applied. We made the experiments available on the web using WebExp (Keller et al. 1998). For evaluation of our data we applied the pattern matching technique (Featherston 2004). The basic idea is that if two constructions are structurally alike, their judgements will be identical or behave in a parallel way across conditions, here a hierarchy of predicates. One structure may be better than the other, but their response to the continuum of predicates is the same. Put differently, there is no interaction of structure and predicate type. If, on the other hand, two constructions are structurally different, we expect an interaction of structure and predicate type and no parallel pattern.
2.2. The studies series Our research program was guided by three questions: (i) Does the controversial construction respond to the predicate restrictions like the dass-extractions (see 2.2.1)? (ii) If not, could there be a confounding factor which is responsible for the perceived differences (see 2.2.2)? (iii) Does the controversial construction behave like clear parentheticals (see 2.2.3)?
Contributing to the extraction/parenthesis debate
33
2.2.1. Experiment I: dass-extraction vs. controversial construction Recall Reis’ claim that factive, negative/negated, adjectival and preference predicates are acceptable in the dass-extraction but not in the controversial construction. This is striking as they are otherwise generally assumed to allow the same predicate classes, i.e. mainly predicates of thought and speech. For the parenthesis/extraction debate it is more revealing to focus on those predicates for which the two constructions are supposed to diverge, but one needs to consider their behaviour with the predicates of thought and speech in order to understand their overall pattern. Featherston (2004) carried out a judgement study testing eight predicates from this group with a number of structures, among them the dass-extraction and the controversial construction. Featherston adheres to the standard generative analysis of the controversial construction as an extraction construction. We thus have to be careful about his conclusions, but the data as such is very interesting: Featherston detects a strikingly similar pattern for the two constructions with the predicates of thought and speech, with the controversial construction being constantly judged better than the dass-extraction, but in a parallel fashion. We took Featherston’s results as a starting point for our own investigation: knowing that the dass-extraction and the controversial construction behave alike for some predicates, will we detect a difference between them for negative/negated, adjectival and preference predicates? 2.2.1.1. Design: Our own experiment was deliberately designed as a followup study of Featherston (2004) with some of the conditions overlapping, and we will present Featherston’s and our own study conjoinedly in this paper. In (8) we spell out the constructions tested in both experiments, i.e. the controversial construction in (8a) and the dass-extraction in (8b).2 (8)
a. Welchen Bewerber glaubst/hoffst/bevorzugst du which applicant believe/hope/prefer you stellt das Projekt ein? employs the project PARTICLE b. Welchen Bewerber glaubst/hoffst/bevorzugst du which applicant believe/hope/prefer you dass das Projekt einstellt? that the project employs ‘Which applicant do you believe/hope/prefer (that) the project will employ?’
34 Katrin Axel and Tanja Kiziak Both experiments included the verbs glauben (believe) and hoffen (hope). Featherston additionally tested the reporting predicates sagen (say), behaupten (claim), fürchten (fear), erzählen (tell), erklären (explain), and the negative predicate bezweifeln (doubt). Our follow-up experiment focussed on preference and adjectival predicates: wollen (want), wünschen (wish), vorziehen (prefer), bevorzugen (prefer), lieber sein (be preferable), ratsam sein (be advisable), das Beste sein (be the best), besser finden (find better), klar sein (be clear) and bekannt sein (be known). In sum, a total of 18 different predicates were covered in the two studies, eight in Featherston’s and twelve in the follow-up study with an overlap of two verbs (glauben and hoffen). Featherston used er (he) as the subject of these predicates, the follow-up study du (you, singular) unless the predicate was an adjective, which only combines with es (it).3 The interrogative constituent was in the accusative in both experiments. Featherston’s study contained ten, the follow-up experiment eleven lexical variants of the experimental material. The lexis was controlled for length, lemma frequency and semantic plausibility. In each experiment 28 subjects were recruited by flier. We refer the reader to Featherston (2004) and Kiziak (2004) for further details and separate discussions of the studies. 2.2.1.2. Results and discussion: For graphical representation, we normalize the data from all subjects by conversion to z-scores. This unifies the different scales that individual informants used, allowing for visual inspection of the results. The data sets from both experiments are combined in figure 1 by using glauben and hoffen as overlap and tool for unification. The vertical scale represents perceived wellformedness, with higher scores indicating better judgements. We ordered the predicates on the horizontal axis according to the scores in the dass-extraction condition.
Contributing to the extraction/parenthesis debate
35
Figure 1. dass-extraction versus controversial construction
Figure 1 shows that the best predicates for both constructions are the typical predicates of thought and speech. Judgements of the dass-extraction decline fairly evenly as the predicates become worse bridge predicates. In contrast to this, the controversial construction starts off better than the dass-extractions with the reporting predicates, but declines more steeply with the negative, preference and adjectival predicates, plunging past the dass-extractions to become worse than them. Our judgement data thus reveals a division of the predicates into two subgroups as claimed by Reis: on the one hand we have the reporting predicates for which the controversial construction is constantly rated better than the dass-extraction, and on the other hand, we have the negative, preference and adjectival predicates, for which the pattern is reversed.4 On the face of it, this finding disfavours an extraction analysis of the controversial construction since a more parallel behaviour would be expected on this analysis. Yet, the next section addresses a possible objection to this conclusion. 2.2.2. Experiment II: V2-subordination as a confounding factor? On the extraction analysis, the controversial construction is an extraction from a dependent V2-clause. It is noteworthy that dependent V2-clauses are more restricted in their occurrence than dass-clauses, i.e. only a subset of the predicates which select a dass-clause can also select a V2-clause.
36 Katrin Axel and Tanja Kiziak Our reasoning is therefore as follows: Possibly the controversial construction is an extraction – just like the dass-extraction –, but it might be influenced by an additional factor which is irrelevant for the dass-extraction: The acceptability of V2-subordination with certain predicates. If this proved true, the observed dissimilarities between dass-extraction and controversial construction are due only to the V2-factor, in which case they are no counterargument to the extraction analysis of the controversial construction. 2.2.2.1. Design: As the primary aim of this study was to understand whether ratings of the controversial construction directly correlate with acceptability of simple, declarative dependent V2-clauses, we tested both constructions with a subset of the predicates from the previous studies. The secondary goal was to see whether we could replicate our earlier findings concerning the dass-extraction and the controversial construction. We thus included sentences as in (8) for replication and declarative dependent V2clauses as in (9) for comparison with the controversial construction. (9)
Er glaubt/hofft/bevorzugt, die Firma wählt he believes/hopes/prefers the company chooses diesen Standort aus. this location PARTICLE ‘He believes/hopes/prefers the company chooses this location.’
Note that we were careful to include both predicates for which the controversial construction had scored better as well as such for which it had scored worse than the dass-extraction. Our experiment contained four predicates of thought and speech (glauben, hoffen, fürchten and erzählen), six preference predicates, among them two adjectival predicates (wollen, wünschen, bevorzugen, vorziehen, lieber sein and ratsam sein), one adjective of certainty (klar sein) and one negative predicate (bezweifeln). We used third person singular pronouns as subjects for these predicates. We provided 12 versions of the material and included 15 filler items. The 31 subjects were recruited by flyer. 2.2.2.2. Results and discussion:5 The previous results for the dass-extraction and the controversial construction were replicated. For lack of space we cannot discuss this in detail but refer the reader to Kiziak (2007). The repeated measures analysis of variance confirms the replication. The important measure is the interaction of the factors Predicate and Structure. It is significant by subjects and by items (F1 (11,330) = 4.23, p1 < 0.001; F2 (11,121) = 3.07, p2 = 0.003), thus confirming that the two constructions do
Contributing to the extraction/parenthesis debate
37
not respond in a parallel way to the range of predicates. We omit the comparison from figure 2 for reasons of clarity.
Figure 2. Dependent V2-clauses versus controversial construction. Partial correspondance, but also clear differences
Consider figure 2. Since we seek to understand whether the controversial construction only reflects the quality of dependent V2-clauses with the tested predicates, we aligned the predicates in such a way that the two constructions are as parallel as possible. We find a consistent pattern of dependent V2-clauses and controversial construction over eight of the predicates on the left of the graph. However, the parallelism breaks down for the four predicates on the right. For these, the dependent V2-clauses score disproportionally higher than the controversial construction. It is the word ‘disproportionally’ that should be emphasized, as the dependent V2-clauses are rated better throughout, i.e. even where the two constructions are judged in a parallel fashion. The repeated measures analysis of variance supports the view that the V2-clauses and the controversial construction are rated differently across the continuum of predicates: The interaction of Verb and Structure is highly significant both by subjects and by items (F1 (11,330) = 6.66, p1 < 0.001; F2 (11,121) = 4.73, p2 < 0.001). In summary, the pattern of the controversial construction cannot be explained by attributing it to the quality of dependent V2-clauses. This in turn
38 Katrin Axel and Tanja Kiziak means that the differences we repeatedly found between the dass-extraction and the controversial construction cannot be accounted for by the factor ‘V2-subordination’, i.e. we have not found an explanation for the perceived differences between dass-extraction and controversial construction on an extraction analysis of the latter.
2.2.2. Experiment III: clear parenthetical vs. controversial construction With respect to the parenthesis/extraction debate, the results so far must be considered as negative evidence against the extraction analysis. This is however not equivalent to providing positive evidence in favour of the parenthetical analysis. In our third experiment we therefore compared the controversial construction to clear parentheticals, i.e. V1-parentheticals in postsubject position. These have generally received a parenthetical analysis despite some slifting accounts along the lines of Ross (1973). 2.2.3.1 Design: Apart from the controversial construction and the postsubject V1-parentheticals as in (10), we again included the dass-extraction to test for replication of our earlier findings. We used the same predicates as in experiment II. To ensure that we get clause-internal rather than clausefinal V1-parentheticals, we inserted an adverbial in all versions. Apart from this we retained the previously used material. We recruited 27 participants. (10) Welchen Vorschlag setzt der Vorstand which proposal implement the board glaubt/hofft/bevorzugt er im Frühjahr um? believes/hopes/prefers he in spring PARTICLE ‘Which proposal will the board implement in spring, does he believe/ hope/prefer?’ 2.2.3.2. Results and discussion: Let us briefly state that the predicate-class dependent contrast between dass-extraction and controversial construction was again replicated in this third experiment. In the repeated measures analysis of variance, the interaction of the factors Predicate and Structure is again highly significant both by subjects and by items (F1 (11,286) = 5.51, p1< 0.001; F2 (11,121) = 4.489, p2, <S1B-018 #112>, <S2A029 #14>, <S1A-097 #285>) and one finite ∅-token was recoded as a hollow clause (<W1B-011 #112>) and another as a non-finite token (<S1A-033 #75>). In contrast to this a thorough reinvestigation of the corpus data revealed that two that- (<S1A-002 #156>, <S1B-008 #15>) and three ∅-token (<S1A-062 #1>, <S1A-084 #150>, <S1A-088 #217>) had been overlooked in the original analysis.
180 Thomas Hoffmann 3.
4.
5.
6.
7.
As Wasow, Jäger & Orr (in prep.) have shown, the choice of head noun seems to affect the presence of a particular relativizer: in their Switchboard corpus data of non-subject relative clauses the antecedent stuff, e.g., clearly favours the presence of a that relativizer (in 62.8% of all cases). It is therefore conceivable that particular head nouns also favour relative clauses which are introduced by a pied piped preposition and a wh-relativizer. Elsewhere I have argued that the antecedent way, e.g., clearly favours relative clauses with in which in SpecC (cf. Hoffmann in prep.). The reason for this appears to be a combination of processing factors and usage-based entrenchment procedures. Here are the full details of the repeated measures analyses by subject (F1) and by item (F2): Preposition Placement (F1(1,33) = 4.536, p < 0.05; F2(1,5) = 32.261, p < 0.01); Relativiser (F1(2,66) = 17.149, p < 0.001; F2(2,10) = 38.783, p < 0.001); PP function (F1(2,66) = 0.997, p > 0.30; F2(2,10) = 30.281, p < 0.001); Preposition Placement*Relativiser (F1(2,66) = 9.740, p < 0.001; F2(2,10) = 78.271, p < 0.001); and Preposition Placement*PP function (F1(2,66) = 4.217, p < 0.02; F2 (2,10) = 20.075, p < 0.001). Temporal/location adjuncts are judged as good as the grammatical fillers (t(35) = –1.349, p > 0.18), while prepositional verbs are considered better than the grammatical fillers (t(35) = 3.728, p < 0.005). The latter effect can be explained by the fact that prepositional verbs such as rely on or talk to are stored as complex lexical items, which facilitates the interpretation of such V-P structures. A third possibility would have been to add a fictitious token (Paolillo 2002) coded only for the categorical environment, and for the dependent variant. As pointed out by an anonymous reviewer, however, such a fictitious token may distort model results if the categorical environment is not very frequent. Note that adjusted R2 and adjusted multiple R2 as well as cross-validation parameters are not automatically calculated by Goldvarb since the standard test of model fit in maximum likelihood models is the G2-test (i.e. Goldvarb’s Fit: X-square test; Paolillo 2002). Since many researchers are likely to be unfamiliar with this model fit parameter, however, the additional model fit parameters were calculated by feeding the final model into the R 2.2.1 software to get the cross-validation parameter. The adjusted R2 and adjusted multiple R2 were calculated manually using the actual and the expected applications for all cells of the Binomial One-level output.
References Aarts, Bas 2000
Corpus linguistics, Chomsky and Fuzzy Tree Fragments. In Corpus Linguistics and Linguistic Theory, Christian Mair & Marianne Hundt (eds.), 5–13. Amsterdam /Atlanta, GA: Rodopi.
Corroborating empirical evidence on preposition placement in English RCs 181 Aarts, Jan 1991
Intuition-based and observation-based grammars. In English Corpus Linguistics, Karin Aijmer & Bengt Altenberg (eds.), 44–62. London / New York: Longman. Bard, Ellen Gurman, Dan Robertson & Antonella Sorace 1996 Magnitude Estimation of Linguistic Acceptability. Language 72: 32– 68. Cowart, Wayne 1997 Experimental Syntax: Applying Objective Methods to Sentence Judgements. Thousand Oaks: Sage. Featherston, Sam 2004 Bridge verbs and V2 verbs – the same thing in spades? Zeitschrift für Sprachwissenschaft 23 (2): 181–210. 2005 Magnitude estimation and what it can do for your syntax: Some whconstraints in German. Lingua 115 (11): 1525–1550. Fillmore, Charles J. 1992 ‘Corpus linguistics’ or ‘Computer aided armchair linguistics’. In Directions in Corpus Linguistics: Proceedings of Nobel Symposium 82, Stockholm, 4–8 August 1991, Jan Svartvik (ed.), 35–60. Berlin /New York: Mouton de Gruyter. Gries, Stefan Th. 2002 Preposition stranding in English: Predicting speakers’ behaviour. In Proceedings of the Western Conference on Linguistics. Vol. 12, Vida Samiian (ed.), 230–241. Fresno, CA: California State University. Hawkins, John A. 1994 A Performance Theory of Order and Constituency. Cambridge: Cambridge University Press. 2004 Efficiency and Complexity in Grammars. Oxford: Oxford University Press. Hoffmann, Thomas 2005 Variable vs. categorical Effects: Preposition pied piping and stranding in British English relative clauses. Journal of English Linguistics 33 (3): 257–297. 2006 Corpora and introspection as corroborating evidence: The case of preposition placement in English relative clause. Corpus Linguistics and Linguistic Theory 2,2: 165–195. in prep. English relative clauses and construction grammar: something that preposition placement can shed light on? In Constructional Explanations in English Grammar, Graeme Trousdale & Nikolas Gisborne (eds.) Berlin /New York: Mouton de Gruyter. Huddleston, Rodney, Geoffrey K. Pullum & Peter Peterson 2002 Relative Constructions and Unbound Dependencies. In The Cambridge Grammar of the English Language, Geoffrey K. Pullum &
182 Thomas Hoffmann Rodney Huddleston (eds.), 1031–1096 Cambridge: Cambridge University Press. Johansson, Christine, & Christer Geisler 1998 Pied piping in spoken English. In Explorations in Corpus Linguistics, Antoinette Renouf (eds.), 67–82 Amsterdam: Rodopi. Keller, Frank 2000. Gradience in Grammar: Experimental and Computational Aspects of Degrees of Grammaticality. Ph.D. thesis, University of Edinburgh. Keller, Frank, Martin Corley, Steffan Corley, Lars Konienczny & Amalia Todirascu 1998 WebExp: A Java toolbox for web-based psychological experiments. Technical Report HCRC/TR-99, Human Communication Research Centre, University of Edinburgh. Keller, Frank & Theodora Alexopoulou 2005 A crosslinguistic, experimental study of resumptive pronouns and that-trace effects. In Proceedings of the 27th Annual Conference of the Cognitive Science Society, Bruno G. Bara, Lawrence Barsalou & Monica Bucciarelli (eds.), 1120 –1125. Kepser, Stephan & Marga Reis (eds.) 2005 Linguistic Evidence: Empirical, Theoretical, and Computational Perspectives. Berlin /New York: Mouton de Gruyter. Lu, Bingfu 2002 How does language encode performance limitation into its structure. http://www.people.fas.harvard.edu/~whu/China/chunk.doc Maindonald, John & John Braun 2003 Data Analysis and Graphics Using R: An Example-based Approach. Cambridge: Cambridge University Press. Nelson, Gerald, Sean Wallis & Bas Aarts 2002 Exploring Natural Language: Working with the British Component of the International Corpus of English. Amsterdam /Philadelphia: Benjamins. Olofsson, Arne 1981 Relative Junctions in Written American English. Göteborg: ACTA Universitatis Gothoburgensis. Pesetsky, David 1998 Some Optimality principles of sentence production. In Is the Best Good Enough? Optimality and Competition in Syntax, Pilar Barbosa et al. (eds.), 337–383. Cambridge, MA: MIT Press. Pullum, Geoffrey K. & Rodney Huddleston 2002 Prepositions and prepositional phrases. In The Cambridge Grammar of the English Language, Geoffrey K. Pullum & Rodney Huddleston (eds.), 597–661. Cambridge: Cambridge University Press. Quirk, Randolph, Sidney Greenbaum, Geoffrey Leech & Jan Svartvik 1985 A Comprehensive Grammar of the English Language. London: Longman.
Corroborating empirical evidence on preposition placement in English RCs 183 Robinson, John S., Helen R. Lawrence & Sali A. Tagliamonte 2001 GOLDVARB 2001: A multivariate analysis application for Windows. http://www.york.ac.uk/depts/lang/webstuff/goldvarb. Sag, Ivan A. 1997 English relative clause constructions. Journal of Linguistics 33: 431– 484. Sampson, Geoffrey 2001 Empirical Linguistics. London, New York: Continuum. Schütze, Carson T. 1996 The Empirical Base of Linguistics: Grammaticality Judgements and Linguistic Methodology. Chicago: Chicago University Press. Sigley, Robert J. 1997 Choosing your Relatives: Relative Clauses in New Zealand English. Ph.D. thesis, Victoria University of Wellington. 2003 The importance of interaction effects. Language Variation and Change 15: 227–253. Trotta, Joe 2000. Wh-clauses in English: Aspects of Theory and Description. Amsterdam /Philadelphia, GA: Rodopi. Van der Auwera, Johan 1985 Relative that – a Centennial Dispute. Journal of Linguistics 21: 149– 179. Wasow, Thomas, T. Florian Jaeger & David M. Orr in prep. Lexical Variation in Relativizer Frequency. For the Workshop on Expecting the unexpected: Exceptions in Grammar at the 27th Annual Meeting of the German Linguistic Association, University of Cologne, Germany. http://www.bcs.rochester.edu/people/fjaeger/papers/WasowJaegerOrrDGfSpaper.pdf.
Locality and accessibility in wh-questions Philip Hofmeister, T. Florian Jaeger, Ivan A. Sag, Inbal Arnon and Neal Snider
1. Competing wh-orders Even in relatively configurational languages, such as English, speakers frequently have a choice between different constituent orders. Many of these word order variations have been linked to complexity (Hawkins 2005; inter alia). For example, heavy-NP shift is more likely if the shifted NP is more complex than the NP it shifts over (Wasow 1997). Other cases of word order variations, however, have not been considered in these terms. The choice between different wh-phrase orders, as in (1), has been said to be determined by (categorical) grammatical constraints, such as Superiority (Kuno & Robinson 1972; Chomsky 1973; inter alia). (1)
a. Who bought what? Non-SUV b. What did who buy? Superiority Violation (SUV)
According to such accounts, (1b) is ungrammatical in English. These accounts, however, do not predict the findings of Arnon et al. (2005) and Clifton et al. (2006), both of whom present evidence from corpora, attesting the usage of Superiority-violating examples. Nor can they accommodate the gradient nature of the contrast that has emerged in several studies (Featherston 2005; Fedorenko & Gibson 2006). In Arnon et al. (2005), we examined an alternative account, which we dubbed the Wh-Processing Hypothesis, which treats wh-phrase ordering as being subject to the same type of constraints as other word order variations. The Wh-Processing Hypothesis predicts that speakers disprefer more complex wh-dependencies. Here we examine to what extent factors known to affect the processing of filler-gap dependencies (FGDs) also affect the relative acceptability of different whphrase orders. We focus, in particular, on two factors in the processing of wh-questions: locality and accessibility. These factors play significant roles in the processing of FGDs in general, as we discuss below. One of our goals in this paper is to explore the extent to which these factors can explain SUVs.
186 Hofmeister, Jaeger, Sag, Arnon & Snider In the next section, we define and discuss the two factors of locality and accessibility, showing how these factors have been previously related to processing difficulty. In section 2.2, we present the Wh-Processing Hypothesis. In section 3, we present the results of three acceptability surveys and one reading time study which test the effects of the above-mentioned factors on the processing and acceptability of questions. Finally, in section 4, we discuss the implications of these results, other possible factors, and potential problems with this account. 2. Locality and accessibility The first factor we consider here is the locality of the dependency. Gibson (2000), Hawkins (2005), and many others observe that the distance between the filler and gap strongly affects the processing difficulty and relative acceptability of sentences with FGDs. For example, English object relatives, as compared to the shorter subject relatives, require more resources and increase processing difficulty, as indicated by reading times, question-answer accuracy, and lexical-decision tasks (King & Just 1991; inter alia). Since wh-interrogative dependencies are also non-local, it is reasonable to assume that they are subject to the same processing constraints as relative clauses. In fact, the lack of a specified, identifiable referent associated with a whinterrogative filler potentially presents an additional cognitive challenge. Hence, we hypothesize that locality is also likely to play an important role in determining the acceptability of multiple wh-phrase (interrogative) constructions. It has also been noticed that the type of the wh-filler (which-NP vs. bare wh-item) influences the acceptability of SUVs. Karttunen (1977) points out that examples like (2) sound better than (3): (2)
Which class of drug will which patient get?
(3)
What will who get?
Pesetsky (1987) further notices that the type of the in-situ wh-phrase affects acceptability independently, so that (4) is judged better than (3): (4)
What will which patient get?
Pesetsky ascribes this difference to “D(iscourse)-linking” of the which-NP, which exempts it from the normal conditions on wh-phrase ordering. The proposal, however, that the type of wh-filler and wh-intervener affect
Locality and accessibility in wh-questions 187
grammaticality is both ad hoc and without independent motivation. We propose that the factors explaining SUVs are both more general and independently motivated. We discuss next how wh-order preferences, widely discussed under the label of D-linking, relate to more general processing mechanisms. Specifically, we believe there is a strong relationship between the form and content of an expression and its degree of activation, which has been described in terms of accessibility (Ariel 1990) and that this degree of activation strongly impacts the processing of the FGD. FGDs have been shown to be affected by the referential properties of material intervening between the filler and the gap. For example, in sentences like (5), verbs are read fastest when the relative clause subjects are pronouns, while first or famous names lead to faster reading times than definite descriptions: (5)
The consultant who (we/Donald Trump/the chairman /a chairman) called advised wealthy companies.
Warren & Gibson (2002, 2005) interpret these results in terms of accessibility (Ariel 1990, 2001): the more accessible the intervening referents, the less burden there is on the processor, which is already taxed by maintaining the filler-gap dependency. Accessibility is a measure of activation level, which is partially indicated by the choice of referring expression. The form of an NP acts as a cue to the listener as to how much work is necessary to activate or retrieve the correct antecedent. As information and morphological complexity in the NP increase, the amount of work necessary to retrieve the antecedent also increases. Processing less accessible forms, therefore, requires more work and hence creates an additional processing difficulty while an FGD is being parsed. Interrogative wh-dependencies, like other FGDs, also exhibit sensitivity to the properties of intervening material. Alexopolou & Keller (2003) show that words associated with a higher cognitive cost appearing between a whfiller and gap impair the integration of wh-phrases with (the subcategorizer of) the gap. There is also evidence from German that certain intervening wh-phrases improve the acceptability of superiority-violating, multiple whquestions: German speakers disprefer bare in-situ wh-phrases in SUVs (e.g. wer), as compared to complex wh-phrases (e.g. welcher Mann; Featherston 2005). We interpret these results as reflecting the increased processing difficulty introduced by bare wh-words. Locality and accessibility thus constitute the focus of this study. Before we turn to the predictions we make about these factors and how they influ-
188 Hofmeister, Jaeger, Sag, Arnon & Snider ence the processing of wh-dependencies, we first address in detail how accessibility applies to wh-phrases.
2.1. Accessibility: Wh-phrases versus referential NPs While accessibility has been almost exclusively applied to referential NPs, we propose that the same mechanisms that influence the processing of referential NPs are also at play during the processing of wh-phrases. We dwell on this subject here in order to address the issue of why the explicitness of intervening wh-phrases and referential NPs affect processing difficulty in seemingly different ways. As pointed out above, more explictness correlates with more processing difficulty for referential NPs, but the opposite seems to be true for wh-phrases. To explain this difference, we consider here some hypotheses about the most important predictors of activation for whphrases and referential NPs. For referential NPs, morphologically simple and less informative NPs (e.g. pronouns) are used to refer to entities of higher activation or salience, while morphological complexity and high informativity (e.g. definite descriptions) indicate that the referent is less activated at the time of utterance (Ariel 2001). Thus, the choice between a pronoun or a definite description is conditioned by the salience of that particular individual in the preceding discourse. Notice that it only makes sense to compare the accessibility of two phrases when they have the same intended interpretation (i.e. both phrases have the same referent). In addition to marking a current degree of activation, the form of NPs also partially determines the degree of activation subsequent to their utterance – referred to as future accessibility by Ariel (2001). In short, the more explicit an NP is, the greater the subsequent increase in activation of the corresponding referent(s). Increases in activation not only make subsequent references with higher current accessibility markers more likely, they also facilitate other linguistic operations that involve that information, such as the integration of fillers and gaps. Thus, all other things being equal, the referent of an expression like the gorilla approaching at breakneck speed, as opposed to it, is more likely to become the discourse topic and have a higher activation level at subsequent points in the utterance. In support of this view, Gernsbacher (1989) presents evidence that proper names reactivate an antecedent more strongly than a pronoun. From this perspective, current activation marking is in an inverse relation to future activation marking. A higher accessibility marker like a personal pronoun
Locality and accessibility in wh-questions 189
indicates high current accessibility, but does relatively little to increase activation. As Ariel (2001: 68) notes, this “can explain why speakers shift to lower accessibility markers from time to time, even when they continue to discuss the same discourse entity.” That is, to maintain topicality, speakers use longer and more explicit forms on occasion to compensate for normal activation decay and interference from other discourse entities. The same reasoning, we hypothesize, applies to wh-phrases: all other things being equal, the concept of politicians is more salient after an utterance of which politician (in context) than after who. Wh-phrases, too, have a range of possible forms from morphologically simple and uninformative (e.g. who) to more complex forms that package more information (e.g. which politician) to ever more complex and informative forms (e.g. which politician from Missouri). Given the greater degree of morphological complexity and explicitness in which-NPs, we categorize them as higher future accessibility markers. Moreover, Frazier & Clifton (2002) provide evidence that which-NPs are better antecedents for pronouns than bare wh-words like who and what. Since high future accessibility phrases encourage the subsequent use of high current accessibility anaphors (i.e. pronouns), the relation between explicitness and future activation is thus the same for anaphoric and wh-expressions. Preliminary results from reading-time experiments conducted by the first author also favor this ranking. In unary, wh-island constructions with supporting contexts (Which employee/Who did Albert learn whether they dismissed after the annual performance reviews?), which-NPs lead to significantly faster reading times than a bare wh-item at the embedded verb and in subsequent regions. Accordingly, the evidence from Featherston and Frazier & Clifton can all be seen to reflect the fact that which-phrases are more accessible than simple wh-pronouns at the time that fillers and gaps are integrated. If the difficulty of processing a head is a function (among other things) of the activation levels of its arguments, then the form preferences for both wh-questions and referential NPs emerge as a preference for high argument activation at the point when the head is processed. In examples like (6) from Warren & Gibson (2005), variants with highly salient personal pronouns will have the highest argument activation, because activation starts high and hence can withstand more decay and/or interference effects.
" you & " & we $ $ $ $ (6) It was # Patricia ' who # Dan ' avoided at the party. $ the lawyer$ $% the businessman$( % (
!
!
190 Hofmeister, Jaeger, Sag, Arnon & Snider In contrast, argument activation starts low or at zero in multiple wh-questions, but is boosted higher when more information is expressed in the whphrases. Therefore, a which-phrase in either argument position of an SUV should satisfy the preference for higher activation at the verbal head. This still leaves a noticeable distinction between the processing of referential NPs and wh-phrases. Recall that highly salient, but less informative NPs serve as the best kind of intervening referential NP (Warren & Gibson 2005). The above-cited data on wh-phrases, however, appears to indicate that more explicit and informative wh-phrases are preferred as interveners. Assuming that processing ease depends upon activation level, this means that wh-phrase interveners are most activated when the wh-phrase is explicit, while referential NP interveners are most activated when the form is not very informative but marks a highly salient referent. One way to account for the apparently different effects of explicitness is to point to the simple fact that interrogative wh-phrases are not anaphoric. Anaphoric NPs are used to refer back to discourse referents previously mentioned. In other words, they evoke information already in the common ground (explictly or implicitly). Hence, the primary task in processing a referential NP is retrieving the correct antecedent or, failing that, accommodating the existence of an antecedent. This whole process is expedited when the referent or mental entity is highly salient at the point the anaphor is reached.1 A processing benefit for more explicit anaphoric forms is not apparent in the Warren & Gibson (2005) results.2 This does not preclude the possibility of some positive correlation between explictness and activation boosting with respect to anaphoric NPs; instead, the results permit the view that the effect of activation boosting is obscured by the profound effect of salience in that study. One way to account for this is to argue that pronouns, proper names, and definites differ too much in their current activation levels (due to the need to express important differences in salience) for boosting to make much difference. On this view, it is the property of being an anaphor that causes activation boosting to be relatively unimportant. In contrast, wh-phrases do not function as anaphors, although parts of their interpretation may derive from the preceding discourse. Rather, whphrases are used to construct complex objects – questions – which either seek to gain information (as in main clause interrogatives) or else to make a clausal argument that can be predicated over (as in embedded interrogatives). Questions, therefore, will be more easily understood (and better answered, for that matter) when either a) the context strongly provides the focus of the question or b) the wh-phrase itself explicitly narrows down the scope of the inquiry. Under the assumption that wh-phrases have either low
Locality and accessibility in wh-questions 191
or zero activation prior to their utterance (which follows from their nonanaphoric function), using a more explicit wh-form should facilitate the retrieval and integration process. In other words, because the initial activation is so low, activation strength is largely dependent on activation boosts that are, in turn, dependent on explicitness. This hypothesis is consistent with all the wh-phrase data considered so far.3 In sum, we propose that the apparent differences in the effect (size) of explicitness can be attributed to wh-expressions being non-anaphoric. This proposal makes two interesting predictions for future research: a) an indefinite phrase should lead to faster processing at the verb if the indefinite phrase is more explicit (contains more information); b) in the right context, it may even be possible to observe effects of activation boost for anaphoric expressions (see footnote 3) – in such contexts, more explicit anaphoric NPs should lead to faster processing.
2.2. The Wh-Processing Hypothesis Based on these observations of how locality and accessibility affect FGDs, we propose the following Wh-Processing Hypothesis to account for the relative rareness of examples like (1b) in English, as compared to nonsuperiority violating orders like (1a): (7)
The Wh-Processing Hypothesis a. Factors that have been shown to burden the processing of referential filler-gap dependencies (e.g. relative clauses) burden the processing of all FGDs, including wh-interrogative constructions. b. Many filler-gap sentences that have standardly been analyzed as ungrammatical (violating ‘island’ constraints) are in fact grammatical, but are judged to be less acceptable by speakers because they are harder to process.
The reasoning implicit in (7b) builds on recent proposals to better understand the relation between speaker judgments and processing factors. See, for example, Fanselow & Frisch (2004). This hypothesis entails that speakers faced with a choice between several grammatical wh-orders, will disprefer those which (given the context) are associated with a greater processing cost. Combined with existing theories of processing complexity (e.g. Gibson 2000), the Wh-Processing Hypothesis makes the following predictions about wh-questions:
192 Hofmeister, Jaeger, Sag, Arnon & Snider (I)
In filler-gap constructions, the greater the distance between the filler and its gap, the less acceptable the sentence. (II) Less accessible fillers make filler-gap sentences less acceptable. (III) Less accessible interveners make filler-gap sentences less acceptable. Note that we make no assumptions about the relative importance of these predictions. That is, we do not conjecture whether the effect of distance is more important than accessibility or vice versa; nor does the Wh-Processing Hypothesis indicate if the accessibility of the filler is paramount to that of the interveners or vice versa.
3.
Experimental evidence
3.1. Methods We present here the results of three surveys eliciting acceptability judgments and one experiment measuring comprehension complexity in whquestions via self-paced reading.4 Acceptability judgments were elicited over the WWW using magnitude estimation (ME; Bard et al. 1996) with the WebExp software package (Keller et al. 1998). ME lets participants set their own continuous acceptability scale, allowing participants to express as many distinctions as desired. Acceptability judgments are made relative to a reference sentence. Participant’s judgments are subsequently standardized by dividing by the reference sentence’s score. All ME analyses are based on the z-score5 of these (log-transformed) standardized judgments. For the reading time study, residual reading times were used for the analysis. This method reduces variability due to individual differences in reading times.6 All experiments use Latin-square design: Each participant saw each item in exactly one condition, and all conditions occur equally often. All lists include at least as many fillers as experimental items. All results were analyzed using repeated measures analyses of variance (ANOVAs). Participants for the ME experiments were recruited via e-mail lists and online discussion forums. The reading-time study was conducted as part of another reading-time study at MIT’s Tedlab.
Locality and accessibility in wh-questions 193
3.2.
Locality effects on acceptability (ME1)
3.2.1. Materials ME1 investigates the effect of locality on the acceptability of wh-questions (Prediction I). Locality-based processing theories (e.g. Gibson 2000) predict that an increase in distance between filler and gap (measured in new discourse referents) makes wh-dependencies harder to process. We manipulated this distance by optionally attaching a six-word PP either to the which-phrase (8c,f) or to the other NP (8b,e). In addition, the which-phrase was either subject-extracted (8a–c) or object-extracted (8d–f): (8)
a. b. c. d. e. f.
Which man saw the girl? Which man saw the girl in the bar on California Ave? Which man in the bar on California Ave. saw the girl ? Which man did the girl see? Which man did the girl in the bar on California Ave. see? Which man in the bar on California Ave. did the girl see?
We hypothesized that longer filler-gap distances would engender higher processing costs, which would result in lower acceptability judgments. For example, the filler in (8d) is separated from the gap by only one new discourse referent, the girl; but in (8e), three new discourse referents intervene between the filler and the gap. Thus, we predict (8e) to be judged less acceptable than (8d). Notice that we further predict a difference between (8b) and (8e) for the same reasons, despite the roughly equivalent lengths of the questions. In general, Prediction I says that subject-extractions should be judged more acceptable than object extractions. The study includes 36 items in six different conditions. In addition, 34 fillers were included in each list. 18 of these came from another multiple-wh experiment. 42 native English speakers completed the survey, but the results from one individual were removed because of incomplete data for that subject. Participation did not result in compensation.
3.2.2. Results As shown in Table 1, object extractions (which have more intervening discourse referents) were judged as less acceptable than subject extractions in the subject but not the item analysis (F1 (1,35) = 4.9, p < .05; non-significant
194 Hofmeister, Jaeger, Sag, Arnon & Snider by items, F2 (1,35) = 2.5, p = .12). While the difference between examples like (8a) and (8d) turned out to be non-significant, this is not surprising since neither question involves more than one intervener and the number of interveners differed by only one. Notably though, the object wh-question with three intervening NPs (8e) was judged less acceptable than the subject wh-question (8b) of the same length with zero interveners. Overall, sentences were judged differently from each other if the difference in number of interveners was two or more. This may mean that, for simple unary whquestions, it takes at least two interveners to invoke any measurable cognitive challenge.
Figure 1. Acceptability ratings from ME1 (OE = object extraction; SE = subjection extraction; O = no attachment; WH = PP attached to wh-phrase; NP = PP attached to referential NP ) Table 1. Pairwise comparisons of the six conditions in ME1, including the difference in interveners for each pair. (OE = object extracted; SE = subject extracted; NP = six-word PP is attached to referential NP; WH = sixword PP is attached to wh-phrase) Pairs of Extraction.Attachment
Difference in # of interveners
Subj. analysis
Item analysis
OE.NP (8e) SE.NP (8b) SE.WH (8c)
3 3
p .10). The reading times for Region 2 revealed a main effect of subject type (F1 (1,48) = 8.42, p < .01; F2 (1,23) = 7.57, p < .01). Participants needed more time to read the sentences containing a non-referential subject. Addi-
368 Britta Stolterfoht, Lyn Frazier and Charles Clifton, Jr. tionally, there were two marginally significant effects: the main effect of adverb position, marginally significant in the subject analysis and fully significant in the item analysis (F1 (1,48) = 2.94), p = .09; F2 (1,23) = 4.13, p < .05), and the interaction of the two factors, significant in the subject analysis and marginally significant in the item analysis (F1 (1,48) = 4.48, p < .05; F2 (1,23) = 3.26; p = .08). The conventional 2x2 analysis of variance provides some evidence that participants needed more time to read the sentences with a non-referential subject and a late adverb than each of the other three types of sentences. Since this pattern of results was predicted we performed more focused tests, comparing the non-referential subject/late adverb condition against each of the remaining three conditions. Each contrast was fully significant (see Table 2). However, none of the remaining three conditions differed significantly from any other (F < 1.0). Table 2. ANOVA RT Region 2 – planned comparisons Comparison non-ref.-late with non-ref.-early ref.-early ref.-late
F1 (1,48)
p1
F2 (1,23)
p2
5.56 7.17 11.56
.02 .01 .001
5.40 10.24 8.92
.03 .004 .007
For the choice of the correct paraphrases, only a marginally significant main effect for adverb position was found (F1 (1,48) = 3.33, p = .07; F2 (1,23) = 3.59, p = .07). For unclear reasons, participants gave slightly more correct answers for sentences with a late adverb (93,5%) than for sentences with an early adverb (90%). The analysis of the question-answering times revealed no significant effects (all F < 1.0)
3. Discussion The results of the self-paced reading study revealed significantly longer reading times for sentences with a non-referential subject preceding a sentential adverbial in comparison to all other conditions. These results can be interpreted as evidence for Hypothesis 2, which assumes a topic position for English comparable to that found in scrambling languages like German. Hypothesis 2 predicted that a subject placed above a sentential adverbial is treated as a topic and thus a non-referential subject preceding the adverb
Adverbs and sentence topics in processing English
369
will be highly marked. This result is particularly interesting because German and English are so different with respect to the relevant structural properties: German permits various types of fronting operations (fronting to SpecCP, scrambling to various positions within the middlefield) to reflect information-structure whereas English permits very little movement of this type. The results of our study can be interpreted as evidence against Hypothesis 1 which assumes that English has only one landing position for subjects that is not sensitive to the information-structural status of DPs.1 Our data suggest that English patterns with other Germanic languages with regard to information-structural constraints for the position of the subject. One might worry that the long reading times for sentences with a negative subject preceding the adverb were due to the possibility of a scope ambiguity in these sentences, but not in the sentences with a referential subject. However, at least according to our intuitions, there is no scope ambiguity with the adverbs tested in the actual materials (unfortunately, evidently, apparently, surprisingly, etc.). Thus, we think this possibility is remote, given that at best one would be dealing with a potential ambiguity. Further, in the processing literature on scope, one does not find longer processing times due to actual scope ambiguity. For example, Anderson (2004, Experiment 6) tested scope ambiguous sentences like A climbing expert scaled every cliff. In a self-paced reading study, the ambiguous sentences were presented in either a context biased to surface scope or a context biased to inverse scope. Unambiguous control sentences were also tested (The climbing expert scaled every cliff for surface scope; A different climber scaled every cliff for inverse scope). As in all her other studies, the surface scope sentences were read faster than the inverse scope sentences but there was no effect of ambiguity.
4. Conclusions Our study examined the behavior of subjects in English when a sentential adverbial followed the subject in an embedded clause. The results of the experiment showed that a non-referential subject gave rise to a penalty (longer reading times) when it preceded the adverb but a referential subject did not. This was expected if a subject preceding the sentential adverbial is treated as a topic, since non-referential subjects do not make good topics. Based on evidence from intuitions, a similar effect seems to occur in German. But in German it is not so surprising that adverb placement would influence the information-structure of a sentence. Given the movement pos-
370 Britta Stolterfoht, Lyn Frazier and Charles Clifton, Jr. sibilities afforded by scrambling, often the only way to be sure where in the syntactic structure a subject sits is by looking at its position relative to an adverb or some other constituent. Further, the fact that a subject or other argument may appear in various syntactic positions allows the positions to be exploited for marking information-structure. But what the present results suggest is that adverbs may play a similar role in English. The results encourage the view that in a non-scrambling language too adverb placement can constrain and signal information-structure.
Acknowledgements We would like to thank William Evans for his help with collecting the data. Furthermore, we would like to thank Thomas Ernst for helpful suggestions. This work was supported by a fellowship within the Postdoc-Programme of the German Academic Exchange Service (DAAD) awarded to Britta Stolterfoht and by National Institutes of Health Grant HD-18708 to the University of Massachusetts.
Note 1. Alternatively, a single syntactic subject position may be ‘valued’ by informationstructure constraints in a context-dependent fashion, depending on its position relative to a sentential adverb. Regardless of whether one adopts a more complicated syntax with straightforward mapping to information-structure, or a simpler syntax and a more complicated statement of the information-structure constraints, it is clear that adverb position and the information-structure status of the subject interact.
Adverbs and sentence topics in processing English
Appendix Materials (one version of all experimental items and paraphrases) The envoy said that presumably no king defeated the knights. The envoy assumed that the knights lost/won. The electrician reported that presumably no appliance caused the blackout. The electrician supposed that there was a/no defective appliance. The officer noticed that surprisingly no suspect knew the victim. The victim was known/not known by a suspect. The exterminator saw that surprisingly no mouse ate the cheese. The exterminator saw that the cheese was gone/still there. The president declared that evidently no minister lied to the subordinates. The president claimed that the subordinates were deceived/not deceived. The doctor concluded that evidently no patient survived the disease. The doctor concluded that the disease was nonlethal/lethal. The police assumed that possibly no owner torched the warehouse. The police assumed that the owner was/the owners were not involved in arson. The magazine speculated that possibly no actress visited the hospital. The magazine speculated that hospital was/was not visited by an actress. The reporter said that unfortunately no quarterback attended the party. The reporter said that the party was/was not attended by a quarterback. The mother said that unfortunately no nurse called the doctor. The mother said that a/no doctor was called. The organizers announced that probably no band will play at the festival. It is likely that the/no band will appear at the festival. The forecast claimed that probably no storm will reach Amherst. The forecast assumes that Amherst will/won’t get nasty weather. The journalist emphasized that obviously no soldier killed the demonstrators. The demonstrators were/were not killed by the military. The investigator heard that obviously no clerk broke the safe. The investigator heard that the safe was/was not broken by the clerk. The teacher said that certainly no pupil smoked a cigarette. The teacher said that the pupil smoked/did not smoke. The father stated that certainly no son washed the car. The father stated that his son has been busy/his sons have not been busy.
371
372 Britta Stolterfoht, Lyn Frazier and Charles Clifton, Jr. The lawyer stated that apparently no priest embezzled the money. The lawyer stated that the money was/was not embezzeled by a priest. The director heard that apparently no audience loved his film. The film was a/no success. The judge stated that supposedly no secretary stole the data. The data were/were not stolen by a secretary. The artist recognized that supposedly no gallery owner bought the picture. The artist recognized that the picture was/was not sold. The driver said that fortunately no child missed the bus. Some child missed/Everybody caught the bus. The mayor said that fortunately no people obeyed the request. The request was/was not complied. The professor noticed that amazingly no student passed the exam. The professor noticed that somebody/nobody passed the exam. The activist noticed that amazingly no whale survived the spill. The spill left a/no whale alive.
References Abraham, Werner 1992 Wortstellung und das Mittelfeld im Deutschen. In Erklärende Syntax des Deutschen, W. Abraham (ed.), 27–52. Tübingen: Narr. Anderson, Catherine 2004 The structure and real-time comprehension of quantifier scope ambiguity. Unpublished Ph.D. dissertation, Northwestern University. Bader, Markus & Michael Meng 1999 Subject-object ambiguities in German embedded clauses: An acrossthe-board comparison. Journal of Psycholinguistic Research 28: 121– 144. Bobaljik, Jonathan David & Dianne Jonas 1996 Subject positions and the role of TP. Linguistic Inquiry 27: 195–236. Diesing, Molly 1992 Indefinites. Cambridge, MA: MIT Press, Emonds, Joseph 1976 A transformational approach to English syntax: root, structurepreserving, and local transformations. New York: Academic Press. Erteshik-Shir, Nomi 1997 The Dynamics of Focus Structure. Cambridge: Cambridge University Press.
Adverbs and sentence topics in processing English
373
Frey, Werner 2000 Über die syntaktische Position der Satztopiks im Deutschen. ZAS Papers in Linguistics 20: 137–172. Frey, Werner & Karin Pittner 1998 Zur Positionierung der Adverbiale im deutschen Mittelfeld. Linguistische Berichte 176: 489–534. Haftka, Brigitta 1995 Syntactic positions for topic and contrastive focus in the German middlefield. In Proceedings of the Göttingen Focus Workshop, [Arbeitsberichte des Sonderforschungsbereichs 340 Nr. 69] I. Kohlhof (ed.), 1–24. University of Tübingen. 2003 „Möglicherweise tatsächlich nicht immer“ – Beobachtungen zur Adverbialreihenfolge an der Spitze des Rhemas. Folia Linguistica 37: 103–128. Höhle, Tilman 1982 Explikationen für „normale Betonung“ und „normale Wortstellung“. In Satzglieder im Deutschen, W. Abraham (ed.), 75–153. Tübingen: Narr. Kaiser, Elsi & John C. Trueswell 2004 The role of discourse context in the processing of a flexible word order language. Cognition 94: 113–147. Kratzer, Angelika 1995 Stage-level and individual-level predicates. In The Generic Book, G. Carlson & F. Pelletier (eds.), 125–175. Chigaco: University of Chicago Press. Meinunger, André 1995 Discourse dependent DP (De-)Placement. Ph.D. dissertation. University of Potsdam. Platzack, Christer 1983 Germanic word order and the COMP/INFL parameter. Working Papers in Scandinavian Syntax 2. Reinhart, Tanya 1981 Pragmatics and linguistics: An analysis of sentence topics. Philosophica 27: 53–94. 1995 Interface strategies. OTS Working Papers. Utrecht University. Steube, Anita 2000 Ein kognitionswissenschaftliches basiertes Modell für Informationsstrukturierung (in Anwendung auf das Deutsche). In Von der Philologie zur Grammatik, J. Bayer & C. Römer (eds.), 213–238. Tübingen: Niemeyer. Stolterfoht, Britta & Markus Bader 2004 Focus structure and the processing of word order variations in German. In A. Steube (ed.), Information structure: theoretical and empirical aspects. Berlin /New York: Mouton de Gruyter.
374 Britta Stolterfoht, Lyn Frazier and Charles Clifton, Jr. Svenonius, Peter 2002 Subject positions and the placement of adverbials. In Subjects, expletitives, and the EPP, P. Svenonius (ed.), 201–242. Oxford: Oxford University Press.
List of contributors
Sam Featherston SFB 441 Linguistic Data Structures
Katrin Axel Department of German Linguistics
University of Tübingen Nauklerstraße 35 72074 Tübingen Germany
Saarland University Postfach 151150 66041 Saarbrücken Germany
[email protected] [email protected] Wolfgang Sternefeld Department of Linguistics
Peter Bosch Institute of Cognitive Science
University of Tübingen Wilhelmstraße 19-23 72074 Tübingen Germany
University of Osnabrück Albrechtstraße 28 49069 Osnabrück
[email protected] [email protected] Doug Arnold Department of Language and Linguistics
Oliver Bott SFB 441 Linguistic Data Structures
University of Essex Wivenhoe Park Colchester, C04 3SQ United Kingdom
University of Tübingen Nauklerstraße 35 72074 Tübingen Germany
[email protected] [email protected] Inbal Arnon Department of Linguistics
Joan Bresnan Department of Linguistics
Stanford University Margaret Jacks Hall (Building 460) Stanford, CA 94305 USA
Stanford University Margaret Jacks Hall (Building 460) Stanford, CA 94305 USA
[email protected] [email protected] 376 List of contributors Harald Clahsen Department of Language and Linguistics
Wilbert Heeringa Meertens Institute
University of Essex Wivenhoe Park Colchester, C04 3SQ United Kingdom
Postbus 94264 Joan Muyskenweg 25 1090 GG Amsterdam The Netherlands
[email protected] [email protected] Charles Clifton, Jr. Department of Psychology
Thomas Hoffmann Department of English and American Studies
Tobin Hall University of Massachusetts, Amherst Amherst, MA 01003, USA
University of Regensburg 93040 Regensburg, Germany
[email protected] [email protected] Elena Dieser SFB 441 Linguistic Data Structures
Philip Hofmeister Department of Linguistics
University of Tübingen Nauklerstraße 35 72074 Tübingen, Germany
Stanford University Margaret Jacks Hall (Building 460) Stanford, CA 94305, USA
[email protected] [email protected] Caroline Féry Linguistics Department
T. Florian Jaeger Brain and Cognitive Sciences
University of Potsdam Karl-Liebknecht-Str. 24 –25 14476 Golm, Germany
University of Rochester, Meliora Hall Box 270268 Rochester, NY 14627-0268, USA
[email protected] [email protected] Lyn Frazier Department of Linguistics
Anke Karabanov Department of Women and Child Health
226 South College University of Massachusetts, Amherst Amherst, MA 01003, USA
Karolinska Institutet Stockholm Brain Institute Solna, 171 76 Stockholm, Sweden
[email protected] [email protected] List of contributors
377
Tanja Kiziak SFB 441 Linguistic Data Structures
Janina Radó SFB 441 Linguistic Data Structures
University of Tübingen Nauklerstraße 35 72074 Tübingen, Germany
University of Tübingen Nauklerstraße 35 72074 Tübingen, Germany
[email protected] [email protected] Peter König Institute of Cognitive Science
Louisa Sadler Department of Language and Linguistics
University of Osnabrück Albrechtstraße 28 49069 Osnabrück, Germany
University of Essex Wivenhoe Park Colchester, C04 3SQ United Kingdom
[email protected] [email protected] Denisa Lenertová Institute of Linguistics University of Leipzig Beethovenstraße 15 04107 Leipzig, Germany
[email protected] Ivan A. Sag Department of Linguistics Stanford University Margaret Jacks Hall (Building 460) Stanford, CA 94305, USA
[email protected] Timm Lichte Emmy Noether Project SFB 441 Linguistic Data Structures University of Tübingen Nauklerstraße 35 72074 Tübingen, Germany
[email protected] Christopher D. Sapp Department of Modern Languages C-115 Bondurant University of Mississippi PO Box 1848 University, MS 38677-1848, USA
[email protected] John Nerbonne Center for Language and Cognition University of Groningen P.O. Box 716 9700 AS Groningen The Netherlands
[email protected] Stavros Skopeteas Linguistics Department University of Potsdam Karl-Liebknecht-Str. 24 –25 14476 Golm, Germany
[email protected] 378 List of contributors Neal Snider Department of Linguistics
Britta Stolterfoht SFB 441 Linguistic Data Structures
Stanford University Margaret Jacks Hall (Building 460) Stanford, CA 94305 USA
University of Tübingen Nauklerstraße 35 72074 Tübingen Germany
[email protected] [email protected] Jan-Philipp Soehn SFB 441 Linguistic Data Structures
Stefan Sudhoff Institute of Linguistics
University of Tübingen Nauklerstraße 35 72074 Tübingen Germany
University of Leipzig Beethovenstraße 15 04107 Leipzig Germany
[email protected] [email protected] Ilona Steiner SFB 441 Linguistic Data Structures
Aline Villavicencio Institute of Informatics
University of Tübingen Nauklerstraße 35 72074 Tübingen Germany
Federal University of Rio Grande do Sul Av. Bento Gonçalves, 9500 CEP 91501-970 Caixa Postal: 15064 Porto Alegre Brazil
[email protected] [email protected] Index
acceptability, 17, 25–26, 31, 36, 47, 169, 171–173, 185–187, 192–198, 200–203, 314 accessibility, 1, 4–5, 75–76, 100, 185–189, 191–192, 195–200, 202, 279, 319, 356, 359 activation, level of, 187–191, 195, 198, 202–203 adverb placement, 38, 45, 49, 235–236, 239, 302, 306, 345, 361–366, 368–370 agreement, 7, 9–15, 17–27, 62, 111, 115, 169, 199, 223, 260, 326, 363–364 closest conjunct ~, 10–27 gender ~, 9, 13–15, 19–22, 224 number ~, 11, 16–18 prenominal vs. postnominal ~, 11–12, 15–17, 19, 21–22, 25–27 resolution, 10–12, 16–21, 25 anaphora, 117, 119, 121, 189–191, 202– 203, 207–208, 212, 222–224, 232, 356 antecedent, 117–118, 121, 177, 180, 187– 190, 208–210, 212, 215, 218–220, 222– 224, 249, 356 auch, 4, 141, 143, 148–149, 151, 227– 235, 238–241, 243–244, 246, 251 bilingualism, 7, 80, 133–137, 144–145, 147–157 ~ and code-switching, 134, 137, 147 one system theory, 135–136, 148 binding, 116–122, 224 Binding Theory, 117–118, 122, 224 boundary tone, 325 bridge predicate, 31, 35 c-command, 73, 212, 222–223, 227, 251
CELEX, 81, 104, 317 competence, 202 complexity, 11, 70, 163–164, 174, 176– 178, 185, 187–189, 191–192, 198, 262 context, 5, 55, 57–62, 65–68, 73, 75, 78– 79, 84–85, 87–92, 102, 114, 136, 143, 153–155, 163, 189–191, 201, 203, 207–208, 210, 227, 229, 233–235, 239, 244–245, 255–257, 260, 263, 278, 302, 305–307, 314, 319–321, 335, 363, 369 coordination, 9–14, 16–26, 243, 341, 343–359 NP vs. S, 349–351, 353–357 corpus historical ~, 31, 40, 46, 301 ~ of spoken language, 76–77, 102, 104, 115, 319, 343 corpus model, 77–78, 80, 83–84 corpus study, 1, 4–5, 9, 13, 24–25, 34, 46, 53, 63–65, 75, 78, 81, 83, 86, 89, 91, 104, 161–163, 165, 173, 178–180, 185, 227, 229, 238, 243–246, 250, 253–259, 261–263, 300–301, 324, 341–344, 348, 353–358 correspondence hypothesis, 107–108 Cosmas Corpus, 64 dative alternation, 75–78, 83–90 deaccenting, 328–329, 331 dialectology, 267–277, 287 dialectometry, 267–268, 278, 288–289 dialects, 267–269, 271–275, 277–280, 283–284, 287–293, 305–306, 309, 311 Austrian German, 305, 309, 313 Low Saxon, 277
380 Index Swabian, 5, 305–307, 309–310, 315– 316 discourse, 115, 319, 322, 328–330, 361– 364 discourse linking, 186–190, 193, 196– 201, 320, 322 discourse referent, 190, 193, 202, 207, 364 Down’s Syndrome, 20, 118–123, 253 downstep, 328–329, 331 downward entailing, 11, 48, 134, 249, 252, 254, 264, 317 Dutch, 108, 227, 250–251, 253, 277, 279, 281, 283, 293, 341, 354–355 Early New High German, 5, 299–301, 303–305, 307, 310–311, 314–316 elicitation task, 102–103, 319–320, 322, 324, 334–336 English, 4–6, 75–76, 84, 100–102, 105– 106, 108, 161–162, 169, 178, 185– 186, 191, 193, 197, 201, 227, 235, 249, 250, 274, 281, 324–325, 327, 329–331, 334–337, 341–343, 361, 364–366, 368–370 event-related brain potentials, 110–114, 116 experiments, 2, 5–6, 32–36, 38, 47, 56, 61, 71, 73, 75–76, 78–84, 86–87, 90– 91, 99, 107–108, 110–116, 119–121, 162, 169–171, 173, 189, 192–195– 196, 198–199, 203, 209–210, 213, 221, 223, 234, 238, 240–241, 243– 246, 250, 268, 272, 278, 280, 286, 311, 313–315, 317, 320, 322, 343, 364, 366–367, 369 extraction/parenthesis debate, 6, 29–36, 38–49 extraposition, 49, 255, 300, 302–307, 310, 315 eye tracking, 2, 4, 207–209, 212–213, 221, 224
filler-gap dependency, 177–178, 185– 187, 191–193, 195–197, 199–201 focus, 4–6, 10–11, 33, 134, 147, 185, 187, 190, 207, 209, 213, 218–219, 222–223, 227–229, 233, 243, 246, 252, 268–269, 288, 293, 299–308, 310–317, 321–329, 331–336, 343– 344, 348, 361, 368 association with ~, 4–5, 26, 31–32, 37, 54–57, 66, 71, 73, 80–99, 102, 105, 108, 141, 143, 148–149, 151, 163, 170–172, 227–246, 251, 274, 294, 320, 348–350, 362, 366 contrastive ~, 302, 304, 310 ~ particles, 4, 141, 143, 148–149, 151, 227–235, 238–241, 243–244, 246, 251 ~-background structure, 319, 334, 361 multiple ~, 313, 324, 331, 334–335 narrow ~, 331, 334–335, 361 new information, 302, 310 Georgian, 6, 324–329, 331, 334–337 German, 4–6, 16, 29–31, 40–42, 45–46, 49, 54, 57, 64, 67, 99, 101–102, 104– 110, 112–113, 115–116, 124, 133– 134, 137–147, 149–152, 154–157, 187, 201, 210, 221–222, 227–228, 232, 246, 250–251, 254, 259, 263, 278, 299–300, 303, 305, 309, 311, 313–316, 324–327, 329–331, 334– 337, 361–365, 368–370 grammaticality, 75, 84–85, 87–91, 110– 112, 166, 168, 169, 171–173, 178, 185, 187, 191, 201, 249, 305, 307– 314, 317 grammaticality judgments, 75, 84, 91, 311–312, 317 gravity hypothesis, 6, 267–270, 273–274, 278–280, 283–284, 287–294 Greek, 6, 324–329, 331, 334–337
Index historical data, 5–6, 29, 31, 40, 43, 45– 46, 166, 268, 278–280, 283, 293, 299, 305, 315 ICE-GB, 162–164, 169, 171, 173–174, 178 information structure, 4, 6, 232, 243–245, 300, 316, 319–320, 322, 324–325, 327, 335–336, 361–362, 364–365, 369–370 intervention, 26, 65, 176, 186–188, 190, 193–196, 199–200, 251–252, 343 intonation, 71, 228, 236, 240, 243, 300, 303, 325, 327–329 intonational phrase (IP), 109, 329, 362, 364 introspection, 161, 171 intuitions, 7, 53, 56, 75–76, 79, 84, 91, 369 judgments, 2, 5–6, 11, 13, 22, 25, 27, 29, 31–35, 40, 46, 53–55, 57, 59, 62–63, 66, 68, 71, 75, 85, 89, 91, 119, 121, 169–171, 173, 191–193, 198, 200– 201, 203, 238–239, 241, 244, 277, 305–309, 312–313, 316, 363
381
~ and geographic distance, 267, 269, 272, 274–275, 279–280, 283, 285– 293 ~ and population size, 11, 117–118, 267, 269–270, 272–274, 279–280, 283–285, 287–290, 292 ~ and social contact, 267, 270–273, 279, 284, 286, 292–293 locality, 5, 22, 24, 117–118, 185–187, 191, 193, 200–202 magnitude estimation, 5, 32, 53, 66, 162, 169, 171, 179, 192, 300, 311, 315–316 mapping hypothesis, 362 middlefield, 41, 45, 47–48, 227, 229, 234–241, 363, 369 mixed effect regression models, 78, 81– 89, 173 natural usage data, 91 negative polarity item, 4, 249–264 classification, 250, 253, 260–264 licensing, 249, 251–255, 259, 262–265 retrieval, 187, 253 Old High German, 6, 31, 40–49
language acquisition, 97–98, 100–101, 103–104, 107, 116, 123, 134–137, 150, 154, 156, 271 language disorders, 7, 97–98, 113–114, 116–119, 121–123 language mixing, 133–137, 148–150, 152–153, 155–156 language production, 76, 134, 139–140, 147, 152–153, 156, 341–343, 357, 358 language separation, 7, 133–134, 136– 137, 143, 150, 153–155, 157 Levenshtein distance, 275–278, 280–281 linguistic distance, 267, 275, 277–283, 286–290, 292 linguistic variation, 5, 137, 267–269, 271–275, 277–281, 283–284, 287– 293, 300–301, 305–306, 309–311
parallel-structure effect, 343–344, 347– 353, 357–359 parsing preferences, 4, 341, 343, 356– 357 passive, 116–123, 140, 148, 162, 358 pattern matching technique, 6, 32 phonetic parameters, 228–229, 235, 238– 239, 242, 244–245 phonetic segment distance, 275–276, 278, 294 Portuguese, 7, 9–13, 15–19, 22–24, 26 prediction, 115, 177, 193, 195–197, 288 prefield, 45, 227, 229, 232–243, 246 preposition placement, 5, 161–180 probability, role of, 5, 75–85, 91, 106, 173, 209, 212–224, 302
382 Index production experiment, 234–235, 240, 242, 244–246, 319, 322 pronoun, 4–5, 36, 44, 47–48, 76, 82, 85– 90, 117–118, 120–122, 165–166, 187– 190, 207–210, 212–225, 230, 302, 316, 335, 342–343, 355–356 ~ resolution, 208–209 prosody, 4, 6, 31, 45, 89, 124, 177, 227– 229, 233–235, 238, 240–241, 243– 245, 319–320, 325, 327–328, 331, 334–337 psycholinguistics, 1, 3–6, 33–36, 38, 47, 55–56, 59, 61, 63, 66, 68, 71–72, 75– 76, 78–84, 86–87, 90–91, 97–99, 103, 105, 108, 110–116, 119–120, 123, 162, 166, 168–171, 173, 174, 176, 178, 192–193, 195–199, 208–210, 212–213, 221, 223, 227, 234–236, 238, 240–241, 243,–246, 250, 268, 272, 278, 280, 286, 291, 311, 313– 315, 317, 319–320, 322, 334, 336, 341–344, 359, 361, 364, 366–367, 369, 371 quantifier, 7, 251 ~ distributivity, 63–68, 71 ~ scope, 7, 53–60, 62–71, 73, 190, 228, 240, 249, 251–252, 254–255, 362, 369 questionnaire study, 3, 6, 53, 57, 67, 72, 78–82, 87–89, 186, 192, 201, 208, 279, 313, 319–320, 335 referential expression, 207–210, 212–224 reflexive, 117–118, 120–123 relative clause, 43, 123, 161–163, 165– 180, 186, 191, 254, 341 relative clause attachment, 341–343, 354–356 Russian, 133–134, 138–152, 154–157 scrambling, 302–307, 309–310, 315, 361– 364, 368–370
self-paced reading, 4–5, 110, 112, 116, 186–187, 189, 192, 198–203, 208, 323, 341–343, 353–354, 358, 365–369 semantic classes, 90, 92 sentence comprehension, 4, 97, 107–112, 115–116, 201, 203, 208, 221, 341– 343, 358 Spanish, 11, 23, 105 speech perception, 228, 234, 238, 240, 243, 245 spontaneous speech production, 75, 102, 104, 115, 245, 319, 343 stress, 124, 227–236, 238–244, 246, 301, 308, 310, 315–316, 325, 328–331, 334–336, 362 superiority, 185–187, 190, 195–202 syntactic preferences, 91 topic, 188, 228, 232, 234, 243, 271, 292, 319–320, 322, 324–325, 329, 334– 336, 361, 363–365, 368–369 contrastive ~, 228, 243, 246, 319, 321, 334 Contrastive ~ Hypothesis, 228, 243, 246 translation equivalents, 7, 135, 139–143, 146–148, 154–156 TÜBA-E, 343–354, 357 tuning hypothesis, 341–342, 348, 353– 354, 356–357 TÜPP-D/Z, 254 verb inflection, 100–101, 103–107, 124 verb placement, finite, 6, 29–31, 35, 42–44, 48–49, 108–116, 227, 229 verbal complex, 45, 71, 299–311, 315– 316 Visual World Paradigm, 207, 209, 213 Wh-Processing Hypothesis, 185–186, 191–192, 198, 201 wh-questions multiple wh-dependencies, 185–189, 191, 193, 198, 200–202
Index single vs. double ~, 319–322, 324– 327, 329–336 subject vs. object ~, 193–195, 320– 321, 323–328, 335 Williams Syndrome, 118–121
383
word order, 4, 6, 42, 49, 58, 99, 108–112, 114–115, 169, 185, 245, 300–301, 303–306, 309–311, 313–315, 317, 320, 324–327, 331–336, 361, 364